Stay organized with collections
Save and categorize content based on your preferences.
Datastream supports replicating change events from a MongoDB source.
MongoDB is an open source, NoSQL database that uses JSON-like documents. One
document can have other documents embedded in it. The documents are gathered
together in collections, and a MongoDB database stores one or more collections
of documents.
Instead of storing data in tables of rows and columns like relational SQL
databases, each record in a MongoDB database is a document described in binary
JSON (BSON), a binary representation of data. Applications can then retrieve
this data in JSON format.
This page contains information about:
The key terms that you need to understand when replicating from a MongoDB
database
The behavior of how Datastream handles data that's being pulled from
a source MongoDB database
The versions and editions of MongoDB that Datastream
supports
Known limitations for using MongoDB as a source
Key terms
The following are the key terms that you need to understand when working with
MongoDB sources:
SRV connection: a connection string with a hostname that corresponds to a
domain name service (DNS) service record (SRV). The string has the following
format:
Standard connection string: the standard format of the MongoDB
connection URI used to connect to a self-hosted MongoDB standalone deployment,
replica set, or sharded cluster. The string has the following format:
Replica set: a cluster of MongoDB servers that implements replication and
automated failover. Replica sets provide redundancy and high availability, and
are the basis for all production deployments.
Sharded cluster: a MongoDB sharded cluster consists of shards, mongos and
configuration servers. MongoDB shards data at the collection level, distributing
the collection data across the shards in the cluster.
mongos: the interface between the client applications and the sharded
cluster. mongos act as a query router and write operations to shards.
Collection: MongoDB organizes data in a hierarchical structure. A MongoDB
deployment contains one or more databases, and each database contains one or
more collections. In each collection, MongoDB stores data as documents that
contain field and value pairs. Collections are analogous to tables in
relational databases.
Behavior
The source MongoDB database relies upon change streams to replicate changes to
the destination. Change streams let you access real-time data and are supported
for replica sets and sharded clusters.
If configured, all historical data is replicated for included objects.
All changes, such as inserts, updates, and deletes from the specified objects
are replicated.
Versions
Datastream supports MongoDB versions later than 5.0.
Known limitations
Known limitations for using MongoDB as a source include:
When using the Datastream API, you can only specify what
fields you want to exclude in your stream. Specifying an include list for fields
isn't supported.
[[["Easy to understand","easyToUnderstand","thumb-up"],["Solved my problem","solvedMyProblem","thumb-up"],["Other","otherUp","thumb-up"]],[["Hard to understand","hardToUnderstand","thumb-down"],["Incorrect information or sample code","incorrectInformationOrSampleCode","thumb-down"],["Missing the information/samples I need","missingTheInformationSamplesINeed","thumb-down"],["Other","otherDown","thumb-down"]],["Last updated 2025-08-25 UTC."],[],[],null,["# Source MongoDB document database\n\n| **Preview**\n|\n|\n| This feature is subject to the \"Pre-GA Offerings Terms\" in the General Service Terms section\n| of the [Service Specific Terms](/terms/service-terms#1).\n|\n| Pre-GA features are available \"as is\" and might have limited support.\n|\n| For more information, see the\n| [launch stage descriptions](/products#product-launch-stages).\n\n\u003cbr /\u003e\n\nDatastream supports replicating change events from a MongoDB source.\nMongoDB is an open source, NoSQL database that uses JSON-like documents. One\ndocument can have other documents embedded in it. The documents are gathered\ntogether in collections, and a MongoDB database stores one or more collections\nof documents.\n\nInstead of storing data in tables of rows and columns like relational SQL\ndatabases, each record in a MongoDB database is a document described in binary\nJSON (BSON), a binary representation of data. Applications can then retrieve\nthis data in JSON format.\n\nThis page contains information about:\n\n- The key terms that you need to understand when replicating from a MongoDB database\n- The behavior of how Datastream handles data that's being pulled from a source MongoDB database\n- The versions and editions of MongoDB that Datastream supports\n- Known limitations for using MongoDB as a source\n\nKey terms\n---------\n\nThe following are the key terms that you need to understand when working with\nMongoDB sources:\n\n- **SRV connection**: a connection string with a hostname that corresponds to a\n domain name service (DNS) service record (SRV). The string has the following\n format:\n\n `mongodb+srv://[username:password@]host[/[defaultauthdb][?options]]`\n\n For more information, see the\n [MongoDB documentation](https://www.mongodb.com/docs/manual/reference/connection-string/#srv-connection-format).\n- **Standard connection string**: the standard format of the MongoDB\n connection URI used to connect to a self-hosted MongoDB standalone deployment,\n replica set, or sharded cluster. The string has the following format:\n\n `mongodb://[username:password@]host1[:port1][,...hostN[:portN]][/[defaultauthdb][?options]]`\n\n For more information, see the\n [MongoDB documentation](https://www.mongodb.com/docs/manual/reference/connection-string/#standard-connection-string-format).\n- **Replica set**: a cluster of MongoDB servers that implements replication and\n automated failover. Replica sets provide redundancy and high availability, and\n are the basis for all production deployments.\n\n- **Sharded cluster**: a MongoDB sharded cluster consists of shards, mongos and\n configuration servers. MongoDB shards data at the collection level, distributing\n the collection data across the shards in the cluster.\n\n- **mongos** : the interface between the client applications and the sharded\n cluster. `mongos` act as a query router and write operations to shards.\n\n- **Collection**: MongoDB organizes data in a hierarchical structure. A MongoDB\n deployment contains one or more databases, and each database contains one or\n more collections. In each collection, MongoDB stores data as documents that\n contain field and value pairs. Collections are analogous to tables in\n relational databases.\n\nBehavior\n--------\n\nThe source MongoDB database relies upon change streams to replicate changes to\nthe destination. Change streams let you access real-time data and are supported\nfor replica sets and sharded clusters.\n\n- If configured, all historical data is replicated for included objects.\n- All changes, such as inserts, updates, and deletes from the specified objects are replicated.\n\nVersions\n--------\n\nDatastream supports MongoDB versions later than 5.0.\n\nKnown limitations\n-----------------\n\nKnown limitations for using MongoDB as a source include:\n\n- When using the Datastream API, you can only specify what fields you want to exclude in your stream. Specifying an include list for fields isn't supported.\n- Stream recovery isn't supported.\n\nWhat's next\n-----------\n\n- Learn how to [configure a MongoDB source](/datastream/docs/configure-mongodb) for use with Datastream."]]