A document database is a type of NoSQL database that can be used to store and query data as JSON-like documents. JavaScript Object Notation (JSON) is an open data interchange format that is both human and machine-readable. Developers can use JSON documents in their code and save them directly into the document database. The flexible, semi-structured, and hierarchical nature of documents and document databases allows them to evolve with applications’ needs.

JSON document database
JSON document database query

What are the advantages of document databases

Document databases enable flexible indexing, powerful ad hoc queries, and analytics over collections of documents. Read more about the benefits below.

Ease of development

JSON documents map to objects—a common data type in most programming languages. When building applications, developers can flexibly create and update documents directly from the code. This means they spend less time creating data models beforehand. Therefore, application development is more rapid and efficient.

Flexible schema

A document-oriented database allows you to create multiple documents with different fields within the same collection. This can be handy when storing unstructured data like emails or social media posts. However, some document databases offer schema validation, so you can impose some restrictions on the structure.

Performance at scale

Document databases offer built-in distribution capabilities. You can scale them horizontally across multiple servers without impacting performance, which is cost-efficient as well. Moreover, document databases provide fault tolerance and availability through built-in replication.

What are the use cases of document databases

The document model works well with use cases such as content management, catalogs, sensor management, and more. For each use case, each document is unique and evolves over time.

Content management

A document database is an excellent choice for content management applications such as blogs and video platforms. With a document database, each entity the application tracks can be stored as a single document. The document database is a more intuitive way for a developer to update an application as the requirements evolve. In addition, if the data model needs to change, only the affected documents need to be updated. No schema update is required and no database downtime is necessary to make the changes.

Catalogs

Document databases are efficient and effective for storing catalog information. For example, in an e-commerce application, different products usually have different numbers of attributes. Managing thousands of attributes in relational databases is inefficient, and the reading performance is affected. Using a document database, each product’s attributes can be described in a single document for easy management and faster reading speed. Changing the attributes of one product won’t affect others.

Sensor management

The Internet of Things (IoT) has resulted in organizations regularly collecting data from smart devices like sensors and meters. Sensor data typically comes in as a continuous stream of variable values. Due to latency issues, some data objects might be incomplete, duplicated, or missing. Additionally, you must collect a large volume of data before you can filter or summarize it for analytics.

Document stores are more convenient in this case. You can quickly store the sensor data as it is, without cleaning it or making it conform to pre-determined schemas. You can also scale it as required and delete entire documents once analytics is done.

How do document databases work

Document databases store data as key-value pairs in JSON format. You can read and write JSON documents to the databases programmatically.

JSON documents structure

JSON represents data in three ways:

Key value

Key-value pairs are recorded within curly braces. The key is a string, and the value can be any data type like integer, decimal, or boolean. For example, a simple key-value is {"year": 2013}.

Array

An array is an ordered collection of values defined within left ([) and right (]) brackets. Items in the array are comma separated. For example, {"fruit": ["apple","mango"]}.

Objects

An object is a collection of key-value pairs. Essentially, JSON documents allow developers to embed objects and create nested pairs. For example, {"address": {"country": "USA","state": "Texas"}}.

JSON documents example

In the following example, a JSON-like document describes a film data set.

[
    {
        "year" : 2013,
        "title" : "Turn It Down, Or Else!",
        "info" : {
            "directors" : [ "Alice Smith", "Bob Jones"],
            "release_date" : "2013-01-18T00:00:00Z",
            "rating" : 6.2,
            "genres" : ["Comedy", "Drama"],
            "image_url" : "http://ia.media-imdb.com/images/N/O9ERWAU7FS797AJ7LU8HN09AMUP908RLlo5JF90EWR7LJKQ7@@._V1_SX400_.jpg",
            "plot" : "A rock band plays their music at high volumes, annoying the neighbors.",
            "actors" : ["David Matthewman", "Jonathan G. Neff"]
        }
    },
    {
        "year": 2015,
        "title": "The Big New Movie",
        "info": {
            "plot": "Nothing happens at all.",
            "rating": 0
        }
    }
]

You can observe that the JSON document holds simple values, arrays, and objects quite flexibly. You can even have an array with JSON objects within it. Hence, document-oriented databases let you create an unlimited-level hierarchy of embedded JSON objects. It's entirely up to you what schema you want to give your document store.

Document database operations

You can create, read, update, and delete entire documents stored in the database. Document databases provide a query language or API that allows developers to run the following operations:

Create

You can create documents in the database. Each document has a unique identifier that serves as a key.

Read

You can use the API or query language to read document data. You can run queries using field values or keys. You can also add indexes to the database to increase read performance.

Update

You can update existing documents flexibly. You can rewrite the entire document or update individual values.

What is the difference between document database and key-value stores

A key-value database is a NoSQL database that uses a simple key-value method to store data. It stores data as a collection of key-value pairs in which a key serves as a unique identifier. Both keys and values can be anything, from simple to complex compound objects.

A document-oriented database is a special type of key-value store where keys can only be strings. Moreover, the document is encoded using standards like JSON or related languages like XML. You can also store PDFs, image files, or text documents directly as values.

When querying your document store, you can read the value or a part of a value—especially if the value is another JSON object. For example, you can have {"book": {"id": 1,"price": $10}}, then query book.price, and the database will return the value 10. Key-value databases always return the whole value with ID and price information.

How can AWS support your document database requirements

Amazon DocumentDB (with MongoDB compatibility) is a fully managed native JSON document database service that supports document, including MongoDB, workloads. Developers can use the same MongoDB application code, drivers, and tools to run, manage, and scale workloads on Amazon DocumentDB. You can enjoy improved performance, scalability, and availability without worrying about managing the underlying infrastructure. With Amazon DocumentDB you can:

  • Scale to millions of read and write requests per second with Amazon DocumentDB Elastic Clusters, with little to no impact on performance and zero management of underlying infrastructure.
  • Decoupled storage and compute so you can increase read performance with up to 15 read replicas that share the same underlying storage, without having to perform writes at the replica nodes.
  • Automate undifferentiated manual database management tasks without licensing fees, including hardware provisioning, patching, setup, and other more.
  • Achieve 99.99% high availability enhanced with Amazon DocumentDB Global Clusters for globally distributed applications supporting fast local read performance. 
  • Achieve 99.99% availability with automatic replication, continuous backup, and strict network isolation.
  • Highly reliable and durable with fault-tolerant and self-healing storage, point-in-time recovery, continuous backups, and more. Amazon DocumentDB makes your data durable across three AZs within a Region by replicating new writes six ways while you only pay for one copy.
  • Highly secure with default encryption at rest, network isolation, and advanced auditing while providing ability to control resource-level permissions with fine-grained access.
  • Broad compliance coverage including SOC (1, 2, and 3), PCI DSS, HIPAA eligible, and more.

Get started with document databases on AWS by creating a free account today!

Data modeling with Amazon DocumentDB
Introduction to Amazon DocumentDB Elastic Clusters