What Is a Document Database?

The document database defined

A document database is a type of nonrelational database that is designed to store semistructured data as documents. Document databases are intuitive for developers to use because the data in the application tier is typically represented as a JSON document. Developers can persist data by using the same document model format that they use in their application code. In a document database, each document can have the same or different data structure, and each document is self-describing—including its possibly unique schema—and isn’t necessarily dependent on any other document. Documents are grouped into "collections," which serve a similar purpose to a table in a relational database. 

For example, in a simple book database, a JSON file that describes a book item could look like the following code.

[
    {
        "year" : 2013,
        "title" : "Turn It Down, Or Else!",
        "info" : {
            "directors" : [ "Alice Smith", "Bob Jones"],
            "release_date" : "2013-01-18T00:00:00Z",
            "rating" : 6.2,
            "genres" : ["Comedy", "Drama"],
            "image_url" : "http://ia.media-imdb.com/images/N/O9ERWAU7FS797AJ7LU8HN09AMUP908RLlo5JF90EWR7LJKQ7@@._V1_SX400_.jpg",
            "plot" : "A rock band plays their music at high volumes, annoying the neighbors.",
            "actors" : ["David Matthewman", "Jonathan G. Neff"]
        }
    },
    {
        "year": 2015,
        "title": "The Big New Movie",
        "info": {
            "plot": "Nothing happens at all.",
            "rating": 0
        }
    }
]

Use cases

Content management

A document database is a great choice for content management applications such as blogs and video platforms. With a document database, each entity that the application tracks can be stored as a single document. The document database is more intuitive for a developer to update an application as the requirements evolve. In addition, if the data model needs to change, only the affected documents need to be updated. No schema update is required and no database downtime is necessary to make the changes. 

Catalogs

Document databases are efficient and effective for storing catalog information. For example, in an e-commerce application, different products usually have different numbers of attributes. Managing thousands of attributes in relational databases is inefficient, and the reading performance is affected. Using a document database, each product’s attributes can be described in a single document for easy management and faster reading speed. Changing the attributes of one product won’t affect others.

Popular document databases

Amazon DynamoDB

Amazon DynamoDB is a nonrelational database that delivers reliable performance at any scale. It's a fully managed, multi-region, multi-master database that provides consistent single-digit millisecond latency, and offers built-in security, backup and restore, and in-memory caching. DynamoDB supports native JSON, so you can write JSON documents directly into DynamoDB tables. With a maximum item size of 400 KB, DynamoDB allows you to store large JSON documents and nested objects in one transaction.

Get started with DynamoDB today.

Getting Started with Amazon DynamoDB

MongoDB

MongoDB is an open-source, nonrelational database that provides support for JSON-like, document-oriented storage systems. It supports a flexible data model that enables you to store data of any structure, and provides a rich set of features, including full index support, sharding, and replication. AWS enables you to set up the infrastructure to support MongoDB deployment in a flexible, scalable, and cost-effective manner in the AWS Cloud. 

Use the AWS MongoDB Quick Start (also available in PDF format) to deploy a MongoDB cluster in the AWS Cloud. For an overview of MongoDB and its implementation on AWS, see the whitepaper, MongoDB on AWS: Guidelines and Best Practices. Also, be sure to review AWS security recommendations for MongoDB.

Couchbase

Designed to power engaging mobile, IoT, and web applications, the enterprise-class Couchbase Data Platform includes Couchbase Server and Couchbase Mobile. Couchbase Server is a cloud-native, nonrelational database designed with a distributed architecture for performance, scalability, and availability. It enables developers to build applications by leveraging the power of SQL with the flexibility of JSON. Couchbase Mobile includes a fully integrated embedded database, built-in security, and real-time automated sync with the highly scalable Couchbase Server.

Use the AWS Couchbase Quick Start (also available in PDF format) to deploy a Couchbase cluster in the AWS Cloud.