AWS Database Blog

Using $dateFromString and executionStats in Amazon DocumentDB (with MongoDB compatibility)

Amazon DocumentDB (with MongoDB compatibility) is a fast, scalable, highly available, and fully managed document database service that supports MongoDB workloads. Amazon DocumentDB makes it easy and intuitive to store, query, and index JSON data. If you are new to Amazon DocumentDB, see Ramping up on Amazon DocumentDB (with MongoDB compatibility).

Amazon DocumentDB continues to improve compatibility with MongoDB. As of this writing, Amazon DocumentDB has added support for two new capabilities:

  • $dateFromString, which is an aggregation pipeline operator that allows you to compose powerful aggregations over your documents
  • executionStats mode for explain(), which provides detailed execution statistics for each stage within a query plan.

For more information about supported MongoDB APIs and aggregation pipeline capabilities for Amazon DocumentDB, see Supported MongoDB APIs, Operations, and Data Types.

This post discusses use cases for $dateFromString and executionStats, and shows you use to use these new capabilities through code examples.

$dateFromString

The $dateFromString aggregation operator enables you to convert a date in string form into a DATE data type. $dateFromString is the inverse operation of $dateToString.

To understand how $dateFromString works, this post uses an example dataset that records the date and time of events that occur within a video game. Although the video game logs events as strings, your application must be able to analyze the event field as a DATE data type. To perform the conversion from string to date, use the $dateToString aggregation operator.

Each document in the following example dataset is a record of a distinct event and time at which an event occurred within the video game:

db.missionLog243.insertMany([
{ _id: 1, "event":"missionStart", logDate: "2020-03-15T13:41:33"},
{ _id: 2, "event":"jumpPoint1", logDate: "2020-03-15T13:45:34"},
{ _id: 3, "event":"jumpPoint2", logDate: "2020-03-15T13:48:21"},
{ _id: 4, "event":"jumpPoint3", logDate: "2020-03-15T13:52:09"},
{ _id: 5, "event":"missionEnd", logDate: "2020-03-15T13:58:44"}
])

The following aggregation query projects the event field in its native String type, and converts the logDate field to a DATE data type, which subsequent stages in the aggregation pipeline can use:

db.missionLog243.aggregate( [ {
   $project: {
      event: '$event',
      logDate: {
         $dateFromString: {
            dateString: '$logDate'
         }
      }
   }
} 
])

The following output is the result:

{ "_id" : 1, "event" : "missionStart", "logDate" : ISODate("2020-03-15T13:41:33Z") }
{ "_id" : 2, "event" : "jumpPoint1", "logDate" : ISODate("2020-03-15T13:45:34Z") }
{ "_id" : 3, "event" : "jumpPoint2", "logDate" : ISODate("2020-03-15T13:48:21Z") }
{ "_id" : 4, "event" : "jumpPoint3", "logDate" : ISODate("2020-03-15T13:52:09Z") }
{ "_id" : 5, "event" : "missionEnd", "logDate" : ISODate("2020-03-15T13:58:44Z") }

The logDate field in the preceding output is represented as a DATE data type.

executionStats

When you investigate why a query is running more slowly than expected, you should understand the execution statistics for the selected query plan. The executionStats option to the explain command provides the number of documents returned from a particular stage (nReturned) and the amount of time spent executing each stage (executionTimeMillisEstimate). The output of executionStats helps identify the longest-running stages of a query, so you can optimize your queries and thus your application’s performance.

Run the query that you want to improve under the explain command with the following command:

db.col.find({<em>&lt;query document&gt;</em>}).explain("executionStats");

The following command is an example operation:

db.fish.find({}).limit(2).explain("executionStats");

The output from this operation may look something like the following:

{
	"queryPlanner" : {
		"plannerVersion" : 1,
		"namespace" : "test.fish",
		"winningPlan" : {
			"stage" : "SUBSCAN",
			"inputStage" : {
				"stage" : "LIMIT_SKIP",
				"inputStage" : {
					"stage" : "COLLSCAN"
				}
			}
		}
	},
	"executionStats" : {
		"executionSuccess" : true,
		"executionTimeMillis" : "0.063",
		"planningTimeMillis" : "0.040",
		"executionStages" : {
			"stage" : "SUBSCAN",
			"nReturned" : "2",
			"executionTimeMillisEstimate" : "0.012",
			"inputStage" : {
				"stage" : "LIMIT_SKIP",
				"nReturned" : "2",
				"executionTimeMillisEstimate" : "0.005",
				"inputStage" : {
					"stage" : "COLLSCAN",
					"nReturned" : "2",
					"executionTimeMillisEstimate" : "0.005"
				}
			}
		}
	},
	"serverInfo" : {
		"host" : "enginedemo",
		"port" : 27017,
		"version" : "3.6.0"
	},
	"ok" : 1
}

To show only executionStats from the preceding query, enter the following command:

db.fish.find({}).limit(2).explain("executionStats").executionStats;

For more information on troubleshooting and optimizing query performance, see Troubleshooting Amazon DocumentDB and Best Practices for Amazon DocumentDB.

Summary

AWS continues to work backward from our customers and build the capabilities you need. This post showed how to use the $dateFromString aggregation operator and executionStats to help build and optimize your applications.

For more information, see Getting Started with Amazon DocumentDB, or watch the video Getting Started with Amazon DocumentDB on YouTube. You can use the same application code, drivers, and tools that you use with MongoDB today to start developing against Amazon DocumentDB. For more information about migrating to Amazon DocumentDB, see Migrating to Amazon DocumentDB and the video AWS re:Invent 2019: Migrating your databases to Amazon DocumentDB on YouTube.

 


About the Author

 

Joseph Idziorek is a Principal Product Manager at Amazon Web Services.