
Overview
This listing provides a container-based agent for "Datadog Pro (Pay-As-You-Go". You will need to subscribe to https://aws.amazon.com/marketplace/pp/B01LYD359RÂ before using this agent.
Datadog is a SaaS-based monitoring and analytics platform for large-scale applications and infrastructure. Combining real-time logs, metrics from servers, containers, databases, and applications with end-to-end tracing, Datadog delivers actionable alerts and powerful visualizations to provide full-stack observability. Datadog includes over 250 vendor-supported integrations and APM libraries for several languages.
Highlights
- Turn-key integrations and easy-to-install agent to start monitoring all your servers and resources in minutes.
- Rich, out-of-the-box dashboards plus drag-and-drop tools to create your own.
- Easy-to-use API allows you to extend Datadog integrations and send metrics and events from your own applications.
Details
Unlock automation with AI agent solutions

Features and programs
Trust Center
Buyer guide

Financing for AWS Marketplace purchases
Pricing
Vendor refund policy
Custom pricing options
How can we make this page better?
Legal
Vendor terms and conditions
Content disclaimer
Delivery details
Datadog Pro (Pay-As-You-Go)
- Amazon ECS
- Amazon EKS
Container image
Containers are lightweight, portable execution environments that wrap server application software in a filesystem that includes everything it needs to run. Container applications run on supported container runtimes and orchestration services, such as Amazon Elastic Container Service (Amazon ECS) or Amazon Elastic Kubernetes Service (Amazon EKS). Both eliminate the need for you to install and operate your own container orchestration software by managing and scheduling containers on a scalable cluster of virtual machines.
Version release notes
N/A
Additional details
Usage instructions
The Datadog Agent can be deployed as a Docker Container, an ECS or Fargate Task, or via the Kubernetes Daemonset or Helm Chart. For basic usage refer to: /agent. For advanced documentation see: https://github.com/DataDog/datadog-agent/tree/master/Dockerfiles/agentÂ
Resources
Vendor resources
Support
Vendor support
Contact our knowledgable Support Engineers via email, IRC, or in-app chat
AWS infrastructure support
AWS Support is a one-on-one, fast-response support channel that is staffed 24x7x365 with experienced and technical support engineers. The service helps customers of all sizes and technical abilities to successfully utilize the products and features provided by Amazon Web Services.
Similar products



Customer reviews
Have improved incident response and centralized observability while optimizing resource usage
What is our primary use case?
Our main use case for Datadog includes monitoring and logs, custom metrics, as well as utilizing the APM feature and synthetic tests in our day-to-day operations.
A quick specific example of how Datadog helps with our monitoring and logs comes from all our applications sending logs into Datadog for troubleshooting purposes, with alerts built on top of the logs, and for custom metrics, we send our metrics from the applications via Prometheus to Datadog, building alerts on top of those as well, sometimes sending critical alerts directly to PagerDuty.
We generally have monitors and alerts set up for our applications and specifically rely on them to check our critical business units, such as databases; in GCP, we use Cloud SQL, in AWSÂ , we use RDSÂ , and we also monitor Scylla databases and EC2Â instances running Kafka services, which we heavily depend upon. Recently, we migrated from US one to US five, which was a significant shift, requiring us to migrate all alerts and monitors to US five and validate their functionality in the new site.
What is most valuable?
The best feature Datadog offers is its user-intuitive interface, making it very easy to track logs and custom metrics. We also appreciate the APMÂ feature, which has helped reduce our log volumes and custom metric volumes, allowing us to turn off some custom metrics.
We recently learned how tags contribute to custom metrics volume, which led us to exclude certain tags to further reduce that volume, and we implement log indexing and exclusion filters, leaving us with much to explore and optimize in our use of Datadog as our major observability platform.
What needs improvement?
Regarding metrics showing our improvements, the MTTR has been reduced by about 40% after integrating Datadog with PagerDuty, and we've seen our costs significantly drop in the most recent renewal after three years' contract.
Operationally, we spend about 30-40% less time correlating logs and metrics across services, while potential areas for improvement in Datadog include its integration depth and providing more flexible pricing models for large metric and log volumes.
I would suggest having an external Slack channel for urgent requests, which would enable quicker access to support or a dedicated support team for our needs.
I choose eight because, while we have used Datadog for three years and experienced growth in our business and services, the cost has also increased with the growth in metrics and log volumes, and proactive cost management feedback has not been provided to help manage or budget those rising costs. Thus, I'd like to see more proactive cost management in the future, as the pricing model seems to escalate quickly with increasing metrics ingestion and monitoring across clouds. Datadog is a powerful and reliable observability platform, but there is still room for improvement in cost efficiency and usability at scale.
Regarding pricing, setup costs, and licensing, I find Datadog's pricing model transparent but scaling quickly; the base licensing for host integration is straightforward, but costs can rapidly climb as we add custom metrics and log ingestion, especially in dynamic Kubernetes or multi-cloud environments, with the pricing being moderate to high, and while cost visibility is straightforward, it could become challenging with growing workloads. The upfront setup cost is minimal, mainly involving fine-tuning dashboards, tags, and alerts, making licensing very flexible to enable features as needed.
For how long have I used the solution?
I have been working in my current field for roughly around 10 years, starting my AWSÂ journey about 10 years ago, mainly focused on infrastructure and observability.
What do I think about the stability of the solution?
I believe Datadog is stable.
What do I think about the scalability of the solution?
Datadog's scalability is impressive, as it has the necessary integrations, supports agent-based and cloud-native solutions, and accommodates multi-cloud, multi-region features, making overall performance very good.
How are customer service and support?
Customer support has improved recently with online support available through a portal, allowing for quicker access to help.
How would you rate customer service and support?
Neutral
Which solution did I use previously and why did I switch?
Previously, we used Splunk SignalFx for a couple of years, switching to Datadog because of Datadog's user-intuitive interface, which was lacking in SignalFx at the time.
What was our ROI?
Datadog has had a significant positive impact on our organization overall, particularly in visibility, reliability, and cost efficiency, allowing us to centralize metrics, logs, and traces across our cloud, moving from reactive to proactive monitoring, with improvements including faster incident detection and resolution, enhanced service reliability, better cost and resource optimization, and shared dashboards providing the engineering and product teams a single source of truth for system health and performance, thus enhancing our overall observability and operational efficiency.
I believe Datadog has delivered more than its value through reduced downtime, faster recovery, and infrastructure optimization; although we sometimes miss critical alerts, overall, it has improved our team's efficiency by maybe 30% less time spent troubleshooting logs and custom metrics while providing measurable ROI through enhanced system reliability, reduced incident costs, and infrastructure spending optimization.
Which other solutions did I evaluate?
We only evaluated SignalFx before choosing Datadog, as Datadog offered simpler scaling, better management, broader integrations, and dashboards, allowing for easier monitoring of our multi-cloud setup.
What other advice do I have?
After reducing log and custom metric volumes, we notice a significant reduction in costs without any performance issues on our end, actually seeing a lot of cost reductions.
I strongly recommend using Datadog, but suggest being proactive about resource usage and tracking anomalies monthly.
I find the interview process okay, although it runs longer than I expected, exceeding the anticipated 10 minutes.
My rating for Datadog is 8 out of 10.
Which deployment model are you using for this solution?
If public cloud, private cloud, or hybrid cloud, which cloud provider do you use?
Has improved our ability to identify cloud application issues quickly using trace data and detailed log filtering
What is our primary use case?
My team and I primarily rely on Datadog for logs to our application to identify issues in our cloud-based solution, so we can take the requests and information that's being presented as errors from our customers and use it to identify what the errors are within our back-end systems, allowing us to submit code fixes or configuration changes.
I had an error when I was trying to submit an API request this morning that just said unspecified error in the web interface. I took the request ID and filtered a facet of our logs to include that request ID, and it gave me the specific examples, allowing me to look at the code stack that we had logged to identify what specifically it was failing to convert in order to upload that data.
My team doesn't utilize Datadog logs very often, but we do have quite a few collections of dashboards and widgets that tell us the health of the various API requests that come through our application to identify any known issues with some of our product integrations. It's useful information, but it's not necessarily stuff that our team monitors directly as we're more of a reactionary team.
What is most valuable?
The best features Datadog offers, in my experience, are the ability to filter down by facets very quickly to identify the problems we're experiencing with our individual customers using our cloud application. I really enjoy the trace option so that I can see all of the various components and how they communicate with each other to see where the failures are occurring.
The trace option helps us spot issues by giving access to see if the problem is occurring within our Java components or if it's a result of the SQL queries, allowing us to look at the SQL queries themselves to identify what information it's trying to pull. We can also look at other integrations, whether that's serverless Lambda functions or different components from our outreach.
Datadog has impacted our organization positively because the general feeling is that it's superior to the ELK stack that we used to use, being significantly faster in searching and filtering the information down, as well as providing links to our search criteria that our development teams and cloud operations teams can use to look at the same problems without having to set up their own search and filter criteria.
What needs improvement?
For the most part, the issues that we come across with Datadog are related to training for our organization. Our development and operations teams have done a really good job of getting our software components into Datadog, allowing us to identify them. However, we do have reduced logging in our Datadog environment due to the amount of information that's going through.
The hardest thing we experience is just training people on what to search for when identifying a problem in Datadog, and having some additional training that might be easily accessible would probably be a benefit.
At this point, I do not know what I don't know, so while there may be options for improvements, Datadog works very well for the things that we currently use it for. Additionally, the extra training that would be more easily accessible would be extremely helpful, perhaps something within the user interface itself that could guide us on useful information or how to tie different components or build a good dashboard.
For how long have I used the solution?
I have worked for Calabrio for 13 years.
What do I think about the stability of the solution?
Datadog is very stable.
What do I think about the scalability of the solution?
Datadog's scalability is strong; we've continued to significantly grow our software, and there are processes in place to ensure that as new servers, realms, and environments are introduced, we're able to include them all in Datadog without noticing any performance issues. The reporting and search functionality remain just as good as when we had a much smaller implementation.
Which solution did I use previously and why did I switch?
Previously, we used the ELK stack—Elasticsearch, Logstash , and Kibana—to capture data. Our cloud operations team set that up because they were familiar with it from previous experiences. We stopped using it because as our environment continued to grow, the response times and the amount of data being kept reached a point where we couldn't effectively utilize it, and it lacked the capability to help us proactively identify issues.
What other advice do I have?
A general impression is that Datadog saves time because the ability to search, even over the vast amount of AWSÂ realms and time spans that we have, is significantly faster compared to other solutions that I've used that have served similar purposes.
I would advise others looking into using Datadog to identify various components within their organization that could benefit from pulling that information in and how to effectively parse and process all of it before getting involved in a task, so they know what to look for. Specifically, when searching for data, if a metric can be pulled out into an individual facet and used, the amount of filtering that can be done is significantly improved compared to a general text search.
I would love to figure out how to use Datadog more effectively in the organization work that I do, but that is a discussion I need to have with our operations and research and development teams to determine if it can benefit the customer or the specific implementation software that I work with.
On a scale of one to ten, I rate Datadog a ten out of ten.
Which deployment model are you using for this solution?
If public cloud, private cloud, or hybrid cloud, which cloud provider do you use?
Has improved incident response time through centralized log monitoring and infrastructure automation
What is our primary use case?
My main use case for Datadog is for security SIEM , log management, and log archiving.
In my daily work, we send all our logs from different cloud services and SaaS products, including Okta, GCP, AWS , GitHub , as well as virtual machines, containers, and Kubernetes clusters. We send all this data to Datadog , and we have numerous different monitors configured. This allows us to create different security features, such as security monitoring and escalate items to a security team on call to create incident response. Archiving is significant because we can always restore logs from the archive and go back in time to see what happened on that exact day. It is very helpful for us to investigate security incidents and infrastructure incidents as well.
Regarding our main use case, we use the Terraform provider for Datadog, which is probably one of the biggest benefits of using Datadog over any other similar tool because Datadog has great Terraform support. We can create all our security monitoring infrastructure using Terraform. Even if something goes wrong and the Datadog tenant becomes completely compromised or if all our monitors were to get erased for whatever reason, we can always restore all our monitoring setup through Terraform, which provides peace of mind.
What is most valuable?
The best features Datadog offers are not necessarily about having the best individual features, but rather the sheer quantity of different features they offer. I appreciate how you can reuse a query across different indexes for logs or security monitoring. The syntax remains consistent for everything, so you do not have to learn multiple languages. Similarly, for different types of monitors, you can always reuse the same templating language, which makes things much more efficient.
Datadog positively impacted our organization by making us more cautious about how we manage our logs. Before Datadog, we would ingest substantial amounts of data without considering indexing priorities. We became more strategic about what we index, particularly for security and cloud audit logs. We improved our approach to indexing retention and determining which types of logs are important. Overall, we enhanced our internal log management practices.
After implementing Datadog, we observed specific improvements in outcomes and metrics. We started analyzing our logs more thoroughly than before, identifying different patterns, and determining log importance levels. We began looking for more signals from audit logs and distinguishing between critical and non-critical information. The most significant metric improvement has been reduced incident investigation time.
What needs improvement?
Datadog can be improved by addressing billing and spend calculation methods, as it would be better if these were more straightforward. Currently, these calculations can be complex. Additionally, while we use Terraform extensively, not everything is available in Terraform. It would be beneficial to have more features supported in Terraform, particularly some security features that have been available for a while but still lack Terraform support.
For how long have I used the solution?
I have been using Datadog for about four years.
What do I think about the stability of the solution?
Datadog is very stable.
What do I think about the scalability of the solution?
Datadog's scalability is excellent. We have never encountered any issues.
How are customer service and support?
The customer support is good. I have never had any issues.
I would rate the customer support as nine out of ten.
How would you rate customer service and support?
Positive
Which solution did I use previously and why did I switch?
We previously used New Relic and switched because it was not very effective.
How was the initial setup?
My experience with pricing, setup cost, and licensing indicates that it was somewhat expensive.
What was our ROI?
I have seen a return on investment with Datadog, particularly in time saved responding to incidents. Regarding staffing requirements, that metric isn't applicable for our use case since log management and security monitoring inherently require personnel to respond. However, it has definitely improved our efficiency in terms of response time, though this isn't a hard metric but rather based on experience.
Which other solutions did I evaluate?
I do not remember evaluating other options before choosing Datadog as it was a long time ago.
What other advice do I have?
I would rate Datadog an eight out of ten because while it is expensive, it offers numerous features, though sometimes it attempts to do too much.
My advice to others considering Datadog is to explore other products and calculate potential spending carefully. If Terraform support is important to your organization, then Datadog is an excellent choice. However, keep in mind that costs will increase significantly as you scale, and different features have varying pricing structures.
Overall rating: 8/10
Which deployment model are you using for this solution?
If public cloud, private cloud, or hybrid cloud, which cloud provider do you use?
Has enabled our teams to detect application errors faster and shift company mindset toward proactive monitoring
What is our primary use case?
My main use case for Datadog is application monitoring.
Specifically for application monitoring, we monitor our production Laravel instances using APMÂ spans and tracing.
In addition to application monitoring, I also use Datadog to monitor our log management for our applications that are both on-prem and in the cloud, as using the AWS integration.
What is most valuable?
In my experience, the best features that Datadog offers us include unprecedented visibility and the ability to dive deep on application debugging.
Datadog's visibility and debugging features help me day-to-day; specifically, we had an application that was throwing a bunch of errors causing an issue in our production database. Using Datadog, we were able to immediately isolate the error and plan around it.
Datadog has positively impacted my organization. I think it has given us not only the specific debug and error codes that we're looking for, but it has changed the entire company's mindset in how to extract value from data that's been lying around in our internal systems for years now and given everybody a new perspective on monitoring and debugging.
Since adopting Datadog, I've noticed specific outcomes. We've begun to handle our log management internally in a more efficient manner, so we've actually reduced our disk space as simplified our backup procedures and process chains using Datadog. Now that we have extracted the value from the logs and the traces and the debug logs, we no longer have to rely so much on traditional text-based logs or even digging into the code and the error files themselves.
What needs improvement?
The only improvement I would to see with Datadog is that the graphical user interface sometimes takes a little bit to load, especially when diving deep on a subject, and just a little bit more caching would help.
The largest pain point we've had with Datadog to this point was onboarding. This was partly our fault because our logs weren't really set up to be used in a modern observability platform Datadog, but I definitely would have liked to have seen more comprehensive onboarding. We had a few appointments, but the more help we get up front, the easier it is for us to get more familiar and do more things with Datadog.
At this time, I do not think there are any other improvements Datadog needs that would make my experience even better.
For how long have I used the solution?
I have been using Datadog for approximately four months now.
What do I think about the stability of the solution?
Datadog is very stable.
What do I think about the scalability of the solution?
We have not yet hit the use case to evaluate Datadog's scalability, but based off of everything else we've used with the infrastructure, I don't think there are going to be any issues with it. We did, as a trial, engage the AWSÂ integration, and immediately it found all of our AWS resources and presented them to us. In fact, it was talking about costing and billing which we had not anticipated, but we were pleasantly surprised with.
How are customer service and support?
Customer support is excellent; I have opened and closed probably five tickets in the past few days, specifically within the past seven days. Very responsive, and the support techs are knowledgeable and responsive.
I would rate customer support an eight out of ten. The only issues that we had were really needing more educational resources to begin with to truly understand the specifics of log management and APMÂ tracing setup, simply because those are very complicated procedures. Walking through that a couple more times with the support engineer probably would have been helpful. It was not a deal breaker or a significant pain point, but the quicker we get up with Datadog, the happier, the quicker and deeper we get with Datadog, the happier people seem to be at our organization.
Overall, the entire Datadog comprehensive experience of support, onboarding, getting everything in there, and having a good line of feedback has been exceptional. I've been in the industry over 20 years, and part of my roles has always been customer-facing. I find that Datadog's client support is very engaging, comprehensive, and thorough.
How would you rate customer service and support?
Positive
Which solution did I use previously and why did I switch?
For on-prem infrastructure monitoring, we're currently using Nagios, but that's beginning to fade as we rely more on Datadog for our infrastructure monitoring. We had used New Relic for application performance monitoring, but because of the cost associated with that and not seeing the value from it, we stopped using that about two years ago.
How was the initial setup?
We did not purchase Datadog through the AWS Marketplace ; we were contacted independently by a Datadog sales agent.
My experience with pricing, setup cost, and licensing has been overall fairly positive. The on-demand/reserved pricing, we were not as cognizant as to how big the on-demand could get, especially when we were getting everything set up, but Datadog proactively took a strong hand in guiding us to getting our costs under control. I'm proud to say that we are within 1% of our projected cost budget, so that was very handy and that's happened in the last month. Very efficient and very effective working with Datadog to control cost.
What was our ROI?
In terms of time saved, I've noticed that when we're responding to potential errors or during our software deployments, it's saving us minutes at a time that quickly add up to hours, that quickly add up to days in terms of retrieving debug and application error information.
Which other solutions did I evaluate?
Before choosing Datadog, we evaluated other options including New Relic and SolarWinds.
What other advice do I have?
I would advise others looking into using Datadog to evaluate it against other competing properties and applications in the space, and really dig in. You will find that Datadog does what it's supposed to do very quickly, very efficiently, as does it more cost competitively than some of the other offerings.
Datadog is deployed in my organization in both on-prem and in public cloud scenarios.
On a scale of one to ten, I rate Datadog a nine overall.
Which deployment model are you using for this solution?
If public cloud, private cloud, or hybrid cloud, which cloud provider do you use?
User sessions have been monitored effectively and beta user frustration points are now identified through behavioral insights
What is our primary use case?
I think the most important feature for me in Datadog is the RUM features.
I check the efficiency of the applications that I'm supporting in Datadog and also use it to view the sessions of users.
I have some trouble doing troubleshooting in our app currently, but RUM is my main use case in Datadog.
What is most valuable?
The personalized dashboards and alerting in Datadog stand out to me, so that way you can gear your use of the product towards what's important to you.
Datadog has allowed us to ensure that we can look at how our beta testers are using our new UIs and seeing where their frustration points are, which has been important to us.
We've been using the heat map feature in Datadog to measure those frustration points.
What needs improvement?
Some templates for certain roles and things that users care about could be auto-suggested for a dashboard or alerting in Datadog.
We had limitations around RUM and our feature flag provider in Datadog because it's a back-end forward feature flag usage in our Next.js application. We had trouble hooking up our feature flags due to RUM being client-side only. This issue arose because Next.js is a front-end and back-end focused application, and it would be beneficial to send the feature flag resolution from the back-end if needed. Our feature flag provider is GrowthBook, and the way we would have to get those feature flags into Datadog was time-consuming with a lot of boilerplate. We would have to mimic feature flag resolution on the client side, so we decided to forego that.
For how long have I used the solution?
We have been using Datadog for about two or three months.
What do I think about the stability of the solution?
Datadog seems stable in my experience without any downtime or reliability issues.
What do I think about the scalability of the solution?
Datadog is scalable and I don't think we'll have problems with scalability in terms of our use case. We might face limitations with logs, but I feel we would not be reaching any of Datadog's limits.
How are customer service and support?
The customer support has been one of the best parts of Datadog.
I would rate the customer support from Datadog a 10 on a scale of 1 to 10.
I would suggest staying in close contact with your customer support representative to get the most out of Datadog.
How would you rate customer service and support?
Positive
Which solution did I use previously and why did I switch?
We did not have a different solution before Datadog.
How was the initial setup?
Setup with Datadog was pretty easy.
What was our ROI?
It is too early to tell if we've seen a return on investment so far with Datadog.
What's my experience with pricing, setup cost, and licensing?
I'm not clear on pricing, but it's not a huge concern for us at the moment in terms of RUM. For the other pieces, I know that there may be some pricing that they've been looking at for APMÂ and logs.
Which other solutions did I evaluate?
I did not evaluate other options before choosing Datadog.
What other advice do I have?
I personally don't use the personalized dashboards and alerting, but I've seen some nice use cases from others on my team. On a scale of 1-10, I rate Datadog an 8.