Sign in Agent Mode
Categories
Your Saved List Become a Channel Partner Sell in AWS Marketplace Amazon Web Services Home Help

Reviews from AWS customer

21 AWS reviews

External reviews

58 reviews
from

External reviews are not included in the AWS star rating for the product.


4-star reviews ( Show all reviews )

    Patrick Lynch

Has improved visibility into performance metrics and helped reduce cloud spend

  • October 16, 2025
  • Review from a verified AWS customer

What is our primary use case?

My main use case for Datadog is dashboards and monitoring.

We use dashboards and monitoring with Datadog to monitor the performance of our Nexus Artifactory system and make sure the services are running.

What is most valuable?

The best features Datadog offers are the dashboarding tools as well as the monitoring tools.

What I find most valuable about the dashboarding and monitoring tools in Datadog is the ease of use and simplicity of the interface.

Datadog has positively impacted our organization by allowing us to look at things such as Cloud Spend and make sure our services are running at an optimal performance level.

We have seen specific outcomes such as cost savings by utilizing the cost utilization dashboards to identify areas where we could trim our spend.

What needs improvement?

To improve Datadog, I suggest they keep doing what they're doing.

Newer features using AI to create monitors and dashboards would be helpful.

For how long have I used the solution?

I have been using Datadog for six years.

What do I think about the stability of the solution?

Datadog is stable.

What do I think about the scalability of the solution?

I am not sure about Datadog's scalability.

How are customer service and support?

Customer support with Datadog has been great when we needed it.

I rate the customer support a nine on a scale of 1 to 10.

How would you rate customer service and support?

Positive

Which solution did I use previously and why did I switch?

We did not previously use a different solution.

What was our ROI?

In terms of return on investment, there is a lot of time saved from using the platform.

What's my experience with pricing, setup cost, and licensing?

I was not directly involved in the pricing, setup cost, and licensing details.

Which other solutions did I evaluate?

Before choosing Datadog, we evaluated other options such as Splunk and Grafana.

What other advice do I have?

I rate Datadog an eight out of ten because the expense of using it keeps it from being a nine or ten.

My advice to others looking into using Datadog is to brush up on their API programming skills.

My overall rating for Datadog is eight out of ten.

Which deployment model are you using for this solution?

Public Cloud

If public cloud, private cloud, or hybrid cloud, which cloud provider do you use?


    Ilja Summala

Alerting and metrics improve monitoring efficiency while pricing presents challenges

  • August 07, 2025
  • Review from a verified AWS customer

What is our primary use case?

The primary purposes for which Datadog is used include infrastructure monitoring and application monitoring.

The main use case for Datadog integration capabilities is to monitor workloads in public cloud, and those public cloud integrations that reached the public cloud metric natively were helpful or critical for us. We are not using Datadog for AI-driven data analysis tasks, but more cloud-native and vendor-native tools at the moment, and at the time when I was still in my last employer, we didn't use Datadog for the AI piece at all.

What is most valuable?

I find alerting and metrics to be the most effective features of Datadog for system monitoring. It was still cheaper to run Datadog than other alternatives, so the running costs were cheaper because it was SaaS and quite easy to use.

Datadog is only available in SaaS.

What needs improvement?

The pricing nowadays is quite complex.

In future updates, I would like to see AI features included in Datadog for monitoring AI spend and usage to make the product more versatile and appealing for the customer.

For how long have I used the solution?

I have been using Datadog since 2014.

What was my experience with deployment of the solution?

There were no problems with the deployment of Datadog.

The deployment of Datadog just took a few hours.

What do I think about the stability of the solution?

The challenges I encountered while using Datadog were in the early days when the product was missing the ability to monitor Kubernetes and similar features, but they have since added those features. At the moment, I don't think there are too many challenges that I am worrying about.

How was the initial setup?

One person is enough to do the installation.

What other advice do I have?

I am not working with any of these solutions currently because I'm on sabbatical, but I used to work with Datadog six months ago, and now at the moment I'm on sabbatical.

We were using the tools that AWS and Azure came with natively to monitor the AI workflows on their platforms.

I used to work as the CTO at Northcloud, but I no longer work there.

On a scale of one to ten, I rate Datadog an eight out of ten.


    reviewer820579

Single pane of glass, easy to share dashboards, and good for monitoring

  • September 20, 2024
  • Review from a verified AWS customer

What is our primary use case?

We primarily use the solution for a variety of purposes, including:

  • Watching RUM data for frontend site, using LCP and INP metrics to compare across the old and new architecture to inform rollout decisions.
  • Watching APM data for backend services, observing how the backend server reacts (CPU util, memory, requests/second) to make sure the backend can handle the load.
  • Using Datadog CCM during our free trial period to get visibility over our AWS spend across accounts and resources and looking at recommendations and acting on those.
  • Browsing the service catalog to look at the current state of services that are running and what resources it uses. 

How has it helped my organization?

This provides a single place to find monitoring data. Prior to DD, we had some metrics living in New Relic, some in Grafana, and some in Circonus, and it was very confusing to navigate across them. Understanding different query languages is challenging. Here, there's a single UI to get used to, and everything is so sharable.

DD has led to teams making more decisions based on data that they observe about their service metrics and RUM metrics. I've seen decisions get made based on what has been observed in DD, and less based on anecdotal data.

What is most valuable?

I really enjoyed using CCM since it showed cloud cost data easily next to other metrics, and I could correlate the two.

Across CCM and the rest of Datadog, I like how sharable everything is. It's so easy to share dashboards and links with my teammates so we can quickly get up to speed on debugging/solving an issue.

I also have really enjoyed K8s view of pods and pod health. It's very visual, and as a non-K8s platform owner at my company, I can still observe the overall health of the system. Then I can drill in and have learned things about K8s by exploring that part of the product and talking with the team.

What needs improvement?

We've had some issues where we had Datadog automatically turned on in AWS regions that we weren't using, which incurred a small but steady cost that amounted to tens of thousands of dollars spent over a few weeks. I wish there was a global setting that lets an admin restrict which regions DD is turned on in as a default setup step.

Sometimes, the APM service dashboard link isn't sharable. I click something in the service catalog, and on that service's APM default view, I try to share a link to that with a teammate, and they reach a blank or error screen. 

I wish there was more organization and detail in the suggestions when I use the query editor. I'm never quite sure when the autofill dropdown shows up if I'm seeing some custom tag or some default property, so I have to know exactly what I'm looking for in order to build a chart. It's hard to navigate and explore using the query autofill suggestions without knowing exactly what tag to look for.

It's been a bit hard to understand how data gets sampled or how many data points a particular dashboard value is using. We've had questions over the RUM metrics that we see and we had to ask for help with how values are calculated, bin sizes, etc to get confidence in our data.

For how long have I used the solution?

I've used the solution for six months.

What do I think about the stability of the solution?

I've only been aware of a recent outage that affected the latency of data collection for one of our production tests. Outside of that, the solution seems stable.

What do I think about the scalability of the solution?

The solution seems like it can scale very well and beyond our needs.

How are customer service and support?

Technical support has been stellar. We love working with a team that responds fast, in great detail, and with great empathy. I trust what they say.

How would you rate customer service and support?

Positive

Which solution did I use previously and why did I switch?

We used New Relic, Grafana, and Circonus. Circonus was flakey, always having downtime and we were always on the phone with them. New Relic and grafana, different metrics lived in either and it was hard for consumers of the data to easily find what they need. And we had licensing issues across the 3 so not everybody could easily access all of them.

What's my experience with pricing, setup cost, and licensing?

I didn't do this portion of the product setup.

Which deployment model are you using for this solution?

Public Cloud

If public cloud, private cloud, or hybrid cloud, which cloud provider do you use?

Amazon Web Services (AWS)


    Scott Palmer

Good query filtering and dashboards to make finding data easier

  • September 19, 2024
  • Review from a verified AWS customer

What is our primary use case?

We use the solution for monitoring microservices in a complex AWS-based cloud service.  

The system is comprised of about a dozen services. This involves processing real-time data from tens of thousands of internet connected devices that are providing telemetry. Thousands of user interactions are processed along with real-time reporting of device date over transaction intervals that can last for hours or even days. The need to view and filter data over periods of several months is not uncommon.  

Datadog is used for daily monitoring and R&D research as well as during incident response.

How has it helped my organization?

The query filtering and improved search abilities offered by Datadog are by far superior to other solutions we were using, such as AWS CloudWatch. We find that we can simply get at the data we need quicker and easier than before. This has made responding to incidents or investigating issues a much more productive endeavour. We simply have less roadblocks in the way when we need to "get at the data". It is also used occasionally to extract data while researching requirements for new features.

What is most valuable?

Datadog dashboards are used to provide a holistic view of the system across many services. Customizable views as well as the ability to "dive in" when we see someting anomalous has improved the workflow for handling incidents.    

Log filtering, pattern detection and grouping, and extracting values from logs for plotting on graphs all help to improve our ability to visualize what is going on in the system. The custom facets allow us to tailor the solution to fit our specific needs.

What needs improvement?

There are some areas on log filtering screens where the user interface can take some getting used to. Perhaps having the option for a simple vs advanced user interface would be helpful in making new or less experienced users comfortable with making their own custom queries.

Maybe it is just how our system is configured, yet finding the valid values for a key/value pair is not always intuitively obvious to me. While there is a pop-up window with historical or previously used values and saved views from previous query runs, I don't see a simple list or enumeration of the set of valid values for keys that have such a restriction.

For how long have I used the solution?

I've used the solution for one year.

What do I think about the stability of the solution?

The solution is very stable.

What do I think about the scalability of the solution?

The product is reasonably scalable, although costs can get out of hand if you aren't careful.

How are customer service and support?

I have not had the need to contact support.

How would you rate customer service and support?

Neutral

Which solution did I use previously and why did I switch?

We did use AWS CloudWatch. It was to awkward to use effectively and simply didn't have the features.

How was the initial setup?

We had someone experienced do the initial setup.  However, with a little training, it wasn't too bad for the rest of us.

What about the implementation team?

We handled the setup in-house.

What's my experience with pricing, setup cost, and licensing?

Take care of how you extract custom values from logs. You can do things without thought to make your life easier and not realize how expensive it can be from where you started.

Which other solutions did I evaluate?

I'm not aware of evaluating other solutions.

What other advice do I have?

Overall I recommend the solution. Just be mindful of costs.

Which deployment model are you using for this solution?

Public Cloud

If public cloud, private cloud, or hybrid cloud, which cloud provider do you use?

Amazon Web Services (AWS)


    Jordan Lee

Good centralization with helpful monitoring and streamlined investigation capabilities

  • September 19, 2024
  • Review from a verified AWS customer

What is our primary use case?

We utilize Datadog to monitor both some legacy products and a new PaaS solution that we are building out here at Icario which is Micro-Service arch. 

All of our infrastructure is in AWS with very few legacies being rackspace. For the PaaS we mainly just utilize the K8s Orchestrator which implements the APM libraries into services deployed there as well as giving us infra info regarding the cluster. 

For legacies, we mainly just utilize the Agent or the AWS integration. With APM in specific places. We monitor mainly prod in Legacy and the full scope in the PaaS for now.

How has it helped my organization?

Datadog has greatly improved the time needed to investigate issues. Putting everything into a single pane of glass. Allowing us to get ahead of infra/app-based issues before they affect customer experience with our products. 

Outside of that, the ease of management, deployment of agents, integrations etc. has greatly helped the teams. There isn't much leg work needed by the devs to manage or deploy Datadog into their stacks. This is with the use of Terraform, pipelines and the orchestrator. All in all, it has been an improvement.

What is most valuable?

The two most valuable aspects are the Terraform provider for Datadog and the K8s Orchestrator. People don't take that into account when buying into a tooling product like Datadog in this age where scalability, management, and ease of implementation is key. Other tools not having good IaC products or options is a ball drop. Orchestration for the tools agent is good. Not having to use another tool to manage the agents and config files in mutiple places/instances is a huge win!

What needs improvement?

A big problem with Datadog is the billing. They need to make the billing more user-friendly. I know it like the back of my hand at this point, yet trying to explain it to the C-suite as to why costs went up or are what they are is many times more complicated than it needs to be. I can't even say "why" due to of the lack of metadata tied to billing. For instance, with the AWS Integration Host ingestion, I cant say well this month THESE host got added and thats what caused cost to go up. The billing visibility really needs to be resolved!

For how long have I used the solution?

I'd rate the solution for more than four years.

What do I think about the stability of the solution?

Datadog has always been extremely stable, with outages really only ever creating delays, never actual downtime of the service, which is amazing and impressive.

What do I think about the scalability of the solution?

The solution is very scalable if implemented right and not on top of complicated architecture.

How are customer service and support?

Support is excellent. They are always looking for a resolution, and a ticket is never left unresolved unless the feature just can't exist or isn't currently possible.

How would you rate customer service and support?

Positive

Which solution did I use previously and why did I switch?

We did have New Relic, Datadog, Sumo Logic, Pingdom, and some other custom or third-party tooling. We switched because we wanted everything to be in a single pane and because Datadog is a better solution than the competitors.

How was the initial setup?

For us, set-up is a mixed bag as we support legacy apps and architectures as well as a new microservice architecture. That being said, legacy is somewhat complex just due to the nature of how those apps stack and the underlying infra and configuration and setup. Microservice is a breeze and straight-forward for most of the out-of-the-box stuff.

What about the implementation team?

Our Team of SRE Engineers, Platform Engineers and Cloud Engineers implemented the solution.

What was our ROI?

I can't really speak to ROI; however, from my perspective, we definitely get our money's worth from the product.

What's my experience with pricing, setup cost, and licensing?

Users just just really need to make sure they stay on top of costs and don't let all of the engineers do as they please. Billing with Datadog can get out of hand if you let them. Not everything needs to be monitored.

Which other solutions did I evaluate?

We didn't really need to evaluate other options.

Which deployment model are you using for this solution?

Hybrid Cloud

If public cloud, private cloud, or hybrid cloud, which cloud provider do you use?

Amazon Web Services (AWS)


    Victor Chen1

Good for log ingestion and analyzing logs with easy searchability of data

  • September 19, 2024
  • Review from a verified AWS customer

What is our primary use case?

We use Datadog as our main log ingestion source, and Datadog is one of the first places we go to for analyzing logs.

This is especially true for cases of debugging, monitoring, and alerting on errors and incidents, as we use traffic logs from K8s, Amazon Web Services, and many other services at our company to Datadog. In addition, many products and teams at our company have dashboards for monitoring statistics (sometimes based on these logs directly, other times we set queries for these metrics) to alert us if there are any errors or health issues.

How has it helped my organization?

Overall, at my company, Datadog has made it easy to search for and look up logs at an impressively quick search rate over a large amount of logs.

It seamlessly allows you to set up monitoring and alerting directly from log queries which is convenient and helps for a good user experience, and while there is a bit of a learning curve, given enough time a majority of my company now uses Datadog as the first place to check when there are errors or bugs.

However, the cost aspect of Datadog is tricky to gauge because it's related to usage, and thus, it is hard to tell the relative value of Datadog year to year.

What is most valuable?

The feature I've found most valuable is the log search feature. It's set up with our ingestion to be a quick one-stop shop, is reliable and quick, and seamlessly integrates into building custom monitors and alerts based on log volume and timeframes.

As a result, it's easy to leverage this to triage bugs and errors, since we can pinpoint the logs around the time that they occur and get metadata/context around the issue. This is the main feature that I use the most in my workflow with Datadog to help debug and triage issues.

What needs improvement?

More helpful log search keywords/tips would be helpful in improving Datadog's log dashboard. I recently struggled a lot to parse text from raw line logs that didn't seem to match directly with facets. There should be smart searching capabilities. However, it's not intuitive to learn how to leverage them, and instead had to resort to a Python script to do some simple regex parsing (I was trying to parse "file:folder/*/*" from the logs and yet didn't seem to be able to do this in Datadog, maybe I'm just not familiar enough with the logs but didn't seem to easily find resources on how to do this either).

For how long have I used the solution?

I've used the solution for 10 months.

What's my experience with pricing, setup cost, and licensing?

Beware that the cost will fluctuate (and it often only gets more expensive very quickly).


    Charlie W.

Helpful support, with centralized pipeline tracking and error logging

  • September 19, 2024
  • Review from a verified AWS customer

What is our primary use case?

Our primary use case is custom and vendor-supplied web application log aggregation, performance tracing and alerting. 

How has it helped my organization?

Through the use of Datadog across all of our apps, we were able to consolidate a number of alerting and error-tracking apps, and Datadog ties them all together in cohesive dashboards. 

What is most valuable?

The centralized pipeline tracking and error logging provide a comprehensive view of our development and deployment processes, making it much easier to identify and resolve issues quickly. 

Synthetic testing is great, allowing us to catch potential problems before they impact real users. Real user monitoring gives us invaluable insights into actual user experiences, helping us prioritize improvements where they matter most. And the ability to create custom dashboards has been incredibly useful, allowing us to visualize key metrics and KPIs in a way that makes sense for different teams and stakeholders. 

What needs improvement?

While the documentation is very good, there are areas that need a lot of focus to pick up on the key details. In some cases the screenshots don't match the text when updates are made. 

I spent longer than I should trying to figure out how to correlate logs to traces, mostly related to environmental variables.

For how long have I used the solution?

I've used the solution for about three years.

What do I think about the stability of the solution?

We have been impressed with the uptime.

What do I think about the scalability of the solution?

It's scalable and customizable. 

How are customer service and support?

Support is helpful. They help us tune our committed costs and alert us when we start spending out of the on-demand budget.

Which solution did I use previously and why did I switch?

We used a mix of SolarWinds, UptimeRobot, and GitHub actions. We switched to find one platform that could give deep app visibility.

How was the initial setup?

Setup is generally simple. .NET Profiling of IIS and aligning logs to traces and profiles was a challenge.

What about the implementation team?

We implemented the solution in-house.

What was our ROI?

There has been significant time saved by the development team in terms of assessing bugs and performance issues.

What's my experience with pricing, setup cost, and licensing?

I'd advise others to set up live trials to asses cost scaling. Small decisions around how monitors are used can have big impacts on cost scaling. 

Which other solutions did I evaluate?

NewRelic was considered. LogicMonitor was chosen over Datadog for our network and campus server management use cases.

What other advice do I have?

We are excited to dig further into the new offerings around LLM and continue to grow our footprint in Datadog. 

Which deployment model are you using for this solution?

Hybrid Cloud

If public cloud, private cloud, or hybrid cloud, which cloud provider do you use?

Amazon Web Services (AWS)


    ZJ

Very good custom metrics, dashboards, and alerts

  • September 18, 2024
  • Review from a verified AWS customer

What is our primary use case?

Our primary use case for Datadog involves utilizing its dashboards, monitors, and alerts to monitor several key components of our infrastructure.

We track the performance of AWS-managed Airflow pipelines, focusing on metrics like data freshness, data volume, pipeline success rates, and overall performance.

In addition, we monitor Looker dashboard performance to ensure data is processed efficiently. Database performance is also closely tracked, allowing us to address any potential issues proactively. This setup provides comprehensive observability and ensures that our systems operate smoothly.

How has it helped my organization?

Datadog has significantly improved our organization by providing a centralized platform to monitor all our key metrics across various systems. This unified observability has streamlined our ability to oversee infrastructure, applications, and databases from a single location.

Furthermore, the ability to set custom alerts has been invaluable, allowing us to receive real-time notifications when any system degradation occurs. This proactive monitoring has enhanced our ability to respond swiftly to issues, reducing downtime and improving overall system reliability. As a result, Datadog has contributed to increased operational efficiency and minimized potential risks to our services.

What is most valuable?

The most valuable features we’ve found in Datadog are its custom metrics, dashboards, and alerts. The ability to create custom metrics allows us to track specific performance indicators that are critical to our operations, giving us greater control and insights into system behavior.

The dashboards provide a comprehensive and visually intuitive way to monitor all our key data points in real-time, making it easier to spot trends and potential issues. Additionally, the alerting system ensures we are promptly notified of any system anomalies or degradations, enabling us to take immediate action to prevent downtime.

Beyond the product features, Datadog’s customer support has been incredibly timely and helpful, resolving any issues quickly and ensuring minimal disruption to our workflow. This combination of features and support has made Datadog an essential tool in our environment.

What needs improvement?

One key improvement we would like to see in a future Datadog release is the inclusion of certain metrics that are currently unavailable. Specifically, the ability to monitor CPU and memory utilization of AWS-managed Airflow workers, schedulers, and web servers would be highly beneficial for our organization. These metrics are critical for understanding the performance and resource usage of our Airflow infrastructure, and having them directly in Datadog would provide a more comprehensive view of our system’s health. This would enable us to diagnose issues faster, optimize resource allocation, and improve overall system performance. Including these metrics in Datadog would greatly enhance its utility for teams working with AWS-managed Airflow.

For how long have I used the solution?

I've used the solution for four months.

What do I think about the stability of the solution?

The stability of Datadog has been excellent. We have not encountered any significant issues so far.

The platform performs reliably, and we have experienced minimal disruptions or downtime. This stability has been crucial for maintaining consistent monitoring and ensuring that our observability needs are met without interruption.

What do I think about the scalability of the solution?

Datadog is generally scalable, allowing us to handle and display thousands of custom metrics efficiently. However, we’ve encountered some limitations in the table visualization view, particularly when working with around 10,000 data points. In those cases, the search functionality doesn’t always return all valid results, which can hinder detailed analysis.

How are customer service and support?

Datadog's customer support plays a crucial role in easing the initial setup process. Their team is proactive in assisting with metric configuration, providing valuable examples, and helping us navigate the setup challenges effectively. This support significantly mitigates the complexity of the initial setup.

Which solution did I use previously and why did I switch?

We used New Relic before.

How was the initial setup?

The initial setup of Datadog can be somewhat complex, primarily due to the learning curve associated with configuring each metric field correctly for optimal data visualization. It often requires careful attention to detail and a good understanding of each option to achieve the desired graphs and insights

What about the implementation team?

We implemented the solution in-house.