Sign in Agent Mode
Categories
Your Saved List Become a Channel Partner Sell in AWS Marketplace Amazon Web Services Home Help

Datadog Pro

Datadog

Reviews from AWS customer

23 AWS reviews

External reviews

745 reviews
from and

External reviews are not included in the AWS star rating for the product.


5-star reviews ( Show all reviews )

    Accounting

Fast, Unified Observability That Speeds Up Production Root-Cause Analysis

  • January 21, 2026
  • Review provided by G2

What do you like best about the product?
What I like best about Datadog is how fast it helps teams understand what’s actually happening in production. The platform brings logs, metrics, traces, and real-time alerts into a single, intuitive view, so you’re not jumping between tools when something goes wrong. That unified observability makes it much easier to identify root causes quickly, especially in complex, distributed systems.
What do you dislike about the product?
Cost can escalate quickly. Pricing is usage-based, so as log volume, metrics, or hosts scale up, it’s easy for costs to grow faster than expected if usage isn’t closely monitored.
What problems is the product solving and how is that benefiting you?
Datadog is solving the core problem of not knowing what’s happening inside your systems when it matters most, and it benefits you by saving time, reducing stress, and helping you make better decisions.


    saurabh r.

All-in-One Monitoring with Real-Time Insights

  • January 08, 2026
  • Review provided by G2

What do you like best about the product?
It brings metrics, logs, traces, and alerts into a single, intuitive platform.
Real-time dashboards and powerful visualizations make it easy to identify issues quickly.
What do you dislike about the product?
Fine-tuning alerts and dashboards often takes time to avoid noise and false positives.
What problems is the product solving and how is that benefiting you?
It centralizes metrics, logs, traces, and alerts across applications, infrastructure, and cloud services in one platform.


    Prasanth K.

Effortless Observability Across Platforms, Services and Integrations for Always-On Reliability

  • January 05, 2026
  • Review provided by G2

What do you like best about the product?
Firstly, Its integration capabilities to Hosts (Windows/Linux/mac), Platforms (AWS, Azure which I use, plus GCP etc) and container platforms (Docker, Kubernetes) etc benefits alot of usecases as DD server as a one stop shop. We have support for multiple programming languages to easily publish logs to Datadog directly right form your application code, eliminating any fancy stuff.

We have very reliable feature slike smart health checks and automated test suites so we catch problems before they hit. On-call teams get instant alerts, incident triage, and even automated workflows for triage etc enhance teams to focus fixing issues quickly and stress-free with some readily available first hand information.

Dashboards with visualizations like line, bar, pie, and timeseries charts cater to different use cases—such as applications, infrastructure, and databases—making it easier to monitor performance. DD become an integral part of our daily operations, helping outquickly spot anomalies and simplifying the overall and managing workflows.

Its easy to setup/install/implement agent configuration (Pre designed Installation URL with installation script) doesnt take more than 5mins. Users can readily build dashboards in under 15 mins for prod grade setup. [ In general its just 5mins as publicised by Datadog].

DD do has great customer support but we rarely need that as most of the stuff has documentation and easy to setup or configure.
What do you dislike about the product?
In our current context,
As our infrastructure or application footprint grows, storage costs increase proportionally and can become a major expense. If we need to retain data for extended periods, expect those costs to rise even further (so storage necessity is the key).

Just like other platforms, Datadog also offers numerous integrations with third-party platforms like Slack, Microsoft Teams, and Jira. We leveraged on all these channels initially that lead to increased costs, as each integration added complexity and resource usage along with increase complexity implementing them. We had to strip someof them to manage cost and purpose of applications at different environment levels.

There are so many options for same purpose but without proper guidance or complete understanding of that usecase, we may en dup implement more than what is required. So purpose is key here.
What problems is the product solving and how is that benefiting you?
1. Datadog is helping us providing the complete picture of problem with some initial details and by giving us a single platform to monitor 50+ microservices across 40+ AWS accounts, so nothing slips through the cracks. we have some first hand information based on automation or test suits logs, we know where to check, leading to less turn around time.

2. It tackles incident management/response challenges with real-time alerts, on-call integration, and automated triage, identifying similar patterns, notes around the service and resolution documents helping us fix issues before they impact customers at large extend. Its integration to different platforms we manage (almost all) is really a value add.

3. Built-in health checks and test suites keep our systems in shape, while integrations with AWS, PagerDuty, Slack, and more make the whole workflow smooth and connected. Datadog eliminates tool silos and creates a smooth workflow for monitoring and incident resolution.

4. From service-level segregation to rich dashboards, Datadog turns most of our log data into simple insights for engineers and execs alike. Different dashboards at low level and higher level made our life easy from monitoring to presenting the data to higher-ups.


    Abhishek Kumar M.

Real-Time Insights and Effortless Error Tracking

  • November 30, 2025
  • Review provided by G2

What do you like best about the product?
I like that Datadog shows issues in real time and the filters help me quickly find the key errors without digging through everything.
What do you dislike about the product?
Sometimes the dashboard feels a bit heavy, and it can take time to set up the right filters when the data gets large. and some time the errors are small but it shows high alerts for it.
What problems is the product solving and how is that benefiting you?
Datadog helps us catch issues fast during live esports events like the World Cup. By checking logs and errors in real time, we can fix problems quickly and keep everything running smoothly.


    Ajay V.

Unmatched Observability and AI Insights, Lightning-Fast Setup

  • November 26, 2025
  • Review provided by G2

What do you like best about the product?
Datadog gives us a single observability layer that ties metrics, logs, traces, and AI-driven insights together. What I like most is how fast it is to instrument new services, define custom metrics, and build dashboards that actually help teams make decisions. We also use Datadog extensively for deploying internal AI agents—its event streams, log ingestions, and metric pipelines make it easy to create intelligent triggers and automated workflows.
The correlation between logs → metrics → alerts is incredibly powerful, and the AI-based anomaly detection has helped us reduce blind spots in our observability stack.
What do you dislike about the product?
The biggest drawback is cost. Datadog becomes expensive very quickly—especially when log volumes grow or when you create many custom business metrics. Even with strict retention windows and log pipelines, the monthly bill requires constant governance. It’s a powerful platform, but the pricing model can be a challenge for teams that want broad coverage without compromising on granularity.
What problems is the product solving and how is that benefiting you?
Datadog helps us run a reliable observability platform. We use it to monitor application health, detect failures early, and define custom KPIs for product and engineering teams. It also acts as the signal layer for deploying AI agents—our automated workflows listen to Datadog metrics and logs to trigger alerts, escalations, and self-healing actions.

As a Product Manager, Datadog enables me to set clear SLOs, understand system behavior end-to-end, and reduce MTTD/MTTR significantly. It gives us a proactive observability mindset powered by intelligent metrics and real-time insights, helping us deliver a more stable and predictable platform for users.


    Johnny C.

Comprehensive and Dynamic Monitoring for a Complete End-to-End View

  • November 25, 2025
  • Review provided by G2

What do you like best about the product?
its end-to-end monitoring since it shows us everything we need to see and likewise see what is consumed and all its logs, traces, metrics in a more dynamic way
What do you dislike about the product?
its way of containers that makes the price go up if not configured properly
What problems is the product solving and how is that benefiting you?
Before, I had to check five different tools: one to see the CPU usage of my servers (Metrics), another to find out what caused the error (Logs), and another to see which part of the code slowed down (Traces). This is slow and makes me waste time jumping between tabs.


    Om T.

Comprehensive Monitoring, But Setup Could Improve

  • November 18, 2025
  • Review provided by G2

What do you like best about the product?
I love how Datadog integrates seamlessly into a multi-cloud platform, providing real-time metrics that are crucial for monitoring and observability. The highly customizable dashboards allow me to tailor the analytics to fit my exact needs while the fine-tuning alert options ensure that I am always informed without being overwhelmed. These combined features help significantly in tracking system performance metrics and logs, greatly aiding in root cause analysis (RCA).
What do you dislike about the product?
I find the setup of Datadog to be inefficient and somewhat challenging, making the initial configuration process moderately hard. Additionally, the cost of using Datadog is quite high, and it would be beneficial if there were discounts available for first-time users to ease the financial burden.
What problems is the product solving and how is that benefiting you?
I use Datadog for centralized monitoring and observability, tracking system performance metrics, logs, and root cause analysis. It integrates with multi-cloud platforms, provides real-time metrics, and features customizable dashboards and alert options.


    Entertainment

Comprehensive Monitoring Tool with Powerful Insights but High Costs

  • October 26, 2025
  • Review provided by G2

What do you like best about the product?
What I like best about Datadog is how seamlessly it brings together metrics, logs, and traces in one place. The dashboard is very intuitive, and it’s easy to set up real-time monitoring for applications and infrastructure. I also like how flexible it is — you can create custom dashboards, set alerts, and get deep visibility into performance issues quickly. It really helps in identifying bottlenecks before they impact users.
What do you dislike about the product?
The main downside of Datadog is its pricing — it can get quite expensive as your infrastructure and data volume grow. Managing costs can be tricky, especially when you’re monitoring multiple environments. Also, the interface, while powerful, can feel a bit overwhelming at first due to the number of features and options available. It takes some time to get fully comfortable navigating everything.
What problems is the product solving and how is that benefiting you?
Datadog helps us monitor our entire system — from backend services and APIs to frontend performance — all in one place. It gives real-time visibility into logs, metrics, and traces, which makes it much easier to detect and troubleshoot issues quickly. Thanks to Datadog, we’ve reduced downtime, improved application performance, and gained better insights into how different parts of our system interact. It really helps our team stay proactive instead of reactive when it comes to performance and reliability.


    Carson Waldrop

Has resolved user errors faster by reviewing behavior with replay features

  • October 17, 2025
  • Review provided by PeerSpot

What is our primary use case?

My main use case for Datadog involves working on projects related to our sales reps in terms of registering new clients, and I've been using Datadog to pull up instances of them while they're beta testing our product that we're rolling out just to see where their errors are occurring and what their behavior was leading up to that.

I can't think of all of the specific details, but there was a sales rep who was running into a particular error message through their sales registration process, and they weren't giving us a lot of specific screenshots or other error information to help us troubleshoot. I went into Datadog and looked at the timestamp and was able to look at the actual steps they took in our platform during their registration and was able to determine what the cause of that error was. I believe if I remember correctly, it was user error; they were clicking something incorrectly.

One thing I've seen in my main use case for Datadog is an option that our team can add on, and it's the ability to track behavior based on the user ID. I'm not sure at this time if our team has turned that on, but I do think that's a really valuable feature to have, especially with the real-time user management where you can watch the replay. Because we have so many users that are using our platform, the ability to filter those replay videos based on the user ID would be so much more helpful. Especially in terms where we're testing a specific product that we're rolling out, we start with smaller beta tests, so being able to filter those users by the user IDs of those using the beta test would be much more helpful than just looking at every interaction in Datadog as a whole.

What is most valuable?

The best features Datadog offers are the replay videos, which I really find super helpful as someone who works in QA. So much of testing is looking at the UI, and being able to look back at the actual visual steps that a user is taking is really valuable.

Datadog has impacted our organization positively in a major way because not even just as a QA engineer having access to the real-time replay, but just as a team, all of us being able to access this data and see what parts of our system are causing the most errors or resulting in the most frustration with users. I can't speak for everybody else because I don't know how each other segment of the business is using it, but I can imagine just in terms of how it's been beneficial to me; I can imagine that it's being beneficial to everybody else and they're able to see those areas of the system that are causing more frustration versus less.

What needs improvement?

I think Datadog can be improved, but it's a question that I'm not totally sure what the answer is. Being that my use case for it is pretty specific, I'm not sure that I have used or even really explored all of the different features that Datadog offers. So I'm not sure that I know where there are gaps in terms of features that should be there or aren't there.

I will go back to just the ability to filter based on user ID as an option that has to be set up by an organization, but I would maybe recommend that being something part of an organization's onboarding to present that as a first step. I think as an organization gets bigger or even if the organization starts using Datadog and is large, it's going to be potentially more difficult to troubleshoot specific scenarios if you're sorting through such a large amount of data.

For how long have I used the solution?

I have been working in this role for a little over a year now.

What do I think about the stability of the solution?

As far as I can tell, Datadog has been stable.

What do I think about the scalability of the solution?

I believe we have about 500 or so employees in our organization using our platform, and Datadog seems to be able to handle that load sufficiently, as far as I can tell. So I think scalability is good.

How are customer service and support?

I haven't had an instance where I've reached out to customer support for Datadog, so I do not know.

How would you rate customer service and support?

Which solution did I use previously and why did I switch?

I do not believe we used a different solution previously for this.

What was our ROI?

I cannot answer if I have seen a return on investment; I'm not part of the leadership in terms of making that decision. Regarding time saved, in my specific use case as a QA engineer, I would say that Datadog probably didn't save me a ton of time because there are so many replay videos that I had to sort through in order to find the particular sales reps that I'm looking for for our beta test group. That's why I think the ability to filter videos by the user ID would be so much more helpful. I believe features that would provide a lot of time savings, just enabling you to really narrow down and filter the type of frustration or user interaction that you're looking for. But in regards to your specific question, I don't think that's an answer that I'm totally qualified to answer.

Which other solutions did I evaluate?

I was not part of the decision-making process before choosing Datadog, so I cannot speak to whether we evaluated other options.

What other advice do I have?

Right now our users are in the middle of the beta test. At the beginning of rolling the test out, I probably used the replay videos more just as the users were getting more familiar with the tool. They were probably running into more errors than they would be at this point now that they're more used to the tool. So it kind of ebbs and flows; at the beginning of a test, I'm probably using it pretty frequently and then as it goes on, probably less often.

It does help resolve issues faster, especially because our sales reps are used to working really quickly in terms of the sales registration, as they're racing through it. They're more likely to accidentally click something or click something incorrectly and not fully pay attention to what they're doing because they're just used to their flow. Being able to go back and watch the replay and see that a person clicked this button when they intended to click another button, or identifying the action that caused an error versus going off of their memory.

I have not noticed any measurable outcomes in terms of reduction in support tickets or faster resolution times since I started using Datadog. For myself, looking at the users in our beta test group, none of those came as a result of any sort of support ticket. It came from messages in Microsoft Teams with all the people in the beta group. We have resulted in fewer messages in relation to the beta test because they are more familiar with the tool. Now that they know there might be differences in terms of what their usual flow is versus how their flow is during the beta test group, they are resulting in fewer messages because they are probably being more careful or they've figured out those inflection points that would result in an error.

My biggest piece of advice for others looking into using Datadog would be to use the filters based on user ID; it will save so much time in terms of troubleshooting specific error interactions or occurrences. I would also suggest having a UI that's more simple for people that are less technical. For example, logging into Datadog, the dashboard is pretty overwhelming in terms of all of the bar charts and options; I think having a more simplified toggle for people that are not looking for all of the options in terms of data, and then having a more technical toggle for people that are looking for more granular data, would be helpful.

I rate Datadog 10 out of 10.


    Corey Peoples

Has improved our ability to identify cloud application issues quickly using trace data and detailed log filtering

  • October 16, 2025
  • Review from a verified AWS customer

What is our primary use case?

My team and I primarily rely on Datadog for logs to our application to identify issues in our cloud-based solution, so we can take the requests and information that's being presented as errors from our customers and use it to identify what the errors are within our back-end systems, allowing us to submit code fixes or configuration changes.

I had an error when I was trying to submit an API request this morning that just said unspecified error in the web interface. I took the request ID and filtered a facet of our logs to include that request ID, and it gave me the specific examples, allowing me to look at the code stack that we had logged to identify what specifically it was failing to convert in order to upload that data.

My team doesn't utilize Datadog logs very often, but we do have quite a few collections of dashboards and widgets that tell us the health of the various API requests that come through our application to identify any known issues with some of our product integrations. It's useful information, but it's not necessarily stuff that our team monitors directly as we're more of a reactionary team.

What is most valuable?

The best features Datadog offers, in my experience, are the ability to filter down by facets very quickly to identify the problems we're experiencing with our individual customers using our cloud application. I really enjoy the trace option so that I can see all of the various components and how they communicate with each other to see where the failures are occurring.

The trace option helps us spot issues by giving access to see if the problem is occurring within our Java components or if it's a result of the SQL queries, allowing us to look at the SQL queries themselves to identify what information it's trying to pull. We can also look at other integrations, whether that's serverless Lambda functions or different components from our outreach.

Datadog has impacted our organization positively because the general feeling is that it's superior to the ELK stack that we used to use, being significantly faster in searching and filtering the information down, as well as providing links to our search criteria that our development teams and cloud operations teams can use to look at the same problems without having to set up their own search and filter criteria.

What needs improvement?

For the most part, the issues that we come across with Datadog are related to training for our organization. Our development and operations teams have done a really good job of getting our software components into Datadog, allowing us to identify them. However, we do have reduced logging in our Datadog environment due to the amount of information that's going through.

The hardest thing we experience is just training people on what to search for when identifying a problem in Datadog, and having some additional training that might be easily accessible would probably be a benefit.

At this point, I do not know what I don't know, so while there may be options for improvements, Datadog works very well for the things that we currently use it for. Additionally, the extra training that would be more easily accessible would be extremely helpful, perhaps something within the user interface itself that could guide us on useful information or how to tie different components or build a good dashboard.

For how long have I used the solution?

I have worked for Calabrio for 13 years.

What do I think about the stability of the solution?

Datadog is very stable.

What do I think about the scalability of the solution?

Datadog's scalability is strong; we've continued to significantly grow our software, and there are processes in place to ensure that as new servers, realms, and environments are introduced, we're able to include them all in Datadog without noticing any performance issues. The reporting and search functionality remain just as good as when we had a much smaller implementation.

Which solution did I use previously and why did I switch?

Previously, we used the ELK stack—Elasticsearch, Logstash, and Kibana—to capture data. Our cloud operations team set that up because they were familiar with it from previous experiences. We stopped using it because as our environment continued to grow, the response times and the amount of data being kept reached a point where we couldn't effectively utilize it, and it lacked the capability to help us proactively identify issues.

What other advice do I have?

A general impression is that Datadog saves time because the ability to search, even over the vast amount of AWS realms and time spans that we have, is significantly faster compared to other solutions that I've used that have served similar purposes.

I would advise others looking into using Datadog to identify various components within their organization that could benefit from pulling that information in and how to effectively parse and process all of it before getting involved in a task, so they know what to look for. Specifically, when searching for data, if a metric can be pulled out into an individual facet and used, the amount of filtering that can be done is significantly improved compared to a general text search.

I would love to figure out how to use Datadog more effectively in the organization work that I do, but that is a discussion I need to have with our operations and research and development teams to determine if it can benefit the customer or the specific implementation software that I work with.

On a scale of one to ten, I rate Datadog a ten out of ten.

Which deployment model are you using for this solution?

Public Cloud

If public cloud, private cloud, or hybrid cloud, which cloud provider do you use?