Has enabled our teams to detect application errors faster and shift company mindset toward proactive monitoring
What is our primary use case?
My main use case for Datadog is application monitoring.
Specifically for application monitoring, we monitor our production Laravel instances using APM spans and tracing.
In addition to application monitoring, I also use Datadog to monitor our log management for our applications that are both on-prem and in the cloud, as using the AWS integration.
What is most valuable?
In my experience, the best features that Datadog offers us include unprecedented visibility and the ability to dive deep on application debugging.
Datadog's visibility and debugging features help me day-to-day; specifically, we had an application that was throwing a bunch of errors causing an issue in our production database. Using Datadog, we were able to immediately isolate the error and plan around it.
Datadog has positively impacted my organization. I think it has given us not only the specific debug and error codes that we're looking for, but it has changed the entire company's mindset in how to extract value from data that's been lying around in our internal systems for years now and given everybody a new perspective on monitoring and debugging.
Since adopting Datadog, I've noticed specific outcomes. We've begun to handle our log management internally in a more efficient manner, so we've actually reduced our disk space as simplified our backup procedures and process chains using Datadog. Now that we have extracted the value from the logs and the traces and the debug logs, we no longer have to rely so much on traditional text-based logs or even digging into the code and the error files themselves.
What needs improvement?
The only improvement I would to see with Datadog is that the graphical user interface sometimes takes a little bit to load, especially when diving deep on a subject, and just a little bit more caching would help.
The largest pain point we've had with Datadog to this point was onboarding. This was partly our fault because our logs weren't really set up to be used in a modern observability platform Datadog, but I definitely would have liked to have seen more comprehensive onboarding. We had a few appointments, but the more help we get up front, the easier it is for us to get more familiar and do more things with Datadog.
At this time, I do not think there are any other improvements Datadog needs that would make my experience even better.
For how long have I used the solution?
I have been using Datadog for approximately four months now.
What do I think about the stability of the solution?
What do I think about the scalability of the solution?
We have not yet hit the use case to evaluate Datadog's scalability, but based off of everything else we've used with the infrastructure, I don't think there are going to be any issues with it. We did, as a trial, engage the AWS integration, and immediately it found all of our AWS resources and presented them to us. In fact, it was talking about costing and billing which we had not anticipated, but we were pleasantly surprised with.
How are customer service and support?
Customer support is excellent; I have opened and closed probably five tickets in the past few days, specifically within the past seven days. Very responsive, and the support techs are knowledgeable and responsive.
I would rate customer support an eight out of ten. The only issues that we had were really needing more educational resources to begin with to truly understand the specifics of log management and APM tracing setup, simply because those are very complicated procedures. Walking through that a couple more times with the support engineer probably would have been helpful. It was not a deal breaker or a significant pain point, but the quicker we get up with Datadog, the happier, the quicker and deeper we get with Datadog, the happier people seem to be at our organization.
Overall, the entire Datadog comprehensive experience of support, onboarding, getting everything in there, and having a good line of feedback has been exceptional. I've been in the industry over 20 years, and part of my roles has always been customer-facing. I find that Datadog's client support is very engaging, comprehensive, and thorough.
How would you rate customer service and support?
Which solution did I use previously and why did I switch?
For on-prem infrastructure monitoring, we're currently using Nagios, but that's beginning to fade as we rely more on Datadog for our infrastructure monitoring. We had used New Relic for application performance monitoring, but because of the cost associated with that and not seeing the value from it, we stopped using that about two years ago.
How was the initial setup?
We did not purchase Datadog through the AWS Marketplace; we were contacted independently by a Datadog sales agent.
My experience with pricing, setup cost, and licensing has been overall fairly positive. The on-demand/reserved pricing, we were not as cognizant as to how big the on-demand could get, especially when we were getting everything set up, but Datadog proactively took a strong hand in guiding us to getting our costs under control. I'm proud to say that we are within 1% of our projected cost budget, so that was very handy and that's happened in the last month. Very efficient and very effective working with Datadog to control cost.
What was our ROI?
In terms of time saved, I've noticed that when we're responding to potential errors or during our software deployments, it's saving us minutes at a time that quickly add up to hours, that quickly add up to days in terms of retrieving debug and application error information.
Which other solutions did I evaluate?
Before choosing Datadog, we evaluated other options including New Relic and SolarWinds.
What other advice do I have?
I would advise others looking into using Datadog to evaluate it against other competing properties and applications in the space, and really dig in. You will find that Datadog does what it's supposed to do very quickly, very efficiently, as does it more cost competitively than some of the other offerings.
Datadog is deployed in my organization in both on-prem and in public cloud scenarios.
On a scale of one to ten, I rate Datadog a nine overall.
Which deployment model are you using for this solution?
Hybrid Cloud
If public cloud, private cloud, or hybrid cloud, which cloud provider do you use?
Amazon Web Services (AWS)
User sessions have been monitored effectively and beta user frustration points are now identified through behavioral insights
What is our primary use case?
I think the most important feature for me in Datadog is the RUM features.
I check the efficiency of the applications that I'm supporting in Datadog and also use it to view the sessions of users.
I have some trouble doing troubleshooting in our app currently, but RUM is my main use case in Datadog.
What is most valuable?
The personalized dashboards and alerting in Datadog stand out to me, so that way you can gear your use of the product towards what's important to you.
Datadog has allowed us to ensure that we can look at how our beta testers are using our new UIs and seeing where their frustration points are, which has been important to us.
We've been using the heat map feature in Datadog to measure those frustration points.
What needs improvement?
Some templates for certain roles and things that users care about could be auto-suggested for a dashboard or alerting in Datadog.
We had limitations around RUM and our feature flag provider in Datadog because it's a back-end forward feature flag usage in our Next.js application. We had trouble hooking up our feature flags due to RUM being client-side only. This issue arose because Next.js is a front-end and back-end focused application, and it would be beneficial to send the feature flag resolution from the back-end if needed. Our feature flag provider is GrowthBook, and the way we would have to get those feature flags into Datadog was time-consuming with a lot of boilerplate. We would have to mimic feature flag resolution on the client side, so we decided to forego that.
For how long have I used the solution?
We have been using Datadog for about two or three months.
What do I think about the stability of the solution?
Datadog seems stable in my experience without any downtime or reliability issues.
What do I think about the scalability of the solution?
Datadog is scalable and I don't think we'll have problems with scalability in terms of our use case. We might face limitations with logs, but I feel we would not be reaching any of Datadog's limits.
How are customer service and support?
The customer support has been one of the best parts of Datadog.
I would rate the customer support from Datadog a 10 on a scale of 1 to 10.
I would suggest staying in close contact with your customer support representative to get the most out of Datadog.
How would you rate customer service and support?
Which solution did I use previously and why did I switch?
We did not have a different solution before Datadog.
How was the initial setup?
Setup with Datadog was pretty easy.
What was our ROI?
It is too early to tell if we've seen a return on investment so far with Datadog.
What's my experience with pricing, setup cost, and licensing?
I'm not clear on pricing, but it's not a huge concern for us at the moment in terms of RUM. For the other pieces, I know that there may be some pricing that they've been looking at for APM and logs.
Which other solutions did I evaluate?
I did not evaluate other options before choosing Datadog.
What other advice do I have?
I personally don't use the personalized dashboards and alerting, but I've seen some nice use cases from others on my team. On a scale of 1-10, I rate Datadog an 8.
Which deployment model are you using for this solution?
Hybrid Cloud
If public cloud, private cloud, or hybrid cloud, which cloud provider do you use?
Efficient and reliable.
What do you like best about the product?
Extremely reliable.
Ability to fine tune Custom metrics and logs ingested -> this helps to control the cost.
Many features integrated together multiplying the efficiency of Datadog as a global Observability solution.
It is easy to implement and to use.
Integration with 3rd parties is most of the time straightforward.
Scales well with a large organisation.
Dashboards and queries are very responsive even with a very large amount of data.
The Datadog team is very responsive, they already implemented most of the feature requests we suggested (over 10 in a year) which is impressive.
Support is also responsive and we have most of our issues solved in a reasonable amount of time.
The platform is used intensively be the developers across our organisation.
What do you dislike about the product?
I don't like Zendesk, it is quite poor to interact with support compared to Datadog platform.
Terraform and APIs often take longer to catch up with new features.
What problems is the product solving and how is that benefiting you?
While Datadog platform covers a wide range of functionality, it helps us to consolidate our Observability tool suite saving cost and time.
The fact that it's easy to use helps a lot for our application monitoring and our incidents management.
Has created intuitive dashboards and streamlined monitoring across teams
What is our primary use case?
Our main use case for Datadog is collecting metrics, specifically things such as latency metrics and error metrics for our services at Procore.
To give a specific example of how I use Datadog for those metrics in my daily work, I had to create a new service to solve a particular problem, which was an API. I used Datadog to get metrics around successful requests, failure requests, and 400 requests. I then created dashboards that showed those metrics along with some latency metrics from the API, and I also built a monitor that triggers and sends an alert whenever we're over a certain number of the failure metrics.
How has it helped my organization?
The single biggest improvement has been breaking down the silos between our teams. Before we adopted it, our developers, operations, and SRE teams all lived in separate tools. Ops had their infrastructure graphs, Devs had their log files, and no one had a complete picture.
Here’s where we’ve seen the most significant impact:
-
We Find and Fix Problems Drastically Faster: The "single pane of glass" is a real thing for us. When an alert fires, our on-call engineer can see the infrastructure metric spike (like CPU), pivot directly to the application traces (APM) running on that host, and see the exact, correlated logs from the services causing the problem—all in one place. We've cut our Mean Time to Resolution (MTTR) significantly because we're no longer "swivel-chairing" between three different tools trying to manually line up timestamps.
-
We Are More Proactive and Less Reactive: Features like Watchdog (its anomaly detection) have been crucial. We've been alerted to a slow-building memory leak and an abnormal spike in error rates on a specific API endpoint before they breached our static thresholds and caused a user-facing outage. It's helped us move from a "firefighting" culture to one where we can catch problems before they escalate.
What is most valuable?
The best features of Datadog include a great dashboard, a super simple and easy to use Python library, and an easy monitor, which together provide a really great UI experience.
What makes the dashboard and Python library stand out for me is that they save a lot of time, getting right to the point and being super intuitive.
Datadog has positively impacted my organization by allowing us to have a link to a dashboard for most services.
We have dashboards across the company, which can easily be passed around, making it super easy for everyone to understand the metrics they are looking at.
What needs improvement?
Oh, that's a great question. We actually have a running list of things we'd love to see. Even though we get a ton of value from it, no tool is perfect. Our feedback generally falls into two categories: making the current experience less painful and adding new capabilities we think are the logical next step.
Honestly, our biggest frustrations aren't about a lack of features, but about the management of the platform itself.
-
Cost Predictability and Governance: This is, without a doubt, our number one issue. It's not just that Datadog is expensive—it's that the cost is incredibly complex and hard to predict. Our bill can fluctuate wildly based on custom metrics, log ingestion, and traces from a new service. We've had to dedicate engineering time just to managing our Datadog costs, creating exclusion filters, and sampling aggressively, which feels like we're being punished for using the product more.
-
How to improve it: We need a "cost calculator" inside the platform. Before I enable monitoring on a new cluster or turn on a new integration, I want Datadog to give me a concrete estimate of what it will cost. We also need better built-in tools for attributing costs back to specific teams or services before the bill arrives.
-
The Steep Learning Curve and UI Density: The UI is incredibly powerful, but it's dense. For a senior SRE who lives in the tool all day, it's fine. For a new engineer or a developer who only jumps in during an incident, it's overwhelming. We've seen people "click in circles" trying to find a simple stack trace that's buried three layers deep. Building a "perfect" dashboard is still too much of an art form.
For how long have I used the solution?
I have been using Datadog for about five years.
What do I think about the stability of the solution?
Which solution did I use previously and why did I switch?
I did not previously use a different solution.
How was the initial setup?
I did not deal with any of the pricing, setup cost, or licensing.
What about the implementation team?
I do not know if we purchased Datadog through the AWS Marketplace.
What other advice do I have?
My advice to others looking into using Datadog is to just try using it and see how easy it is to use. I found this interview great. On a scale of 1-10, I rate Datadog a 10.
Which deployment model are you using for this solution?
Private Cloud
If public cloud, private cloud, or hybrid cloud, which cloud provider do you use?
Has improved visibility into performance metrics and helped reduce cloud spend
What is our primary use case?
My main use case for Datadog is dashboards and monitoring.
We use dashboards and monitoring with Datadog to monitor the performance of our Nexus Artifactory system and make sure the services are running.
What is most valuable?
The best features Datadog offers are the dashboarding tools as well as the monitoring tools.
What I find most valuable about the dashboarding and monitoring tools in Datadog is the ease of use and simplicity of the interface.
Datadog has positively impacted our organization by allowing us to look at things such as Cloud Spend and make sure our services are running at an optimal performance level.
We have seen specific outcomes such as cost savings by utilizing the cost utilization dashboards to identify areas where we could trim our spend.
What needs improvement?
To improve Datadog, I suggest they keep doing what they're doing.
Newer features using AI to create monitors and dashboards would be helpful.
For how long have I used the solution?
I have been using Datadog for six years.
What do I think about the stability of the solution?
What do I think about the scalability of the solution?
I am not sure about Datadog's scalability.
How are customer service and support?
Customer support with Datadog has been great when we needed it.
I rate the customer support a nine on a scale of 1 to 10.
How would you rate customer service and support?
Which solution did I use previously and why did I switch?
We did not previously use a different solution.
What was our ROI?
In terms of return on investment, there is a lot of time saved from using the platform.
What's my experience with pricing, setup cost, and licensing?
I was not directly involved in the pricing, setup cost, and licensing details.
Which other solutions did I evaluate?
Before choosing Datadog, we evaluated other options such as Splunk and Grafana.
What other advice do I have?
I rate Datadog an eight out of ten because the expense of using it keeps it from being a nine or ten.
My advice to others looking into using Datadog is to brush up on their API programming skills.
My overall rating for Datadog is eight out of ten.
Which deployment model are you using for this solution?
Public Cloud
If public cloud, private cloud, or hybrid cloud, which cloud provider do you use?