Effortless Observability Across Platforms, Services and Integrations for Always-On Reliability
What do you like best about the product?
Firstly, Its integration capabilities to Hosts (Windows/Linux/mac), Platforms (AWS, Azure which I use, plus GCP etc) and container platforms (Docker, Kubernetes) etc benefits alot of usecases as DD server as a one stop shop. We have support for multiple programming languages to easily publish logs to Datadog directly right form your application code, eliminating any fancy stuff.
We have very reliable feature slike smart health checks and automated test suites so we catch problems before they hit. On-call teams get instant alerts, incident triage, and even automated workflows for triage etc enhance teams to focus fixing issues quickly and stress-free with some readily available first hand information.
Dashboards with visualizations like line, bar, pie, and timeseries charts cater to different use cases—such as applications, infrastructure, and databases—making it easier to monitor performance. DD become an integral part of our daily operations, helping outquickly spot anomalies and simplifying the overall and managing workflows.
Its easy to setup/install/implement agent configuration (Pre designed Installation URL with installation script) doesnt take more than 5mins. Users can readily build dashboards in under 15 mins for prod grade setup. [ In general its just 5mins as publicised by Datadog].
DD do has great customer support but we rarely need that as most of the stuff has documentation and easy to setup or configure.
What do you dislike about the product?
In our current context,
As our infrastructure or application footprint grows, storage costs increase proportionally and can become a major expense. If we need to retain data for extended periods, expect those costs to rise even further (so storage necessity is the key).
Just like other platforms, Datadog also offers numerous integrations with third-party platforms like Slack, Microsoft Teams, and Jira. We leveraged on all these channels initially that lead to increased costs, as each integration added complexity and resource usage along with increase complexity implementing them. We had to strip someof them to manage cost and purpose of applications at different environment levels.
There are so many options for same purpose but without proper guidance or complete understanding of that usecase, we may en dup implement more than what is required. So purpose is key here.
What problems is the product solving and how is that benefiting you?
1. Datadog is helping us providing the complete picture of problem with some initial details and by giving us a single platform to monitor 50+ microservices across 40+ AWS accounts, so nothing slips through the cracks. we have some first hand information based on automation or test suits logs, we know where to check, leading to less turn around time.
2. It tackles incident management/response challenges with real-time alerts, on-call integration, and automated triage, identifying similar patterns, notes around the service and resolution documents helping us fix issues before they impact customers at large extend. Its integration to different platforms we manage (almost all) is really a value add.
3. Built-in health checks and test suites keep our systems in shape, while integrations with AWS, PagerDuty, Slack, and more make the whole workflow smooth and connected. Datadog eliminates tool silos and creates a smooth workflow for monitoring and incident resolution.
4. From service-level segregation to rich dashboards, Datadog turns most of our log data into simple insights for engineers and execs alike. Different dashboards at low level and higher level made our life easy from monitoring to presenting the data to higher-ups.
Unified observability has improved incident response and now reduces downtime across environments
What is our primary use case?
My main use case for
Datadog is unified observability, as I use it to correlate metrics, traces, and logs in a single pane of glass to ensure the health and security of our cloud infrastructure and application.
I correlate those metrics, traces, and logs using the Service Map to visualize dependencies between our microservices, and for example, during a latency spike, I can instantly see if there is a bottleneck in a specific database query or a downstream API, which allows me to route the issues to the right team immediately.
What is most valuable?
Datadog is an incredibly powerful daily driver for any engineer, and the recent addition of LLM observability for AI apps and Cloud Security Management makes it feel like a platform that is truly keeping up with modern tech trends. The dashboarding and alert integrations are great features offered by Datadog, giving us all the required information on a single screen, and the alert integration performs its job in a very good manner.
Datadog has positively impacted our organization, as it has eliminated many negative issues, which I call tool sprawl, by replacing four or five separate monitoring tools with one unified platform. This has improved our MTTR and broken down silos between Dev and Ops teams.
Since Datadog has been introduced, the response time when seeing an alert has increased, so alerts have been taken care of within less time and routed to the other teams who have been taking the required actions. This has given us a very positive approach towards the entire working culture.
What needs improvement?
Datadog is a platform that can be improved by making its pricing more predictable, as sometimes it is difficult to forecast exactly how much a new project will cost until after we have started ingesting the data.
When it comes to the documentation, we do not have much available right now, so if Datadog can improve the documentation part, it would really help the engineers to work on this.
Datadog is the most comprehensive observability tool on the market, and it only loses two points because the pricing for log ingestion can grow quickly if we do not carefully manage our filters.
For how long have I used the solution?
I have been using Datadog for about three years to monitor our cloud-native application and infrastructure across multiple environments.
What do I think about the stability of the solution?
Datadog is extremely stable, as it is built for high scalable environments and consistently maintains high availability, which is why I trust it as our primary monitoring tool.
What do I think about the scalability of the solution?
Datadog is built for hyperscale, as it automatically scales when we add new hosts or containers, and its Monitoring as Code approach via
Terraform allows us to scale our monitoring setup instantly as our infrastructure grows.
How are customer service and support?
Their technical documentation is some of the best in the industry, and their support engineers are very proactive, helping us optimize the ingestion cost.
How would you rate customer service and support?
Which solution did I use previously and why did I switch?
I previously used a mix of open-source tools like Prometheus and
Grafana, and I switched because manual upkeep was too high and I needed a platform that could handle logs and traces alongside metrics without having to manage the backend storage.
How was the initial setup?
Buying Datadog through the
AWS Marketplace was seamless and helped me meet
AWS spending commitments, and while Datadog's custom metric pricing can be complex, the setup cost is very low because the agent is easy to deploy.
What was our ROI?
I have seen a strong ROI through a thirty percent reduction in downtime and significant cost savings by identifying under-utilized cloud resources, for example, the ideal
EC2 instances through their cloud cost management.
Which other solutions did I evaluate?
I evaluated
New Relic,
Dynatrace, and
Amazon CloudWatch before choosing Datadog, and I chose Datadog because of its massive library of over seven hundred integrations and its superior user interface, which is easier for our developers to use daily.
What other advice do I have?
My biggest advice is to set up ingestion rules and filters early, as you should not send all your logs and metrics at once, and being selective about what you need to store can maximize your ROI from day one. I would rate this review as an eight.
Which deployment model are you using for this solution?
Hybrid Cloud
If public cloud, private cloud, or hybrid cloud, which cloud provider do you use?
Comprehensive Monitoring with Easy Setup
What do you like best about the product?
I really like how detailed the log traces can be in Datadog, and how I can search for specific logs based on labels and facets. Setting up Datadog agents was also very easy.
What do you dislike about the product?
Pricing can become really expensive at scale, especially when log ingestion and custom metrics are not carefully managed. It would really be nice to be able to view a cost dashboard, as I don't think Datadog has that feature.
What problems is the product solving and how is that benefiting you?
I use Datadog to gain insights into application metrics and monitor key metrics like memory and CPU usage. It also provides visibility into services deployed across clouds like GCP and AWS.
Unmatched Reliability and Observability You Can Trust
What do you like best about the product?
What I appreciate most about DataDog are its reliability and observability. Even during major global outages, such as those affecting Cloudflare and AWS, DataDog continued to perform without any issues. The APM, RUM, and Synthetic Checks are exceptionally dependable, giving me the confidence to choose DataDog without hesitation.
What do you dislike about the product?
I wish the UI had a more modern look, as it currently feels like I'm using an old inventory management system from the 2000s.
What problems is the product solving and how is that benefiting you?
DataDog has been highly beneficial for us in many ways, from monitoring synthetic checks to tracking APM traces. The platform provides a map of interconnected, dependable services, allowing us to view latency and error rates all in one central location. The alerting mechanism is easy to use, supports numerous integrations, and consistently works smoothly.
Metrics hub for problem solving
What do you like best about the product?
Collecting all important monitoring information in one place so if a problem arises it is faster to find the cause as every metric is collected.
What do you dislike about the product?
I find its interface a bit outdated and unintuitive
What problems is the product solving and how is that benefiting you?
It provides a hub to collect a lot of data points from different applications in one place which makes it much easier to access as well as setup notifications based on multiple data sources.
Unified Monitoring (APM) That Accelerates Issue Diagnosis and Incident Resolution
What do you like best about the product?
Datadog brings infrastructure, applications, logs, and security signals together in one place, which makes it much easier to understand what is really happening in an environment and to move quickly from detection to action. The correlation between metrics, traces, and logs is particularly valuable when diagnosing incidents, as it reduces guesswork and speeds up root cause analysis.
What do you dislike about the product?
While Datadog is extremely powerful, it can become difficult to control and predict costs in large or rapidly changing environments, particularly when ingesting high volumes of logs, metrics, and traces. Without strong governance and regular tuning, usage can grow quickly and lead to unexpected spend.
In addition, the breadth of features can sometimes feel overwhelming. Teams need time and clear ownership to configure dashboards, alerts, and monitors properly; otherwise, there is a risk of noise, alert fatigue, or under-utilisation of the platform’s capabilities.
What problems is the product solving and how is that benefiting you?
Datadog helps us centralise logs and monitor our Java applications and APIs, and provides APM (Application Performance Monitoring) to quickly detect performance issues and troubleshoot incidents or bottlenecks.
Intuitive Interface That Makes Data Insights Effortless
What do you like best about the product?
The user interface is very intuitive, making it easy to gain insights from the data. getting data into datadog is quite simple due to the multiple integrations, so it get's ready to use in a few clicks, support is responsive, my team uses it every day.
What do you dislike about the product?
In terms of cost, this platform is not inexpensive. Additionally, making bulk changes across multiple widgets is not straightforward, which can be inconvenient.
What problems is the product solving and how is that benefiting you?
We leverage Datadog to generate alerts and reports for our services, which helps us maintain higher uptime and gain better visibility into any issues that arise.
Comprehensive Tracking Capabilities That Impress
What do you like best about the product?
It offers almost every possible way to track the application interactions.
What do you dislike about the product?
It's very expensive, and it's not easy to grasp for newbies
What problems is the product solving and how is that benefiting you?
It gives us insights into our application services and it allows us to catch issues. Also, it provides session replays that allow us to quickly spot issues with the way users interact with our application.
Empowers Confident Monitoring and Insightful System Analysis
What do you like best about the product?
Datadog is a powerful tool that gives us greater confidence in our company's systems and enhances our ability to detect outages. It offers a wide range of features, most of which we actively use. Datadog provides essential insights into our systems, which helps us investigate problems, identify issues, and monitor performance, these being just a few of the ways we rely on Datadog.
What do you dislike about the product?
The cost is one of Datadog's biggest drawbacks, there are some products that would be helpful to use but the cost makes them impractical. The cost of Mobile App Testing testing comes to mind as an example for this.
Additionally, we have experienced some frustration due to pricing changes. Our previous SKUs were grandfathered, but we were eventually required to switch to the newer, more expensive SKU pricing.
What problems is the product solving and how is that benefiting you?
Datadog helps gives our company confidence in knowing that our systems are being monitored and that issues can be detected and addressed by our observability team promptly. We also use the information collected in Datadog to assess the overall health of our systems and drive investigations for any issues detected.
User-Friendly Dashboards with Comprehensive Analytics
What do you like best about the product?
It logs all the details we need for analysis. The platform is user-friendly, allowing us to easily set up various dashboards to view all the insights. Implementation is straightforward as well. We have integrated it with our different platforms, including both web and mobile.
What do you dislike about the product?
I have never encountered any major issues. However, there should be an option to retrieve data based on our own custom filters.
What problems is the product solving and how is that benefiting you?
We use Datadog for both performance monitoring and crash analytics. Previously, our main challenge was with crash reporting, as we struggled to debug or analyze crash data in our production environments. However, Datadog has helped us address this issue through its crash reports and log monitoring features.