Sign in
Categories
Your Saved List Become a Channel Partner Sell in AWS Marketplace Amazon Web Services Home Help

Datadog Pro

Datadog

Reviews from AWS customer

9 AWS reviews

External reviews

693 reviews
from and

External reviews are not included in the AWS star rating for the product.


    reviewer08624379

Great documentation and learning platform with good built-in integrations

  • September 26, 2024
  • Review provided by PeerSpot

What is our primary use case?

We were looking for an all-in-one observability platform that could handle a number of different environments and products. At a basic level, we have a variety of on-premises servers (Windows/Mac/Linux) as well as a number of commercial, cloud-hosted products. 

While it's often possible to let each team rely on its own means for monitoring, we wanted something that the entire company could rally around - a unified platform that is developed and supported by the very same people, not others just slapping their name on some open source products they have no control over.

How has it helped my organization?

Datadog has effortlessly dropped in to nearly every stage of observability for us. We appreciate how it has robust cross-platform support for our IT assets, and for integrating hosted products, enabling integrations often couldn't be easier, with many of them including native dashboards and even other types of content packs. 

Over the last couple of years, we have onboarded a number of engineering teams, and each of them feels comfortable using Datadog. This gives us the ability to build organizational knowledge.

What is most valuable?

Datadog's learning platform is second to none. It's the gold standard of training resources in my mind; not only are these self-paced courses available at no charge, but you can spin up an actual Datadog environment to try out its various features. 

I just hate when other vendors try to upsell you on training beyond their (often poorly-written) documentation. Apart from that, we appreciate the variety of content that comes from Datadog's built-in integrations - for common sources, we don't have to worry about parsing, creating dashboards, or otherwise reinventing the wheel.

What needs improvement?

Datadog's roadmap can be a bit unpredictable at times. For instance, a few years ago, our rep at the time stated that Datadog had dropped its plans to develop an incident on-call platform. However, this year, they released a platform that does exactly that.

They also decided to drop chat-based support just recently. While I understand that it's often easier to work with support tickets, I do miss the easy availability of live support. 

It would be nice if Datadog continued to broaden its variety of available integrations to include even more commercial platforms because that is central to its appeal. If we're looking at a new product and there isn't a native integration, then that's more work on our part.


    reviewer0962486

Good alerts and detailed data but needs UI improvements

  • September 23, 2024
  • Review provided by PeerSpot

What is our primary use case?

I work in product design, and although we use Datadog for monitoring, etc, my use case is different as I mostly review and watch session recordings from users to gain insight into user feedback.

We watch multiple sessions per week to understand how users are using our product. From this data, we are able to hone in on specific problems that come up during the sessions. We then reach out to specific users to follow up with them via moderated testing sessions, which is very valuable for us.

How has it helped my organization?

Using Datadog has allowed us to review detailed interactions of users at a scale that leads us to make informed data-driven UX improvements as mentioned above.

Being able to pinpoint specific users via filtering is also very useful as it means when we have direct feedback from a specific user, we can follow up by watching their session back. 

The engineering team's use case for Datadog is for alerting, which is also very useful for us as it gives us visibility of how stable our platform is in various different lenses.

What is most valuable?

Session recordings have been the most valuable to me as it helps me gain insights into user behaviour at scale. By capturing real-time interactions, such as clicks, scrolls, and navigation paths, we can identify patterns and trends across a large user base. This helps us pinpoint usability issues, optimize the user experience, and improve the overall experience for our users. Analyzing these recordings enables us to make data-driven decisions that enhance both functionality and user satisfaction.

What needs improvement?

I'd like the ability to see more in-depth actions on user sessions, such as where there are specific problems and rather than having to watch numerous session recordings to understand where this happens to get alerts/notifications of specific areas that users are struggling with - such as rage clicks, etc.

In terms of UI, everything is very small, which makes it quite difficult to navigate at times, especially in terms of accessibility, so I'd love for there to be more attention on this.

For how long have I used the solution?

I've used the solution for over one year.

Which solution did I use previously and why did I switch?

We did not evaluate other options. 

What's my experience with pricing, setup cost, and licensing?

I wasn't part of the decision-making process during licensing.

Which other solutions did I evaluate?

I wasn't part of the decision-making process during the evaluation stage.


    reviewer9637683

Great for logging and racing but needs better customization

  • September 23, 2024
  • Review provided by PeerSpot

What is our primary use case?

We're using the product for logging and monitoring of various services in production environments. 

It excels at providing real-time observability across a wide range of metrics, logs, and traces, making it ideal for DevOps teams and enterprises managing complex environments. 

The platform integrates seamlessly with our cloud services, but browser side logging is a little lagging. 

Dashboards are very useful for quick insights, but can be time consuming to create, and the learning curve is steep. Documentation is vast, but not as detailed as I'd like.

How has it helped my organization?

The solution has made logging and tracing a lot easier, and the RUM sessions are something we did not have previously. Datadog’s real-time alerting and anomaly detection help reduce downtime by allowing us to identify and address performance issues quickly. 

The platform’s intelligent alert system minimises noise, ensuring your team focuses on critical incidents. This results in faster Mean Time to Resolution (MTTR), improving service availability. 

It consolidates monitoring for infrastructure, applications, logs, and security into a single platform. This enables us to view and analyse data across the entire stack in one place, reducing the time spent jumping between tools.

What is most valuable?

Real user monitoring has made triaging any possible bugs our users might face a lot easier. RUM tracks actual user interactions, including page load times, clicks, and navigation flows. This gives our organization a clear picture of how our users are experiencing your application in real-world conditions, including slow-loading pages, errors, and other performance issues that affect user satisfaction. We can then easily prioritize these, and make sure we offer our users the best possible experience.

What needs improvement?

I'm not sure if this is on Datadog, however, Vercel integration is very limited. 

They need to offer better/more customization on what logs we get and making tracing possible on Edge runtime logs is a real requirement. It is extremely difficult, if not completely impossible, to get working traces and logs displayed in Datadog with our stack of Vercel, NexJs, and Datadog. This is a very common stack in front end development and the difficulty of implementing it is unacceptable. Please do something about it soon. Front end logs matter.

For how long have I used the solution?

I've used the solution for a little over a year.


    reviewer820579

Single pane of glass, easy to share dashboards, and good for monitoring

  • September 20, 2024
  • Review from a verified AWS customer

What is our primary use case?

We primarily use the solution for a variety of purposes, including:

  • Watching RUM data for frontend site, using LCP and INP metrics to compare across the old and new architecture to inform rollout decisions.
  • Watching APM data for backend services, observing how the backend server reacts (CPU util, memory, requests/second) to make sure the backend can handle the load.
  • Using Datadog CCM during our free trial period to get visibility over our AWS spend across accounts and resources and looking at recommendations and acting on those.
  • Browsing the service catalog to look at the current state of services that are running and what resources it uses. 

How has it helped my organization?

This provides a single place to find monitoring data. Prior to DD, we had some metrics living in New Relic, some in Grafana, and some in Circonus, and it was very confusing to navigate across them. Understanding different query languages is challenging. Here, there's a single UI to get used to, and everything is so sharable.

DD has led to teams making more decisions based on data that they observe about their service metrics and RUM metrics. I've seen decisions get made based on what has been observed in DD, and less based on anecdotal data.

What is most valuable?

I really enjoyed using CCM since it showed cloud cost data easily next to other metrics, and I could correlate the two.

Across CCM and the rest of Datadog, I like how sharable everything is. It's so easy to share dashboards and links with my teammates so we can quickly get up to speed on debugging/solving an issue.

I also have really enjoyed K8s view of pods and pod health. It's very visual, and as a non-K8s platform owner at my company, I can still observe the overall health of the system. Then I can drill in and have learned things about K8s by exploring that part of the product and talking with the team.

What needs improvement?

We've had some issues where we had Datadog automatically turned on in AWS regions that we weren't using, which incurred a small but steady cost that amounted to tens of thousands of dollars spent over a few weeks. I wish there was a global setting that lets an admin restrict which regions DD is turned on in as a default setup step.

Sometimes, the APM service dashboard link isn't sharable. I click something in the service catalog, and on that service's APM default view, I try to share a link to that with a teammate, and they reach a blank or error screen. 

I wish there was more organization and detail in the suggestions when I use the query editor. I'm never quite sure when the autofill dropdown shows up if I'm seeing some custom tag or some default property, so I have to know exactly what I'm looking for in order to build a chart. It's hard to navigate and explore using the query autofill suggestions without knowing exactly what tag to look for.

It's been a bit hard to understand how data gets sampled or how many data points a particular dashboard value is using. We've had questions over the RUM metrics that we see and we had to ask for help with how values are calculated, bin sizes, etc to get confidence in our data.

For how long have I used the solution?

I've used the solution for six months.

What do I think about the stability of the solution?

I've only been aware of a recent outage that affected the latency of data collection for one of our production tests. Outside of that, the solution seems stable.

What do I think about the scalability of the solution?

The solution seems like it can scale very well and beyond our needs.

How are customer service and support?

Technical support has been stellar. We love working with a team that responds fast, in great detail, and with great empathy. I trust what they say.

How would you rate customer service and support?

Positive

Which solution did I use previously and why did I switch?

We used New Relic, Grafana, and Circonus. Circonus was flakey, always having downtime and we were always on the phone with them. New Relic and grafana, different metrics lived in either and it was hard for consumers of the data to easily find what they need. And we had licensing issues across the 3 so not everybody could easily access all of them.

What's my experience with pricing, setup cost, and licensing?

I didn't do this portion of the product setup.

Which deployment model are you using for this solution?

Public Cloud

If public cloud, private cloud, or hybrid cloud, which cloud provider do you use?

Amazon Web Services (AWS)


    reviewer3796153

Intuitive user interface with good log management and a helpful Log Explorer feature

  • September 20, 2024
  • Review provided by PeerSpot

What is our primary use case?

In our fast-paced environment, managing and analyzing log data and performance metrics is crucial. That’s where Datadog comes in. We rely on it not just for monitoring but for deeper insights into our systems, and here’s how we make the most of it. 

One of the first things we appreciate about Datadog is its ability to centralize logs from various sources—think applications, servers, and cloud services. This means we can access everything from one dashboard, which saves us a lot of time and hassle. Instead of digging through multiple platforms, we have all our log data in one place, making it much easier to track events and troubleshoot issues.

How has it helped my organization?

Before Datadog, we faced the common challenge of fragmented data. Our logs, metrics, and traces were spread across different tools and platforms, making it difficult to get a complete picture of our system’s health. 

With Datadog, we now have a centralized monitoring solution that aggregates everything in one place. This has streamlined our workflow immensely. Whether it’s logs from our servers, metrics from our applications, or traces from user transactions, we can access all this information easily. This unified view has made it simpler for our teams to identify and troubleshoot issues quickly.

What is most valuable?

In my experience with Datadog, one feature stands out above the rest is the Log Explorer. It has completely transformed the way I interact with our log data and has become an essential part of my daily workflow. 

The user interface is incredibly intuitive. When I first started using it, I was amazed at how easy it was to navigate. The design is clean and straightforward, allowing me to focus on the data rather than getting lost in complicated menus. Whether I’m searching for specific log entries or filtering by certain criteria, everything feels seamless. 

This ease of use allowed me to get up to speed with log management since it's my first time using Datadog.

What needs improvement?

Interactive tutorials could be a game changer. Instead of just reading about how to use query filters, users could engage with step-by-step guides that walk them through the process. For example, a tutorial could start with a simple query and gradually introduce more complex filtering techniques, allowing users to practice along the way. These tutorials could include pop-up tips and hints that provide additional context or best practices as users work through examples. This hands-on approach not only reinforces learning but also builds confidence in using the tool.

For how long have I used the solution?

My company has recently made Datadog available to it's software engineers and I personally have been using it for almost a year now.


    reviewer2561892

A go-to tool for analyzing, understanding, and investigating application performance

  • September 20, 2024
  • Review provided by PeerSpot

What is our primary use case?

The soluton is used for full stack enterprise performance monitoring for our primarily cloud-based stack on AWS. We have implemented monitoring coverage using RUM for critical apps and websites and utilize APM (integrated with RUM) for full stack traceability.  

We use Datadog as our primary log repository for all apps and platforms, and the advanced log analytics enable accurate log-based monitoring/alerting and investigations. 

Additionally, we some advanced RUM capabilities and metrics to track and optimize client-side user experience. We track SLO's for our critical apps and platforms using Datadog.

How has it helped my organization?

We now have full-stack observability, which allows us to better understand application behavior, quickly alert users about issues, and proactively manage application performance.  

We've seen value by implementing observability coordinated across multiple applications, allowing us to track things like customer shopping and orders across multiple applications and services.  

For critical application launches, we've built dashboards that can track user activity and confirm users are able to successfully utilize new features, tracking user activities in real-time in a war-room situation.  

Datadog is our go-to tool for analyzing, understanding, and investigating application performance and behavior.

What is most valuable?

APM accurately tracks our service performance across our ecosystem. RUM gives us client-side performance and user experience visibility, and the rate of new features implemented in the Digital Experience area recently has been high. Log analytics give us a powerful mechanism for error tracking, research, and analysis.  

Custom metrics that we've created allow us to track KPIs in real-time on dashboards. All of these have proven valuable in our organization.  Additionally, Datadog product support teams are responsive and have provided timely support when needed.

What needs improvement?

Agent remote configuration should be provided/improved and streamlined, allowing for config changes/upgrades to be performed via the portal instead of at the host.   

Cost tracking via the admin portal is a bit lacking, even though it has gotten better.  I'm looking for usage trends (that drive cost) across time and better visibility or notifications about on-demand charges.  

Network device and performance monitoring could be improved, as we've faced some limitations in this area.  

The Datadog usage-based cost model, while giving us better transparency, is difficult to follow at times and is constantly evolving.  

For how long have I used the solution?

I've used the solution for three years.

How are customer service and support?

Support has been responsive and helpful.  

How would you rate customer service and support?

Positive

What's my experience with pricing, setup cost, and licensing?

Pricing is straightforward. That said, it's sometimes difficult to estimate usage volumes.

Which other solutions did I evaluate?

We evaluated Datadog and New Relic in detail and chose Datadog due to their straightforward and competitive pricing model, and their full coverage of monitoring features that we desired, and an easy-to-use UI.  

Which deployment model are you using for this solution?

Public Cloud


    reviewer2561889

Easy to configure with synthetic testing and offers a consolidated approach to monitoring

  • September 20, 2024
  • Review provided by PeerSpot

What is our primary use case?

We use this solution for enterprise monitoring across a large number of applications in multiple environments like production, development, and testing. It helps us track application performance, uptime, and resource usage in real time, providing alerts for issues like downtime or performance bottlenecks. 

Our hybrid environment includes cloud and on-premise infrastructure. The solution is crucial for ensuring reliability, compliance, and high availability across our diverse application landscape.

How has it helped my organization?

Datadog has greatly improved our organization by centralizing all monitoring into one platform, allowing us to consolidate data from a wide range of sources. 

From infrastructure metrics and application logs to end-user experience and device monitoring, everything is now collected and displayed in one place. This has simplified our monitoring processes, improved visibility, and allowed for faster issue detection and resolution. 

By streamlining these operations, Datadog has enhanced both efficiency and collaboration across teams.

What is most valuable?

Synthetic testing is by far the most valuable feature in our organization. It’s highly requested since the setup process is both quick and straightforward, allowing us to simulate user interactions across our applications with minimal effort. 

The ease of configuring tests and interpreting the results makes it accessible even to non-technical team members. This feature provides valuable insights into user experience, helps identify performance bottlenecks, and ensures that our critical workflows are functioning as expected, enhancing reliability and uptime.

What needs improvement?

One area where the product could be improved is Application Performance Monitoring (APM). While it's a powerful feature, many in our organization find it difficult to fully understand and utilize to its maximum potential. 

The data provided is comprehensive, yet it can sometimes be overwhelming, especially for those who are less familiar with the intricacies of application performance metrics. 

Simplifying the interface, offering clearer guidance, or providing more intuitive visualizations would make it easier for users to extract valuable insights quickly and efficiently.

For how long have I used the solution?

I've used the solution for four years.

What do I think about the stability of the solution?

The solution is very stable. Issues happen once or twice a year and are usually solved before we have any real impact on the service.

What do I think about the scalability of the solution?

Scalability has never been a bottleneck for us; we've never felt any issues here.

How are customer service and support?

Support is slow at the beginning, however, they are much better and responsive now.

How would you rate customer service and support?

Positive

Which solution did I use previously and why did I switch?

Datadog offered the most consolidated approach to our monitoring needs.

How was the initial setup?

This was a migration project, so it was rather complex.

What about the implementation team?

We implemented the solution with our in-house team.

What's my experience with pricing, setup cost, and licensing?

I'd recommend new users look down the road and decide on at least a three-year plan.

Which other solutions did I evaluate?

We evaluated AppDynamics and Dynatrace.


    reviewer9864210

Good dashboards and observability capabilities but pricing needs improvement

  • September 20, 2024
  • Review provided by PeerSpot

What is our primary use case?

We have multiple nodes integrated into our Azure infrastructure and our AKS clusters. These nodes are integrated with traces (as APM hosts).

We also have infrastructure Hosts integrated to see the metrics and the resources of each hosts mainly for Azure VMs and AKS nodes. Additionally, we also have hosts from our VMs in Azure which act as Activemq and we integrate them as messaging queues to show up in the Activemq dashboard.

We have recently added Activemq as containers in the AKS and we are also integrating those as messaging queues to show up in the Activemq dashboard integration 

How has it helped my organization?

Logs are great. Having all services with different teams sending the logs to Datadog and having all logs in the same place is very helpful for us to understand what is going on in our app; filtering of the logs a huge help and adding special custom filters is easy, filters are fast. Documentation is better than average, with little room for improvement.

Dashboards are simple, and monitors are very easy to configure and get notified if something is wrong.

With the aggregated logs, we can now see logs from other systems and identify problems in other areas in which we had no visibility before.

What is most valuable?

Dashboards are the most valuable. We need the observability. We have given the dashboards to a dedicated team to monitor them off working hours and they are reporting whatever they see going red. This helps us since people without any knowledge can understand when there is a problem and when to react and when to inform others by simply looking if the monitor (showing the dashboards) turns up red. 

Traces being connected to each other and seeing that each service is connected through one API call is very helpful for us to understand how the system works.

What needs improvement?

The monitors need improvement. We need easier root cause analysis when a monitor hits red. When we get the email, it's hard to identify why the trigger has gone red and which pod exactly is to blame in a scenario where the pod is restarting, for example.

Prices are a very difficult thing in Datadog. We have to be very mindful of any changes we make in Datadog, and we are a bit afraid of using new features since, if we change something, we might get charged a lot. For example, if we add a network feature to our nodes, we might get charged a lot simply by changing one flag, even though we are only going to use one small feature for those network nodes. However, due to the fact that we have more than 50 nodes, all of the nodes will be charged for the feature of "Network hosts".

This leads us to not fully utilize the capabilities of Datadog, and it's a shame. Maybe we can have a grace period to test features like a trial and then have datadog stop that for us to avoid paying more by mistake.

For how long have I used the solution?

I've used the solution for five years.

What do I think about the stability of the solution?

The solution is stable enough. We found it to be down only a few times, and it's reasonable.

What do I think about the scalability of the solution?

The solution offers very good scalability. When we added more logs and more hosts, we did not notice any degradation in the service.

How are customer service and support?

Support is very good. They answer all of our questions, and with a few emails, we get what we need

How would you rate customer service and support?

Positive

Which solution did I use previously and why did I switch?

We previously used Elastic. We had to set up everything and maintain it ourselves.

How was the initial setup?

Datadog has very good support and it is not so complicated to set up.

What about the implementation team?

We set up the solution in-house. We integrated everything on our own.

What was our ROI?

We found the product to be very valuable.

What's my experience with pricing, setup cost, and licensing?

I'd advise others to start small and then integrate more stuff. Be mindful when using Datadog.

Which other solutions did I evaluate?

We evaluated Splunk and ELK.

What other advice do I have?

Be careful of the costs. Set up only the important things.

Which deployment model are you using for this solution?

Public Cloud

If public cloud, private cloud, or hybrid cloud, which cloud provider do you use?

Microsoft Azure


    Accounting

Jorge A. Torres Martínez is horrible at customer service and support

  • September 19, 2024
  • Review provided by G2

What do you like best about the product?
Nothing. he is horrible. I would never use their service because of the exprience we had with Jorge A. Torres Martínez
What do you dislike about the product?
This person I was in touch with and he is truly a horrible human being. I would not have him as an employee. We had recahed our about a feature and we were connected to him and he was horrible. I would avoid this person and employee at all costs.
What problems is the product solving and how is that benefiting you?
We tried to enahnce our customer satisfaction approach with these guys and it just didnt work. We need a simple API implemented and this man was horrible to work with.


    Traci Ortiz

Improved response time and cost-efficiency with good monitoring

  • September 19, 2024
  • Review provided by PeerSpot

What is our primary use case?

We monitor our multiple platforms using Datadog and post alerts to Slack to notify us of server and end-user issues. We also monitor user sessions to help troubleshoot an issue being reported. 

We monitor 3.5 platforms on our Datadog instance, and the team always monitors the trends and Dashboards we set up. We have two instances to span the 3.5 platforms and are currently looking to implement more platform monitoring over time. The user session monitoring is consistent for one of these platforms. 

How has it helped my organization?

Datadog has improved our response time and cost-efficiency in bug reporting and server maintenance. We're able to track our servers more fluidly, allowing us to expand our outreach and decrease response time. 

There are many different ways that Datadog is used, and we monitor three and a half platforms on the Datadog environment at this time. By monitoring all of these platforms in one easy-to-use instance, we're able to track the platform with the issue, the issue itself, and its impact on the end user. 

What is most valuable?

The server monitoring, service monitoring, and user session monitoring are extremely helpful, as they allow us to be alerted ahead of time of issues that users might experience. More often than not, an issue is not only able to be identified, but solved and released before an end user notices an issue. 

We are currently using this as an investigative tool to notice trends, identify issues, and locate areas of our program that we can improve upon that haven't been identified as pain points yet. This is another effective use case. 

What needs improvement?

I would like to see a longer retention time of user sessions, even if by 24 to 48 hours, or even just having the option to be configurable. By doing this, we're enabled to store user sessions that have remained invisible for a long time, and identify issues that people are working around. 

I would also like to see an improvement in the server's data extraction times, as sometimes it can take up to ten minutes to download a report for a critical issue that is costing us money. Regardless, I am very happy with Datadog and love the uses we have for the program so far.  

For how long have I used the solution?

I've used the solution for more than four years.

Which solution did I use previously and why did I switch?

We did not previously use a different solution.