AWS Marketplace: Datadog Enterprise Reviews

reviewer254673

Good monitoring capabilities, centralizing of logs, and making data easily searchable

September 19, 2024
Review provided by PeerSpot

What is our primary use case?

Our primary use of Datadog involves monitoring over 50 microservices deployed across three distinct environments. These services vary widely in their functions and resource requirements.

We rely on Datadog to track usage metrics, gather logs, and provide insight into service performance and health. Its flexibility allows us to efficiently monitor both production and development environments, ensuring quick detection and response to any anomalies.

We also have better insight into metrics like latency and memory usage.

How has it helped my organization?

Datadog has significantly improved our organization’s monitoring capabilities by centralizing all of our logs and making them easily searchable. This has streamlined our troubleshooting process, allowing for quicker root cause analysis.

Additionally, its ease of implementation meant that we could cover all of our services comprehensively, ensuring that logs and metrics were thoroughly captured across our entire ecosystem. This has enhanced our ability to maintain system reliability and performance.

What is most valuable?

The intuitive user interface has been one of the most valuable features for us. Unlike other platforms like Grafana, as an example, where learning how to query either involves a lot of trial and error or memorization almost like learning a new language, Datadog’s UI makes finding logs, metrics, and performance data straightforward and efficient. This ease of use has saved us time and reduced the learning curve for new team members, allowing us to focus more on analysis and troubleshooting rather than on learning the tool itself.

What needs improvement?

While the UI and search functionality are excellent, further improvement could be made in the querying of logs by offering more advanced templates or suggestions based on common use cases. This would help users discover powerful queries they might not think to create themselves.

Additionally, enhancing alerting capabilities with more customizable thresholds or automated recommendations could provide better insights, especially when dealing with complex environments like ours with numerous microservices.

For how long have I used the solution?

I've used the solution for five years.

What do I think about the stability of the solution?

We have never experienced any downtime.

Which solution did I use previously and why did I switch?

We previously used Sumo Logic.

Which deployment model are you using for this solution?

Public Cloud

Lin Qui

Excellent APM, RUM and dashboards

September 19, 2024
Review provided by PeerSpot

What is our primary use case?

We use the solution for APM, anomaly detection, resource metrics, RUM, and synthetics.

We use it to build baseline metrics for our apps before we start focusing in on performance improvements. A lot of times that’s looking at methods that take too long to run and diving into db queries and parsing.

I’ve used it in multiple configurations in aws and azure. I’ve built it using terraform and hand rolled.

I’ve used it predominantly with Ruby and Node and a little bit of Python.

How has it helped my organization?

The solution provides deep insights into our stack. It gives us the ability to measure and monitor before making decisions.

We're using it to make informed decisions about performance. Being able to show how across a timeline we increased performance from a release via a visual indication of p50+ metrics is almost magical.

Another way we use it is for leading indicators of issues that might be happening. So for example, anomaly detection on gauge metrics across the app and having synthetics build in with alerting configurations are both ways we can get alerted sometimes even before a big issue is about to happen.

What is most valuable?

The most valuable aspects include APM, RUM and dashboards.

I think of Datadog as an analytics company first. And that the integrations around notifications and alerts as a part of insight discoverability.

Everything Datadog offers for me is around knowledge building and how much do I know about the deep details of my stack.

The pricing model makes more sense than what we paid for against other competitors. I was at one job where we used two competing services because DD didn’t have BAA for APM. And then when it offered it, we immediately dumped the other solution for Datadog.

What needs improvement?

Logging is not a great experience. Searching for specific logs and then navigating around the context of the results is slow and cumbersome. Honestly that is my only gripe for Datadog. It’s a wonderful product outside of log searching. I have had better experience using other services that aggregate logs for search.

My use case for it is around discoverability. Log search is fine if I’m just looking for something specific. That said, if it’s something else targeted and I am wandering around looking for possible issues, it’s really unintuitive.

For how long have I used the solution?

I've used the solution for more than eight years.

What do I think about the stability of the solution?

Very stable.

What about the implementation team?

We always implement the solution in-house.

Which deployment model are you using for this solution?

Private Cloud

reviewer902462

Capable of pinpointing warnings and errors in logs and provide detailed context

September 18, 2024
Review provided by PeerSpot

What is our primary use case?

Our primary use case for Datadog is to monitor, analyze, and optimize the performance and health of our applications and infrastructure.

We leverage its logging, metrics, and tracing capabilities to pinpoint issues, track system performance, and improve overall reliability.

Datadog’s ability to provide real-time insights and alerting on key metrics helps us quickly address issues, ensuring smooth operations. It’s integral for visibility across our microservices architecture and cloud environments.

How has it helped my organization?

Datadog has been incredibly valuable to our organization. Its ability to pinpoint warnings and errors in logs and provide detailed context is essential for troubleshooting.

The platform's request tracing feature offers comprehensive insights into user flows, allowing us to quickly identify issues and optimize performance.

Additionally, Datadog's real-time monitoring and alerting capabilities help us proactively manage system health, ensuring operational efficiency across our applications and infrastructure.

What is most valuable?

Being able to filter requests by latency is invaluable, as it provides immediate insight into which endpoints require further analysis and optimization. This feature helps us quickly identify performance bottlenecks and prioritize improvements.

Additionally, the ability to filter requests by user email is extremely useful for tracking down user-specific issues faster. It streamlines the troubleshooting process and enables us to provide more targeted support to individual users, improving overall customer satisfaction.

What needs improvement?

The query performance could be improved, particularly when handling large datasets, as slower response times can hinder efficiency.

Additionally, the interface can sometimes feel overwhelming, with so much happening at once, which may discourage users from exploring new features.

Simplifying the layout or providing clearer guidance could enhance user experience. Any improvements related to query optimization would be highly beneficial, as it would further streamline workflows and boost productivity.

For how long have I used the solution?

I've used the solution for five years.

reviewer907251

Good logging, easy to find issues, and saves time

September 18, 2024
Review from a verified AWS customer

What is our primary use case?

We use the solution for APM, AWS, Lambda, logging, and infrastructure. We have many different things all over AWS, and having one place to look is great.

We have all sorts of different AWS things out there that are in C# and Node. Having a single place to log and APM into is very important to us.

Keeping track of the cloud infrastructure is also important. We have Lambda, containers, EC2, etc.

Having a super simple interface to filter the searching for APM and logging is great. It is super easy to show people how to use. This is super important to us.

How has it helped my organization?

Finding issues quickly is super important. Being able to create dashboards and alert on issues.

Having the ability to create dashboards has really taught us how to utilize the searching part of the system. We are able to share them, and build upon them so easily. Many iterations later people are putting some solid information out there.

Alerting is also important to us. We have set up many alerts that help us spot issues in the platform before they become bigger issues. This has enabled my teams to use incidents and address the issues so they are no longer problems.

What is most valuable?

Alerting on running systems is very helpful. Finding issues is quick. We have one place for logging, searching through. Being able to save these and reference them in the future and build upon them.

The logging in general is one of my favorite features. The search is so straight forward and easy to use. Just being able to click on a field and add it to search has taught me so much about the interface, It might not be as useful without a shortcut like that to teach me the system. We have Cloudflare logs in there, and I have no idea sometimes how to filter on such a buried piece of JSON. That is where the interface helps me by clicking on the add to search I get what I need.

What needs improvement?

The "Pager Duty" replacement is something we are very interested in. We only really use pager duty to call the team when things are down.

I love to have some DD guru come in and do a department training directly at our setup. We would love to have someone come in and show us the things we could do better within our current setup.

Also saving a bit of cash would also help if there are things we are doing that are costing us. It's a big enough tool that it is tough to have someone dedicated to manage.

For how long have I used the solution?

I've used the solution for a bit over a year at this point.

What do I think about the stability of the solution?

The stability seems good here too.

What do I think about the scalability of the solution?

Scalability seems good to me. I have no complaints

How are customer service and support?

I get answers from our contact, and one team member did reach out. It went well.

How would you rate customer service and support?

Positive

Which solution did I use previously and why did I switch?

We used Loggly.

We switched because we wanted an all-in-one tool

How was the initial setup?

Some parts of our setup were tough. Some Windows container setups cost us a lot of time.

The AWS infrastructure was tough to fully turn on due to the large cost of everything being run.

What about the implementation team?

We handled the setup ourselves in-house.

What was our ROI?

This cost us more overall. ROI is hard to sell. That said, I can find issues way faster and see what is going on in my entire platform. I pay back the cost every month with productivity.

What's my experience with pricing, setup cost, and licensing?

It is going to cost you more than you think to keep everything running. We saw value in the one-for-all solution, however, it came at a premium to what we were paying.

Which other solutions did I evaluate?

We did evaluate Dynatrace.

Which deployment model are you using for this solution?

Public Cloud

If public cloud, private cloud, or hybrid cloud, which cloud provider do you use?

Amazon Web Services (AWS)

Mason Parry

Customizable alerts, good dashboards, and improves reliability

September 18, 2024
Review provided by PeerSpot

What is our primary use case?

We have several teams and several different projects, all working in tandem, so there are a lot of logs and monitoring that need to be done. We use Datadog mostly for alerting when things go down.

We also have several dashboards to keep track of critical operations and to make sure things are running without issues. The Slack messaging is essential in our workflow in letting us know when an alert is triggered. I also appreciate all the graphs you can make, as it gives our team a good overview of how our services are doing.

How has it helped my organization?

It has improved our reliability and our time to get back up from an outage. By creating an alert and then messaging a Slack channel, we know when something goes down fairly fast. This, in turn, improves our response time to swarm on an issue without it affecting customers. The graphs have also been useful to demonstrate to higher-ups how our services are performing, allowing them to make more informed decisions when it comes to the team.

What is most valuable?

The alerts are the most valuable. Having alerts have saved us countless times in the past and is essentially what we use data dog for.

I like how we can customize alerts, and when alerts have become too noisy, we turn their threshold down fairly easily. This is also the case when alerts should be notifying us more often.

I also like the graphs and how customizable they are. It allows us to create a nice-looking dashboard with all sorts of information relating to our project. This gives us a quick overview of how things are going.

What needs improvement?

It's not that straightforward when creating an alert. The syntax is a little confusing. I guess that the trade-off is customizability. But it would be nice to have a click-and-drag kind of way when creating an alert. So, if someone who isn't so familiar with Datadog or tech in general wanted to create an alert, they wouldn't need to know the syntax.

It would also be great if AI could be used to generate alerts and graphs. I could write a short prompt, and then the AI could auto-generate alerts and graphs for me.

For how long have I used the solution?

I've used the solution for more than two years.

Hoon Kang

Good alerting and issue detection for many valuable features

September 18, 2024
Review provided by PeerSpot

What is our primary use case?

Our company has a microservice architecture, with different teams in charge of different services. Also, it is a start, which means that we have to build fast and move very fast as well. So before we were properly using DD, we often had issues of things breaking, but without much information on where in our system the breaking happened. This was quite a big-time sync as teams were unfamiliar with other teams' codes, so they needed the help of other teams to debug. This slowed our building down a lot. So implementing dd traces fixed this

What is most valuable?

DataDog has many features, but the most valuable have become our primary uses.

Also, thanks to frequent concurrent deployments, the DataDog alerts monitors allow us quickly detect issues if anything occurs.

What needs improvement?

The monitors can be improved. The chart in the monitors only goes back a couple of hours, clunky. Also, it can provide more info, like traces within the monitors. We have many alerts connected to different notification systems, such as Slack and Opsgenie.

When the on-caller receives notifications fired by the alerts, we are taken to the monitors. Yet often, we have to open up many different tabs to see logs, traces and info that is not accessible on the monitors. I think it would make all of the on callers' lives easier if the monitor had more data

For how long have I used the solution?

We've used the solution for three years.

Jola H.

Using Datadog for Log Managment

June 26, 2024
Review provided by G2

What do you like best about the product?

Being able to have tedious task like Log Managment be dealt with, in an efficient way. Ease of use and ease of Integration is a big plus.

What do you dislike about the product?

Cost is a major drawback. While it might be justifiable for large enterprises with significant budgets, for small to medium-sized businesses or startups, the cost can be prohibitive. The charges accumulate quickly, especially as you add more hosts and services.

What problems is the product solving and how is that benefiting you?

Issues regarding AWS Monitoring

Michael K.

Datadog is like candy for DevSecOps or FinOps teams!

June 26, 2024
Review provided by G2

What do you like best about the product?

The variety of telemetry (features) that can be pulled in results in being able to make more meaningful decisions and process improvements.

What do you dislike about the product?

Sometime pricing/licensing can be tricky to understand.

What problems is the product solving and how is that benefiting you?

Every month Datadog is solving new problems for us, whether it be cloud cost optimization, or continous profiling which enables us to optimize our microservices quickly.

Information Technology and Services

SL Datadog review

June 26, 2024
Review provided by G2

What do you like best about the product?

Full stack monitoring for all of our cloud apps. One tool, one UI. Ease of use. Adding more functionality/features.

What do you dislike about the product?

Need to Add more functionalty for RBAC (RUM).

What problems is the product solving and how is that benefiting you?

APM/RUM/ Watchdog pinpoints performance issues and helps our SRE team resolve issues quickly.

Pooja K.

makes developers life easier

June 26, 2024
Review provided by G2

What do you like best about the product?

APM, RUM, Synthetic tests, Reterntion filters.

What do you dislike about the product?

Cost ofcourse. some of the feature are abit costly

What problems is the product solving and how is that benefiting you?

Observibility for the infrastructure that we have

Datadog Enterprise

Reviews from AWS customer

External reviews

Good monitoring capabilities, centralizing of logs, and making data easily searchable

What is our primary use case?

How has it helped my organization?

What is most valuable?

What needs improvement?

For how long have I used the solution?

What do I think about the stability of the solution?

Which solution did I use previously and why did I switch?

Which deployment model are you using for this solution?

Excellent APM, RUM and dashboards

What is our primary use case?

How has it helped my organization?

What is most valuable?

What needs improvement?

For how long have I used the solution?

What do I think about the stability of the solution?

What about the implementation team?

Which deployment model are you using for this solution?

Capable of pinpointing warnings and errors in logs and provide detailed context

What is our primary use case?

How has it helped my organization?

What is most valuable?

What needs improvement?

For how long have I used the solution?

Good logging, easy to find issues, and saves time

What is our primary use case?

How has it helped my organization?

What is most valuable?

What needs improvement?

For how long have I used the solution?

What do I think about the stability of the solution?

What do I think about the scalability of the solution?

How are customer service and support?

How would you rate customer service and support?

Which solution did I use previously and why did I switch?

How was the initial setup?

What about the implementation team?

What was our ROI?

What's my experience with pricing, setup cost, and licensing?

Which other solutions did I evaluate?

Which deployment model are you using for this solution?

If public cloud, private cloud, or hybrid cloud, which cloud provider do you use?

Customizable alerts, good dashboards, and improves reliability

What is our primary use case?

How has it helped my organization?

What is most valuable?

What needs improvement?

For how long have I used the solution?

Good alerting and issue detection for many valuable features

What is our primary use case?

What is most valuable?

What needs improvement?

For how long have I used the solution?

Using Datadog for Log Managment

Datadog is like candy for DevSecOps or FinOps teams!

SL Datadog review

makes developers life easier