AWS Marketplace: Datadog Operator Reviews

Real Estate

A fantastic product with nebulous billing

June 10, 2025
Review provided by G2

What do you like best about the product?

A very strong monitoring an integration tool that has many integrations across the market that allow for very fast setup and usage. Generally can derive value very quickly and have a functioning setup out of the box

What do you dislike about the product?

Billing that can very much catch up with you quickly or billing practices that don'ty scale well into modern infra design. For example how per-host billing is done is still not very friendly to spot/dynamic workloads due to time accounting.

What problems is the product solving and how is that benefiting you?

Mainly application monitoring, logging, and APM/Tracing.

It acts as a strong centralized tool for things like tracing, and allows cross enterprise visibility for services that span many areas.

Financial Services

Very powerful platform but can be overwhelming

June 10, 2025
Review provided by G2

What do you like best about the product?

APM
AWS Integrations
Synthetic Tests
Monirtors / Alerts
Log Forwarder
Dashboards

What do you dislike about the product?

Busy UI / UX
Cost Transparency is lacking

What problems is the product solving and how is that benefiting you?

N/A

Computer Software

SDE

June 10, 2025
Review provided by G2

What do you like best about the product?

the. latest in observability and ease of access

What do you dislike about the product?

the onboarding is tricky and too many buttons

What problems is the product solving and how is that benefiting you?

network logs

Media Production

Solid one stop platform for monitoring

June 10, 2025
Review provided by G2

What do you like best about the product?

It's easy to generate tests and monitoring your applications

What do you dislike about the product?

Have clearer paths/documents for custom platform integration.

What problems is the product solving and how is that benefiting you?

Continuous monitoring applications after launch.

ahmet.kose@sesame.org K.

Qa Engineer

June 10, 2025
Review provided by G2

What do you like best about the product?

I enjoy using the User experience of the platform

What do you dislike about the product?

I do think that a new user needs training to use everything

What problems is the product solving and how is that benefiting you?

It is helping me with Synthetic Testing

reviewer0962486

Good alerts and detailed data but needs UI improvements

September 23, 2024
Review provided by PeerSpot

What is our primary use case?

I work in product design, and although we use Datadog for monitoring, etc, my use case is different as I mostly review and watch session recordings from users to gain insight into user feedback.

We watch multiple sessions per week to understand how users are using our product. From this data, we are able to hone in on specific problems that come up during the sessions. We then reach out to specific users to follow up with them via moderated testing sessions, which is very valuable for us.

How has it helped my organization?

Using Datadog has allowed us to review detailed interactions of users at a scale that leads us to make informed data-driven UX improvements as mentioned above.

Being able to pinpoint specific users via filtering is also very useful as it means when we have direct feedback from a specific user, we can follow up by watching their session back.

The engineering team's use case for Datadog is for alerting, which is also very useful for us as it gives us visibility of how stable our platform is in various different lenses.

What is most valuable?

Session recordings have been the most valuable to me as it helps me gain insights into user behaviour at scale. By capturing real-time interactions, such as clicks, scrolls, and navigation paths, we can identify patterns and trends across a large user base. This helps us pinpoint usability issues, optimize the user experience, and improve the overall experience for our users. Analyzing these recordings enables us to make data-driven decisions that enhance both functionality and user satisfaction.

What needs improvement?

I'd like the ability to see more in-depth actions on user sessions, such as where there are specific problems and rather than having to watch numerous session recordings to understand where this happens to get alerts/notifications of specific areas that users are struggling with - such as rage clicks, etc.

In terms of UI, everything is very small, which makes it quite difficult to navigate at times, especially in terms of accessibility, so I'd love for there to be more attention on this.

For how long have I used the solution?

I've used the solution for over one year.

Which solution did I use previously and why did I switch?

We did not evaluate other options.

What's my experience with pricing, setup cost, and licensing?

I wasn't part of the decision-making process during licensing.

Which other solutions did I evaluate?

I wasn't part of the decision-making process during the evaluation stage.

reviewer9637683

Great for logging and racing but needs better customization

September 23, 2024
Review provided by PeerSpot

What is our primary use case?

We're using the product for logging and monitoring of various services in production environments.

It excels at providing real-time observability across a wide range of metrics, logs, and traces, making it ideal for DevOps teams and enterprises managing complex environments.

The platform integrates seamlessly with our cloud services, but browser side logging is a little lagging.

Dashboards are very useful for quick insights, but can be time consuming to create, and the learning curve is steep. Documentation is vast, but not as detailed as I'd like.

How has it helped my organization?

The solution has made logging and tracing a lot easier, and the RUM sessions are something we did not have previously. Datadog’s real-time alerting and anomaly detection help reduce downtime by allowing us to identify and address performance issues quickly.

The platform’s intelligent alert system minimises noise, ensuring your team focuses on critical incidents. This results in faster Mean Time to Resolution (MTTR), improving service availability.

It consolidates monitoring for infrastructure, applications, logs, and security into a single platform. This enables us to view and analyse data across the entire stack in one place, reducing the time spent jumping between tools.

What is most valuable?

Real user monitoring has made triaging any possible bugs our users might face a lot easier. RUM tracks actual user interactions, including page load times, clicks, and navigation flows. This gives our organization a clear picture of how our users are experiencing your application in real-world conditions, including slow-loading pages, errors, and other performance issues that affect user satisfaction. We can then easily prioritize these, and make sure we offer our users the best possible experience.

What needs improvement?

I'm not sure if this is on Datadog, however, Vercel integration is very limited.

They need to offer better/more customization on what logs we get and making tracing possible on Edge runtime logs is a real requirement. It is extremely difficult, if not completely impossible, to get working traces and logs displayed in Datadog with our stack of Vercel, NexJs, and Datadog. This is a very common stack in front end development and the difficulty of implementing it is unacceptable. Please do something about it soon. Front end logs matter.

For how long have I used the solution?

I've used the solution for a little over a year.

reviewer9864210

Good dashboards and observability capabilities but pricing needs improvement

September 20, 2024
Review provided by PeerSpot

What is our primary use case?

We have multiple nodes integrated into our Azure infrastructure and our AKS clusters. These nodes are integrated with traces (as APM hosts).

We also have infrastructure Hosts integrated to see the metrics and the resources of each hosts mainly for Azure VMs and AKS nodes. Additionally, we also have hosts from our VMs in Azure which act as Activemq and we integrate them as messaging queues to show up in the Activemq dashboard.

We have recently added Activemq as containers in the AKS and we are also integrating those as messaging queues to show up in the Activemq dashboard integration

How has it helped my organization?

Logs are great. Having all services with different teams sending the logs to Datadog and having all logs in the same place is very helpful for us to understand what is going on in our app; filtering of the logs a huge help and adding special custom filters is easy, filters are fast. Documentation is better than average, with little room for improvement.

Dashboards are simple, and monitors are very easy to configure and get notified if something is wrong.

With the aggregated logs, we can now see logs from other systems and identify problems in other areas in which we had no visibility before.

What is most valuable?

Dashboards are the most valuable. We need the observability. We have given the dashboards to a dedicated team to monitor them off working hours and they are reporting whatever they see going red. This helps us since people without any knowledge can understand when there is a problem and when to react and when to inform others by simply looking if the monitor (showing the dashboards) turns up red.

Traces being connected to each other and seeing that each service is connected through one API call is very helpful for us to understand how the system works.

What needs improvement?

The monitors need improvement. We need easier root cause analysis when a monitor hits red. When we get the email, it's hard to identify why the trigger has gone red and which pod exactly is to blame in a scenario where the pod is restarting, for example.

Prices are a very difficult thing in Datadog. We have to be very mindful of any changes we make in Datadog, and we are a bit afraid of using new features since, if we change something, we might get charged a lot. For example, if we add a network feature to our nodes, we might get charged a lot simply by changing one flag, even though we are only going to use one small feature for those network nodes. However, due to the fact that we have more than 50 nodes, all of the nodes will be charged for the feature of "Network hosts".

This leads us to not fully utilize the capabilities of Datadog, and it's a shame. Maybe we can have a grace period to test features like a trial and then have datadog stop that for us to avoid paying more by mistake.

For how long have I used the solution?

I've used the solution for five years.

What do I think about the stability of the solution?

The solution is stable enough. We found it to be down only a few times, and it's reasonable.

What do I think about the scalability of the solution?

The solution offers very good scalability. When we added more logs and more hosts, we did not notice any degradation in the service.

How are customer service and support?

Support is very good. They answer all of our questions, and with a few emails, we get what we need

How would you rate customer service and support?

Positive

Which solution did I use previously and why did I switch?

We previously used Elastic. We had to set up everything and maintain it ourselves.

How was the initial setup?

Datadog has very good support and it is not so complicated to set up.

What about the implementation team?

We set up the solution in-house. We integrated everything on our own.

What was our ROI?

We found the product to be very valuable.

What's my experience with pricing, setup cost, and licensing?

I'd advise others to start small and then integrate more stuff. Be mindful when using Datadog.

Which other solutions did I evaluate?

We evaluated Splunk and ELK.

What other advice do I have?

Be careful of the costs. Set up only the important things.

Ajay Thomas

Great features and synthetic testing but pricing can get expensive

September 19, 2024
Review provided by PeerSpot

What is our primary use case?

Our primary use case is custom and vendor supplied web application log aggregation, performance tracing and alerting. We run a mix of AWS EC2, Azure serverless, and colocated VMWare servers to support higher education web applications.

Managing a hybrid multi-cloud solution across hundreds of applications is always a challenge. Datadog agents on each web host, and native integrations with GitHub, AWS, and Azure gets all of our instrumentation and error data in one place for easy analysis and monitoring.

How has it helped my organization?

Through use of Datadog across all of our apps we were able to consolidate a number of alerting and error tracking apps and Datadog ties them all together in cohesive dashboards.

Whether the app is vendor-supplied or we built it ourselves, the depth of tracing, profiling, and hooking into logs is all obtainable and tunable. Both legacy .NET Framework and Windows Event Viewer and cutting edge .NET Core with streaming logs all work.

The breadth of coverage for any app type or situation is really incredible. It feels like there's nothing we can't monitor.

What is most valuable?

When it comes to Datadog, several features have proven particularly valuable. The centralized pipeline tracking and error logging provides a comprehensive view of our development and deployment processes, making it much easier to identify and resolve issues quickly.

Synthetic testing has been a game-changer, allowing us to catch potential problems before they impact real users. Real user monitoring gives us invaluable insights into actual user experiences, helping us prioritize improvements where they matter most. And the ability to create custom dashboards has been incredibly useful, allowing us to visualize key metrics and KPIs in a way that makes sense for different teams and stakeholders.

Together, these features form a powerful toolkit that helps us maintain high performance and reliability across our applications and infrastructure, ultimately leading to better user satisfaction and more efficient operations.

What needs improvement?

I'd like to see an expansion of the Android and IOS apps to have a simplified CI/CD pipeline history view. I like the idea of monitoring on the go yet it seems the options are still a bit limited out of the box. While the documentation is very good considering all the frameworks and technology Datadog covers, there are areas - specifically .NET Profiling and Tracing of IIS-hosted apps - that need a lot of focus to pick up on the key details needed. In some cases the screenshots don't match the text as updates are made. I spent longer than I should figuring out how to correlate logs to traces, mostly related to environmental variables.

For how long have I used the solution?

I've used the solution for about three years.

What do I think about the stability of the solution?

We have been impressed with the uptime and clean and light resource usage of the agents.

What do I think about the scalability of the solution?

The solution is very scalable, very customizable.

How are customer service and support?

Service is always helpful in tuning our committed costs and alerting us when we start spending outside the on-demand budget.

Which solution did I use previously and why did I switch?

We used a mix of a custom error email system, SolarWinds, UptimeRobot, and GitHub actions. We switched to find one platform that could give deep app visibility regardless of whether it is Linux or Windows or Container, cloud or on-prem hosted.

How was the initial setup?

The setup was generally simple. However, .NET Profiling of IIS and aligning logs to traces and profiles was a challenge.

What about the implementation team?

We implemented the solution in-house.

What was our ROI?

I'd count our ROI as significant time saved by the development team assessing bugs and performance issues.

What's my experience with pricing, setup cost, and licensing?

Set up live trials to asses cost scaling. Small decisions around how monitors are used can have big impacts on cost scaling.

Which other solutions did I evaluate?

NewRelic was considered. LogicMonitor was chosen over Datadog for our network and campus server management use cases.

What other advice do I have?

I'm excited to dig further into the new offerings around LLM and continue to grow our footprint in Datadog.

reviewer1494894

Improved time to discovery and resolution but needs better consumption visibility

September 18, 2024
Review provided by PeerSpot

What is our primary use case?

The product monitors multiple systems, from customer interactions on our web applications down to the database and all layers in between. RUM, APM, logging, and infrastructure monitoring are all surfaced into single dashboards.

We initially started with application logs and generated long-term business metrics out of critical logs. We have turned those metrics and logs into a collection of alerts integrated into our pager system. As we have evolved, we have also used APM and RUM data to trigger additional alerts.

How has it helped my organization?

The solution has surfaced how integrated our applications really are and helps us track calls from the top down, identifying slowness and errors all through the call stack.

The biggest improvement we have seen is our time to discovery and resolution. As Datadog has improved, and we add new features, the depth and clarity we get from top to bottom has been excellent. Our engineering teams have quickly adopted many features within Datadog, and are quick to build out their own dashboards and alerts. This has also led to a rapid sprawl when left unchecked.

What is most valuable?

We started with application logs and have expanded over the years to include infrastructure, APM, and now RUM. All of these tools have been incredibly valuable in their own sphere. The huge value is tying all of the data points together.

Logging was the first tool we started with years ago, replacing our ELK stack. It was the easiest to get in place, and our engineers quickly embraced the tools. Several critical dashboards were created years ago and are still in use today. Over time, we have shifted from verbose logs and matured into APM and RUM. That has helped us focus on fine-tuning the performance of our applications.

What needs improvement?

We need better visibility into our consumption rate, which is tied to our commit levels. We would love to see a % consumed and alert us if we are over budget before getting an overage charge 20 days into the month.

The biggest complaint we hear comes from the cost of the tool. It is pretty easy to accidentally consume a lot of extra data. Unless you watch everything come in almost daily, you could be in for a big surprise.

We utilize the Datadog estimated usage metrics to build out alerts and dashboards. The usage and cost system page still doesn't tie into our committed spending - it would be wonderful to see the monthly burn rate on any given day.

For how long have I used the solution?

I've used the solution for six years.

What do I think about the stability of the solution?

There have not been as many outages in the past year. We also haven't been jumping into the new features as quickly as they come out. We may be working on more stable products.

What do I think about the scalability of the solution?

It has scaled up to meet our needs pretty well. Over the years, we have only managed to trigger internal DataDog alerts once or twice by misconfiguring a metric and spiralling out of control with costs.

How are customer service and support?

Support has been lacking. Opening a chat with the tech support rep of the day is always a gamble. We are looking into working with third-party support because it has been so rough over the years.

How would you rate customer service and support?

Neutral

Which solution did I use previously and why did I switch?

We used the ELK stack for logging and monitoring and AppDynamics for APM.

How was the initial setup?

The initial setup for new teams has become easier over the years. We are increasing our adoption rate as we shift our technology to more cloud-native tools. Datadog has supported easy implementation by simply adding a package to the app.

They have really focused on a lot of out-of-the-box functionality, but the real fun happens as you dive deeper into the configuration. We have also begun adapting open telemetry standards. This has kept us from going too deep into vendor-specific implementations.

What about the implementation team?

We did the initial setup via an in-house team.

What was our ROI?

As long as we stay on top of our consumption mid-month, it has been worth it. However, the few engineers we have who are dedicated to playing whack-a-mole with the growing spending could be better utilized in teaching best practices to new users. I suppose our implementation of the rapidly changing tools over the years has led to a fair amount of technical debt.

What's my experience with pricing, setup cost, and licensing?

It is quite easy to set up any specific tool, but to take advantage of the full visibility it offers, you need to instrument across the board—which can be time-consuming. Be careful about how each tool is billed, and watch your consumption like a hawk.

Which other solutions did I evaluate?

We evaluated AppDynamics and Dynatrace.

What other advice do I have?

It's a very powerful tool, with lots of new features coming, but you certainly will pay for what you get.

Datadog Operator

Reviews from AWS customer

External reviews

A fantastic product with nebulous billing

Very powerful platform but can be overwhelming

SDE

Solid one stop platform for monitoring

Qa Engineer

Good alerts and detailed data but needs UI improvements

What is our primary use case?

How has it helped my organization?

What is most valuable?

What needs improvement?

For how long have I used the solution?

Which solution did I use previously and why did I switch?

What's my experience with pricing, setup cost, and licensing?

Which other solutions did I evaluate?

Great for logging and racing but needs better customization

What is our primary use case?

How has it helped my organization?

What is most valuable?

What needs improvement?

For how long have I used the solution?

Good dashboards and observability capabilities but pricing needs improvement

What is our primary use case?

How has it helped my organization?

What is most valuable?

What needs improvement?

For how long have I used the solution?

What do I think about the stability of the solution?

What do I think about the scalability of the solution?

How are customer service and support?

How would you rate customer service and support?

Which solution did I use previously and why did I switch?

How was the initial setup?

What about the implementation team?

What was our ROI?

What's my experience with pricing, setup cost, and licensing?

Which other solutions did I evaluate?

What other advice do I have?

Great features and synthetic testing but pricing can get expensive

What is our primary use case?

How has it helped my organization?

What is most valuable?

What needs improvement?

For how long have I used the solution?

What do I think about the stability of the solution?

What do I think about the scalability of the solution?

How are customer service and support?

Which solution did I use previously and why did I switch?

How was the initial setup?

What about the implementation team?

What was our ROI?

What's my experience with pricing, setup cost, and licensing?

Which other solutions did I evaluate?

What other advice do I have?

Improved time to discovery and resolution but needs better consumption visibility

What is our primary use case?

How has it helped my organization?

What is most valuable?

What needs improvement?

For how long have I used the solution?

What do I think about the stability of the solution?

What do I think about the scalability of the solution?

How are customer service and support?

How would you rate customer service and support?

Which solution did I use previously and why did I switch?

How was the initial setup?

What about the implementation team?

What was our ROI?

What's my experience with pricing, setup cost, and licensing?

Which other solutions did I evaluate?

What other advice do I have?