My main use case for Datadog is dashboards and monitoring.
We use dashboards and monitoring with Datadog to monitor the performance of our Nexus Artifactory system and make sure the services are running.
External reviews are not included in the AWS star rating for the product.
My main use case for Datadog is dashboards and monitoring.
We use dashboards and monitoring with Datadog to monitor the performance of our Nexus Artifactory system and make sure the services are running.
The best features Datadog offers are the dashboarding tools as well as the monitoring tools.
What I find most valuable about the dashboarding and monitoring tools in Datadog is the ease of use and simplicity of the interface.
Datadog has positively impacted our organization by allowing us to look at things such as Cloud Spend and make sure our services are running at an optimal performance level.
We have seen specific outcomes such as cost savings by utilizing the cost utilization dashboards to identify areas where we could trim our spend.
To improve Datadog, I suggest they keep doing what they're doing.
Newer features using AI to create monitors and dashboards would be helpful.
I have been using Datadog for six years.
Datadog is stable.
I am not sure about Datadog's scalability.
Customer support with Datadog has been great when we needed it.
I rate the customer support a nine on a scale of 1 to 10.
Positive
We did not previously use a different solution.
In terms of return on investment, there is a lot of time saved from using the platform.
I was not directly involved in the pricing, setup cost, and licensing details.
Before choosing Datadog, we evaluated other options such as Splunk and Grafana.
I rate Datadog an eight out of ten because the expense of using it keeps it from being a nine or ten.
My advice to others looking into using Datadog is to brush up on their API programming skills.
My overall rating for Datadog is eight out of ten.
My main use case for Datadog is performance monitoring, SLOs, and SLIs.
For performance monitoring, SLOs, and SLIs, we create objectives and indicators around user feedback and stakeholder feedback. We have weekly meetings to create backlog items to work on if things have elapsed and gone into the red based on our SLO definitions.
The best features Datadog offers are the analytics that are all associated with each other. RUM data associated with APM, trace data, and all of that, including information around inferred requests, has been super useful. Machine health data gives a complete picture of performance, which has been extremely useful for troubleshooting difficult problems.
Having all that associated analytics helps me in troubleshooting by not having to bounce around to other tools, which saves me a lot of time. I know that the quality of Datadog metrics gathered is enough to where I can rule things in and out. This basically goes for any web app; when asking why a web app is slow, first you look at the code. If the code looks good, then you look at the hardware or the database. Being able to rule all of those out with one tool with one set of requests is useful.
Datadog has positively impacted my organization by allowing us to gather complete data instead of looking all over the place at incomplete data and actually make pointed determinations for fixing issues. It has helped increase efficiency and saved time.
I don't know how Datadog can be improved as they are doing a pretty good job.
I have been using Datadog for three years.
Datadog is stable.
Datadog's scalability is good.
The customer support is good.
My experience with pricing, setup cost, and licensing is that it is really expensive.
I have not seen a return on investment.
My experience with pricing, setup cost, and licensing is that it is really expensive.
My advice to others looking into using Datadog is that it is good and they should use it.
I don't know if my company has a business relationship with this vendor other than being a customer.
On a scale of 1-10, I rate Datadog a 9.
My main use case for Datadog is application and portal monitoring.
For application or portal monitoring, we have several monitors set up that give us a heads up early when we believe there's a problem with end users getting to the applications that are available to them on the portal. Just yesterday, we were able to identify an error in code that was throwing thousands of errors a day, and it was very simple for us to actually find it using Datadog analytics on the error and the Watchdog alerts.
I don't have anything else to add about my main use case, other than the ease with which we were able to identify an issue that we previously, when we didn't have Datadog, might not even be aware of, but was consuming resources that it didn't need to.
In my opinion, the best features Datadog offers are flexibility and extensive support. It can be a little overwhelming since there are so many features that come with Datadog, and I'm just scratching the surface of that. I also appreciate the support that our representative has provided to us, coming on-prem, providing training, being available to answer questions, and the extensive knowledge base documentation that I have been referred to, which has been extremely helpful also.
The flexibility I mentioned shows up in my day-to-day work because traditionally, I was using SolarWinds to monitor infrastructure health, but the polling period is lengthier than we would like to see. Datadog specifically has real-time monitoring, and the alerts that we have configured are coming to us much quicker. We're able to address an issue sooner rather than later, and when it comes to reviewing .NET code or application configuration, I only had limited visibility, but with Datadog doing the analysis of the IIS logs and any other application logs, it's also opened up visibility to me so that I can assist a developer in identifying the area of concern or where a code could be more efficiently written.
Datadog has positively impacted my organization by helping us make our web portals more efficient. Our portals and integrations are extremely complex, and as we get the agent installed on more devices, it's really provided us visibility that we haven't had in my entire career with Ace Hardware.
I cannot provide specific numbers for the improved performance, but Datadog has identified issues that we have in our data source area. We have implemented additional indexes and have plans for breaking out complex queries that are pulling data across multiple data sources. We're in the crawl, walk, run phase, so right now we're identifying and prioritizing the things that need to be fixed. A few of the things that we've already addressed include adding additional resources to servers, and we have noticed improved performance. I know someone has the statistics; I just don't have them available to me at the moment.
At this point, I'm not sure how Datadog can be improved, but maybe some initial intense training from the vendor before setting us loose with the application is the only thing I can think of.
I think it would be helpful to have an administrative page right from the portal that gives us links to the application documentation. I have separate URLs to get to the various locations that I need to go to, but unless I'm just not seeing them, I have to go to separate URLs. I cannot get to some of the documentation and various other components from my company-specific portal.
Other than being restricted by cost, Datadog's scalability has been a little bit of a challenge to do the initial installation of the agent. We have upgraded all of our agents so that we can do the upgrades remotely, but the initial install is still a little time-consuming and a little clunky.
I think the customer support is great. I love the ability to send flares directly from the machine or device that's having an issue, and my tickets are always opened promptly. I usually get links to documentation about the specific feature or function that I'm trying to implement, and when I have additional questions, the ticket is updated with actual recommendations or suggestions pointing me in the correct direction.
Positive
We continue to use SolarWinds, although I can see the infrastructure monitoring component of SolarWinds being replaced with Datadog. We also used Catchpoint to run synthetic scripts from various locations throughout the country, and we use Pingdom for our e-commerce solution. We're trying to phase out Pingdom at this time with the help of Datadog engineers, and we have ceased using Catchpoint because we have created those synthetic scripts within Datadog.
At this point, I'm leaving the return on investment metrics to my manager and director. I'm just focused on getting it up and running, installed, upgraded, and helping to train other folks to use it. I know they're trying to keep metrics on all of those questions, but I'm just not focusing on that at this time.
I was not included in the pricing, setup cost, and licensing decisions, but I have needed to gain more information about licensing and individual feature cost projections. Everybody wants the agent installed, but we only have so many dollars to spread across, so it's been difficult for me to prioritize who will benefit from Datadog at this time.
I'm excited to learn more about the application and can't wait as my knowledge expands, all the exciting things that we might be able to do with the tool.
I rate Datadog an 8 out of 10, only because I haven't had the ability to explore everything that I intend to explore, and some of the more complex monitors that I want to create I'm just not able to intuitively do. But that might be on me and not the product. The complexity and my lack of knowledge related to all the features and how I can use them keep it from being a 10 for me.
I would advise others looking into using Datadog to do more training and become much more familiar with the product before going live with it. There are so many wonderful things that can be done with it that it's a little overwhelming to only attempt to configure those or investigate them when the product's already live.
I'm excited to continue to learn and explore the tool. It's giving me some insight into systems that I have not had for the past 17 years, so it's exciting to be able to see that and put it to use almost immediately.
The primary purposes for which Datadog is used include infrastructure monitoring and application monitoring.
The main use case for Datadog integration capabilities is to monitor workloads in public cloud, and those public cloud integrations that reached the public cloud metric natively were helpful or critical for us. We are not using Datadog for AI-driven data analysis tasks, but more cloud-native and vendor-native tools at the moment, and at the time when I was still in my last employer, we didn't use Datadog for the AI piece at all.
I find alerting and metrics to be the most effective features of Datadog for system monitoring. It was still cheaper to run Datadog than other alternatives, so the running costs were cheaper because it was SaaS and quite easy to use.
Datadog is only available in SaaS.
The pricing nowadays is quite complex.
In future updates, I would like to see AI features included in Datadog for monitoring AI spend and usage to make the product more versatile and appealing for the customer.
I have been using Datadog since 2014.
There were no problems with the deployment of Datadog.
The deployment of Datadog just took a few hours.
The challenges I encountered while using Datadog were in the early days when the product was missing the ability to monitor Kubernetes and similar features, but they have since added those features. At the moment, I don't think there are too many challenges that I am worrying about.
One person is enough to do the installation.
I am not working with any of these solutions currently because I'm on sabbatical, but I used to work with Datadog six months ago, and now at the moment I'm on sabbatical.
We were using the tools that AWS and Azure came with natively to monitor the AI workflows on their platforms.
I used to work as the CTO at Northcloud, but I no longer work there.
On a scale of one to ten, I rate Datadog an eight out of ten.
The technology itself is generally very useful and the interface it great.
There should be a clearer view of the expenses.
I have used the solution for four years.
The solution is stable.
I have not personally interacted with customer service. I am satisfied with tech support.
Neutral
I am using ThousandEyes and Datadog. Datadog supports AI-driven data analysis, with some AI elements to analyze, like data processing tools and so on. AI helps in Datadog primarily for resolving application issues.
It was not difficult to set up for me. There was no problem.
I can confirm there is a return on investment.
I find the setup cost to be too expensive. The setup cost for Datadog is more than $100. I am evaluating the usage of this solution, however, it is too expensive.
I would rate this solution eight out of ten.
We use Datadog for monitoring and observing all of our systems, which range in complexity from lightweight, user-facing serverless lambda functions with millions of daily calls to huge, monolithic internal applications that are essential to our core operations. The value we derive from Datadog stems from its ability to handle and parse a massive volume of incoming data from many different sources and tie it together into a single, informative view of reliability and performance across our architecture.
Adopting Datadog has been fantastic for our observability strategy. Where previously we were grepping through gigabytes of plaintext logs, now we're able to quickly sort, filter, and search millions of log entries with ease. When an issue arises, Datadog makes it easy to track down the malfunctioning service, diagnose the problem, and push a fix.
Consequently, our team efficiency has skyrocketed. No longer does it take hours to find the root cause of an issue across multiple services. Shortened debugging time, in turn, leads to more time for impactful, user-facing work.
Our services have many moving parts, all of which need to talk to each other. The Service Map makes visualizing this complex architecture - and locating problems - an absolute breeze. When I reflect on the ways we used to track down issues, I can't imagine how we ever managed before Datadog.
Additionally, our architecture is written in several languages, and one area where Datadog particularly shines is in providing first-class support for a
multitude of programming languages. We haven't found a case yet where we
needed to roll out our own solution for communicating with our instance.
A tool as powerful as Datadog is, understandably, going to have a bit of a learning curve, especially for new team members who are unfamiliar with the bevy of features it offers. Bringing new team members up to speed on its abilities can be challenging and sometimes requires too much hand-holding. The documentation is adequate, but team members coming into a project could benefit from more guided, interactive tutorials, ideally leveraging real-world data. This would give them the confidence to navigate the tool and make the most of all it offers.
The company was using it before I arrived; I'm unsure of how long before.
We use Datadog for monitoring the performance of our infrastructure across multiple types of hosts in multiple environments. We also use APM to monitor our applications in production.
We have some Kubernetes clusters and multi-cloud hosts with Datadog agents installed. We have recently added RUM to monitoring our application from the user side, including replay sessions, and are hoping to use those to replace existing monitoring for errors and session replay for debugging issues in the application.
We have been using Datadog since I started working at the company ten years ago and it has been used for many reasons over the years. Datadog across our services has helped debug slow performance on specific parts of our application, which, in turn, allows us to provide a snappier and more performant application for our customers.
The monitoring and alerting system has allowed our team to be aware of the issues that have come up in our production system and react faster with more tools to debug and view to keep the system online for our customers.
Datadog infrastructure monitoring has helped us identify health issues with our virtual machines, such as high load, CPU, and disk usage, as well as monitoring uptime and alerting when Kubernetes containers have a bad time staying up. Our use of Datadog's Application Monitoring, APM over the last six years or so has been crucial to identifying performance and bottleneck issues as well as alerting us when services are seeing high error rates, which have made it easier to debug when specific services may be going down.
We have found that some of the different options for filtering for logs ingestion, APM traces and span ingestion, and RUM sessions vs replay settings can be hard to discover and tough to determine how to adjust and tweak for both optimal performance and monitoring as well as for billing within the console.
It can sometimes be difficult to determine which information is documented, as we have found inconsistencies with deprecated information, such as environment variables within the documentation.
I've been using the solution for ten years.
The solution seems pretty stable, as we've been using it for more than a decade.
The solution seems quite scalable, especially within Kubernetes. Costs are a factor.
SUpport has been very helpful whenever we need it.
Positive
We had tried some other APM monitoring in the past, however, it was too expensive, and then we added it to Datadog since we were already using Datadog and it seemed like a good value add.
The solution is straightforward to set up. Sometimes, it is complex to find the correct documentation.
We handled the setup in-house.
Our ROI is ease of mind with alerts and monitoring, as well as the ability to review and debug issues for our customers.
Getting settled on pricing is something you want to keep an eye on, as things seem to change regularly.
We used New Relic previously.
Datadog is a great service that is continually growing its solution for monitoring and security. It is easy to set up and turn on and off its features once you have instrumented agents and tailored solutions to your needs.
We use Datadog across the enterprise for observability of infrastructure, APM, RUM, SLO management, alert management and monitoring, and other features. We're also planning on using the upcoming cloud cost management features and product analytics.
For infrastructure, we integrate with our Kube systems to show all hosts and their data.
For APM, we use it with all of our API and worker services, as well as cronjobs and other Kube deployments.
We use serverless to monitor our Cloud Functions.
We use RUM for all of our user interfaces, including web and mobile.
It's given us the observability we need to see what's happening in our systems, end to end. We get full stack visibility from APM and RUM, through to logging and infrastructure/host visibility. It's also becoming the basis of our incident management process in conjunction with PagerDuty.
APM is probably the most prominent place where it has helped us. APM gives us detailed data on service performance, including latency and request count. This drives all of the work that we do on SLOs and SLAs.
RUM is also prominent and is becoming the basis of our product team's vision of how our software is actually used.
APM is a fundamental part of our service management, both for viewing problems and improving latency and uptime. The latency views drive our SLOs and help us identify problems.
We also use APM and metrics to view the status of our Pub/Sub topics and queues, especially when dealing with undelivered messages.
RUM has been critical in identifying what our users are actually doing, and we'll be using the new product analytics tools to research and drive new feature development.
All of this feeds into the PagerDuty integration, which we use to drive our incident management process.
Sometimes thesolution changes features so quickly that the UI keeps moving around. The cost is pretty high. Outside of that, we've been relatively happy.
The APM service catalog is evolving fast. That said, it is redundant with our other tools and doesn't allow us to manage software maturity. However, we do link it with our other tools using the APIs, so that's helpful.
Product analytics is relatively new and based on RUM, so it will be interesting to see how it evolves.
Sometimes some of the graphs take a while to load, based on the window of data.
Some stock dashboards don't allow customization. You need to clone them first, but this can lead to an abundance of dashboards. Also, there are some things that stock dashboards do that can't yet be duplicated with custom dashboards, especially around widget organization.
The "top users" widget on the product analytics page only groups by user email, which is unfortunate, since user ID is the field we use to identify our users.
I've used the solution for three and a half years.
The solution is pretty stable.
The solution is very scalable.
Support was excellent during the sales process, with a huge dropoff after we purchased the product. It has only recently (within the past year) they have begun to reach acceptable levels again.
Neutral
We did not have a global solution. Some teams were using New Relic.
The instructions aren't always clear, especially when dealing with multiple products across multiple languages. The tracer works very differently from one language to another.
We handled the setup in-house.
We have built our own set of installation instructions for our teams, to ensure consistent tagging and APM setup.
We did look at Dynatrace.
The service was great during the initial testing phase. However, once we bought the product, the quality of service dropped significantly. However, in the past year or so, it has improved and is now approaching the level we'd expect based on the cost.
We were looking for an all-in-one observability platform that could handle a number of different environments and products. At a basic level, we have a variety of on-premises servers (Windows/Mac/Linux) as well as a number of commercial, cloud-hosted products.
While it's often possible to let each team rely on its own means for monitoring, we wanted something that the entire company could rally around - a unified platform that is developed and supported by the very same people, not others just slapping their name on some open source products they have no control over.
Datadog has effortlessly dropped in to nearly every stage of observability for us. We appreciate how it has robust cross-platform support for our IT assets, and for integrating hosted products, enabling integrations often couldn't be easier, with many of them including native dashboards and even other types of content packs.
Over the last couple of years, we have onboarded a number of engineering teams, and each of them feels comfortable using Datadog. This gives us the ability to build organizational knowledge.
Datadog's learning platform is second to none. It's the gold standard of training resources in my mind; not only are these self-paced courses available at no charge, but you can spin up an actual Datadog environment to try out its various features.
I just hate when other vendors try to upsell you on training beyond their (often poorly-written) documentation. Apart from that, we appreciate the variety of content that comes from Datadog's built-in integrations - for common sources, we don't have to worry about parsing, creating dashboards, or otherwise reinventing the wheel.
Datadog's roadmap can be a bit unpredictable at times. For instance, a few years ago, our rep at the time stated that Datadog had dropped its plans to develop an incident on-call platform. However, this year, they released a platform that does exactly that.
They also decided to drop chat-based support just recently. While I understand that it's often easier to work with support tickets, I do miss the easy availability of live support.
It would be nice if Datadog continued to broaden its variety of available integrations to include even more commercial platforms because that is central to its appeal. If we're looking at a new product and there isn't a native integration, then that's more work on our part.
I work in product design, and although we use Datadog for monitoring, etc, my use case is different as I mostly review and watch session recordings from users to gain insight into user feedback.
We watch multiple sessions per week to understand how users are using our product. From this data, we are able to hone in on specific problems that come up during the sessions. We then reach out to specific users to follow up with them via moderated testing sessions, which is very valuable for us.
Using Datadog has allowed us to review detailed interactions of users at a scale that leads us to make informed data-driven UX improvements as mentioned above.
Being able to pinpoint specific users via filtering is also very useful as it means when we have direct feedback from a specific user, we can follow up by watching their session back.
The engineering team's use case for Datadog is for alerting, which is also very useful for us as it gives us visibility of how stable our platform is in various different lenses.
Session recordings have been the most valuable to me as it helps me gain insights into user behaviour at scale. By capturing real-time interactions, such as clicks, scrolls, and navigation paths, we can identify patterns and trends across a large user base. This helps us pinpoint usability issues, optimize the user experience, and improve the overall experience for our users. Analyzing these recordings enables us to make data-driven decisions that enhance both functionality and user satisfaction.
I'd like the ability to see more in-depth actions on user sessions, such as where there are specific problems and rather than having to watch numerous session recordings to understand where this happens to get alerts/notifications of specific areas that users are struggling with - such as rage clicks, etc.
In terms of UI, everything is very small, which makes it quite difficult to navigate at times, especially in terms of accessibility, so I'd love for there to be more attention on this.
I've used the solution for over one year.
We did not evaluate other options.
I wasn't part of the decision-making process during licensing.
I wasn't part of the decision-making process during the evaluation stage.