AI-driven monitoring has reduced incident resolution times and improves release confidence
What is our primary use case?
My main use case for Dynatrace involves daily work with monitoring charts, setting up alerts, and tracking response times and error rates to identify slow transaction bottlenecks in microservices. I also manage infrastructure monitoring, such as CPU, memory, and disk issues. When anomalies in resource consumption arise, I utilize the AI-powered Dynatrace Davis engine to quickly identify the root cause. Additionally, we employ real user monitoring (RUM) for alert and incident management, creating alerts with tools such as PagerDuty and ServiceNow when we need to raise incidents. We also focus on observability in our workloads deployed on a Kubernetes environment, including microservices and various servers.
I have a specific example of how Dynatrace helped me solve performance issues, particularly with slow response times in payment services, which could reach eight to ten seconds. I had to check the trace routes and flow to understand these delays during calls to external APIs, where I discovered that third-party API calls were waiting for responses due to DNS resolution issues. Dynatrace identified this slowdown, correlating it with spikes in DNS lookup times in a node in our Kubernetes cluster. After we handled deployment releases, we dropped response times to under one second. This solution significantly improved our common problems, achieving a success rate of almost fifty percent in troubleshooting.
How has it helped my organization?
Dynatrace has positively impacted my organization by reducing incident resolution times, with Davis AI helping to pinpoint root causes effectively. We have seen a reduction of thirty to sixty percent in mean time to resolution (MTTR) for prioritized incidents and fewer escalations. Additionally, Dynatrace has helped us reduce alert noise, leading to forty to seventy percent fewer alerts while routing incidents more reliably to the correct teams. The quality of our releases improves gradually due to automated validation, allowing for quicker rollbacks and issue detection within minutes of deployment, which increases confidence in our CI/CD processes.
Dynatrace has contributed to significant improvements such as reducing P1 tickets resolution time from four hours to under one hour and drastically cutting alert volumes from between two hundred to four hundred alerts per week down to approximately sixty to one hundred twenty. The latency for reporting ticket issues dropped with PurePath and RUM data, improving from over three point five seconds to around two point one seconds. We also recorded substantial reductions in both latency from three point eight to one point four seconds and error rates averaging under one percent after implementing the findings from Dynatrace analytics.
What is most valuable?
The best features that Dynatrace offers include the AI-powered root cause analysis with Davis AI, which automatically identifies root causes by correlating metrics, logs, and traces, saving substantial time during incident resolution. Full-stack observability is another top feature, as it covers application, infrastructure, and network-related services while integrating with cloud environments. I appreciate the PurePath distributed tracing that provides deep dive insights into every transaction across microservices, helping us pinpoint slow database queries and external API calls. RUM allows us to track actual user sessions that impact UX, while synthetic monitoring proactively detects issues before they affect real users. OneAgents simplify infrastructure-related configurations, and I want to emphasize the importance of business analytics integration to tie technical metrics with business KPIs, as my role involves prioritizing issues based on their impact on business outcomes.
The feature that saves me the most time is Davis AI, as it automatically analyzes all data elements, understands metrics, logs, and traces, and pinpoints exact root causes of issues. Instead of manually digging through dashboards, I receive clear explanations of problems, such as high CPU usage due to garbage collection or memory issues, which drastically reduce the mean time to resolution (MTTR). The manual investigations that used to take hours can now be solved in under a minute, eliminating guesswork and allowing me to respond quickly without needing cross-team checks. For instance, Davis AI recently flagged a slowdown in microservices that led me to a recent inefficient data query introduced during deployment, allowing me to roll back changes in only fifteen minutes.
What needs improvement?
Beyond the features already discussed, I would like to see improvements in auto-discovery, smart instrumentation, and a unified data model to centralize all metrics and events on a single platform. This change would minimize the need to jump between tools and manually stitch data together. Continuous improvement features tied to SLO objectives should also ensure deployments meet performance standards.
In terms of improvements, I believe Dynatrace could enhance cost and licensing structures, as the current pricing can be expensive for large-scale deployments. More flexible and granular billing options would be beneficial, especially for ephemeral workloads. Additionally, while the initial setup is straightforward, understanding advanced features requires expertise. Improvements in user guidance, such as tutorials or workflow documentation, could help new users navigate the platform more easily, particularly with customization options and dashboard enhancements.
Further improvements could include fostering deep native integrations with major platforms and enhancing the ease of integrating with CI/CD tools such as Jenkins or GitHub Actions. Additionally, supporting better OpenTelemetry for custom traces and metrics would simplify setups. Native integrations with BI tools would enhance our analytical capabilities, making real-time dashboard creation easier.
For how long have I used the solution?
I have been using Dynatrace for three years, having initially been introduced to Kibana and other solutions such as AWS Watch before that.
How was the initial setup?
Dynatrace was purchased through the AWS Marketplace, which made the setup process straightforward; however, I believe no additional improvements are necessary beyond what I have already mentioned.
What other advice do I have?
For others exploring Dynatrace, my advice is to start by defining clear goals, such as improving incident resolution times or release quality. Familiarizing oneself with key features such as Davis AI and ensuring thorough tagging of services is essential for cleaner dashboards. Utilize AI for problem detection and integrate Dynatrace with incident management tools for efficient workflows.
Before concluding, I want to emphasize the importance of leveraging advanced features beyond basic monitoring, particularly with SLOs and release validations, and to be mindful of budgeting, as Dynatrace can get expensive at scale. I would rate this product an eight out of ten.
Which deployment model are you using for this solution?
Hybrid Cloud
If public cloud, private cloud, or hybrid cloud, which cloud provider do you use?
Unified observability has improved incident response and transforms complexity into insight
What is our primary use case?
My primary focus is unified observability and security, and in my organization, we use Dynatrace for monitoring the infrastructure, cloud platforms, and application performance to reduce major outages and intelligently combine metrics, logs, and traces into a single view.
We use Dynatrace's OneAgent to automatically discover and instrument our entire stack from the front-end user interaction to the back-end database. During high traffic events, we monitor real-time response and use this AI to automatically pinpoint the root cause of any slowdowns before they impact the customers.
What is most valuable?
Dynatrace is a super platform that easily manages over 10,000 hosts and is highly stable, making it the best.
Dynatrace manages over that many hosts through an elastic grid architecture, which efficiently handles thousands of services and containers with minimal overhead.
Dynatrace has drastically improved our MTTR and provides one monitoring tool across the entire landscape. We have achieved a massive reduction in operational costs and tool sprawl.
We have seen an average three-year return on investment of 451%, which is a huge number when it comes to the saving part. We are seeing an increase in the efficiency for our IT and DevOps team with introducing Dynatrace, including 40% fewer Sev1 and Sev2 outages annually.
Dynatrace is a powerful expert-level tool that transforms complexity into a business asset. The inclusion of runtime application security has become a critical feature for our modern DevSecOps workflows.
What needs improvement?
The platform could improve its custom visualization options, as some executive dashboards feel restrictive compared to other tools. Additionally, making raw log data more cost-effective would add significant value.
For how long have I used the solution?
I have been using Dynatrace since early 2022 to manage our centralized log management and observability initiatives across multiple large-scale environments.
What do I think about the stability of the solution?
Dynatrace is very stable, with most users rating its reliability between an 8 and 10 on a 10-point scale, featuring an always-on architecture designed to be as resilient as the systems it monitors.
What do I think about the scalability of the solution?
Dynatrace is built for hyper-scale, easily managing over a lot of hosts through an elastic grid architecture. It also efficiently handles thousands of services and containers.
How are customer service and support?
Technical support is highly responsive and expert, with users particularly valuing the Guardian Program for hands-on assistance during critical issues.
How would you rate customer service and support?
Which solution did I use previously and why did I switch?
We previously used a mix of traditional monitoring tools such as SolarWinds and Splunk, but we switched because we needed a more unified platform that was easier to deploy and provided better business-level visibility.
How was the initial setup?
Purchasing Dynatrace through the AWS Marketplace was straightforward and allowed us to leverage our existing AWS spend. While the cost can be high for CPU-intensive tasks, the automated setup and tools consolidation save us significant money.
What about the implementation team?
I advise defining a goal tagging policy before rolling it out to ensure your data stays organized as you scale. Also, start with a 15-day free trial to test OneAgent on your own machines and see automatic discovery in action.
What was our ROI?
We have seen an average three-year return on investment of 451%. Key metrics include 40% fewer Sev1 and Sev2 outages annually along with a 37% increase in the efficiency for our IT and DevOps team.
Which other solutions did I evaluate?
We evaluated several competitors, including New Relic, DataDog, and AppDynamics, but we ultimately chose Dynatrace for its superior automation, security-conscious design, and advanced AI-powered root cause analysis.
What other advice do I have?
I would rate this product a 9 out of 10.
Which deployment model are you using for this solution?
Hybrid Cloud
If public cloud, private cloud, or hybrid cloud, which cloud provider do you use?
Monitoring has improved full-stack visibility and now provides faster, AI-driven root cause analysis
What is our primary use case?
I use Dynatrace in my project to monitor EC2 instances and ECS containers and track application performance, detect infrastructure, service, and container issues. In my project, I use Dynatrace for full-stack monitoring.
What is most valuable?
The best features Dynatrace offers include configuring alerts and dashboards, such as problem notifications and custom dashboards for EC2 health, ECS service performance, error rates, and latency, along with anomaly detection and security and IAM, allowing outbound traffic to Dynatrace and using IAM roles for EC2 with no sensitive data stored locally. In my project, I use Dynatrace OneAgent on EC2 and ECS to monitor the infrastructure, containers, and application performance, providing automatic service discovery, real-time metrics, distributed tracing, and AI-based root cause analysis.
What needs improvement?
Dynatrace can be improved by fine-tuning AI-based settings.
I feel that improvements around API rates and latency could be beneficial.
For how long have I used the solution?
I have been working in my current field for the past three to four years.
What do I think about the stability of the solution?
Dynatrace is stable in my experience.
What do I think about the scalability of the solution?
The scalability of Dynatrace is good.
How are customer service and support?
Customer support is excellent; whenever I raise a ticket with the vendor, they immediately connect with me and help me.
How would you rate customer service and support?
Which solution did I use previously and why did I switch?
Previously, I used only CloudWatch, but I switched to Dynatrace because the features in Dynatrace are not found in AWS CloudWatch.
How was the initial setup?
Dynatrace is deployed in my organization in the AWS cloud.
What about the implementation team?
I did not purchase Dynatrace from the AWS Marketplace; I hosted Dynatrace on AWS.
What was our ROI?
I have seen a return on investment in terms of time saved.
What's my experience with pricing, setup cost, and licensing?
My experience with pricing is that it costs $58 per month for full-stack monitoring.
Which other solutions did I evaluate?
I did not evaluate other options before choosing Dynatrace, as I only selected Dynatrace.
What other advice do I have?
I advise anyone looking to set up a monitoring system in their project to consider Dynatrace because it will monitor their infrastructure, servers, and application performance, providing automatic service discovery and real-time metrics; it is the best option to choose or set up in their project. I gave this review a rating of 9 out of 10.
Which deployment model are you using for this solution?
Public Cloud
If public cloud, private cloud, or hybrid cloud, which cloud provider do you use?
Amazon Web Services (AWS)
AI-driven insights have reduced downtime and improved cross-team collaboration
What is our primary use case?
Our main use case for Dynatrace is that we deployed Dynatrace OneAgent across our AKS nodes to monitor pod level metrics and service dependencies. Smartscape topology helped us visualize the entire environment in real-time and identify issues such as memory leaks and high response times during load. For our backend services, we were facing intermittent latency issues, but using Dynatrace we could pinpoint the exact methods and queries causing the application slowness and understand what exactly was happening.
What is most valuable?
Dynatrace offers several best features. When comparing before integrating and using Dynatrace, we also used other tools including Prometheus and Grafana, but Dynatrace provides that additional feature. Nowadays, everything incorporates AI, and Davis AI that Dynatrace built-in dashboards are clean and powerful. The AI integration allows us to outperform the issues, reduce the downtime, and timely resolve the issues. Their AI integration is exceptional and gives Dynatrace a better capability compared to competitors.
Davis AI and the built-in dashboards have made a difference for our team because whenever we were stuck on any level issue or metrics and wanted to dig into some logs and metrics to identify where exactly it was going wrong on our application side, the AI helped us to narrow down where exactly it was going wrong. It identifies the particular specific pod, application, or container and provides a whole overview of the application and container level. We had to fine-tune it to align with our environment. We fine-tuned it in such a way that whenever something goes down on our containerization or VM level, we could drill down and by using the AI analysis, it looks into the entire system and provides an exact pinpoint solution when we give a detailed outline. This reduced the resolution problem to a lesser time compared to using other tools.
Dynatrace has positively impacted our organization. We have had major outages across the organization, whether application side or performance level related, with users experiencing slowness or blank pages. However, Dynatrace setup on an enterprise level brought down the outage time by half. All cross-functional teams can integrate and see exactly what is going on, allowing us to work on the actual resolution based on the metrics provided by Dynatrace. Using this tool, it pinpointed the problem instead of us scratching our heads trying to identify where it could go wrong, and we could directly proceed with the solution. This reduced our man-hours and outage time while increasing productivity based on performance metrics and observability.
What needs improvement?
We encountered some challenges while using Dynatrace. Although the initial setup was smooth, fine-tuning alert thresholds and custom metrics took some time. Another challenge was that Dynatrace charges based on host units, so we had to carefully plan our agent deployments. The licensing model is expensive. Additionally, the complexity of setup is an issue. While OneAgent and auto-discover services are powerful, the setup is more complex compared to other tools such as Prometheus and Grafana. These integrations are simple and basic, but Dynatrace setup requires more complexity based on the environment. For new users wanting to use Dynatrace, it is difficult. However, the AI-related solutions and metrics took us to the next level for identifying and fixing things.
Dynatrace requires an agent for operation. OneAgent is powerful, but it is also resource-heavy. On lightweight nodes or older systems, the agent can slightly impact performance. If Dynatrace could implement a lightweight agent behavior, we could make things faster. Additionally, if Dynatrace could add a long-term retention policy so that we could store more data and find fine-grained details, that would help us. While Dynatrace managed edition supports on-premises deployment, the SaaS version depends on cloud connectivity. For highly regulated or air-gapped environments, setup and updates can be challenging. Although the initial setup is smooth, if someone wants to fine-tune it and fully understand the tool end-to-end, it could be tricky.
What do I think about the stability of the solution?
What do I think about the scalability of the solution?
Dynatrace scalability is great. It is a powerful tool and helped us to reduce customer downtime and increase work efficiency. We could identify any issues causing problems in our application or environment.
How are customer service and support?
Customer support is very prompt. Whenever we faced any issues, we could get timely resolution from their support, so we did not face any issues.
How would you rate customer service and support?
Which solution did I use previously and why did I switch?
Before Dynatrace, we used open source Prometheus. It is a very standard solution but not exceptional. It is a standard option because it is open source and free.
How was the initial setup?
My experience with pricing, setup cost, and licensing is that Dynatrace licensing model is based on host units and can be complex initially to understand. If setting up in a large scale environment, it is overwhelming because it is expensive. We had to plan carefully when deploying OneAgents across our nodes or clusters, ensuring we did not exceed our licensing capacity.
What about the implementation team?
For purchasing Dynatrace, we explored the AWS Marketplace, but on an enterprise level, we reached out to their agents and they set up a call for integration. We deployed on an enterprise level because while the marketplace is fine, if implementing a solution at a huge enterprise level, we had to work with the actual Dynatrace team to set it up. For some of the bare metal hosts or engines, we went with the marketplace only.
What was our ROI?
When it comes to metrics and performance, we have clear return on investment. Before Dynatrace, we spent considerable time manually correlating logs, metrics, and traces from multiple tools to find root causes. Using Dynatrace directly improved application uptime and reduced customer impacting incidents. It helped our engineers focus on optimizing rather than firefighting.
What's my experience with pricing, setup cost, and licensing?
My experience with pricing, setup cost, and licensing is that Dynatrace licensing model is based on host units and can be complex initially to understand. If setting up in a large scale environment, it is overwhelming because it is expensive. We had to plan carefully when deploying OneAgents across our nodes or clusters, ensuring we did not exceed our licensing capacity.
Which other solutions did I evaluate?
Before choosing Dynatrace, we considered DataDog and other tools available in the market, but compared to all other tools and their values that could bring to our organization, we chose Dynatrace.
What other advice do I have?
My advice for others looking into using Dynatrace depends on the organization. My main advice is that if you want an enterprise-level observability and monitoring solution for a hassle-free experience with troubleshooting, identifying and reducing customer downtime, and increasing work efficiency, or if you want a stable and standard solution, Dynatrace is the best fit. This review received a rating of nine out of ten.
Which deployment model are you using for this solution?
Public Cloud
If public cloud, private cloud, or hybrid cloud, which cloud provider do you use?
Monitoring tools streamline root cause analysis and enhance user experience
What is our primary use case?
My main use cases with Dynatrace include a plethora of functionalities. Real user monitoring (RUM) is being used extensively. Synthetic monitoring is used extensively. We use Kafka monitoring. I have awareness of how AWS metrics are being sent, whether it's a direct integration or only account-level integration. We use it significantly.
What is most valuable?
Some of the best features I appreciate about Dynatrace include synthetic monitoring, where you can get into HTTP level monitoring and click browser monitoring, which is available in both Splunk and Dynatrace. This is very useful in our environment. Apart from that, mobile-based monitoring, which we have embedded in some cases with the apps that are connected, is also beneficial for monitoring APIs.
Synthetic monitoring has had a significant impact on my ability to track performance proactively. It has been very useful. It serves two aspects: synthetic monitoring is primarily for the front-end side where availability and tracking whether the website is running, and we can verify if users are able to log in and see things running.
From the infrastructure point of view, the availability of the infrastructure at AWS level with Kafka, EC2 instances, and Lambda functions is covered for the monitoring system and infrastructure team. We are catering to both audiences.
Dynatrace's AI-driven Davis engine absolutely helps identify performance issues by showing root cause analysis for us up to 200%; whatever is integrated, if it is visible, it can stitch and show.
The comprehensive application topology visualization in Dynatrace, called Smartscape, has benefited my understanding of system dependencies. Even though Splunk has something similar, since we are using Dynatrace primarily for observability, we can get top-to-bottom visibility, from infrastructure to network to application front end. When integrated based on application requirements, we get a good grasp of what's happening during issues.
What needs improvement?
Dynatrace could be improved primarily with regards to pricing. These tools are expensive. They have been the pioneers from inception, and they remain at the top of the Gartner chart. The quicker we learn, the better we can serve is what the team believes.
Learning is another aspect that needs improvement, specifically how the tools can educate users. Education needs to be tailored differently for front-end user monitoring, infrastructure-based monitoring, and centralized monitoring teams. The segregation of educating how to use the tool is something we have recommended to all tools teams and product owners.
They could help with more learning capabilities. A monitoring tool will be used by different types of users. One is from the infrastructure point of view. Another is an application developer who wants to see if strings are getting attached and code-based applications are being stitched together. A front-end monitoring person or business wants to know about website and infrastructure availability, and what kinds of dashboards they can create for their comfort. That's how a tool gains visibility and inclusiveness. It depends on the owner of the tool to address these different aspects and let users choose their preferred way to use the tool.
For how long have I used the solution?
I have nearly six years of experience with Dynatrace.
What was my experience with deployment of the solution?
Regarding the initial setup, while I can't speak for how my company implemented it overall, I can say that the Dynatrace setup is good enough, not an issue. Integration with the cloud is straightforward. In cases of new aspects of cloud, some drilling is needed to determine installation possibilities. That's a challenge in the cloud - whether to integrate with the cloud directly or opt for agent installation. These issues arise when new features or services are enabled in AWS; parallelly, Dynatrace and Splunk and these tools need to adapt to see if these services can be monitored. Users start asking if we can enable new AWS features in our tools. This synchronization should happen at the back end; users should not be involved in that process.
How are customer service and support?
I think Dynatrace's customer service and technical support for this product is good. Out of five, it rates nearly four. This is good enough to expect.
How would you rate customer service and support?
Which other solutions did I evaluate?
The monitoring team is able to work without any user intervention, which is appreciable. Comparing to other tools such as Splunk, CloudWatch, or other tools which I have researched for RFPs, I feel DataDog is good enough. However, based on the experience from my users, Dynatrace is more flexible for them.
What other advice do I have?
We deal with APM solutions and monitoring or logging solutions by having Splunk already in place in my environment, along with Dynatrace. We have other tools such as CloudWatch.
I have dealt with Splunk on-call and have all kinds of experience. I have used the PeerSpot platform extensively, and it does help significantly. I will be happy to provide individual product reviews.
On a scale of 1-10, I rate Dynatrace an 8 out of 10.
Which deployment model are you using for this solution?
Amazon Web Services (AWS)
If public cloud, private cloud, or hybrid cloud, which cloud provider do you use?
Amazon Web Services (AWS)
Integration empowers detailed reporting and automated alert management
What is our primary use case?
We decided to integrate
Dynatrace primarily to monitor the availability and latency of our applications. We have a significant number of applications, and we needed a system that could manage their availability. Previously, we lacked evidence to show whether an application was available when a customer raised an issue. With
Dynatrace, we can verify this information, even after a month.
What is most valuable?
Dynatrace offers exceptional features compared to other monitoring tools. It allows the creation of different dashboards related to application availability and server latency. The integration with Power BI for generating detailed reports is a standout feature. Furthermore, the automation of alert management, which integrates with ticketing systems to automatically raise and assign tickets, is a significant advantage over tools like
Centreon. Dynatrace also integrates with AI tools to generate business reports.
What needs improvement?
When making comparisons with other tools, Dynatrace stands out, but it is a bit complex to understand and implement on servers. There are occasional issues with language and text, which can be frustrating. Despite this, it remains superior to
Centreon.
For how long have I used the solution?
I started using Dynatrace six months ago.
What do I think about the stability of the solution?
There have been no stability issues with Dynatrace. It is well-maintained and, being supported by cloud infrastructure, we have experienced no interruptions.
What do I think about the scalability of the solution?
We have faced no scalability or reliability issues with Dynatrace. Everything functions smoothly.
How are customer service and support?
The technical support from Dynatrace is excellent. Recently, we had an issue, and the support team resolved it within 20 to 25 minutes. Their support is very fast and effective.
How would you rate customer service and support?
Which solution did I use previously and why did I switch?
In the past, I used
Icinga and Centreon. Compared to these, Dynatrace remains the top priority. Its features surpass the typical expectations for monitoring tools.
What about the implementation team?
The implementation of Dynatrace was handled by a different team, so I don't have insights into their process.
What was our ROI?
The integration of Dynatrace has allowed us to monitor application stability and create business reports. However, I haven't quantified the ROI.
What's my experience with pricing, setup cost, and licensing?
Dynatrace is known to be costly, which delayed its integration into our system. However, the advanced features it provides are unmatched, making it a valuable investment for organizations with a higher budget.
What other advice do I have?
I highly recommend Dynatrace for organizations with a significant budget because it decreases resource costs and manpower. It is a versatile tool for IT organizations, providing features that are beneficial for comprehensive application monitoring. I rate Dynatrace a 9 out of 10.
Which deployment model are you using for this solution?
Public Cloud
If public cloud, private cloud, or hybrid cloud, which cloud provider do you use?