Dependable Alerts, Smart Incident Grouping, and Deep Integrations
What do you like best about the product?
It is depedable, when a system fails at 3 AM, Pagerduty ensures the alert actually gets through via phone, sms. It uses Aiops to group related alerts into a single incident, preventing "alert fatigue" by filtering out the background static. With over 700 integrations (AWS, Datadog, Slack) it acts as central system for entire tech stack. Making global rotations and escalation policies simple, ensuring there is always a "warm body" ready to respond.
What do you dislike about the product?
It is notoriously expensive, often users with high per-seat costs and locking essential features like advanced analytics or AIops behind pricey upper-tier plans. The web interface can feel cluttered and dated, making simple plans unintuitive. Setting up sophisticated incident workflows and service dependencies often requires significant manual effort and a steep learning curve for new administrators. Despite its noise-reduction tools, teams still struggle with pager fatigue from low-priority alerts that haven't been perfectly tuned
What problems is the product solving and how is that benefiting you?
Without pagerduty, a single server failure might trigger 500 individual emails or slack pings, It groups these related alerts into a single incident, preveting you from being overwhelmed. Reduces time to mobilize a team by providing one-click "war rooms" automated diagnostic scripts, and deep links to exact code or server is failing. It automates status updates and communications so the engineers can focus o fixing the bug instead of answering.
Reliable Alerting and Strong On-Call Management
What do you like best about the product?
It has a reliable alerting system and strong on-call management. We receive call, email, and message notifications, which helps ensure we don’t miss important alerts.
What do you dislike about the product?
It has a complex setup, and I sometimes get confused when configuring alert routing and escalation policies, which can make it hard to set up correctly. It also takes time to fully optimize.
What problems is the product solving and how is that benefiting you?
PagerDuty is one of the best tools for incident alerting, and in our case it makes on-call management run smoothly. We’re a monitoring team, and it benefits us in several ways—most importantly, we don’t miss any important notifications.
Incident response has become faster and on-call alerts stay reliable for critical operations
What is our primary use case?
I am an end user of PagerDuty Operations Cloud in my organization, with a background in incident management. I primarily use it for managing on-call schedules, triggering and handling incidents, and monitoring alerts. It helps ensure timely responses, efficient escalation, and better coordination during incidents, making it a key tool for maintaining operational reliability.
How has it helped my organization?
PagerDuty Operations Cloud has improved our incident response by ensuring reliable alerting and faster escalation to the right teams. It has significantly reduced alert fatigue through better alert filtering and deduplication. The platform has also lowered our mean time to resolve (MTTR) with runbook automation and streamlined on-call management, leading to fewer disruptions and improved overall operational efficiency.
What is most valuable?
The features of PagerDuty Operations Cloud that I have found the most valuable and useful include alerting, which is very reliable with minimal delays, and the escalation policies and routing rules that are more flexible. Additionally, the on-call scheduling capabilities are great, and it integrates well with any cloud platforms such as AWS, GCP, or Azure, and observability tools such as DataDog and New Relic for logging and checking out logs.
I have noticed that PagerDuty Operations Cloud influences revenue protection by reducing alert fatigue and incident costs. AIOps has helped recently in reducing noise and alert duplications, and runbook automations aid in lowering the mean time to resolve by integrating triggers to Slack and updating runbooks.
I see PagerDuty Operations Cloud as a very good incident management and on-call platform, mostly used by large-scale organizations because it comes with premium pricing, but it is very reliable with alerting and on-call scheduling, triggering incidents, escalation policies, rules, and runbooks.
What needs improvement?
I think an area of PagerDuty Operations Cloud that could be improved is their premium pricing, as it compares unfavorably with competitors such as Atlassian's Opsgenie and ServiceNow, which offer bundle deals, plus DataDog now has incident management capabilities. Overall, the premium pricing makes it less accessible for small to medium businesses.
I think the pricing of PagerDuty Operations Cloud is a bit too high, and also, the UI can feel a bit curvy for new users; the learning curve might be a bit dense for them. The initial setup is straightforward, but the event orchestration could be complex, and the automation workflow definitely requires great expertise.
For how long have I used the solution?
I have been using PagerDuty Operations Cloud for approximately four years.
What do I think about the stability of the solution?
I would rate the stability and reliability of PagerDuty Operations Cloud 9.5 out of 10. The platform is highly stable and dependable in production environments, especially for critical incident management workflows. We have experienced consistent alert delivery, reliable on-call scheduling, and minimal downtime or disruptions.
That said, no system is completely perfect, so I cannot say it is 100% flawless. However, overall it has proven to be very reliable for mission-critical operations, where even small delays or failures would have significant impact.
What do I think about the scalability of the solution?
I would rate the scalability of PagerDuty Operations Cloud 8 out of 10. From a technical perspective, the platform scales very well and can support large, distributed teams with complex incident management needs. It handles high volumes of alerts, multiple services, and integrations across cloud platforms efficiently.
However, the main limitation to scalability is its premium pricing. As organizations grow and onboard more users or services, the cost increases significantly, which can be a challenge for small to mid-sized teams. So while it is technically highly scalable, cost can be a limiting factor for broader adoption.
How are customer service and support?
I have had regular interactions with PagerDuty Operations Cloud’s technical support, and my overall experience has been positive. The support team is responsive and helpful in addressing queries.
For example, during a user audit, I requested specific data on active users and those who had not accepted invitations. The support team responded quickly and provided the required information without delays. Overall, the support experience has been efficient and reliable when assistance is needed.
Which solution did I use previously and why did I switch?
I have only been using PagerDuty Operations Cloud; recently with my new organization, I am also using Fire Hydrant.
How was the initial setup?
I was not directly involved in the initial setup of PagerDuty Operations Cloud, as it was handled by senior team members. However, from my observations, the setup process appears to be straightforward at a basic level for core features like alerting and on-call scheduling.
That said, advanced configurations such as event orchestration and automation can become complex. If rules are not configured properly, they may lead to alert storms or missed incidents. Additionally, runbook automation is not plug-and-play and typically requires scripting knowledge and careful setup to function effectively.
What was our ROI?
From an ROI perspective, I do not have direct visibility into financial metrics, so I cannot quantify exact cost savings. However, I have seen strong operational ROI from PagerDuty Operations Cloud.
It has improved incident response efficiency by reducing alert fatigue, ensuring faster escalation, and lowering mean time to resolve (MTTR) through runbook automation. These improvements have helped prevent prolonged outages and reduced the impact of incidents, which indirectly contributes to cost savings and better service reliability at an operational level.
What other advice do I have?
I have some exposure to its autonomous AI agents, I have not extensively used its AIOps or generative AI capabilities. Despite that, the platform has had a strong positive impact on our operations.
By properly configuring alerting rules, we have been able to significantly reduce alert fatigue and shift focus toward more critical issues rather than routine noise. PagerDuty has also helped in reducing the number of duplicate alerts through intelligent pattern recognition.
Additionally, runbook automation has contributed to lowering our mean time to resolve (MTTR), enabling faster and more efficient incident handling. Overall, it has helped prevent costly incidents and improved operational efficiency across the team.
My review rating for PagerDuty Operations Cloud is nine point five out of ten.
Which deployment model are you using for this solution?
Hybrid Cloud
If public cloud, private cloud, or hybrid cloud, which cloud provider do you use?
Reliable Alerts Undermined by Poor Billing Support
What do you like best about the product?
I find PagerDuty's reliability to be world-class, providing a technically superior tool for managing on-call rotations and incident response for our engineering team. The multi-channel alerting is flawless, and the Slack and Kubernetes integrations are seamless, which are crucial for ensuring high availability of our SaaS platform. It effectively solves the critical problem of alert fatigue by ensuring major incidents in our cloud infrastructure are routed to the right engineer immediately.
What do you dislike about the product?
I'm really frustrated with PagerDuty's billing support. It's extremely slow, and responses take over a week. Fixing invoice errors like missing tax IDs is a major struggle, which is a big problem. Despite PagerDuty being about 'urgent response,' their customer service is sluggish, and that's overshadowing its technical value. We need a partner that responds to business issues as quickly as their software responds to incidents. The billing and account setup process has been a nightmare due to this lack of support.
What problems is the product solving and how is that benefiting you?
I use PagerDuty to manage on-call rotations and incident response, solving alert fatigue and ensuring critical issues reach the right engineer immediately. It helps maintain SLA commitments through clear escalation paths, integrating with our monitoring tools for high availability.
No More Missed Notifications Without Constant Email Checking
What do you like best about the product?
I don’t have to constantly check my email anymore to make sure I’m not missing any notifications.
What do you dislike about the product?
The queue functionality could support more advanced automations to ensure the right people are notified during a workflow incident.
What problems is the product solving and how is that benefiting you?
We had to consistently check and refresh our emails to make sure we weren’t missing any outage alerts. It gave us peace of mind, and we could see who had responded to the incident.
Centralized incident response has reduced downtime and now needs more predictable costs
What is our primary use case?
My main use case for PagerDuty Operations Cloud is for cloud-based operations, including incident management and resolving incident responses to reduce downtime and improve reliability.
PagerDuty Operations Cloud provides a central command center that collects data signals from various IT systems, which helps us detect high-priority incidents and reduce the noise.
In addition to my main use case, we are able to perform on-call scheduling and routing with the help of PagerDuty Operations Cloud very easily.
What is most valuable?
The best features PagerDuty Operations Cloud offers include automated on-call scheduling, AI-driven alert grouping, and an impressive number of integrations, as it has more than 700 integrations, which really help us.
The user interface is user-friendly for non-technical persons, and it has been maintained well over the years, making it easier for our project managers and product managers to navigate PagerDuty Operations Cloud dashboards.
PagerDuty Operations Cloud's embedded AI has helped reduce alert fatigue and lower costs from incidents, which has contributed to retaining revenue by minimizing financial losses.
The AI-driven alert grouping, particularly for incident management, helps us significantly as it streamlines our processes.
PagerDuty Operations Cloud has positively impacted our organization by accelerating incident response and reducing MTTR by up to 27%, and we are also integrating AI and ML into PagerDuty Operations Cloud.
What needs improvement?
One area for improvement in PagerDuty Operations Cloud is the unpredictable costs that can cause issues in our organization and project complexity, along with the occasional perception of an outdated user interface by non-tech personnel.
For how long have I used the solution?
I have been using PagerDuty Operations Cloud for around three years.
What do I think about the stability of the solution?
PagerDuty Operations Cloud is stable and scalable, capable of handling enterprise environments and multi-cloud setups efficiently.
How are customer service and support?
I would rate customer support a seven out of 10.
Which solution did I use previously and why did I switch?
Previously, we were using incident.io, AlertOps, and DataDog, but we switched to PagerDuty Operations Cloud due to its all-in-one solution capabilities.
What about the implementation team?
I have implemented AI and automation through PagerDuty Operations Cloud for incident response, which has significantly changed our operational efficiency, allowing us to accomplish more with less manual input.
I have experimented with PagerDuty Operations Cloud's autonomous AI agents, striving to automate repetitive tasks, which improves operational efficiency.
What was our ROI?
While I cannot provide exact return on investment metrics, I estimate that we save around 20% of costs and approximately 10 to 15 hours a week with the efficient use of PagerDuty Operations Cloud.
The 27% reduction in MTTR has had a direct business impact as it enhances business growth and supports impactful business decisions.
The alert reduction feature has greatly impacted our ability to prevent costly incidents, as we can accurately respond to alerts with the help of autonomous AI agents, which reduces erroneous notifications.
What's my experience with pricing, setup cost, and licensing?
My experience with pricing, setup cost, and licensing has been positive compared to other tools, as PagerDuty Operations Cloud simplifies many management tasks that would otherwise be burdensome.
Which other solutions did I evaluate?
Before choosing PagerDuty Operations Cloud, I evaluated previous tools such as incident.io and DataDog, but we selected PagerDuty Operations Cloud for its comprehensive features and strong alerting.
What other advice do I have?
My advice for others considering PagerDuty Operations Cloud is to first understand their organizational needs before selecting any tool to prevent mismatches.
I believe all aspects of my experience with PagerDuty Operations Cloud have been covered.
I would rate my overall experience with PagerDuty Operations Cloud a seven out of 10.
Which deployment model are you using for this solution?
Hybrid Cloud
If public cloud, private cloud, or hybrid cloud, which cloud provider do you use?
Amazon Web Services (AWS)
Incident response has become faster and on-call teams manage alerts with reduced noise
What is our primary use case?
I use PagerDuty Operations Cloud for notifying engineers of any incidents or issues in our operations or infrastructure. Most of the time on our servers, I receive alerts regarding memory, disk utilization, and CPU, so when any infrastructure-related issues arise, we trigger PagerDuty alerts to the engineers to resolve them.
My main use case with PagerDuty Operations Cloud is for resolving incidents or when I need peer help on an ongoing incident. I page out the correct relevant engineers, and in response, they join the call, which is very useful for us to resolve any incidents.
What is most valuable?
The best features that PagerDuty Operations Cloud offers include paging out as one of the best capabilities. We can respond to incidents through a message, call, mobile app, or website, which is very useful and quick for getting incident notifications.
We rely on the mobile app the most for incident notifications because it is easy to carry wherever we go. PagerDuty Operations Cloud is one of the important aspects in our organization which we use widely for incident response and quick incident resolution. When we need any alert or peer help, PagerDuty assists us and helps in managing who is on call for which team, allowing us to view the dashboard, which is very useful for us.
The alert reduction feature of PagerDuty Operations Cloud helps in reducing incident alerts, allowing similar alerts to be resolved easily. The AI functionality is very useful in reducing alerts, automating workflow, improving on-call efficiency, and enabling faster incident resolution. PagerDuty's generative AI is particularly helpful as it reduces alerts and unnecessary noise.
What needs improvement?
Since using PagerDuty Operations Cloud, it would be helpful to have a phone number that we can use to page out certain people, making it easier for us. If a phone number existed that we could call and it would detect which team we are currently active in, we could then ask the system to page out certain people based on the available options.
Overall, I believe the platform is good, however, I think we could have a phone number that, when someone calls and provides some authentication such as a PIN, could help page out certain people, which would be useful for us. I rate the platform an 8 out of 10 because, as mentioned, there could be better ways of creating incidents or using a phone number to contact someone who is on call. For each team, we could have a temporary email ID which we could use to email and automatically reach the on-call person. Additionally, if we could integrate an AI chatbot feature, such as asking who is on call and having it display the results, it would be beneficial because we have multiple teams and currently need to navigate different dashboards.
For how long have I used the solution?
I have been using PagerDuty Operations Cloud for around two to three years.
What do I think about the stability of the solution?
PagerDuty Operations Cloud is very stable.
What do I think about the scalability of the solution?
The scalability of PagerDuty Operations Cloud is very good, and it is easy to scale.
How are customer service and support?
Customer support for PagerDuty Operations Cloud is very nice, and we receive very good support.
What was our ROI?
We have seen a return on investment with PagerDuty Operations Cloud, and it is very useful because without it, we would need multiple people. Our headcount is very low because we are using PagerDuty, as it significantly reduces toil and manual work.
What's my experience with pricing, setup cost, and licensing?
My experience with pricing, setup cost, and licensing for PagerDuty Operations Cloud is good and easy to understand.
What other advice do I have?
For others looking into using PagerDuty Operations Cloud, my advice is that it is very useful, it is easy to onboard, and the response is very nice. You can manage multiple teams as multiple teams can have on-calls. I think you can integrate machine learning and AI features into PagerDuty and possibly have a chatbot that answers all questions related to PagerDuty, which will be helpful by providing a summary and similar functionalities. I rate this product an 8 out of 10.
Which deployment model are you using for this solution?
Public Cloud
If public cloud, private cloud, or hybrid cloud, which cloud provider do you use?
Amazon Web Services (AWS)