
Overview
High customer expectations and increasingly distributed systems mean disruptions to digital service can have catastrophic effects on sales, brand loyalty, and operating costs. The PagerDuty Operations Cloud deflects unnecessary work from teams and subject matter experts so they can focus on delivering business value. Urgent work is escalated to the right teams and routine work is made self-service. Teams can automate and accelerate issue resolutions with minimal human interruption -and improve system resilience and team capacity while reducing the strain of operational complexity and the unexpected.
With more than 700 integrations, APIs, and apps for customer service, the PagerDuty Operations Cloud empowers rapid responses in any environment. And thanks to more than 10 years of data ingestion, its machine learning-powered AIOps functionality can reduce alert noise by up to 98% and drive down MTTR with critical context for faster triage and effective automation.
PagerDuty integrates with various AWS services, including AWS CloudWatch, Amazon GuardDuty, AWS CloudTrail, AWS Personal Health Dashboard, Amazon EventBridge, AWS Security Hub, Amazon DevOps Guru, AWS Control Tower, AWS Outposts, and AWS S3 Storage Lens.
AIOps PagerDuty AIOps helps teams reduce noise, triage efficiently to drive the right actions towards resolution, and remove manual, repetitive work from the incident response process. Noise reduction baked in with an ML model that learns and adapts based on user behavior means teams see fewer incidents overall. And automating toil from manual event processing results in greater efficiency, saving teams valuable time for innovating.
Process Automation PagerDuty Runbook Automation is a managed cloud service that enables DevOps teams and SREs to create and delegate operational tasks in automated runbooks to other stakeholders such as developers, NOC personnel, and incident responders. Runbook Automation provides automated workflows and task automation focused on IT and developer process automation. Examples include service provisioning, CI/CD, configuration management, incident diagnosis and remediation, and more. With PagerDuty Runbook Automation, you can resolve requests in minutes, rather than days, optimize security and compliance, and give your engineers more time to spend on innovation rather than firefighting.
Incident Response PagerDuty helps you save time and money by bringing together the right teams with the right information to resolve incidents faster. Replace manual processes with automation to streamline incident response, freeing up time and resources for more innovation. Orchestrate end-to-end incident response with a service ownership model that only brings in the teams you need. Over 21K organizations trust PagerDuty to help them adopt DevOps best practices and build more resilient operational practices to minimize costly downtime and protect the customer experience.
Custom Private Offer We can create a custom offer tailored to your needs. Please contact us at aws-sales@pagerduty.com
Highlights
- Incident Response - Manage incidents end-to-end
- Process Automation - Automate and delegate business and IT processes
- AIOps - Maximize IT capacity with fewer incidents and faster resolution
Details
Introducing multi-product solutions
You can now purchase comprehensive solutions tailored to use cases and industries.
Features and programs
Buyer guide

Financing for AWS Marketplace purchases
Pricing
Dimension | Description | Cost/12 months |
|---|---|---|
Professional | On-call and incident response for growing teams | $252.00 |
Business | Streamlined incident response for the enterprise | $492.00 |
CustomerServProfessional | Bi-directional comms between CS & Dev, protect SLAs, & lower MTTR | $252.00 |
CustomerService Business | Bi-directional comms between CS & Dev, protect SLAs, & lower MTTR | $492.00 |
Runbook Automation | Automate manual procedures in runbooks | $1,500.00 |
Automation Actions | Add-on: Automate steps to diagnose & remediate incidents | $240.00 |
Live Call Routing | Add-on: For on-call schedules & escalations (by line) | $1,890.00 |
Runbook Auto Job Runner | Add-on: For Runbook Automation | $750.00 |
Stakeholder Users | Bundle of 50 Stakeholder users | $1,800.00 |
PagerDuty Status Pages | 1000 User Pack | $1,068.00 |
The following dimensions are not included in the contract terms, which will be charged based on your usage.
Dimension | Cost/unit |
|---|---|
Additional events over contracted value | $0.06 |
Vendor refund policy
All fees are non-cancellable and non-refundable except as required by law.
Custom pricing options
How can we make this page better?
Legal
Vendor terms and conditions
Content disclaimer
Delivery details
Software as a Service (SaaS)
SaaS delivers cloud-based software applications directly to customers over the internet. You can access these applications through a subscription model. You will pay recurring monthly usage fees through your AWS bill, while AWS handles deployment and infrastructure management, ensuring scalability, reliability, and seamless integration with other AWS services.
Support
Vendor support
Our team provides multiple resources for customers to find answers to questions and get help with our product. Users may browse our integration guides (pagerduty.com/integrations) to integrate with partner tools, our knowledge base (support.pagerduty.com) to learn more about using PagerDuty, and our developer docs (developer.pagerduty.com) to use our APIs. Additionally, anyone can interact with other PagerDuty users and PagerDuty employees via the PagerDuty Community (community.pagerduty.com). Our Support team is available during regular business hours around the globe, Monday through Friday, and can be contacted at: Email: support@pagerduty.com or via a ticket submitted at tickets.pagerduty.com
AWS infrastructure support
AWS Support is a one-on-one, fast-response support channel that is staffed 24x7x365 with experienced and technical support engineers. The service helps customers of all sizes and technical abilities to successfully utilize the products and features provided by Amazon Web Services.

Standard contract
Customer reviews
Reliable Scheduling and App, but Needs More Integration Flexibility
Automated incident workflows have transformed on-call operations and improved response times
What is our primary use case?
I have been working in my current field for over seven years as a DevOps and site reliability engineer, and my primary experience involves managing the reliability of infrastructure platforms hosted in multi-cloud and on-premises environments. I have predominantly worked with systems hosted in AWS services, setting up infrastructure, CI/CD, observability, and completely establishing the release process where I utilize PagerDuty Operations Cloud for triage and other SRE operations.
I have been using PagerDuty Operations Cloud for over four years, and I have utilized it in multiple ways. One involves using PagerDuty Operations Cloud through enterprise services via a subscription model, and I have also used it in a project at Intel where I utilized PagerDuty Operations Cloud from AWS for approximately one to one and a half years. After that period, I have been using it as a subscription currently at IBM.
One of the main use cases for PagerDuty Operations Cloud involves handling the operation center, particularly concerning incident resolutions and triaging different incidents as part of the score platform engineering team within a central IBM cloud where various IBM cloud services are hosted. To ensure continuous reliability, automated incidents are created in PagerDuty Operations Cloud and incident management automation is heavily utilized as part of the project. Previously, I worked on integrating PagerDuty Operations Cloud with default AWS services to create incidents for different AWS services as part of the host infrastructure at Intel. Currently, I am creating different incident workflows within IBM internal cloud operations to ensure an effective incident management process, utilizing integrations with different LLMs as part of incident management, along with agentic SRE tasks that have arisen in the project.
Since I am part of a larger platform engineering team and SRE operations team, there are many incidents and services that my team handles. I handle over 26 IBM cloud score services hosted in our internal platform, where there have been many incidents related to service downtime, reliability issues, and update issues. A dedicated SRE team handles end-to-end incident management, and we wanted to automate the incident management process, especially since we receive hundreds of incidents per day, up to thousands of incidents during critical release times of different services. Thus, the manual on-call process has been automated through utilizing PagerDuty Operations Cloud.
What is most valuable?
One of the features I find valuable in PagerDuty Operations Cloud, which is part of our current migration activities, involves automating the entire incident management process by integrating all service incidents into a single incident management page. In IBM cloud, since many services and incidents occur, I utilize the runbook automation feature where I create runbooks for each service and common issues that facilitate incident management for common incidents.
The runbook automation positively impacts my team's workflow by significantly speeding up the incident resolution process. In IBM cloud, we have different services hosted such as IBM Schematics and IBM Kubernetes Service, with thousands of concurrent global users. We face several issues during multiple incidents, particularly in reliability and infrastructure side issues. Basic level zero incidents often require simple commands run in kubectl. Therefore, I created runbooks to address these common issues, allowing SRE team members to refer to the runbook and manually fix the issues. However, before using PagerDuty Operations Cloud's runbook automation, it took over 20 minutes to resolve these issues. After implementing PagerDuty Operations Cloud's runbook automation, I have reduced the response time from over 20 minutes to less than two minutes, saving approximately 80 to 90 percent of the time and making mean time to resolve significantly faster. I review the runbooks quarterly to update them with any new steps necessary.
PagerDuty Operations Cloud has greatly improved our productivity. Previously, I handled many incidents with manual automation runbooks, leading to substantial toil for the SRE teams in resolving even minor incidents and complicating our on-call schedule. Once I adopted PagerDuty Operations Cloud and heavily utilized the runbook automation, I provided a list of common incidents to PagerDuty Operations Cloud, which then did the heavy lifting in fixing basic incidents automatically. This allowed my team to focus more on development activities related to platform engineering. Overall, it has reduced our toil by at least 50 to 60 percent and improved our efficiency, enabling us to onboard more services. We increased from onboarding seven core IBM cloud services to over 28 services now hosted.
The expansion of services impacts our organization's goals and customer experience by allowing all IBM cloud internal services to be hosted on a dedicated platform engineering service called Rednote. Before using PagerDuty Operations Cloud, I utilized Nagios and Sysdig, and after migrating to PagerDuty Operations Cloud last year, I prepared a set of runbooks, automating on-call schedules and incident management. This automation has led to a significant increase in incident closure rates from around 40 to 45 percent, improving efficiency and reducing manual toil for basic incidents by about 60 percent. This enables my team to focus more on development activities as basic incidents that can be managed through simpler runbooks are now handled automatically by PagerDuty Operations Cloud. Additionally, incorporating AIOps into our on-call scheduling and notifications helps it learn from previous incidents and proactively address issues. This scalability has allowed me to grow from handling four cloud services to 28 services, increasing from 200 to 300 customers to over 1800 plus customers, thanks to PagerDuty Operations Cloud.
What needs improvement?
Since I host our internal services, I want more customization relating to our specific use case.
The needed improvements include the configuration process, as new team members face a steep learning curve to understand the platform. With many new members, they need training to set up runbook workflows, event orchestration, and manage complex on-call schedules across 23 services, making it a challenge for new users. Additionally, I feel the web interface requires improvements.
I would rate PagerDuty Operations Cloud as eight out of ten because the cons include a complex configuration process and high costs for each add-on that I try to obtain, making subscriptions costly, along with limited customization in certain incident workflows.
The primary reasons for rating it an eight include the complex configuration which makes it challenging for new users, as well as their difficulty in setting up advanced runbook workflows and managing complex on-call setups. The web user interface also requires improvement. Although I receive alerts via the mobile app, which is beneficial for handling schedule maintenance, the same features should be added to the web interface. Customization issues persist, such as the inability to clone entire schedules as part of the workflows, and I want to keep incidents open for a specified duration, neither of which I can currently customize. Thus, I raised a ticket with PagerDuty Operations Cloud to address these concerns. Furthermore, the cost is high, making it one of the more expensive incident management solutions.
For how long have I used the solution?
I have been using PagerDuty Operations Cloud for over four years.
What do I think about the stability of the solution?
PagerDuty Operations Cloud is definitely stable, providing faster incident management and making managing our on-call roster easy along with effective escalation and notification channels.
What do I think about the scalability of the solution?
Regarding scalability, I do not find many issues. PagerDuty Operations Cloud effectively handles concurrent incidents, and incidents are fixed properly and on time.
How are customer service and support?
My interaction with PagerDuty Operations Cloud's customer support mainly focused on customizing our workflows. They understand our concerns and are willing to implement solutions that integrate into PagerDuty Operations Cloud effectively. From a reliability perspective, I have not faced any issues, and their support provides timely assistance for custom integration and workflows.
How would you rate customer service and support?
Positive
Which solution did I use previously and why did I switch?
Previously, I used RunDeck automation, which is why I switched to PagerDuty Operations Cloud after PagerDuty Operations Cloud acquired RunDeck.
How was the initial setup?
Before using PagerDuty Operations Cloud, I utilized Nagios and Sysdig, and after migrating to PagerDuty Operations Cloud last year, I prepared a set of runbooks, automating on-call schedules and incident management. This automation led to a significant increase in incident closure rates from around 40 to 45 percent, improving efficiency and reducing manual toil for basic incidents by about 60 percent.
What's my experience with pricing, setup cost, and licensing?
I purchased PagerDuty Operations Cloud through AWS Marketplace while at Intel, and my experience has been positive regarding pricing, setup costs, and licensing. I had around seven users part of it for a base pricing of around $450 per user, primarily for custom workflows and the ITSM part. Currently, I am utilizing the runbook automation part, which costs around $2000 per year, and in the last three months, I have also used the AIOps feature for approximately $700 to $800 per month, resulting in a cumulative cost of around $3000 per month.
Which other solutions did I evaluate?
Before choosing PagerDuty Operations Cloud, I evaluated other options, particularly competitors such as ZenDuty due to cost effectiveness, but I favored PagerDuty Operations Cloud for its RunDeck features and automation capabilities for incident workflows, despite the required migration.
What other advice do I have?
Utilizing PagerDuty Operations Cloud allows me to save a significant amount of time, not only on routine incidents but also in focusing on onboarding additional services. This significantly aids me in spending less time on routine operational activities, quantified by the reduced personnel needed to manage routine tasks.
I highly recommend using PagerDuty Operations Cloud if you have numerous operational incidents to handle daily, especially if you prioritize reliability, particularly in critical projects such as IBM's core cloud services where outages must be avoided to ensure compliance and reliability.
I urge potential users to adopt PagerDuty Operations Cloud if reliability is a priority and they have a sufficient budget, as it is suited for larger infrastructures and effectively manages redundant incidents, standing out as the number one option in its market segment. I have rated PagerDuty Operations Cloud as eight out of ten overall.
Which deployment model are you using for this solution?
If public cloud, private cloud, or hybrid cloud, which cloud provider do you use?
PagerDuty: Powering Global Incident Response and Business Continuity at Scale
2.PagerDuty has seamless integration with all the above services including Teams and Slack channels, enabled us a single point of source monitoring(dashboards) and response(its auomtaion capabilities) across 40+ accounts and 50+ microservices.
3.Its automated and priority-based notifications (emails, calls, messages) and dynamic on-call scheduling capabilities giving the option to notify the the right teams based on the service and ensure response instantly (Acknowledging via Phone call, message & email), reducing MTTR for critical services. All this can be done via Mobile App is an added advantage.
4.Its Built-in options for conference bridges for each service, configuring issue resolution guides, automation (Rundeck) to run primary health checks, service graphs enable global teams to troubleshoot together in real time, ensuring business continuity 24/7 which is priority for our business every single day.
5.Dashboards provide leadership with a clear view of service health, issue trends, and performance metrics, driving informed decisions and proactive improvements.
6. Learnign wise, Pagerduty university has lof of content to learn and certify. I have done Pagerduty Incident Responder course a7 certification for free.
While alerting is highly commendable, it often triggers too many notifications based on false alerts. We should have some solution in place to limit in certain amount of time. It all needs to be closed manually more often.
I really miss "Quick templates" feature where most of the toold provide. Since we need to configure everything from scratch, it does took us a lot of time (for new teams it will be challending). We should have something like starter kit to get going from Day-1.
Escalation policies have been instrumental in ensuring accountability from the responsible teams, helping us maintain high SLAs and ultimately driving overall customer satisfaction over the years. Its realy easy to configure bridges per service and manage on call schedules enable us give information about who to enagage.
Rather than having data from different dashboards from each platform, PD made it easy for us to present a single dashboard that covers overall impact, system health and issu patterns to higher management.
Real-time monitoring has reduced downtime and ensures failed jobs are resolved quickly
What is our primary use case?
We receive a notification if there are any failed jobs or operations. We have some Bamboo agents working, so if one of the jobs fails on one of these servers, PagerDuty Operations Cloud creates an incident and notifies us. We use PagerDuty Operations Cloud for monitoring purposes, and it works great for our current needs.
What is most valuable?
The best features PagerDuty Operations Cloud offers include quick access to failed jobs and the ability to add descriptions about the failed job. The quick access allows us to rapidly identify which job or operation has failed because it sends the job name or the operation name that has failed.
I have heard that integration between our systems and PagerDuty Operations Cloud was easy to implement. For efficiency, we can monitor our deployment process in real time. For incident response, PagerDuty Operations Cloud creates alarms that make calls to the specific person who can handle these issues.
We have fewer missed incidents because it keeps calling regarding the incidents until they are resolved. We have also reduced downtime because we notice errors and failed jobs, and we work to fix them.
What needs improvement?
The system is very smooth right now.
For how long have I used the solution?
We have been using the solution for about one year.
What do I think about the stability of the solution?
We have not experienced any stability issues.
What do I think about the scalability of the solution?
We have not experienced any scalability issues.
How are customer service and support?
We did not try to reach out to customer service because we did not face any issues.
How would you rate customer service and support?
Which solution did I use previously and why did I switch?
I prefer not to use previous solutions.
How was the initial setup?
I joined the team after they had already purchased and configured PagerDuty Operations Cloud, so I did not have knowledge about the setup process.
What about the implementation team?
I do not have any experience with the implementation team.
What was our ROI?
Time saved.
What's my experience with pricing, setup cost, and licensing?
There was no relationship between setup cost and other factors.
Which other solutions did I evaluate?
We did not consider alternate solutions.
What other advice do I have?
PagerDuty Operations Cloud is a great tool that saves time and is worth starting to use. I would rate this product nine out of ten because nothing is fully perfect. My overall review rating for this product is nine out of ten.
Which deployment model are you using for this solution?
If public cloud, private cloud, or hybrid cloud, which cloud provider do you use?
On-call teams have reduced downtime and respond faster through integrated alerting workflows
What is our primary use case?
My main use case for PagerDuty Operations Cloud is monitoring and on-call management for downtime.
Recently, we had a service go down last week, and we were alerted via PagerDuty Operations Cloud of the issue. One of our on-call engineers responded to the page and quickly resolved the problem through PagerDuty Operations Cloud app.
What is most valuable?
The best features PagerDuty Operations Cloud offers include the ability to integrate its app through various platforms such as Teams and various monitoring platforms such as New Relic and DynaTrace. It is easy to use, easy to log in and configure your on-call rotation, as well as utilizing their business services and technical services to properly configure how you want things monitored and alerted.
The integrations and easy configuration help our team by saving time and reducing errors. We use Terraform to create various modules, including integrations with PagerDuty Operations Cloud and our monitoring platform, New Relic . When a team creates a new application, we ask them to use our monitoring module to monitor their service using New Relic and PagerDuty Operations Cloud. By doing that, we save time and errors by preventing people from manually having to set up their PagerDuty Operations Cloud operations; it is all done through this module, which is easy to use.
PagerDuty Operations Cloud has positively impacted our organization by allowing us to be immediately paged when a system or service is down, enabling us to quickly respond and provide updates to the organization on issues and their resolution.
This quick response has led to measurable improvements, with reduced downtime and faster incident resolution times, as our on-call engineers are appropriately alerted when things happen. We understand based on the page what is going on and how to quickly respond to it, and if we need help, we can loop in other engineers and our managers that own the product to resolve it quicker.
What needs improvement?
PagerDuty Operations Cloud can be improved by using automation or AI to advance the product in such a way that it allows the implementation of automation to resolve issues or speed up workflows.
For how long have I used the solution?
I have been using PagerDuty Operations Cloud for six years.
What do I think about the stability of the solution?
PagerDuty Operations Cloud is stable.
What do I think about the scalability of the solution?
Its scalability is impressive; it scales very well, allowing us to add licenses, add services, and more very quickly and easily.
How are customer service and support?
The customer support is great; we have never had an issue when reaching out to someone in customer service when we have questions.
How would you rate customer service and support?
Which solution did I use previously and why did I switch?
Previously, we were using New Relic for monitoring, which sent us alerts when issues went down, but we ended up using PagerDuty Operations Cloud alongside it because PagerDuty Operations Cloud is used for on-call alerting.
How was the initial setup?
Our experience with pricing, setup cost, and licensing has been straightforward and easy. We have been using PagerDuty Operations Cloud for several years, so our pricing and cost have definitely increased over time, especially as we have hired additional engineers. Adding additional users and/or licenses is very straightforward, and we have always had a good experience with customer service from PagerDuty Operations Cloud side.
What was our ROI?
The best return on investment comes from being alerted and paged for ongoing issues or new issues appropriately, allowing us to set up those schedules and engineers. The fact that PagerDuty Operations Cloud allows us to be alerted when things go down and configure how our engineers are alerted speaks to the return on investment due to the quick response it facilitates.
What's my experience with pricing, setup cost, and licensing?
Our experience with pricing, setup cost, and licensing has been straightforward and easy. We have been using PagerDuty Operations Cloud for several years, so our pricing and cost have definitely increased over time, especially as we have hired additional engineers. Adding additional users and/or licenses is very straightforward, and we have always had a good experience with customer service from PagerDuty Operations Cloud side.
Which other solutions did I evaluate?
I did not evaluate other options before choosing PagerDuty Operations Cloud.
What other advice do I have?
I recommend PagerDuty Operations Cloud as a great service and application to anyone that needs to improve their on-call process at their company. I gave this product a rating of 10.