Listing Thumbnail

    Gremlin Reliability Management Platform

     Info
    Sold by: Gremlin 
    Deployed on AWS
    Free Trial
    AWS Free Tier
    Downtime is expensive and can hurt your brand. Gremlin provides engineers with the framework to safely, securely, and easily simulate real outages through the practice of Reliability Engineering. As organizations build more and more cloud-native systems, it's critical for these organizations to be able to fully understand and provide data points on what will happen to their systems if they experience any sort of degradation. This may include situations such as a spike in CPU, added latency on a service, a service being completely unreachable or other situations that result in a poor user experience or unplanned outage.
    4.1

    Overview

    Play video

    Gremlin's Reliability Management Platform builds upon the practice of Chaos Engineering. By giving teams a more guided and product-led way to achieve reliability goals, Gremlin's Reliability Management Platform let's you easily define services, integrate them to your Golden Signals in your APM tool, run a series of Reliability Tests, and receive a Reliability Score for your defined service. These efforts help groups approach their reliability efforts in a safe, secure, scalable and standardized way.

    Highlights

    • Use Reliability Engineering to proactively get ahead of any infrastructure related issues in your environment
    • Use Reliability Scoring to get a comprehensive view of where your services rank from a reliability perspective
    • Don't let velocity conflict with reliability. Build a regression set of Reliability tests to understand how changes to your applications and infrastructure impact your underlying microservices and infrastructure

    Details

    Sold by

    Delivery method

    Deployed on AWS
    New

    Introducing multi-product solutions

    You can now purchase comprehensive solutions tailored to use cases and industries.

    Multi-product solutions

    Features and programs

    Financing for AWS Marketplace purchases

    AWS Marketplace now accepts line of credit payments through the PNC Vendor Finance program. This program is available to select AWS customers in the US, excluding NV, NC, ND, TN, & VT.
    Financing for AWS Marketplace purchases

    Pricing

    Free trial

    Try this product free according to the free trial terms set by the vendor.

    Gremlin Reliability Management Platform

     Info
    Pricing is based on the duration and terms of your contract with the vendor. This entitles you to a specified quantity of use for the contract duration. If you choose not to renew or replace your contract before it ends, access to these entitlements will expire.
    Additional AWS infrastructure costs may apply. Use the AWS Pricing Calculator  to estimate your infrastructure costs.

    12-month contract (1)

     Info
    Dimension
    Description
    Cost/12 months
    Reliability Management: 50 Agents
    Reliability Management Platform - Unlimited Reliability Testing & Scoring, Fault Injection, Failure Flags - 50 Agents
    $45,000.00

    Vendor refund policy

    No refunds.

    How can we make this page better?

    We'd like to hear your feedback and ideas on how to improve this page.
    We'd like to hear your feedback and ideas on how to improve this page.

    Legal

    Vendor terms and conditions

    Upon subscribing to this product, you must acknowledge and agree to the terms and conditions outlined in the vendor's End User License Agreement (EULA) .

    Content disclaimer

    Vendors are responsible for their product descriptions and other product content. AWS does not warrant that vendors' product descriptions or other product content are accurate, complete, reliable, current, or error-free.

    Usage information

     Info

    Delivery details

    Software as a Service (SaaS)

    SaaS delivers cloud-based software applications directly to customers over the internet. You can access these applications through a subscription model. You will pay recurring monthly usage fees through your AWS bill, while AWS handles deployment and infrastructure management, ensuring scalability, reliability, and seamless integration with other AWS services.

    Resources

    Support

    Vendor support

    Email support is offered during 8am - 8pm PST, Monday - Friday. support@gremlin.com  or by Zendesk widget in App

    AWS infrastructure support

    AWS Support is a one-on-one, fast-response support channel that is staffed 24x7x365 with experienced and technical support engineers. The service helps customers of all sizes and technical abilities to successfully utilize the products and features provided by Amazon Web Services.

    Product comparison

     Info
    Updated weekly

    Accolades

     Info
    Top
    25
    In Testing, Network Infrastructure
    Top
    50
    In Compliance and Auditing, Monitoring and Observability
    Top
    10
    In Hybrid Monitoring

    Customer reviews

     Info
    Sentiment is AI generated from actual customer reviews on AWS and G2
    Reviews
    Functionality
    Ease of use
    Customer service
    Cost effectiveness
    4 reviews
    Insufficient data
    Insufficient data
    Insufficient data
    0 reviews
    Insufficient data
    Insufficient data
    Insufficient data
    Insufficient data
    1 reviews
    Insufficient data
    Insufficient data
    Insufficient data
    Insufficient data
    Positive reviews
    Mixed reviews
    Negative reviews

    Overview

     Info
    AI generated from product descriptions
    Chaos Engineering Framework
    Simulates real outages and infrastructure degradation scenarios including CPU spikes, service latency, and complete service unavailability to test system resilience.
    Reliability Scoring System
    Generates comprehensive reliability scores for defined services based on integration with Golden Signals from APM tools to assess service reliability rankings.
    Guided Reliability Testing
    Provides a product-led approach to define services, integrate monitoring data, and execute a series of structured reliability tests in a standardized manner.
    Regression Test Suite
    Enables creation of regression test sets to measure and understand the impact of application and infrastructure changes on microservices and underlying systems.
    Proactive Issue Detection
    Identifies and helps prevent infrastructure-related issues before they occur through systematic reliability engineering practices and testing.
    Service Level Objective Management
    Defines and monitors customer-centric Service Level Objectives (SLOs) with flexible Error Budgets, Occurrences, and Time-slices configurations
    Multi-Source Data Integration
    Connects to existing observability and monitoring data sources through a library of integrations without requiring additional tooling
    SLO Analysis and Reporting
    Provides SLI Analyzer for defining SLOs based on historical metrics, intuitive reliability burndown reports, and composite SLOs that aggregate multiple individual SLOs
    Automated Alerting and Incident Response
    Triggers proactive automated alerts to existing incident response tools and executes customizable webhooks with runbooks based on SLOs-at-risk conditions
    SLO-as-Code and Extended Features
    Supports SLOs-as-code, OpenSLO compatibility, sloctl command-line tool, annotations for context, and replay functionality for historical analysis
    Automated Service Discovery and Modeling
    Lightweight, agentless, and scalable discovery of IT infrastructure, applications, and software components with automatic identification of dependencies and relationships across the IT landscape.
    AI-Powered Root Cause Analysis
    Causal AI technology to determine root causes and isolate issues across services, reducing mean time to resolution and eliminating manual war room investigations.
    Machine Learning-Based Event Correlation
    ML-powered situations for proactively correlating events and determining root causes across services with human-readable summaries and visual diagrams showing impact and diagnosis.
    Predictive Capacity Planning
    Saturation forecasting to predict up to 30 days in advance when infrastructure and application resources will run out of capacity, with what-if simulation capabilities for business event planning.
    Generative AI-Powered Remediation Recommendations
    Patented Best Action Recommendation engine powered by generative AI that recommends resolution steps based on similar past incidents and delivers code templates including Ansible runbooks and Bash scripts for automated remediation.

    Contract

     Info
    Standard contract
    No
    No

    Customer reviews

    Ratings and reviews

     Info
    4.1
    4 ratings
    5 star
    4 star
    3 star
    2 star
    1 star
    50%
    25%
    25%
    0%
    0%
    1 AWS reviews
    |
    3 external reviews
    External reviews are from G2 .
    reviewer2783910

    Platform has improved reliability metrics but still raises questions about overall value

    Reviewed on Dec 03, 2025
    Review from a verified AWS customer

    What is our primary use case?

    The Enterprise Reliability Platform  serves as my main use case for the next question.

    A quick specific example of how I use The Enterprise Reliability Platform  to maintain reliability and efficiency is that we have our own internal system to track and maintain the reliability and efficiency.

    What is most valuable?

    The Enterprise Reliability Platform has positively impacted my organization as it has significantly increased the efficiency and reliability of our systems.

    I measured that increase in efficiency, and I can share that the metrics I noticed include latency and the SLOs, error budget, and not burning through the error budgets.

    What needs improvement?

    I have no recommendations for how The Enterprise Reliability Platform can be improved.

    For how long have I used the solution?

    I have been using The Enterprise Reliability Platform for one year.

    What other advice do I have?

    I have no answer regarding the best features The Enterprise Reliability Platform offers.

    I would provide no advice to others looking into using The Enterprise Reliability Platform.

    My company does not have a business relationship with this vendor other than being a customer.

    I was not offered a gift card or incentive for this review.

    I do not have any additional thoughts about The Enterprise Reliability Platform before we wrap up.

    I gave this review a rating of 6.

    Which deployment model are you using for this solution?

    Hybrid Cloud

    If public cloud, private cloud, or hybrid cloud, which cloud provider do you use?

    Computer Software

    Feature-full graph traversal language

    Reviewed on Mar 09, 2022
    Review provided by G2
    What do you like best about the product?
    Gremlin is quite easy to learn and use. I like that it supports both graph traversals and graph pattern matching (aka declarative traversals). In many cases, I would prefer the Gremlin syntax to the SPARQL syntax.
    What do you dislike about the product?
    I would be interested to see inferencing support (materialized on not) in the future. Mixing features like declarative and non-declarative traversals could be a bit cumbersome.
    What problems is the product solving and how is that benefiting you?
    There are many use cases where the property graph data model and graph traversals with Gremlin can be very useful. I used Gremlin to solve problems related to fraud detection, real-time recommendations and customer 360.
    Pranav S.

    Gremlin is one of the few good Chaos Engineering Provider with continuous improvements

    Reviewed on Feb 16, 2022
    Review provided by G2
    What do you like best about the product?
    Support for Chaos Engineering on Cloud Platforms for testing weak points in infra availability, resilience & security. Support is good for new implementations.
    What do you dislike about the product?
    The only thing I can think is that providing support to new technologies like Serverless on cloud continuos updates are required as Cloud Platforms change. So a maturity model is quite tough to maintain for such Chaos products.
    What problems is the product solving and how is that benefiting you?
    Try to find How the system will behave under inevitable failures like a specific Service or VM is down, how much time to recovery(MTTR), how resilient architecture handles random failures.
    Emulating black holes service to simulate different service failures helps understand cascading failures, which you might not expect in design earlier. How interconnected services will behave/misbehave in dependency failure & prevent data loss with middleware failures
    Reuben Rajan G.

    Go to solution to get started with Chaos Engineering

    Reviewed on Feb 02, 2022
    Review provided by G2
    What do you like best about the product?
    Easy to use chaos engineering tool, minimal installation, great for entrants in chaos engineering concepts. Easy cloud integration. Lots of documentation to get started quickly.
    What do you dislike about the product?
    Has limited support for on-premise chaos injection, and it requires a subscription to run multi-point chaos experiments. Open-source version of the product is not available,
    What problems is the product solving and how is that benefiting you?
    We use Gremlin to run chaos tests against our K8 workloads hosted on AWS. Our teams can get quickly onboarded with the chaos engineering concepts. Great tool for our new joiners.
    View all reviews