Listing Thumbnail

    Starburst Galaxy

     Info
    Sold by: Starburst 
    Deployed on AWS
    Starburst Galaxy offers a full-featured data lake analytics platform that allows you to discover, manage, and consume the data in and around your data lake.

    Overview

    Starburst Galaxy is a fully managed data lake analytics platform designed for large and complex data sets in and around your cloud data lake. It is the easiest and fastest way for you to start running queries at interactive speeds across data sources using the business intelligence and analytics tools you already know.

    Starburst Galaxy takes just minutes to set up and takes care of the heavy lifting of designing, provisioning, maintaining, and securing your Trino infrastructure. In addition, Galaxy offers proprietary features such as fully managed connectors, global search, schema discovery, monitoring and metrics, and data sharing with data products that allow your data teams to focus on generating unique insights from your data - not managing and building analytics infrastructure.

    Highlights

    • Simplicity - Starburst Galaxy lets you discover, govern, and prepare your data from a single, fully-managed platform. Future-proof your architecture with a single point of access and governance to all your data, including RBAC and ABAC capabilities.
    • Scalability - Built on top of a query engine designed to run at internet-scale, Starburst Galaxy automatically scales your infrastructure to the needs of your workload in just a few clicks.
    • Optionality - Starburst Galaxy works with any data storage and table format, so you never have to worry about locking yourself into a proprietary data ecosystem.

    Details

    Delivery method

    Deployed on AWS

    Unlock automation with AI agent solutions

    Fast-track AI initiatives with agents, tools, and solutions from AWS Partners.
    AI Agents

    Features and programs

    Financing for AWS Marketplace purchases

    AWS Marketplace now accepts line of credit payments through the PNC Vendor Finance program. This program is available to select AWS customers in the US, excluding NV, NC, ND, TN, & VT.
    Financing for AWS Marketplace purchases

    Pricing

    Starburst Galaxy

     Info
    Pricing is based on the duration and terms of your contract with the vendor, and additional usage. You pay upfront or in installments according to your contract terms with the vendor. This entitles you to a specified quantity of use for the contract duration. Usage-based pricing is in effect for overages or additional usage not covered in the contract. These charges are applied on top of the contract price. If you choose not to renew or replace your contract before the contract end date, access to your entitlements will expire.
    Additional AWS infrastructure costs may apply. Use the AWS Pricing Calculator  to estimate your infrastructure costs.

    1-month contract (1)

     Info
    Dimension
    Description
    Cost/month
    Standard Tier
    Pay as you go
    $0.00

    Additional usage costs (1)

     Info

    The following dimensions are not included in the contract terms, which will be charged based on your usage.

    Dimension
    Cost/unit
    Usage fee
    $0.01

    Vendor refund policy

    No refunds.

    Custom pricing options

    Request a private offer to receive a custom quote.

    How can we make this page better?

    We'd like to hear your feedback and ideas on how to improve this page.
    We'd like to hear your feedback and ideas on how to improve this page.

    Legal

    Vendor terms and conditions

    Upon subscribing to this product, you must acknowledge and agree to the terms and conditions outlined in the vendor's End User License Agreement (EULA) .

    Content disclaimer

    Vendors are responsible for their product descriptions and other product content. AWS does not warrant that vendors' product descriptions or other product content are accurate, complete, reliable, current, or error-free.

    Usage information

     Info

    Delivery details

    Software as a Service (SaaS)

    SaaS delivers cloud-based software applications directly to customers over the internet. You can access these applications through a subscription model. You will pay recurring monthly usage fees through your AWS bill, while AWS handles deployment and infrastructure management, ensuring scalability, reliability, and seamless integration with other AWS services.

    Support

    Vendor support

    Get help directly from Starburst in the Starburst Galaxy UI by using our chat app. You can use the app to get answers to frequently asked questions, chat with a support agent, and search our knowledge base. For free, on-demand training, visit Starburst Academy. Docs: https://docs.starburst.io/starburst-galaxy/index.html  Support Packages:

    AWS infrastructure support

    AWS Support is a one-on-one, fast-response support channel that is staffed 24x7x365 with experienced and technical support engineers. The service helps customers of all sizes and technical abilities to successfully utilize the products and features provided by Amazon Web Services.

    Product comparison

     Info
    Updated weekly

    Accolades

     Info
    Top
    25
    In Databases & Analytics Platforms, Business Intelligence & Advanced Analytics, Data Analytics
    Top
    100
    In Log Analysis, Analytic Platforms
    Top
    10
    In Data Warehouses

    Customer reviews

     Info
    Sentiment is AI generated from actual customer reviews on AWS and G2
    Reviews
    Functionality
    Ease of use
    Customer service
    Cost effectiveness
    2 reviews
    Insufficient data
    Insufficient data
    Insufficient data
    Insufficient data
    Positive reviews
    Mixed reviews
    Negative reviews

    Overview

     Info
    AI generated from product descriptions
    Query Engine Performance
    Fully managed data lake analytics platform built on a query engine designed for internet-scale performance
    Data Source Connectivity
    Supports multiple data storage systems and table formats with flexible, universal data access capabilities
    Infrastructure Management
    Automated infrastructure design, provisioning, maintenance, and security for complex data environments
    Access Control
    Role-based and attribute-based access control (RBAC and ABAC) for comprehensive data governance
    Data Discovery
    Advanced schema discovery, global search, and monitoring capabilities for complex data ecosystems
    Data Indexing
    Indexes Amazon S3 data without transformation, optimizing for data size and performance
    Analytics Integration
    Supports search, SQL, and machine learning workloads through open APIs with tools like Kibana, Elastic, Looker, and Tableau
    Cloud Storage Transformation
    Converts Amazon S3 into a hot analytical data lake with native indexing capabilities
    Data Access Architecture
    Enables direct data access without complex data pipelines, parsing, or schema changes
    Scalability Mechanism
    Provides infinite scale data analysis with no administrative overhead for re-indexing, sharding, or load balancing
    Data Lake Query Performance
    Provides sub-second query response times using SQL query service on data lake platforms
    Open Standards Support
    Utilizes community-driven standards like Apache Iceberg and Apache Arrow for processing engines
    Multi-Source Data Integration
    Enables joining data from data lakes and external databases without data movement
    Compute Engine Management
    Automatically handles compute engine lifecycle including provisioning, scaling, pausing, and decommissioning
    VPC-Based Data Processing
    Deploys compute engines within customer's Amazon Virtual Private Cloud for secure data processing

    Contract

     Info
    Standard contract
    No
    No

    Customer reviews

    Ratings and reviews

     Info
    4.6
    5 ratings
    5 star
    4 star
    3 star
    2 star
    1 star
    60%
    40%
    0%
    0%
    0%
    5 AWS reviews
    |
    91 external reviews
    Star ratings include only reviews from verified AWS customers. External reviews can also include a star rating, but star ratings from external reviews are not averaged in with the AWS customer star ratings.
    reviewer2750097

    Unified data access improves analytics and simplifies complex processes

    Reviewed on Aug 14, 2025
    Review from a verified AWS customer

    What is our primary use case?

    I use Starburst Galaxy  on AWS  as a federated query engine to access our S3-based Iceberg data lake, Snowflake , and Redshift without duplicating data. This enables secure, high-performance analytics and machine learning workloads with consistent governance across all data sources.

    How has it helped my organization?

    Starburst Galaxy  has improved our organization by unifying access to all major data sources, reducing the need for complex ETL processes. In addition to our original use case, it has proven fast and reliable for Iceberg table maintenance, and it has enabled ingestion of Kafka feeds into our AWS  S3  data lake, further increasing its value to our data platform.

    What is most valuable?

    The features I value most are federated querying across S3  Iceberg, Snowflake , and Redshift; native Iceberg table management tools that make maintenance operations simple and performant; and the ability to connect directly to Kafka for streaming ingestion. The federated query capability has also enabled me to build a Sigma  Computing dashboard that pulls data from Postgres, BigQuery , and Snowflake through a single Starburst Galaxy connection, greatly simplifying data access and integration.

    What needs improvement?

    I would like to see better alerting integrations for failures and errors in scheduled tasks and maintenance jobs. I also want support for more connectors such as Kinesis  and Firehose, support for more file types such as Avro and JSON, and object storage message queue integration for object storage integrations. A single view of query execution and optimization details, rather than needing to toggle between the Galaxy  and Trino UI, would be helpful. Additionally, enhanced control over account and environment variables that would be available in the Enterprise edition would be beneficial.

    For how long have I used the solution?

    I have used the solution for 1.5 years.

    Which solution did I use previously and why did I switch?

    I previously used several query engines, including Athena , EMR, Redshift, Snowflake, and BigQuery . Starburst Galaxy’s federated query capabilities allowed me to join data across clouds and platforms, reducing complexity.

    What's my experience with pricing, setup cost, and licensing?

    I recommend tracking usage metrics from the start, focusing on data scanned and query concurrency, so you can right-size spend. If workloads are steady, you should explore commitment-based pricing for better rates and factor in the operational savings from not having to manage and scale your own Trino or query infrastructure.

    Which other solutions did I evaluate?

    I reviewed several options including Databricks  and Dremio . I was an early adopter of Snowflake and still use it as well. Starburst Galaxy was a better fit for my technology stack and developers.

    What other advice do I have?

    I have found that Starburst Galaxy’s flexibility makes it worth experimenting beyond the initial deployment plan. Features I originally viewed as secondary, such as Iceberg maintenance and Kafka ingestion, have become everyday tools. Building a strong relationship with the Starburst team has also helped me optimize configurations and discover new capabilities faster.

    Which deployment model are you using for this solution?

    Public Cloud

    If public cloud, private cloud, or hybrid cloud, which cloud provider do you use?

    Amazon Web Services (AWS)
    reviewer2750082

    Platform reduces management overhead by deploying multiple clusters and tracking costs efficiently while enhancing performance with low-latency responses

    Reviewed on Aug 14, 2025
    Review from a verified AWS customer

    What is our primary use case?

    Starburst Galaxy  serves as our primary SQL-based data processing engine, a strategic decision driven by its seamless integration with our AWS  cloud infrastructure and its ability to deliver high performance with low-latency responses.

    The platform provides a comprehensive suite of functionalities that significantly enhance the daily operations of our data engineers and data analysts.

    How has it helped my organization?

    Starburst Galaxy  has been instrumental in reducing the maintenance effort and management overhead of our Trino cluster, which is particularly valuable given our lean platform team responsible for Kovi's data infrastructure.

    The platform has enabled us to deploy multiple clusters for different purposes while providing clear cost tracking and utilization monitoring capabilities.

    What is most valuable?

    The most relevant functionalities today are cluster autoscaling for intensive load periods and automated metadata management through cleaning, compression, and orphaned file deletion in Iceberg.

    These capabilities significantly reduce reading costs, storage expenses, and query processing overhead.

    What needs improvement?

    I maintain weekly conversations with Starburst's development and support teams, which provides me with visibility into the product roadmap and evolution.

    Currently, my primary need is the impersonation functionality for BI solutions within Starburst clusters, which would enable enhanced access control and data governance capabilities.

    For how long have I used the solution?

    I have used the solution for almost 2 years.

    Which solution did I use previously and why did I switch?

    Previously, I utilized the AWS  stack with Redshift and Athena .

    I chose to migrate to Starburst Galaxy due to their expertise with Trino, superior aggregate cost structure compared to my previous solutions, and the rapid product evolution with new functionalities, problem corrections, and performance improvements.

    What's my experience with pricing, setup cost, and licensing?

    Since Starburst Galaxy's pricing model is simple to understand and easy to predict, there are no major secrets.

    Everything is transparent and accessible through the product console.

    The only point of attention is the S3  and transfer costs that should also be included when calculating the total cost.

    Which deployment model are you using for this solution?

    Public Cloud

    If public cloud, private cloud, or hybrid cloud, which cloud provider do you use?

    Amazon Web Services (AWS)
    reviewer2750067

    Significantly improved our data architecture flexibility and performance management

    Reviewed on Aug 14, 2025
    Review from a verified AWS customer

    What is our primary use case?

    My team uses Starburst Galaxy  for cross-database querying, iceberg table management, and workload separation across multiple data sources. We implemented Starburst Galaxy  to replace our self-hosted Trino setup, bridging gaps in our data warehousing situation where we need flexibility to read from various warehouses and write to different formats while maintaining clean compute separation.

    How has it helped my organization?

    Starburst Galaxy has significantly improved our data architecture flexibility and performance management. We have successfully solved cross-database query challenges by utilizing Starburst Galaxy's ability to read and write in iceberg format on Trino, making our iceberg tables usable externally across our entire data ecosystem.

    The compute separation capabilities have been transformative. We can easily split workloads and prevent sporadic usage spikes from slowing down critical processes. This has resulted in much more predictable performance and better resource utilization across our data operations.

    The clean entry point provided by the built-in query engine has streamlined our SQL development workflow, while the data products functionality gives us an excellent way to present our end-state warehouse-level tables to stakeholders.

    What is most valuable?

    The flexibility to connect to numerous different warehouses and write to various formats is Starburst Galaxy's standout feature. This adaptability allows it to mold itself perfectly to our specific needs rather than forcing us to conform to rigid constraints.

    The compute-focused architecture makes workload management incredibly straightforward. Since Trino focuses primarily on compute, it is really easy to work with and optimize. The user interface for navigating, managing permissions, and viewing queries and clusters is excellent and makes administration tasks much more manageable.

    Cross-database functionality combined with iceberg format support has been game-changing for our data integration workflows.

    What needs improvement?

    For teams heavily invested in cutting-edge dbt  features, it is worth noting that Starburst Galaxy is not a tier 1 dbt  partner, so it is typically slower to adopt the newest dbt capabilities such as the Fusion Engine and Semantic Layer. While these features would be nice to have, it was not significant enough to deter us from choosing Starburst Galaxy. The core functionality works well and the benefits far outweigh this limitation.

    Cluster startup time is another pain point, typically 3 to 5 minutes, which is not the worst with proper planning but can be annoying for ad-hoc work. The lack of a Terraform  provider is also a notable gap for infrastructure-as-code workflows. Additionally, integration between data products and the dbt Semantic Layer would significantly enhance the platform's value proposition.

    For how long have I used the solution?

    We have used Starburst Galaxy for a few months.

    Which solution did I use previously and why did I switch?

    We migrated our self-hosted Trino instance to Starburst Galaxy.

    What's my experience with pricing, setup cost, and licensing?

    Pricing is competitive and the value proposition depends on your specific use case and requirements. When evaluating against alternatives such as Snowflake , it is worth considering the unique flexibility and cross-database capabilities that Starburst Galaxy provides rather than focusing solely on compute costs.

    Which other solutions did I evaluate?

    We briefly explored other options, but given the one-to-one nature of Trino and Starburst Galaxy, it made for a more seamless transition.

    What other advice do I have?

    Starburst Galaxy excels as a flexible, adaptable solution for teams dealing with complex, multi-source data architectures. It may not be the absolute best at any single function, but its strength lies in being very good at many things while remaining highly malleable.

    I would particularly recommend it for teams that need cross-database functionality and iceberg format support, though dbt-focused teams should be prepared to work around the slower adoption of cutting-edge dbt features. It is important to plan for cluster startup times in your workflows, and if infrastructure-as-code is important, factor in the current lack of Terraform  support.

    Overall, if you are looking for a solution that can bridge gaps in your data architecture rather than replace everything, Starburst Galaxy is an excellent choice that provides the flexibility to adapt to your specific needs.

    Which deployment model are you using for this solution?

    Public Cloud
    Stephen-Howard

    Federated querying delivers integrated data at record speed and reduces processing time

    Reviewed on Aug 11, 2025
    Review from a verified AWS customer

    What is our primary use case?

    We use Starburst Galaxy  to query data across our diverse data ecosystem. Our data has evolved over many years and is spread across many data sources. Starburst enables us to query across this ecosystem without having to move everything into a single location.

    Our teams require a method for integrating data from various systems for reporting and ad-hoc analysis, and Starburst Galaxy  fundamentally meets this need.

    How has it helped my organization?

    The biggest win has been the ability to combine data from multiple sources and deliver it to the business at record speed.

    This capability has allowed us to query directly through Starburst Galaxy, enabling teams to access integrated data that would otherwise be hard to pull together.

    This has reduced both our ETL processing time and storage costs. We are answering questions that would have been hard, if not impossible, to answer previously because the data came from disparate, disconnected sources.

    What is most valuable?

    Federated querying through Starburst Galaxy has unlocked our ability to move data using SQL, keeping data in the data layer. The ability to use SQL to query multiple data sources and then write to a single destination has been essential.

    Additionally, setting up new data connections is straightforward.

    What needs improvement?

    I would like to see per-model cluster routing selection when using dbt . Cluster startup time can be slow, sometimes taking over a minute.

    For how long have I used the solution?

    We have been using the solution for 6 months.

    Which solution did I use previously and why did I switch?

    We started using Trino, which worked, but we wanted a reliable managed solution to help us scale.

    What's my experience with pricing, setup cost, and licensing?

    The pricing is transparent and reasonable.

    Which other solutions did I evaluate?

    We considered using open source Trino.

    What other advice do I have?

    Starburst Galaxy addresses our primary problem of managing and working with data spread across multiple systems. Our teams can access and combine data from any source, enabling faster insights and reducing the time spent on manual data wrangling.

    Starburst Galaxy is becoming a cornerstone of our data platform, empowering us to make smarter and faster decisions across the organization.

    Which deployment model are you using for this solution?

    Public Cloud

    If public cloud, private cloud, or hybrid cloud, which cloud provider do you use?

    reviewer2748021

    Has a cost-effective transformation for data management as efficient querying enhances productivity

    Reviewed on Aug 05, 2025
    Review provided by PeerSpot

    What is our primary use case?

    Our primary use case is to manage hundreds of terabytes of data efficiently across a wide range of internal use cases, including ingestion/ETL, machine learning pipelining, and customer-facing product workflows.

    It is a top priority to enable all engineers to have access to this volume of data without the concern of overspending on expensive cloud warehouse providers.

    How has it helped my organization?

    We have experienced several improvements across our organization.

    Our data ingestion processes previously involved copying data from S3  to Snowflake , which was fairly costly and required constant vigilance to purge old data so that our source tables would not bloat.

    Now we are able to move ingestion staging data to Iceberg tables, resulting in a much better experience in terms of both compute and storage costs as well as maintenance.

    Data transformation has also become more efficient.

    Starburst on Trino, combined with our SQL-native data transformation tool SQLMesh, has delivered anywhere from a two to five times improvement in compute performance across our transformation DAG.

    This improvement is largely due to how efficiently Trino scans relevant data without requiring any additional setup, such as defining partitions in Snowflake .

    In terms of cost effectiveness, we are already forecasting a 25% reduction in cloud data provider spending, even while continuing to use both Snowflake and Starburst.

    This is because we are able to shift a significant amount of compute to Galaxy , and the cost difference compared to our previous approach of running jobs exclusively on Snowflake is substantial.

    What is most valuable?

    Cross-catalog querying and compatibility with AWS Glue  have both significantly enhanced the user experience.

    We operate several accounts within our AWS  organization, each containing substantial volumes of data, and the onboarding process with Starburst has been fairly quick, even in the face of AWS  IAM  complexities.

    What needs improvement?

    The most persistent issue is the cluster spin-up time.

    Coming from Snowflake, where warehouse spin-ups are nearly instantaneous, it has been a challenge to adapt.

    However, I believe the Starburst team is working on solutions for this.

    Additionally, the cluster and query monitoring UI lacks an optimal user experience.

    I would recommend that the Starburst team invest in forking the Trino console and enhancing that tool, as observability is very important to us.

    More Starburst-specific documentation would also be helpful.

    I understand that some Trino functionality, such as certain parameters, is not supported, so clearer guidance would be appreciated.

    Which solution did I use previously and why did I switch?

    We previously used only Snowflake but are now shifting toward a more hybrid architecture.

    We primarily added Starburst to our stack due to the potential for significant cost savings and because implementing a lakehouse is a more effective long-term data strategy.

    What's my experience with pricing, setup cost, and licensing?

    The setup cost is fairly transparent.

    There are many opportunities to find cost savings or discounts, especially for a startup like ours.

    I appreciate that the pricing is available online, although I will note that comparable compute is only slightly cheaper than Snowflake warehouse costs, for example.

    Which other solutions did I evaluate?

    We considered Onehouse and Clickhouse as alternative solutions.

    What other advice do I have?

    We are in the early phases of our Starburst relationship and are looking forward to how we can grow with it in the future.

    Which deployment model are you using for this solution?

    Hybrid Cloud

    If public cloud, private cloud, or hybrid cloud, which cloud provider do you use?

    Amazon Web Services (AWS)
    View all reviews