Overview
Starburst Galaxy is a fully managed data lake analytics platform designed for large and complex data sets in and around your cloud data lake. It is the easiest and fastest way for you to start running queries at interactive speeds across data sources using the business intelligence and analytics tools you already know.
Starburst Galaxy takes just minutes to set up and takes care of the heavy lifting of designing, provisioning, maintaining, and securing your Trino infrastructure. In addition, Galaxy offers proprietary features such as fully managed connectors, global search, schema discovery, monitoring and metrics, and data sharing with data products that allow your data teams to focus on generating unique insights from your data - not managing and building analytics infrastructure.
Highlights
- Simplicity - Starburst Galaxy lets you discover, govern, and prepare your data from a single, fully-managed platform. Future-proof your architecture with a single point of access and governance to all your data, including RBAC and ABAC capabilities.
- Scalability - Built on top of a query engine designed to run at internet-scale, Starburst Galaxy automatically scales your infrastructure to the needs of your workload in just a few clicks.
- Optionality - Starburst Galaxy works with any data storage and table format, so you never have to worry about locking yourself into a proprietary data ecosystem.
Details
Unlock automation with AI agent solutions

Features and programs
Financing for AWS Marketplace purchases
Pricing
Vendor refund policy
No refunds.
Custom pricing options
How can we make this page better?
Legal
Vendor terms and conditions
Content disclaimer
Delivery details
Software as a Service (SaaS)
SaaS delivers cloud-based software applications directly to customers over the internet. You can access these applications through a subscription model. You will pay recurring monthly usage fees through your AWS bill, while AWS handles deployment and infrastructure management, ensuring scalability, reliability, and seamless integration with other AWS services.
Resources
Vendor resources
Support
Vendor support
Get help directly from Starburst in the Starburst Galaxy UI by using our chat app. You can use the app to get answers to frequently asked questions, chat with a support agent, and search our knowledge base. For free, on-demand training, visit Starburst Academy. Docs: https://docs.starburst.io/starburst-galaxy/index.html Support Packages:
AWS infrastructure support
AWS Support is a one-on-one, fast-response support channel that is staffed 24x7x365 with experienced and technical support engineers. The service helps customers of all sizes and technical abilities to successfully utilize the products and features provided by Amazon Web Services.


Standard contract
Customer reviews
Unified data access improves analytics and simplifies complex processes
What is our primary use case?
I use Starburst Galaxy on AWS as a federated query engine to access our S3-based Iceberg data lake, Snowflake , and Redshift without duplicating data. This enables secure, high-performance analytics and machine learning workloads with consistent governance across all data sources.
How has it helped my organization?
Starburst Galaxy has improved our organization by unifying access to all major data sources, reducing the need for complex ETL processes. In addition to our original use case, it has proven fast and reliable for Iceberg table maintenance, and it has enabled ingestion of Kafka feeds into our AWS S3 data lake, further increasing its value to our data platform.
What is most valuable?
The features I value most are federated querying across S3 Iceberg, Snowflake , and Redshift; native Iceberg table management tools that make maintenance operations simple and performant; and the ability to connect directly to Kafka for streaming ingestion. The federated query capability has also enabled me to build a Sigma Computing dashboard that pulls data from Postgres, BigQuery , and Snowflake through a single Starburst Galaxy connection, greatly simplifying data access and integration.
What needs improvement?
I would like to see better alerting integrations for failures and errors in scheduled tasks and maintenance jobs. I also want support for more connectors such as Kinesis and Firehose, support for more file types such as Avro and JSON, and object storage message queue integration for object storage integrations. A single view of query execution and optimization details, rather than needing to toggle between the Galaxy and Trino UI, would be helpful. Additionally, enhanced control over account and environment variables that would be available in the Enterprise edition would be beneficial.
For how long have I used the solution?
Which solution did I use previously and why did I switch?
I previously used several query engines, including Athena , EMR, Redshift, Snowflake, and BigQuery . Starburst Galaxy’s federated query capabilities allowed me to join data across clouds and platforms, reducing complexity.
What's my experience with pricing, setup cost, and licensing?
I recommend tracking usage metrics from the start, focusing on data scanned and query concurrency, so you can right-size spend. If workloads are steady, you should explore commitment-based pricing for better rates and factor in the operational savings from not having to manage and scale your own Trino or query infrastructure.
Which other solutions did I evaluate?
I reviewed several options including Databricks and Dremio . I was an early adopter of Snowflake and still use it as well. Starburst Galaxy was a better fit for my technology stack and developers.
What other advice do I have?
I have found that Starburst Galaxy’s flexibility makes it worth experimenting beyond the initial deployment plan. Features I originally viewed as secondary, such as Iceberg maintenance and Kafka ingestion, have become everyday tools. Building a strong relationship with the Starburst team has also helped me optimize configurations and discover new capabilities faster.
Which deployment model are you using for this solution?
If public cloud, private cloud, or hybrid cloud, which cloud provider do you use?
Platform reduces management overhead by deploying multiple clusters and tracking costs efficiently while enhancing performance with low-latency responses
What is our primary use case?
Starburst Galaxy serves as our primary SQL-based data processing engine, a strategic decision driven by its seamless integration with our AWS cloud infrastructure and its ability to deliver high performance with low-latency responses.
The platform provides a comprehensive suite of functionalities that significantly enhance the daily operations of our data engineers and data analysts.
How has it helped my organization?
Starburst Galaxy has been instrumental in reducing the maintenance effort and management overhead of our Trino cluster, which is particularly valuable given our lean platform team responsible for Kovi's data infrastructure.
The platform has enabled us to deploy multiple clusters for different purposes while providing clear cost tracking and utilization monitoring capabilities.
What is most valuable?
The most relevant functionalities today are cluster autoscaling for intensive load periods and automated metadata management through cleaning, compression, and orphaned file deletion in Iceberg.
These capabilities significantly reduce reading costs, storage expenses, and query processing overhead.
What needs improvement?
I maintain weekly conversations with Starburst's development and support teams, which provides me with visibility into the product roadmap and evolution.
Currently, my primary need is the impersonation functionality for BI solutions within Starburst clusters, which would enable enhanced access control and data governance capabilities.
For how long have I used the solution?
I have used the solution for almost 2 years.
Which solution did I use previously and why did I switch?
Previously, I utilized the AWS stack with Redshift and Athena .
I chose to migrate to Starburst Galaxy due to their expertise with Trino, superior aggregate cost structure compared to my previous solutions, and the rapid product evolution with new functionalities, problem corrections, and performance improvements.
What's my experience with pricing, setup cost, and licensing?
Since Starburst Galaxy's pricing model is simple to understand and easy to predict, there are no major secrets.
Everything is transparent and accessible through the product console.
The only point of attention is the S3Â and transfer costs that should also be included when calculating the total cost.
Which deployment model are you using for this solution?
If public cloud, private cloud, or hybrid cloud, which cloud provider do you use?
Significantly improved our data architecture flexibility and performance management
What is our primary use case?
My team uses Starburst Galaxy for cross-database querying, iceberg table management, and workload separation across multiple data sources. We implemented Starburst Galaxy to replace our self-hosted Trino setup, bridging gaps in our data warehousing situation where we need flexibility to read from various warehouses and write to different formats while maintaining clean compute separation.
How has it helped my organization?
Starburst Galaxy has significantly improved our data architecture flexibility and performance management. We have successfully solved cross-database query challenges by utilizing Starburst Galaxy's ability to read and write in iceberg format on Trino, making our iceberg tables usable externally across our entire data ecosystem.
The compute separation capabilities have been transformative. We can easily split workloads and prevent sporadic usage spikes from slowing down critical processes. This has resulted in much more predictable performance and better resource utilization across our data operations.
The clean entry point provided by the built-in query engine has streamlined our SQL development workflow, while the data products functionality gives us an excellent way to present our end-state warehouse-level tables to stakeholders.
What is most valuable?
The flexibility to connect to numerous different warehouses and write to various formats is Starburst Galaxy's standout feature. This adaptability allows it to mold itself perfectly to our specific needs rather than forcing us to conform to rigid constraints.
The compute-focused architecture makes workload management incredibly straightforward. Since Trino focuses primarily on compute, it is really easy to work with and optimize. The user interface for navigating, managing permissions, and viewing queries and clusters is excellent and makes administration tasks much more manageable.
Cross-database functionality combined with iceberg format support has been game-changing for our data integration workflows.
What needs improvement?
For teams heavily invested in cutting-edge dbt features, it is worth noting that Starburst Galaxy is not a tier 1 dbt partner, so it is typically slower to adopt the newest dbt capabilities such as the Fusion Engine and Semantic Layer. While these features would be nice to have, it was not significant enough to deter us from choosing Starburst Galaxy. The core functionality works well and the benefits far outweigh this limitation.
Cluster startup time is another pain point, typically 3 to 5 minutes, which is not the worst with proper planning but can be annoying for ad-hoc work. The lack of a Terraform provider is also a notable gap for infrastructure-as-code workflows. Additionally, integration between data products and the dbt Semantic Layer would significantly enhance the platform's value proposition.
For how long have I used the solution?
We have used Starburst Galaxy for a few months.
Which solution did I use previously and why did I switch?
We migrated our self-hosted Trino instance to Starburst Galaxy.
What's my experience with pricing, setup cost, and licensing?
Pricing is competitive and the value proposition depends on your specific use case and requirements. When evaluating against alternatives such as Snowflake , it is worth considering the unique flexibility and cross-database capabilities that Starburst Galaxy provides rather than focusing solely on compute costs.
Which other solutions did I evaluate?
We briefly explored other options, but given the one-to-one nature of Trino and Starburst Galaxy, it made for a more seamless transition.
What other advice do I have?
Starburst Galaxy excels as a flexible, adaptable solution for teams dealing with complex, multi-source data architectures. It may not be the absolute best at any single function, but its strength lies in being very good at many things while remaining highly malleable.
I would particularly recommend it for teams that need cross-database functionality and iceberg format support, though dbt-focused teams should be prepared to work around the slower adoption of cutting-edge dbt features. It is important to plan for cluster startup times in your workflows, and if infrastructure-as-code is important, factor in the current lack of Terraform support.
Overall, if you are looking for a solution that can bridge gaps in your data architecture rather than replace everything, Starburst Galaxy is an excellent choice that provides the flexibility to adapt to your specific needs.
Which deployment model are you using for this solution?
Federated querying delivers integrated data at record speed and reduces processing time
What is our primary use case?
We use Starburst Galaxy to query data across our diverse data ecosystem. Our data has evolved over many years and is spread across many data sources. Starburst enables us to query across this ecosystem without having to move everything into a single location.
Our teams require a method for integrating data from various systems for reporting and ad-hoc analysis, and Starburst Galaxy fundamentally meets this need.
How has it helped my organization?
The biggest win has been the ability to combine data from multiple sources and deliver it to the business at record speed.
This capability has allowed us to query directly through Starburst Galaxy, enabling teams to access integrated data that would otherwise be hard to pull together.
This has reduced both our ETL processing time and storage costs. We are answering questions that would have been hard, if not impossible, to answer previously because the data came from disparate, disconnected sources.
What is most valuable?
Federated querying through Starburst Galaxy has unlocked our ability to move data using SQL, keeping data in the data layer. The ability to use SQL to query multiple data sources and then write to a single destination has been essential.
Additionally, setting up new data connections is straightforward.
What needs improvement?
I would like to see per-model cluster routing selection when using dbt . Cluster startup time can be slow, sometimes taking over a minute.
For how long have I used the solution?
Which solution did I use previously and why did I switch?
We started using Trino, which worked, but we wanted a reliable managed solution to help us scale.
What's my experience with pricing, setup cost, and licensing?
The pricing is transparent and reasonable.
Which other solutions did I evaluate?
We considered using open source Trino.
What other advice do I have?
Starburst Galaxy addresses our primary problem of managing and working with data spread across multiple systems. Our teams can access and combine data from any source, enabling faster insights and reducing the time spent on manual data wrangling.
Starburst Galaxy is becoming a cornerstone of our data platform, empowering us to make smarter and faster decisions across the organization.
Which deployment model are you using for this solution?
If public cloud, private cloud, or hybrid cloud, which cloud provider do you use?
Has a cost-effective transformation for data management as efficient querying enhances productivity
What is our primary use case?
Our primary use case is to manage hundreds of terabytes of data efficiently across a wide range of internal use cases, including ingestion/ETL, machine learning pipelining, and customer-facing product workflows.
It is a top priority to enable all engineers to have access to this volume of data without the concern of overspending on expensive cloud warehouse providers.
How has it helped my organization?
We have experienced several improvements across our organization.
Our data ingestion processes previously involved copying data from S3 to Snowflake , which was fairly costly and required constant vigilance to purge old data so that our source tables would not bloat.
Now we are able to move ingestion staging data to Iceberg tables, resulting in a much better experience in terms of both compute and storage costs as well as maintenance.
Data transformation has also become more efficient.
Starburst on Trino, combined with our SQL-native data transformation tool SQLMesh, has delivered anywhere from a two to five times improvement in compute performance across our transformation DAG.
This improvement is largely due to how efficiently Trino scans relevant data without requiring any additional setup, such as defining partitions in Snowflake .
In terms of cost effectiveness, we are already forecasting a 25% reduction in cloud data provider spending, even while continuing to use both Snowflake and Starburst.
This is because we are able to shift a significant amount of compute to Galaxy , and the cost difference compared to our previous approach of running jobs exclusively on Snowflake is substantial.
What is most valuable?
Cross-catalog querying and compatibility with AWS Glue have both significantly enhanced the user experience.
We operate several accounts within our AWSÂ organization, each containing substantial volumes of data, and the onboarding process with Starburst has been fairly quick, even in the face of AWSÂ IAMÂ complexities.
What needs improvement?
The most persistent issue is the cluster spin-up time.
Coming from Snowflake, where warehouse spin-ups are nearly instantaneous, it has been a challenge to adapt.
However, I believe the Starburst team is working on solutions for this.
Additionally, the cluster and query monitoring UI lacks an optimal user experience.
I would recommend that the Starburst team invest in forking the Trino console and enhancing that tool, as observability is very important to us.
More Starburst-specific documentation would also be helpful.
I understand that some Trino functionality, such as certain parameters, is not supported, so clearer guidance would be appreciated.
Which solution did I use previously and why did I switch?
We previously used only Snowflake but are now shifting toward a more hybrid architecture.
We primarily added Starburst to our stack due to the potential for significant cost savings and because implementing a lakehouse is a more effective long-term data strategy.
What's my experience with pricing, setup cost, and licensing?
The setup cost is fairly transparent.
There are many opportunities to find cost savings or discounts, especially for a startup like ours.
I appreciate that the pricing is available online, although I will note that comparable compute is only slightly cheaper than Snowflake warehouse costs, for example.
Which other solutions did I evaluate?
We considered Onehouse and Clickhouse as alternative solutions.
What other advice do I have?
We are in the early phases of our Starburst relationship and are looking forward to how we can grow with it in the future.