Cloud data warehousing has transformed analytics workflows and delivers faster, cheaper insights
What is our primary use case?
Snowflake is primarily used to handle the data warehousing part, for creating data modeling, and also keeping the raw data and creating reporting data so that it is further used for data analytics.
Whenever we receive files from API integration from the front end, such as Excel files, HR data, sales data, or marketing data, we load those files into Snowflake. These files may be in non-SQL formats, such as Parquet files or text files. From Snowflake, we transform those files into tables. Once the tables are transformed, we create dimension tables according to the business needs, either in a star schema or Snowflake schema. After creating that schema, we create curated data on top of it. Once the curated data is created, we create a view query which is further used as reporting data analytics. We load the raw data, transform it, and then use it further as reporting data analytics.
Snowflake is utilized as a hybrid cloud. We primarily use it as a private cloud, but some components are hybrid.
What is most valuable?
Snowflake has a very wide variety of data integration capabilities. For workflows and data sources, we can capture semi-structured or non-structured data such as Avro, Parquet files, JSON files, and text files, or we fetch data from API integrations, such as Salesforce API or SAP data. We try to inculcate and grab all the data and then load it into Snowflake. Snowflake works as a very good tool because it handles and creates micro-partitions automatically. Whenever we create SQL queries, it automatically divides and runs the whole query itself on a multi-cloud, multi-cluster, and multi-cloud enterprise workflow.
The top features about Snowflake are first that it is hybrid cloud native. Everything is on cloud, so we have very zero maintenance for computational or in order to maintain Snowflake. Second, it is a very simple tool to understand with an architecture dividing the data layer, storage layer, computational layer, and the UI layer. This unique feature of Snowflake gives us very low costs. It is very helpful for SQL analysis. In Snowflake, we keep the data in a very low-cost manner, and if we don't compute it, we can also move it to cloud storage. Third, it is very elastic in scaling. Whenever a large amount of SQL or computational work is required, the warehouse automatically scales in and out. This is one of the best features in Snowflake that it manages automatically and runs on ACID properties: Atomicity, consistency, isolation, and durability. Each query works on isolation, hits isolation, and Snowflake very much optimizes performance. It is very simple to use.
The elastic scaling helps significantly; for example, if we get a large amount of data after a leave period, it requires scaling out the warehouse to a larger size. If we have a bulky query, the warehouse size increases automatically, dividing the query into micro-partitions for quick execution. If we encounter unexpected data, such as special characters, Snowflake creates zero-copy cloning, aiding in development and testing. Such cloning does not require physical storage, hence reducing storage costs.
Snowflake has positively impacted us by making everything cloud-native, significantly reducing the systems and application running process. The maintenance costs are low because Snowflake maintains a high amount of data efficiently at a low cost. There is also a massive scalable data warehousing performance. Whenever we need to increase SQL data, the data warehouse scales up or down based on our needs and query requirements. If the query is not heavy, the costs remain low, and the differentiation between storage and computational layers makes Snowflake very cost-effective compared to other data warehousing tools.
Snowflake has contributed to significant cost savings. Previously, we paid for both storage and computational layers, where unused data accumulated costs unnecessarily. With Snowflake, the unique approach to managing storage and computational costs allows us to transfer less frequently accessed data to Glacier, further optimizing expenses. Snowflake's performance enhances business efficiency, allowing timely achievement of targets. The analytics and transformations we execute using Snowflake are notably faster, providing significant advantages. The scalability is remarkable; marketing teams can analyze data on time-sensitive trends swiftly.
What needs improvement?
Snowflake is already quite improved, but they have recently introduced AI features. AI integration would be beneficial for direct data capturing from systems such as SAP and Salesforce to Snowflake as raw data and allow for efficient data warehousing.
Snowflake is very good overall, but it could improve documentation for supporting different structures. Even though Snowflake has a strong ANSI SQL presence, there are some features that do not perform optimally, particularly for Change Data Capture (CDC).
I do not rate it a 10 because there are still areas for improvement, such as AI integration and raw data capturing. Once those enhancements are achieved, the dependency on ETL would decrease, as would the scheduling aspect to streamline major data workloads within Snowflake.
For how long have I used the solution?
I have been working for more than six months in my current field, and it is about a year going to happen.
What do I think about the stability of the solution?
Snowflake is highly stable and performs well even with large data sets exceeding terabytes, maintaining stability throughout.
What do I think about the scalability of the solution?
Snowflake's scalability is excellent, depending on dataset sizes and query handling. It can scale from one engine to up to 32 engines seamlessly, maintaining performance.
How are customer service and support?
We interacted with customer support, and they were quite helpful. They listened to our issues and provided solutions, but we faced challenges acquiring documentation for various structures. We sought this documentation multiple times but faced difficulty in obtaining it.
How would you rate customer service and support?
Which solution did I use previously and why did I switch?
Earlier, we were using on-premise data warehousing with Hadoop, but it was complex and high-maintenance. The setup was also prolonged and costly. Therefore, we transitioned to Snowflake, which allows for rapid setup.
What's my experience with pricing, setup cost, and licensing?
For pricing, setup cost, and licensing, everything is managed smoothly. Regarding licensing, it is inexpensive. The setup cost is low, mainly due to AWS Marketplace; we only need to pay for serverless configurations. When it comes to cloud support, the setup cost is very cheap compared to other platforms, such as Oracle or PostgreSQL, which typically require higher costs. Snowflake's handling of computational resources and hardware is excellent, making it one of the most affordable data warehousing solutions on the market.
Which other solutions did I evaluate?
Before choosing Snowflake, we evaluated other options such as Redshift and Databricks. What we observed was Redshift's architecture seemed coupled, leading to slower scaling and limited concurrency issues requiring extensive maintenance. Redshift's clustering methods increased complexity in data management. In contrast, we found Snowflake offers fully managed services with micro-partitioning and independent scaling, making it a more desirable option.
What other advice do I have?
One more feature is Time Travel, which maintains historical data. For transient tables, it is retained for seven days, but for permanent tables, it is stored for 90 days. If we want to keep historical data beyond that, we can extend it via Time Travel. If we accidentally delete some tables, we can retrieve that data within 90 days, or if that does not happen, we can reach out to Snowflake support team for retrieval known as Fail-safe travel. Additionally, Snowflake offers very secure data sharing, ensuring no data movement among different applications while collecting data into one place for transformation and loading into data warehousing. The storage, security, and compliance are all managed well within Snowflake, and we just need to maintain the security policies as per administration.
One thing is very low maintenance. The infrastructure has transitioned to cloud, which is very manageable. Snowflake gives us very high concurrency, and we can predict our costs effectively. The enterprise data analytics we perform—whether BI reports or analytics—everything is very helpful with Snowflake.
My advice is to leverage Snowflake for its scalability, efficient query handling, user role assignments, and direct integrations with platforms such as SAP and Salesforce. The decoupling of computational and storage architecture is a key feature, allowing us to pay based on actual usage. Features such as zero-copy cloning and Time Travel enable easy data recovery and improve overall data management. Additionally, automatic tuning enhances query performance and sharing securely aligns with stringent security policies.
I believe we have covered almost everything. To summarize, Snowflake offers scalable storage that adjusts automatically, high concurrency, and virtually no downtime. It continually enhances performance by improving cluster configurations. I rate this product an 8.5 out of 10.
Which deployment model are you using for this solution?
Hybrid Cloud
If public cloud, private cloud, or hybrid cloud, which cloud provider do you use?
Amazon Web Services (AWS)
Users maximize data management with seamless third-party integration and AI capabilities
What is our primary use case?
I primarily use
Snowflake for hosting and analyzing data. It acts as a data warehouse where data is stored, analyzed, and moved from stage to stage, ultimately exposing it to end users. Additionally, there is an increasing trend in implementing AI capabilities, allowing me to write SQL queries for insights into structured and unstructured data.
What is most valuable?
The independence of the compute and storage within
Snowflake is key. The integration with third-party solutions like DBT,
Airflow, and
Fivetran is highly beneficial. The scalability options it provides, addressing issues without tying workloads into one virtual machine, enhance functionality. The fast pace of delivering new AI features also brings excitement about future possibilities. Further, being able to perform AI and Machine Learning in the same location as the data is quite advantageous.
What needs improvement?
There is a need for a tool to help me estimate the cost of using Snowflake. Enhancements in user experience for data observability and quality checks would be beneficial, as these tasks currently require SQL coding, which might be challenging for some users. It would also help if Snowflake provided clear guidelines on how requests impact warehouse size.
For how long have I used the solution?
I began using Snowflake in 2021.
What do I think about the stability of the solution?
Snowflake as a SaaS offering means that maintenance isn't an issue for me, and I have not experienced any cases where it was down.
What do I think about the scalability of the solution?
While Snowflake provides the ability to scale resources, the expected return on investment is not always achieved. The billing doubles with size increase, but processing does not necessarily speed up accordingly.
How are customer service and support?
The technical support from Snowflake is very good, nice, and efficient. I rate it ten out of ten.
How would you rate customer service and support?
How was the initial setup?
Setting up Snowflake in 2021 was challenging, especially due to the strong security requirements at the enterprise level. It involved back-and-forth communication with Snowflake and
Azure support. However, the documentation has improved over time, which would likely streamline the process now.
What was our ROI?
I assume I achieve a certain level of return on investment, though I am skeptical about the calculations. However, I am generally happy after adopting Snowflake.
What's my experience with pricing, setup cost, and licensing?
It is complicated to understand how requests impact warehouse size. Unlike competitors such as Microsoft and
Databricks, Snowflake lacks transparency in estimating resource usage.
Which other solutions did I evaluate?
Snowflake's main competitor is
Databricks. Databricks was initially built for big data and machine learning, and then moved to SQL capabilities, while Snowflake followed the opposite trajectory.
What other advice do I have?
New users should not proceed on their own without leveraging the experience of others who have already implemented Snowflake. Establishing a framework for operation and change management is crucial. Define a clear operating model for Snowflake adoption, and start with a small warehouse to adjust as needed. I rate Snowflake a 9.5 out of ten because there is room for expecting further improvements.
Transformation in data querying speed with good migration capabilities
What is our primary use case?
I started working with Snowflake when I was with Fidelity Investments around 2016-2017. We used Snowflake on AWS cloud because Snowflake doesn’t have an on-premise offering. You need to use it with AWS, Azure, or Google Cloud.
As a consultant now, I assist enterprise customers, though I don't have Snowflake deployments yet.
What is most valuable?
Snowflake is a data lake on the cloud where all processing happens in memory, resulting in very fast query responses. One key feature is the separation of compute and storage, which eliminates storage limitations.
It also has tools for migrating data from legacy databases like Oracle. Its stability and efficiency enhance performance greatly. Tools in the AI/ML marketplace are readily available without needing development.
What needs improvement?
Cost reduction is one area I would like Snowflake to improve. The product is not very cheap, and a reduction in costs would be appreciated.
What do I think about the stability of the solution?
Snowflake is very stable, especially when used with AWS. It works best with AWS compared to Google Cloud and Azure.
What do I think about the scalability of the solution?
Snowflake is very scalable and has a dedicated team constantly improving the product. There are no problems on the scalability side.
How are customer service and support?
Snowflake's technical support is excellent. During my time at Fidelity, I received great support in migrating data to Snowflake, with quick responses and innovative solutions.
How would you rate customer service and support?
How was the initial setup?
The initial setup was rated six out of ten due to the time required for migrating existing data to Snowflake. Configuration and data migration are major steps involved.
What's my experience with pricing, setup cost, and licensing?
Snowflake's pricing is on the higher side, rated as eight out of ten. If there were ways to reduce costs, it would be a positive improvement.
What other advice do I have?
Snowflake is a great solution if you have substantial data volume. For those considering Snowflake, be prepared for the necessary initial investment in time and resources.
I rate the overall solution nine out of ten.
Which deployment model are you using for this solution?
Public Cloud
If public cloud, private cloud, or hybrid cloud, which cloud provider do you use?
Amazon Web Services (AWS)
Generates metrics efficiently, but the integration process needs enhancement
What is most valuable?
The platform's most valuable features include its ability to effectively summarize and manage large datasets, allowing multiple teams to analyze and generate insights. Its integration with data lakes for business impact analysis, performance metrics, and KPIs is particularly important.
What needs improvement?
Improvement is needed in integrating external tools, such as data catalogs, which can be complicated due to differing formats and usage across departments. The goal is to enhance collaboration and streamline workflows.
What do I think about the scalability of the solution?
The product's scalability is crucial for managing petabyte-scale data generated daily across various regions, allowing for efficient data validation and handling.
How was the initial setup?
The primary challenges during the initial setup were the high pricing and uncertainties regarding future costs associated with data usage.
The deployment involved consultation among managers, agreement on on-site requirements, scale calculations, and collaboration with engineers for setup approval.
I rate the process a seven out of ten.
What other advice do I have?
Snowflake is integrated through a complex workflow that involves collecting data on the publisher side, using tools like Airflow and Kafka for batch jobs, and frequently importing data into the product from various sources, including S3 and Data Lakes. It creates a smooth data pipeline.
I rate it a seven out of ten.
If public cloud, private cloud, or hybrid cloud, which cloud provider do you use?
Amazon Web Services (AWS)