Unified data workflows have cut ticket processing times and are driving faster business insights
What is our primary use case?
My main use case for Databricks involves the pipelines and ETL processes that we are implementing. Following the Medallion architecture with Gold, Silver, and Bronze layers, we filter the data, perform transformations, and integrate AI. Databricks has made this process significantly easier.
I worked for an airline company where they experienced substantial delays in data processing. When a passenger booked a ticket, it took 20 to 25 minutes for the transaction to reflect in the system. Using Databricks, we compressed that time from 10 to 6 minutes initially and eventually reduced it to just a few seconds. After setting up all the pipelines and leveraging Databricks features to enhance and accelerate the process, this project became truly impactful and time-based, resulting in reduced processing time and ultimately increased profit for the airline company.
What is most valuable?
The best features Databricks offers are Unity Catalog, Databricks Workflow, Databricks AI, Agentic AI, and the automated pipelines that utilize AI. The AI models are very easy to create and deploy in just a few seconds. These are helpful and user-friendly tools.
I find myself using Unity Catalog most frequently because it provides a unified governance solution for all data and AI needs on Databricks, offering centralized access control, auditing, lineage, and data discovery capabilities across the platform. The main features include access control, security compliance standard models, built-in auditing, and lineage tracking. Most of my projects have involved integrating Unity Catalog into systems and providing overall security, including a migration project to transition to Unity Catalog.
The platform's unified data intelligence capabilities allow teams to analyze, manage, and activate data at scale, leading to faster time to insights, cleaner data pipelines, and significant savings on infrastructure and engineering efforts. Databricks eliminates data silos, accelerates the time to insight, empowers all data personnel, and provides built-in governance and security. It also supports AI and ML, which is an added advantage in today's AI-driven field.
What needs improvement?
Databricks already provides monthly updates and continuously works on delivering new features while enhancing existing ones. However, the platform could become easier to use. While instruction-led workshops are available, offering more free instructional workshops would allow a wider audience to access and learn about Databricks. Additionally, providing use cases would help beginners gain more knowledge and hands-on experience.
Regarding my experience, I was initially unfamiliar with the platform and had to conduct research and learn through various videos. I did find some instruction-led classes, but several of those required payment. The platform should provide more free resources to enable a broader audience to access and learn about Databricks. The platform itself is user-friendly and easy to use without complex issues, so I believe it does not need improvement in its core functionality. Rather, supporting aspects can be enhanced.
For how long have I used the solution?
I have been working as a data engineer for four years. Initially, I was a software engineer, but my career has progressed as a data engineer over this four-year period.
What was our ROI?
Definitely. As I mentioned regarding my airline project, it was impactful because the cost was reduced by 60 to 70 percent. The company was initially using Azure Blob storage, and in Databricks, the cluster and associated infrastructure were cheaper than other platforms. This reduction in both time and money resulted in real-time impact and significant cost savings.
What other advice do I have?
For advice for others considering Databricks, it is important to start by understanding its place in the data ecosystem and how it fits into your specific needs. Key points to consider include familiarizing yourself with Databricks, learning the basics, starting with data engineering, and incorporating ETL processes. You can then dive deeper into Databricks features such as notebooks, clusters, and jobs. Achieving certification enhances your skills validation. For best practices, it is critical to optimize performance and minimize complexity while continuously learning to stay competitive in the data field. Following these steps will be very beneficial for anyone pursuing a career as a data engineer and Databricks engineer.
Databricks is a truly essential platform for data engineering needs, and I recommend it to anyone looking to advance in the data engineering field. It is a very important platform and tool for every data engineer. I encourage everyone to learn and explore this product and to maximize its potential. I rate this product a 9 out of 10.
AI Integration with the Data Lakehouse Made Databricks a Clear Choice
What do you like best about the product?
The integration of AI to the data lakehouse is the key thing which encouraged us to use databricks
What do you dislike about the product?
Databricks is more complex than spark therefore it takes more efforts to fine tune it as per business usecase
What problems is the product solving and how is that benefiting you?
As the worlds largest wealth manager blackrock has huge TBs of data processed daily via spark jobs and to derive meaningful analytics from that data which is highly flexible required so much effort and expertise but with databricks AI models it became easy
Report 1100
What do you like best about the product?
I started with databricks 6 years ago and received more than 10 certifications. I liked a lot data analytics and fast calculations features of databricks. As well integration to other external tools like Power BI for reporting.
What do you dislike about the product?
All features were fine, but more AI-Powered features need to enhace all current features.
What problems is the product solving and how is that benefiting you?
Analytics, fast calculations and reporting.
Effortless Data Insights and Governance
What do you like best about the product?
I like the Databricks Data Intelligence Platform for its data governance capabilities. The platform supports machine learning applications and offers helpful autofilling features. I also find the quick analytics code support to be a valuable aspect.
What do you dislike about the product?
I find it problematic that if the tables have two similar attributes and I need to choose another which isn't many-to-many, it can't handle that yet. Also, when I ask for the keys of a table, whether foreign or main, it's not able to provide the correct key.
What problems is the product solving and how is that benefiting you?
Databricks Data Intelligence Platform reduces the time to find relationships between tables.
Databricks -Scalable Data
What do you like best about the product?
1. Easy for data teams once set up; notebooks, SQL, and dashboards work smoothly in one place.
2. Used frequently for data engineering, analytics, and ML workloads.
3. Implementation is structured and scalable, especially on cloud environments.
What do you dislike about the product?
1. Customer support quality depends on the support tier purchased.
2. Too many advanced features can feel overwhelming for smaller teams.
3. Initial setup and architecture planning take time and skilled resources.
What problems is the product solving and how is that benefiting you?
1. Helps me make faster, data-driven decisions with scalable and trusted data pipelines.
2. Handles large data volumes reliably, supporting daily and recurring workloads.
3. Reduces time spent managing infrastructure so I can focus on insights and outcomes.
Improved data governance has enabled sensitive data tracking but cost management still needs work
What is our primary use case?
My usual use case for Databricks as an end-user mostly involves exporting data. This typically entails writing directly into a web interface to get the data out, so probably with Python.
What is most valuable?
The most significant benefit Databricks has brought to my company is the Unity Catalog. Previously, with our data warehouse, we weren't able to track where sensitive data was. The Unity Catalog has been a big improvement, even though we haven't gotten the rest right.
The user interface is very useful, especially in writing directly into a web interface.
From my perspective, the ability to export data effectively and use Python within Databricks are key valuable features.
What needs improvement?
I believe we could improve Databricks integration with cloud service providers. The impact of our current integration has not been particularly good, and it's becoming very expensive for us. The inefficiencies in our implementation, such as not shutting down warehouses when they're not in use or reserving the right number of credits, have led to increased costs.
We made several beginner mistakes, such as not taking advantage of incremental loading and running overly complicated queries all the time. We should be using ETL tools to help us instead of doing it directly in Databricks. We need more experienced professionals to manage Databricks effectively, as it's not as forgiving as other platforms such as Snowflake.
I think introducing customer repositories would facilitate easier implementation with Databricks.
For how long have I used the solution?
I have been working with Databricks for the last six months.
What do I think about the stability of the solution?
As a platform, Databricks is fine. However, our implementation isn't particularly reliable.
We've suffered from the lack of professionals with previous experience, which makes it difficult to dig ourselves out of the situation we've found ourselves in.
What do I think about the scalability of the solution?
The scalability level of Databricks at the moment exceeds our needs. It's not a problem for us.
The sky's the limit with Databricks.
How are customer service and support?
I have addressed technical support about our issue with Databricks. It was the team that engaged with them, and I believe our development teams also reach out for support, though I'm not sure what level of support they get.
Previously, when using Snowflake, we had customer reps who were really knowledgeable and helped us to avoid beginner mistakes. With Databricks, it seems we could have benefited from similar support. Our implementation team had no experience and made obvious mistakes. It may be that we opted not to have that support, but I believe we should have.
How would you rate customer service and support?
Which solution did I use previously and why did I switch?
Before Databricks, I used SQL Server.
The big decision to switch from SQL Server to Databricks was motivated by the lack of auditing, lineage, and tracking sensitive data in SQL Server, along with a need for more flexibility.
How was the initial setup?
I did not participate in the initial setup of Databricks.
What about the implementation team?
We use a consultancy, Avanade, for our Databricks implementation. They had previously done a Databricks implementation for another part of our organization. Our implementation team lacked experience which resulted in several beginner mistakes.
What was our ROI?
So far, we're not measuring any return on investment, such as saving time, money, or resources with Databricks. We're still in the phase where our old system and the new system are running simultaneously, so everything is twice as expensive and much effort is doubled. We haven't progressed far enough yet to realize any ROI.
What's my experience with pricing, setup cost, and licensing?
I believe that in terms of credits for Databricks, we're spending between £15,000 and £20,000 a month.
I think Databricks is priced correctly. If we managed our resources better, we wouldn't be paying anywhere near that amount. The issue is with our management of resources.
Which other solutions did I evaluate?
No other options were considered because we used the consultancy Avanade, who had done a previous Databricks implementation for another part of our organization. We used them to recreate our implementation.
What other advice do I have?
I'm probably not the best person to discuss certain aspects of Databricks since I haven't explored it deeply and am not part of the team developing it.
We haven't utilized Databricks' machine learning capabilities.
From my company, data ingestion and transformation are done with Databricks, though I don't do it directly.
I don't use Databricks' features for managing data, such as data lake and warehouse operations.
Most of our current work with Databricks isn't really live yet, so measuring savings in time and money or identifying any return on investment isn't applicable right now.
I would rate this review a 7 overall.
Unified Platform Enhances Data Collaboration and Processing
What do you like best about the product?
I like the Databricks Data Intelligence Platform because it's a unified, scalable platform for data engineering, analytics, and collaboration. It simplifies the process of building and running data pipelines, handles performance-intensive Spark workloads, and promotes collaboration across teams with shared notebooks and environments. It's great for managing and processing large datasets, improving performance, and providing a consistent platform for turning raw data into insights. The initial setup was easy, thanks to helpful documentation.
What do you dislike about the product?
It can improve in areas like startup time for clusters, deeper visibility into performance tuning and clearer documentation for advanced configurations.
What problems is the product solving and how is that benefiting you?
I use the Databricks Data Intelligence Platform for large-scale data processing and analytics. It simplifies building reliable data pipelines, handles Spark workloads efficiently, reduces operational overhead, and enables team collaboration with shared notebooks. It's great for turning raw data into actionable insights.
Outstanding Experience with This Software
What do you like best about the product?
Databricks data intelligence is a platform that helps in accommodating all of our business and official data and share it with different team departments so that they can analyse it and create a detailed analytics of past performances and also make required changes on it for future growth.
What do you dislike about the product?
One of the major challenge that we face while working with Databricks data intelligence platform is that you cannot use this tool with a single data scientist you will have to keep a team of professionals who can deal with large data and create multiple graphs and analytics according to available information and this complete activity involves lot of financial investment
What problems is the product solving and how is that benefiting you?
This software help us in making sure that all the data of different departments are accommodated in a same software so that access can be easier and decisions can be taken much quicker. With the help of this tool data of all the departments like finance, operations, sales and marketing are screend in one time and thoroughly interchecked too.
Excellent ML Features and Data Controls in Databricks
What do you like best about the product?
Databricks offers useful features for machine learning engineers, such as the ability to use compute pools to run workloads efficiently. In addition, it includes the Unity Catalog, which is an excellent tool for managing data access controls.
What do you dislike about the product?
The product would benefit from having more built-in functions that are specifically optimized for GenAI use cases.
What problems is the product solving and how is that benefiting you?
We successfully consolidated data from multiple ERPs onto a single platform, which enabled us to use this unified data for data engineering, reporting, machine learning, and GenAI use cases.
Comprehensive Data Platform with Flexible Onboarding and Robust Governance
What do you like best about the product?
This is an end-to-end platform that begins with flexible onboarding of data from multiple sources, followed by processing through a medallion architecture. The Unity Catalog is used for governance, cataloging, and tracking data lineage. Databricks SQL serves as the endpoint for use cases such as business intelligence, as well as downstream integration through API endpoints.
What do you dislike about the product?
There isn't anything particularly specific, but in certain situations, we do depend on cloud-native services. For instance, when working on Azure and aiming for comprehensive end-to-end governance, we require Azure Purview.
What problems is the product solving and how is that benefiting you?
As a solution architect specializing in Data & AI, Databricks has become my preferred platform for all things related to data engineering, data warehousing, and analytics. In my experience designing solutions, I have found that Databricks offers a comprehensive suite that meets the majority of client needs. This means there is no need to search across multiple vendors for the best services, as most requirements can be addressed within a single platform.