Reimagining Data Workflows & Insights with Genie: NLQ spaces, Agent Mode, and Intelligent Coding
What do you like best about the product?
1) In our implementation, Genie Space is actively used to enable NLQ-based access across multiple data products like Finance, HR, Marketing, Sales, and Supply Chain (inventory, demand planning, and replenishment), reducing dependency on data teams for ad-hoc queries.
2) We designed separate Genie Spaces for each BU/team/data product, ensuring domain-level isolation while still supporting cross-functional querying where required (e.g., Finance + Sales joins).
Each Genie Space is carefully configured with curated data tables, business-level instructions, and semantic context, which significantly improves the accuracy of SQL generation.
3) We provide few-shot examples, guided prompts, and sample business questions tailored to each domain, helping Genie understand real business intent instead of generic query patterns.
4) In Chat Mode, business users directly ask questions in natural language, and Genie translates them into SQL and returns results, which has improved self-service analytics adoption.
5) In Agent Mode, Genie goes beyond SQL generation by creating a logical execution plan, breaking down complex queries into multiple steps before querying the underlying data.
6) We built a dedicated Anomaly Detector Genie Space, where users ask questions about cluster cost, performance issues, and inefficient workloads.
This anomaly-focused Genie analyzes long-running jobs, inefficient queries, and cluster utilization patterns, using historical workload data to identify optimization opportunities.
7) A key implementation is notebook-level analysis, where Genie highlights code issues, shows before vs after optimization, categorizes problems (performance, cost, inefficiency), and explains improvements clearly.
8) Genie also provides quantified recommendations, including expected cost savings (e.g., idle cluster reduction, query tuning impact) and workload-based optimization strategies, making it highly actionable for engineering teams.
9) We extended Genie into Genie Code integrated with Databricks AI Assistant, enabling an agentic development experience directly within our data engineering workflows.
Our team defined custom skills in Markdown (MD files) such as Coder, Tester, Mapper, and Data Generator, which are attached to Genie Code to modularize capabilities.
These skills are used to support end-to-end SDLC activities, including code generation, transformation logic creation, test case design, and synthetic data generation.
10) Genie Code operates by first creating a structured execution plan, outlining all required steps before starting any development activity.
It then breaks the plan into a detailed to-do list, executing each step sequentially (e.g., create notebook → write transformation → validate logic → optimize code).
11) During execution, Genie Code follows a human-in-the-loop model, asking for approvals at every step with options like allow once, always allow, or read-only execution.
The behavior of Genie Code is controlled through project-specific guidelines and instructions, ensuring it aligns with our coding standards, architecture patterns, and governance rules.
12) It acts as a co-developer within the workspace, assisting engineers in writing optimized code, validating logic, and ensuring best practices are followed consistently.
We are leveraging it for proactive development workflows, where Genie not only executes tasks but also suggests improvements and optimization opportunities during development itself.
This approach has enabled a “vibe coding” style of development, where engineers focus on intent while Genie handles structured execution, resulting in faster delivery, reduced manual effort, and improved overall code quality.
What do you dislike about the product?
Context limitation across Genie Spaces, also number of tables can be attached is 30 if i remember
Agent Mode reasoning depth is good but not fully autonomous
Need improvements in performance efficiency and reduce the latency
What problems is the product solving and how is that benefiting you?
1) Bridging business and data teams through NLQ
Databricks Genie solves the gap between business users and technical teams by enabling natural language access to data, reducing dependency on data engineers for everyday queries.
2) Eliminating data silos across domains
By integrating data from Finance, HR, Sales, and Supply Chain, it helps us analyze cross-domain datasets, improving decision-making for use cases like demand planning and inventory optimization.
3) Accelerating self-service analytics
With Genie Chat Mode converting NLQ to SQL, business users can independently fetch insights, significantly reducing turnaround time for reporting and analysis.
4) Handling complex analytical queries with Agent Mode
Genie Agent Mode solves complex query scenarios by breaking them into structured execution plans, which is especially useful for multi-step analytical and optimization problems.
5) Improving cost and performance visibility
Through our Anomaly Detector Genie Space, Databricks helps identify cluster inefficiencies, long-running jobs, and costly queries, giving clear visibility into platform usage.
6) Driving workload optimization and cost savings
The platform provides actionable recommendations like query tuning, cluster right-sizing, and idle resource reduction, helping us optimize cost based on actual workload patterns.
7) Enhancing code quality through notebook analysis
Genie analyzes notebook code and highlights performance issues with before/after comparisons, enabling developers to improve efficiency and follow best practices.
8) Supporting proactive development with Genie Code
Databricks enables an agentic development workflow, where Genie Code assists in planning, coding, testing, and executing tasks step-by-step, reducing manual effort.
9) Standardizing development using skill-based automation
By attaching custom skills (Coder, Tester, Mapper, Data Generator), we ensure consistent development practices and faster onboarding for new use cases.
10) Increasing overall productivity and faster delivery
Combining Genie Space and Genie Code, Databricks significantly improves developer productivity, reduces iteration cycles, and accelerates delivery of data solutions, while maintaining governance and control.
Essential Data Processing with Seamless Collaboration
What do you like best about the product?
I like how Databricks allows not just engineers, but also data managers, analysts, data scientists, and everyone to work in a simplified and collaborative manner. That's a feature I appreciate which Databricks does well, setting it apart from competitors who are trying to offer similar capabilities. Many people have already adopted it, and it has become the de facto choice.
What do you dislike about the product?
I think the lineage and the addition of business assets, as well as how the data translates to the business layer of the bank or any other vendor, is where Databricks can improve. I don't see different departments getting connected in Databricks by the glossary or items which they use for themselves.
What problems is the product solving and how is that benefiting you?
I use Databricks to manage vast datasets from multiple sources, helping organize infrastructure and access management, and aids in some visualization tasks.
Revolutionized HR Analytics with Genie, Minor Cost Concerns
What do you like best about the product?
I really like the Genie feature on Databricks, it's great and unifies well with the ecosystem. Combining the lakehouse with Genie is simple and has transformed our HR analytics. We can ask questions in plain English about attrition and get instant, accurate responses. This effectively removes the engineering bottleneck almost completely, allowing HR to access insights directly from Genie without waiting weeks for custom dashboards. It saves the engineering team loads of hours and accelerates decision-making. Plus, setting up Databricks is seamless, as we could set up the account and start running lakehouses in minutes.
What do you dislike about the product?
Setting up Genie requires meticulous planning and data curation to get excellent responses. If the semantic model isn't perfect, it can stumble. Cost management is tricky when multiple teams use open-ended queries all day. The metric views and serverless costs features make it better, but there's room for improvement.
What problems is the product solving and how is that benefiting you?
Databricks solves ingestion, transformation, governance, and data quality challenges, offering AI and BI tools for instant insights. With Genie, HR bypasses engineering bottlenecks, saving hours and accelerating decisions. It's simple to unify lakehouse with Genie for quick, accurate responses.
An all-in-one platform
What do you like best about the product?
It's an all-in-one platform for data engineers, analysts, data scientists, and business users.
What do you dislike about the product?
It’s easy to overspend and it is a vendor lock-in.
What problems is the product solving and how is that benefiting you?
Data engineering, model training and inference, GenAI.
Databricks solves the problem of having fragmented tools across the data and AI lifecycle. Traditionally, teams would need separate platforms for data engineering, analytics, machine learning, and AI — leading to silos, duplicated work, and governance challenges.
With Databricks, data engineering pipelines, model training and inference, and GenAI development all live in one unified environment. This means data engineers can build and orchestrate pipelines, data scientists can train and deploy models, and teams can develop and serve GenAI applications — without constantly moving data or context-switching between tools.
Powerful Warehousing, Collaborative, AI Debugging
What do you like best about the product?
As a growing Data Engineer, the community support and clear documentation of Databricks really helps me to guide through the problems. I've been managing the jobs and pipelines where failures are bound to happen, debugging with the Diagnose this error with AI feature has helped me with fasterthe failure recovery SLA. The UI is neat and makes it very easy to move between notebooks, SQL, and PySpark without much friction. Since I work with a team, collaboration is must. Sharing notebooks and iterating with teammates feels easy. I really like that I can rely on the ABAC policies to setup the Data Quality and Governance.
What do you dislike about the product?
I am not hundred percent sure if I would use the term dislike, I think it's just a personal preference. I sometimes feel the compute being used is a lot more than it should be for a simple query. Maybe the shuffle read/write that always gets involved when you're using a delta tables sometimes slows down the job.
What problems is the product solving and how is that benefiting you?
Databricks is helping our clients to manage the lakehouse and warehouse architecture in a much more structured way. We use it as the landing layer from S3 and then process data through our medallion architecture (bronze, silver, and gold) before delivering it to the final products. It’s been very effective for orchestrating daily jobs and pipelines. I also really like the asset bundles and how easily everything integrates with Git, which makes version control and deployments much smoother for the team. I am more likely to use Databricks as my go to platform for data lakehouse and warehousing.
Fast, Scalable Spark Processing with a Powerful Unified Analytics Workspace
What do you like best about the product?
fast distributed processing with Spark, collaborative notebooks for teams, strong integration with cloud data platforms, scalable data pipelines, unified workspace for data engineering and analytics, handles large datasets efficiently
What do you dislike about the product?
cluster startup time can be slow, costs can increase quickly with heavy workloads, UI can feel complex for new users, debugging distributed jobs is not always straightforward, notebook version control can be tricky
What problems is the product solving and how is that benefiting you?
large-scale data processing, building and managing data pipelines, unified environment for engineering and analytics, faster data transformations, improved scalability for big data workloads
Powerful Layered Data Structure with Flexible, Enterprise-Wide Joins
What do you like best about the product?
Its inherent layers to structure the data and allowing user driven data joins to be available across enterprises using Gold layer
What do you dislike about the product?
Data injection into Bronze layer across various resources and laborious Silver layer definitions
What problems is the product solving and how is that benefiting you?
Defining the Feature sets across 100+ data sources and making it available for consumption to the destination systems as per need
Unified data workflows have cut ticket processing times and are driving faster business insights
What is our primary use case?
My main use case for Databricks involves the pipelines and ETL processes that we are implementing. Following the Medallion architecture with Gold, Silver, and Bronze layers, we filter the data, perform transformations, and integrate AI. Databricks has made this process significantly easier.
I worked for an airline company where they experienced substantial delays in data processing. When a passenger booked a ticket, it took 20 to 25 minutes for the transaction to reflect in the system. Using Databricks, we compressed that time from 10 to 6 minutes initially and eventually reduced it to just a few seconds. After setting up all the pipelines and leveraging Databricks features to enhance and accelerate the process, this project became truly impactful and time-based, resulting in reduced processing time and ultimately increased profit for the airline company.
What is most valuable?
The best features Databricks offers are Unity Catalog, Databricks Workflow, Databricks AI, Agentic AI, and the automated pipelines that utilize AI. The AI models are very easy to create and deploy in just a few seconds. These are helpful and user-friendly tools.
I find myself using Unity Catalog most frequently because it provides a unified governance solution for all data and AI needs on Databricks, offering centralized access control, auditing, lineage, and data discovery capabilities across the platform. The main features include access control, security compliance standard models, built-in auditing, and lineage tracking. Most of my projects have involved integrating Unity Catalog into systems and providing overall security, including a migration project to transition to Unity Catalog.
The platform's unified data intelligence capabilities allow teams to analyze, manage, and activate data at scale, leading to faster time to insights, cleaner data pipelines, and significant savings on infrastructure and engineering efforts. Databricks eliminates data silos, accelerates the time to insight, empowers all data personnel, and provides built-in governance and security. It also supports AI and ML, which is an added advantage in today's AI-driven field.
What needs improvement?
Databricks already provides monthly updates and continuously works on delivering new features while enhancing existing ones. However, the platform could become easier to use. While instruction-led workshops are available, offering more free instructional workshops would allow a wider audience to access and learn about Databricks. Additionally, providing use cases would help beginners gain more knowledge and hands-on experience.
Regarding my experience, I was initially unfamiliar with the platform and had to conduct research and learn through various videos. I did find some instruction-led classes, but several of those required payment. The platform should provide more free resources to enable a broader audience to access and learn about Databricks. The platform itself is user-friendly and easy to use without complex issues, so I believe it does not need improvement in its core functionality. Rather, supporting aspects can be enhanced.
For how long have I used the solution?
I have been working as a data engineer for four years. Initially, I was a software engineer, but my career has progressed as a data engineer over this four-year period.
What was our ROI?
Definitely. As I mentioned regarding my airline project, it was impactful because the cost was reduced by 60 to 70 percent. The company was initially using Azure Blob storage, and in Databricks, the cluster and associated infrastructure were cheaper than other platforms. This reduction in both time and money resulted in real-time impact and significant cost savings.
What other advice do I have?
For advice for others considering Databricks, it is important to start by understanding its place in the data ecosystem and how it fits into your specific needs. Key points to consider include familiarizing yourself with Databricks, learning the basics, starting with data engineering, and incorporating ETL processes. You can then dive deeper into Databricks features such as notebooks, clusters, and jobs. Achieving certification enhances your skills validation. For best practices, it is critical to optimize performance and minimize complexity while continuously learning to stay competitive in the data field. Following these steps will be very beneficial for anyone pursuing a career as a data engineer and Databricks engineer.
Databricks is a truly essential platform for data engineering needs, and I recommend it to anyone looking to advance in the data engineering field. It is a very important platform and tool for every data engineer. I encourage everyone to learn and explore this product and to maximize its potential. I rate this product a 9 out of 10.
AI Integration with the Data Lakehouse Made Databricks a Clear Choice
What do you like best about the product?
The integration of AI to the data lakehouse is the key thing which encouraged us to use databricks
What do you dislike about the product?
Databricks is more complex than spark therefore it takes more efforts to fine tune it as per business usecase
What problems is the product solving and how is that benefiting you?
As the worlds largest wealth manager blackrock has huge TBs of data processed daily via spark jobs and to derive meaningful analytics from that data which is highly flexible required so much effort and expertise but with databricks AI models it became easy
Report 1100
What do you like best about the product?
I started with databricks 6 years ago and received more than 10 certifications. I liked a lot data analytics and fast calculations features of databricks. As well integration to other external tools like Power BI for reporting.
What do you dislike about the product?
All features were fine, but more AI-Powered features need to enhace all current features.
What problems is the product solving and how is that benefiting you?
Analytics, fast calculations and reporting.