Zero to generative AI with Databricks and AWS

By Daniel Wirjo, Solutions Architect – AWS
By Sean Chang, Solutions Architect – AWS
By Venkatavaradhan Viswanathan, Sr. Partner Solutions Architect – AWS
By Josh Faure, Senior Solutions Architect – Databricks

Databricks

Foundation models offer a major breakthrough in the ability to rapidly prototype new AI applications. However, for many businesses building generative AI applications, a key challenge is refining these applications to be production quality. In this post, we cover how you can leverage tools from Databricks and Amazon Web Services (AWS) to address this challenge.

Creating production-quality generative AI applications

To achieve high quality generative AI applications, businesses need tools for understanding the quality of their data and model outputs, along with a unified platform that lets them combine and optimize all aspects of the process. This involves many components including data preparation, retrieval, model selection, ranking and post-processing pipelines, prompt engineering, and training models on custom data.

Databricks is an AWS Competency Partner with multiple AWS Competency designations including the AWS Machine Learning ISV Competency. Databricks is available on the AWS Marketplace. Databricks allows you to handle all your data, analytics, and artificial intelligence (AI) on one simple platform. With generative AI, Databricks provides a suite of tools designed to address common challenges in creating a high-quality production generative AI application. To achieve this, Databricks leverages the best-in-class infrastructure at AWS, and includes seamless integration with AWS services, including:

Amazon Bedrock for access to a broad choice of foundation models from leading companies such as Anthropic Claude, while maintaining full control over your data
Amazon S3 as the storage layer to deliver near infinite scalability, 99.999999999% (11 nines) data durability, and 99.99% availability by default
Amazon EC2 for high-performance compute to power collaborative notebooks for prototyping and testing, and real-time data pipelines
AWS Trainium accelerators for training machine learning models with the best price-performance

Let’s take a look at an overview of the generative AI capabilities on Databricks and AWS, and how they can address three major challenges with generative AI applications.

An overview of generative AI capabilities on Databricks and AWS

Figure 1 – An overview of generative AI capabilities with Databricks and AWS

Challenge 1: Choosing the Right Foundation Model

One key challenge for businesses is choosing the right foundation model that achieves the desired balance between price, performance and latency for their use case. For example, smaller models often have worse quality, but they also achieve lower latencies and are often cheaper. In practice, achieving high quality often means mixing-and-matching base models according to the specific requirements of each application. Being able to compare capabilities across models and easily change models when required is crucial for generative AI applications.

With Databricks’ Mosaic AI Gateway, developers can access a wide variety of models, including models hosted by Databricks, such as Meta’s Llama, the broad selection of models available via Amazon Bedrock, and custom OSS models like models from Hugging Face, without the need to reconfigure their systems. This flexibility allows teams to use a single API to experiment with different models, easily route between multiple models, and seamlessly switch between models when required.

To help businesses select the most suitable models for their needs, both Databricks and AWS provide advanced tools for model evaluation throughout the development lifecycle. Amazon Bedrock supports model evaluation jobs that assess output based on accuracy, robustness, and toxicity. Businesses can conduct evaluations automatically (leveraging large language models as judges) or with human oversight, ensuring precise assessments. Together, these tools streamline model selection and refinement, aligning outputs with business objectives and ensuring production readiness.

Challenge 2: Improving Model Performance with your Business Context

While Large Language Models (LLMs) provide powerful capabilities, businesses often struggle to adapt them to their specific needs. One cost-effective solution is Retrieval-Augmented Generation (RAG), where the model references a relevant knowledge base containing proprietary business context before generating responses. This approach integrates external data with the model’s generative capabilities, ensuring outputs are aligned with business-specific knowledge. However, implementing RAG can be complex, requiring the management of a knowledge base, vector search mechanisms to retrieve relevant data, and augmented prompt design to enhance model responses.

Databricks’ Mosaic AI Vector Search simplifies this process by automatically indexing business data for storage in a searchable knowledge base. Through a REST API, developers can efficiently query this index, retrieving the most relevant context to augment model prompts and improve response quality. This streamlines the traditionally complex RAG workflow, making it accessible to organizations looking to enhance LLM performance without excessive overhead.

When RAG alone does not meet quality expectations, businesses may turn to fine-tuning or continued pre-training of LLMs for better alignment with their use case. With Mosaic AI Model Training, organizations can fine-tune or continue pre-training open-source models on proprietary datasets, tailoring them for enhanced performance. Databricks simplifies this customization by handling GPU-based training infrastructure, using a proprietary training stack to accelerate the process. At the infrastructure level, AWS Trainium, a machine learning chip purpose-built for cost-effective training, plays a key role. Trainium leverages the AWS Nitro System—a suite of hardware and software components designed to optimize performance, security, and scalability for cloud-based workloads. This combination allows businesses to train and deploy custom models efficiently while maintaining high standards for privacy and reliability.

Together, RAG workflows and custom model training empower enterprises to align AI outputs with their specific needs, balancing cost-effectiveness with enhanced performance.

Challenge 3: Operationalizing and Ensuring Model Quality

When building generative AI applications, businesses need to ensure that the outputs from the application, and align them with their own ethical and operational guidelines. Given the non-deterministic outputs from LLMs, businesses are looking to set safeguards around the outputs to meet business and compliance requirements. At the model provider level, Amazon Bedrock Guardrails allow teams to block undesired topics, redact Personally Identifiable Information (PII) from inputs or outputs, and filter harmful content. These safeguards allow businesses to adopt AI responsibly, and reduce the risk and liability of undesirable outputs.

Operationalizing AI applications also involves ensuring consistent model quality in real-world scenarios, where there may not be a single correct answer or obvious error conditions. With Databricks’ Mosaic AI Gateway, organizations can leverage continuous monitoring to track performance metrics and rapidly debug issues in production. The Inference Tables feature logs all incoming requests and outgoing responses at the Mosaic AI Model Serving Endpoint, giving engineering teams the visibility they need to troubleshoot issues efficiently.

With the logs in Inference Tables, businesses can leverage LLM judges to effectively evaluate the quality of outputs according to metrics like correctness, grounded-ness and safety. Together with Databricks Lakehouse Monitoring, businesses can automatically generate data and model quality dashboards. This enables stakeholders to monitor trends, detect anomalies, and set alarms to alert teams of drops in model quality. With these tools, businesses can maintain reliability and trust in their AI systems, ensuring production-grade applications consistently meet expectations.

Innovation from flexibility and seamless integration

Many customers have successfully driven AI innovation to production leveraging capabilities from Databricks and AWS. One customer example is hipages, an online marketplace and Software-as-a-Service (SaaS) platform that connects tradespeople with residential and commercial customers across Australia and New Zealand.

“Databricks and AWS have significantly accelerated our ability to quickly prototype and develop AI solutions using foundational models, making it ideal for rapid proof-of-concepts (POCs) at hipages. Once we’ve validated a solution, we can seamlessly deploy it with Databricks Model Serving, and integrate Langchain for Retrieval-Augmented Generation or AI agent applications. This flexibility allows us to not only innovate quickly but also deploy scalable, production-grade AI models with ease” says Shu Ming Peh, Lead Machine Learning Engineer at hipages.

Conclusion

In summary, Databricks on AWS offers a comprehensive platform for building production-grade generative AI applications. While foundation models provide immense potential, achieving trust and high-quality outcomes requires tools that integrate, monitor, and optimize every part of the AI architecture. By leveraging the combined power of Databricks and AWS, organizations can unlock the full potential of their AI initiatives, ensuring scalability, performance, and business alignment every step of the way.

Resources

If you wish to learn more, try our hands-on workshop or contact your representative from Databricks or AWS.
If you have not yet tried Databricks on AWS, you can begin a free 14-day trial in AWS Marketplace.
Learn how Databricks strengthens partnership with AWS to deliver advanced generative AI capabilities.

.

.

Databricks – AWS Partner Spotlight

Databricks is an AWS Data and Analytics Competency Partner that allows customers to manage all of their data, analytics, and artificial intelligence (AI) on one platform.

Contact Databricks | Partner Overview | AWS Marketplace | Case Studies

Select your cookie preferences

AWS Partner Network (APN) Blog

Zero to generative AI with Databricks and AWS

Creating production-quality generative AI applications

Challenge 1: Choosing the Right Foundation Model

Challenge 2: Improving Model Performance with your Business Context

Challenge 3: Operationalizing and Ensuring Model Quality

Innovation from flexibility and seamless integration

Conclusion

Resources

Databricks – AWS Partner Spotlight

Resources

Follow

Learn

Resources

Developers

Help