AWS for Industries

Unlock new capabilities from product images using generative AI

Retail and consumer goods businesses are adopting generative AI to enhance customer experiences, improve operational efficiency, and create new revenue streams. Recent advancements in multimodal and image-generation large language models (LLMs) are also expanding visual data uses. For instance, Amazon’s generative AI tools help sellers create product descriptions and video ads, streamlining their operations and enhancing the selling experience.

In this blog post, we’ll explore three transformative generative AI use cases. Each use case highlights how generative AI can unlock new potential from your product images and visual assets. We’ll also discuss key benefits for retail and consumer goods companies and provide architectural guidance for implementing these solutions on AWS.

Image-based generative AI use cases

Image to text
Generative AI models with computer vision capabilities can transform product content, significantly improving customer experiences. By using multimodal LLMs such as Anthropic’s Claude 3 hosted on Amazon Bedrock, businesses can seamlessly automate the creation of rich product descriptions from their visual assets.

Multimodal LLMs can recognize and identify crucial elements within product images. They extract relevant metadata and convert this information into compelling, human-readable text. The resulting content enhances product listings by improving search engine optimization (SEO) for better discoverability, filling gaps in product information and creating more comprehensive, accurate details. These improvements can lead to increased conversion rates and higher customer satisfaction.

Brands can also streamline catalog management by automatically inferring product dimensions, materials, and styles. This automation creates more complete, enriched product data, improving operational efficiency. The LLMs can identify specific objects, scenes, and attributes within images, streamlining content-moderation workflows and ensuring regulatory compliance. They can also improve accessibility, with automated, detailed image captioning for users who are blind or with low vision.

Example architecture

Image to text example architecture

Image-based search

Image-based search employs computer vision to provide a more intuitive and effective search experience. By using multimodal embedding models such as Amazon Titan Multimodal Embeddings on Amazon Bedrock, and vector databases such as Vector Engine for Amazon OpenSearch Serverless, businesses can implement natural-language semantic search capabilities that understand both text and visual data. This approach enables a more intuitive and engaging shopping experience—it seeks to understand the customer’s intent through natural language and visual cues, rather than forcing them to conform to rigid search parameters.

In retail and consumer goods applications, image-based search helps customers find products using natural language queries. Customers can also upload reference images. A customer could search for “red dress with floral patterns” or upload an image of a similar dress. The system then retrieves visually and semantically similar products, improving search relevance and potentially increasing conversion rates. An embedding LLM processes product images, then maps text and visual inputs to relevant product embeddings. The embedding model does the heavy lifting of interpreting and matching complex search inputs, reducing the need for extensive keyword management or SEO efforts.

Image-based search significantly enhances product discoverability and search result relevance. It improves customer engagement and can lead to higher conversion rates and increased sales. Moreover, deeply understanding customer intent helps retailers offer personalized, context-aware product recommendations, further enhancing the shopping experience and driving business growth.

Example architecture

Image-generation-text-to-image-and-image-to-image

Image generation (text to image and image to image)

Image generation models such as Stable Diffusion Ultra from Stability AI and Amazon Titan Image Generator V2—both hosted on Amazon Bedrock—are opening new possibilities in product ideation and personalized experiences. This approach speeds up ideation and enables simultaneous evaluation of multiple design directions.

A common use case is product ideation through visual exploration. Designers can start with a basic sketch or concept and use an image generation model to produce a range of developed product ideas and variations.

Retailers are also using image generation to create personalized product experiences by rendering products in user-specified scenes or environments. For example, a customer can upload a reference image of their living room. The model then generates an image of a product seamlessly integrated into that space. This contextualized visualization supports purchasing decisions and boosts customer engagement.

Generative AI–powered image generation offers substantial business benefits. It accelerates product ideation and design while enabling highly personalized customer experiences that inform purchasing decisions. However, when implementing these capabilities, businesses must prioritize authenticity, transparency, and responsible use.

AWS supports these efforts by including invisible watermarking on images generated with Amazon Titan Image Generator models. This helps people maintain trust in product representations. And content filtering capabilities in foundational models help safeguard brand reputation by preventing the creation of misleading or harmful product imagery. To maximize the innovative potential of generative AI while preserving product integrity and strengthening customer relationships, brands should establish clear policies on AI use in visual content creation. This includes being transparent with customers about when and how AI is employed. By following these ethical guidelines and using generative AI capabilities from AWS, businesses can explore new creative applications and unlock revenue potential to advance their operations.

Example architecture

Image generation (text to image and image to image) example architecture

LLMs boost productivity for retailers

Generative AI isn’t about replacing workers. It’s about empowering your team to achieve more. By deploying these technologies, retailers can significantly increase both the quality and quantity of their outputs across various operations. Leading brands are already transforming their businesses with AWS generative AI solutions:

  1. Discover how The Very Group enhanced its customer experience with generative AI.
  2. Learn how Zalando and AWS Gen AI Innovation Center extracted product attributes from unstructured data with Amazon Bedrock.

Ready to transform your retail operations with generative AI? Take the next step:

  1. Explore our Generative AI for Retail and Consumer Goods page to learn how AWS can help drive efficiency, enhance customer engagement, and accelerate innovation in your business.
  2. Schedule a personalized consultation with an AWS retail specialist to discuss your specific use cases.
  3. Attend the RCG206: How Nykaa automates product descriptions using generative AI at AWS re:Invent to learn how Nykaa, a large retailer in India, is creating product descriptions using generative AI
  4. Join us at NRF 2025: Retail’s Big Show (Jan. 12th – 14th) for live demonstrations of these capabilities.
TAGS:
Matt Barbieri

Matt Barbieri

Matt Barbieri is a Senior Solutions Architect at AWS and is based in New York City. With nearly a decade of experience as a former AWS customer, Matt guides retail and consumer goods enterprise businesses through cloud adoptions and digital transformations. He specializes in using generative AI and other technologies to solve business challenges. Matt designs secure, compliant, and efficient AWS solutions, translating complex technical concepts into practical strategies. His work helps retail and consumer goods organizations innovate faster and compete more effectively in rapidly changing markets.