AWS Spatial Computing Blog

Text-to-Hologram: Unleashing Creativity with AWS Gen AI and Proto Hologram

In today’s digital landscape, where innovation is the driving force behind technological advancements, we find ourselves on the cusp of a paradigm shift. Imagine a world where creating captivating holographic content is as effortless as typing a thought. Thanks to our groundbreaking collaboration with the visionaries at Proto Hologram, this once-futuristic dream has become a reality.

Making Digital Real Again

Proto Hologram, a pioneer in the field of holographic display technology, partnered with the AWS Prototyping and Cloud Engineering (PACE) team to transform the way immersive content is created and consumed. Through this collaboration the teams have developed a cutting-edge text-to-hologram pipeline that empowers users to generate dynamic, immersive content for Proto’s state-of-the-art holographic displays with just a few lines of text.

This revolutionary technology is poised to redefine creative expression, communication, and visualization across a wide range of industries. Imagine artists effortlessly sculpting holographic masterpieces with the mere stroke of a keyboard, designers bringing their concepts to life in breathtaking three-dimensional form, and entrepreneurs captivating audiences with immersive presentations that transcend the boundaries of traditional mediums.

Solution Overview

This text-to-hologram solution is built on a robust and scalable serverless architecture powered by Amazon Web Services (AWS). At the core of this solution lies an AWS Step Functions state machine, orchestrating a seamless workflow that harnesses the power of cutting-edge generative AI technologies.

The creation pipeline begins with Amazon Cognito, a secure user authentication and authorization service, ensuring that only authorized users can access and interact with the text-to-hologram pipeline. User input is then passed through REST API hosted by Amazon API Gateway, a fully managed service that acts as the entry point for our application. Once the user’s text input is received, this solution uses AWS Lambda to trigger an AWS Step Functions state machine.

The state machine uses AWS Lambda states to synchronously orchestrate the various steps require to generate holographic visuals from text. The state machine begins by leveraging the Amazon Titan Image Generator model, a state-of-the-art AI model trained on a vast corpus of images and available through Amazon Bedrock, to create stunning visual representations based on the user’s textual input.

a dozen brown leather bags dropping to the ground against a white studio backdrop

Figure 1: Image generated by Amazon Titan Image Generator

Next, the generated image is sent to Stability AI’s Stable Video Diffusion model using the Stability AI Developer Platform API. This model takes the static image and transforms it into a captivating, dynamic video, bringing the user’s vision to life with mesmerizing motion and detail.

a-dozen-brown-leather-bags-dropping-to-the-ground-against-a-white-studio-backdrop

Figure 2: Video generated by Stable Video Diffusion

Because this solution was built with AWS serverless services, data handled by this solution remains encrypted, whether at rest or in transit. By default, Amazon S3 and Amazon DynamoDB employ Amazon managed keys for encrypting data at rest. Optionally, if your organization has specific requirements around data encryption, you can supply AWS Key Management Service (KMS) customer managed keys. AWS Step Functions uses Transport Layer Security (TLS) for communication between integrated services. As a result, when data is in transit between AWS Lambda states of the state machine, it remains encrypted and secure.

Now that an image has been generated from text input and a video has been generated from the image, the last step is to convert the video to holographic content using Proto’s technology. In the final step of this solution the video is uploaded to the Proto Hologram Content Management System where the content becomes viewable and spatial on a Proto holographic display.

proto-hologram-epic-live-jfk27

Figure 3: Proto Epic live in Amazon JFK27

With today’s cutting-edge generative AI models, AWS serverless services, and Proto’s holographic displays, users can go from typing in an idea to having animated holographic content in less than two minutes.

The Technical Backbone: AWS Serverless Architecture

Leveraging the capabilities of the Amazon Titan Image Generator, powered by Amazon Bedrock, and Stable Video Diffusion (SVD), the Text-to-Hologram Creation Pipeline enables users to generate holograms for Proto’s holographic displays from simple text prompts. Prior to the Text-to-Hologram Creation Pipeline, crafting holographic content demanded proficiency in 3D modeling, texturing, lighting, and rendering, but now anyone can bring their visions to life by merely describing their desired output.

aws reference architecture

Figure 4: High-level Reference Architecture

    1. Amazon Cognito adds user access controls to this solution, handling the sign-in and sign out processes. Once signed in, a user can be authorized to make requests to the backend.
    2. Amazon API Gateway is configured to act as the front door to the backend application. The API routes user requests to access data and assets.
    3. AWS Lambda functions route queries based on request parameters and perform backend operations.
    4. Amazon S3 is an object storage services that stores the generated image and video files.
    5. An Amazon DynamoDB table stores metadata such as the prompt, image and video filenames, and Amazon S3 object keys.
    6. The AWS Step Functions State Machine orchestrates the processing workflow to generate video from text by integrating with various AWS services.
    7. The Open Pipeline state creates placeholder data entries for the various objects that the workflow will generate.
    8. The Create Image state uses AWS Lambda to send the text prompt to the Titan Image Generator FM running on Amazon Bedrock. This step outputs an image file that is uploaded to Amazon S3.
    9. The Resize Video state scales and crops the generated image to align with SVD requirements. This new image is stored in Amazon S3.
    10. The Create Video state uses AWS Lambda to send the generated image to the SVD model using the Stability AI API. This step outputs a video file that is uploaded to Amazon S3.
    11. The Resize Video state scales the generated video to 4k resolution for viewing on the end-user display device. This step outputs a video file that is uploaded to Amazon S3.
    12. Once the video is upscaled, the state machine uploads the video to the Proto Hologram CMS so the user can view the generated video on their Proto holographic display device.
    13. AWS Web Application Firewall (WAF) protects the application from common web exploits and bots that can affect availability, compromise security, or consume excessive resources.
    14. AWS IAM securely manages identities and access to AWS services and resources.
    15. Amazon CloudWatch provides monitoring, logging, and observability for resources.
    16. AWS X-Ray provides a complete view of requests traced throughout the application.
    17. AWS Secrets Manager provides secure storage for API keys for the Proto Hologram CMS, as well as any additional secrets.

Pushing the Boundaries

This is only the beginning of what’s possible. As generative AI continues to unlock novel realms of potential, envision a future where holographic content becomes an integral part of our daily lives. From interactive educational experiences to collaborative design sessions, and even immersive entertainment, the possibilities are endless.

Start exploring the boundless horizons with generative AI on AWS today using the AI Playgrounds available from Amazon Bedrock in the AWS console.