Artificial Intelligence

Category: Advanced (300)

Perform batch transforms with Amazon SageMaker Jumpstart Text2Text Generation large language models

Today we are excited to announce that you can now perform batch transforms with Amazon SageMaker JumpStart large language models (LLMs) for Text2Text Generation. Batch transforms are useful in situations where the responses don’t need to be real time and therefore you can do inference in batch for large datasets in bulk. For batch transform, […]

Deploy generative AI models from Amazon SageMaker JumpStart using the AWS CDK

The seeds of a machine learning (ML) paradigm shift have existed for decades, but with the ready availability of virtually infinite compute capacity, a massive proliferation of data, and the rapid advancement of ML technologies, customers across industries are rapidly adopting and using ML technologies to transform their businesses. Just recently, generative AI applications have […]

GPT-NeoXT-Chat-Base-20B foundation model for chatbot applications is now available on Amazon SageMaker

Today we are excited to announce that Together Computer’s GPT-NeoXT-Chat-Base-20B language foundation model is available for customers using Amazon SageMaker JumpStart. GPT-NeoXT-Chat-Base-20B is an open-source model to build conversational bots. You can easily try out this model and use it with JumpStart. JumpStart is the machine learning (ML) hub of Amazon SageMaker that provides access […]

Publish predictive dashboards in Amazon QuickSight using ML predictions from Amazon SageMaker Canvas

April 2024: This post was reviewed and updated for accuracy. Understanding business trends, customer behavior, sales revenue, increase in demand, and buyer propensity all start with data. Exploring, analyzing, interpreting, and finding trends in data is essential for businesses to achieve successful outcomes. Business analysts play a pivotal role in facilitating data-driven business decisions through […]

Announcing provisioned concurrency for Amazon SageMaker Serverless Inference

Amazon SageMaker Serverless Inference allows you to serve model inference requests in real time without having to explicitly provision compute instances or configure scaling policies to handle traffic variations. You can let AWS handle the undifferentiated heavy lifting of managing the underlying infrastructure and save costs in the process. A Serverless Inference endpoint spins up […]

Host ML models on Amazon SageMaker using Triton: Python backend

Amazon SageMaker provides a number of options for users who are looking for a solution to host their machine learning (ML) models. Of these options, one of the key features that SageMaker provides is real-time inference. Real-time inference workloads can have varying levels of requirements and service level agreements (SLAs) in terms of latency and […]

Securing MLflow in AWS: Fine-grained access control with AWS native services

June 2024: The contents of this post are out of date. We recommend you refer to Announcing the general availability of fully managed MLflow on Amazon SageMaker for the latest. With Amazon SageMaker, you can manage the whole end-to-end machine learning (ML) lifecycle. It offers many native capabilities to help manage ML workflows aspects, such […]

Host ML models on Amazon SageMaker using Triton: TensorRT models

Sometimes it can be very beneficial to use tools such as compilers that can modify and compile your models for optimal inference performance. In this post, we explore TensorRT and how to use it with Amazon SageMaker inference using NVIDIA Triton Inference Server. We explore how TensorRT works and how to host and optimize these […]

Build an image search engine with Amazon Kendra and Amazon Rekognition

In this post, we discuss a machine learning (ML) solution for complex image searches using Amazon Kendra and Amazon Rekognition. Specifically, we use the example of architecture diagrams for complex images due to their incorporation of numerous different visual icons and text. With the internet, searching and obtaining an image has never been easier. Most […]

Achieve high performance with lowest cost for generative AI inference using AWS Inferentia2 and AWS Trainium on Amazon SageMaker

The world of artificial intelligence (AI) and machine learning (ML) has been witnessing a paradigm shift with the rise of generative AI models that can create human-like text, images, code, and audio. Compared to classical ML models, generative AI models are significantly bigger and more complex. However, their increasing complexity also comes with high costs […]