Skip to main content
A guide to data, analytics, and machine learning (ML) tools for AWS
From generative AI to data lakes, visualizations to transformation, this guide will help you choose the right data tools for the job
Data storage
Storing data is a complex problem that is driven by multiple factors which define the capabilities and features a given tool must fulfill for it to be a solid candidate for a production-grade deployment. These factors include data volume, rate of data change, restrictions around location and use of the data, and the necessary performance to satisfy specific use cases, such as interactive user experience. With this in mind, let’s look at tools and AWS services that the AWS Marketplace developer relations team find optimal for the following categories.
Transportation and transformation
For data to be of value, it must usually be taken from its raw form and enriched or otherwise transformed. There are many sources of data that must be integrated and related, including streaming data sources (from edge or IoT), data batches from DBMS, user generated data from applications and large volumes of unstructured or semi structured data from data lakes and data warehouses. Transportation and transformation are usually the target of Data Engineers.

Page topics
Visualization and analysis
Gaining an understanding of data requires the ability to query and explore it with flexibility and efficiency and produce the necessary representations of it once you’ve arrived at the desired insights. In terms of analysis, having strong tools to explore the data and produce insightful results with simple syntax are both key capabilities of an optimal tool. As per visualization, having access to integrate with many different sources of data and present results in a variety of potential visual representations. This is the realm of data scientists, business intelligence engineers, and database administrators (DBAs).

Machine learning and generative AI
Data lies at the core of machine learning as a crucial element in achieving the desired outcome from ML algorithms. Training, fine tuning, and Retrieval-Augmented Generation (RAG) are just three areas where data is indispensable in the context of machine learning. There are unique challenges that arise from integrating large data sets as part of development processes, which is a must considering the rapidly evolving domain of machine learning model development.

Page topics
Integration and connectivity
In today's complex IT landscapes, organizations face the challenge of seamlessly connecting diverse applications, data sources, and systems across cloud and on-premises environments. Key requirements include the ability to rapidly deploy integrations, automate workflows, and manage APIs without extensive coding. Solutions must offer scalability to handle growing data volumes, security to protect sensitive information during transit, and flexibility to adapt to evolving business needs. Additionally, integration platforms should provide real-time data synchronization, support for various data formats and protocols, and intuitive interfaces that enable both technical and non-technical users to create and manage integrations efficiently.

Page topics
More resources to help you build with AWS
Why AWS Marketplace for on-demand cloud tools
Free to try. Deploy in minutes. Pay only for what you use.