Amazon Simple Storage Service (S3) | AWS Partner Network (APN) Blog

Archiving Amazon MSK Data to Amazon S3 with the Lenses.io S3 Kafka Connect Connector

Amazon Managed Streaming for Apache Kafka (Amazon MSK) is a fully managed, highly available, and secure Apache Kafka service that makes it easy to build and run applications that use Kafka to process steaming data. Learn how to use the new open source Kafka Connect Connector (StreamReactor) from Lenses.io to query, transform, optimize, and archive data from Amazon MSK to Amazon S3. We’ll also demonstrate how to use Amazon Athena to query the partitioned parquet data directly from S3.

How DataArt Helped Inchcape Shipping Services to Revolutionize Document Processing on AWS

The main thing that all interested parties need when a vessel is in port—and this has never changed—is information. Learn how Inchcape, a global organization and leader in ships agency and maritime services that covers around 2,500 ports, teamed up with DataArt to reimagine its core operational platform, Optic, as a bespoke .NET Core-based microservices solution. Optic builds trust through transparency of the vessel program, real-time updates, and standardized workflow and data across all port calls and locations.

Bursting Your On-Premises Data Lake Analytics and AI Workloads on AWS

Developing and maintaining an on-premises data lake is a complex undertaking. To maximize the value of data and use it as the basis for critical decisions, the data platform must be flexible and cost-effective. Learn how to build a hybrid data lake with Alluxio to leverage analytics and AI on AWS alongside a multi-petabyte on-premises data lake. Alluxio’s solution is called “zero-copy” hybrid cloud, indicating a cloud migration approach without first copying data to Amazon S3.

Leveraging Serverless Architecture to Build an Enterprise Data Repository Platform for Customer Insights and Analytics

Moving data between multiple data stores requires an extract, transform, load (ETL) process using various data analysis approaches. ETL operations form the backbone of any modern enterprise data and analytics platform. AWS provides a broad range of services to deploy enterprise-grade applications in the cloud. This post explores a strategic collaboration between Tech Mahindra and a customer to build and deploy an enterprise data repository on AWS and create ETL workflows using a serverless architecture.

WANdisco Accelerates GoDaddy’s Hadoop Cloud Migration to AWS Without Business Interruption

Advances in migration technology enable you to migrate data from actively-used Hadoop environments at scale to the cloud. You can benefit from AWS managed services, pace of innovation, and improve costs for your largest and most complex analytic workloads. Learn how GoDaddy migrated data from their 800-node, 2.5 PB production Apache Hadoop cluster to Amazon S3 using WANdisco’s LiveData Migrator product.

Using Dremio for Fast and Easy Analysis of Amazon S3 Data

Although many SQL engines allow tools to query Amazon S3 data, organizations face multiple challenges, including high latency and infrastructure costs. Learn how Dremio empowers analysts and data scientists to analyze data in S3 directly at interactive speed, without having to physically copy data into other systems or create extracts, cubes, and/or aggregation tables. Dremio’s unique architecture enables faster and more reliable query performance than traditional SQL engines.

How to Turn Archive Data into Actionable Insights with Cohesity and AWS

A big challenge for enterprises is how to manage the growth of data in a cost-effective manner. CIOs are also looking for ways to get insights out of the data so their organizations can create actionable outcomes. Learn how the CloudArchive Direct feature of Cohesity’s DataPlatform with AWS analytics services to drive insights into customers’ NAS data. Cohesity is redefining data management to lower TCO while simplifying the way businesses manage and protect their data.

Change Data Capture from On-Premises SQL Server to Amazon Redshift Target

Change Data Capture (CDC) is the technique of systematically tracking incremental change in data at the source, and subsequently applying these changes at the target to maintain synchronization. You can implement CDC in diverse scenarios using a variety of tools and technologies. Here, Cognizant uses a hypothetical retailer with a customer loyalty program to demonstrate how CDC can synchronize incremental changes in customer activity with the main body of data already stored about a customer.

How to Use AWS Glue to Prepare and Load Amazon S3 Data for Analysis by Teradata Vantage

Customers want to use Teradata Vantage to analyze the data they have stored in Amazon S3, but the AWS service that prepares and loads data stored in S3 for analytics, AWS Glue, does not natively support Teradata Vantage. To use AWS Glue to prep and load data for analysis by Teradata Vantage, you need to rely on AWS Glue custom database connectors. Follow step-by-step instructions and learn how to set up Vantage and AWS Glue to perform Teradata-level analytics on the data you have stored in Amazon S3.

Protecting Your Amazon EBS Volumes at Scale with Clumio

Many AWS customers who use Amazon EBS to store persistent data need to back up that data, sometimes for long periods of time. Clumio’s SaaS solution protects Amazon EBS volumes from multiple AWS accounts though a single policy via tagging. Amazon EBS backups by Clumio are securely stored outside of your AWS account in the Clumio service built on AWS, which is protected by end-to-end encryption and stored in an immutable format.

Category: Amazon Simple Storage Service (S3)