AWS Partner Network (APN) Blog

Tag: PySpark

Training Multiple Machine Learning Models Simultaneously Using Spark and Apache Arrow

Spark is a distributed computing framework that added new features like Pandas UDF by using PyArrow. You can leverage Spark for distributed and advanced machine learning model lifecycle capabilities to build massive-scale products with a bunch of models in production. Learn how Perion Network implemented a model lifecycle capability to distribute the training and testing stages with few lines of PySpark code. This capability improved the performance and accuracy of Perion’s ML models.

Cognizant_AWS Solutions

Accelerating Data Warehouse Migration to Amazon Redshift Using Cognizant Intelligent Data Works

Many organizations are looking to migrate existing, on-premises enterprise data warehouse systems to cloud-based data warehouse systems such as Amazon Redshift. Here, we discuss how Cognizant’s Intelligent Migration Workbench (IMW) can be used to accelerate the data warehouse migrations while converting Oracle PL/SQL and Tetradata BTEQ scripts. IMW makes it easy to move mission critical proprietary code to AWS, giving customers competitive edge through faster time to market.