AWS Partner Network (APN) Blog
Tag: PySpark
Training Multiple Machine Learning Models Simultaneously Using Spark and Apache Arrow
Spark is a distributed computing framework that added new features like Pandas UDF by using PyArrow. You can leverage Spark for distributed and advanced machine learning model lifecycle capabilities to build massive-scale products with a bunch of models in production. Learn how Perion Network implemented a model lifecycle capability to distribute the training and testing stages with few lines of PySpark code. This capability improved the performance and accuracy of Perion’s ML models.
Accelerating Data Warehouse Migration to Amazon Redshift Using Cognizant Intelligent Data Works
Many organizations are looking to migrate existing, on-premises enterprise data warehouse systems to cloud-based data warehouse systems such as Amazon Redshift. Here, we discuss how Cognizant’s Intelligent Migration Workbench (IMW) can be used to accelerate the data warehouse migrations while converting Oracle PL/SQL and Tetradata BTEQ scripts. IMW makes it easy to move mission critical proprietary code to AWS, giving customers competitive edge through faster time to market.