Big Data | AWS Partner Network (APN) Blog

How Mactores Tripled Performance by Migrating from Oracle to Amazon Redshift with Zero Downtime

Mactores used a five-step approach to migrate, with zero downtime, a large manufacturing company from an Oracle on-premises data warehouse to Amazon Redshift. The result was lower total cost of ownership and triple the performance for dependent business processes and reports. The migration tripled the customer’s performance of reports, dashboards, and business processes, and lowered TCO by 30 percent. Data refresh rates dropped from 48 hours to three hours.

Monitoring Your Palo Alto Networks VM-Series Firewall with a Syslog Sidecar

By hosting a Palo Alto Networks VM-Series firewall in an Amazon VPC, you can use AWS native cloud services—such as Amazon CloudWatch, Amazon Kinesis Data Streams, and AWS Lambda—to monitor your firewall for changes in configuration. This post explains why that’s desirable and walks you through the steps required to do it. You now have a way to monitor your Palo Alto Networks firewall that is very similar to how you monitor your AWS environment with AWS Config.

Accelerating Machine Learning with Qubole and Amazon SageMaker Integration

Data scientists creating enterprise machine learning models to process large volumes of data spend a significant portion of their time managing the infrastructure required to process the data, rather than exploring the data and building ML models. You can reduce this overhead by running Qubole data processing tools and Amazon SageMaker. An open data lake platform, Qubole automates the administration and management of your resources on AWS.

How to Use AWS Glue to Prepare and Load Amazon S3 Data for Analysis by Teradata Vantage

Customers want to use Teradata Vantage to analyze the data they have stored in Amazon S3, but the AWS service that prepares and loads data stored in S3 for analytics, AWS Glue, does not natively support Teradata Vantage. To use AWS Glue to prep and load data for analysis by Teradata Vantage, you need to rely on AWS Glue custom database connectors. Follow step-by-step instructions and learn how to set up Vantage and AWS Glue to perform Teradata-level analytics on the data you have stored in Amazon S3.

Running SQL on Amazon Athena to Analyze Big Data Quickly and Across Regions

Data is the lifeblood of a digital business and a key competitive advantage for many companies holding large amounts of data in multiple cloud regions. Imperva protects web applications and data assets, and in this post we examine how you can use SQL to analyze big data directly, or to pre-process the data for further analysis by machine learning. You’ll also learn about the benefits and limitations of using SQL, and see examples of clustering and data extraction.

Powering Enterprise Analytics at Scale Using Teradata Vantage on AWS

The amount and variety of existing and newly-generated data in today’s connected world is unparalleled. As this growth continues, so does the opportunity for organizations to extract real value from their data. Teradata Vantage is a modern analytics platform that combines open source and commercial analytic technologies. It can drive autonomous decision-making by helping you to operationalize insights, solve complex business problems, and enable descriptive, predictive, and prescriptive analytics.

Lower TCO and Increase Query Performance by Running Hive on Spark in Amazon EMR

Learn how Mactores helped Seagate Technology to use Apache Hive on Apache Spark for queries larger than 10TB, combined with the use of transient Amazon EMR clusters leveraging Amazon EC2 Spot Instances. It was imperative for Seagate to have systems in place to ensure the cost of collecting, storing, and processing data did not exceed their ROI. Moving to Hive on Spark enabled Seagate to continue processing petabytes of data at scale with significantly lower TCO.

Optimizing Amazon EC2 Spot Instance Usage with Qubole Data Platform

Amazon EC2 Spot Instances let you reduce costs by taking advantage of unused capacity. You can further reduce costs by using the policy-based automation in Qubole Data Platform to balance performance, cost, and SLA requirements anytime you use Spot Instances. Learn how the Qubole Data Platform optimizes your Spot usage, and how it applies policy-based automation to balance your performance, cost, and SLAs whenever you use Amazon EC2 Spot Instances.

Say Hello to 49 New AWS Competency, Service Delivery, and MSP Partners Added in March

We are excited to highlight 49 APN Partners that received new designations in March for our global AWS Competency, AWS Managed Service Provider (MSP), and AWS Service Delivery programs. These designations span workload, solution, and industry, and help AWS customers identify top APN Partners that can deliver on core business objectives. APN Partners are focused on your success, helping customers take full advantage of the business benefits AWS has to offer.

Optimizing Presto SQL on Amazon EMR to Deliver Faster Query Processing

Seagate asked Mactores Cognition to evaluate and deliver an alternative data platform to process petabytes of data with consistent performance. It needed to lower query processing time and total cost of ownership, and provide the scalability required to support about 2,000 daily users. Learn about the the three migration options Mactores tested and the architecture of the solution Seagate selected. This effort improved the overall efficiency of Seagate’s Amazon EMR cluster and business operations.

Tag: Big Data