Tag: Data Pipeline
Building a Serverless Trigger-Based Data Movement Pipeline Using Apache NiFi, DataFlow Functions, and AWS Lambda
Organizations have a wide range of data processing use cases, collecting data from variety of sources, transforming it and loading it to different destinations to fulfill diverse business needs. Learn how DataFlow Functions, combined with the serverless compute services provided by AWS Lambda, enables developers to implement a wide spectrum of use cases using the low-code NiFi flow designer user interface, and deploy the flows as short-lived serverless functions.
Next Caller uses machine learning on AWS to drive data analysis and the processing pipeline. Amazon SageMaker helps Next Caller understand call pathways through the telephone network, rendering analysis in approximately 125 milliseconds with the VeriCall analysis engine. VeriCall verifies that a phone call is coming from the physical device that owns the phone number, and flags spoofed calls and other suspicious interactions in real-time.
In spite of the rich set of machine learning tools AWS provides, coordinating and monitoring workflows across an ML pipeline remains a complex task. Control-M by BMC Software that simplifies complex application, data, and file transfer workflows, whether on-premises, on the AWS Cloud, or across a hybrid cloud model. Walk through the architecture of a predictive maintenance system we developed to simplify the complex orchestration steps in a machine learning pipeline used to reduce downtime and costs for a trucking company.
Attribution models allow companies to guide marketing, sales, and support efforts using data, and then custom tailor every customer’s experience for maximum effect. Combined together, cloud-based data pipeline tools like Fivetran and data warehouses like Amazon Redshift form the infrastructure for integrating and centralizing data from across a company’s operations and activities, enabling business intelligence and analytics activities.
Amazon Redshift is a powerful yet affordable data warehouse, and while getting data out of Redshift is easy, getting data into and around Redshift can pose problems as the warehouse grows. Datacoral is a serverless data platform that manages metadata changes, data transformations, and orchestrating pipelines for data consumers. In this post, learn how to write Redshift SQL to represent data flow, and how serverless data pipelines get automatically generated for that data flow.