AWS Public Sector Blog

Tag: dataops

Photo by Hunter Harritt on Unsplash

Modern data engineering in higher ed: Doing DataOps atop a data lake on AWS

Modern data engineering covers several key components of building a modern data lake. Most databases and data warehouses, to an extent, do not lend themselves well to a DevOps model. DataOps grew out of frustrations trying to build a scalable, reusable data pipeline in an automated fashion. DataOps was founded on applying DevOps principles on top of data lakes to help build automated solutions in a more agile manner. With DataOps, users apply principles of data processing on the data lake to curate and collect the transformed data for downstream processing. One reason that DevOps was hard on databases was because testing was hard to automate on such systems. At California State University Chancellors Office (CSUCO), we took a different approach by residing most of our logic with a programming framework that allows us to build a testable platform. Learn how to apply DataOps in ten steps.