AWS Big Data Blog

architecture diagram

Create a serverless event-driven workflow to ingest and process Microsoft data with AWS Glue and Amazon EventBridge

Microsoft SharePoint is a document management system for storing files, organizing documents, and sharing and editing documents in collaboration with others. Your organization may want to ingest SharePoint data into your data lake, combine the SharePoint data with other data that’s available in the data lake, and use it for reporting and analytics purposes. AWS […]

Read More

How Roche democratized access to data with Google Sheets and Amazon Redshift Data API

This post was co-written with Dr. Yannick Misteli, João Antunes, and Krzysztof Wisniewski from the Roche global Platform and ML engineering team as the lead authors. Roche is a Swiss multinational healthcare company that operates worldwide. Roche is the largest pharmaceutical company in the world and the leading provider of cancer treatments globally. In this […]

Read More
ScopeofSolution

Accelerate self-service analytics with Amazon Redshift Query Editor V2

Amazon Redshift is a fast, fully managed cloud data warehouse. Tens of thousands of customers use Amazon Redshift as their analytics platform. Users such as data analysts, database developers, and data scientists use SQL to analyze their data in Amazon Redshift data warehouses. Amazon Redshift provides a web-based query editor in addition to supporting connectivity […]

Read More

Introducing Amazon S3 shuffle in AWS Glue

AWS Glue is a serverless data integration service that makes it easy to discover, prepare, and combine data for analytics, machine learning (ML), and application development. In AWS Glue, you can use Apache Spark, which is an open-source, distributed processing system for your data integration tasks and big data workloads. Apache Spark utilizes in-memory caching and […]

Read More

Integrate AWS Glue DataBrew and Amazon PinPoint to launch marketing campaigns

Marketing teams often rely on data engineers to provide a consumer dataset that they can use to launch marketing campaigns. This can sometimes cause delays in launching campaigns and consume data engineers’ bandwidth. The campaigns are often launched using complex solutions that are either code heavy or using licensed tools. The processes of both extract, […]

Read More

TrueBlue uses Amazon QuickSight to deliver more accurate pricing and grow business

This is a guest post by TrueBlue. In their own words, “Founded in 1989, TrueBlue provides specialized workforce solutions, including staffing, talent management, and recruitment process outsourcing (RPO). In 2020, the company connected approximately 490,000 people with work.” At TrueBlue, we offer solutions that help employers connect with workers worldwide. Every day, sales teams at […]

Read More

Query hierarchical data models within Amazon Redshift

In a hierarchical database model, information is stored in a tree-like structure or parent-child structure, where each record can have a single parent but many children. Hierarchical databases are useful when you need to represent data in a tree-like hierarchy. The perfect example of a hierarchical data model is the navigation file and folders or […]

Read More

Now Available: Updated guidance on the Data Analytics Lens for AWS Well-Architected Framework

Nearly all businesses today require some form of data analytics processing, from auditing user access to generating sales reports. For all your analytics needs, the Data Analytics Lens for AWS Well-Architected Framework provides prescriptive guidance to help you assess your workloads and identify best practices aligned to the AWS Well-Architected Pillars: Operational Excellence, Security, Reliability, […]

Read More

Cybersecurity Awareness Month: Learn about the job zero of securing your data using Amazon Redshift

Amazon Redshift is the most widely used cloud data warehouse. It allows you to run complex analytic queries against terabytes to petabytes of structured and semi-structured data, using sophisticated query optimization, columnar on high-performance storage, and massively parallel query execution. At AWS, we embrace the culture that security is job zero, by which we mean […]

Read More

Copy large datasets from Google Cloud Storage to Amazon S3 using Amazon EMR

Many organizations have data sitting in various data sources in a variety of formats. Even though data is a critical component of decision-making, for many organizations this data is spread across multiple public clouds. Organizations are looking for tools that make it easy and cost-effective to copy large datasets across cloud vendors. With Amazon EMR […]

Read More