AWS Big Data Blog
Use Amazon Redshift RA3 with managed storage in your modern data architecture
Amazon Redshift is a fully managed, petabyte-scale data warehouse service in the cloud. You can start with just a few hundred gigabytes of data and scale to a petabyte or more. This enables you to use your data to acquire new insights for your business and customers. Over the years, Amazon Redshift has evolved a […]
Read MoreIngest Stripe data in a fast and reliable way using Stripe Data Pipeline for Amazon Redshift
Enterprises typically host a myriad of business applications for varying data needs. As companies grow, so does the demand for insights from a complete set of business data. Having data from various applications that store data in disparate silos can delay the decision-making process. However, building and maintaining an API integration or a third-party extract, […]
Read MoreComing June 2022: An updated Amazon QuickSight dashboard experience
This post was last updated June 2022, to include new experiences such as Analysis and Embedding. Starting June 30, 2022, Amazon QuickSight is introducing the new look and feel for your dashboards. In this post, we walk through the changes to expect with the new look. The new dashboard experience includes the following improvements: Simplified […]
Read MoreUse a linear learner algorithm in Amazon Redshift ML to solve regression and classification problems
Amazon Redshift is the fastest, most widely used, fully managed, and petabyte-scale cloud data warehouse. Tens of thousands of customers use Amazon Redshift to process exabytes of data every day to power their analytics workloads. Amazon Redshift ML, powered by Amazon SageMaker, makes it easy for SQL users such as data analysts, data scientists, and […]
Read MoreDeep dive into Amazon EMR Kerberos authentication integrated with Microsoft Active Directory
Many of our customers that use Amazon EMR as their big data platform need to integrate with their existing Microsoft Active Directory (AD) for user authentication. This integration requires the Kerberos daemon of Amazon EMR to establish a trusted connection with an AD domain, which involves a lot of moving pieces and can be difficult […]
Read MoreFederate single sign-on access to Amazon Redshift query editor v2 with Okta
Amazon Redshift query editor v2 is a web-based SQL client application that you can use to author and run queries on your Amazon Redshift data warehouse. You can visualize query results with charts and collaborate by sharing queries with members of your team. You can use query editor v2 to create databases, schemas, tables, and […]
Read MoreFederate access to Amazon Redshift query editor V2 with Active Directory Federation Services (AD FS): Part 3
In the first post of this series, Federate access to your Amazon Redshift cluster with Active Directory Federation Services (AD FS): Part 1, you set up Microsoft Active Directory Federation Services (AD FS) and Security Assertion Markup Language (SAML) based authentication and tested the SAML federation using a web browser. In Part 2, you learned […]
Read MoreHow Paytm modernized their data pipeline using Amazon EMR
This post was co-written by Rajat Bhardwaj, Senior Technical Account Manager at AWS and Kunal Upadhyay, General Manager at Paytm. Paytm is India’s leading payment platform, pioneering the digital payment era in India with 130 million active users. Paytm operates multiple lines of business, including banking, digital payments, bill recharges, e-wallet, stocks, insurance, lending and […]
Read MoreOrchestrate big data jobs on on-premises clusters with AWS Step Functions
Customers with specific needs to run big data compute jobs on an on-premises infrastructure often require a scalable orchestration solution. For large-scale distributed compute clusters, the orchestration of jobs must be scalable to maximize their utilization, while at the same time remain resilient to any failures to prevent blocking the ever-growing influx of data and […]
Read MoreAnalyze Amazon SES events at scale using Amazon Redshift
Email is one of the most important methods for business communication across many organizations. It’s also one of the primary methods for many businesses to communicate with their customers. With the ever-increasing necessity to send emails at scale, monitoring and analysis has become a major challenge. Amazon Simple Email Service (Amazon SES) is a cost-effective, […]
Read More