AWS Big Data Blog
Category: Learning Levels
Amazon EMR on EKS widens the performance gap: Run Apache Spark workloads 5.37 times faster and at 4.3 times lower cost
Amazon EMR on EKS provides a deployment option for Amazon EMR that allows organizations to run open-source big data frameworks on Amazon Elastic Kubernetes Service (Amazon EKS). With EMR on EKS, Spark applications run on the Amazon EMR runtime for Apache Spark. This performance-optimized runtime offered by Amazon EMR makes your Spark jobs run fast […]
SANS Institute uses Amazon QuickSight to drive transformational security awareness maturity within organizations
This is a guest post by Carl Marrelli from SANS Institute. The SANS Institute is a world leader in cybersecurity training and certification. For over 30 years, SANS has worked with leading organizations to help ensure security across their organization, as well as with individual IT professionals who want to build and grow their security […]
It’s the Amazon QuickSight Community’s 1st birthday!
Happy birthday Amazon QuickSight Community! We are celebrating 1 year since the launch of our new Community. The Amazon QuickSight Community website is a one-stop-shop where business intelligence (BI) authors and developers from across the globe can ask and answer questions, stay up to date, network, and learn together about Amazon QuickSight. In this post, […]
Push Amazon EMR step logs from Amazon EC2 instances to Amazon CloudWatch logs
Amazon EMR is a big data service offered by AWS to run Apache Spark and other open-source applications on AWS to build scalable data pipelines in a cost-effective manner. Monitoring the logs generated from the jobs deployed on EMR clusters is essential to help detect critical issues in real time and identify root causes quickly. […]
Connect to Amazon MSK Serverless from your on-premises network
Amazon Managed Streaming for Apache Kafka (Amazon MSK) is a fully managed, highly available, and secure Apache Kafka service. Amazon MSK reduces the work needed to set up, scale, and manage Apache Kafka in production. With Amazon MSK, you can create a cluster in minutes and start sending data. With Amazon MSK Serverless, you can […]
How Morningstar used tag-based access controls in AWS Lake Formation to manage permissions for an Amazon Redshift data warehouse
This post was co-written by Ashish Prabhu, Stephen Johnston, and Colin Ingarfield at Morningstar and Don Drake, at AWS. With “Empowering Investor Success” as the core motto, Morningstar aims at providing our investors and advisors with the tools and information they need to make informed investment decisions. In this post, Morningstar’s Data Lake Team Leads […]
Patterns for updating Amazon OpenSearch Service index settings and mappings
Amazon OpenSearch Service is used for a broad set of use cases like real-time application monitoring, log analytics, and website search at scale. As your domain ages and you add additional consumers, you need to reevaluate and change the domain’s configuration to handle additional storage and compute needs. You want to minimize downtime and performance […]
Showpad accelerates data maturity to unlock innovation using Amazon QuickSight
Showpad aligns sales and marketing teams around impactful content and powerful training, helping sellers engage with buyers and generate the insights needed to continuously improve conversion rates. In 2021, Showpad set forth the vision to use the power of data to unlock innovations and drive business decisions across its organization. Showpad’s legacy solution was fragmented […]
Create threshold alerts on tables and pivot tables in Amazon QuickSight
Amazon QuickSight previously launched threshold alerts on KPIs and gauge charts. Now, QuickSight supports creating threshold alerts on tables and pivot tables—our most popular visual types. This allows readers and authors to track goals or key performance indicators (KPIs) and be notified via email when they are met. These alerts allow readers and authors to […]
Generic orchestration framework for data warehousing workloads using Amazon Redshift RSQL
Tens of thousands of customers run business-critical workloads on Amazon Redshift, AWS’s fast, petabyte-scale cloud data warehouse delivering the best price-performance. With Amazon Redshift, you can query data across your data warehouse, operational data stores, and data lake using standard SQL. You can also integrate AWS services like Amazon EMR, Amazon Athena, Amazon SageMaker, AWS […]









