Posted On: Jul 27, 2023

We are happy to announce that starting today, you can now retrieve secrets from AWS Secrets Manager on Amazon EMR Serverless from your Spark and Hive jobs. Amazon EMR Serverless is a serverless option that makes it easy for data analysts and engineers to run open-source big data analytics frameworks such as Apache Spark and Apache Hive without configuring, managing, and scaling clusters or servers.

Spark or Hive jobs often need to access sensitive information such as database credentials and API keys to connect to other systems. It is a good practice to decouple the management of such sensitive information from application configuration to improve code re-usability and reduce operational overhead of updating application configuration when updating secrets. Now, you can securely specify references to secrets stored in Secrets Manager as part of EMR Serverless job configurations or classifications and during runtime those references will be replaced by secret values. This feature is especially useful for use cases that need to specify credentials for external Hive metastore databases in application configuration.

This feature is available for all release versions of EMR and in all regions where Amazon EMR Serverless is available. To learn more and to see examples of how to specify secret references, see Using SecretsManager in EMR Serverless.