AWS Lake Formation FAQs

General

Open all

AWS Lake Formation makes it easier to centrally govern, secure, and globally share data for analytics and machine learning (ML). With Lake Formation, you can centralize data security and governance using the AWS Glue Data Catalog, letting you manage metadata and data permissions in one place with familiar database-style features. It also delivers fine-grained data access control, so you can help ensure users have access to the right data, down to the row and column level. You can then scale permissions across your users. Lake Formation also makes it easier to share data internally across your organization, across Regions, and externally using AWS Data Exchange, letting you create a data mesh or meet other data sharing needs with no data movement. And, because Lake Formation tracks data interactions by role and user, it provides comprehensive data access auditing to help ensure the right data was accessed by the right users at the right time.

Lake Formation shares console controls and the AWS Glue Data Catalog with AWS Glue. AWS Glue focuses on data integration and ETL.

See AWS Lake Formation features for more details.

Lake Formation and the AWS Glue Data Catalog are integral parts of Amazon Data Zone. Amazon DataZone supports granting access to Data Catalog tables that are managed in AWS Lake Formation. Amazon DataZone uses Lake Formation to manage permissions and facilitate sharing of data products. For example, in Amazon DataZone when a producer makes data available for a subscription, it has to be in the Data Catalog. When a subscription is granted, Amazon DataZone orchestrates the creation of Lake Formation grants.

Yes. You can use third-party business applications, such as Tableau and Looker, to connect to your AWS data sources through services such as Amazon Athena or Amazon Redshift. Access to data is managed by Lake Formation and the underlying AWS Glue Data Catalog, so regardless of which application you use, you’re assured that access to your data is governed and controlled.

There are several third-party tools that integrate with Lake Formation, including Ahana, Dremio, Privacera, Collibra, and Starburst.

Centralized permissions management

Open all

Lake Formation centralizes permission management on your resources in the AWS Glue Data Catalog, including databases and tables, letting you manage permissions for your data and metadata in one place. You can define and manage access for your users and applications by role with Lake Formation in the Data Catalog using familiar database-like grants, bringing the simplicity of data warehouses and databases to your data lake.

Integration with AWS Identity and Access Management (IAM) authenticates users and roles, enforcing permissions across AWS analytics and ML services, including Amazon Athena, Amazon QuickSight, Amazon Redshift, and Amazon SageMaker. Lake Formation provides a permissions model that is based on a straightforward grant and revoke mechanism. Lake Formation permissions combine with IAM permissions to control access to data stored in data lakes and to the metadata that describes that data. When a principal makes a request to access AWS Glue Data Catalog resources or underlying data, for the request to succeed it must pass permission checks by both IAM and Lake Formation. IAM and Lake Formation are complementary to each other, and Lake Formation does not impact existing IAM permissions.

Different users across your organization require different levels of access to your data. It’s important that users have access to the right data—and no more—to do their jobs. Lake Formation fine-grained access control (FGAC) lets you manage permissions to the column, row, and cell level. FGAC makes it easier to comply with increased business regulations, apply better data governance, and deftly protect and manage consumers’ sensitive data.

Security management and governance

Open all

Lake Formation makes it easier to scale permissions across your users. You can set attributes on data and apply permissions on those attributes to scale. Lake Formation tag-based access control (LF-TBAC) uses attributes of the data to help keep permissions up to date as data changes. With LF-TBAC, administrators set the appropriate tags on their data, and then the existing policies will enforce the desired access to new data resources. This helps scale permissions management across a large number of AWS Glue Data Catalog resources, reducing management time and effort.

Secure data sharing

Open all

AWS services such as Amazon Athena, AWS Glue, Amazon Redshift Spectrum, and Amazon EMR can use Lake Formation to securely access data in Amazon Simple Storage Service (Amazon S3) locations registered with Lake Formation. With Lake Formation, you can define and manage FGAC permissions for your data in the AWS Glue Data Catalog. Each of these AWS services is a trusted caller to Lake Formation, and Lake Formation provides access to data stored in Amazon S3 through temporary credentials. Amazon QuickSight and Amazon SageMaker Studio integrations are also supported if using one of the supported engines.

For more information, see Lake Formation Permissions Management Workflow.

Data access monitoring and auditing

Open all

Lake Formation provides comprehensive audit logs with AWS CloudTrail to monitor access and demonstrate compliance with centrally defined policies. You can audit data access history across analytics and ML services that read the data in your data lake with Lake Formation. This lets you see which users or roles have attempted to access what data, with which services, and when. You can access audit logs in the same way you access other CloudTrail logs using the CloudTrail APIs and console. For more information about CloudTrail logs, see Logging AWS Lake Formation API Calls Using AWS CloudTrail.

AWS Lake Formation FAQs

General

Centralized permissions management

Security management and governance

Data access monitoring and auditing

How to get started

Learn more about AWS Lake Formation pricing

Sign up for an account

Start building in the console

Learn

Resources

Developers

Help

AWS Lake Formation FAQs

General

What is AWS Lake Formation?

How does Lake Formation relate to AWS Glue and the AWS Glue Data Catalog?

How does Lake Formation integrate with Amazon DataZone?

Can I use third-party business intelligence tools with Lake Formation?

Do third-party tools integrate with Lake Formation?

Centralized permissions management

How does Lake Formation simplify and centralize permissions management?

How does Lake Formation work with AWS Identity and Access Management (IAM)?

What data granularity does Lake Formation permissions support?

Security management and governance

How does Lake Formation help scale permissions across users?

Secure data sharing

How does Lake Formation streamline data sharing?

What is the role of the AWS Glue Data Catalog in data sharing?

How does Lake Formation simplify external or business-to-business data sharing?

How does Lake Formation integrate with other AWS services?

Data access monitoring and auditing

How can Lake Formation help monitor and audit data access?

How to get started

Learn more about AWS Lake Formation pricing

Sign up for an account

Start building in the console

Learn

Resources

Developers

Help