
Overview
GenomeKit is Deep Genomics’ Python library for fast and easy access to genomic resources such as sequence, data tracks, and annotations. The goal is to let machine learning researchers build data sets easily, and to be creative about how those data sets are designed. Out of the box, GenomeKit provides access to pre-built optimized genomic data files that are required for its operation.
Features and programs
Open Data Sponsorship Program
Pricing
This is a publicly available data set. No subscription is required.
How can we make this page better?
Legal
Content disclaimer
Delivery details
AWS Data Exchange (ADX)
AWS Data Exchange is a service that helps AWS easily share and manage data entitlements from other organizations at scale.
Open data resources
Available with or without an AWS account.
- How to use
- To access these resources, reference the Amazon Resource Name (ARN) using the AWS Command Line Interface (CLI). Learn more
- Description
- Optimized data files required to query genomic data using GenomeKit (Assemblies, annotations, etc)
- Resource type
- S3 bucket
- Amazon Resource Name (ARN)
- arn:aws:s3:::genomekit-public-dg
- AWS region
- us-east-1
- AWS CLI access (No AWS account required)
- aws s3 ls --no-sign-request s3://genomekit-public-dg/
Resources
Vendor resources
Support
Managed By
Deep Genomics
How to cite
GenomeKit genomic data was accessed on DATE from https://registry.opendata.aws/genomekit .
License
Apache License Version 2.0 https://www.apache.org/licenses/LICENSE-2.0