Overview
Quilt is a scientific data management system (SDMS) that integrates large heterogeneous datasets from instruments, pipelines, and analyses into beautiful, trustworthy data products in your data lake in Amazon S3. Quilt runs privately and securely in your AWS account as a CloudFormation stack. Quilt is powered by scalable and secure services like Amazon S3, Amazon OpenSearch, and Amazon Athena.
For more info, see: https://quiltdata.com/aws-marketplace
Highlights
- Link large instrument and data pipeline outputs to any notebook, ELN, or lab information management system (LIMS) with immutable URLs.
- Find, document, and understand all of your data in a central catalog.
- Confidently capture data, metadata, and documentation in immutable collections, known as Quilt packages.
Details
Features and programs
Financing for AWS Marketplace purchases
Pricing
Free trial
Dimension | Description | Cost/unit/hour |
---|---|---|
Hours | Container Hours | $2.60 |
Vendor refund policy
For refund information please contact us at: support@quiltdata.io
Legal
Vendor terms and conditions
Content disclaimer
Delivery details
Quilt Business
- Amazon ECS
Container image
Containers are lightweight, portable execution environments that wrap server application software in a filesystem that includes everything it needs to run. Container applications run on supported container runtimes and orchestration services, such as Amazon Elastic Container Service (Amazon ECS) or Amazon Elastic Kubernetes Service (Amazon EKS). Both eliminate the need for you to install and operate your own container orchestration software by managing and scheduling containers on a scalable cluster of virtual machines.
Version release notes
Providing AI/ML Capabilities, Checksum and Search updates, and Security Updates,
Additional details
Usage instructions
Before launching Quilt Business (1) request or import an SSL certificate for your region of choice and an SSL certificate in us-east-1 in AWS Certificate Manager and (2) locate the necessary credentials to update DNS records in your domain. Use CloudFormation to create a stack from this template: s3://quilt-marketplace/quilt-business-454c9d5.yaml. During the configuration process, you'll need to choose and enter: (1-3) credentials for the first admin account, (8-10) bucket title, icon URL and description, (11) SSL certificate ARN for the web catalog (us-east-1), (12) SSL certificate ARN for the ELB (installation region), (13) Quilt web host (the domain name for Quilt on your company's domain), (14) a password for the auth database, (15) name of the S3 bucket to be connected to Quilt, (17) name of a new bucket to be created to hold configuration files, (18) whether or not to create default roles to read and write the bucket. Click Next, then Create. Once the stack is complete, open the Outputs tab. Find: CloudFrontDomain, LoadBalancerDNSName, and RegistryHost. These values still need to be mapped to user-facing URLs via DNS. 11. Go to your DNS service. Create two CNAME records: (QuiltWebHost) to the CloudFrontDomain, and (RegistryHost) to the LoadBalancerDNSName. See: https://quiltdocs.gitbook.io/t4/references/technical-reference#deploying-the-t4-catalog-on-aws
Resources
Vendor resources
Support
Vendor support
Please find us by chat or email at https://quiltdata.com/
AWS infrastructure support
AWS Support is a one-on-one, fast-response support channel that is staffed 24x7x365 with experienced and technical support engineers. The service helps customers of all sizes and technical abilities to successfully utilize the products and features provided by Amazon Web Services.
Similar products
Customer reviews
The data hub I've been looking for
Quilt allows my team at DSP Concepts to focus on solving customer problems instead of data versioning problems. It is well established at this point that data quality is the foundation of serious and well performing data teams. Organization is key to building and retaining value in high quality data over time. Quilt solves this problem head on giving you a reliable single source of data truth with a suite of features to inspect the data and its documentation easily. This in turn allows each member of my team to find data in a self serve fashion without having to rely on institutional knowledge that only one team member, if any, might have depending on how long ago the data was collected, cleaned, processed and labelled.
A necessity for every data-driven company
Quilt is an indispensable tool for anyone that wants to properly manage their data in AWS. A key element to Quilt is that the programmatic interface is intuitive and flexible, offering multiple ways to integrate it into the data analysis workflow (python, R, command line). As only a handful of Quilt functions provide a majority of core functionality, there is not an overwhelming learning curve to get started, but many additional features improve the usability (e.g., reading data directly into memory, single file installation). Beyond the programmatic functionality, the Quilt web-based interface is extremely useful for browsing files and packages and switching between the different versions. I would highly recommend integrating Quilt into your data science workflow.
Missing tool in Data Science pipeline
Quilt simplified our flow in data maintenance and versioning. Now, it became extremely easy to keep track of changes in a dataset and refer in a reproducible manner a specific revision without worrying if someone overwrites the data. We have it already integrated into our flow, so the dataset updates interfere with model building no more.
Quilt team provides us with ongoing support. Bugs happen in every software, but in the case of small bug we found, we received a fixup in no time, so we could smoothly continue our work.
We spotted some drawbacks in Quilt Teams some time ago. These are mostly resolved here, and remaining "wishes" are on the roadmap. It's really nice that devs listen to our needs!
What we love most about Quilt, is the caching feature. We reduced data transfer costs while keeping low complexity of scripts.
Overall grade is 5/5 since that tool was missing heavily in the flow we had for Machine Learning. At this moment we use it also for versioning models (especially that we generate models in a bunch of formats each time) and Jupyter Notebooks (for which Git isn't the best option)