Overview

Product video
THE PROBLEM Setting up a data lakehouse shouldn't take a week. But if you've tried to configure Apache Iceberg with Hive Metastore, or wrestled with AWS Glue catalogs, you know the reality: days of YAML files, IAM policies, Spark clusters, and debugging before you can write your first query. For large enterprises with dedicated platform teams, that's fine. For a 5-person startup or a solo data engineer, it's a blocker. THE SOLUTION DuckLake Container gives you a fully functional data lakehouse in one container. PostgreSQL stores your catalog metadata. S3 stores your data as Parquet files. You connect from any DuckDB client and start querying. Time to first query: 5 minutes. No Spark. No Hive. No Glue. No YAML sprawl. HOW IT WORKS
Subscribe and launch the container on ECS, EKS, or Fargate Point it at your S3 bucket via environment variables or Secrets Manager Attach an IAM role with S3 permissions Connect from any DuckDB client and start querying
Your data lives in S3 as standard Parquet files. The catalog lives in PostgreSQL. Both stay in your VPC we never see your data. WHAT YOU GET Lakehouse capabilities: ACID transactions on S3, time travel queries, schema evolution, and partition pruning for fast queries on large datasets. Operational simplicity: Single container deployment with no external dependencies beyond S3. Automatic IAM credential handling no access keys to manage. Configuration via environment variables or AWS Secrets Manager. Flexible connectivity: Connect from Python, Go, Rust, Java, or any language with DuckDB bindings. Works with DataGrip, DBeaver, or any SQL tool. Multiple concurrent clients supported. WHAT'S INSIDE
PostgreSQL 17 for catalog storage (table schemas, snapshots, partitions) DuckDB 1.3 with ducklake, postgres, and httpfs extensions pre-configured Python init script that validates S3 access and configures the catalog Built-in container health monitoring
WHO THIS IS FOR Small data teams (1-10 people) who need lakehouse capabilities but don't have time to configure and maintain Iceberg, Delta Lake, or Glue. Startups who want to start simple and scale later. Your data is in standard Parquet format if you outgrow this solution, migrate to Iceberg or Databricks without rewriting everything. Developers building analytics features who need a quick way to prototype lakehouse functionality. Data engineers who are tired of complexity. If you've ever spent a week debugging Hive Metastore or Glue permissions, this is for you. WHO THIS IS NOT FOR Large enterprises already invested in Databricks, Snowflake, or a mature Iceberg deployment. This is a lightweight solution it won't replace your existing data platform. High-concurrency production workloads with hundreds of concurrent writers. PostgreSQL catalog handles moderate concurrency, but this isn't designed for massive scale. SECURITY Runs in your VPC your data never leaves your AWS account. Uses IAM-based authentication with task roles, no hardcoded credentials. Works with private S3 buckets. Supports AWS Secrets Manager for secure configuration. PRICING $0.05 per hour, billed per second with a 1-minute minimum. Running 4 hours/day costs approximately $6/month. Running 8 hours/day costs approximately $12/month. Running 24/7 costs approximately $36/month. Plus standard AWS costs for ECS/EKS compute, S3 storage, and data transfer. No annual commitment. No minimum spend. Stop the container, stop paying. GETTING STARTED
Subscribe to this listing Create an S3 bucket for your data Create an IAM role with S3 read/write permissions Launch the container with your configuration Connect and query
Documentation: https://docs.lokryn.com/ducklake ABOUT LOKRYN Lokryn builds infrastructure tools for small dev teams the kind of stuff that usually costs $10K/month from enterprise vendors or takes weeks to configure from open source. Simple. Transparent pricing. Runs in your cloud.
Highlights
- 5-minute setup. One container gives you a fully functional data lakehouse PostgreSQL for the catalog, S3 for your data as Parquet files. No Spark clusters. No Hive Metastore. No Glue configuration. Connect from any DuckDB client (Python, Go, etc) and start querying immediately.
- Your data never leaves your VPC. The container runs in your AWS account, your data stays in your S3 bucket, and we never see any of it. IAM role authentication no access keys to manage. Works with private buckets and AWS Secrets Manager.
- $0.05/hour with no minimum commitment. Running 8 hours a day costs about $12/month. Compare that to weeks of engineering time configuring open source alternatives, or $500+/month for managed lakehouse services. Stop the container, stop paying.
Details
Introducing multi-product solutions
You can now purchase comprehensive solutions tailored to use cases and industries.
Features and programs
Financing for AWS Marketplace purchases
Pricing
Dimension | Description | Cost/unit/hour |
|---|---|---|
Hours | Container Hours | $0.05 |
Vendor refund policy
DuckLake Container is billed hourly at 0.05 hour, prorated to the second. You only pay for what you use. Due to the low hourly cost and usage-based billing model, we do not offer refunds for consumed usage. If you experience technical issues that prevent the software from functioning, contact support@lokryn.com and we will work with you to resolve the issue or discuss credit on a case-by-case basis. For billing disputes related to AWS infrastructure charges, contact AWS directly.
How can we make this page better?
Legal
Vendor terms and conditions
Content disclaimer
Delivery details
DuckLake Container
- Amazon ECS
- Amazon EKS
Container image
Containers are lightweight, portable execution environments that wrap server application software in a filesystem that includes everything it needs to run. Container applications run on supported container runtimes and orchestration services, such as Amazon Elastic Container Service (Amazon ECS) or Amazon Elastic Kubernetes Service (Amazon EKS). Both eliminate the need for you to install and operate your own container orchestration software by managing and scheduling containers on a scalable cluster of virtual machines.
Version release notes
Summary: Refactored into modular package structure and added automatic AWS region detection for simpler deployments. What's New:
Auto Region Detection: Container now automatically detects AWS region from EC2 instance metadata (IMDSv2) or ECS task metadata. No need to set AWS_REGION in most deployments. Simplified Secrets Manager Config: aws_region is now optional in the secret. Required fields are just: postgres_user, postgres_password, postgres_db, s3_bucket_url Environment-Driven Marketplace Metering: MARKETPLACE_PRODUCT_CODE now reads from environment variable. Empty by default for local/dev mode, set it in production task definitions.
Technical Changes:
Refactored monolithic init_ducklake.py into modular package:
config.py - Region detection and configuration loading postgres.py - PostgreSQL initialization and management marketplace.py - AWS Marketplace metering s3.py - S3 access validation ducklake_setup.py - DuckLake catalog configuration constants.py - Shared constants and defaults
Region detection order: EC2 metadata ECS metadata AWS_REGION env var us-east-1 default
Upgrade Notes: No breaking changes. Existing Secrets Manager configs and task definitions will continue to work. For new deployments, you can remove AWS_REGION from env vars if running on EC2 or ECS (region auto-detected).
Additional details
Usage instructions
QUICK START
- Subscribe to this listing and note your container image URI
- Create an S3 bucket for your lakehouse data
- Create an IAM role with the required permissions (see below)
- Deploy the container to ECS or EKS
- Connect from any DuckDB client
STEP 1: CREATE S3 BUCKET
aws s3 mb s3://your-company-lakehouse --region us-east-1
STEP 2: CREATE IAM ROLE
Your ECS task role or EKS service account needs these permissions:
{ "Version": "2012-10-17", "Statement": [ { "Effect": "Allow", "Action": ["s3:GetObject", "s3:PutObject", "s3:DeleteObject", "s3:ListBucket"], "Resource": ["arn:aws:s3:::your-bucket", "arn:aws:s3:::your-bucket/"] }, { "Effect": "Allow", "Action": ["aws-marketplace:RegisterUsage"], "Resource": "" } ] }
Optional: Add secretsmanager:GetSecretValue if using Secrets Manager for configuration.
STEP 3: STORE CONFIGURATION
Option A - Secrets Manager (recommended):
aws secretsmanager create-secret
--name lokryn/ducklake/config
--secret-string '{
"POSTGRES_USER": "admin",
"POSTGRES_PASSWORD": "your-secure-password",
"POSTGRES_DB": "lakehouse",
"S3_BUCKET_URL": "s3://your-bucket/data/",
"AWS_REGION": "us-east-1"
}'
Option B - Environment variables:
Pass POSTGRES_USER, POSTGRES_PASSWORD, POSTGRES_DB, S3_BUCKET_URL, and AWS_REGION directly in your task definition.
STEP 4: DEPLOY TO ECS
Create an ECS service with:
- Container image: Use the image URI from your subscription
- Port mapping: 5432
- Task role: The IAM role from Step 2
- Memory: 1024 MB minimum recommended
- Health check: pg_isready -U admin -d lakehouse
For persistent catalog storage, mount an EFS volume to /var/lib/postgresql/data
STEP 5: CONNECT
From Python:
import duckdb conn = duckdb.connect() conn.execute("INSTALL ducklake; LOAD ducklake;") conn.execute("ATTACH 'ducklake:postgres:host=YOUR_ECS_HOST port=5432 user=admin password=YOUR_PASSWORD dbname=lakehouse' AS lake") conn.execute("SELECT * FROM lake.your_table")
From DuckDB CLI:
ATTACH 'ducklake:postgres:host=YOUR_ECS_HOST port=5432 user=admin password=YOUR_PASSWORD dbname=lakehouse' AS lake;
TROUBLESHOOTING
"CustomerNotSubscribedException" - Verify your AWS account has an active subscription to this product.
"Access Denied on S3" - Check that your task role has the S3 permissions listed above.
"Connection refused" - Verify the container is running and security groups allow port 5432.
DOCUMENTATION
Full guides, ECS task definitions, and examples: https://docs.lokryn.com/ducklake
SUPPORT
Email: support@lokryn.com Discord: https://discord.gg/ceTWDx97
Resources
Vendor resources
Support
Vendor support
SUPPORT CHANNELS Email: support@lokryn.com Community Discord: https://discord.gg/ceTWDx97 WHAT'S INCLUDED All DuckLake Container customers receive:
Documentation at docs.lokryn.com/ducklake Community support via Discord Email support for bug reports and technical issues Response within 2 business days for email inquiries
Our Discord community is the fastest way to get help. Ask questions, share how you're using DuckLake, and connect with other users. DOCUMENTATION COVERS
IAM policy setup and permissions ECS and EKS deployment guides Connecting from Python, Go, and SQL tools Troubleshooting common issues Configuration via environment variables and Secrets Manager
PAID SUPPORT PLANS For teams needing guaranteed response times or hands-on assistance, we offer paid support contracts starting at $199/month. Plans include faster SLAs, direct Slack access, and scheduled calls with our engineering team. Contact support@lokryn.com for details. FEATURE REQUESTS AND BUGS We actively maintain this product. If you find a bug or have a feature request, email us or post in Discord. We read everything.
AWS infrastructure support
AWS Support is a one-on-one, fast-response support channel that is staffed 24x7x365 with experienced and technical support engineers. The service helps customers of all sizes and technical abilities to successfully utilize the products and features provided by Amazon Web Services.