Sold by: Chestnut Hill Technologies
Migration of client's existing Data Warehouse from SQL Server to AWS/Redshift.
Sold by: Chestnut Hill Technologies
Overview
- Our Client needed to migrate an existing Data Warehouse from SQL Server to AWS/Redshift. Collaboration between CHT and the end client was as follows. Client would help develop the initial platform infrastructure for deploying and integrating the ETL developed to migrate data from SQL Server Data Sources S3 buckets making up Client’s data lake. The data lake was accessed through Redshift external tables.
- Roy Helander designed the initial CHT platform. The CHT platform was developed to allow for continuous integration continuous development (CI/CD). This involved the development of AWS lambdas written in python 3.x to process the stored procedures created in Redshift and AWS Glue jobs, also written in python.
- The CHT platform also took advantage of Cloudformation stack files to allow developers during deployment to also create multiple AWS/Redshift objects required for our jobs.
- Another part of the CHT is PostgreSQL database, called the Orchestration Database. All Jobs are defined in tables that describe connection strings, using shortcuts, stored in Secrets Manager, for our Glue Jobs to extract from or write to external databases.
- As the CHT platform was enhanced, more and more of the deployment process was handled by the platform. The AWS scheduler was replaced with a lambda that runs every minute to look for jobs to execute. Source control was migrated from Bitbucket to CodeCommit. Rick Karpel worked with Client to setup the hierarchy of the CodeCommit repository. By moving to CodeCommit Client was able to enhance the deployment process to control who migrates to what sets of Branches by AWS access and role.
- For the rest of the project CHT migrated ETL packages written for SSIS. The process involved in the migrations was to determine the original source files. Create a Source-To-Target-Mapping that show any column conversion, transformation from source to target, identify columns with Personal and Identifiable information, and comment on all major columns. These would be submitted to the Data Architect for approval. Once approval was received, a new repository for a new System of Record (SOR) and a branch for the ticket. If the repository already existed, only the new branch would be created.
Highlights
- 3. Orchestration Database Configuration - Creates the batches that groups the jobs - Creates the jobs that define the job execution - Creates the job hierarchy that determines job precedence
- 4. Pipeline - Bash script that identifies the environment (dev, qa, or prod) - Passes the SOR - Passes the repository branch - Passes a manifest file name that contains all the files to be executed by the deployment process. e. A yaml file containing the definition for a stack and pipeline for CodeCommit Server to update the AWS environment passed.
- 5. Documentation - Deployment Instructions - Runbook - Test Validation
Details
Pricing
Custom pricing options
Pricing is based on your specific requirements and eligibility. To get a custom quote for your needs, request a private offer.
Legal
Content disclaimer
Vendors are responsible for their product descriptions and other product content. AWS does not warrant that vendors' product descriptions or other product content are accurate, complete, reliable, current, or error-free.
Support
Vendor support
Contact Our Sales Team: Tel: 954-928-8221 Email: Sales@chtus.net Web: