AWS Database Blog

Empowering the role of the cloud database engineer

Automation has been both an adjustment and a gift to traditional database administrators (DBAs). Most traditional responsibilities of a DBA involve provisioning, access control, maintenance, monitoring, high availability, and backup/restore. In Part 1 of our series, we talked about how that role evolved to focus less on platform and more on applications. In Part 2, we discussed how, as fleets move to the cloud, there are tools and code that can perform those tasks more reliably and efficiently. This is still true today. What has changed is that tooling and instrumentation has evolved to make it even easier to accomplish routine DBA tasks to the point that the role is evolving once again. As large enterprise customers are migrating their databases to the cloud at scale, we are seeing these database roles morph into the new role called cloud database engineer.

Customers managing their databases on premises previously maintained separation of duties through roles like database developer, DBA, and DevOps. Because each role was responsible for only a portion of the whole process, there could be tension and bottlenecks leading to project delays or misunderstandings. When implementing and maintaining a database is all manual, having multiple roles is a way to divide up work. In the cloud, the undifferentiated heavy lifting is handled by managed services and infrastructure as code, making room for unprecedented innovation. This shift also ushers in the opportunity for a dedicated leader to oversee end-to-end database project delivery, thereby streamlining processes and ensuring accountability. The cloud database engineer is that dedicated leader in supporting business-critical applications and services. The role involves a combination of database administration, software engineering, architecture, cloud infrastructure, and database design, management, and optimization. The Cloud Database Engineer is critical for ensuring the reliability, performance, and security of cloud-based applications and services. You are responsible not only for designing, implementing, and maintaining cloud-based databases, but also for network segmentation, compliance, better high availability and disaster recovery, inventory management, risk assessment, and governance.

Meeting the new responsibilities of the Cloud Database Engineer is an exciting journey that blends your existing database expertise with new skills. You will still rely on your knowledge of database technologies and you gain an opportunity to expand your scope beyond relational to key-value, in-memory, document, graph, time series, wide-column and data warehousing. You will become proficient at AWS services, infrastructure as code, devops practices, Security, Networking, and scripting languages. Your new skills can help you navigate and thrive in a cloud-centric database environment.

In this post, we discuss how, as a database engineer, you can use AWS services to automate the undifferentiated heavy lifting of day-to-day database management using managed services, automation, and tools. You can gain efficiency, thereby streamlining communication and decision-making. Fewer people are needed to collaborate or approve changes. For databases in the cloud, the buck stops with you, the database engineer, which can speed up project completion times. You can cover a variety of databases such as Amazon Aurora, Amazon Relational Database Service (Amazon RDS), Amazon DynamoDB, Amazon ElastiCache, and Amazon DocumentDB (with MongoDB compatibility), which allows the organization to adapt quickly to technology changes or new products. With the basics automated, the new database engineer role allows you to stretch beyond the traditional confines of your role.

How can AWS help you?

AWS has a toolkit of services primed to revolutionize your work, amplify your productivity, and redefine the boundaries of what’s possible. With managed database services like Aurora, DynamoDB, Amazon MemoryDB for Redis, and Amazon Neptune, much of the heavy lifting is handled automatically. These services are your dependable allies, deftly managing administrative chores such as provisioning, software patching, setup, configuration, and backups. This means you are free to focus on more impactful tasks that bring value to your role and the business. Embrace the freedom to innovate, to strategize, to engineer. That is the true essence of being a database engineer in the cloud.

Fundamentals: AWS databases services

AWS provides a number of database managed services that are optimized for specific workloads and use cases, and can help you achieve better performance, scalability, and reliability for your applications. As the data landscape grows increasingly complex, a one-size-fits-all approach to databases often falls short. Purpose-built databases are designed for specific data workloads. They are tailored for specific data models or workload types like key value, document, time series, in memory, or graph. Each one is designed from the ground up, and are built to scale easily when handling large volumes of data or sudden increases in workload demand. The following table provides an overview of AWS managed database services and gives example use cases for each.

Aurora is a relational database service with advantages for mission critical applications that need high availability, and high durability. Amazon RDS is relational databases used for traditional applications, customer relationship management (CRM), and some ecommerce applications. DynamoDB is a key-value store commonly used for high-traffic web applications, web-scale ecommerce systems, and gaming applications. ElastiCache is a fully managed in-memory databases for caching, session management. MemoryDB is a durable database with microsecond reads, low single-digit millisecond writes used for gaming leaderboards, and geospatial applications. DocumentDB specializes in JSON document storage and is used for content management, catalogs, and user profiles. Amazon Keyspaces offers wide columns for high-scale industrial applications for equipment maintenance, fleet management, and route optimization. Neptune is a fully managed graph database that makes fraud detection, social networking, and recommendation engines a breeze. Amazon Timestream is purpose built for time series data to accommodate use cases such as DevOps, industrial telemetry, and Internet of Things (IoT). Amazon Redshift is a petabyte scale data warehouse service in the cloud that lets you access and analyze data for business intelligence, data integration, and analytics.

Enhanced proficiency: Harnessing advanced tools to simplify daily operations

Embracing the shift to the role of a database engineer in the cloud involves not just using the right database for the workload, but also harnessing a suite of advanced tools to simplify, automate, and enhance daily operations. Consider a common task like database provisioning. In a traditional setting, this could take hours or even days as you request hardware, install software, and configure settings. In the cloud, with a few lines of code, you can provision a fully managed database in minutes. You can also simply configure high availability, backups, and automatic scaling, reducing operational overhead and freeing you to focus on more strategic tasks. AWS offers a powerful lineup of services and features that take routine tasks to the next level, optimizing everything from database provisioning and access administration to maintenance, backup, and high availability. With AWS’s robust offerings, routine tasks become strategic operations that enhance security and efficiency. AWS is redefining the landscape for database engineers and propelling the role into the future of cloud computing.

Database provisioning

Infrastructure as code (IaC) allows you to automate the process of provisioning and managing infrastructure at scale using code. IaC simplifies the process of editing and distributing configurations by generating configuration files that encapsulate your infrastructure specifications. AWS Cloud Development Kit (CDK) allows you to create, update, and delete AWS resources using familiar programming languages. With AWS CDK, you can create and manage an entire stack of AWS resources, including databases, in an automated and repeatable manner.

You can also use the AWS Command Line Interface (AWS CLI) or AWS Software Development Kits (SDKs) to automate the process of creating and managing databases in AWS. The AWS CLI is a unified tool for managing your AWS services from the command line. You can use it without writing a full-fledged program and use it across your fleet. If you use scripts like Bash or Python to automate your work, it’s easy to call the AWS CLI commands directly from your script. The beauty of the AWS SDK is that it supports multiple programming languages (JavaScript, Python, PHP, .NET, Ruby, Java, Go, and Node.js) and can be embedded in your application code and incorporated into robust, production-grade applications. The AWS SDK provides more advanced error handling, and easy-to-use advanced features.

Database access administration

AWS has revolutionized database access administration by incorporating sophisticated, service-integrated access controls to provide secure, scalable, and manageable authentication mechanisms. Notably, the ability to authenticate directly using AWS Identity and Access Management (IAM) and AWS Secrets Manager has been game-changing. IAM enables centralized control with granular permission to your AWS resources, including databases, effectively replacing the traditional approach of managing users and credentials within each database. It supports short-lived credentials, further improving security by reducing the risk of credentials being compromised. Secrets Manager addresses the challenges of secrets management by safely storing, retrieving, and rotating credentials for databases and other services. It seamlessly integrates with AWS databases, eliminating the need to hardcode sensitive information in plain text. These enhancements, working in harmony, offer a far more secure, streamlined, and automated experience for database access administration.

Database maintenance

Database maintenance in the AWS Cloud has been dramatically simplified and automated largely due to the advent of several powerful tools and services. Amazon RDS Performance Insights for Aurora, Amazon RDS, and DocumentDB combined with Amazon CloudWatch provides deep insights into database performance, enabling proactive maintenance. Amazon DevOps Guru for RDS uses machine learning (ML) to identify operational issues long before they impact your business. AWS CloudTrail helps track user activity and API usage, contributing to improved governance, compliance, and risk auditing. AWS Backup provides centralized backup across AWS services and automated backup tasks, including policy-driven backup retention and cross-account backup, enhancing data protection and disaster recovery. These advancements, coupled with the continuing evolution of managed database services like Amazon Keyspaces, DocumentDB, and Aurora, make the maintenance of cloud databases more efficient, freeing database engineers to focus on value-adding tasks rather than routine maintenance.

Global scale databases

In recent years, AWS databases have significantly evolved multi-Region capabilities by introducing powerful features like Amazon Aurora Global Database, DynamoDB global tables, and DocumentDB global clusters. These advances have opened the door to true global scalability for applications, enabling them to serve read and in many cases also write requests with low latency, irrespective of the users’ geographical location. Aurora Global Database, for instance, allows an Aurora database to span multiple Regions, replicating changes with typical latency of less than 1 second. For NoSQL workloads, DynamoDB global tables replicates your tables across multiple Regions, making them fully accessible for both read and write traffic globally. Moreover, with Amazon RDS cross-Region read replicas, ongoing cross-Region replication is feasible, offering benefits like geographic redundancy and multi-Region application deployment. These capabilities are pivotal for applications with a global footprint, offering increased availability, disaster recovery, latency reduction, and an enhanced user experience by serving customers from their nearest geographical location.

Advanced mastery: Strategizing for security, compliance, and optimization

Once you have established the fundamental and enhanced proficiency stages of your database engineer role, the next stage is advanced mastery. This goes beyond day-to-day database management and delves into strategic, high-impact areas such as risk assessment and governance, inventory management, and compliance. The focus shifts from operational excellence to strategic governance, with the database engineer playing a critical role in ensuring data integrity, security, and compliance with regulatory standards. This is where you truly become a database strategist, proactively addressing organizational risks, optimizing resource usage, and ensuring alignment with business goals and regulatory requirements.

Risk assessment and governance

As a database engineer, one critical responsibility to your organization is to ensure that your cloud governance plan identifies vulnerabilities in your databases and their host accounts. You must enact plans to mitigate risk and establish metrics to gauge the impact of security measures. AWS has introduced several services and features that aid in risk assessment and governance for databases. AWS Config enables you to assess, audit, and evaluate the configurations of your databases. To identify resources in your organization, such as RDS instances, that are shared with entities outside of your account, use IAM Access Analyzer. It guides you toward least-privilege permissions, helps manage access to resources, and reduces potential security risk. Amazon Macie is primarily a data security and data privacy service; it helps identify and protect sensitive data such as personally identifiable information (PII) stored in Amazon databases like Amazon RDS or DynamoDB. Amazon GuardDuty continuously monitors your AWS accounts and workloads including database services. It exposes threats quickly using anomaly detection, ML, behavior modeling, and threat intelligence feeds from AWS and leading third parties. With GuardDuty, you can mitigate threats early by initiating automated responses.

All of these security services (AWS Config, IAM Access Analyzer, Macie, and GuardDuty) have built-in integrations with AWS Security Hub to consolidate findings across services. Security Hub is a cloud security posture management service that performs security best practice checks, aggregates alerts, and enables automated remediation. It gives you a comprehensive view of your security alerts and security posture across your AWS accounts.

Inventory management

Inventory management helps you identify what resources you have running in your cloud accounts at any given moment. Understanding what hardware, software, network, storage, compute, and other services provides comprehensive visibility into your cloud infrastructure, enabling you to make informed decisions and optimize resource allocation and costs. A key tool is AWS resource groups, which allow you to create, manage, and automate tasks across a collection of AWS resources sharing common tags, effectively simplifying inventory management across a large number of databases. AWS Systems Manager, another crucial tool, provides visibility and control of your infrastructure on AWS. It offers an operations dashboard from which you can view all your operational data across multiple AWS services, including databases, and acts as a single pane of glass for your resource management.

These tools offer a centralized way to manage your database resources, allowing you to keep track of your resources effectively, making the overall process of inventory management more efficient and error-free.

Compliance and governance

Maintaining compliance with industry regulations and standards is a critical aspect of managing databases in the cloud. As organizations migrate their data infrastructure to cloud environments, they need to ensure that their databases meet the necessary compliance requirements for data security, privacy, and governance. AWS Config and Security Hub are not only helpful in risk assessment and governance, but they are also an integral part of your compliance strategy. Together, they help determine overall compliance against the configurations specified in your guidelines. They also simplify compliance auditing by providing you with historical data about your resource configurations, provide a comprehensive view of your compliance status across your AWS accounts, and make it simpler to maintain regulatory and internal policy compliance. AWS Audit Manager, on the other hand, assists in continuously auditing your AWS usage to maintain compliance with regulations and industry standards. It automates evidence collection to enable seamless audits. AWS Artifact helps you find auditor-issued reports, certifications, accreditations, and other third-party attestations of AWS in a comprehensive resource. You can manage online agreements at scale and perform due diligence of ISVs that sell products on AWS Marketplace by giving you on-demand access to their security and compliance reports. As a database engineer, having tools to help manage and automate compliance reduces the burden of manual checks and record keeping, and makes it easier to uphold the high standards of data integrity and security.

Summary

In an ever-evolving technology landscape, the role of a database engineer has undergone a significant transformation. AWS has been at the forefront of this shift, providing a myriad of tools and services to augment the traditional responsibilities of database administrators, database developers, DevOps, and now the database engineer. The evolution can be seen as a journey, starting from the fundamentals where purpose-built databases cater to specific workloads, moving towards enhanced proficiency with tools simplifying routine tasks, and finally arriving at advanced mastery, focusing on strategic governance and optimization. This journey empowers the database engineer to become a true database strategist, driving decision-making processes around security, compliance, risk assessment, and resource optimization. By embracing these advancements, a database engineer can ensure more efficient database operations, enhanced security, and a streamlined compliance process. This is a promising era for database engineers as they navigate their transformation, enabled by the robust capabilities of AWS.

If you have questions or feedback, leave a comment.


About the authors

Wendy Neu has worked as a Data Architect with Amazon since January 2015. Prior to joining Amazon, she worked as a consultant in Cincinnati, OH, helping customers integrate and manage their data from different unrelated data sources.

Rajib Sadhu is a Senior Data Specialist Solutions Architect with over 15 years of experience in Microsoft SQL Server and other database technologies. He helps customers architect and migrate their database solutions to AWS. Prior to joining AWS, he supported production and mission-critical database implementation across financial, travel, and hospitality industry segments.