Serverless COBOL: Rejuvenating legacy code with open source software — Part 1
In this post, we explain how using open source software, GnuCOBOL, combined with AWS Lambda functions, can extend the life of legacy code into a serverless context. We also examine additional benefits of open source software when legacy features are deployed in such a modern environment.
The COBOL code described in this post—CI/CD scripts—are available from GitHub released under an open source license (Apache-2.0). Detailed insights about the implementation are explained in Serverless COBOL: Rejuvenating legacy code with open source software — Part 2, a more technical companion post. Happy forking!
Why COBOL still matters
COBOL was initially designed more than 60 years ago by Grace Hopper. This programming language remains vibrant today. Despite its age, COBOL is the foundation of numerous enterprise backbone applications. COBOL is still heavily used in organizations using mainframes, including banks, insurance companies, and government administrations. For example, skills shortages at the beginning of the Covid-19 pandemic showed how critical this programming language remains for running daily operations in US administrations. Old COBOL applications are still heavily used by several US states to process unemployment claims.
A 2017 Thomson Reuters report, COBOL Blues, claims that more than 200 billion lines of COBOL are still in operation. It also asserts that 43% of banking systems are built on COBOL and that 95% of ATM swipes rely on this language. COBOL’s importance in our economy is not declining anytime soon. In fact, IBM reports that more than 5 billion additional lines are produced each year.
Many of these abundant COBOL programs are still relevant in today’s digital business because they “do the job” even in the digital era. There is no need to rewrite them. So, following sections will demonstrate how to extract such programs from their legacy context and integrate them as first-class components of cloud-native applications.
GnuCOBOL and serverless
GnuCOBOL (formerly OpenCOBOL) is a free/libre implementation of the COBOL programming language. GnuCOBOL is a transcompiler to C, which then uses a native C compiler. The 2.2 final release passes more than 9,688 (99.79%) of the tests included in the NIST COBOL 85 test suite.
Despite being nearly 20 years old, GnuCOBOL is still alive and well. Following a yearly cadence pattern, version 3.1.2 was released in December 2020. The copyrights on the source code and past name (OpenCOBOL) were transferred to the Free Software Foundation (FSF) in 2015. FSF releases GnuCOBOL compiler under GPL license and its associated runtime under LGPL. Those two open source software licenses give users the needed levels of freedom to use Lambda functions.
Commercial proprietary software and middleware may create licensing issues when used in a serverless context. Their owning ISVs sometimes impose stringent T&Cs in their license–they want to know how many machines are used and exact hardware configuration (number of cores, size of RAM, etc.) in order to charge for all the servers in the distributed cluster, even if not all machines run their software.
Open source software can be a great option in a serverless context such as Lambda functions, in which AWS does the takes care of high availability and scalability through an abundant and redundant infrastructure, whose exact size changes to match the growth of such a fully managed service. Open source software permissive licensing T&Cs of GnuCOBOL allow deploying workloads to run wherever needed, at no cost. So, there can be as many distinct Lambda functions as needed to match the functional requirements. They can be replicated as widely as needed by the AWS Lambda service orchestrator to ensure proper availability and scale.
AWS Serverless Application Model
At end of 2016, AWS open sourced the AWS Serverless Application Model (AWS SAM) framework, to describe such serverless applications made of multiple Lambda functions with all their dependencies. AWS SAM source code is available in GitHub.
The main purpose of SAM is to reduce the development effort when creating such applications. Required artifacts and definitions are specified at a high level of abstraction. The AWS Serverless Application Model processor on AWS, in collaboration with the AWS CloudFormation service, which it extends, will take care of the low-level definitions of corresponding required AWS resources to deploy the Lambda function and make it publicly accessible through the API Gateway. SAM fosters the implementation of Infrastructure as Code best practices.
In this project, SAM will automatically take care of the creation and management of the various AWS resources required to launch and run the Lambda function. The SAM YAML file describing the Lambda function will contain the definition of single item of type AWS::Serverless::Function with its parameters. Under the cover, the SAM framework will turn this high-level abstraction into several AWS objects:
- Lambda function itself
- AWS Identity and Access Management (IAM) role to execute the Lambda function
- API Gateway elements:
Although SAM is used here through its raw CLI, contributions by the open source community have allowed its integration in various IDEs, such as PyCharm, IntelliJ, and Visual Studio, and standard CI/CD tools such as Jenkins.
So, COBOL features extracted from their legacy environment and restructured and deployed via SAM as GnuCOBOL-based Lambda functions can make a quantum leap forward in terms of technical debt elimination and use the best of open source solutions.
GnuCOBOL as Lambda custom runtime
Lambda natively supports Java, Go, PowerShell, Node.js, C#, Python, and Ruby code. When using those languages, the developer must upload its source code as a .zip file and the Lambda environment will take care of the compilation, packaging, and deployment of its application code.
Additionally, Lambda provides a runtime API that lets developers use any additional programming languages to author functions. This openness for custom runtimes is leveraged here; the single binary where the GnuCOBOL is statically linked to the application instructions will be uploaded as the custom runtime of the deployed Lambda.
Because the source code is compiled to a binary outside of the Lambda environment, including the source code in the uploaded package is not required, as it is done for standard Lambda languages.
Build, deploy, and test from GitHub
Our project was implemented using no-cost resources end-to-end to demonstrate how existing applications can use open source software across the entire application lifecycle. Lambda functions are part an AWS no-cost tier, which includes 1 million no-cost requests per month and 400,000 GB-seconds of compute time per month.
Similarly, the application code and its DevOps components are hosted as a public GitHub repository. GitHub is an AWS Partner Network (APN) member with the AWS DevOps Competency. The build and deploy phases for this Lambda function are run as GitHub Actions grouped in a CI/CD workflow, whose first 2,000 minutes each month are at no cost. Additional details regarding use cases of GitHub Actions can be found in a blog post on using GitHub Actions to deploy serverless applications.
To respect the best DevOps practices of complete isolation and full portability, the compilation happens in a Docker instance launched as the first stage of the workflow. This Docker instance is based on the Amazon Linux image published by AWS on Docker Hub. The image file will then pull the compiler source code and build it into a runtime library in order to build the application source code and generate the binary embarking the GnuCOBOL runtime in addition to the application feature.
The workflow will compile the code within the Docker image, extract the binaries from this image, and upload and deploy all components of the Lambda instance via a single
sam build command using the directives of the YAML template. Finally, the deployed Lambda instance is tested from GitHub via a cURL test request.
Further technical insights around the implementation are published in the companion post, Serverless COBOL: Rejuvenating legacy code with open source software — Part 2. The COBOL code, CI/CD scripts, are available from the GitHub repository released under the Apache-2.0 license.
This post explained the rationale and the value of combining open source software with leading-edge serverless services to rejuvenate legacy COBOL programs, which remain relevant in transformed business of the digital era.