Overview
As organizations increasingly use cloud computing and Amazon Web Services (AWS) offers a wide range of services to support DevOps cultural philosophies, practices, and tools, it’s crucial to know the pros and cons of provided services. Besides that, integrating Sec (Security) inside DevOps is another important aspect that needs to be focused on. Any update to the source code can contain defects that can lead to security breaches, which can result in potential exploitation and more.
To solve this problem, developing an automated pipeline to build, test, and deploy, along with security scanning tools within AWS services, is necessary. This solution aims to streamline the development process, respond faster to bugs, and reduce deployment-related human actions. By doing this, the developer can enhance their coding progress and maintain a secure application, and the application can get to the customer as fast as possible.
Developing A Solution
After a time of consideration and multiple tests, I have reached to conclusion a solution that has these abilities:
A pipeline that triggers every time there is a change in the source code.
Integrate security scanning in the pipeline, based on the report to change the state of the pipeline.
The pipeline can be self-mutated to the change in the settings and infrastructure.
The pipeline will stop at the stage where it failed
Notification is sent every time the pipeline fails, succeeds, and reaches the manual stage.
Centralize reports of the pipeline to one place for convenience.
With that, I have created two stacks representing the application pipeline and the self-mutate pipeline.
Application Pipeline
Overview Diagram
Using AWS native services was the priority of this project, as discussed above. In this overall architecture, I will use CDK to deploy resources to the AWS environment:
CI/CD
AWS CodeCommit: a version control service like Github or Gitlab.
AWS CodeBuild: fully managed build service in the cloud.
AWS CodeDeploy: a deployment service that automates application deployments to AWS computing platforms like EC2/ Lambda/ECS.
AWS CodePipeline: continuous delivery service to model, visualize, and automate the steps required to release your software.
Amazon EventBridge: Trigger the pipeline when there is a change in the source code
Security Scan Tools
AWS CodeGuru Security: a static application security tool that uses machine learning to detect security policy violations and vulnerabilities.
AWS Inspector: a vulnerability management service that continuously scans AWS workloads for software vulnerabilities and unintended network exposure.
Snyk: It is not an AWS native service, but it is an easy-to-integrate, powerful SCA solution that can be implemented in the pipeline.
Artifacts, Report, Notification, Cleanup
Amazon ECR: Store build image from CodeBuild.
Security Hub: collects security data across AWS accounts and services, supports third-party products, and helps you analyze your security trends and identify the highest priority security issues.
-
Trigger CodeBuild for the first time for ECS to use.
When the pipeline state changes, trigger SNS to send a message to the topic’s subscriber.
Trigger ECR to delete an image when a branch is deleted.
S3:
Store security logs.
Store input and output artifacts of a pipeline.
Store cache for CodeBuild to reuse.
SNS: Create a topic for email subscription.
Deployment Solution
There are two types of deployment:
In-place deployment
The application on each instance in the deployment group is stopped, the latest application revision is installed, and the new version of the application is started and validated. This type of deployment only supports EC2/On-premise compute type.
Blue/green deployment
A blue/green deployment is a strategy in which you create two separate but identical environments. One environment (blue) runs the current version, and one environment (green) runs the new version. Once testing has been completed on the green environment, live application traffic is directed to the green environment, and the blue environment is deprecated.
In this project, I choose Blue/green deployment for its benefit of increasing application availability and reducing deployment risk by simplifying the rollback process if a deployment fails.
Self-Mutate Pipeline
Overview Diagram
The purpose of services is the same as above, but instead of deploying applications using CodeDeploy, I will use CloudFormation action to update the application stack.
Update Stack
CloudFormation: Update the chosen stack based on the CloudFormation template generated from the CodeBuild project.
cdk-nag: Check the CDK applications or CloudFormation templates for best practices using a combination of available rule packs.
Purpose
Application pipeline self-mutate is delegated to another pipeline to solve some problems that the all-in pipeline is facing that I was noticing:
Will have a lower cost: AWS Pipeline Type V2 charges with pipeline execution time. With the way of charging the all-in pipeline, each time it runs, it will need to perform a self-mutate action even though the infrastructure code has not been changed. This results in an increase in the execution time of the pipeline and CodeBuild service.
More intuitive: We will have discrete source code; one is for application, and one is for infrastructure.
We do not have two builds in one pipeline: We want to change the CodeBuild project settings in some scenarios. If we use one build project, it can’t be reflected in the first build of the pipeline.
Note
I will deploy the application in a public subnet for pricing matters.
For the solution, I will use trunk-based development due to it being easy to apply to the pipeline, the main branch will implement Continuous Deployment (CD), and the feature branch will implement Continuous Integration (CI).
Implementation
Prerequisite
Application repository: Store application source code and related file (buildspec.yml, …) to CodeCommit so CDK can get this repo to set up the pipeline.
Setup token: Tools like Snyk require a token to authenticate, so we need to store the token safely and retrieve it while building, like the Secret Manager service.
Enable AWS Inspector, Security Hub, and Snyk Code: These services must be enabled manually due to default deactivation.
Application Pipeline
Detail Diagram
Workflow
Source Stage
Steps:
Every time there is a change in the main branch of the CodeCommit application repo, EventBridge will trigger the pipeline to run.
CodeCommit will push the source code to S3 as an output artifact.
Build Stage
Steps
CodeBuild takes source code from S3 as input artifacts.
CodeBuild finds the buildspec.yml file in the source code’s root folder.
Perform unit tests and generate test reports in JUnit XML format that CodeBuild can consume to update the CodeBuild Test Dashboard.
Scanning application source code using AWS CodeGuru Security, the return finding will be converted to ASFF format using bash file, and the upload findings to Security Hub.
Scan application source code using Snyk and generate test reports in JSON format.
Build a docker image.
Export SBOM from the built image and use AWS Inspector to scan for findings. The Findings will be in Cyclone DX 1.5 format.
Upload the image to ECR. With AWS Inspector enabled, the image will automatically be scanned, and the findings will be uploaded to SecurityHub.
Upload all the findings, reports, and output artifacts (appspec.yml, taskdef.json) to S3.
Output
CodeBuild Reports
CodeGuru
Snyk
AWS Inspector
- Security Hub
Deploy Stage
Blue Container: The current version of the application
Green Container: The latest version of the application
Steps:
CodeDeploy gets appspec.yml and taskdef.json from S3 as input artifacts.
Code Deploy register taskdef.json to ECS.
CodeDeploy runs appspec.yml, which contains the definition of the task that ECS Services will use to deploy the green environment container.
After creation, the container will get a health check at the designed port and path.
If the health check is passed, the Application Load Balancer will redirect user traffic from the blue container to the green container. Otherwise, if the health check fails, it will get stuck at this step until overtime passes and trigger a failure.
The blue container will be deleted to save resources.
Output:
Self-mutate Pipeline
Detail Diagram
Detail Diagram
Source Stage
Steps:
Before pushing code to the remote repo, the developer needs to check the CDK application using cdk-nag
Every time there is a change in the main branch of the CodeCommit application repo, EventBridge will trigger the pipeline to run.
CodeCommit will push the source code to S3 as an output artifact.
Outputs:
Update Stage
Steps
CodeBuild takes source code from S3 as input artifacts.
CodeBuild finds the buildspec.yml file in the source code’s root folder.
Generate CloudFormation templates using CDK
While generating the templates, cdk-nag will check the application for any defects.
Scan the generated templates using Snyk and generate test reports in JSON format.
Upload all the reports and output artifacts (template.json) to S3.
CloudFormation will get the template from S3 and perform an update stack action.
Compare between CodeGuru Security and Snyk Code
After using CodeGuru Security and Snyk Code as SAST, I have come to some conclusions:
Tools | CodeGuru Security | Snyk Code |
Features | - Use Amazon Code Guru Detector Library | - Use it’s security rules and can create customs rule |
Supported IDEs | Here | Here. Snyk also has LSP for users to configure. |
Integration | Github, Gitlab, Bitbucket, AWS CLI, AWS CodePipeline, Amazon Inspector, Amazon CodeWhisperer. | Github, Gitlab, Azure (TFS) Repo, Bitbucket, CLI, APIs, Notification, SecurityHub (Price). |
Remediation | Create code blocks that directly replace user-vulnerable lines of code. | Create data flow and fix analysis. Can fix vulnerability automatically with DeepCode AI |
Notification | CodeGuru Security does not support notification right now. | Snyk supports notifications in Email and Slack. Snyk offers notifications for vulnerabilities, usage alerts, … |
Severity | CVSS | CVSS v3.1 |
Report | - UI Dashboard. | - UI Dashboard. |
Status | Preview | Stable |
Price | Free to use | Three plans: |
Free, Teams, Enterprise |
Obstacles Encountered
During the implementation, I encountered some problems:
Repository code initialization fails when empty files exist within the contents
Description: When attempting to initialize a CodeCommit repository with some code contents, the stack fails with a “Code supplied is not a valid .zip archive” error if, among the contents, one or more empty files exist (e.g. init.py). The error return is rather confusing for me to understand at first glance.
CodeGuru Security does not support Security Hub natively
Description: When using CodeGuru Security with Security Hub, I noticed that the CodeGuru findings format is not in ASFF format, which is crucial to upload to Security Hub to centralize all the findings.
Self Mutate can’t handle VPC changes in subnets, availability zones with IP not change
Description: When setting up VPC using CDK, I try to change the number of availability zones without IP changes. This action resulted in “The CIDIR 10.0.0.0/16 conflicts with another subnet”. In the issue thread, it’s said that it was inevitable to conflict with the subnet IP CIDR in this use case.
Perform an update with a different IP/CIDR with the latest settings change. After that, deploy again with the old IP/CIDR.
Specify IP/CIDR for each subnet to avoid overlapping.
CodeCommit is not compatible with theglob patternin CodePipeline
Description: For example, I want to create a pipeline triggered when every branch contains the prefix “features/”, but this can’t be done due to CodePipeline only supporting triggering a specific branch for CodeCommit.
Create an Event that listens to changes in the wanted branch (Event support prefix) and adds to a Lambda target, The Lambda function will trigger a pipeline with an override branch.
CodeBuild can’t be forced to succeed even if there is a failed step
Description: In some scenarios, we still want the building project to return to success when there is an invalid step. For example, when Snyk scans the IaC template, it will include an exit command when a vulnerability is found. Any exit code value except 0 will fail the building project.
Inconsistent documentation in blue/green deployment using CloudFormation
Description: The CodeDeploy user guide page says to deploy through CloudFormation only supports the ECS compute type.
However, the CloudFormation user guide about blue/green deployment only supports the AWS Lambda compute type.
Improvements
Even within the project's six-week timeframe, there are things for potential improvement that remain apparent to me, suggesting opportunities for enhanced practices in the future:
Roles Creation: Right now, we delegate role creation to the CDK; this can result in some roles having over-privilege permission than they should have. We need to follow the least-privilege permission best practice to narrow the impact of a role.
Testing Environment: Currently, the pipeline does not have a testing environment for the tester before the production stage. This environment is crucial to validate that the latest code functions as expected.
More communication channels: Right now, I only notify the user that they are using Gmail, but for teams to receive, it should be to other channels like Teams and Slack… This will provide better notification for teams to respond to the pipeline state.
A better way to centralize reports: At the moment, Security Hub does not support CodeGuru Security by default even though it's a native service, parse reports are not a long way when some fields can’t be filled up.
Upgrade toGitFlowdevelopment process: Currently, the project follows trunk-based development, which is easy to implement and suitable for small projects. However, migrating to GitFlow development is more appropriate to match the company culture and projects.
Conclusion
This project's main target is integrating security into the DevOps pipeline with preferred-to-use AWS native services. I have deployed the solution in the AWS environment using CDK and tested it with some use cases. The solution works and meets all the requirements set. The pipeline will automatically build, test, and deploy the application. During the build stage, AWS security tools scan for findings and centralize reports for easy review. Users are notified about the pipeline state. During the information gathering and implementation, I encountered some problems; some can be resolved, but some can’t be dealt with comprehensively.