If you read the news at all back in 2021, then you probably recall the SolarWinds hack. What you might not remember is that the attackers reportedly “got in” via SolarWind’s build system. The build is one of the first steps in any continuous integration (CI) process. So let’s talk about securing continuous integration and continuous deployment (CI/CD).
In my experience, many companies do not treat their CI/CD systems with the same level of security rigor as they do their applications. I remember suggesting to a co-worker once that we perform vulnerability scanning of our CI/CD docker images. He pushed back on me and said something to the effect of “the containers only run for minutes at a time each and aren’t accessible via the Internet”.
He technically wasn’t wrong, but it struck me as odd that we would allow potentially insecure build systems access to our most valuable intellectual property (source code). Additionally, some of these containers were being used to provision cloud infrastructure and had broad access with the credentials assigned to them.
Let’s go over some simple and not-so-simple ways to secure your CI/CD process.
Vulnerability scan your CI/CD docker images! If you are running CI/CD containers on a managed CI/CD service, then you probably don’t have as much visibility into the lifecycle of those containers as you’d like. Maybe they simply run as a Kubernetes Job on the backend or use a managed service like AWS CodeBuild or GCP Cloud Build. Or maybe your provider has a 24/7 sysadmin manually starting your jobs on his Raspberry PI cluster in his mom’s basement. Whatever the case, the CI/CD environment may have vulnerabilities that you don’t know about. Additionally, your containers will likely be running alongside other customers’ containers, some of which could be malicious. Vulnerability scanning will help inform you whether your containers are potentially vulnerable to these malicious neighbors.
Avoid static credentials. By “static”, I mean credentials that have a long expiration date or no expiration. If your CI/CD workloads integrate with third-party systems, then they likely use some form of static credentials. But fear not! Static credentials can be made more secure by having a process for expiring/rotating them. For instance, AWS Secrets Manager can integrate with AWS Lambda to rotate your secrets. Of course, this assumes your third-party vendor has some sort of API for creating new credentials that Lambda can call. Also, your Lambda needs access to that third-party API and will likely use static credentials to do that. This is still more secure than using static credentials in your CI/CD system though as long as you lock down the Lambda function to only be able to be invoked for key rotation via Secrets Manager.
Stop using roles with overly broad permissions. tl;dr use Least Privilege for CI/CD workloads. It is pretty common for DevOps engineers to setup CI/CD pipelines and use a role with very broad access. In AWS, for example, your CI/CD process might assume the Admin IAM Role of an AWS account. This makes getting the pipeline setup much easier. However, it also means that those credentials can totally wreck and destroy your environments due to either a breach of even an accidental change. At the very least, I recommend you modify the permissions so that the pipeline does not have permissions to delete critical infrastructure such as virtual networks, databases, blob storage buckets, backups, etc. Perhaps make an IAM role called “SaferAdmin” or something like that. You’d much rather perform deletions like that manually instead of dealing with loss of critical company data.
Use OIDC where supported. Modern CI/CD vendors generally have OIDC built-in meaning each of your CI/CD jobs should have its own identity. This identity can be used to retrieve short-lived credentials from modern cloud providers like AWS or GCP. This actually helps deal with the problem of “static credentials” mentioned above! Additionally, you can likely configure the credentials to be only scoped to your build job. So if the credentials get leaked, they are very difficult to exploit (they’d have to use them from the build job machine).
Governance for CI/CD. The vast majority of CI/CD workloads could benefit from some level of governance to prevent intentional or accidental misuse. For instance, CD jobs that deploy to production tend to occur at certain times. You should consider blocking deployments at all other times. For instance, you probably don’t want to deploy to production on the weekend because production deployments risk causing issues, and you don’t want your team to get paged. Of course, maybe your company only deploys on the weekends because your customers don’t use the product much at that time.
Interestingly enough, many CI/CD systems make governance like this tricky to implement. Developers can simply modify some sort of spec.yml file in their Git repository to run any workflows they want. Governance implies that the developers shouldn’t be able to bypass governance simply by modifying a spec.yml file. So depending on your tooling, this can be a challenge.
Use multi-layered protection. Always ask yourself what would happen if a mitigation were to fail. For instance, if an attacker breached a container running as part of a build job, what is the worst thing they could do? Once identified, determine what you could do to limit the impact.