Skip to content

Pipeline Fundamentals

First PublishedByAtif Alam

Before diving into a specific CI/CD platform (GitHub Actions, GitLab CI, etc.), it helps to understand the universal concepts that all of them share. This page covers the building blocks of any CI/CD pipeline.

Every CI/CD system organizes work into a hierarchy:

Pipeline
├── Stage: Build
│ └── Job: compile
│ ├── Step: checkout code
│ ├── Step: install dependencies
│ └── Step: build artifact
├── Stage: Test
│ ├── Job: unit-tests
│ │ ├── Step: run pytest
│ │ └── Step: upload coverage
│ └── Job: lint
│ └── Step: run eslint
└── Stage: Deploy
└── Job: deploy-staging
├── Step: authenticate to cloud
└── Step: deploy application
ConceptWhat It IsExample
PipelineThe entire automated workflow triggered by an eventA full build-test-deploy run
StageA logical phase that groups related jobs; stages usually run sequentiallyBuild, Test, Deploy
JobA unit of work that runs on a single runner/agent; jobs within a stage can run in parallelunit-tests, lint, build-image
Step (or task)A single command or action within a jobnpm install, docker build, kubectl apply
ConceptGitHub ActionsGitLab CIAzure PipelinesJenkins
PipelineWorkflowPipelinePipelinePipeline
Stage(implicit via needs)StageStageStage
JobJobJobJobStage/Step
StepStepScript lineTask/StepStep
Config file.github/workflows/*.yml.gitlab-ci.ymlazure-pipelines.ymlJenkinsfile

Triggers define what starts a pipeline:

TriggerWhen It FiresUse Case
PushCode pushed to a branchCI on every commit
Pull/Merge RequestPR opened, updated, or reopenedValidate before merge
TagA git tag is pushedRelease builds
Schedule (cron)On a time scheduleNightly builds, drift checks
ManualUser clicks a button or calls an APIProduction deploys, ad-hoc runs
API / webhookExternal system sends a requestCross-repo triggers, ChatOps
Pipeline completionAnother pipeline finishesChained/downstream pipelines

Most CI/CD systems let you narrow triggers:

# Pseudocode — run only on main branch, only when src/ files change
trigger:
branches: [main]
paths: [src/**]

This is critical for monorepos where you don’t want every service to rebuild when an unrelated file changes.

Artifacts are files produced by one job and consumed by another (or downloaded later):

Job: build
└── produces: app.jar (artifact)
Job: deploy
└── downloads: app.jar (from build job)
└── deploys to server
Use CaseExample
Pass build output to deploy jobCompiled binary, Docker image tag, Terraform plan file
Store test resultsJUnit XML, coverage reports
Archive for auditingBuild logs, SBOM (Software Bill of Materials)

Artifacts are typically stored by the CI/CD platform for a configurable retention period (e.g. 30 days).

Caching stores dependencies between pipeline runs to avoid re-downloading every time:

Run 1: npm install (downloads 800 MB of node_modules) → cache saved
Run 2: npm install (cache hit — restores node_modules in seconds)
What to CacheCache KeyImpact
node_modulesHash of package-lock.json30-60s saved
Python venv / pipHash of requirements.txt20-40s saved
Go modulesHash of go.sum10-30s saved
Docker layersImage hash or Dockerfile hashMinutes saved
Gradle / MavenHash of build.gradle / pom.xml30-60s saved

Key rule: Cache key should change when dependencies change (e.g. hash of the lockfile). When the key changes, the cache is rebuilt.

CachingArtifacts
PurposeSpeed up future runsPass data between jobs/stages
ScopeAcross pipeline runsWithin a single pipeline run
Examplenode_modules, pip packagesBuilt binary, test report
ExpirationLRU eviction or time-basedConfigurable retention (days)

Variables configure job behavior without hardcoding values:

# Pseudocode
env:
NODE_ENV: production
APP_VERSION: 1.2.3
steps:
- run: echo "Deploying version $APP_VERSION"

Variables can be set at:

  • Pipeline level — available to all jobs.
  • Job level — available to all steps in that job.
  • Step level — available to a single step.

Secrets are encrypted environment variables for sensitive data:

SecretExample
Cloud credentialsAWS access keys, Azure service principal
API tokensDocker Hub token, npm publish token
Database passwordsConnection strings
Signing keysCode signing certificates

Best practices for secrets:

  • Never hardcode secrets in the pipeline file or source code.
  • Use the CI/CD platform’s secret store (encrypted at rest, masked in logs).
  • Prefer OIDC (OpenID Connect) over long-lived credentials — the pipeline gets a short-lived token from the cloud provider without storing any keys. See GitHub Actions and GitLab CI for platform-specific OIDC setup.
  • Restrict secrets to specific branches or environments.

Environments represent deployment targets with optional protection rules:

Pipeline passes CI
Deploy to "staging" ──── (automatic, no approval needed)
Deploy to "production" ──── (requires manual approval from team lead)
FeatureWhat It Does
EnvironmentNamed target (dev, staging, production) with its own variables and secrets
Approval gateRequire one or more people to approve before the job runs
Wait timerDelay deployment by N minutes (cool-down period)
Branch restrictionOnly allow deployments from specific branches (e.g. main only for prod)
Deployment historyTrack what was deployed when and by whom

Jobs in the same stage (or with no dependency) run simultaneously:

Stage: Test (3 jobs in parallel)
├── Job: unit-tests (2 min)
├── Job: integration-tests (5 min)
└── Job: lint (1 min)
Total time: 5 min (not 8 min)

A matrix runs the same job across multiple configurations:

# Pseudocode — test on 3 Node versions × 2 OS
matrix:
node: [18, 20, 22]
os: [ubuntu, macos]
# Creates 6 parallel jobs:
# node-18-ubuntu, node-18-macos, node-20-ubuntu, ...

Use cases:

  • Test against multiple language versions.
  • Test on multiple operating systems.
  • Test with different database versions.

A runner (also called an agent or executor) is the machine that executes pipeline jobs:

TypeWhat It IsProsCons
Cloud-hostedProvided by the CI/CD platform (ephemeral VMs)Zero maintenance, clean environment every runLimited customization, potential queue times
Self-hostedYour own machine (VM, bare metal, Kubernetes pod)Full control, faster (pre-cached), access to internal networksYou maintain it, security responsibility
ScenarioRecommendation
Open-source project, standard buildsCloud-hosted
Need GPU, special hardwareSelf-hosted
Strict compliance (data can’t leave your network)Self-hosted
Very high build volume (cost savings)Self-hosted
Need access to internal services (private DB, APIs)Self-hosted
Want zero maintenanceCloud-hosted
┌──────────┐ ┌──────────┐ ┌──────────┐ ┌──────────┐ ┌──────────┐ ┌──────────┐
│ Trigger │───►│ Build │───►│ Test │───►│ Security │───►│ Deploy │───►│ Deploy │
│ (push) │ │ │ │ │ │ Scan │ │ Staging │ │ Prod │
└──────────┘ └──────────┘ └──────────┘ └──────────┘ └──────────┘ └──────────┘
compile unit tests SAST auto manual
install deps integration dependency deploy approval
build image e2e (optional) container scan deploy
push to coverage smoke test
registry
StageWhat HappensFailure Action
BuildCompile, install dependencies, create Docker image, push to registryPipeline stops — no point testing broken code
TestRun unit tests, integration tests, generate coverage reportsPipeline stops — don’t deploy broken code
SecurityStatic analysis (SAST), dependency vulnerability scan, container image scanPipeline stops or warns (depends on severity)
Deploy StagingDeploy to staging environment, run smoke testsPipeline stops — staging is broken
Deploy ProductionManual approval, deploy to production, run smoke tests, monitorRollback if smoke tests fail

All modern CI/CD tools store pipeline definitions in the repository as YAML (or Groovy for Jenkins):

BenefitWhy It Matters
Version controlledPipeline changes go through PR review like application code
ReproducibleAny commit has its exact pipeline definition
AuditableGit history shows who changed what and when
PortablePipeline lives with the code, not in a separate UI
  • A pipeline is a hierarchy: pipeline > stage > job > step.
  • Triggers define what starts a pipeline — push, PR, schedule, manual, tag.
  • Artifacts pass data between jobs; caching speeds up repeated dependency installs.
  • Secrets should never be hardcoded — use the platform’s encrypted secret store or OIDC.
  • Environments with approval gates control the path from staging to production.
  • Matrix builds test across multiple configurations in parallel.
  • Runners can be cloud-hosted (zero maintenance) or self-hosted (full control).
  • Store pipelines as code in the repository — version controlled, reviewed, reproducible.