ci-cd
CI/CD pipeline design, optimization, DevSecOps security scanning, and troubleshooting. Use for creating workflows, debugging pipeline failures, implementing SAST/DAST/SCA, optimizing build performance, implementing caching strategies, setting up deployments, securing pipelines with OIDC/secrets management, and troubleshooting common issues across GitHub Actions, GitLab CI, and other platforms.
Packaged view
This page reorganizes the original catalog entry around fit, installability, and workflow context first. The original raw source lives below.
Install command
npx @skill-hub/cli install ahmedasmar-devops-claude-skills-ci-cd
Repository
Skill path: ci-cd
CI/CD pipeline design, optimization, DevSecOps security scanning, and troubleshooting. Use for creating workflows, debugging pipeline failures, implementing SAST/DAST/SCA, optimizing build performance, implementing caching strategies, setting up deployments, securing pipelines with OIDC/secrets management, and troubleshooting common issues across GitHub Actions, GitLab CI, and other platforms.
Open repositoryBest for
Primary workflow: Design Product.
Technical facets: Full Stack, DevOps, Designer, Security.
Target audience: everyone.
License: Unknown.
Original source
Catalog source: SkillHub Club.
Repository owner: ahmedasmar.
This is still a mirrored public skill entry. Review the repository before installing into production workflows.
What it helps with
- Install ci-cd into Claude Code, Codex CLI, Gemini CLI, or OpenCode workflows
- Review https://github.com/ahmedasmar/devops-claude-skills before adding ci-cd to shared team environments
- Use ci-cd for development workflows
Works across
Favorites: 0.
Sub-skills: 0.
Aggregator: No.
Original source / Raw SKILL.md
---
name: ci-cd
description: CI/CD pipeline design, optimization, DevSecOps security scanning, and troubleshooting. Use for creating workflows, debugging pipeline failures, implementing SAST/DAST/SCA, optimizing build performance, implementing caching strategies, setting up deployments, securing pipelines with OIDC/secrets management, and troubleshooting common issues across GitHub Actions, GitLab CI, and other platforms.
---
# CI/CD Pipelines
Comprehensive guide for CI/CD pipeline design, optimization, security, and troubleshooting across GitHub Actions, GitLab CI, and other platforms.
## When to Use This Skill
Use this skill when:
- Creating new CI/CD workflows or pipelines
- Debugging pipeline failures or flaky tests
- Optimizing slow builds or test suites
- Implementing caching strategies
- Setting up deployment workflows
- Securing pipelines (secrets, OIDC, supply chain)
- Implementing DevSecOps security scanning (SAST, DAST, SCA)
- Troubleshooting platform-specific issues
- Analyzing pipeline performance
- Implementing matrix builds or test sharding
- Configuring multi-environment deployments
## Core Workflows
### 1. Creating a New Pipeline
**Decision tree:**
```
What are you building?
├── Node.js/Frontend → GitHub: templates/github-actions/node-ci.yml | GitLab: templates/gitlab-ci/node-ci.yml
├── Python → GitHub: templates/github-actions/python-ci.yml | GitLab: templates/gitlab-ci/python-ci.yml
├── Go → GitHub: templates/github-actions/go-ci.yml | GitLab: templates/gitlab-ci/go-ci.yml
├── Docker Image → GitHub: templates/github-actions/docker-build.yml | GitLab: templates/gitlab-ci/docker-build.yml
├── Other → Follow the pipeline design pattern below
```
**Basic pipeline structure:**
```yaml
# 1. Fast feedback (lint, format) - <1 min
# 2. Unit tests - 1-5 min
# 3. Integration tests - 5-15 min
# 4. Build artifacts
# 5. E2E tests (optional, main branch only) - 15-30 min
# 6. Deploy (with approval gates)
```
**Key principles:**
- Fail fast: Run cheap validation first
- Parallelize: Remove unnecessary job dependencies
- Cache dependencies: Use `actions/cache` or GitLab cache
- Use artifacts: Build once, deploy many times
See [best_practices.md](references/best_practices.md) for comprehensive pipeline design patterns.
### 2. Optimizing Pipeline Performance
**Quick wins checklist:**
- [ ] Add dependency caching (50-90% faster builds)
- [ ] Remove unnecessary `needs` dependencies
- [ ] Add path filters to skip unnecessary runs
- [ ] Use `npm ci` instead of `npm install`
- [ ] Add job timeouts to prevent hung builds
- [ ] Enable concurrency cancellation for duplicate runs
**Analyze existing pipeline:**
```bash
# Use the pipeline analyzer script
python3 scripts/pipeline_analyzer.py --platform github --workflow .github/workflows/ci.yml
```
**Common optimizations:**
- **Slow tests:** Shard tests with matrix builds
- **Repeated dependency installs:** Add caching
- **Sequential jobs:** Parallelize with proper `needs`
- **Full test suite on every PR:** Use path filters or test impact analysis
See [optimization.md](references/optimization.md) for detailed caching strategies, parallelization techniques, and performance tuning.
### 3. Securing Your Pipeline
**Essential security checklist:**
- [ ] Use OIDC instead of static credentials
- [ ] Pin actions/includes to commit SHAs
- [ ] Use minimal permissions
- [ ] Enable secret scanning
- [ ] Add vulnerability scanning (dependencies, containers)
- [ ] Implement branch protection
- [ ] Separate test from deploy workflows
**Quick setup - OIDC authentication:**
**GitHub Actions → AWS:**
```yaml
permissions:
id-token: write
contents: read
steps:
- uses: aws-actions/configure-aws-credentials@v4
with:
role-to-assume: arn:aws:iam::123456789:role/GitHubActionsRole
aws-region: us-east-1
```
**Secrets management:**
- Store in platform secret stores (GitHub Secrets, GitLab CI/CD Variables)
- Mark as "masked" in GitLab
- Use environment-specific secrets
- Rotate regularly (every 90 days)
- Never log secrets
See [security.md](references/security.md) for comprehensive security patterns, supply chain security, and secrets management.
### 4. Troubleshooting Pipeline Failures
**Systematic approach:**
**Step 1: Check pipeline health**
```bash
python3 scripts/ci_health.py --platform github --repo owner/repo
```
**Step 2: Identify the failure type**
| Error Pattern | Common Cause | Quick Fix |
|---------------|--------------|-----------|
| "Module not found" | Missing dependency or cache issue | Clear cache, run `npm ci` |
| "Timeout" | Job taking too long | Add caching, increase timeout |
| "Permission denied" | Missing permissions | Add to `permissions:` block |
| "Cannot connect to Docker daemon" | Docker not available | Use correct runner or DinD |
| Intermittent failures | Flaky tests or race conditions | Add retries, fix timing issues |
**Step 3: Enable debug logging**
GitHub Actions:
```yaml
# Add repository secrets:
# ACTIONS_RUNNER_DEBUG = true
# ACTIONS_STEP_DEBUG = true
```
GitLab CI:
```yaml
variables:
CI_DEBUG_TRACE: "true"
```
**Step 4: Reproduce locally**
```bash
# GitHub Actions - use act
act -j build
# Or Docker
docker run -it ubuntu:latest bash
# Then manually run the failing steps
```
See [troubleshooting.md](references/troubleshooting.md) for comprehensive issue diagnosis, platform-specific problems, and solutions.
### 5. Implementing Deployment Workflows
**Deployment pattern selection:**
| Pattern | Use Case | Complexity | Risk |
|---------|----------|------------|------|
| Direct | Simple apps, low traffic | Low | Medium |
| Blue-Green | Zero downtime required | Medium | Low |
| Canary | Gradual rollout, monitoring | High | Very Low |
| Rolling | Kubernetes, containers | Medium | Low |
**Basic deployment structure:**
```yaml
deploy:
needs: [build, test]
if: github.ref == 'refs/heads/main'
environment:
name: production
url: https://example.com
steps:
- name: Download artifacts
- name: Deploy
- name: Health check
- name: Rollback on failure
```
**Multi-environment setup:**
- **Development:** Auto-deploy on develop branch
- **Staging:** Auto-deploy on main, requires passing tests
- **Production:** Manual approval required, smoke tests mandatory
See [best_practices.md](references/best_practices.md#deployment-strategies) for detailed deployment patterns and environment management.
### 6. Implementing DevSecOps Security Scanning
**Security scanning types:**
| Scan Type | Purpose | When to Run | Speed | Tools |
|-----------|---------|-------------|-------|-------|
| Secret Scanning | Find exposed credentials | Every commit | Fast (<1 min) | TruffleHog, Gitleaks |
| SAST | Find code vulnerabilities | Every commit | Medium (5-15 min) | CodeQL, Semgrep, Bandit, Gosec |
| SCA | Find dependency vulnerabilities | Every commit | Fast (1-5 min) | npm audit, pip-audit, Snyk |
| Container Scanning | Find image vulnerabilities | After build | Medium (5-10 min) | Trivy, Grype |
| DAST | Find runtime vulnerabilities | Scheduled/main only | Slow (15-60 min) | OWASP ZAP |
**Quick setup - Add security to existing pipeline:**
**GitHub Actions:**
```yaml
jobs:
# Add before build job
secret-scan:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
with:
fetch-depth: 0
- uses: trufflesecurity/trufflehog@main
- uses: gitleaks/gitleaks-action@v2
sast:
runs-on: ubuntu-latest
permissions:
security-events: write
steps:
- uses: actions/checkout@v4
- uses: github/codeql-action/init@v3
with:
languages: javascript # or python, go
- uses: github/codeql-action/analyze@v3
build:
needs: [secret-scan, sast] # Add dependencies
```
**GitLab CI:**
```yaml
stages:
- security # Add before other stages
- build
- test
# Secret scanning
secret-scan:
stage: security
image: trufflesecurity/trufflehog:latest
script:
- trufflehog filesystem . --json --fail
# SAST
sast:semgrep:
stage: security
image: returntocorp/semgrep
script:
- semgrep scan --config=auto .
# Use GitLab templates
include:
- template: Security/SAST.gitlab-ci.yml
- template: Security/Dependency-Scanning.gitlab-ci.yml
```
**Comprehensive security pipeline templates:**
- **GitHub Actions:** `templates/github-actions/security-scan.yml` - Complete DevSecOps pipeline with all scanning stages
- **GitLab CI:** `templates/gitlab-ci/security-scan.yml` - Complete DevSecOps pipeline with GitLab security templates
**Security gate pattern:**
Add a security gate job that evaluates all security scan results and fails the pipeline if critical issues are found:
```yaml
security-gate:
needs: [secret-scan, sast, sca, container-scan]
script:
# Check for critical vulnerabilities
# Parse JSON reports and evaluate thresholds
# Fail if critical issues found
```
**Language-specific security tools:**
- **Node.js:** CodeQL, Semgrep, npm audit, eslint-plugin-security
- **Python:** CodeQL, Semgrep, Bandit, pip-audit, Safety
- **Go:** CodeQL, Semgrep, Gosec, govulncheck
All language-specific templates now include security scanning stages. See:
- `templates/github-actions/node-ci.yml`
- `templates/github-actions/python-ci.yml`
- `templates/github-actions/go-ci.yml`
- `templates/gitlab-ci/node-ci.yml`
- `templates/gitlab-ci/python-ci.yml`
- `templates/gitlab-ci/go-ci.yml`
See [devsecops.md](references/devsecops.md) for comprehensive DevSecOps guide covering all security scanning types, tool comparisons, and implementation patterns.
## Quick Reference Commands
### GitHub Actions
```bash
# List workflows
gh workflow list
# View recent runs
gh run list --limit 20
# View specific run
gh run view <run-id>
# Re-run failed jobs
gh run rerun <run-id> --failed
# Download logs
gh run view <run-id> --log > logs.txt
# Trigger workflow manually
gh workflow run ci.yml
# Check workflow status
gh run watch
```
### GitLab CI
```bash
# View pipelines
gl project-pipelines list
# Pipeline status
gl project-pipeline get <pipeline-id>
# Retry failed jobs
gl project-pipeline retry <pipeline-id>
# Cancel pipeline
gl project-pipeline cancel <pipeline-id>
# Download artifacts
gl project-job artifacts <job-id>
```
## Platform-Specific Patterns
### GitHub Actions
**Reusable workflows:**
```yaml
# .github/workflows/reusable-test.yml
on:
workflow_call:
inputs:
node-version:
required: true
type: string
jobs:
test:
runs-on: ubuntu-latest
steps:
- uses: actions/setup-node@v4
with:
node-version: ${{ inputs.node-version }}
```
**Call from another workflow:**
```yaml
jobs:
test:
uses: ./.github/workflows/reusable-test.yml
with:
node-version: '20'
```
### GitLab CI
**Templates with extends:**
```yaml
.test_template:
image: node:20
before_script:
- npm ci
unit-test:
extends: .test_template
script:
- npm run test:unit
integration-test:
extends: .test_template
script:
- npm run test:integration
```
**DAG pipelines with needs:**
```yaml
build:
stage: build
test:unit:
stage: test
needs: [build]
test:integration:
stage: test
needs: [build]
deploy:
stage: deploy
needs: [test:unit, test:integration]
```
## Diagnostic Scripts
### Pipeline Analyzer
Analyzes workflow configuration for optimization opportunities:
```bash
# GitHub Actions
python3 scripts/pipeline_analyzer.py --platform github --workflow .github/workflows/ci.yml
# GitLab CI
python3 scripts/pipeline_analyzer.py --platform gitlab --config .gitlab-ci.yml
```
**Identifies:**
- Missing caching opportunities
- Unnecessary sequential execution
- Outdated action versions
- Unused artifacts
- Overly broad triggers
### CI Health Checker
Checks pipeline status and identifies issues:
```bash
# GitHub Actions
python3 scripts/ci_health.py --platform github --repo owner/repo --limit 20
# GitLab CI
python3 scripts/ci_health.py --platform gitlab --project-id 12345 --token $GITLAB_TOKEN
```
**Provides:**
- Success/failure rates
- Recent failure patterns
- Workflow-specific insights
- Actionable recommendations
## Reference Documentation
For deep-dive information on specific topics:
- **[best_practices.md](references/best_practices.md)** - Pipeline design, testing strategies, deployment patterns, dependency management, artifact handling, platform-specific patterns
- **[security.md](references/security.md)** - Secrets management, OIDC authentication, supply chain security, access control, vulnerability scanning, secure pipeline patterns
- **[devsecops.md](references/devsecops.md)** - Comprehensive DevSecOps guide: SAST (CodeQL, Semgrep, Bandit, Gosec), DAST (OWASP ZAP), SCA (npm audit, pip-audit, Snyk), container security (Trivy, Grype, SBOM), secret scanning (TruffleHog, Gitleaks), security gates, license compliance
- **[optimization.md](references/optimization.md)** - Caching strategies (dependencies, Docker layers, build artifacts), parallelization techniques, test splitting, build optimization, resource management
- **[troubleshooting.md](references/troubleshooting.md)** - Common issues (workflow not triggering, flaky tests, timeouts, dependency errors), Docker problems, authentication issues, platform-specific debugging
## Templates
Starter templates for common use cases:
### GitHub Actions
- **`assets/templates/github-actions/node-ci.yml`** - Complete Node.js CI/CD with security scanning, caching, matrix testing, and multi-environment deployment
- **`assets/templates/github-actions/python-ci.yml`** - Python pipeline with security scanning, pytest, coverage, PyPI deployment
- **`assets/templates/github-actions/go-ci.yml`** - Go pipeline with security scanning, multi-platform builds, benchmarks, integration tests
- **`assets/templates/github-actions/docker-build.yml`** - Docker build with multi-platform support, security scanning, SBOM generation, and signing
- **`assets/templates/github-actions/security-scan.yml`** - Comprehensive DevSecOps pipeline with SAST, DAST, SCA, container scanning, and security gates
### GitLab CI
- **`assets/templates/gitlab-ci/node-ci.yml`** - GitLab CI pipeline with security scanning, parallel execution, services, and deployment stages
- **`assets/templates/gitlab-ci/python-ci.yml`** - Python pipeline with security scanning, parallel testing, Docker builds, PyPI and Cloud Run deployment
- **`assets/templates/gitlab-ci/go-ci.yml`** - Go pipeline with security scanning, multi-platform builds, benchmarks, Kubernetes deployment
- **`assets/templates/gitlab-ci/docker-build.yml`** - Docker build with DinD, multi-arch, Container Registry, security scanning
- **`assets/templates/gitlab-ci/security-scan.yml`** - Comprehensive DevSecOps pipeline with SAST, DAST, SCA, container scanning, GitLab security templates, and security gates
## Common Patterns
### Caching Dependencies
**GitHub Actions:**
```yaml
- uses: actions/cache@v4
with:
path: ~/.npm
key: ${{ runner.os }}-node-${{ hashFiles('**/package-lock.json') }}
restore-keys: |
${{ runner.os }}-node-
- run: npm ci
```
**GitLab CI:**
```yaml
cache:
key:
files:
- package-lock.json
paths:
- node_modules/
```
### Matrix Builds
**GitHub Actions:**
```yaml
strategy:
matrix:
os: [ubuntu-latest, macos-latest]
node: [18, 20, 22]
fail-fast: false
```
**GitLab CI:**
```yaml
test:
parallel:
matrix:
- NODE_VERSION: ['18', '20', '22']
```
### Conditional Execution
**GitHub Actions:**
```yaml
- name: Deploy
if: github.ref == 'refs/heads/main' && github.event_name == 'push'
```
**GitLab CI:**
```yaml
deploy:
rules:
- if: '$CI_COMMIT_BRANCH == "main"'
when: manual
```
## Best Practices Summary
**Performance:**
- Enable dependency caching
- Parallelize independent jobs
- Add path filters to reduce unnecessary runs
- Use matrix builds for cross-platform testing
**Security:**
- Use OIDC for cloud authentication
- Pin actions to commit SHAs
- Enable secret scanning and vulnerability checks
- Apply principle of least privilege
**Reliability:**
- Add timeouts to prevent hung jobs
- Implement retry logic for flaky operations
- Use health checks after deployments
- Enable concurrency cancellation
**Maintainability:**
- Use reusable workflows/templates
- Document non-obvious decisions
- Keep workflows DRY with extends/includes
- Regular dependency updates
## Getting Started
1. **New pipeline:** Start with a template from `assets/templates/`
2. **Add security scanning:** Use DevSecOps templates or add security stages to existing pipelines (see workflow 6 above)
3. **Optimize existing:** Run `scripts/pipeline_analyzer.py`
4. **Debug issues:** Check `references/troubleshooting.md`
5. **Improve security:** Review `references/security.md` and `references/devsecops.md` checklists
6. **Speed up builds:** See `references/optimization.md`
---
## Referenced Files
> The following files are referenced in this skill and included for context.
### references/best_practices.md
```markdown
# CI/CD Best Practices
Comprehensive guide to CI/CD pipeline design, testing strategies, and deployment patterns.
## Table of Contents
- [Pipeline Design Principles](#pipeline-design-principles)
- [Testing in CI/CD](#testing-in-cicd)
- [Deployment Strategies](#deployment-strategies)
- [Dependency Management](#dependency-management)
- [Artifact & Release Management](#artifact--release-management)
- [Platform Patterns](#platform-patterns)
---
## Pipeline Design Principles
### Fast Feedback Loops
Design pipelines to provide feedback quickly:
**Priority ordering:**
1. Linting and code formatting (seconds)
2. Unit tests (1-5 minutes)
3. Integration tests (5-15 minutes)
4. E2E tests (15-30 minutes)
5. Deployment (varies)
**Fail fast pattern:**
```yaml
# GitHub Actions
jobs:
lint:
runs-on: ubuntu-latest
steps:
- run: npm run lint
test:
needs: lint # Only run if lint passes
runs-on: ubuntu-latest
steps:
- run: npm test
e2e:
needs: [lint, test] # Run after basic checks
```
### Job Parallelization
Run independent jobs concurrently:
**GitHub Actions:**
```yaml
jobs:
lint:
runs-on: ubuntu-latest
test:
runs-on: ubuntu-latest
# No 'needs' - runs in parallel with lint
build:
needs: [lint, test] # Wait for both
runs-on: ubuntu-latest
```
**GitLab CI:**
```yaml
stages:
- validate
- test
- build
# Jobs in same stage run in parallel
unit-test:
stage: test
integration-test:
stage: test
e2e-test:
stage: test
```
### Monorepo Strategies
**Path-based triggers (GitHub):**
```yaml
on:
push:
paths:
- 'services/api/**'
- 'shared/**'
jobs:
api-test:
if: |
contains(github.event.head_commit.modified, 'services/api/') ||
contains(github.event.head_commit.modified, 'shared/')
```
**GitLab rules:**
```yaml
api-test:
rules:
- changes:
- services/api/**/*
- shared/**/*
frontend-test:
rules:
- changes:
- services/frontend/**/*
- shared/**/*
```
### Matrix Builds
Test across multiple versions/platforms:
**GitHub Actions:**
```yaml
strategy:
matrix:
os: [ubuntu-latest, macos-latest, windows-latest]
node: [18, 20, 22]
include:
- os: ubuntu-latest
node: 22
coverage: true
exclude:
- os: windows-latest
node: 18
fail-fast: false # See all results
```
**GitLab parallel:**
```yaml
test:
parallel:
matrix:
- NODE_VERSION: ['18', '20', '22']
OS: ['ubuntu', 'alpine']
```
---
## Testing in CI/CD
### Test Pyramid Strategy
Maintain proper test distribution:
```
/\
/E2E\ 10% - Slow, expensive, flaky
/-----\
/ Int \ 20% - Medium speed
/--------\
/ Unit \ 70% - Fast, reliable
/------------\
```
**Implementation:**
```yaml
jobs:
unit-test:
runs-on: ubuntu-latest
steps:
- run: npm run test:unit # Fast, runs on every commit
integration-test:
runs-on: ubuntu-latest
needs: unit-test
steps:
- run: npm run test:integration # Medium, after unit tests
e2e-test:
runs-on: ubuntu-latest
needs: [unit-test, integration-test]
if: github.ref == 'refs/heads/main' # Only on main branch
steps:
- run: npm run test:e2e # Slow, only on main
```
### Test Splitting & Parallelization
Split large test suites:
**GitHub Actions:**
```yaml
strategy:
matrix:
shard: [1, 2, 3, 4]
steps:
- run: npm test -- --shard=${{ matrix.shard }}/4
```
**Playwright example:**
```yaml
strategy:
matrix:
shardIndex: [1, 2, 3, 4]
shardTotal: [4]
steps:
- run: npx playwright test --shard=${{ matrix.shardIndex }}/${{ matrix.shardTotal }}
```
### Code Coverage
**Track coverage trends:**
```yaml
- name: Run tests with coverage
run: npm test -- --coverage
- name: Upload coverage
uses: codecov/codecov-action@v4
with:
files: ./coverage/lcov.info
fail_ci_if_error: true # Fail if upload fails
- name: Coverage check
run: |
COVERAGE=$(jq -r '.total.lines.pct' coverage/coverage-summary.json)
if (( $(echo "$COVERAGE < 80" | bc -l) )); then
echo "Coverage $COVERAGE% is below 80%"
exit 1
fi
```
### Test Environment Management
**Docker Compose for services:**
```yaml
jobs:
integration-test:
runs-on: ubuntu-latest
steps:
- name: Start services
run: docker-compose up -d postgres redis
- name: Wait for services
run: |
timeout 30 bash -c 'until docker-compose exec -T postgres pg_isready; do sleep 1; done'
- name: Run tests
run: npm run test:integration
- name: Cleanup
if: always()
run: docker-compose down
```
**GitLab services:**
```yaml
integration-test:
services:
- postgres:15
- redis:7-alpine
variables:
POSTGRES_DB: testdb
POSTGRES_PASSWORD: password
script:
- npm run test:integration
```
---
## Deployment Strategies
### Deployment Patterns
**1. Direct Deployment (Simple)**
```yaml
deploy:
if: github.ref == 'refs/heads/main'
steps:
- run: |
aws s3 sync dist/ s3://${{ secrets.S3_BUCKET }}
aws cloudfront create-invalidation --distribution-id ${{ secrets.CF_DIST }}
```
**2. Blue-Green Deployment**
```yaml
deploy:
steps:
- name: Deploy to staging slot
run: az webapp deployment slot swap --slot staging --resource-group $RG --name $APP
- name: Health check
run: |
for i in {1..10}; do
if curl -f https://$APP.azurewebsites.net/health; then
echo "Health check passed"
exit 0
fi
sleep 10
done
exit 1
- name: Rollback on failure
if: failure()
run: az webapp deployment slot swap --slot staging --resource-group $RG --name $APP
```
**3. Canary Deployment**
```yaml
deploy-canary:
steps:
- run: kubectl set image deployment/app app=myapp:${{ github.sha }}
- run: kubectl patch deployment app -p '{"spec":{"replicas":1}}' # 1 pod
- run: sleep 300 # Monitor for 5 minutes
- run: kubectl scale deployment app --replicas=10 # Scale to full
```
### Environment Management
**GitHub Environments:**
```yaml
jobs:
deploy-staging:
environment:
name: staging
url: https://staging.example.com
steps:
- run: ./deploy.sh staging
deploy-production:
needs: deploy-staging
environment:
name: production
url: https://example.com
steps:
- run: ./deploy.sh production
```
**Protection rules:**
- Require approval for production
- Restrict to specific branches
- Add deployment delay
**GitLab environments:**
```yaml
deploy:staging:
stage: deploy
environment:
name: staging
url: https://staging.example.com
on_stop: stop:staging
only:
- develop
deploy:production:
stage: deploy
environment:
name: production
url: https://example.com
when: manual # Require manual trigger
only:
- main
```
### Deployment Gates
**Pre-deployment checks:**
```yaml
pre-deploy-checks:
steps:
- name: Check migration status
run: ./scripts/check-migrations.sh
- name: Verify dependencies
run: npm audit --audit-level=high
- name: Check service health
run: curl -f https://api.example.com/health
```
**Post-deployment validation:**
```yaml
post-deploy-validation:
needs: deploy
steps:
- name: Smoke tests
run: npm run test:smoke
- name: Monitor errors
run: |
ERROR_COUNT=$(datadog-api errors --since 5m)
if [ $ERROR_COUNT -gt 10 ]; then
echo "Error spike detected!"
exit 1
fi
```
---
## Dependency Management
### Lock Files
**Always commit lock files:**
- `package-lock.json` (npm)
- `yarn.lock` (Yarn)
- `pnpm-lock.yaml` (pnpm)
- `Cargo.lock` (Rust)
- `Gemfile.lock` (Ruby)
- `poetry.lock` (Python)
**Use deterministic install commands:**
```bash
# Good - uses lock file
npm ci # Not npm install
yarn install --frozen-lockfile
pnpm install --frozen-lockfile
pip install -r requirements.txt
# Bad - updates lock file
npm install
```
### Dependency Caching
**See optimization.md for detailed caching strategies**
Quick reference:
- Hash lock files for cache keys
- Include OS/platform in cache key
- Use restore-keys for partial matches
- Separate cache for build artifacts vs dependencies
### Security Scanning
**Automated vulnerability checks:**
```yaml
security-scan:
steps:
- name: Dependency audit
run: |
npm audit --audit-level=high
# Or: pip-audit, cargo audit, bundle audit
- name: SAST scanning
uses: github/codeql-action/analyze@v3
- name: Container scanning
run: trivy image myapp:${{ github.sha }}
```
### Dependency Updates
**Automated dependency updates:**
- Dependabot (GitHub)
- Renovate
- GitLab Dependency Scanning
**Configuration example (Dependabot):**
```yaml
# .github/dependabot.yml
version: 2
updates:
- package-ecosystem: npm
directory: "/"
schedule:
interval: weekly
open-pull-requests-limit: 5
groups:
dev-dependencies:
dependency-type: development
```
---
## Artifact & Release Management
### Artifact Strategy
**Build once, deploy many:**
```yaml
build:
steps:
- run: npm run build
- uses: actions/upload-artifact@v4
with:
name: dist-${{ github.sha }}
path: dist/
retention-days: 7
deploy-staging:
needs: build
steps:
- uses: actions/download-artifact@v4
with:
name: dist-${{ github.sha }}
- run: ./deploy.sh staging
deploy-production:
needs: [build, deploy-staging]
steps:
- uses: actions/download-artifact@v4
with:
name: dist-${{ github.sha }}
- run: ./deploy.sh production
```
### Container Image Management
**Multi-stage builds:**
```dockerfile
# Build stage
FROM node:20-alpine AS builder
WORKDIR /app
COPY package*.json ./
RUN npm ci --only=production
COPY . .
RUN npm run build
# Production stage
FROM node:20-alpine
WORKDIR /app
COPY --from=builder /app/dist ./dist
COPY --from=builder /app/node_modules ./node_modules
USER node
CMD ["node", "dist/server.js"]
```
**Image tagging strategy:**
```yaml
- name: Build and tag images
run: |
docker build -t myapp:${{ github.sha }} .
docker tag myapp:${{ github.sha }} myapp:latest
docker tag myapp:${{ github.sha }} myapp:v1.2.3
```
### Release Automation
**Semantic versioning:**
```yaml
release:
if: startsWith(github.ref, 'refs/tags/v')
steps:
- uses: actions/create-release@v1
with:
tag_name: ${{ github.ref }}
release_name: Release ${{ github.ref }}
body: |
Changes in this release:
${{ github.event.head_commit.message }}
```
**Changelog generation:**
```yaml
- name: Generate changelog
run: |
git log $(git describe --tags --abbrev=0)..HEAD \
--pretty=format:"- %s (%h)" > CHANGELOG.md
```
---
## Platform Patterns
### GitHub Actions
**Reusable workflows:**
```yaml
# .github/workflows/reusable-test.yml
on:
workflow_call:
inputs:
node-version:
required: true
type: string
jobs:
test:
runs-on: ubuntu-latest
steps:
- uses: actions/setup-node@v4
with:
node-version: ${{ inputs.node-version }}
- run: npm test
```
**Composite actions:**
```yaml
# .github/actions/setup-app/action.yml
name: Setup Application
runs:
using: composite
steps:
- uses: actions/setup-node@v4
with:
node-version: 20
- run: npm ci
shell: bash
```
### GitLab CI
**Templates & extends:**
```yaml
.test_template:
image: node:20
before_script:
- npm ci
script:
- npm test
unit-test:
extends: .test_template
script:
- npm run test:unit
integration-test:
extends: .test_template
script:
- npm run test:integration
```
**Dynamic child pipelines:**
```yaml
generate-pipeline:
script:
- ./generate-config.sh > pipeline.yml
artifacts:
paths:
- pipeline.yml
trigger-pipeline:
trigger:
include:
- artifact: pipeline.yml
job: generate-pipeline
```
---
## Continuous Improvement
### Metrics to Track
- **Build duration:** Target < 10 minutes
- **Failure rate:** Target < 5%
- **Time to recovery:** Target < 1 hour
- **Deployment frequency:** Aim for multiple/day
- **Lead time:** Commit to production < 1 day
### Pipeline Optimization Checklist
- [ ] Jobs run in parallel where possible
- [ ] Dependencies are cached
- [ ] Test suite is properly split
- [ ] Linting fails fast
- [ ] Only necessary tests run on PRs
- [ ] Artifacts are reused across jobs
- [ ] Pipeline has appropriate timeouts
- [ ] Flaky tests are identified and fixed
- [ ] Security scanning is automated
- [ ] Deployment requires approval
### Regular Reviews
**Monthly:**
- Review build duration trends
- Analyze failure patterns
- Update dependencies
- Review security scan results
**Quarterly:**
- Audit pipeline efficiency
- Review deployment frequency
- Update CI/CD tools and actions
- Team retrospective on CI/CD pain points
```
### references/optimization.md
```markdown
# CI/CD Pipeline Optimization
Comprehensive guide to improving pipeline performance through caching, parallelization, and smart resource usage.
## Table of Contents
- [Caching Strategies](#caching-strategies)
- [Parallelization Techniques](#parallelization-techniques)
- [Build Optimization](#build-optimization)
- [Test Optimization](#test-optimization)
- [Resource Management](#resource-management)
- [Monitoring & Metrics](#monitoring--metrics)
---
## Caching Strategies
### Dependency Caching
**Impact:** Can reduce build times by 50-90%
#### GitHub Actions
**Node.js/npm:**
```yaml
- uses: actions/cache@v4
with:
path: ~/.npm
key: ${{ runner.os }}-node-${{ hashFiles('**/package-lock.json') }}
restore-keys: |
${{ runner.os }}-node-
- run: npm ci
```
**Python/pip:**
```yaml
- uses: actions/cache@v4
with:
path: ~/.cache/pip
key: ${{ runner.os }}-pip-${{ hashFiles('**/requirements.txt') }}
restore-keys: |
${{ runner.os }}-pip-
- run: pip install -r requirements.txt
```
**Go modules:**
```yaml
- uses: actions/cache@v4
with:
path: |
~/.cache/go-build
~/go/pkg/mod
key: ${{ runner.os }}-go-${{ hashFiles('**/go.sum') }}
restore-keys: |
${{ runner.os }}-go-
- run: go build
```
**Rust/Cargo:**
```yaml
- uses: actions/cache@v4
with:
path: |
~/.cargo/bin/
~/.cargo/registry/index/
~/.cargo/registry/cache/
~/.cargo/git/db/
target/
key: ${{ runner.os }}-cargo-${{ hashFiles('**/Cargo.lock') }}
restore-keys: |
${{ runner.os }}-cargo-
- run: cargo build --release
```
**Maven:**
```yaml
- uses: actions/cache@v4
with:
path: ~/.m2/repository
key: ${{ runner.os }}-maven-${{ hashFiles('**/pom.xml') }}
restore-keys: |
${{ runner.os }}-maven-
- run: mvn clean install
```
#### GitLab CI
**Global cache:**
```yaml
cache:
key: ${CI_COMMIT_REF_SLUG}
paths:
- node_modules/
- .npm/
- vendor/
```
**Job-specific cache:**
```yaml
build:
cache:
key: build-${CI_COMMIT_REF_SLUG}
paths:
- target/
policy: push # Upload only
test:
cache:
key: build-${CI_COMMIT_REF_SLUG}
paths:
- target/
policy: pull # Download only
```
**Cache with files checksum:**
```yaml
cache:
key:
files:
- package-lock.json
- yarn.lock
paths:
- node_modules/
```
### Build Artifact Caching
**Docker layer caching (GitHub):**
```yaml
- uses: docker/setup-buildx-action@v3
- uses: docker/build-push-action@v5
with:
context: .
cache-from: type=gha
cache-to: type=gha,mode=max
push: false
tags: myapp:latest
```
**Docker layer caching (GitLab):**
```yaml
build:
image: docker:latest
services:
- docker:dind
variables:
DOCKER_DRIVER: overlay2
script:
- docker pull $CI_REGISTRY_IMAGE:latest || true
- docker build --cache-from $CI_REGISTRY_IMAGE:latest -t $CI_REGISTRY_IMAGE:latest .
- docker push $CI_REGISTRY_IMAGE:latest
```
**Gradle build cache:**
```yaml
- uses: actions/cache@v4
with:
path: |
~/.gradle/caches
~/.gradle/wrapper
key: ${{ runner.os }}-gradle-${{ hashFiles('**/*.gradle*', '**/gradle-wrapper.properties') }}
- run: ./gradlew build --build-cache
```
### Cache Best Practices
**Key strategies:**
- Include OS/platform: `${{ runner.os }}-` or `${CI_RUNNER_OS}`
- Hash lock files: `hashFiles('**/package-lock.json')`
- Use restore-keys for fallback matches
- Separate caches for different purposes
**Cache invalidation:**
```yaml
# Version in cache key
cache:
key: v2-${CI_COMMIT_REF_SLUG}-${CI_PIPELINE_ID}
```
**Cache size management:**
- GitHub: 10GB per repository (LRU eviction after 7 days)
- GitLab: Configurable per runner
---
## Parallelization Techniques
### Job Parallelization
**Remove unnecessary dependencies:**
```yaml
# Before - Sequential
jobs:
lint:
test:
needs: lint
build:
needs: test
# After - Parallel
jobs:
lint:
test:
build:
needs: [lint, test] # Only wait for what's needed
```
### Matrix Builds
**GitHub Actions:**
```yaml
strategy:
matrix:
os: [ubuntu-latest, macos-latest, windows-latest]
node: [18, 20, 22]
include:
- os: ubuntu-latest
node: 22
coverage: true
exclude:
- os: macos-latest
node: 18
fail-fast: false
max-parallel: 10 # Limit concurrent jobs
```
**GitLab parallel:**
```yaml
test:
parallel:
matrix:
- NODE_VERSION: ['18', '20', '22']
TEST_SUITE: ['unit', 'integration']
script:
- nvm use $NODE_VERSION
- npm run test:$TEST_SUITE
```
### Test Splitting
**Jest sharding:**
```yaml
strategy:
matrix:
shard: [1, 2, 3, 4]
steps:
- run: npm test -- --shard=${{ matrix.shard }}/4
```
**Playwright sharding:**
```yaml
strategy:
matrix:
shardIndex: [1, 2, 3, 4]
shardTotal: [4]
steps:
- run: npx playwright test --shard=${{ matrix.shardIndex }}/${{ matrix.shardTotal }}
```
**Pytest splitting:**
```yaml
strategy:
matrix:
group: [1, 2, 3, 4]
steps:
- run: pytest --splits 4 --group ${{ matrix.group }}
```
### Conditional Execution
**Path-based:**
```yaml
jobs:
frontend-test:
if: contains(github.event.head_commit.modified, 'frontend/')
backend-test:
if: contains(github.event.head_commit.modified, 'backend/')
```
**GitLab rules:**
```yaml
frontend-test:
rules:
- changes:
- frontend/**/*
backend-test:
rules:
- changes:
- backend/**/*
```
---
## Build Optimization
### Incremental Builds
**Turb
orepo (monorepo):**
```yaml
- run: npx turbo run build test lint --filter=[HEAD^1]
```
**Nx (monorepo):**
```yaml
- run: npx nx affected --target=build --base=origin/main
```
### Compiler Optimizations
**TypeScript incremental:**
```json
{
"compilerOptions": {
"incremental": true,
"tsBuildInfoFile": ".tsbuildinfo"
}
}
```
**Cache tsbuildinfo:**
```yaml
- uses: actions/cache@v4
with:
path: .tsbuildinfo
key: ts-build-${{ hashFiles('**/*.ts') }}
```
### Multi-stage Docker Builds
```dockerfile
# Build stage
FROM node:20 AS builder
WORKDIR /app
COPY package*.json ./
RUN npm ci --only=production
COPY . .
RUN npm run build
# Production stage
FROM node:20-alpine
WORKDIR /app
COPY --from=builder /app/dist ./dist
COPY --from=builder /app/node_modules ./node_modules
CMD ["node", "dist/server.js"]
```
### Build Tool Configuration
**Webpack production mode:**
```javascript
module.exports = {
mode: 'production',
optimization: {
minimize: true,
splitChunks: {
chunks: 'all'
}
}
}
```
**Vite optimization:**
```javascript
export default {
build: {
minify: 'terser',
rollupOptions: {
output: {
manualChunks(id) {
if (id.includes('node_modules')) {
return 'vendor';
}
}
}
}
}
}
```
---
## Test Optimization
### Test Categorization
**Run fast tests first:**
```yaml
jobs:
unit-test:
runs-on: ubuntu-latest
steps:
- run: npm run test:unit # Fast (1-5 min)
integration-test:
needs: unit-test
runs-on: ubuntu-latest
steps:
- run: npm run test:integration # Medium (5-15 min)
e2e-test:
needs: [unit-test, integration-test]
if: github.ref == 'refs/heads/main'
runs-on: ubuntu-latest
steps:
- run: npm run test:e2e # Slow (15-30 min)
```
### Selective Test Execution
**Run only changed:**
```yaml
- name: Get changed files
id: changed
run: |
if [ "${{ github.event_name }}" == "pull_request" ]; then
echo "files=$(git diff --name-only origin/${{ github.base_ref }}...HEAD | tr '\n' ' ')" >> $GITHUB_OUTPUT
fi
- name: Run affected tests
if: steps.changed.outputs.files
run: npm test -- --findRelatedTests ${{ steps.changed.outputs.files }}
```
### Test Fixtures & Data
**Reuse test databases:**
```yaml
services:
postgres:
image: postgres:15
env:
POSTGRES_DB: testdb
POSTGRES_PASSWORD: testpass
options: >-
--health-cmd pg_isready
--health-interval 10s
--health-timeout 5s
--health-retries 5
steps:
- run: npm test # All tests share same DB
```
**Snapshot testing:**
```javascript
// Faster than full rendering tests
expect(component).toMatchSnapshot();
```
### Mock External Services
```javascript
// Instead of hitting real APIs
jest.mock('./api', () => ({
fetchData: jest.fn(() => Promise.resolve(mockData))
}));
```
---
## Resource Management
### Job Timeouts
**Prevent hung jobs:**
```yaml
jobs:
test:
timeout-minutes: 30 # Default: 360 (6 hours)
build:
timeout-minutes: 15
```
**GitLab:**
```yaml
test:
timeout: 30m # Default: 1h
```
### Concurrency Control
**GitHub Actions:**
```yaml
concurrency:
group: ${{ github.workflow }}-${{ github.ref }}
cancel-in-progress: true # Cancel old runs
```
**GitLab:**
```yaml
workflow:
auto_cancel:
on_new_commit: interruptible
job:
interruptible: true
```
### Resource Allocation
**GitLab runner tags:**
```yaml
build:
tags:
- high-memory
- ssd
```
**Kubernetes resource limits:**
```yaml
# GitLab Runner config
[[runners]]
[runners.kubernetes]
cpu_request = "1"
cpu_limit = "2"
memory_request = "2Gi"
memory_limit = "4Gi"
```
---
## Monitoring & Metrics
### Track Key Metrics
**Build duration:**
```yaml
- name: Track duration
run: |
START=$SECONDS
npm run build
DURATION=$((SECONDS - START))
echo "Build took ${DURATION}s"
```
**Cache hit rate:**
```yaml
- uses: actions/cache@v4
id: cache
with:
path: node_modules
key: ${{ hashFiles('package-lock.json') }}
- name: Cache stats
run: |
if [ "${{ steps.cache.outputs.cache-hit }}" == "true" ]; then
echo "Cache hit!"
else
echo "Cache miss"
fi
```
### Performance Regression Detection
**Compare against baseline:**
```yaml
- name: Benchmark
run: npm run benchmark > results.json
- name: Compare
run: |
CURRENT=$(jq '.duration' results.json)
BASELINE=120
if [ $CURRENT -gt $((BASELINE * 120 / 100)) ]; then
echo "Performance regression: ${CURRENT}s vs ${BASELINE}s baseline"
exit 1
fi
```
### External Monitoring
**DataDog CI Visibility:**
```yaml
- run: datadog-ci junit upload --service myapp junit-results.xml
```
**BuildPulse (flaky test detection):**
```yaml
- uses: buildpulse/[email protected]
with:
account: myaccount
repository: myrepo
path: test-results/*.xml
```
---
## Optimization Checklist
### Quick Wins
- [ ] Enable dependency caching
- [ ] Remove unnecessary job dependencies
- [ ] Add job timeouts
- [ ] Enable concurrency cancellation
- [ ] Use `npm ci` instead of `npm install`
### Medium Impact
- [ ] Implement test sharding
- [ ] Use Docker layer caching
- [ ] Add path-based triggers
- [ ] Split slow test suites
- [ ] Use matrix builds for parallel execution
### Advanced
- [ ] Implement incremental builds (Nx, Turborepo)
- [ ] Use remote caching
- [ ] Optimize Docker images (multi-stage, distroless)
- [ ] Implement test impact analysis
- [ ] Set up distributed test execution
### Monitoring
- [ ] Track build duration trends
- [ ] Monitor cache hit rates
- [ ] Identify flaky tests
- [ ] Measure test execution time
- [ ] Set up performance regression alerts
---
## Performance Targets
**Build times:**
- Lint: < 1 minute
- Unit tests: < 5 minutes
- Integration tests: < 15 minutes
- E2E tests: < 30 minutes
- Full pipeline: < 20 minutes
**Resource usage:**
- Cache hit rate: > 80%
- Job success rate: > 95%
- Concurrent jobs: Balanced across available runners
- Queue time: < 2 minutes
**Cost optimization:**
- Build minutes used: Monitor monthly trends
- Storage: Keep artifacts < 7 days unless needed
- Self-hosted runners: Monitor utilization (target 60-80%)
```
### references/security.md
```markdown
# CI/CD Security
Comprehensive guide to securing CI/CD pipelines, secrets management, and supply chain security.
## Table of Contents
- [Secrets Management](#secrets-management)
- [OIDC Authentication](#oidc-authentication)
- [Supply Chain Security](#supply-chain-security)
- [Access Control](#access-control)
- [Secure Pipeline Patterns](#secure-pipeline-patterns)
- [Vulnerability Scanning](#vulnerability-scanning)
---
## Secrets Management
### Never Commit Secrets
**Prevention methods:**
- Use `.gitignore` for sensitive files
- Enable pre-commit hooks (git-secrets, gitleaks)
- Use secret scanning (GitHub, GitLab)
**If secrets are exposed:**
1. Rotate compromised credentials immediately
2. Remove from git history: `git filter-repo` or BFG Repo-Cleaner
3. Audit access logs for unauthorized usage
### Platform Secret Stores
**GitHub Secrets:**
```yaml
# Repository, Environment, or Organization secrets
steps:
- name: Deploy
env:
API_KEY: ${{ secrets.API_KEY }}
DB_PASSWORD: ${{ secrets.DB_PASSWORD }}
run: ./deploy.sh
```
**Secret hierarchy:**
1. Environment secrets (highest priority)
2. Repository secrets
3. Organization secrets (lowest priority)
**GitLab CI/CD Variables:**
```yaml
# Project > Settings > CI/CD > Variables
deploy:
script:
- echo $API_KEY
- deploy --token $DEPLOY_TOKEN
variables:
ENVIRONMENT: "production" # Non-secret variable
```
**Variable types:**
- **Protected:** Only available on protected branches
- **Masked:** Hidden in job logs
- **Environment scope:** Limit to specific environments
### External Secret Management
**HashiCorp Vault:**
```yaml
# GitHub Actions
- uses: hashicorp/vault-action@v3
with:
url: https://vault.example.com
method: jwt
role: cicd-role
secrets: |
secret/data/app api_key | API_KEY ;
secret/data/db password | DB_PASSWORD
```
**AWS Secrets Manager:**
```yaml
- name: Get secrets
run: |
SECRET=$(aws secretsmanager get-secret-value \
--secret-id prod/api/key \
--query SecretString --output text)
echo "::add-mask::$SECRET"
echo "API_KEY=$SECRET" >> $GITHUB_ENV
```
**Azure Key Vault:**
```yaml
- uses: Azure/get-keyvault-secrets@v1
with:
keyvault: "my-keyvault"
secrets: 'api-key, db-password'
```
### Secret Rotation
**Implement rotation policies:**
```yaml
check-secret-age:
steps:
- name: Check secret age
run: |
CREATED=$(aws secretsmanager describe-secret \
--secret-id myapp/api-key \
--query 'CreatedDate' --output text)
AGE=$(( ($(date +%s) - $(date -d "$CREATED" +%s)) / 86400 ))
if [ $AGE -gt 90 ]; then
echo "Secret is $AGE days old, rotation required"
exit 1
fi
```
**Best practices:**
- Rotate secrets every 90 days
- Use short-lived credentials when possible
- Audit secret access logs
- Automate rotation where possible
---
## OIDC Authentication
### Why OIDC?
**Benefits over static credentials:**
- No long-lived secrets in CI/CD
- Automatic token expiration
- Fine-grained permissions
- Audit trail of authentication
### GitHub Actions OIDC
**AWS example:**
```yaml
permissions:
id-token: write # Required for OIDC
contents: read
jobs:
deploy:
steps:
- uses: aws-actions/configure-aws-credentials@v4
with:
role-to-assume: arn:aws:iam::123456789:role/GitHubActionsRole
aws-region: us-east-1
- run: aws s3 sync dist/ s3://my-bucket
```
**AWS IAM Trust Policy:**
```json
{
"Version": "2012-10-17",
"Statement": [{
"Effect": "Allow",
"Principal": {
"Federated": "arn:aws:iam::123456789:oidc-provider/token.actions.githubusercontent.com"
},
"Action": "sts:AssumeRoleWithWebIdentity",
"Condition": {
"StringEquals": {
"token.actions.githubusercontent.com:aud": "sts.amazonaws.com",
"token.actions.githubusercontent.com:sub": "repo:owner/repo:ref:refs/heads/main"
}
}
}]
}
```
**GCP example:**
```yaml
- uses: google-github-actions/auth@v2
with:
workload_identity_provider: 'projects/123/locations/global/workloadIdentityPools/github/providers/github-provider'
service_account: '[email protected]'
- run: gcloud storage cp dist/* gs://my-bucket
```
**Azure example:**
```yaml
- uses: azure/login@v2
with:
client-id: ${{ secrets.AZURE_CLIENT_ID }}
tenant-id: ${{ secrets.AZURE_TENANT_ID }}
subscription-id: ${{ secrets.AZURE_SUBSCRIPTION_ID }}
- run: az storage blob upload-batch -d mycontainer -s dist/
```
### GitLab OIDC
**Configure ID token:**
```yaml
deploy:
id_tokens:
GITLAB_OIDC_TOKEN:
aud: https://aws.amazonaws.com
script:
- |
CREDENTIALS=$(aws sts assume-role-with-web-identity \
--role-arn $AWS_ROLE_ARN \
--role-session-name gitlab-ci \
--web-identity-token $GITLAB_OIDC_TOKEN \
--duration-seconds 3600)
```
**Vault integration:**
```yaml
deploy:
id_tokens:
VAULT_ID_TOKEN:
aud: https://vault.example.com
before_script:
- export VAULT_TOKEN=$(vault write -field=token auth/jwt/login role=cicd-role jwt=$VAULT_ID_TOKEN)
```
---
## Supply Chain Security
### Dependency Verification
**Lock files:**
- Always commit lock files
- Use `npm ci`, not `npm install`
- Enable `--frozen-lockfile` (Yarn) or `--frozen-lockfile` (pnpm)
**Checksum verification:**
```yaml
- name: Verify dependencies
run: |
npm ci --audit=true
npx lockfile-lint --path package-lock.json --validate-https
```
**SBOM generation:**
```yaml
- name: Generate SBOM
run: |
syft dir:. -o spdx-json > sbom.json
- uses: actions/upload-artifact@v4
with:
name: sbom
path: sbom.json
```
### Action/Workflow Security
**Pin to commit SHA (GitHub):**
```yaml
# Bad - mutable tag
- uses: actions/checkout@v4
# Better - specific version
- uses: actions/[email protected]
# Best - pinned to SHA
- uses: actions/checkout@b4ffde65f46336ab88eb53be808477a3936bae11 # v4.1.0
```
**Verify action sources:**
- Only use actions from trusted sources
- Review action code before first use
- Monitor Dependabot alerts for actions
- Use verified creators when possible
**GitLab include verification:**
```yaml
include:
- project: 'security/ci-templates'
ref: 'v2.1.0' # Pin to specific version
file: '/security-scan.yml'
```
### Container Image Security
**Use specific tags:**
```yaml
# Bad
image: node:latest
# Good
image: node:20.11.0-alpine
# Best
image: node:20.11.0-alpine@sha256:abc123...
```
**Minimal base images:**
```dockerfile
# Prefer distroless or alpine
FROM gcr.io/distroless/node20-debian12
# Or alpine
FROM node:20-alpine
```
**Image scanning:**
```yaml
- name: Build image
run: docker build -t myapp:${{ github.sha }} .
- name: Scan image
run: |
trivy image --severity HIGH,CRITICAL myapp:${{ github.sha }}
grype myapp:${{ github.sha }}
```
### Code Signing
**Sign commits:**
```bash
git config --global user.signingkey <key-id>
git config --global commit.gpgsign true
```
**Verify signed commits (GitHub):**
```yaml
- name: Verify signatures
run: |
git verify-commit HEAD || exit 1
```
**Sign artifacts:**
```yaml
- name: Sign release
run: |
cosign sign myregistry/myapp:${{ github.sha }}
```
---
## Access Control
### Principle of Least Privilege
**GitHub permissions:**
```yaml
# Minimal permissions
permissions:
contents: read # Only read code
pull-requests: write # Comment on PRs
jobs:
deploy:
permissions:
contents: read
id-token: write # For OIDC
```
**GitLab protected branches:**
- Configure in Settings > Repository > Protected branches
- Restrict who can push and merge
- Require approval before merge
### Branch Protection
**GitHub branch protection rules:**
- Require pull request reviews
- Require status checks to pass
- Require signed commits
- Require linear history
- Include administrators
- Restrict who can push
**GitLab merge request approval rules:**
```yaml
# .gitlab/CODEOWNERS
* @senior-devs
/infra/ @devops-team
/security/ @security-team
```
### Environment Protection
**GitHub environment rules:**
- Required reviewers (up to 6)
- Wait timer before deployment
- Deployment branches (limit to specific branches)
- Custom deployment protection rules
**GitLab deployment protection:**
```yaml
production:
environment:
name: production
rules:
- if: '$CI_COMMIT_BRANCH == "main"'
when: manual # Require manual trigger
only:
variables:
- $APPROVED == "true"
```
### Audit Logging
**Enable audit logs:**
- GitHub: Enterprise > Settings > Audit log
- GitLab: Admin Area > Monitoring > Audit Events
**Monitor for:**
- Secret access
- Permission changes
- Workflow modifications
- Deployment approvals
---
## Secure Pipeline Patterns
### Isolate Untrusted Code
**Separate test from deploy:**
```yaml
test:
# Runs on PRs from forks
permissions:
contents: read
pull-requests: write
deploy:
if: github.event_name == 'push' # Not on PR
permissions:
contents: read
id-token: write
```
**GitLab fork protection:**
```yaml
deploy:
rules:
- if: '$CI_PROJECT_PATH == "myorg/myrepo"' # Only from main repo
- if: '$CI_COMMIT_BRANCH == "main"'
```
### Sanitize Inputs
**Avoid command injection:**
```yaml
# Bad - command injection risk
- run: echo "Title: ${{ github.event.issue.title }}"
# Good - use environment variable
- env:
TITLE: ${{ github.event.issue.title }}
run: echo "Title: $TITLE"
```
**Validate inputs:**
```yaml
- name: Validate version
run: |
if [[ ! "${{ inputs.version }}" =~ ^[0-9]+\.[0-9]+\.[0-9]+$ ]]; then
echo "Invalid version format"
exit 1
fi
```
### Network Restrictions
**Limit egress:**
```yaml
# GitHub Actions with StepSecurity
- uses: step-security/harden-runner@v2
with:
egress-policy: block
allowed-endpoints: |
api.github.com:443
npmjs.org:443
```
**GitLab network policy:**
```yaml
# Kubernetes NetworkPolicy for GitLab Runner pods
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: gitlab-runner-policy
spec:
podSelector:
matchLabels:
app: gitlab-runner
policyTypes:
- Egress
egress:
- to:
- namespaceSelector: {}
ports:
- protocol: TCP
port: 443
```
---
## Vulnerability Scanning
### Dependency Scanning
**npm audit:**
```yaml
- run: npm audit --audit-level=high
```
**Snyk:**
```yaml
- uses: snyk/actions/node@master
env:
SNYK_TOKEN: ${{ secrets.SNYK_TOKEN }}
with:
args: --severity-threshold=high
```
**GitLab Dependency Scanning:**
```yaml
include:
- template: Security/Dependency-Scanning.gitlab-ci.yml
```
### Static Application Security Testing (SAST)
**CodeQL (GitHub):**
```yaml
- uses: github/codeql-action/init@v3
with:
languages: javascript, python
- uses: github/codeql-action/autobuild@v3
- uses: github/codeql-action/analyze@v3
```
**SonarQube:**
```yaml
- uses: sonarsource/sonarqube-scan-action@master
env:
SONAR_TOKEN: ${{ secrets.SONAR_TOKEN }}
```
### Container Scanning
**Trivy:**
```yaml
- run: |
docker build -t myapp .
trivy image --severity HIGH,CRITICAL --exit-code 1 myapp
```
**Grype:**
```yaml
- uses: anchore/scan-action@v3
with:
image: myapp:latest
fail-build: true
severity-cutoff: high
```
### Dynamic Application Security Testing (DAST)
**OWASP ZAP:**
```yaml
dast:
stage: test
image: owasp/zap2docker-stable
script:
- zap-baseline.py -t https://staging.example.com -r report.html
artifacts:
paths:
- report.html
```
---
## Security Checklist
### Repository Level
- [ ] Enable branch protection
- [ ] Require code review
- [ ] Enable secret scanning
- [ ] Configure CODEOWNERS
- [ ] Enable signed commits
- [ ] Audit third-party integrations
### Pipeline Level
- [ ] Use OIDC instead of static credentials
- [ ] Pin actions/includes to specific versions
- [ ] Minimize permissions
- [ ] Sanitize user inputs
- [ ] Enable vulnerability scanning
- [ ] Separate test from deploy workflows
- [ ] Add security gates
### Secrets Management
- [ ] Use platform secret stores
- [ ] Enable secret masking
- [ ] Rotate secrets regularly
- [ ] Use short-lived credentials
- [ ] Audit secret access
- [ ] Never log secrets
### Monitoring & Response
- [ ] Enable audit logging
- [ ] Monitor for security alerts
- [ ] Set up incident response plan
- [ ] Regular security reviews
- [ ] Dependency update automation
- [ ] Security training for team
```
### references/troubleshooting.md
```markdown
# CI/CD Troubleshooting
Comprehensive guide to diagnosing and resolving common CI/CD pipeline issues.
## Table of Contents
- [Pipeline Failures](#pipeline-failures)
- [Dependency Issues](#dependency-issues)
- [Docker & Container Problems](#docker--container-problems)
- [Authentication & Permissions](#authentication--permissions)
- [Performance Issues](#performance-issues)
- [Platform-Specific Issues](#platform-specific-issues)
---
## Pipeline Failures
### Workflow Not Triggering
**GitHub Actions:**
**Symptoms:** Workflow doesn't run on push/PR
**Common causes:**
1. Workflow file in wrong location (must be `.github/workflows/`)
2. Invalid YAML syntax
3. Branch/path filters excluding the changes
4. Workflow disabled in repository settings
**Diagnostics:**
```bash
# Validate YAML
yamllint .github/workflows/ci.yml
# Check if workflow is disabled
gh workflow list --repo owner/repo
```
**Solutions:**
```yaml
# Check trigger configuration
on:
push:
branches: [main] # Ensure your branch matches
paths-ignore:
- 'docs/**' # May be excluding your changes
# Enable workflow
gh workflow enable ci.yml --repo owner/repo
```
**GitLab CI:**
**Symptoms:** Pipeline doesn't start
**Diagnostics:**
```bash
# Validate .gitlab-ci.yml
gl-ci-lint < .gitlab-ci.yml
# Check CI/CD settings
# Project > Settings > CI/CD > General pipelines
```
**Solutions:**
- Check if CI/CD is enabled for the project
- Verify `.gitlab-ci.yml` is in repository root
- Check pipeline must succeed setting isn't blocking
- Review `only`/`except` or `rules` configuration
### Jobs Failing Intermittently
**Symptoms:** Same job passes sometimes, fails others
**Common causes:**
1. Flaky tests
2. Race conditions
3. Network timeouts
4. Resource constraints
5. Time-dependent tests
**Identify flaky tests:**
```yaml
# GitHub Actions - Run multiple times
strategy:
matrix:
attempt: [1, 2, 3, 4, 5]
steps:
- run: npm test
```
**Solutions:**
```javascript
// Add retries to flaky tests
jest.retryTimes(3);
// Increase timeouts
jest.setTimeout(30000);
// Fix race conditions
await waitFor(() => expect(element).toBeInDocument(), {
timeout: 5000
});
```
**Network retry pattern:**
```yaml
- name: Install with retry
uses: nick-invision/retry@v2
with:
timeout_minutes: 10
max_attempts: 3
command: npm ci
```
### Timeout Errors
**Symptoms:** "Job exceeded maximum time" or similar
**Solutions:**
```yaml
# GitHub Actions - Increase timeout
jobs:
build:
timeout-minutes: 60 # Default: 360
# GitLab CI
test:
timeout: 2h # Default: 1h
```
**Optimize long-running jobs:**
- Add caching for dependencies
- Split tests into parallel jobs
- Use faster runners
- Identify and optimize slow tests
### Exit Code Errors
**Symptoms:** "Process completed with exit code 1"
**Diagnostics:**
```yaml
# Add verbose logging
- run: npm test -- --verbose
# Check specific exit codes
- run: |
npm test
EXIT_CODE=$?
echo "Exit code: $EXIT_CODE"
if [ $EXIT_CODE -eq 127 ]; then
echo "Command not found"
elif [ $EXIT_CODE -eq 1 ]; then
echo "General error"
fi
exit $EXIT_CODE
```
**Common exit codes:**
- `1`: General error
- `2`: Misuse of shell command
- `126`: Command cannot execute
- `127`: Command not found
- `130`: Terminated by Ctrl+C
- `137`: Killed (OOM)
- `143`: Terminated (SIGTERM)
---
## Dependency Issues
### "Module not found" or "Cannot find package"
**Symptoms:** Build fails with missing dependency error
**Causes:**
1. Missing dependency in `package.json`
2. Cache corruption
3. Lock file out of sync
4. Private package access issues
**Solutions:**
```yaml
# Clear cache and reinstall
- run: rm -rf node_modules package-lock.json
- run: npm install
# Use npm ci for clean install
- run: npm ci
# Clear GitHub Actions cache
# Settings > Actions > Caches > Delete specific cache
# GitLab - clear cache
cache:
key: $CI_COMMIT_REF_SLUG
policy: push # Force new cache
```
### Version Conflicts
**Symptoms:** Dependency resolution errors, peer dependency warnings
**Diagnostics:**
```bash
# Check for conflicts
npm ls
npm outdated
# View dependency tree
npm list --depth=1
```
**Solutions:**
```json
// Use overrides (package.json)
{
"overrides": {
"problematic-package": "2.0.0"
}
}
// Or resolutions (Yarn)
{
"resolutions": {
"problematic-package": "2.0.0"
}
}
```
### Private Package Access
**Symptoms:** "401 Unauthorized" or "404 Not Found" for private packages
**GitHub Packages:**
```yaml
- run: |
echo "@myorg:registry=https://npm.pkg.github.com" >> .npmrc
echo "//npm.pkg.github.com/:_authToken=${{ secrets.GITHUB_TOKEN }}" >> .npmrc
- run: npm ci
```
**npm Registry:**
```yaml
- run: echo "//registry.npmjs.org/:_authToken=${{ secrets.NPM_TOKEN }}" >> .npmrc
- run: npm ci
```
**GitLab Package Registry:**
```yaml
before_script:
- echo "@mygroup:registry=${CI_API_V4_URL}/projects/${CI_PROJECT_ID}/packages/npm/" >> .npmrc
- echo "${CI_API_V4_URL#https?}/projects/${CI_PROJECT_ID}/packages/npm/:_authToken=${CI_JOB_TOKEN}" >> .npmrc
```
---
## Docker & Container Problems
### "Cannot connect to Docker daemon"
**Symptoms:** Docker commands fail with connection error
**GitHub Actions:**
```yaml
# Ensure Docker is available
runs-on: ubuntu-latest # Has Docker pre-installed
steps:
- run: docker ps # Test Docker access
```
**GitLab CI:**
```yaml
# Use Docker-in-Docker
image: docker:latest
services:
- docker:dind
variables:
DOCKER_HOST: tcp://docker:2376
DOCKER_TLS_CERTDIR: "/certs"
DOCKER_TLS_VERIFY: 1
DOCKER_CERT_PATH: "$DOCKER_TLS_CERTDIR/client"
```
### Image Pull Errors
**Symptoms:** "Error response from daemon: pull access denied" or timeout
**Solutions:**
```yaml
# GitHub Actions - Login to registry
- uses: docker/login-action@v3
with:
registry: ghcr.io
username: ${{ github.actor }}
password: ${{ secrets.GITHUB_TOKEN }}
# Or for Docker Hub
- uses: docker/login-action@v3
with:
username: ${{ secrets.DOCKERHUB_USERNAME }}
password: ${{ secrets.DOCKERHUB_TOKEN }}
# Add retry logic
- run: |
for i in {1..3}; do
docker pull myimage:latest && break
sleep 5
done
```
### "No space left on device"
**Symptoms:** Docker build fails with disk space error
**Solutions:**
```yaml
# GitHub Actions - Clean up space
- run: docker system prune -af --volumes
# Or use built-in action
- uses: jlumbroso/free-disk-space@main
with:
tool-cache: true
android: true
dotnet: true
# GitLab - configure runner
[[runners]]
[runners.docker]
volumes = ["/var/run/docker.sock:/var/run/docker.sock", "/cache"]
[runners.docker.tmpfs]
"/tmp" = "rw,noexec"
```
### Multi-platform Build Issues
**Symptoms:** Build fails for ARM/different architecture
**Solution:**
```yaml
- uses: docker/setup-qemu-action@v3
- uses: docker/setup-buildx-action@v3
- uses: docker/build-push-action@v5
with:
platforms: linux/amd64,linux/arm64
context: .
push: false
```
---
## Authentication & Permissions
### "Permission denied" or "403 Forbidden"
**GitHub Actions:**
**Symptoms:** Cannot push, create release, or access API
**Solutions:**
```yaml
# Add necessary permissions
permissions:
contents: write # For pushing tags/releases
pull-requests: write # For commenting on PRs
packages: write # For pushing packages
id-token: write # For OIDC
# Check GITHUB_TOKEN permissions
- run: |
curl -H "Authorization: token ${{ secrets.GITHUB_TOKEN }}" \
https://api.github.com/repos/${{ github.repository }}
```
**GitLab CI:**
**Symptoms:** Cannot push to repository or access API
**Solutions:**
```yaml
# Use CI_JOB_TOKEN for API access
script:
- 'curl --header "JOB-TOKEN: $CI_JOB_TOKEN" "${CI_API_V4_URL}/projects"'
# Or use personal/project access token
variables:
GIT_STRATEGY: clone
before_script:
- git config --global user.email "[email protected]"
- git config --global user.name "CI Bot"
```
### Git Push Failures
**Symptoms:** "failed to push some refs" or "protected branch"
**Solutions:**
```yaml
# GitHub Actions - Check branch protection
# Settings > Branches > Branch protection rules
# Allow bypass
permissions:
contents: write
# Or use PAT with admin access
- uses: actions/checkout@v4
with:
token: ${{ secrets.ADMIN_PAT }}
# GitLab - Grant permissions
# Settings > Repository > Protected Branches
# Add CI/CD role with push permission
```
### AWS Credentials Issues
**Symptoms:** "Unable to locate credentials"
**Solutions:**
```yaml
# Using OIDC (recommended)
- uses: aws-actions/configure-aws-credentials@v4
with:
role-to-assume: arn:aws:iam::123456789:role/GitHubActionsRole
aws-region: us-east-1
# Using secrets (legacy)
- uses: aws-actions/configure-aws-credentials@v4
with:
aws-access-key-id: ${{ secrets.AWS_ACCESS_KEY_ID }}
aws-secret-access-key: ${{ secrets.AWS_SECRET_ACCESS_KEY }}
aws-region: us-east-1
# Test credentials
- run: aws sts get-caller-identity
```
---
## Performance Issues
### Slow Pipeline Execution
**Diagnostics:**
```bash
# GitHub - View timing
gh run view <run-id> --log
# Identify slow steps
# Each step shows duration in UI
```
**Solutions:**
- See [optimization.md](optimization.md) for comprehensive guide
- Add dependency caching
- Parallelize independent jobs
- Use faster runners
- Reduce test scope on PRs
### Cache Not Working
**Symptoms:** Cache always misses, builds still slow
**Diagnostics:**
```yaml
- uses: actions/cache@v4
id: cache
with:
path: node_modules
key: ${{ hashFiles('**/package-lock.json') }}
- run: echo "Cache hit: ${{ steps.cache.outputs.cache-hit }}"
```
**Common issues:**
1. Key changes every time
2. Path doesn't exist
3. Cache size exceeds limit
4. Cache evicted (LRU after 7 days on GitHub)
**Solutions:**
```yaml
# Use consistent key
key: ${{ runner.os }}-node-${{ hashFiles('**/package-lock.json') }}
# Add restore-keys for partial match
restore-keys: |
${{ runner.os }}-node-
# Check cache size
- run: du -sh node_modules
```
---
## Platform-Specific Issues
### GitHub Actions
**"Resource not accessible by integration":**
```yaml
# Add required permission
permissions:
issues: write # Or whatever resource you're accessing
```
**"Workflow is not shared":**
- Reusable workflows must be in `.github/workflows/`
- Repository must be public or org member
- Check workflow access settings
**"No runner available":**
- Self-hosted: Check runner is online and has matching labels
- GitHub-hosted: May hit concurrent job limit (check usage)
### GitLab CI
**"This job is stuck":**
- No runner available with matching tags
- All runners are busy
- Runner not configured for this project
**Solutions:**
```yaml
# Remove tags to use any available runner
job:
tags: []
# Or check runner configuration
# Settings > CI/CD > Runners
```
**"Job failed (system failure)":**
- Runner disconnected
- Resource limits exceeded
- Infrastructure issue
**Check runner logs:**
```bash
# On runner host
journalctl -u gitlab-runner -f
```
---
## Debugging Techniques
### Enable Debug Logging
**GitHub Actions:**
```yaml
# Repository > Settings > Secrets > Add:
# ACTIONS_RUNNER_DEBUG = true
# ACTIONS_STEP_DEBUG = true
```
**GitLab CI:**
```yaml
variables:
CI_DEBUG_TRACE: "true" # Caution: May expose secrets!
```
### Interactive Debugging
**GitHub Actions:**
```yaml
# Add tmate for SSH access
- uses: mxschmitt/action-tmate@v3
if: failure()
```
**Local reproduction:**
```bash
# Use act to run GitHub Actions locally
act -j build
# Or nektos/act for Docker
docker run -v $(pwd):/workspace -it nektos/act -j build
```
### Reproduce Locally
```bash
# GitHub Actions - Use same Docker image
docker run -it ubuntu:latest bash
# Install dependencies and test
apt-get update && apt-get install -y nodejs npm
npm ci
npm test
```
---
## Prevention Strategies
### Pre-commit Checks
```yaml
# .pre-commit-config.yaml
repos:
- repo: https://github.com/pre-commit/pre-commit-hooks
rev: v4.5.0
hooks:
- id: trailing-whitespace
- id: check-yaml
- id: check-added-large-files
- repo: local
hooks:
- id: tests
name: Run tests
entry: npm test
language: system
pass_filenames: false
```
### CI/CD Health Monitoring
Use the `scripts/ci_health.py` script:
```bash
python3 scripts/ci_health.py --platform github --repo owner/repo
```
### Regular Maintenance
- [ ] Monthly: Review failed job patterns
- [ ] Monthly: Update actions/dependencies
- [ ] Quarterly: Audit pipeline efficiency
- [ ] Quarterly: Review and clean old caches
- [ ] Yearly: Major version updates
---
## Getting Help
**GitHub Actions:**
- Community Forum: https://github.community
- Documentation: https://docs.github.com/actions
- Status: https://www.githubstatus.com
**GitLab CI:**
- Forum: https://forum.gitlab.com
- Documentation: https://docs.gitlab.com/ee/ci
- Status: https://status.gitlab.com
**General CI/CD:**
- Stack Overflow: Tag [github-actions] or [gitlab-ci]
- Reddit: r/devops, r/cicd
```
### references/devsecops.md
```markdown
# DevSecOps in CI/CD
Comprehensive guide to integrating security into CI/CD pipelines with SAST, DAST, SCA, and security gates.
## Table of Contents
- [Shift-Left Security](#shift-left-security)
- [SAST (Static Application Security Testing)](#sast-static-application-security-testing)
- [DAST (Dynamic Application Security Testing)](#dast-dynamic-application-security-testing)
- [SCA (Software Composition Analysis)](#sca-software-composition-analysis)
- [Container Security](#container-security)
- [Secret Scanning](#secret-scanning)
- [Security Gates & Quality Gates](#security-gates--quality-gates)
- [Compliance & License Scanning](#compliance--license-scanning)
---
## Shift-Left Security
**Core principle:** Integrate security testing early in the development lifecycle, not just before production.
**Security testing stages in CI/CD:**
```
Commit → SAST → Unit Tests → SCA → Build → Container Scan → Deploy to Test → DAST → Production
↓ ↓ ↓ ↓ ↓
Secret Code Dependency Docker Dynamic Security
Scan Analysis Vuln Check Image Scan App Testing Gates
```
**Benefits:**
- Find vulnerabilities early (cheaper to fix)
- Faster feedback to developers
- Reduce security debt
- Prevent vulnerable code from reaching production
---
## SAST (Static Application Security Testing)
Analyzes source code, bytecode, or binaries for security vulnerabilities without executing the application.
### Tools by Language
| Language | Tools | GitHub Actions | GitLab CI |
|----------|-------|----------------|-----------|
| **Multi-language** | CodeQL, Semgrep, SonarQube | ✅ | ✅ |
| **JavaScript/TypeScript** | ESLint (security plugins), NodeJsScan | ✅ | ✅ |
| **Python** | Bandit, Pylint, Safety | ✅ | ✅ |
| **Go** | Gosec, GoSec Scanner | ✅ | ✅ |
| **Java** | SpotBugs, FindSecBugs, PMD | ✅ | ✅ |
| **C#/.NET** | Security Code Scan, Roslyn Analyzers | ✅ | ✅ |
### CodeQL (GitHub)
**GitHub Actions:**
```yaml
name: CodeQL Analysis
on:
push:
branches: [main, develop]
pull_request:
branches: [main]
schedule:
- cron: '0 2 * * 1' # Weekly scan
jobs:
analyze:
name: Analyze Code
runs-on: ubuntu-latest
timeout-minutes: 30
permissions:
actions: read
contents: read
security-events: write
strategy:
fail-fast: false
matrix:
language: ['javascript', 'python']
steps:
- uses: actions/checkout@v4
- name: Initialize CodeQL
uses: github/codeql-action/init@v3
with:
languages: ${{ matrix.language }}
queries: security-extended
- name: Autobuild
uses: github/codeql-action/autobuild@v3
- name: Perform CodeQL Analysis
uses: github/codeql-action/analyze@v3
with:
category: "/language:${{matrix.language}}"
```
**Key features:**
- Supports 10+ languages
- Deep semantic analysis
- Low false positive rate
- Integrates with GitHub Security tab
- Custom query support
### Semgrep
**GitHub Actions:**
```yaml
- name: Run Semgrep
uses: returntocorp/semgrep-action@v1
with:
config: >-
p/security-audit
p/owasp-top-ten
p/cwe-top-25
```
**GitLab CI:**
```yaml
semgrep:
stage: test
image: returntocorp/semgrep
script:
- semgrep --config=auto --sarif --output=semgrep.sarif .
artifacts:
reports:
sast: semgrep.sarif
```
**Benefits:**
- Fast (runs in seconds)
- Highly customizable rules
- Multi-language support
- CI-native design
### Language-Specific SAST
**Python - Bandit:**
```yaml
# GitHub Actions
- name: Run Bandit
run: |
pip install bandit
bandit -r src/ -f json -o bandit-report.json
bandit -r src/ --exit-zero -ll # Only high severity fails build
# GitLab CI
bandit:
stage: test
image: python:3.11
script:
- pip install bandit
- bandit -r src/ -ll -f gitlab > bandit-report.json
artifacts:
reports:
sast: bandit-report.json
```
**JavaScript - ESLint Security Plugin:**
```yaml
# GitHub Actions
- name: Run ESLint Security
run: |
npm install eslint-plugin-security
npx eslint . --plugin=security --format=json --output-file=eslint-security.json
```
**Go - Gosec:**
```yaml
# GitHub Actions
- name: Run Gosec
uses: securego/gosec@master
with:
args: '-fmt sarif -out gosec.sarif ./...'
# GitLab CI
gosec:
stage: test
image: securego/gosec:latest
script:
- gosec -fmt json -out gosec-report.json ./...
artifacts:
reports:
sast: gosec-report.json
```
### SonarQube/SonarCloud
**GitHub Actions:**
```yaml
- name: SonarCloud Scan
uses: SonarSource/sonarcloud-github-action@master
env:
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
SONAR_TOKEN: ${{ secrets.SONAR_TOKEN }}
with:
args: >
-Dsonar.projectKey=my-project
-Dsonar.organization=my-org
-Dsonar.sources=src
-Dsonar.tests=tests
-Dsonar.python.coverage.reportPaths=coverage.xml
```
**GitLab CI:**
```yaml
sonarqube:
stage: test
image: sonarsource/sonar-scanner-cli:latest
script:
- sonar-scanner
-Dsonar.projectKey=$CI_PROJECT_NAME
-Dsonar.sources=src
-Dsonar.host.url=$SONAR_HOST_URL
-Dsonar.login=$SONAR_TOKEN
```
---
## DAST (Dynamic Application Security Testing)
Tests running applications for vulnerabilities by simulating attacks.
### OWASP ZAP
**Full scan workflow (GitHub Actions):**
```yaml
name: DAST Scan
on:
schedule:
- cron: '0 3 * * 1' # Weekly scan
workflow_dispatch:
jobs:
dast:
runs-on: ubuntu-latest
services:
app:
image: myapp:latest
ports:
- 8080:8080
steps:
- name: Wait for app to start
run: |
timeout 60 bash -c 'until curl -f http://localhost:8080/health; do sleep 2; done'
- name: ZAP Baseline Scan
uses: zaproxy/[email protected]
with:
target: 'http://localhost:8080'
rules_file_name: '.zap/rules.tsv'
fail_action: true
- name: Upload ZAP report
if: always()
uses: actions/upload-artifact@v4
with:
name: zap-report
path: report_html.html
```
**GitLab CI:**
```yaml
dast:
stage: test
image: owasp/zap2docker-stable
services:
- name: $CI_REGISTRY_IMAGE:$CI_COMMIT_SHA
alias: testapp
script:
# Baseline scan
- zap-baseline.py -t http://testapp:8080 -r zap-report.html -J zap-report.json
artifacts:
when: always
paths:
- zap-report.html
- zap-report.json
reports:
dast: zap-report.json
only:
- schedules
- main
```
**ZAP scan types:**
1. **Baseline Scan** (Fast, ~1-2 min)
```bash
zap-baseline.py -t https://staging.example.com -r report.html
```
- Passive scanning only
- No active attacks
- Good for PR checks
2. **Full Scan** (Comprehensive, 10-60 min)
```bash
zap-full-scan.py -t https://staging.example.com -r report.html
```
- Active + Passive scanning
- Attempts exploits
- Use on staging only
3. **API Scan**
```bash
zap-api-scan.py -t https://api.example.com/openapi.json -f openapi -r report.html
```
- For REST APIs
- OpenAPI/Swagger support
### Other DAST Tools
**Nuclei:**
```yaml
- name: Run Nuclei
uses: projectdiscovery/nuclei-action@main
with:
target: https://staging.example.com
templates: cves,vulnerabilities,exposures
```
**Nikto (Web server scanner):**
```yaml
nikto:
stage: dast
image: sullo/nikto
script:
- nikto -h http://testapp:8080 -Format json -output nikto-report.json
```
---
## SCA (Software Composition Analysis)
Identifies vulnerabilities in third-party dependencies and libraries.
### Dependency Scanning
**GitHub Dependabot (Built-in):**
```yaml
# .github/dependabot.yml
version: 2
updates:
- package-ecosystem: "npm"
directory: "/"
schedule:
interval: "weekly"
open-pull-requests-limit: 10
- package-ecosystem: "pip"
directory: "/"
schedule:
interval: "weekly"
```
**GitHub Actions - Dependency Review:**
```yaml
- name: Dependency Review
uses: actions/dependency-review-action@v4
with:
fail-on-severity: high
```
**npm audit:**
```yaml
- name: npm audit
run: |
npm audit --audit-level=high
# Or with audit-ci for better control
npx audit-ci --high
```
**pip-audit (Python):**
```yaml
- name: Python Security Check
run: |
pip install pip-audit
pip-audit --requirement requirements.txt --format json --output pip-audit.json
```
**Snyk:**
```yaml
# GitHub Actions
- name: Run Snyk
uses: snyk/actions/node@master
env:
SNYK_TOKEN: ${{ secrets.SNYK_TOKEN }}
with:
args: --severity-threshold=high --fail-on=all
# GitLab CI
snyk:
stage: test
image: snyk/snyk:node
script:
- snyk test --severity-threshold=high --json-file-output=snyk-report.json
artifacts:
reports:
dependency_scanning: snyk-report.json
```
**OWASP Dependency-Check:**
```yaml
- name: OWASP Dependency Check
run: |
wget https://github.com/jeremylong/DependencyCheck/releases/download/v8.4.0/dependency-check-8.4.0-release.zip
unzip dependency-check-8.4.0-release.zip
./dependency-check/bin/dependency-check.sh \
--scan . \
--format JSON \
--out dependency-check-report.json \
--failOnCVSS 7
```
### GitLab Dependency Scanning (Built-in)
```yaml
include:
- template: Security/Dependency-Scanning.gitlab-ci.yml
dependency_scanning:
variables:
DS_EXCLUDED_PATHS: "test/,tests/,spec/,vendor/"
```
---
## Container Security
### Image Scanning
**Trivy (Comprehensive):**
```yaml
# GitHub Actions
- name: Run Trivy
uses: aquasecurity/trivy-action@master
with:
image-ref: myapp:${{ github.sha }}
format: 'sarif'
output: 'trivy-results.sarif'
severity: 'CRITICAL,HIGH'
exit-code: '1'
- name: Upload to Security tab
uses: github/codeql-action/upload-sarif@v3
if: always()
with:
sarif_file: 'trivy-results.sarif'
# GitLab CI
trivy:
stage: test
image: aquasec/trivy:latest
script:
- trivy image --severity HIGH,CRITICAL --format json --output trivy-report.json $CI_REGISTRY_IMAGE:$CI_COMMIT_SHA
- trivy image --severity HIGH,CRITICAL --exit-code 1 $CI_REGISTRY_IMAGE:$CI_COMMIT_SHA
artifacts:
reports:
container_scanning: trivy-report.json
```
**Grype:**
```yaml
- name: Scan with Grype
uses: anchore/scan-action@v3
with:
image: myapp:latest
fail-build: true
severity-cutoff: high
output-format: sarif
- name: Upload Grype results
uses: github/codeql-action/upload-sarif@v3
with:
sarif_file: ${{ steps.scan.outputs.sarif }}
```
**Clair:**
```yaml
clair:
stage: scan
image: arminc/clair-scanner:latest
script:
- clair-scanner --ip $(hostname -i) myapp:latest
```
### SBOM (Software Bill of Materials)
**Syft:**
```yaml
- name: Generate SBOM
uses: anchore/sbom-action@v0
with:
image: myapp:${{ github.sha }}
format: spdx-json
output-file: sbom.spdx.json
- name: Upload SBOM
uses: actions/upload-artifact@v4
with:
name: sbom
path: sbom.spdx.json
```
**CycloneDX:**
```yaml
- name: Generate CycloneDX SBOM
run: |
npm install -g @cyclonedx/cyclonedx-npm
cyclonedx-npm --output-file sbom.json
```
---
## Secret Scanning
### Pre-commit Prevention
**TruffleHog:**
```yaml
# GitHub Actions
- name: TruffleHog Scan
uses: trufflesecurity/trufflehog@main
with:
path: ./
base: ${{ github.event.repository.default_branch }}
head: HEAD
# GitLab CI
trufflehog:
stage: test
image: trufflesecurity/trufflehog:latest
script:
- trufflehog filesystem . --json --fail > trufflehog-report.json
```
**Gitleaks:**
```yaml
# GitHub Actions
- name: Gitleaks
uses: gitleaks/gitleaks-action@v2
env:
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
# GitLab CI
gitleaks:
stage: test
image: zricethezav/gitleaks:latest
script:
- gitleaks detect --source . --report-format json --report-path gitleaks-report.json
```
**GitGuardian:**
```yaml
- name: GitGuardian scan
uses: GitGuardian/ggshield-action@master
env:
GITGUARDIAN_API_KEY: ${{ secrets.GITGUARDIAN_API_KEY }}
```
### GitHub Secret Scanning (Native)
Enable in: **Settings → Code security and analysis → Secret scanning**
- Automatic detection
- Partner patterns (AWS, Azure, GCP, etc.)
- Push protection (prevents commits with secrets)
---
## Security Gates & Quality Gates
### Fail Pipeline on Security Issues
**Threshold-based gates:**
```yaml
security-gate:
stage: gate
script:
# Check vulnerability count
- |
CRITICAL=$(jq '.vulnerabilities | map(select(.severity=="CRITICAL")) | length' trivy-report.json)
HIGH=$(jq '.vulnerabilities | map(select(.severity=="HIGH")) | length' trivy-report.json)
echo "Critical: $CRITICAL, High: $HIGH"
if [ "$CRITICAL" -gt 0 ]; then
echo "❌ CRITICAL vulnerabilities found!"
exit 1
fi
if [ "$HIGH" -gt 5 ]; then
echo "❌ Too many HIGH vulnerabilities: $HIGH"
exit 1
fi
```
**SonarQube Quality Gate:**
```yaml
- name: Check Quality Gate
run: |
STATUS=$(curl -u $SONAR_TOKEN: "$SONAR_HOST/api/qualitygates/project_status?projectKey=$PROJECT_KEY" | jq -r '.projectStatus.status')
if [ "$STATUS" != "OK" ]; then
echo "Quality gate failed: $STATUS"
exit 1
fi
```
### Manual Approval for Production
**GitHub Actions:**
```yaml
deploy-production:
runs-on: ubuntu-latest
needs: [sast, dast, container-scan]
environment:
name: production
# Requires manual approval in Settings → Environments
steps:
- run: echo "Deploying to production"
```
**GitLab CI:**
```yaml
deploy:production:
stage: deploy
needs: [sast, dast, container_scanning]
script:
- ./deploy.sh production
when: manual
only:
- main
```
---
## Compliance & License Scanning
### License Compliance
**FOSSology:**
```yaml
license-scan:
stage: compliance
image: fossology/fossology:latest
script:
- fossology --scan ./src
```
**License Finder:**
```yaml
- name: Check Licenses
run: |
gem install license_finder
license_finder --decisions-file .license_finder.yml
```
**npm license checker:**
```yaml
- name: License Check
run: |
npx license-checker --production --onlyAllow "MIT;Apache-2.0;BSD-3-Clause;ISC"
```
### Policy as Code
**Open Policy Agent (OPA):**
```yaml
policy-check:
stage: gate
image: openpolicyagent/opa:latest
script:
- opa test policies/
- opa eval --data policies/ --input violations.json "data.security.allow"
```
---
## Complete DevSecOps Pipeline
**Comprehensive example (GitHub Actions):**
```yaml
name: DevSecOps Pipeline
on: [push, pull_request]
jobs:
# Stage 1: Secret Scanning
secret-scan:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: trufflesecurity/trufflehog@main
# Stage 2: SAST
sast:
runs-on: ubuntu-latest
needs: secret-scan
steps:
- uses: actions/checkout@v4
- uses: github/codeql-action/init@v3
- uses: github/codeql-action/autobuild@v3
- uses: github/codeql-action/analyze@v3
# Stage 3: SCA
sca:
runs-on: ubuntu-latest
needs: secret-scan
steps:
- uses: actions/checkout@v4
- run: npm audit --audit-level=high
- uses: snyk/actions/node@master
env:
SNYK_TOKEN: ${{ secrets.SNYK_TOKEN }}
# Stage 4: Build & Container Scan
build-scan:
runs-on: ubuntu-latest
needs: [sast, sca]
steps:
- uses: actions/checkout@v4
- run: docker build -t myapp:${{ github.sha }} .
- uses: aquasecurity/trivy-action@master
with:
image-ref: myapp:${{ github.sha }}
exit-code: '1'
# Stage 5: DAST
dast:
runs-on: ubuntu-latest
needs: build-scan
if: github.ref == 'refs/heads/main'
steps:
- uses: zaproxy/[email protected]
with:
target: 'https://staging.example.com'
# Stage 6: Security Gate
security-gate:
runs-on: ubuntu-latest
needs: [sast, sca, build-scan, dast]
steps:
- run: echo "All security checks passed!"
- run: echo "Ready for deployment"
# Stage 7: Deploy
deploy:
runs-on: ubuntu-latest
needs: security-gate
environment: production
steps:
- run: echo "Deploying to production"
```
---
## Best Practices
### 1. Fail Fast
- Run secret scanning first
- Run SAST early in pipeline
- Block PRs with critical vulnerabilities
### 2. Balance Speed vs Security
- SAST/SCA on every PR (fast)
- Container scanning after build
- DAST on schedules or staging only (slow)
### 3. Prioritize Findings
**Focus on:**
- Critical/High severity
- Exploitable vulnerabilities
- Direct dependencies (not transitive)
- Public-facing components
### 4. Developer Experience
- Clear error messages
- Link to remediation guidance
- Don't overwhelm with noise
- Use quality gates, not just fail/pass
### 5. Continuous Improvement
- Track security debt over time
- Set SLAs for vulnerability remediation
- Regular tool evaluation
- Security training for developers
### 6. Reporting & Metrics
**Track:**
- Mean Time to Remediate (MTTR)
- Vulnerability backlog
- False positive rate
- Coverage (% of code scanned)
```yaml
- name: Generate Security Report
run: |
echo "## Security Scan Summary" >> $GITHUB_STEP_SUMMARY
echo "- SAST: ✅ Passed" >> $GITHUB_STEP_SUMMARY
echo "- SCA: ⚠️ 3 vulnerabilities" >> $GITHUB_STEP_SUMMARY
echo "- Container: ✅ Passed" >> $GITHUB_STEP_SUMMARY
echo "- DAST: 🔄 Scheduled" >> $GITHUB_STEP_SUMMARY
```
---
## Tool Comparison
| Category | Tool | Speed | Accuracy | Cost | Best For |
|----------|------|-------|----------|------|----------|
| **SAST** | CodeQL | Medium | High | Free (GH) | Deep analysis |
| | Semgrep | Fast | Medium | Free/Paid | Custom rules |
| | SonarQube | Medium | High | Free/Paid | Quality + Security |
| **DAST** | OWASP ZAP | Medium | High | Free | Web apps |
| | Burp Suite | Slow | High | Paid | Professional |
| **SCA** | Snyk | Fast | High | Free/Paid | Easy integration |
| | Dependabot | Fast | Medium | Free (GH) | Auto PRs |
| **Container** | Trivy | Fast | High | Free | Fast scans |
| | Grype | Fast | High | Free | SBOM support |
| **Secrets** | TruffleHog | Fast | High | Free/Paid | Git history |
| | GitGuardian | Fast | High | Paid | Real-time |
---
## Security Scanning Schedule
**Recommended frequency:**
| Scan Type | PR | Main Branch | Schedule | Notes |
|-----------|----|-----------|-----------| ------|
| Secret Scanning | ✅ Every | ✅ Every | - | Fast, critical |
| SAST | ✅ Every | ✅ Every | - | Fast, essential |
| SCA | ✅ Every | ✅ Every | Weekly | Check dependencies |
| Linting | ✅ Every | ✅ Every | - | Very fast |
| Container Scan | ❌ No | ✅ Every | - | After build |
| DAST Baseline | ❌ No | ✅ Every | - | Medium speed |
| DAST Full | ❌ No | ❌ No | Weekly | Very slow |
| Penetration Test | ❌ No | ❌ No | Quarterly | Manual |
---
## Security Checklist
- [ ] Secret scanning enabled and running
- [ ] SAST configured for all languages used
- [ ] Dependency scanning (SCA) enabled
- [ ] Container images scanned before deployment
- [ ] DAST running on staging environment
- [ ] Security findings triaged in issue tracker
- [ ] Quality gates prevent vulnerable deployments
- [ ] SBOM generated for releases
- [ ] Security scan results tracked over time
- [ ] Vulnerability remediation SLAs defined
- [ ] Security training for developers
- [ ] Incident response plan documented
```
### scripts/pipeline_analyzer.py
```python
#!/usr/bin/env python3
"""
CI/CD Pipeline Performance Analyzer
Analyzes CI/CD pipeline configuration and execution to identify performance
bottlenecks, caching opportunities, and optimization recommendations.
Usage:
# Analyze GitHub Actions workflow
python3 pipeline_analyzer.py --platform github --workflow .github/workflows/ci.yml
# Analyze GitLab CI pipeline
python3 pipeline_analyzer.py --platform gitlab --config .gitlab-ci.yml
# Analyze recent workflow runs
python3 pipeline_analyzer.py --platform github --repo owner/repo --analyze-runs 10
"""
import argparse
import json
import os
import re
import subprocess
import sys
from pathlib import Path
from typing import Dict, List, Optional, Tuple
import yaml
class PipelineAnalyzer:
def __init__(self, platform: str, **kwargs):
self.platform = platform.lower()
self.config = kwargs
self.findings = []
self.optimizations = []
self.metrics = {}
def analyze_github_workflow(self, workflow_file: str) -> Dict:
"""Analyze GitHub Actions workflow file"""
print(f"🔍 Analyzing GitHub Actions workflow: {workflow_file}")
if not os.path.exists(workflow_file):
return self._error(f"Workflow file not found: {workflow_file}")
try:
with open(workflow_file, 'r') as f:
workflow = yaml.safe_load(f)
# Analyze workflow structure
self._check_workflow_triggers(workflow)
self._check_caching_strategy(workflow, 'github')
self._check_job_parallelization(workflow, 'github')
self._check_dependency_management(workflow, 'github')
self._check_matrix_strategy(workflow)
self._check_artifact_usage(workflow)
self._analyze_action_versions(workflow)
return self._generate_report()
except yaml.YAMLError as e:
return self._error(f"Invalid YAML: {e}")
except Exception as e:
return self._error(f"Analysis failed: {e}")
def analyze_gitlab_pipeline(self, config_file: str) -> Dict:
"""Analyze GitLab CI pipeline configuration"""
print(f"🔍 Analyzing GitLab CI pipeline: {config_file}")
if not os.path.exists(config_file):
return self._error(f"Config file not found: {config_file}")
try:
with open(config_file, 'r') as f:
config = yaml.safe_load(f)
# Analyze pipeline structure
self._check_caching_strategy(config, 'gitlab')
self._check_job_parallelization(config, 'gitlab')
self._check_dependency_management(config, 'gitlab')
self._check_gitlab_specific_features(config)
return self._generate_report()
except yaml.YAMLError as e:
return self._error(f"Invalid YAML: {e}")
except Exception as e:
return self._error(f"Analysis failed: {e}")
def _check_workflow_triggers(self, workflow: Dict):
"""Check workflow trigger configuration"""
triggers = workflow.get('on', {})
if isinstance(triggers, list):
trigger_types = triggers
elif isinstance(triggers, dict):
trigger_types = list(triggers.keys())
else:
trigger_types = [triggers] if triggers else []
# Check for overly broad triggers
if 'push' in trigger_types:
push_config = triggers.get('push', {}) if isinstance(triggers, dict) else {}
if not push_config or not push_config.get('branches'):
self.findings.append("Workflow triggers on all push events (no branch filter)")
self.optimizations.append(
"Add branch filters to 'push' trigger to reduce unnecessary runs:\n"
" on:\n"
" push:\n"
" branches: [main, develop]"
)
# Check for path filters
if 'pull_request' in trigger_types:
pr_config = triggers.get('pull_request', {}) if isinstance(triggers, dict) else {}
if not pr_config.get('paths') and not pr_config.get('paths-ignore'):
self.optimizations.append(
"Consider adding path filters to skip unnecessary PR runs:\n"
" pull_request:\n"
" paths-ignore:\n"
" - 'docs/**'\n"
" - '**.md'"
)
def _check_caching_strategy(self, config: Dict, platform: str):
"""Check for dependency caching"""
has_cache = False
if platform == 'github':
jobs = config.get('jobs', {})
for job_name, job in jobs.items():
steps = job.get('steps', [])
for step in steps:
if isinstance(step, dict) and step.get('uses', '').startswith('actions/cache'):
has_cache = True
break
if not has_cache:
self.findings.append("No dependency caching detected")
self.optimizations.append(
"Add dependency caching to speed up builds:\n"
" - uses: actions/cache@v4\n"
" with:\n"
" path: |\n"
" ~/.cargo\n"
" ~/.npm\n"
" ~/.cache/pip\n"
" key: ${{ runner.os }}-deps-${{ hashFiles('**/package-lock.json') }}"
)
elif platform == 'gitlab':
cache_config = config.get('cache', {})
job_has_cache = False
# Check global cache
if cache_config:
has_cache = True
# Check job-level cache
for key, value in config.items():
if isinstance(value, dict) and 'script' in value:
if value.get('cache'):
job_has_cache = True
if not has_cache and not job_has_cache:
self.findings.append("No caching configuration detected")
self.optimizations.append(
"Add caching to speed up builds:\n"
"cache:\n"
" key: ${CI_COMMIT_REF_SLUG}\n"
" paths:\n"
" - node_modules/\n"
" - .npm/\n"
" - vendor/"
)
def _check_job_parallelization(self, config: Dict, platform: str):
"""Check for job parallelization opportunities"""
if platform == 'github':
jobs = config.get('jobs', {})
# Count jobs with dependencies
jobs_with_needs = sum(1 for job in jobs.values()
if isinstance(job, dict) and 'needs' in job)
if len(jobs) > 1 and jobs_with_needs == 0:
self.optimizations.append(
f"Found {len(jobs)} jobs with no dependencies - they will run in parallel (good!)"
)
elif len(jobs) > 3 and jobs_with_needs == len(jobs):
self.findings.append("All jobs have 'needs' dependencies - may be unnecessarily sequential")
self.optimizations.append(
"Review job dependencies - remove 'needs' where jobs can run in parallel"
)
elif platform == 'gitlab':
stages = config.get('stages', [])
if len(stages) > 5:
self.findings.append(f"Pipeline has {len(stages)} stages - may be overly sequential")
self.optimizations.append(
"Consider reducing stages to allow more parallel execution"
)
def _check_dependency_management(self, config: Dict, platform: str):
"""Check dependency installation patterns"""
if platform == 'github':
jobs = config.get('jobs', {})
for job_name, job in jobs.items():
steps = job.get('steps', [])
for step in steps:
if isinstance(step, dict):
run_cmd = step.get('run', '')
# Check for npm ci vs npm install
if 'npm install' in run_cmd and 'npm ci' not in run_cmd:
self.findings.append(f"Job '{job_name}' uses 'npm install' instead of 'npm ci'")
self.optimizations.append(
f"Use 'npm ci' instead of 'npm install' for faster, reproducible installs"
)
# Check for pip install without cache
if 'pip install' in run_cmd:
has_pip_cache = any(
s.get('uses', '').startswith('actions/cache') and
'pip' in str(s.get('with', {}).get('path', ''))
for s in steps if isinstance(s, dict)
)
if not has_pip_cache:
self.optimizations.append(
f"Add pip cache for job '{job_name}' to speed up Python dependency installation"
)
def _check_matrix_strategy(self, workflow: Dict):
"""Check for matrix strategy usage"""
jobs = workflow.get('jobs', {})
for job_name, job in jobs.items():
if isinstance(job, dict):
strategy = job.get('strategy', {})
matrix = strategy.get('matrix', {})
if matrix:
# Check fail-fast
fail_fast = strategy.get('fail-fast', True)
if fail_fast:
self.optimizations.append(
f"Job '{job_name}' has fail-fast=true (default). "
f"Consider fail-fast=false to see all matrix results"
)
# Check for large matrices
matrix_size = 1
for key, values in matrix.items():
if isinstance(values, list):
matrix_size *= len(values)
if matrix_size > 20:
self.findings.append(
f"Job '{job_name}' has large matrix ({matrix_size} combinations)"
)
self.optimizations.append(
f"Consider reducing matrix size or using 'exclude' to skip unnecessary combinations"
)
def _check_artifact_usage(self, workflow: Dict):
"""Check artifact upload/download patterns"""
jobs = workflow.get('jobs', {})
uploads = {}
downloads = {}
for job_name, job in jobs.items():
if not isinstance(job, dict):
continue
steps = job.get('steps', [])
for step in steps:
if isinstance(step, dict):
uses = step.get('uses', '')
if 'actions/upload-artifact' in uses:
artifact_name = step.get('with', {}).get('name', 'unknown')
uploads[artifact_name] = job_name
if 'actions/download-artifact' in uses:
artifact_name = step.get('with', {}).get('name', 'unknown')
downloads.setdefault(artifact_name, []).append(job_name)
# Check for unused artifacts
for artifact, uploader in uploads.items():
if artifact not in downloads:
self.findings.append(f"Artifact '{artifact}' uploaded but never downloaded")
self.optimizations.append(f"Remove unused artifact upload or add download step")
def _analyze_action_versions(self, workflow: Dict):
"""Check for outdated action versions"""
jobs = workflow.get('jobs', {})
outdated_actions = []
for job_name, job in jobs.items():
if not isinstance(job, dict):
continue
steps = job.get('steps', [])
for step in steps:
if isinstance(step, dict):
uses = step.get('uses', '')
# Check for @v1, @v2 versions (likely outdated)
if '@v1' in uses or '@v2' in uses:
outdated_actions.append(uses)
if outdated_actions:
self.findings.append(f"Found {len(outdated_actions)} potentially outdated actions")
self.optimizations.append(
f"Update to latest action versions:\n" +
"\n".join(f" - {action}" for action in set(outdated_actions))
)
def _check_gitlab_specific_features(self, config: Dict):
"""Check GitLab-specific optimization opportunities"""
# Check for interruptible jobs
has_interruptible = any(
isinstance(v, dict) and v.get('interruptible')
for v in config.values()
)
if not has_interruptible:
self.optimizations.append(
"Consider marking jobs as 'interruptible: true' to cancel redundant pipeline runs:\n"
"job_name:\n"
" interruptible: true"
)
# Check for DAG usage (needs keyword)
has_needs = any(
isinstance(v, dict) and 'needs' in v
for v in config.values()
)
if not has_needs and config.get('stages') and len(config.get('stages', [])) > 2:
self.optimizations.append(
"Consider using 'needs' keyword for DAG pipelines to improve parallelization:\n"
"test:\n"
" needs: [build]"
)
def _error(self, message: str) -> Dict:
"""Return error report"""
return {
'status': 'error',
'error': message,
'findings': [],
'optimizations': []
}
def _generate_report(self) -> Dict:
"""Generate analysis report"""
return {
'status': 'success',
'platform': self.platform,
'findings': self.findings,
'optimizations': self.optimizations,
'metrics': self.metrics
}
def print_report(report: Dict):
"""Print formatted analysis report"""
if report['status'] == 'error':
print(f"\n❌ Error: {report['error']}\n")
return
print("\n" + "="*60)
print(f"📊 Pipeline Analysis Report - {report['platform'].upper()}")
print("="*60)
if report['findings']:
print(f"\n🔍 Findings ({len(report['findings'])}):")
for i, finding in enumerate(report['findings'], 1):
print(f"\n {i}. {finding}")
if report['optimizations']:
print(f"\n💡 Optimization Recommendations ({len(report['optimizations'])}):")
for i, opt in enumerate(report['optimizations'], 1):
print(f"\n {i}. {opt}")
if not report['findings'] and not report['optimizations']:
print("\n✅ No issues found - pipeline looks well optimized!")
print("\n" + "="*60 + "\n")
def main():
parser = argparse.ArgumentParser(
description='CI/CD Pipeline Performance Analyzer',
formatter_class=argparse.RawDescriptionHelpFormatter
)
parser.add_argument('--platform', required=True, choices=['github', 'gitlab'],
help='CI/CD platform')
parser.add_argument('--workflow', help='Path to GitHub Actions workflow file')
parser.add_argument('--config', help='Path to GitLab CI config file')
parser.add_argument('--repo', help='Repository (owner/repo) for run analysis')
parser.add_argument('--analyze-runs', type=int, help='Number of recent runs to analyze')
args = parser.parse_args()
# Create analyzer
analyzer = PipelineAnalyzer(
platform=args.platform,
repo=args.repo
)
# Run analysis
if args.platform == 'github':
if args.workflow:
report = analyzer.analyze_github_workflow(args.workflow)
else:
# Try to find workflow files
workflow_dir = Path('.github/workflows')
if workflow_dir.exists():
workflows = list(workflow_dir.glob('*.yml')) + list(workflow_dir.glob('*.yaml'))
if workflows:
print(f"Found {len(workflows)} workflow(s), analyzing first one...")
report = analyzer.analyze_github_workflow(str(workflows[0]))
else:
print("❌ No workflow files found in .github/workflows/")
sys.exit(1)
else:
print("❌ No .github/workflows/ directory found")
sys.exit(1)
elif args.platform == 'gitlab':
config_file = args.config or '.gitlab-ci.yml'
report = analyzer.analyze_gitlab_pipeline(config_file)
# Print report
print_report(report)
# Exit with appropriate code
sys.exit(0 if report['status'] == 'success' else 1)
if __name__ == '__main__':
main()
```
### scripts/ci_health.py
```python
#!/usr/bin/env python3
"""
CI/CD Pipeline Health Checker
Checks pipeline status, recent failures, and provides insights for GitHub Actions,
GitLab CI, and other platforms. Identifies failing workflows, slow pipelines,
and provides actionable recommendations.
Usage:
# GitHub Actions
python3 ci_health.py --platform github --repo owner/repo
# GitLab CI
python3 ci_health.py --platform gitlab --project-id 12345 --token <token>
# Check specific workflow/pipeline
python3 ci_health.py --platform github --repo owner/repo --workflow ci.yml
"""
import argparse
import json
import subprocess
import sys
import urllib.request
import urllib.error
from datetime import datetime, timedelta
from typing import Dict, List, Optional
class CIHealthChecker:
def __init__(self, platform: str, **kwargs):
self.platform = platform.lower()
self.config = kwargs
self.issues = []
self.warnings = []
self.insights = []
self.metrics = {}
def check_github_workflows(self) -> Dict:
"""Check GitHub Actions workflow health"""
print(f"🔍 Checking GitHub Actions workflows...")
if not self._check_command("gh"):
self.issues.append("GitHub CLI (gh) is not installed")
self.insights.append("Install gh CLI: https://cli.github.com/")
return self._generate_report()
repo = self.config.get('repo')
if not repo:
self.issues.append("Repository not specified")
self.insights.append("Use --repo owner/repo")
return self._generate_report()
try:
# Get recent workflow runs
limit = self.config.get('limit', 20)
cmd = ['gh', 'run', 'list', '--repo', repo, '--limit', str(limit), '--json',
'status,conclusion,name,workflowName,createdAt,displayTitle']
result = subprocess.run(cmd, capture_output=True, text=True, timeout=30)
if result.returncode != 0:
self.issues.append(f"Failed to fetch workflows: {result.stderr}")
self.insights.append("Verify gh CLI authentication: gh auth status")
return self._generate_report()
runs = json.loads(result.stdout)
if not runs:
self.warnings.append("No recent workflow runs found")
return self._generate_report()
# Analyze runs
total_runs = len(runs)
failed_runs = [r for r in runs if r.get('conclusion') == 'failure']
cancelled_runs = [r for r in runs if r.get('conclusion') == 'cancelled']
success_runs = [r for r in runs if r.get('conclusion') == 'success']
self.metrics['total_runs'] = total_runs
self.metrics['failed_runs'] = len(failed_runs)
self.metrics['cancelled_runs'] = len(cancelled_runs)
self.metrics['success_runs'] = len(success_runs)
self.metrics['failure_rate'] = (len(failed_runs) / total_runs * 100) if total_runs > 0 else 0
# Group failures by workflow
failure_by_workflow = {}
for run in failed_runs:
workflow = run.get('workflowName', 'unknown')
failure_by_workflow[workflow] = failure_by_workflow.get(workflow, 0) + 1
print(f"✅ Analyzed {total_runs} recent runs:")
print(f" - Success: {len(success_runs)} ({len(success_runs)/total_runs*100:.1f}%)")
print(f" - Failed: {len(failed_runs)} ({len(failed_runs)/total_runs*100:.1f}%)")
print(f" - Cancelled: {len(cancelled_runs)} ({len(cancelled_runs)/total_runs*100:.1f}%)")
# Identify issues
if self.metrics['failure_rate'] > 20:
self.issues.append(f"High failure rate: {self.metrics['failure_rate']:.1f}%")
self.insights.append("Investigate failing workflows and address root causes")
if failure_by_workflow:
self.warnings.append("Workflows with recent failures:")
for workflow, count in sorted(failure_by_workflow.items(), key=lambda x: x[1], reverse=True):
self.warnings.append(f" - {workflow}: {count} failure(s)")
self.insights.append(f"Review logs for '{workflow}': gh run view --repo {repo}")
if len(cancelled_runs) > total_runs * 0.3:
self.warnings.append(f"High cancellation rate: {len(cancelled_runs)/total_runs*100:.1f}%")
self.insights.append("Excessive cancellations may indicate workflow timeout issues or manual interventions")
except subprocess.TimeoutExpired:
self.issues.append("Request timed out - check network connectivity")
except json.JSONDecodeError as e:
self.issues.append(f"Failed to parse workflow data: {e}")
except Exception as e:
self.issues.append(f"Unexpected error: {e}")
return self._generate_report()
def check_gitlab_pipelines(self) -> Dict:
"""Check GitLab CI pipeline health"""
print(f"🔍 Checking GitLab CI pipelines...")
url = self.config.get('url', 'https://gitlab.com')
token = self.config.get('token')
project_id = self.config.get('project_id')
if not token:
self.issues.append("GitLab token not provided")
self.insights.append("Provide token with --token or GITLAB_TOKEN env var")
return self._generate_report()
if not project_id:
self.issues.append("Project ID not specified")
self.insights.append("Use --project-id <id>")
return self._generate_report()
try:
# Get recent pipelines
per_page = self.config.get('limit', 20)
api_url = f"{url}/api/v4/projects/{project_id}/pipelines?per_page={per_page}"
req = urllib.request.Request(api_url, headers={'PRIVATE-TOKEN': token})
with urllib.request.urlopen(req, timeout=30) as response:
pipelines = json.loads(response.read())
if not pipelines:
self.warnings.append("No recent pipelines found")
return self._generate_report()
# Analyze pipelines
total_pipelines = len(pipelines)
failed = [p for p in pipelines if p.get('status') == 'failed']
success = [p for p in pipelines if p.get('status') == 'success']
running = [p for p in pipelines if p.get('status') == 'running']
cancelled = [p for p in pipelines if p.get('status') == 'canceled']
self.metrics['total_pipelines'] = total_pipelines
self.metrics['failed'] = len(failed)
self.metrics['success'] = len(success)
self.metrics['running'] = len(running)
self.metrics['failure_rate'] = (len(failed) / total_pipelines * 100) if total_pipelines > 0 else 0
print(f"✅ Analyzed {total_pipelines} recent pipelines:")
print(f" - Success: {len(success)} ({len(success)/total_pipelines*100:.1f}%)")
print(f" - Failed: {len(failed)} ({len(failed)/total_pipelines*100:.1f}%)")
print(f" - Running: {len(running)}")
print(f" - Cancelled: {len(cancelled)}")
# Identify issues
if self.metrics['failure_rate'] > 20:
self.issues.append(f"High failure rate: {self.metrics['failure_rate']:.1f}%")
self.insights.append("Review failing pipelines and fix recurring issues")
# Get details of recent failures
if failed:
self.warnings.append(f"Recent pipeline failures:")
for pipeline in failed[:5]: # Show up to 5 recent failures
ref = pipeline.get('ref', 'unknown')
pipeline_id = pipeline.get('id')
self.warnings.append(f" - Pipeline #{pipeline_id} on {ref}")
self.insights.append(f"View pipeline details: {url}/{project_id}/-/pipelines")
except urllib.error.HTTPError as e:
self.issues.append(f"API error: {e.code} - {e.reason}")
if e.code == 401:
self.insights.append("Check GitLab token permissions")
except urllib.error.URLError as e:
self.issues.append(f"Network error: {e.reason}")
self.insights.append("Check GitLab URL and network connectivity")
except Exception as e:
self.issues.append(f"Unexpected error: {e}")
return self._generate_report()
def _check_command(self, command: str) -> bool:
"""Check if command is available"""
try:
subprocess.run([command, '--version'], capture_output=True, timeout=5)
return True
except (FileNotFoundError, subprocess.TimeoutExpired):
return False
def _generate_report(self) -> Dict:
"""Generate health check report"""
# Determine overall health status
if self.issues:
status = 'unhealthy'
elif self.warnings:
status = 'degraded'
else:
status = 'healthy'
return {
'platform': self.platform,
'status': status,
'issues': self.issues,
'warnings': self.warnings,
'insights': self.insights,
'metrics': self.metrics
}
def print_report(report: Dict):
"""Print formatted health check report"""
print("\n" + "="*60)
print(f"🏥 CI/CD Health Report - {report['platform'].upper()}")
print("="*60)
status_emoji = {"healthy": "✅", "degraded": "⚠️", "unhealthy": "❌"}.get(report['status'], "❓")
print(f"\nStatus: {status_emoji} {report['status'].upper()}")
if report['metrics']:
print(f"\n📊 Metrics:")
for key, value in report['metrics'].items():
formatted_key = key.replace('_', ' ').title()
if 'rate' in key:
print(f" - {formatted_key}: {value:.1f}%")
else:
print(f" - {formatted_key}: {value}")
if report['issues']:
print(f"\n🚨 Issues ({len(report['issues'])}):")
for i, issue in enumerate(report['issues'], 1):
print(f" {i}. {issue}")
if report['warnings']:
print(f"\n⚠️ Warnings:")
for warning in report['warnings']:
if warning.startswith(' -'):
print(f" {warning}")
else:
print(f" • {warning}")
if report['insights']:
print(f"\n💡 Insights & Recommendations:")
for i, insight in enumerate(report['insights'], 1):
print(f" {i}. {insight}")
print("\n" + "="*60 + "\n")
def main():
parser = argparse.ArgumentParser(
description='CI/CD Pipeline Health Checker',
formatter_class=argparse.RawDescriptionHelpFormatter
)
parser.add_argument('--platform', required=True, choices=['github', 'gitlab'],
help='CI/CD platform')
parser.add_argument('--repo', help='GitHub repository (owner/repo)')
parser.add_argument('--workflow', help='Specific workflow name to check')
parser.add_argument('--project-id', help='GitLab project ID')
parser.add_argument('--url', default='https://gitlab.com', help='GitLab URL')
parser.add_argument('--token', help='GitLab token (or use GITLAB_TOKEN env var)')
parser.add_argument('--limit', type=int, default=20, help='Number of recent runs/pipelines to analyze')
args = parser.parse_args()
# Create checker
checker = CIHealthChecker(
platform=args.platform,
repo=args.repo,
workflow=args.workflow,
project_id=args.project_id,
url=args.url,
token=args.token,
limit=args.limit
)
# Run checks
if args.platform == 'github':
report = checker.check_github_workflows()
elif args.platform == 'gitlab':
report = checker.check_gitlab_pipelines()
# Print report
print_report(report)
# Exit with error code if unhealthy
sys.exit(0 if report['status'] == 'healthy' else 1)
if __name__ == '__main__':
main()
```
### assets/templates/github-actions/node-ci.yml
```yaml
# Node.js CI/CD Pipeline
# Optimized workflow with caching, matrix testing, and deployment
name: Node.js CI
on:
push:
branches: [main, develop]
paths-ignore:
- '**.md'
- 'docs/**'
pull_request:
branches: [main]
# Cancel in-progress runs for same workflow
concurrency:
group: ${{ github.workflow }}-${{ github.ref }}
cancel-in-progress: true
jobs:
# Security: Secret Scanning
secret-scan:
name: Secret Scanning
runs-on: ubuntu-latest
timeout-minutes: 5
steps:
- uses: actions/checkout@v4
with:
fetch-depth: 0
- name: TruffleHog Secret Scan
uses: trufflesecurity/trufflehog@main
with:
path: ./
base: ${{ github.event.repository.default_branch }}
head: HEAD
- name: Gitleaks
uses: gitleaks/gitleaks-action@v2
env:
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
# Security: SAST
sast:
name: Static Analysis
runs-on: ubuntu-latest
timeout-minutes: 15
permissions:
contents: read
security-events: write
steps:
- uses: actions/checkout@v4
- name: Initialize CodeQL
uses: github/codeql-action/init@v3
with:
languages: javascript
queries: security-and-quality
- name: Perform CodeQL Analysis
uses: github/codeql-action/analyze@v3
- name: Run Semgrep
uses: returntocorp/semgrep-action@v1
with:
config: >-
p/security-audit
p/owasp-top-ten
# Security: Dependency Scanning
dependency-scan:
name: Dependency Security
runs-on: ubuntu-latest
timeout-minutes: 10
steps:
- uses: actions/checkout@v4
- uses: actions/setup-node@v4
with:
node-version: 20
cache: 'npm'
- name: Install dependencies
run: npm ci
- name: npm audit
run: |
npm audit --audit-level=moderate --json > npm-audit.json || true
npm audit --audit-level=high
continue-on-error: false
- name: Upload audit results
if: always()
uses: actions/upload-artifact@v4
with:
name: npm-audit-report
path: npm-audit.json
lint:
name: Lint
runs-on: ubuntu-latest
needs: [secret-scan]
timeout-minutes: 10
steps:
- uses: actions/checkout@v4
- uses: actions/setup-node@v4
with:
node-version: 20
cache: 'npm'
- name: Install dependencies
run: npm ci
- name: Run linter
run: npm run lint
- name: Check formatting
run: npm run format:check
test:
name: Test (Node ${{ matrix.node }})
runs-on: ubuntu-latest
timeout-minutes: 20
strategy:
matrix:
node: [18, 20, 22]
fail-fast: false
steps:
- uses: actions/checkout@v4
- uses: actions/setup-node@v4
with:
node-version: ${{ matrix.node }}
cache: 'npm'
- name: Install dependencies
run: npm ci
- name: Run unit tests
run: npm run test:unit
- name: Run integration tests
run: npm run test:integration
if: matrix.node == 20 # Only run on one version
- name: Upload coverage
uses: codecov/codecov-action@v4
if: matrix.node == 20
with:
files: ./coverage/lcov.info
fail_ci_if_error: false
build:
name: Build
runs-on: ubuntu-latest
needs: [lint, test, sast, dependency-scan]
timeout-minutes: 15
steps:
- uses: actions/checkout@v4
- uses: actions/setup-node@v4
with:
node-version: 20
cache: 'npm'
- name: Install dependencies
run: npm ci
- name: Build application
run: npm run build
- name: Upload build artifacts
uses: actions/upload-artifact@v4
with:
name: dist-${{ github.sha }}
path: dist/
retention-days: 7
e2e:
name: E2E Tests
runs-on: ubuntu-latest
needs: build
if: github.ref == 'refs/heads/main'
timeout-minutes: 30
steps:
- uses: actions/checkout@v4
- uses: actions/setup-node@v4
with:
node-version: 20
cache: 'npm'
- name: Install dependencies
run: npm ci
- name: Download build artifacts
uses: actions/download-artifact@v4
with:
name: dist-${{ github.sha }}
path: dist/
- name: Run E2E tests
run: npm run test:e2e
- name: Upload test results
if: always()
uses: actions/upload-artifact@v4
with:
name: e2e-results
path: test-results/
deploy-staging:
name: Deploy to Staging
runs-on: ubuntu-latest
needs: [build, test]
if: github.ref == 'refs/heads/develop'
environment:
name: staging
url: https://staging.example.com
permissions:
contents: read
id-token: write # For OIDC
steps:
- uses: actions/checkout@v4
- name: Download build artifacts
uses: actions/download-artifact@v4
with:
name: dist-${{ github.sha }}
path: dist/
- name: Configure AWS credentials
uses: aws-actions/configure-aws-credentials@v4
with:
role-to-assume: ${{ secrets.AWS_ROLE_ARN }}
aws-region: us-east-1
- name: Deploy to S3
run: |
aws s3 sync dist/ s3://${{ secrets.STAGING_BUCKET }}
aws cloudfront create-invalidation --distribution-id ${{ secrets.STAGING_CF_DIST }} --paths "/*"
- name: Smoke tests
run: |
sleep 10
curl -f https://staging.example.com/health || exit 1
deploy-production:
name: Deploy to Production
runs-on: ubuntu-latest
needs: [e2e]
if: github.ref == 'refs/heads/main'
environment:
name: production
url: https://example.com
permissions:
contents: read
id-token: write
steps:
- uses: actions/checkout@v4
- name: Download build artifacts
uses: actions/download-artifact@v4
with:
name: dist-${{ github.sha }}
path: dist/
- name: Configure AWS credentials
uses: aws-actions/configure-aws-credentials@v4
with:
role-to-assume: ${{ secrets.AWS_ROLE_ARN }}
aws-region: us-east-1
- name: Deploy to S3
run: |
aws s3 sync dist/ s3://${{ secrets.PRODUCTION_BUCKET }}
aws cloudfront create-invalidation --distribution-id ${{ secrets.PRODUCTION_CF_DIST }} --paths "/*"
- name: Health check
run: |
for i in {1..10}; do
if curl -f https://example.com/health; then
echo "Health check passed"
exit 0
fi
echo "Attempt $i failed, retrying..."
sleep 10
done
echo "Health check failed"
exit 1
- name: Create deployment record
run: |
echo "Deployed version: ${{ github.sha }}"
echo "Deployment time: $(date -u +%Y-%m-%dT%H:%M:%SZ)"
# Optionally create release with gh CLI:
# gh release create v${{ github.run_number }} \
# --title "Release v${{ github.run_number }}" \
# --notes "Deployed commit ${{ github.sha }}"
```