Production-ready AWS CodePipelines from TypeScript, CLI, or a single AI prompt. 124 reusable plugins, per-org compliance enforcement, multi-tenant isolation, and zero vendor lock-in.
Two deployment options: EC2 (single instance, Kubernetes) or Fargate (serverless containers).
Both deploy the full stack: app services, databases, observability (Prometheus, Loki, Grafana), and admin tools. TLS via Let’s Encrypt (custom domain) or self-signed (IP-only).
| Related docs: Environment Variables | API Reference | Plugin Catalog |
| EC2 | Fargate | |
|---|---|---|
| Runtime | Minikube on EC2 | ECS Fargate |
| Infra | 1 CloudFormation stack | 6 CloudFormation stacks |
| TLS | Let’s Encrypt or self-signed | Let’s Encrypt + ACM |
| Storage | hostPath PVCs on EBS | EFS access points |
| Scaling | Vertical (instance resize) | Horizontal (task count) |
| Cost | ~$30-80/mo | ~$100-300/mo |
| Best for | Dev/staging | Production |
Single hardened EC2 instance running Minikube with all services.
With custom domain:
cd deploy/aws/ec2
aws cloudformation deploy \
--stack-name pipeline-builder \
--template-file template.yaml \
--parameter-overrides \
DomainName=pipeline.example.com \
HostedZoneId=Z1234567890 \
KeyPairName=my-keypair \
GhcrToken=ghp_xxxxxxxxxxxx \
--capabilities CAPABILITY_IAM
Without domain (self-signed TLS):
aws cloudformation deploy \
--stack-name pipeline-builder \
--template-file template.yaml \
--parameter-overrides \
KeyPairName=my-keypair \
GhcrToken=ghp_xxxxxxxxxxxx \
--capabilities CAPABILITY_IAM
Get the URL:
aws cloudformation describe-stacks --stack-name pipeline-builder \
--query 'Stacks[0].Outputs[?OutputKey==`ApplicationURL`].OutputValue' --output text
| Parameter | Required | Default | Description |
|---|---|---|---|
KeyPairName |
Yes | — | EC2 key pair for SSH |
GhcrToken |
Yes | — | GHCR token for pulling images |
DomainName |
No | — | FQDN for Route 53 + Let’s Encrypt |
HostedZoneId |
If domain set | — | Route 53 hosted zone ID |
InstanceType |
No | t3.xlarge |
EC2 instance type (4 vCPU / 16 GiB recommended) |
GhcrUser |
No | mwashburn160 |
GitHub username for GHCR |
SshCidr |
No | 0.0.0.0/0 |
CIDR for SSH access |
EbsVolumeSize |
No | 60 |
Root volume size in GiB (OS, binaries) |
DataVolumeSize |
No | 200 |
Data volume size in GiB (Docker, plugins, registry, databases). Increase to 500 for prebuilt. |
GitRepo |
No | (this repo) | Git repository URL |
GitBranch |
No | main |
Branch to deploy |
The EC2 deployment uses two EBS volumes:
| Volume | Default | Mount | Contents |
|---|---|---|---|
| Root | 60 GiB | / |
OS, Docker/minikube binaries, app code |
| Data | 200 GiB | /mnt/data |
Docker layers, plugin artifacts, registry, databases, logs |
Data volume breakdown:
| Component | build_image | prebuilt | prebuilt + –cleanup |
|---|---|---|---|
| Docker build cache + images | 20-30 GB | 60-90 GB | 60-90 GB |
| Plugin artifacts (image.tar + plugin.zip) | 0 GB | 130-190 GB | 0 GB |
| Registry (pushed images) | 40-60 GB | 40-60 GB | 40-60 GB |
| PostgreSQL + MongoDB | 5-15 GB | 5-15 GB | 5-15 GB |
| Minikube + logs + metrics | 15-25 GB | 15-25 GB | 15-25 GB |
| Total | 80-130 GB | 250-380 GB | 120-190 GB |
Recommendations:
| Plugin strategy | Data volume | Notes |
|---|---|---|
build_image (default) |
200 GB | Builds from Dockerfile at upload time |
prebuilt with --cleanup |
250 GB | Removes artifacts after upload |
prebuilt without cleanup |
500 GB | Keeps artifacts for re-runs |
Daily runtime operations (after initial plugin load) add ~1-5 GB/month from database growth and logs. Add a weekly Docker prune cron to reclaim build cache:
# /etc/cron.weekly/docker-prune
docker system prune -af --filter "until=168h"
If you need more storage after deployment (e.g., switching to prebuilt), expand the data volume live — no reboot required:
# 1. Find the data volume ID
INSTANCE_ID=$(curl -s -H "X-aws-ec2-metadata-token: $(curl -s -X PUT \
http://169.254.169.254/latest/api/token -H 'X-aws-ec2-metadata-token-ttl-seconds: 21600')" \
http://169.254.169.254/latest/meta-data/instance-id)
VOL_ID=$(aws ec2 describe-volumes \
--filters "Name=attachment.instance-id,Values=$INSTANCE_ID" "Name=tag:Name,Values=*data*" \
--query 'Volumes[0].VolumeId' --output text)
# 2. Expand to desired size (e.g., 500 GiB for prebuilt)
aws ec2 modify-volume --volume-id $VOL_ID --size 500
# 3. Wait for modification to complete (~30s)
watch -n5 "aws ec2 describe-volumes-modifications --volume-ids $VOL_ID \
--query 'VolumesModifications[0].ModificationState' --output text"
# Wait until it shows "optimizing" or "completed"
# 4. Grow the partition and filesystem (on the EC2 instance)
DEVICE=$(lsblk -no PKNAME $(findmnt -n -o SOURCE /mnt/data))
PART=$(lsblk -no PARTNUM $(findmnt -n -o SOURCE /mnt/data) 2>/dev/null)
[ -n "$PART" ] && sudo growpart /dev/$DEVICE $PART
sudo xfs_growfs /mnt/data # XFS filesystem
# or: sudo resize2fs $(findmnt -n -o SOURCE /mnt/data) # ext4 filesystem
# 5. Verify
df -h /mnt/data
Deploy with a larger data volume upfront:
aws cloudformation deploy \
--stack-name pipeline-builder \
--template-file template.yaml \
--parameter-overrides \
DataVolumeSize=500 \
KeyPairName=my-key \
GhcrToken=ghp_xxx \
--capabilities CAPABILITY_IAM
bootstrap.sh, which:
.env with random secrets (JWT keys, DB passwords)# Watch bootstrap progress
ssh -i my-keypair.pem ec2-user@<ip> 'tail -f /var/log/user-data.log'
# Check pods
ssh -i my-keypair.pem ec2-user@<ip> \
'sudo -u minikube kubectl --context=pipeline-builder get pods -n pipeline-builder'
# SSM (no SSH key needed)
aws ssm start-session --target <instance-id>
All in deploy/aws/ec2/bin/. On the instance: /opt/pipeline-builder/deploy/aws/ec2/bin/.
| Script | Purpose | Run as |
|---|---|---|
bootstrap.sh |
Full EC2 setup (runs automatically via UserData) | root |
startup.sh |
Start Minikube + deploy K8s manifests | root (sudo) |
shutdown.sh |
Stop Minikube + remove iptables rules | root (sudo) |
update-tls-secret.sh |
Certbot hook to update K8s TLS secret | root |
# Start (after bootstrap or reboot)
sudo bash /opt/pipeline-builder/deploy/aws/ec2/bin/startup.sh
# Stop
sudo bash /opt/pipeline-builder/deploy/aws/ec2/bin/shutdown.sh
# Check pod status
sudo -u minikube kubectl get pods -n pipeline-builder
/etc/pipeline-builder/tls/aws cloudformation delete-stack --stack-name pipeline-builder
aws cloudformation wait stack-delete-complete --stack-name pipeline-builder
Serverless containers on ECS Fargate. 6 CloudFormation stacks deployed in dependency order.
brew install certbot && pip3 install certbot-dns-route53pip3 install certbot certbot-dns-route53cd deploy/aws/fargate
bash bin/deploy.sh \
--domain pipeline.example.com \
--hosted-zone-id Z1234567890 \
--ghcr-token ghp_xxxxxxxxxxxx
With an existing ACM certificate:
bash bin/deploy.sh \
--domain pipeline.example.com \
--hosted-zone-id Z1234567890 \
--ghcr-token ghp_xxxxxxxxxxxx \
--certificate-arn arn:aws:acm:us-east-1:123456789012:certificate/abc-123
| Parameter | Required | Default | Description |
|---|---|---|---|
--domain |
Yes | — | FQDN for Route 53 + Let’s Encrypt |
--hosted-zone-id |
Yes | — | Route 53 hosted zone ID |
--ghcr-token |
Yes | — | GHCR token for pulling images |
--ghcr-user |
No | mwashburn160 |
GitHub username |
--region |
No | us-east-1 |
AWS region |
--stack-prefix |
No | pb |
CloudFormation stack name prefix |
--certificate-arn |
No | (auto-provisioned) | Existing ACM certificate ARN |
Deployed in order. Each exports values consumed by downstream stacks.
| Stack | Contents |
|---|---|
| 01-foundation | VPC, ALB, Route 53, EFS, S3 config bucket, Cloud Map |
| 02-cluster | ECS Cluster, IAM roles, security groups, log groups |
| 03-databases | PostgreSQL, MongoDB, Redis |
| 04-services | Nginx, Platform, Pipeline, Plugin, Quota, Billing, Message, Reporting, Frontend |
| 05-observability | Prometheus, Loki, Grafana |
| 06-admin | Registry, PgAdmin, Mongo Express, Registry UI |
Fargate uses managed services — no EBS volumes to manage. Plugin builds use build_image with kaniko (prebuilt is not supported on Fargate).
| Resource | Type | Size | Notes |
|---|---|---|---|
| Task ephemeral | Per-task | 20 GB (30 GB for plugin) | Non-persistent, cleared on task restart |
| RDS PostgreSQL | RDS gp3 | 50 GB (autoscaling) | Pipelines, plugins, compliance, messages |
| DocumentDB / MongoDB Atlas | Managed | 10-20 GB | Quota + billing records |
| ECR | Managed | 40-60 GB | Plugin container images |
| EFS | Managed | 5-10 GB | Shared config, TLS certs |
| CloudWatch Logs | Managed | ~5 GB/month | 30-day retention recommended |
| S3 | Managed | <1 GB | CDK assets, CloudFormation templates |
Recommendations:
| Resource | Setting |
|---|---|
| RDS storage | 50 GB gp3 with autoscaling enabled |
| ECR lifecycle | Keep last 10 tags per repository |
| CloudWatch retention | 30 days |
| Plugin task ephemeral | 30 GB (kaniko builds need temp space) |
Task sizing:
| Service | CPU | Memory | Ephemeral |
|---|---|---|---|
| Plugin | 2048 | 4096 | 30 GB |
| Pipeline | 512 | 1024 | 20 GB |
| Platform | 512 | 1024 | 20 GB |
| Compliance | 512 | 1024 | 20 GB |
| Frontend | 256 | 512 | 20 GB |
| Reporting, Quota, Billing, Message | 256 | 512 | 20 GB |
Monthly cost estimate (storage only):
| Resource | Cost |
|---|---|
| RDS (50 GB gp3) | ~$5 |
| ECR (60 GB) | ~$6 |
| EFS (10 GB) | ~$3 |
| CloudWatch | ~$3 |
| Total | ~$17/mo |
Unlike EC2, Fargate storage is per-service. Expand each independently:
RDS PostgreSQL — increase allocated storage:
aws rds modify-db-instance \
--db-instance-identifier pb-postgres \
--allocated-storage 100 \
--apply-immediately
# Enable autoscaling to avoid manual expansion in the future
aws rds modify-db-instance \
--db-instance-identifier pb-postgres \
--max-allocated-storage 200 \
--apply-immediately
ECR — add lifecycle policy to prevent unbounded growth:
aws ecr put-lifecycle-policy \
--repository-name plugin \
--lifecycle-policy-text '{
"rules": [{
"rulePriority": 1,
"description": "Keep last 10 images",
"selection": {
"tagStatus": "any",
"countType": "imageCountMoreThan",
"countNumber": 10
},
"action": { "type": "expire" }
}]
}'
CloudWatch Logs — set retention to control growth:
# List log groups
aws logs describe-log-groups --log-group-name-prefix /ecs/pb- --query 'logGroups[*].logGroupName' --output table
# Set 30-day retention on all
for lg in $(aws logs describe-log-groups --log-group-name-prefix /ecs/pb- --query 'logGroups[*].logGroupName' --output text); do
aws logs put-retention-policy --log-group-name "$lg" --retention-in-days 30
done
Task ephemeral storage — increase in CloudFormation stack:
Update EphemeralStorage in 04-services.yaml and redeploy:
cd deploy/aws/fargate
bash bin/deploy.sh --stack-prefix pb --region us-east-1 --domain app.example.com
| Kubernetes | Fargate |
|---|---|
| K8s DNS | Cloud Map (*.pipeline-builder.local) |
| hostPath PVCs | EFS access points |
| K8s Secrets | AWS Secrets Manager |
| ConfigMaps | S3 bucket (downloaded at startup) |
| NodePort + iptables | ALB + target groups |
| Docker socket mount | Kaniko sidecar |
| Promtail DaemonSet | Fluent Bit sidecar (FireLens) |
| NetworkPolicies | Security groups |
| Init containers | ECS container dependency ordering |
All in deploy/aws/fargate/bin/. Run from your local machine.
| Script | Purpose |
|---|---|
deploy.sh |
Full deploy: secrets → cert → 6 stacks → config upload |
teardown.sh |
Delete all stacks in reverse order |
init-cert.sh |
Let’s Encrypt cert via Route 53 DNS challenge → ACM |
init-secrets.sh |
Generate random secrets → Secrets Manager |
# Service status
aws ecs describe-services --cluster pipeline-builder \
--services nginx platform pipeline plugin --region us-east-1
# Tail logs
aws logs tail /pipeline-builder/nginx --follow --region us-east-1
Certificates expire every 90 days. Re-run to renew (same ACM ARN is reused):
bash bin/init-cert.sh --domain pipeline.example.com
bash bin/teardown.sh --stack-prefix pb --region us-east-1
Secrets Manager entries are not auto-deleted. Remove manually if needed.
After deploying (EC2 or Fargate), complete these steps to initialize the platform and enable reporting.
Register the admin user and load pre-built plugins and sample pipelines:
cd deploy
# Interactive — prompts for admin credentials, build strategy, and categories
bash bin/init-platform.sh ec2 # EC2 (resolves URL from CloudFormation)
bash bin/init-platform.sh local # Docker Compose
bash bin/init-platform.sh minikube # Minikube
# Non-interactive
export PLATFORM_BASE_URL=https://pipeline.example.com
export PLATFORM_IDENTIFIER=admin@internal
export PLATFORM_PASSWORD=SecurePassword123!
bash bin/init-platform.sh ec2
# Non-interactive with prebuilt images
PLUGIN_BUILD_STRATEGY=prebuilt bash bin/init-platform.sh ec2
# Non-interactive with prebuilt + specific categories
PLUGIN_BUILD_STRATEGY=prebuilt PLUGIN_CATEGORY=infrastructure,language bash bin/init-platform.sh ec2
# Control upload parallelism (default: 4, auto-lowered to 1 for prebuilt)
PARALLEL_JOBS=2 bash bin/init-platform.sh local
# Force rebuild all prebuilt images even if image.tar exists
PLUGIN_BUILD_STRATEGY=prebuilt FORCE_REBUILD=true bash bin/init-platform.sh ec2
# Clean up plugin.zip and image.tar after upload (reclaim disk space)
./deploy/bin/init-platform.sh --cleanup local
./deploy/bin/load-plugins.sh --rebuild --cleanup
# EC2 with sudo (required for minikube user context)
sudo -u minikube PLATFORM_BASE_URL=https://your-ip bash /opt/pipeline-builder/deploy/bin/init-platform.sh ec2
sudo -u minikube PLATFORM_BASE_URL=https://your-ip bash /opt/pipeline-builder/deploy/bin/init-platform.sh --cleanup ec2
init-platform.sh does: health check → register admin → login → select build strategy → load plugins → load pipelines.
| Variable | Default | Description |
|---|---|---|
PLATFORM_BASE_URL |
auto-detected | Platform API URL (skips CloudFormation lookup when set) |
PLATFORM_IDENTIFIER |
admin@internal |
Admin email |
PLATFORM_PASSWORD |
SecurePassword123! |
Admin password |
PLUGIN_BUILD_STRATEGY |
build_image |
build_image or prebuilt |
PLUGIN_CATEGORY |
all | Comma-separated categories (e.g., language,security) |
PARALLEL_JOBS |
4 (1 for prebuilt) | Upload concurrency. Passed through to load-plugins.sh. Override with --parallel N on CLI. |
FORCE_REBUILD |
false |
Force rebuild all prebuilt image.tar files |
PLUGIN_S3_CLEAR |
false |
Clear S3 bucket before upload (S3 strategy only) |
Use --cleanup flag on init-platform.sh or load-plugins.sh to remove plugin.zip and image.tar files after upload. Useful on EC2 where prebuilt images can consume 25-75GB of disk.
| Script | Purpose |
|---|---|
init-platform.sh |
Register admin + select build strategy + load plugins + pipelines (interactive) |
build-plugin-images.sh |
Pre-build Docker images for plugins (prebuilt strategy) |
load-plugins.sh |
Upload plugins from deploy/plugins/ |
load-pipelines.sh |
Upload pipelines from deploy/samples/pipelines/ |
test-plugins.sh |
Validate plugin specs and Dockerfiles |
The plugin-lookup Lambda and event-ingestion Lambda use a JWT token stored in Secrets Manager. Generate and store it using the CLI:
# First, login to get a PLATFORM_TOKEN
eval $(pipeline-manager login -u admin@your-domain.com -p '***' --quiet --no-verify-ssl)
# Then generate a long-lived token and store in Secrets Manager
pipeline-manager store-token --days 30 --region us-east-1
Set up pipeline execution reporting to track success rates, stage performance, and build analytics:
export PLATFORM_BASE_URL=https://pipeline.example.com
pipeline-manager setup-events --region us-east-1
This creates a CloudFormation stack (pipeline-builder-events) containing:
# Check the EventBridge stack
aws cloudformation describe-stacks --stack-name pipeline-builder-events \
--query 'Stacks[0].StackStatus' --output text
# Check the EventBridge rule
aws events describe-rule --name pipeline-builder-codepipeline-events
# Check the Lambda
aws lambda get-function --function-name pipeline-builder-event-ingestion \
--query 'Configuration.LastModified'
Deploy → pipeline-manager registers hashed ARN in pipeline_registry
Execute → CodePipeline runs → EventBridge captures state changes
Ingest → SQS → Lambda → hashes account → POST /api/reports/events
Store → Reporting API resolves org via registry → inserts into pipeline_events
View → Dashboard Reports page or GET /api/reports/...
AWS account numbers are never stored in plain text. Both the deploy command and Lambda handler hash account numbers with SHA-256 before sending to the API. The reporting API applies the same hash as defense in depth.
Plugin Docker builds are captured automatically by the plugin service (no EventBridge needed).
All endpoints require authentication and org context. Time range defaults to last 30 days.
| Endpoint | Description | Query Params |
|---|---|---|
GET /api/reports/execution/count |
Execution count per pipeline with status breakdown | — |
GET /api/reports/execution/success-rate |
Pass/fail rate over time | interval, from, to |
GET /api/reports/execution/timeline |
Execution timeline (alias for success-rate) | interval, from, to |
GET /api/reports/execution/duration |
Average/min/max/p95 execution duration | from, to |
GET /api/reports/execution/stage-failures |
Stage failure heatmap | from, to |
GET /api/reports/execution/stage-bottlenecks |
Slowest stages per pipeline | from, to |
GET /api/reports/execution/action-failures |
Action/step failure rate | from, to |
GET /api/reports/execution/errors |
Error categorization (top N) | from, to, limit |
| Endpoint | Description | Query Params |
|---|---|---|
GET /api/reports/plugins/summary |
Plugin inventory (total/active/public/private) | — |
GET /api/reports/plugins/distribution |
Type and compute distribution | — |
GET /api/reports/plugins/versions |
Version counts per plugin name | — |
GET /api/reports/plugins/build-success-rate |
Docker build success rate over time | interval, from, to |
GET /api/reports/plugins/build-duration |
Build time per plugin | from, to |
GET /api/reports/plugins/build-failures |
Build failure reasons (top N) | from, to, limit |
Common query parameters:
| Param | Values | Default |
|---|---|---|
interval |
day, week, month |
week |
from |
ISO 8601 timestamp | 30 days ago |
to |
ISO 8601 timestamp | now |
limit |
integer | 20 |
After deployment, access services at:
| Service | Path |
|---|---|
| Application | / |
| Reports Dashboard | /dashboard/reports |
| Grafana | /grafana/ |
| PgAdmin | /pgadmin/ |
| Mongo Express | /mongo-express/ |
| Registry UI | /registry-express/ |
Pods stuck Pending (EC2):
Check CPU requests vs instance capacity. kubectl describe pod <name> shows scheduling failures.
ImagePullBackOff (EC2): Verify GHCR credentials and that iptables rules aren’t intercepting minikube’s outbound traffic.
CrashLoopBackOff on observability pods (EC2):
Usually hostPath permission issues. Check pod logs. Init containers handle chown for loki (10001), prometheus (65534), grafana (472).
ECS tasks stuck in PROVISIONING (Fargate):
Check CloudWatch logs: aws logs tail /pipeline-builder/<service> --follow
ALB health checks failing (Fargate): Verify nginx is running. Check target group health and port 8080 accessibility.
Certificate errors:
Ensure certbot has Route 53 permissions. Check ACM status: aws acm describe-certificate --certificate-arn <arn>
No reporting data after deploy:
pipeline-manager store-token was runaws logs tail /aws/lambda/pipeline-builder-event-ingestion --followsetup-events (ARN must be registered)