- gRPC service with MusicBrainz provider - PostgreSQL schema with migrations - Service layer with database-first caching - Repository pattern for data access - YAML configuration support - Research documentation for 17 music metadata projects
21 KiB
Music Metadata API - Deployment
Deployment Overview
Music Metadata API supports two primary deployment models:
- Standalone binary - Single executable with database files
- Docker container - Containerized deployment with orchestration support
Both models require ~216GB of database files and minimal runtime resources.
Build Process
Building from Source
Prerequisites:
- Go 1.24+
- Git
Build steps:
# Clone repository
git clone https://github.com/Aunali321/music-metadata-api.git
cd music-metadata-api
# Build binary (CGO disabled for static linking)
CGO_ENABLED=0 go build -ldflags="-s -w" -o metadata-api ./cmd/server
# Verify binary
./metadata-api -h
Build flags explained:
| Flag | Purpose | Impact |
|---|---|---|
CGO_ENABLED=0 |
Disable CGO | Pure Go binary, no C dependencies |
-ldflags="-s -w" |
Strip symbols | Smaller binary (~30% reduction) |
-s |
Strip debug symbols | Removes symbol table |
-w |
Strip DWARF | Removes debugging info |
Binary size: ~10-15MB (stripped)
Output: Single executable (metadata-api)
Cross-Compilation
Build for Linux (from macOS/Windows):
GOOS=linux GOARCH=amd64 CGO_ENABLED=0 go build -ldflags="-s -w" -o metadata-api-linux ./cmd/server
Build for ARM (Raspberry Pi, AWS Graviton):
GOOS=linux GOARCH=arm64 CGO_ENABLED=0 go build -ldflags="-s -w" -o metadata-api-arm64 ./cmd/server
Supported platforms:
- Linux (amd64, arm64)
- macOS (amd64, arm64)
- Windows (amd64)
Docker Build
Dockerfile
Multi-stage build:
# Stage 1: Build
FROM golang:1.24-alpine AS builder
WORKDIR /app
# Copy dependency files
COPY go.mod go.sum ./
RUN go mod download
# Copy source code
COPY . .
# Build binary
RUN CGO_ENABLED=0 go build -ldflags="-s -w" -o metadata-api ./cmd/server
# Stage 2: Runtime
FROM alpine:3.21
# Install CA certificates (for HTTPS if needed)
RUN apk --no-cache add ca-certificates
WORKDIR /app
# Copy binary from builder
COPY --from=builder /app/metadata-api .
# Expose port
EXPOSE 8080
# Run as non-root user
RUN adduser -D -u 1000 apiuser
USER apiuser
# Entry point
ENTRYPOINT ["/app/metadata-api"]
Build characteristics:
- Base image: Alpine Linux 3.21 (~5MB)
- Final image size: ~15-20MB (without databases)
- Security: Runs as non-root user
- Layers: Optimized for caching (dependencies separate from code)
Building Docker Image
Build locally:
docker build -t metadata-api:latest .
Build with specific tag:
docker build -t metadata-api:v1.0.0 .
Build for multiple platforms:
docker buildx build --platform linux/amd64,linux/arm64 -t metadata-api:latest .
Official Docker Image
Registry: GitHub Container Registry (ghcr.io)
Image: ghcr.io/aunali321/music-metadata-api:latest
Pull image:
docker pull ghcr.io/aunali321/music-metadata-api:latest
Image tags:
latest- Latest build from main branchv*- Semantic version tags (e.g.,v1.0.0)
CI/CD Pipeline
GitHub Actions Workflow
File: .github/workflows/docker-publish.yml
Triggers:
- Push to
mainbranch - Push tags matching
v*(e.g.,v1.0.0) - Pull requests (build only, no publish)
Workflow steps:
name: Docker Publish
on:
push:
branches: [main]
tags: ['v*']
pull_request:
branches: [main]
jobs:
build:
runs-on: ubuntu-latest
permissions:
contents: read
packages: write
steps:
- name: Checkout code
uses: actions/checkout@v4
- name: Set up Docker Buildx
uses: docker/setup-buildx-action@v3
- name: Log in to GitHub Container Registry
uses: docker/login-action@v3
with:
registry: ghcr.io
username: ${{ github.actor }}
password: ${{ secrets.GITHUB_TOKEN }}
- name: Extract metadata
id: meta
uses: docker/metadata-action@v5
with:
images: ghcr.io/${{ github.repository }}
tags: |
type=ref,event=branch
type=semver,pattern={{version}}
type=semver,pattern={{major}}.{{minor}}
- name: Build and push
uses: docker/build-push-action@v5
with:
context: .
push: ${{ github.event_name != 'pull_request' }}
tags: ${{ steps.meta.outputs.tags }}
labels: ${{ steps.meta.outputs.labels }}
cache-from: type=gha
cache-to: type=gha,mode=max
Key features:
- Multi-platform builds: amd64, arm64
- Caching: GitHub Actions cache for faster builds
- Automatic tagging: Branch name, semantic versions
- Security: Uses GitHub token (no manual secrets)
Notable omission: No test step (zero tests in codebase)
Release Process
Create release:
# Tag version
git tag v1.0.0
git push origin v1.0.0
# GitHub Actions automatically:
# 1. Builds Docker image
# 2. Tags as v1.0.0, v1.0, v1, latest
# 3. Pushes to ghcr.io
Verify release:
docker pull ghcr.io/aunali321/music-metadata-api:v1.0.0
Standalone Deployment
Prerequisites
System requirements:
- Linux, macOS, or Windows
- 216GB disk space (databases)
- 4GB+ RAM
- SSD recommended (HDD too slow)
Database files:
main_database.sqlite3(~117GB)track_files.sqlite3(~99GB)- Must be obtained separately (not in repository)
Deployment Steps
1. Prepare environment:
# Create directory structure
mkdir -p /opt/metadata-api/data
cd /opt/metadata-api
# Copy databases
cp /path/to/main_database.sqlite3 data/
cp /path/to/track_files.sqlite3 data/
# Copy binary
cp metadata-api /opt/metadata-api/
chmod +x metadata-api
2. Run service:
./metadata-api -db /opt/metadata-api/data/main_database.sqlite3 -addr :8080
3. Verify:
curl http://localhost:8080/health
# Expected: {"status":"ok"}
Systemd Service
Create service file: /etc/systemd/system/metadata-api.service
[Unit]
Description=Music Metadata API
After=network.target
[Service]
Type=simple
User=apiuser
Group=apiuser
WorkingDirectory=/opt/metadata-api
ExecStart=/opt/metadata-api/metadata-api -db /opt/metadata-api/data/main_database.sqlite3 -addr :8080
Restart=on-failure
RestartSec=10s
# Resource limits
LimitNOFILE=65536
MemoryLimit=8G
# Logging
StandardOutput=journal
StandardError=journal
SyslogIdentifier=metadata-api
[Install]
WantedBy=multi-user.target
Enable and start:
# Create user
sudo useradd -r -s /bin/false apiuser
sudo chown -R apiuser:apiuser /opt/metadata-api
# Enable service
sudo systemctl daemon-reload
sudo systemctl enable metadata-api
sudo systemctl start metadata-api
# Check status
sudo systemctl status metadata-api
# View logs
sudo journalctl -u metadata-api -f
Docker Deployment
Docker Run
Basic run:
docker run -d \
--name metadata-api \
-p 8080:8080 \
-v /path/to/databases:/data:ro \
ghcr.io/aunali321/music-metadata-api:latest \
-db /data/main_database.sqlite3
With resource limits:
docker run -d \
--name metadata-api \
-p 8080:8080 \
-v /path/to/databases:/data:ro \
--memory=8g \
--cpus=2 \
--restart=unless-stopped \
ghcr.io/aunali321/music-metadata-api:latest \
-db /data/main_database.sqlite3 \
-addr :8080
Verify:
docker logs metadata-api
curl http://localhost:8080/health
Docker Compose
File: docker-compose.yml
version: '3.8'
services:
metadata-api:
image: ghcr.io/aunali321/music-metadata-api:latest
container_name: metadata-api
ports:
- "8080:8080"
volumes:
- ./data:/data:ro
environment:
- LOG_LEVEL=info # NOTE: Not actually used in code
command: ["-db", "/data/main_database.sqlite3"]
healthcheck:
test: ["CMD", "wget", "--spider", "-q", "http://localhost:8080/health"]
interval: 30s
timeout: 10s
retries: 3
start_period: 10s
restart: unless-stopped
deploy:
resources:
limits:
memory: 8G
cpus: '2'
reservations:
memory: 4G
cpus: '1'
Deploy:
# Start services
docker-compose up -d
# View logs
docker-compose logs -f
# Stop services
docker-compose down
Health check details:
- Command:
wget --spider -q http://localhost:8080/health - Interval: Every 30 seconds
- Timeout: 10 seconds
- Retries: 3 failures before unhealthy
- Start period: 10 seconds grace period
Limitation: Health check doesn't verify database connectivity (naive implementation)
Kubernetes Deployment
Deployment Manifest
File: k8s/deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: metadata-api
labels:
app: metadata-api
spec:
replicas: 3
selector:
matchLabels:
app: metadata-api
template:
metadata:
labels:
app: metadata-api
spec:
containers:
- name: api
image: ghcr.io/aunali321/music-metadata-api:latest
args: ["-db", "/data/main_database.sqlite3", "-addr", ":8080"]
ports:
- containerPort: 8080
name: http
volumeMounts:
- name: database
mountPath: /data
readOnly: true
livenessProbe:
httpGet:
path: /health
port: 8080
initialDelaySeconds: 10
periodSeconds: 30
timeoutSeconds: 5
failureThreshold: 3
readinessProbe:
httpGet:
path: /health
port: 8080
initialDelaySeconds: 5
periodSeconds: 10
timeoutSeconds: 5
failureThreshold: 3
resources:
requests:
memory: "4Gi"
cpu: "1"
limits:
memory: "8Gi"
cpu: "2"
securityContext:
runAsNonRoot: true
runAsUser: 1000
readOnlyRootFilesystem: true
volumes:
- name: database
persistentVolumeClaim:
claimName: metadata-db-pvc
Service Manifest
File: k8s/service.yaml
apiVersion: v1
kind: Service
metadata:
name: metadata-api
labels:
app: metadata-api
spec:
type: LoadBalancer
selector:
app: metadata-api
ports:
- port: 80
targetPort: 8080
protocol: TCP
name: http
sessionAffinity: None
Persistent Volume
File: k8s/pvc.yaml
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: metadata-db-pvc
spec:
accessModes:
- ReadOnlyMany # Multiple pods can read
resources:
requests:
storage: 220Gi # 216GB databases + overhead
storageClassName: fast-ssd # Use SSD storage class
Storage options:
- AWS EBS: Use
gp3volumes (SSD) - GCP Persistent Disk: Use
pd-ssd - Azure Disk: Use
Premium_LRS - NFS: Shared filesystem (slower, but works)
Horizontal Pod Autoscaler
File: k8s/hpa.yaml
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
name: metadata-api-hpa
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: metadata-api
minReplicas: 3
maxReplicas: 10
metrics:
- type: Resource
resource:
name: cpu
target:
type: Utilization
averageUtilization: 70
- type: Resource
resource:
name: memory
target:
type: Utilization
averageUtilization: 80
Deploy to Kubernetes
# Create namespace
kubectl create namespace metadata-api
# Apply manifests
kubectl apply -f k8s/pvc.yaml -n metadata-api
kubectl apply -f k8s/deployment.yaml -n metadata-api
kubectl apply -f k8s/service.yaml -n metadata-api
kubectl apply -f k8s/hpa.yaml -n metadata-api
# Verify deployment
kubectl get pods -n metadata-api
kubectl get svc -n metadata-api
# View logs
kubectl logs -f deployment/metadata-api -n metadata-api
# Get service URL
kubectl get svc metadata-api -n metadata-api -o jsonpath='{.status.loadBalancer.ingress[0].ip}'
Cloud Platform Deployments
AWS ECS
Task definition:
{
"family": "metadata-api",
"networkMode": "awsvpc",
"requiresCompatibilities": ["FARGATE"],
"cpu": "2048",
"memory": "8192",
"containerDefinitions": [{
"name": "api",
"image": "ghcr.io/aunali321/music-metadata-api:latest",
"portMappings": [{
"containerPort": 8080,
"protocol": "tcp"
}],
"command": ["-db", "/data/main_database.sqlite3"],
"mountPoints": [{
"sourceVolume": "database",
"containerPath": "/data",
"readOnly": true
}],
"healthCheck": {
"command": ["CMD-SHELL", "wget --spider -q http://localhost:8080/health || exit 1"],
"interval": 30,
"timeout": 5,
"retries": 3,
"startPeriod": 10
},
"logConfiguration": {
"logDriver": "awslogs",
"options": {
"awslogs-group": "/ecs/metadata-api",
"awslogs-region": "us-east-1",
"awslogs-stream-prefix": "ecs"
}
}
}],
"volumes": [{
"name": "database",
"efsVolumeConfiguration": {
"fileSystemId": "fs-12345678",
"rootDirectory": "/databases",
"transitEncryption": "ENABLED"
}
}]
}
Deploy:
# Create EFS filesystem (for databases)
aws efs create-file-system --tags Key=Name,Value=metadata-db
# Register task definition
aws ecs register-task-definition --cli-input-json file://task-definition.json
# Create service
aws ecs create-service \
--cluster metadata-cluster \
--service-name metadata-api \
--task-definition metadata-api \
--desired-count 3 \
--launch-type FARGATE \
--network-configuration "awsvpcConfiguration={subnets=[subnet-123],securityGroups=[sg-456]}"
Google Cloud Run
Deploy:
# Build and push image
gcloud builds submit --tag gcr.io/PROJECT_ID/metadata-api
# Create Cloud Filestore instance (for databases)
gcloud filestore instances create metadata-db \
--zone=us-central1-a \
--tier=BASIC_SSD \
--file-share=name=databases,capacity=250GB
# Deploy to Cloud Run
gcloud run deploy metadata-api \
--image gcr.io/PROJECT_ID/metadata-api \
--platform managed \
--region us-central1 \
--memory 8Gi \
--cpu 2 \
--min-instances 1 \
--max-instances 10 \
--port 8080 \
--args="-db,/data/main_database.sqlite3" \
--execution-environment gen2 \
--vpc-connector metadata-vpc
Note: Cloud Run doesn't natively support persistent volumes. Use Cloud Filestore with VPC connector.
Azure Container Instances
Deploy:
# Create Azure Files share (for databases)
az storage share create --name metadata-db --quota 250
# Deploy container
az container create \
--resource-group metadata-rg \
--name metadata-api \
--image ghcr.io/aunali321/music-metadata-api:latest \
--cpu 2 \
--memory 8 \
--ports 8080 \
--command-line "/app/metadata-api -db /data/main_database.sqlite3" \
--azure-file-volume-account-name STORAGE_ACCOUNT \
--azure-file-volume-account-key STORAGE_KEY \
--azure-file-volume-share-name metadata-db \
--azure-file-volume-mount-path /data
Resource Requirements
Minimum Requirements
| Resource | Minimum | Recommended | Notes |
|---|---|---|---|
| CPU | 1 core | 2 cores | Search queries CPU-intensive |
| RAM | 4GB | 8GB | 2.5GB for SQLite + 1.5GB for app/OS |
| Disk | 220GB | 250GB | 216GB databases + overhead |
| Disk Type | SSD | NVMe SSD | HDD too slow for 256M rows |
| Network | 100 Mbps | 1 Gbps | For serving JSON responses |
Scaling Considerations
Vertical scaling:
- More RAM: Larger SQLite cache (faster queries)
- More CPU: Faster search queries (CPU-bound)
- Faster disk: Lower query latency
Horizontal scaling:
- Each instance needs full 216GB database copy
- Read-only safe (no write conflicts)
- Load balancer distributes traffic
- No shared state (rate limiter per-instance)
Cost implications:
- 10 instances = 2.16TB storage (expensive)
- Consider shared filesystem (NFS, EFS) for databases
- Tradeoff: Shared storage slower than local SSD
Monitoring and Logging
Health Checks
Endpoint: GET /health
Response:
{"status":"ok"}
Limitation: Doesn't verify database connectivity
Improved health check (custom implementation):
func healthCheck(db *sql.DB) http.HandlerFunc {
return func(w http.ResponseWriter, r *http.Request) {
// Ping database
if err := db.Ping(); err != nil {
http.Error(w, "Database unavailable", http.StatusServiceUnavailable)
return
}
json.NewEncoder(w).Encode(map[string]string{"status": "ok"})
}
}
Logging
Current implementation:
- Go stdlib
log/slog - Structured logging for errors
- Output to stdout/stderr
Log format:
2024-01-15T10:30:00Z level=ERROR msg="Database query failed" error="no such table"
Docker logging:
# View logs
docker logs -f metadata-api
# Follow logs with timestamps
docker logs -f --timestamps metadata-api
# Last 100 lines
docker logs --tail 100 metadata-api
Kubernetes logging:
# View logs
kubectl logs -f deployment/metadata-api
# Logs from all pods
kubectl logs -f -l app=metadata-api
# Previous container logs (after crash)
kubectl logs --previous pod/metadata-api-abc123
Metrics (Not Implemented)
Missing metrics:
- Request count by endpoint
- Request duration percentiles
- Error rate
- Database query duration
- Rate limiter rejections
Workaround: Use reverse proxy metrics (nginx, Envoy)
Security Considerations
Container Security
Best practices:
- Run as non-root user (UID 1000)
- Read-only root filesystem
- Drop all capabilities
- No privileged mode
Enhanced Dockerfile:
FROM alpine:3.21
RUN apk --no-cache add ca-certificates && \
adduser -D -u 1000 apiuser
WORKDIR /app
COPY --from=builder /app/metadata-api .
USER apiuser
# Read-only filesystem
RUN chmod 555 /app/metadata-api
ENTRYPOINT ["/app/metadata-api"]
Network Security
Recommendations:
- Deploy behind reverse proxy (nginx, Traefik)
- Use TLS/HTTPS (terminate at proxy)
- Firewall rules (allow only necessary ports)
- VPC/private network (not public internet)
Example nginx TLS:
server {
listen 443 ssl http2;
server_name api.example.com;
ssl_certificate /etc/ssl/cert.pem;
ssl_certificate_key /etc/ssl/key.pem;
ssl_protocols TLSv1.2 TLSv1.3;
location / {
proxy_pass http://localhost:8080;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
}
}
Database Security
Recommendations:
- Read-only volume mounts
- File permissions (chmod 400)
- Separate user for database files
- No write access to application
Example permissions:
sudo chown root:apiuser /data/main_database.sqlite3
sudo chmod 440 /data/main_database.sqlite3
Troubleshooting
Common Issues
Issue: Container fails to start
Diagnosis:
docker logs metadata-api
Common causes:
- Database file not found (check volume mount)
- Incorrect
-dbpath - Insufficient memory
Solution:
# Verify volume mount
docker inspect metadata-api | grep Mounts -A 10
# Check database path
docker exec metadata-api ls -lh /data
Issue: High memory usage
Diagnosis:
docker stats metadata-api
Causes:
- Rate limiter memory leak (unbounded visitor map)
- Large result sets
- Many concurrent requests
Solution:
- Restart container periodically
- Increase memory limit
- Implement visitor cleanup (code change)
Issue: Slow queries
Diagnosis:
- Check disk I/O (use SSD)
- Monitor CPU usage
- Review query patterns
Solution:
- Use SSD storage
- Increase SQLite cache size
- Use batch endpoints (not individual lookups)
Backup and Recovery
Backup Strategy
Database backup:
# Stop service (optional, but safer)
systemctl stop metadata-api
# Copy databases
cp /data/main_database.sqlite3 /backup/main_database.sqlite3.$(date +%Y%m%d)
cp /data/track_files.sqlite3 /backup/track_files.sqlite3.$(date +%Y%m%d)
# Restart service
systemctl start metadata-api
Online backup (while running):
sqlite3 /data/main_database.sqlite3 ".backup /backup/main_database.sqlite3"
Recovery
Restore from backup:
# Stop service
systemctl stop metadata-api
# Restore databases
cp /backup/main_database.sqlite3.20240115 /data/main_database.sqlite3
cp /backup/track_files.sqlite3.20240115 /data/track_files.sqlite3
# Verify integrity
sqlite3 /data/main_database.sqlite3 "PRAGMA integrity_check;"
# Restart service
systemctl start metadata-api
Performance Tuning
Database Optimization
Increase cache size:
_cache_size=-128000 # 128MB (from 64MB)
Increase mmap size:
_mmap_size=2147483648 # 2GB (from 1GB)
Connection pool:
db.SetMaxOpenConns(16) // Increase from 8
Container Optimization
CPU pinning (Docker):
docker run --cpuset-cpus="0-3" metadata-api
Memory limits:
docker run --memory=8g --memory-swap=8g metadata-api
I/O priority:
docker run --blkio-weight=1000 metadata-api