Files
metadata-agregator/docs/research/harmony/analysis/DEPLOYMENT.md
T
Alexander a1f6701bac feat: initial implementation of metadata aggregator
- gRPC service with MusicBrainz provider
- PostgreSQL schema with migrations
- Service layer with database-first caching
- Repository pattern for data access
- YAML configuration support
- Research documentation for 17 music metadata projects
2026-04-28 16:28:53 +02:00

778 lines
17 KiB
Markdown

# Harmony - Deployment and Operations Analysis
## Deployment Philosophy
Harmony follows a **self-hosted, no-containerization** approach:
- **No Docker**: Direct Deno runtime execution
- **No Kubernetes**: Simple systemd service management
- **No cloud-native complexity**: Traditional server deployment
- **Deno Deploy compatible**: Can deploy to Deno's edge platform
This design prioritizes:
- **Simplicity**: Minimal deployment dependencies
- **Deno consistency**: Same runtime across dev and prod
- **Low overhead**: No container orchestration
- **Easy debugging**: Direct process access
## Production Deployment
### Prerequisites
1. **Deno runtime**: Version 1.37+ (Fresh 1.6.8 requirement)
2. **Git**: For version tracking and deployment
3. **systemd**: For service management (Linux)
4. **Environment variables**: OAuth2 credentials, configuration
### Installation Steps
#### 1. Clone Repository
```bash
cd /opt
git clone https://github.com/kellnerd/harmony.git
cd harmony
```
#### 2. Configure Environment
Create `.env` file from template:
```bash
cp .env.example .env
```
Edit `.env`:
```bash
# OAuth2 Credentials
HARMONY_SPOTIFY_CLIENT_ID=your_spotify_client_id
HARMONY_SPOTIFY_CLIENT_SECRET=your_spotify_client_secret
HARMONY_TIDAL_CLIENT_ID=your_tidal_client_id
HARMONY_TIDAL_CLIENT_SECRET=your_tidal_client_secret
# MusicBrainz Configuration
HARMONY_MB_API_URL=https://musicbrainz.org/ws/2
HARMONY_MB_TARGET_URL=https://musicbrainz.org
# Data Storage
HARMONY_DATA_DIR=/var/lib/harmony
# Server Configuration
PORT=8000
FORWARD_PROTO=https
```
#### 3. Create Data Directory
```bash
mkdir -p /var/lib/harmony/snaps
chown -R harmony:harmony /var/lib/harmony
```
#### 4. Create systemd Service
Create `/etc/systemd/system/harmony.service`:
```ini
[Unit]
Description=Harmony Music Metadata Aggregator
After=network.target
[Service]
Type=simple
User=harmony
Group=harmony
WorkingDirectory=/opt/harmony
EnvironmentFile=/opt/harmony/.env
ExecStart=/usr/local/bin/deno run -A server/main.ts
Restart=on-failure
RestartSec=10
StandardOutput=journal
StandardError=journal
# Security hardening
NoNewPrivileges=true
PrivateTmp=true
ProtectSystem=strict
ProtectHome=true
ReadWritePaths=/var/lib/harmony
[Install]
WantedBy=multi-user.target
```
#### 5. Enable and Start Service
```bash
systemctl daemon-reload
systemctl enable harmony
systemctl start harmony
systemctl status harmony
```
### Server Startup
**Command**:
```bash
deno run -A server/main.ts
```
**Flags**:
- `-A`: Allow all permissions (network, read, write, env)
**Alternative** (granular permissions):
```bash
deno run \
--allow-net \
--allow-read=/opt/harmony,/var/lib/harmony \
--allow-write=/var/lib/harmony \
--allow-env \
server/main.ts
```
**Environment Variables**:
| Variable | Required | Default | Purpose |
|----------|----------|---------|---------|
| `PORT` | No | `8000` | HTTP server port |
| `DENO_DEPLOYMENT_ID` | No | Auto-generated | Version identifier |
| `HARMONY_SPOTIFY_CLIENT_ID` | Yes* | - | Spotify OAuth2 client ID |
| `HARMONY_SPOTIFY_CLIENT_SECRET` | Yes* | - | Spotify OAuth2 client secret |
| `HARMONY_TIDAL_CLIENT_ID` | Yes* | - | Tidal OAuth2 client ID |
| `HARMONY_TIDAL_CLIENT_SECRET` | Yes* | - | Tidal OAuth2 client secret |
| `HARMONY_MB_API_URL` | No | `https://musicbrainz.org/ws/2` | MusicBrainz API endpoint |
| `HARMONY_MB_TARGET_URL` | No | `https://musicbrainz.org` | MusicBrainz target instance |
| `HARMONY_DATA_DIR` | No | `./` | Data directory for cache |
| `FORWARD_PROTO` | No | - | Protocol for reverse proxy |
*Required only if using respective provider
**Version Identifier**:
The `DENO_DEPLOYMENT_ID` is auto-generated from git tags:
```bash
export DENO_DEPLOYMENT_ID=$(git describe --tags --always)
# Example: v1.2.3-5-g1a2b3c4
```
This identifier is used for:
- Cache invalidation on deployments
- Version display in UI
- Debugging and logging
### Reverse Proxy Configuration
#### Nginx
```nginx
server {
listen 80;
server_name harmony.example.com;
# Redirect HTTP to HTTPS
return 301 https://$server_name$request_uri;
}
server {
listen 443 ssl http2;
server_name harmony.example.com;
# SSL configuration
ssl_certificate /etc/letsencrypt/live/harmony.example.com/fullchain.pem;
ssl_certificate_key /etc/letsencrypt/live/harmony.example.com/privkey.pem;
# Proxy to Harmony
location / {
proxy_pass http://localhost:8000;
proxy_http_version 1.1;
proxy_set_header Upgrade $http_upgrade;
proxy_set_header Connection 'upgrade';
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_set_header X-Forwarded-Proto $scheme;
proxy_cache_bypass $http_upgrade;
}
# Static assets caching
location /static/ {
proxy_pass http://localhost:8000;
proxy_cache_valid 200 1d;
add_header Cache-Control "public, immutable";
}
}
```
#### Caddy
```caddy
harmony.example.com {
reverse_proxy localhost:8000
header /static/* {
Cache-Control "public, max-age=86400, immutable"
}
}
```
## CI/CD Pipeline
### GitHub Actions Workflow
**File**: `.github/workflows/deno.yml`
**Workflow Structure**:
```yaml
name: Deno CI/CD
on:
push:
branches: [main]
tags: ['v*']
pull_request:
branches: [main]
jobs:
test:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v3
- name: Setup Deno
uses: denoland/setup-deno@v1
with:
deno-version: v1.x
- name: Format check
run: deno fmt --check
- name: Lint
run: deno lint
- name: Type check
run: deno check **/*.ts
- name: Run tests
run: deno test -A
deploy:
needs: test
runs-on: ubuntu-latest
if: startsWith(github.ref, 'refs/tags/v')
steps:
- uses: actions/checkout@v3
- name: Deploy to server
env:
DEPLOY_KEY: ${{ secrets.DEPLOY_KEY }}
DEPLOY_HOST: ${{ secrets.DEPLOY_HOST }}
DEPLOY_PORT: ${{ secrets.DEPLOY_PORT }}
DEPLOY_USER: ${{ secrets.DEPLOY_USER }}
DEPLOY_TARGET: ${{ secrets.DEPLOY_TARGET }}
DEPLOY_SERVICE: ${{ secrets.DEPLOY_SERVICE }}
run: |
# Setup SSH
mkdir -p ~/.ssh
echo "$DEPLOY_KEY" > ~/.ssh/deploy_key
chmod 600 ~/.ssh/deploy_key
# Rsync code to server
rsync -avz --delete \
--exclude '/deno.lock' \
--exclude '/.env' \
--exclude '/snaps.db' \
--exclude '/snaps/' \
-e "ssh -i ~/.ssh/deploy_key -p $DEPLOY_PORT" \
./ "$DEPLOY_USER@$DEPLOY_HOST:$DEPLOY_TARGET"
# Restart service
ssh -i ~/.ssh/deploy_key -p "$DEPLOY_PORT" \
"$DEPLOY_USER@$DEPLOY_HOST" \
"systemctl restart $DEPLOY_SERVICE"
```
### Deployment Secrets
Configure in GitHub repository settings:
| Secret | Example | Purpose |
|--------|---------|---------|
| `DEPLOY_KEY` | SSH private key | SSH authentication |
| `DEPLOY_HOST` | `harmony.example.com` | Target server hostname |
| `DEPLOY_PORT` | `22` | SSH port |
| `DEPLOY_USER` | `harmony` | SSH user |
| `DEPLOY_TARGET` | `/opt/harmony` | Deployment directory |
| `DEPLOY_SERVICE` | `harmony` | systemd service name |
### Deployment Trigger
**Automatic deployment** on:
- Tagged releases: `v*` (e.g., `v1.2.3`)
- Authorized users only (repository collaborators)
**Manual deployment**:
```bash
git tag v1.2.3
git push origin v1.2.3
```
### Deployment Exclusions
Files excluded from rsync:
- `/deno.lock`: Lock file (regenerated on server)
- `/.env`: Environment variables (server-specific)
- `/snaps.db`: Cache database (preserved on server)
- `/snaps/`: Cache files (preserved on server)
**Rationale**: Preserve cache and configuration across deployments.
### Deployment Verification
After deployment, verify:
1. **Service status**:
```bash
systemctl status harmony
```
2. **Logs**:
```bash
journalctl -u harmony -f
```
3. **Health check**:
```bash
curl https://harmony.example.com/
```
4. **Version**:
Check `DENO_DEPLOYMENT_ID` in logs or UI
## Development Deployment
### Local Development
**Start development server**:
```bash
deno task dev
```
**Features**:
- Auto-reload on file changes
- Watch directories: `static/`, `routes/`
- Hot module replacement for islands
- Development logging (DEBUG level)
**Environment**:
- `DENO_DEPLOYMENT_ID`: Not set (enables localStorage for MBID cache)
- `PORT`: Default `8000`
### Testing
**Run all tests**:
```bash
deno task ok
```
**Equivalent to**:
```bash
deno fmt && deno lint && deno check **/*.ts && deno test -A
```
**Run specific test file**:
```bash
deno test -A providers/spotify_test.ts
```
**Offline testing** (use cached responses):
```bash
deno test -A
```
**Download fresh test data**:
```bash
deno test -A --download
```
## Deno Deploy (Edge Platform)
Harmony is compatible with Deno Deploy for edge deployment.
### Deployment Steps
1. **Create Deno Deploy project**:
- Visit https://dash.deno.com/new
- Connect GitHub repository
- Select `server/main.ts` as entry point
2. **Configure environment variables**:
- Add all `HARMONY_*` variables
- Set `PORT` (auto-configured by Deno Deploy)
3. **Deploy**:
- Automatic deployment on git push
- Edge distribution across global regions
### Deno Deploy Benefits
- **Global edge network**: Low latency worldwide
- **Automatic HTTPS**: Free SSL certificates
- **Auto-scaling**: Handle traffic spikes
- **Zero configuration**: No server management
### Deno Deploy Limitations
- **No persistent storage**: `snap_storage` cache not supported
- **Stateless only**: Each request independent
- **No systemd**: Different service management
**Workaround**: Use external cache (Redis, Cloudflare KV) instead of `snap_storage`.
## Monitoring and Logging
### Logging System
**Logger Configuration**:
```typescript
// utils/logger.ts
import * as log from 'std/log/mod.ts';
await log.setup({
handlers: {
console: new log.handlers.ConsoleHandler('DEBUG', {
formatter: (record) => {
const level = record.levelName.padEnd(7);
const logger = record.loggerName.padEnd(20);
return `${level} ${logger} ${record.msg}`;
},
useColors: true
})
},
loggers: {
'harmony.lookup': { level: 'INFO', handlers: ['console'] },
'harmony.mbid': { level: 'DEBUG', handlers: ['console'] },
'harmony.provider': { level: 'INFO', handlers: ['console'] },
'harmony.server': { level: 'INFO', handlers: ['console'] },
'requests': { level: 'INFO', handlers: ['console'] }
}
});
```
**Log Levels**:
| Logger | Level | Purpose |
|--------|-------|---------|
| `harmony.lookup` | INFO | Release lookup operations |
| `harmony.mbid` | DEBUG | MusicBrainz ID resolution |
| `harmony.provider` | INFO | Provider interactions |
| `harmony.server` | INFO | Server lifecycle events |
| `requests` | INFO | HTTP request logging |
**Example Logs**:
```
INFO harmony.server Server listening on http://localhost:8000
INFO harmony.lookup Looking up GTIN 0602537347377 in regions: GB,US,DE,JP
INFO harmony.provider Spotify: Fetching album 3DiDSNVBRYVzccLn2yqhMJ
DEBUG harmony.provider Spotify: Using cached response
INFO harmony.provider Deezer: Fetching album 123456
WARN harmony.provider iTunes: Rate limit exceeded, retrying after 60s
INFO harmony.lookup Merge complete: 3 providers, 1 conflict
DEBUG harmony.mbid Resolving MBIDs for 3 URLs
INFO requests GET /release?gtin=0602537347377 200 1234ms
```
### systemd Journal
**View logs**:
```bash
# Follow logs
journalctl -u harmony -f
# Last 100 lines
journalctl -u harmony -n 100
# Logs since yesterday
journalctl -u harmony --since yesterday
# Logs with priority ERROR or higher
journalctl -u harmony -p err
```
**Log rotation**: Automatic via systemd (default: 4GB limit, 1 month retention)
### Request Logging Middleware
**File**: `server/middleware/request_logger.ts`
```typescript
export function requestLogger(req: Request, ctx: HandlerContext): Response {
const start = Date.now();
const logger = log.getLogger('requests');
const response = await ctx.next();
const duration = Date.now() - start;
const level = response.status >= 400 ? 'WARN' : 'INFO';
logger[level.toLowerCase()](
`${req.method} ${new URL(req.url).pathname} ${response.status} ${duration}ms`
);
return response;
}
```
### No Metrics or Monitoring
Harmony does **not include**:
- **Prometheus metrics**: No `/metrics` endpoint
- **Health checks**: No `/health` endpoint
- **APM integration**: No New Relic, Datadog, etc.
- **Error tracking**: No Sentry integration
- **Performance monitoring**: No tracing
**Workaround**: Add custom middleware for metrics collection.
**Example Health Check** (custom):
```typescript
// routes/health.ts
export const handler = {
GET: () => {
return new Response(JSON.stringify({
status: 'ok',
version: Deno.env.get('DENO_DEPLOYMENT_ID'),
timestamp: Date.now()
}), {
headers: { 'Content-Type': 'application/json' }
});
}
};
```
## Resource Requirements
### Minimum Requirements
- **CPU**: 1 core
- **RAM**: 512 MB
- **Disk**: 10 GB (for cache growth)
- **Network**: 10 Mbps
### Recommended Requirements
- **CPU**: 2 cores
- **RAM**: 2 GB
- **Disk**: 50 GB (for extensive cache)
- **Network**: 100 Mbps
### Resource Usage Estimates
**Idle**:
- CPU: <1%
- RAM: ~100 MB
**Under load** (10 req/sec):
- CPU: 10-20%
- RAM: ~200 MB
- Network: 1-5 Mbps
**Cache growth**:
- ~2-5 MB per day (100 lookups/day)
- ~730 MB - 1.8 GB per year
## Backup and Recovery
### Backup Strategy
**What to backup**:
1. **Cache database**: `/var/lib/harmony/snaps.db`
2. **Cache files**: `/var/lib/harmony/snaps/`
3. **Configuration**: `/opt/harmony/.env`
**What NOT to backup**:
- Application code (in git repository)
- Deno cache (regenerated automatically)
**Backup script**:
```bash
#!/bin/bash
# /usr/local/bin/harmony-backup.sh
BACKUP_DIR=/backup/harmony
DATE=$(date +%Y%m%d)
# Create backup directory
mkdir -p "$BACKUP_DIR/$DATE"
# Backup cache database
cp /var/lib/harmony/snaps.db "$BACKUP_DIR/$DATE/"
# Backup cache files (compressed)
tar -czf "$BACKUP_DIR/$DATE/snaps.tar.gz" /var/lib/harmony/snaps/
# Backup configuration
cp /opt/harmony/.env "$BACKUP_DIR/$DATE/"
# Delete backups older than 30 days
find "$BACKUP_DIR" -type d -mtime +30 -exec rm -rf {} +
```
**Cron schedule**:
```cron
0 2 * * * /usr/local/bin/harmony-backup.sh
```
### Recovery
**Restore from backup**:
```bash
# Stop service
systemctl stop harmony
# Restore cache database
cp /backup/harmony/20240101/snaps.db /var/lib/harmony/
# Restore cache files
tar -xzf /backup/harmony/20240101/snaps.tar.gz -C /
# Restore configuration
cp /backup/harmony/20240101/.env /opt/harmony/
# Fix permissions
chown -R harmony:harmony /var/lib/harmony
# Start service
systemctl start harmony
```
## Security Considerations
### systemd Hardening
**Security options** in `harmony.service`:
```ini
[Service]
# Prevent privilege escalation
NoNewPrivileges=true
# Private /tmp
PrivateTmp=true
# Read-only system directories
ProtectSystem=strict
# No access to /home
ProtectHome=true
# Read-write access only to data directory
ReadWritePaths=/var/lib/harmony
```
### OAuth2 Credentials
**Storage**:
- Store in `.env` file (not in git)
- Restrict file permissions: `chmod 600 .env`
- Use environment variables in production
**Rotation**:
- Rotate credentials periodically
- Update `.env` and restart service
### HTTPS
**Always use HTTPS** in production:
- Reverse proxy (Nginx, Caddy) handles SSL
- Free certificates via Let's Encrypt
- Set `FORWARD_PROTO=https` environment variable
### Rate Limiting
**No built-in rate limiting** on server:
- Implement in reverse proxy (Nginx `limit_req`)
- Or use Cloudflare rate limiting
**Example Nginx rate limiting**:
```nginx
http {
limit_req_zone $binary_remote_addr zone=harmony:10m rate=10r/s;
server {
location / {
limit_req zone=harmony burst=20 nodelay;
proxy_pass http://localhost:8000;
}
}
}
```
## Troubleshooting
### Common Issues
#### Service won't start
**Check logs**:
```bash
journalctl -u harmony -n 50
```
**Common causes**:
- Missing environment variables
- Port already in use
- Permission issues on data directory
#### High memory usage
**Cause**: Large cache or memory leak
**Solution**:
```bash
# Clear cache
rm -rf /var/lib/harmony/snaps.db /var/lib/harmony/snaps/
# Restart service
systemctl restart harmony
```
#### Provider errors
**Check provider status**:
- Spotify: https://developer.spotify.com/status
- Tidal: Check API version (v1 deprecated)
- MusicBrainz: https://musicbrainz.org/doc/MusicBrainz_Server/Status
**Verify credentials**:
```bash
# Test Spotify OAuth2
curl -X POST https://accounts.spotify.com/api/token \
-H "Authorization: Basic $(echo -n 'client_id:client_secret' | base64)" \
-d "grant_type=client_credentials"
```
## Summary
Harmony's deployment model demonstrates:
1. **Simplicity**: No Docker, no Kubernetes, direct Deno execution
2. **systemd integration**: Standard Linux service management
3. **CI/CD automation**: GitHub Actions with SSH deployment
4. **Deno Deploy compatibility**: Edge deployment option
5. **Comprehensive logging**: 5 specialized loggers with color formatting
6. **Security hardening**: systemd security options
7. **Backup strategy**: Cache and configuration backup
8. **No monitoring**: No built-in metrics or health checks (requires custom implementation)
This deployment approach is ideal for small to medium-scale deployments with minimal operational overhead.