a1f6701bac
- gRPC service with MusicBrainz provider - PostgreSQL schema with migrations - Service layer with database-first caching - Repository pattern for data access - YAML configuration support - Research documentation for 17 music metadata projects
708 lines
14 KiB
Markdown
708 lines
14 KiB
Markdown
# MusicBrainz Server Deployment
|
|
|
|
## Docker Architecture
|
|
|
|
### Build System
|
|
|
|
**Template Engine:** M4 macros
|
|
**Base Image:** Ubuntu Noble (24.04 LTS)
|
|
**Dockerfile Location:** `docker/Dockerfile.template`
|
|
|
|
**Template Processing:**
|
|
```bash
|
|
# Generate Dockerfile from template
|
|
m4 docker/Dockerfile.template > docker/Dockerfile
|
|
```
|
|
|
|
**M4 Macros:**
|
|
- `INSTALL_PERL_DEPENDENCIES` - Install Perl modules via carton
|
|
- `INSTALL_NODE_DEPENDENCIES` - Install Node.js packages via yarn
|
|
- `COMPILE_RESOURCES` - Compile static assets
|
|
- `SETUP_DATABASE` - Initialize PostgreSQL schema
|
|
|
|
**Multi-Stage Build:**
|
|
1. Base stage - Install system dependencies
|
|
2. Build stage - Compile assets and dependencies
|
|
3. Runtime stage - Copy artifacts, minimal runtime
|
|
|
|
### Container Types
|
|
|
|
**website:**
|
|
- Main web application
|
|
- Serves HTML pages via Template Toolkit
|
|
- Handles user authentication and sessions
|
|
- Port: 5000
|
|
|
|
**webservice:**
|
|
- API endpoints (/ws/2/)
|
|
- JSON/XML serialization
|
|
- OAuth authentication
|
|
- Port: 5001
|
|
|
|
**tests:**
|
|
- Run test suites
|
|
- Perl unit tests
|
|
- JavaScript tests
|
|
- pgTAP database tests
|
|
- No exposed ports (ephemeral)
|
|
|
|
**cron:**
|
|
- Scheduled tasks
|
|
- Statistics calculation
|
|
- Data cleanup
|
|
- Replication packet export
|
|
- No exposed ports
|
|
|
|
**sitemaps:**
|
|
- Generate XML sitemaps
|
|
- Update search engine indexes
|
|
- Run daily
|
|
- No exposed ports
|
|
|
|
**json-dump:**
|
|
- Export database to JSON
|
|
- Generate data dumps for download
|
|
- Run weekly
|
|
- No exposed ports
|
|
|
|
**solr-backup:**
|
|
- Backup Solr indexes
|
|
- Run daily
|
|
- No exposed ports
|
|
|
|
**template-renderer:**
|
|
- Isolated Template Toolkit renderer
|
|
- Forked from main process
|
|
- Prevents template errors from crashing main app
|
|
- IPC via Unix socket
|
|
|
|
### Docker Compose
|
|
|
|
**File:** `docker-compose.yml`
|
|
|
|
**Services:**
|
|
```yaml
|
|
services:
|
|
db:
|
|
image: postgres:16
|
|
volumes:
|
|
- pgdata:/var/lib/postgresql/data
|
|
environment:
|
|
POSTGRES_USER: musicbrainz
|
|
POSTGRES_PASSWORD: musicbrainz
|
|
POSTGRES_DB: musicbrainz_db
|
|
ports:
|
|
- "5432:5432"
|
|
|
|
redis:
|
|
image: redis:7
|
|
volumes:
|
|
- redisdata:/data
|
|
ports:
|
|
- "6379:6379"
|
|
|
|
solr:
|
|
image: solr:8.11
|
|
volumes:
|
|
- solrdata:/var/solr
|
|
ports:
|
|
- "8983:8983"
|
|
|
|
website:
|
|
build:
|
|
context: .
|
|
dockerfile: docker/Dockerfile
|
|
target: website
|
|
depends_on:
|
|
- db
|
|
- redis
|
|
- solr
|
|
ports:
|
|
- "5000:5000"
|
|
environment:
|
|
MUSICBRAINZ_SERVER_PROCESSES: 10
|
|
MUSICBRAINZ_USE_PROXY: 1
|
|
|
|
webservice:
|
|
build:
|
|
context: .
|
|
dockerfile: docker/Dockerfile
|
|
target: webservice
|
|
depends_on:
|
|
- db
|
|
- redis
|
|
- solr
|
|
ports:
|
|
- "5001:5001"
|
|
|
|
volumes:
|
|
pgdata:
|
|
redisdata:
|
|
solrdata:
|
|
```
|
|
|
|
### Image Layers
|
|
|
|
**Base Layer (Ubuntu Noble):**
|
|
- System packages (build-essential, libpq-dev, etc.)
|
|
- Perl 5.38
|
|
- Node.js 20
|
|
- PostgreSQL client libraries
|
|
|
|
**Dependency Layer:**
|
|
- Perl modules (via carton)
|
|
- Node.js packages (via yarn)
|
|
- Cached for faster rebuilds
|
|
|
|
**Application Layer:**
|
|
- Application code
|
|
- Compiled assets
|
|
- Configuration templates
|
|
|
|
**Runtime Layer:**
|
|
- Minimal runtime dependencies
|
|
- No build tools
|
|
- Smaller image size
|
|
|
|
## PSGI Server Configuration
|
|
|
|
### Starlet
|
|
|
|
**Server:** Starlet (high-performance PSGI server)
|
|
**Protocol:** HTTP/1.1
|
|
**Concurrency:** Pre-forking worker model
|
|
|
|
**Configuration:**
|
|
```perl
|
|
# Start Starlet with 10 workers
|
|
starman --workers 10 \
|
|
--max-requests 100 \
|
|
--listen :5000 \
|
|
app.psgi
|
|
```
|
|
|
|
**Worker Settings:**
|
|
- **Workers:** 10 (configurable via `MUSICBRAINZ_SERVER_PROCESSES`)
|
|
- **Max Requests per Worker:** 30-90 (random to prevent thundering herd)
|
|
- **Worker Timeout:** 300 seconds (5 minutes)
|
|
- **Keepalive:** Enabled (60 seconds)
|
|
|
|
**Worker Lifecycle:**
|
|
1. Master process forks 10 workers
|
|
2. Each worker handles requests until max_requests reached
|
|
3. Worker exits gracefully
|
|
4. Master forks new worker to replace it
|
|
5. Prevents memory leaks from accumulating
|
|
|
|
### Server::Starter (Zero-Downtime Restarts)
|
|
|
|
**Purpose:** Enable zero-downtime deployments
|
|
|
|
**Mechanism:**
|
|
1. Server::Starter binds to port
|
|
2. Forks Starlet with inherited socket
|
|
3. On restart signal (HUP):
|
|
- Start new Starlet process
|
|
- New process binds to same socket
|
|
- Old process finishes existing requests
|
|
- Old process exits
|
|
- No dropped connections
|
|
|
|
**Command:**
|
|
```bash
|
|
start_server \
|
|
--port 5000 \
|
|
--pid-file /var/run/musicbrainz.pid \
|
|
--status-file /var/run/musicbrainz.status \
|
|
-- \
|
|
starman --workers 10 app.psgi
|
|
```
|
|
|
|
**Restart:**
|
|
```bash
|
|
# Send HUP signal to trigger graceful restart
|
|
kill -HUP $(cat /var/run/musicbrainz.pid)
|
|
```
|
|
|
|
**Status Check:**
|
|
```bash
|
|
# Check server status
|
|
cat /var/run/musicbrainz.status
|
|
# Output: 1234:5000 (PID:PORT)
|
|
```
|
|
|
|
### Reverse Proxy
|
|
|
|
**Production Setup:** Nginx reverse proxy in front of Starlet
|
|
|
|
**Nginx Configuration:**
|
|
```nginx
|
|
upstream musicbrainz {
|
|
server localhost:5000;
|
|
keepalive 32;
|
|
}
|
|
|
|
server {
|
|
listen 80;
|
|
server_name musicbrainz.org;
|
|
|
|
location / {
|
|
proxy_pass http://musicbrainz;
|
|
proxy_set_header Host $host;
|
|
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
|
|
proxy_set_header X-Forwarded-Proto $scheme;
|
|
proxy_http_version 1.1;
|
|
proxy_set_header Connection "";
|
|
}
|
|
|
|
location /static/ {
|
|
alias /var/www/musicbrainz/root/static/;
|
|
expires 1y;
|
|
add_header Cache-Control "public, immutable";
|
|
}
|
|
}
|
|
```
|
|
|
|
**Benefits:**
|
|
- SSL termination
|
|
- Static file serving
|
|
- Gzip compression
|
|
- Request buffering
|
|
- Load balancing (multiple Starlet instances)
|
|
|
|
## CI/CD Pipeline
|
|
|
|
### GitHub Actions
|
|
|
|
**Workflow File:** `.github/workflows/test.yml`
|
|
|
|
**Triggers:**
|
|
- Push to main branch
|
|
- Pull requests
|
|
- Manual workflow dispatch
|
|
|
|
### Build Stage
|
|
|
|
**Job:** `build-tests-image`
|
|
|
|
**Steps:**
|
|
1. Checkout code
|
|
2. Set up Docker Buildx
|
|
3. Build test Docker image
|
|
4. Push to GitHub Container Registry
|
|
5. Cache layers for faster rebuilds
|
|
|
|
**Dockerfile:** `docker/Dockerfile.test`
|
|
|
|
**Caching:**
|
|
- Perl dependencies cached by cpanfile.snapshot hash
|
|
- Node dependencies cached by yarn.lock hash
|
|
- Docker layer caching via GitHub Actions cache
|
|
|
|
### Test Stages
|
|
|
|
**Job:** `js-perl-and-pgtap`
|
|
|
|
**Matrix:**
|
|
- Perl 5.38.0 (stable)
|
|
- Perl 5.42.0 (latest)
|
|
|
|
**Steps:**
|
|
1. Pull test image from registry
|
|
2. Start PostgreSQL container
|
|
3. Start Redis container
|
|
4. Initialize test database
|
|
5. Run Perl tests (`prove -lr t/`)
|
|
6. Run JavaScript tests (`yarn test`)
|
|
7. Run pgTAP tests (`pg_prove -d musicbrainz_test t/pgtap/`)
|
|
8. Upload coverage reports
|
|
|
|
**Parallelization:** Tests run in parallel across matrix
|
|
|
|
### Selenium Tests
|
|
|
|
**Jobs:** `selenium-1`, `selenium-2`, `selenium-3`, `selenium-4`
|
|
|
|
**Partitioning:** Tests split into 4 partitions for parallel execution
|
|
|
|
**Steps:**
|
|
1. Pull test image
|
|
2. Start PostgreSQL, Redis, Solr
|
|
3. Start Selenium standalone Chrome
|
|
4. Initialize test database with sample data
|
|
5. Start MusicBrainz server
|
|
6. Run Selenium tests for partition
|
|
7. Upload screenshots on failure
|
|
|
|
**Partition Strategy:**
|
|
```bash
|
|
# Partition 1: Artist and release tests
|
|
# Partition 2: Recording and work tests
|
|
# Partition 3: Edit and relationship tests
|
|
# Partition 4: Search and browse tests
|
|
```
|
|
|
|
**Selenium Configuration:**
|
|
```perl
|
|
# t/selenium.pl
|
|
use Selenium::Remote::Driver;
|
|
|
|
my $driver = Selenium::Remote::Driver->new(
|
|
remote_server_addr => 'localhost',
|
|
port => 4444,
|
|
browser_name => 'chrome',
|
|
extra_capabilities => {
|
|
chromeOptions => {
|
|
args => ['--headless', '--no-sandbox', '--disable-dev-shm-usage'],
|
|
},
|
|
},
|
|
);
|
|
```
|
|
|
|
### Second-Tier Tests
|
|
|
|
**Job:** `second-perl-and-pgtap`
|
|
|
|
**Purpose:** Test against Perl 5.42.0 (latest stable)
|
|
|
|
**Trigger:** After main tests pass
|
|
|
|
**Allowed to Fail:** Yes (informational only)
|
|
|
|
### Report Generation
|
|
|
|
**Job:** `generate-reports`
|
|
|
|
**Steps:**
|
|
1. Download coverage reports from all test jobs
|
|
2. Merge coverage data
|
|
3. Generate HTML coverage report
|
|
4. Upload to Codecov
|
|
5. Comment on PR with coverage summary
|
|
|
|
**Coverage Tools:**
|
|
- Perl: Devel::Cover
|
|
- JavaScript: Istanbul/nyc
|
|
|
|
## Build Process
|
|
|
|
### Step 1: Install Perl Dependencies
|
|
|
|
```bash
|
|
# Install Carton (Perl dependency manager)
|
|
cpanm --notest Carton
|
|
|
|
# Install dependencies from cpanfile.snapshot
|
|
carton install --deployment
|
|
```
|
|
|
|
**Dependencies Installed:**
|
|
- Catalyst framework
|
|
- Moose object system
|
|
- DBD::Pg database driver
|
|
- Template::Toolkit
|
|
- JSON::XS
|
|
- XML::LibXML
|
|
- Redis client
|
|
- ~200 total CPAN modules
|
|
|
|
**Installation Time:** ~10 minutes (first time), ~1 minute (cached)
|
|
|
|
### Step 2: Install Node.js Dependencies
|
|
|
|
```bash
|
|
# Install Yarn (if not present)
|
|
npm install -g yarn
|
|
|
|
# Install dependencies from yarn.lock
|
|
yarn install --frozen-lockfile
|
|
```
|
|
|
|
**Dependencies Installed:**
|
|
- React 19.2.4
|
|
- Redux
|
|
- Webpack 5
|
|
- Babel 7
|
|
- Jest (testing)
|
|
- ESLint (linting)
|
|
- ~500 total npm packages
|
|
|
|
**Installation Time:** ~5 minutes (first time), ~30 seconds (cached)
|
|
|
|
### Step 3: Compile Static Resources
|
|
|
|
```bash
|
|
# Compile CSS, images, fonts
|
|
./script/compile_resources.sh
|
|
```
|
|
|
|
**Tasks:**
|
|
- Compile LESS to CSS
|
|
- Optimize images (pngcrush, optipng)
|
|
- Copy fonts to static directory
|
|
- Generate CSS sprites
|
|
- Minify CSS
|
|
|
|
**Output:** `root/static/styles/`, `root/static/images/`
|
|
|
|
**Time:** ~2 minutes
|
|
|
|
### Step 4: Build JavaScript Bundles
|
|
|
|
```bash
|
|
# Build production bundles with Webpack
|
|
yarn run build
|
|
|
|
# Or for development (with source maps)
|
|
yarn run build:dev
|
|
```
|
|
|
|
**Webpack Configuration:**
|
|
- Entry points: `root/static/scripts/main.js`, `root/static/scripts/edit.js`
|
|
- Output: `root/static/build/`
|
|
- Loaders: Babel (JSX, ES6+), CSS, file-loader
|
|
- Plugins: UglifyJS, ExtractTextPlugin, DefinePlugin
|
|
- Code splitting: Vendor bundle, async chunks
|
|
|
|
**Output Files:**
|
|
- `main.bundle.js` - Main application code
|
|
- `vendor.bundle.js` - Third-party libraries
|
|
- `edit.bundle.js` - Edit interface code
|
|
- `*.chunk.js` - Async-loaded chunks
|
|
|
|
**Time:** ~3 minutes (production), ~30 seconds (development)
|
|
|
|
### Step 5: Initialize Database
|
|
|
|
```bash
|
|
# Create database
|
|
createdb musicbrainz_db
|
|
|
|
# Load schema
|
|
psql musicbrainz_db < admin/sql/CreateTables.sql
|
|
|
|
# Load initial data
|
|
./admin/InitDb.pl --createdb --import
|
|
```
|
|
|
|
**Schema Loading:**
|
|
- 375 tables created
|
|
- 500+ foreign keys added
|
|
- Indexes created
|
|
- Triggers installed
|
|
|
|
**Initial Data:**
|
|
- Countries and areas
|
|
- Languages
|
|
- Relationship types
|
|
- Instrument types
|
|
- Genre definitions
|
|
|
|
**Time:** ~10 minutes (schema), ~30 minutes (sample data)
|
|
|
|
### Step 6: Build Search Indexes
|
|
|
|
```bash
|
|
# Build Solr indexes for all entities
|
|
./admin/BuildSearchIndexes.pl --all
|
|
```
|
|
|
|
**Indexes Built:**
|
|
- Artist index
|
|
- Release index
|
|
- Recording index
|
|
- Work index
|
|
- Label index
|
|
- Area, event, place, series, instrument indexes
|
|
|
|
**Time:** ~2 hours (full production data), ~5 minutes (sample data)
|
|
|
|
## System Requirements
|
|
|
|
### Minimum Requirements (Development)
|
|
|
|
**CPU:** 2 cores
|
|
**RAM:** 4 GB
|
|
**Disk:** 20 GB
|
|
**Database:** PostgreSQL 16+
|
|
**Cache:** Redis 6.0+
|
|
**Search:** Solr 8.11+
|
|
|
|
### Recommended Requirements (Production)
|
|
|
|
**CPU:** 8+ cores
|
|
**RAM:** 16+ GB
|
|
**Disk:** 500+ GB SSD
|
|
- 350 GB for PostgreSQL database
|
|
- 50 GB for Solr indexes
|
|
- 50 GB for backups
|
|
- 50 GB for logs and temp files
|
|
|
|
**Database:** PostgreSQL 16+ with:
|
|
- shared_buffers = 4GB
|
|
- effective_cache_size = 12GB
|
|
- work_mem = 64MB
|
|
- maintenance_work_mem = 1GB
|
|
|
|
**Cache:** Redis 6.0+ with:
|
|
- maxmemory = 2GB
|
|
- maxmemory-policy = allkeys-lru
|
|
|
|
**Search:** Solr 8.11+ with:
|
|
- Java heap = 4GB
|
|
- Solr cache = 512MB per core
|
|
|
|
### Network Requirements
|
|
|
|
**Bandwidth:** 100 Mbps+ (for replication and API traffic)
|
|
|
|
**Ports:**
|
|
- 5000 - Website
|
|
- 5001 - Web service API
|
|
- 5432 - PostgreSQL
|
|
- 6379 - Redis
|
|
- 8983 - Solr
|
|
|
|
**Firewall:**
|
|
- Allow inbound 80/443 (HTTP/HTTPS)
|
|
- Allow outbound 80/443 (external APIs)
|
|
- Restrict 5432, 6379, 8983 to localhost
|
|
|
|
### Software Requirements
|
|
|
|
**Operating System:**
|
|
- Ubuntu 24.04 LTS (Noble) - recommended
|
|
- Debian 12 (Bookworm)
|
|
- Any Linux with Perl 5.38+ and Node.js 20+
|
|
|
|
**Perl:** 5.38.0 or later (5.42.0 tested)
|
|
|
|
**Node.js:** 20.9.0 or later
|
|
|
|
**PostgreSQL:** 16.0 or later (16.3 recommended)
|
|
|
|
**Redis:** 6.0 or later (7.0 recommended)
|
|
|
|
**Solr:** 8.11 or later
|
|
|
|
**Optional:**
|
|
- Docker 24.0+
|
|
- Docker Compose 2.0+
|
|
- Nginx 1.24+ (reverse proxy)
|
|
- RabbitMQ 3.12+ (background jobs)
|
|
|
|
## Deployment Strategies
|
|
|
|
### Single Server
|
|
|
|
**Use Case:** Development, small mirrors
|
|
|
|
**Architecture:**
|
|
- All services on one server
|
|
- PostgreSQL, Redis, Solr, MusicBrainz on localhost
|
|
- Nginx reverse proxy
|
|
|
|
**Pros:**
|
|
- Simple setup
|
|
- Low cost
|
|
- Easy to manage
|
|
|
|
**Cons:**
|
|
- Single point of failure
|
|
- Limited scalability
|
|
- Resource contention
|
|
|
|
### Multi-Server
|
|
|
|
**Use Case:** Production, high-traffic mirrors
|
|
|
|
**Architecture:**
|
|
- Web tier: 2+ servers running MusicBrainz (load balanced)
|
|
- Database tier: PostgreSQL primary + replicas
|
|
- Cache tier: Redis (possibly clustered)
|
|
- Search tier: Solr (possibly sharded)
|
|
|
|
**Pros:**
|
|
- High availability
|
|
- Horizontal scalability
|
|
- Better performance
|
|
|
|
**Cons:**
|
|
- Complex setup
|
|
- Higher cost
|
|
- Requires load balancer
|
|
|
|
### Docker Swarm / Kubernetes
|
|
|
|
**Use Case:** Large-scale deployments, cloud environments
|
|
|
|
**Architecture:**
|
|
- Container orchestration
|
|
- Auto-scaling
|
|
- Service discovery
|
|
- Health checks
|
|
|
|
**Pros:**
|
|
- Automated deployment
|
|
- Self-healing
|
|
- Easy scaling
|
|
|
|
**Cons:**
|
|
- Steep learning curve
|
|
- Operational complexity
|
|
- Overhead
|
|
|
|
## Monitoring and Logging
|
|
|
|
### Logging
|
|
|
|
**Framework:** Log::Dispatch
|
|
|
|
**Log Levels:**
|
|
- DEBUG - Verbose debugging
|
|
- INFO - Informational messages
|
|
- WARN - Warnings
|
|
- ERROR - Errors
|
|
- FATAL - Fatal errors
|
|
|
|
**Log Destinations:**
|
|
- STDOUT (development)
|
|
- File (production): `/var/log/musicbrainz/server.log`
|
|
- Syslog (optional)
|
|
|
|
**Log Rotation:**
|
|
- Daily rotation
|
|
- Keep 30 days
|
|
- Compress old logs
|
|
|
|
### Error Tracking
|
|
|
|
**Platform:** Sentry
|
|
|
|
**Integration:**
|
|
- Server-side: Perl Sentry SDK
|
|
- Client-side: JavaScript Sentry SDK
|
|
|
|
**Captured:**
|
|
- Exceptions
|
|
- Error messages
|
|
- Stack traces
|
|
- Request context
|
|
- User context
|
|
|
|
### Metrics
|
|
|
|
**Current State:** No Prometheus/metrics endpoint
|
|
|
|
**Workaround:** Parse logs for metrics
|
|
|
|
**Future:** Prometheus exporter planned
|
|
|
|
### Health Checks
|
|
|
|
**Current State:** No dedicated health check endpoint
|
|
|
|
**Workaround:** Check `/` returns 200
|
|
|
|
**Future:** `/health` endpoint planned
|