- gRPC service with MusicBrainz provider - PostgreSQL schema with migrations - Service layer with database-first caching - Repository pattern for data access - YAML configuration support - Research documentation for 17 music metadata projects
14 KiB
MusicBrainz Server Deployment
Docker Architecture
Build System
Template Engine: M4 macros
Base Image: Ubuntu Noble (24.04 LTS)
Dockerfile Location: docker/Dockerfile.template
Template Processing:
# Generate Dockerfile from template
m4 docker/Dockerfile.template > docker/Dockerfile
M4 Macros:
INSTALL_PERL_DEPENDENCIES- Install Perl modules via cartonINSTALL_NODE_DEPENDENCIES- Install Node.js packages via yarnCOMPILE_RESOURCES- Compile static assetsSETUP_DATABASE- Initialize PostgreSQL schema
Multi-Stage Build:
- Base stage - Install system dependencies
- Build stage - Compile assets and dependencies
- Runtime stage - Copy artifacts, minimal runtime
Container Types
website:
- Main web application
- Serves HTML pages via Template Toolkit
- Handles user authentication and sessions
- Port: 5000
webservice:
- API endpoints (/ws/2/)
- JSON/XML serialization
- OAuth authentication
- Port: 5001
tests:
- Run test suites
- Perl unit tests
- JavaScript tests
- pgTAP database tests
- No exposed ports (ephemeral)
cron:
- Scheduled tasks
- Statistics calculation
- Data cleanup
- Replication packet export
- No exposed ports
sitemaps:
- Generate XML sitemaps
- Update search engine indexes
- Run daily
- No exposed ports
json-dump:
- Export database to JSON
- Generate data dumps for download
- Run weekly
- No exposed ports
solr-backup:
- Backup Solr indexes
- Run daily
- No exposed ports
template-renderer:
- Isolated Template Toolkit renderer
- Forked from main process
- Prevents template errors from crashing main app
- IPC via Unix socket
Docker Compose
File: docker-compose.yml
Services:
services:
db:
image: postgres:16
volumes:
- pgdata:/var/lib/postgresql/data
environment:
POSTGRES_USER: musicbrainz
POSTGRES_PASSWORD: musicbrainz
POSTGRES_DB: musicbrainz_db
ports:
- "5432:5432"
redis:
image: redis:7
volumes:
- redisdata:/data
ports:
- "6379:6379"
solr:
image: solr:8.11
volumes:
- solrdata:/var/solr
ports:
- "8983:8983"
website:
build:
context: .
dockerfile: docker/Dockerfile
target: website
depends_on:
- db
- redis
- solr
ports:
- "5000:5000"
environment:
MUSICBRAINZ_SERVER_PROCESSES: 10
MUSICBRAINZ_USE_PROXY: 1
webservice:
build:
context: .
dockerfile: docker/Dockerfile
target: webservice
depends_on:
- db
- redis
- solr
ports:
- "5001:5001"
volumes:
pgdata:
redisdata:
solrdata:
Image Layers
Base Layer (Ubuntu Noble):
- System packages (build-essential, libpq-dev, etc.)
- Perl 5.38
- Node.js 20
- PostgreSQL client libraries
Dependency Layer:
- Perl modules (via carton)
- Node.js packages (via yarn)
- Cached for faster rebuilds
Application Layer:
- Application code
- Compiled assets
- Configuration templates
Runtime Layer:
- Minimal runtime dependencies
- No build tools
- Smaller image size
PSGI Server Configuration
Starlet
Server: Starlet (high-performance PSGI server)
Protocol: HTTP/1.1
Concurrency: Pre-forking worker model
Configuration:
# Start Starlet with 10 workers
starman --workers 10 \
--max-requests 100 \
--listen :5000 \
app.psgi
Worker Settings:
- Workers: 10 (configurable via
MUSICBRAINZ_SERVER_PROCESSES) - Max Requests per Worker: 30-90 (random to prevent thundering herd)
- Worker Timeout: 300 seconds (5 minutes)
- Keepalive: Enabled (60 seconds)
Worker Lifecycle:
- Master process forks 10 workers
- Each worker handles requests until max_requests reached
- Worker exits gracefully
- Master forks new worker to replace it
- Prevents memory leaks from accumulating
Server::Starter (Zero-Downtime Restarts)
Purpose: Enable zero-downtime deployments
Mechanism:
- Server::Starter binds to port
- Forks Starlet with inherited socket
- On restart signal (HUP):
- Start new Starlet process
- New process binds to same socket
- Old process finishes existing requests
- Old process exits
- No dropped connections
Command:
start_server \
--port 5000 \
--pid-file /var/run/musicbrainz.pid \
--status-file /var/run/musicbrainz.status \
-- \
starman --workers 10 app.psgi
Restart:
# Send HUP signal to trigger graceful restart
kill -HUP $(cat /var/run/musicbrainz.pid)
Status Check:
# Check server status
cat /var/run/musicbrainz.status
# Output: 1234:5000 (PID:PORT)
Reverse Proxy
Production Setup: Nginx reverse proxy in front of Starlet
Nginx Configuration:
upstream musicbrainz {
server localhost:5000;
keepalive 32;
}
server {
listen 80;
server_name musicbrainz.org;
location / {
proxy_pass http://musicbrainz;
proxy_set_header Host $host;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_set_header X-Forwarded-Proto $scheme;
proxy_http_version 1.1;
proxy_set_header Connection "";
}
location /static/ {
alias /var/www/musicbrainz/root/static/;
expires 1y;
add_header Cache-Control "public, immutable";
}
}
Benefits:
- SSL termination
- Static file serving
- Gzip compression
- Request buffering
- Load balancing (multiple Starlet instances)
CI/CD Pipeline
GitHub Actions
Workflow File: .github/workflows/test.yml
Triggers:
- Push to main branch
- Pull requests
- Manual workflow dispatch
Build Stage
Job: build-tests-image
Steps:
- Checkout code
- Set up Docker Buildx
- Build test Docker image
- Push to GitHub Container Registry
- Cache layers for faster rebuilds
Dockerfile: docker/Dockerfile.test
Caching:
- Perl dependencies cached by cpanfile.snapshot hash
- Node dependencies cached by yarn.lock hash
- Docker layer caching via GitHub Actions cache
Test Stages
Job: js-perl-and-pgtap
Matrix:
- Perl 5.38.0 (stable)
- Perl 5.42.0 (latest)
Steps:
- Pull test image from registry
- Start PostgreSQL container
- Start Redis container
- Initialize test database
- Run Perl tests (
prove -lr t/) - Run JavaScript tests (
yarn test) - Run pgTAP tests (
pg_prove -d musicbrainz_test t/pgtap/) - Upload coverage reports
Parallelization: Tests run in parallel across matrix
Selenium Tests
Jobs: selenium-1, selenium-2, selenium-3, selenium-4
Partitioning: Tests split into 4 partitions for parallel execution
Steps:
- Pull test image
- Start PostgreSQL, Redis, Solr
- Start Selenium standalone Chrome
- Initialize test database with sample data
- Start MusicBrainz server
- Run Selenium tests for partition
- Upload screenshots on failure
Partition Strategy:
# Partition 1: Artist and release tests
# Partition 2: Recording and work tests
# Partition 3: Edit and relationship tests
# Partition 4: Search and browse tests
Selenium Configuration:
# t/selenium.pl
use Selenium::Remote::Driver;
my $driver = Selenium::Remote::Driver->new(
remote_server_addr => 'localhost',
port => 4444,
browser_name => 'chrome',
extra_capabilities => {
chromeOptions => {
args => ['--headless', '--no-sandbox', '--disable-dev-shm-usage'],
},
},
);
Second-Tier Tests
Job: second-perl-and-pgtap
Purpose: Test against Perl 5.42.0 (latest stable)
Trigger: After main tests pass
Allowed to Fail: Yes (informational only)
Report Generation
Job: generate-reports
Steps:
- Download coverage reports from all test jobs
- Merge coverage data
- Generate HTML coverage report
- Upload to Codecov
- Comment on PR with coverage summary
Coverage Tools:
- Perl: Devel::Cover
- JavaScript: Istanbul/nyc
Build Process
Step 1: Install Perl Dependencies
# Install Carton (Perl dependency manager)
cpanm --notest Carton
# Install dependencies from cpanfile.snapshot
carton install --deployment
Dependencies Installed:
- Catalyst framework
- Moose object system
- DBD::Pg database driver
- Template::Toolkit
- JSON::XS
- XML::LibXML
- Redis client
- ~200 total CPAN modules
Installation Time: ~10 minutes (first time), ~1 minute (cached)
Step 2: Install Node.js Dependencies
# Install Yarn (if not present)
npm install -g yarn
# Install dependencies from yarn.lock
yarn install --frozen-lockfile
Dependencies Installed:
- React 19.2.4
- Redux
- Webpack 5
- Babel 7
- Jest (testing)
- ESLint (linting)
- ~500 total npm packages
Installation Time: ~5 minutes (first time), ~30 seconds (cached)
Step 3: Compile Static Resources
# Compile CSS, images, fonts
./script/compile_resources.sh
Tasks:
- Compile LESS to CSS
- Optimize images (pngcrush, optipng)
- Copy fonts to static directory
- Generate CSS sprites
- Minify CSS
Output: root/static/styles/, root/static/images/
Time: ~2 minutes
Step 4: Build JavaScript Bundles
# Build production bundles with Webpack
yarn run build
# Or for development (with source maps)
yarn run build:dev
Webpack Configuration:
- Entry points:
root/static/scripts/main.js,root/static/scripts/edit.js - Output:
root/static/build/ - Loaders: Babel (JSX, ES6+), CSS, file-loader
- Plugins: UglifyJS, ExtractTextPlugin, DefinePlugin
- Code splitting: Vendor bundle, async chunks
Output Files:
main.bundle.js- Main application codevendor.bundle.js- Third-party librariesedit.bundle.js- Edit interface code*.chunk.js- Async-loaded chunks
Time: ~3 minutes (production), ~30 seconds (development)
Step 5: Initialize Database
# Create database
createdb musicbrainz_db
# Load schema
psql musicbrainz_db < admin/sql/CreateTables.sql
# Load initial data
./admin/InitDb.pl --createdb --import
Schema Loading:
- 375 tables created
- 500+ foreign keys added
- Indexes created
- Triggers installed
Initial Data:
- Countries and areas
- Languages
- Relationship types
- Instrument types
- Genre definitions
Time: ~10 minutes (schema), ~30 minutes (sample data)
Step 6: Build Search Indexes
# Build Solr indexes for all entities
./admin/BuildSearchIndexes.pl --all
Indexes Built:
- Artist index
- Release index
- Recording index
- Work index
- Label index
- Area, event, place, series, instrument indexes
Time: ~2 hours (full production data), ~5 minutes (sample data)
System Requirements
Minimum Requirements (Development)
CPU: 2 cores
RAM: 4 GB
Disk: 20 GB
Database: PostgreSQL 16+
Cache: Redis 6.0+
Search: Solr 8.11+
Recommended Requirements (Production)
CPU: 8+ cores
RAM: 16+ GB
Disk: 500+ GB SSD
- 350 GB for PostgreSQL database
- 50 GB for Solr indexes
- 50 GB for backups
- 50 GB for logs and temp files
Database: PostgreSQL 16+ with:
- shared_buffers = 4GB
- effective_cache_size = 12GB
- work_mem = 64MB
- maintenance_work_mem = 1GB
Cache: Redis 6.0+ with:
- maxmemory = 2GB
- maxmemory-policy = allkeys-lru
Search: Solr 8.11+ with:
- Java heap = 4GB
- Solr cache = 512MB per core
Network Requirements
Bandwidth: 100 Mbps+ (for replication and API traffic)
Ports:
- 5000 - Website
- 5001 - Web service API
- 5432 - PostgreSQL
- 6379 - Redis
- 8983 - Solr
Firewall:
- Allow inbound 80/443 (HTTP/HTTPS)
- Allow outbound 80/443 (external APIs)
- Restrict 5432, 6379, 8983 to localhost
Software Requirements
Operating System:
- Ubuntu 24.04 LTS (Noble) - recommended
- Debian 12 (Bookworm)
- Any Linux with Perl 5.38+ and Node.js 20+
Perl: 5.38.0 or later (5.42.0 tested)
Node.js: 20.9.0 or later
PostgreSQL: 16.0 or later (16.3 recommended)
Redis: 6.0 or later (7.0 recommended)
Solr: 8.11 or later
Optional:
- Docker 24.0+
- Docker Compose 2.0+
- Nginx 1.24+ (reverse proxy)
- RabbitMQ 3.12+ (background jobs)
Deployment Strategies
Single Server
Use Case: Development, small mirrors
Architecture:
- All services on one server
- PostgreSQL, Redis, Solr, MusicBrainz on localhost
- Nginx reverse proxy
Pros:
- Simple setup
- Low cost
- Easy to manage
Cons:
- Single point of failure
- Limited scalability
- Resource contention
Multi-Server
Use Case: Production, high-traffic mirrors
Architecture:
- Web tier: 2+ servers running MusicBrainz (load balanced)
- Database tier: PostgreSQL primary + replicas
- Cache tier: Redis (possibly clustered)
- Search tier: Solr (possibly sharded)
Pros:
- High availability
- Horizontal scalability
- Better performance
Cons:
- Complex setup
- Higher cost
- Requires load balancer
Docker Swarm / Kubernetes
Use Case: Large-scale deployments, cloud environments
Architecture:
- Container orchestration
- Auto-scaling
- Service discovery
- Health checks
Pros:
- Automated deployment
- Self-healing
- Easy scaling
Cons:
- Steep learning curve
- Operational complexity
- Overhead
Monitoring and Logging
Logging
Framework: Log::Dispatch
Log Levels:
- DEBUG - Verbose debugging
- INFO - Informational messages
- WARN - Warnings
- ERROR - Errors
- FATAL - Fatal errors
Log Destinations:
- STDOUT (development)
- File (production):
/var/log/musicbrainz/server.log - Syslog (optional)
Log Rotation:
- Daily rotation
- Keep 30 days
- Compress old logs
Error Tracking
Platform: Sentry
Integration:
- Server-side: Perl Sentry SDK
- Client-side: JavaScript Sentry SDK
Captured:
- Exceptions
- Error messages
- Stack traces
- Request context
- User context
Metrics
Current State: No Prometheus/metrics endpoint
Workaround: Parse logs for metrics
Future: Prometheus exporter planned
Health Checks
Current State: No dedicated health check endpoint
Workaround: Check / returns 200
Future: /health endpoint planned