Files
metadata-agregator/docs/research/graphbrainz/analysis/DEPLOYMENT.md
T
Alexander a1f6701bac feat: initial implementation of metadata aggregator
- gRPC service with MusicBrainz provider
- PostgreSQL schema with migrations
- Service layer with database-first caching
- Repository pattern for data access
- YAML configuration support
- Research documentation for 17 music metadata projects
2026-04-28 16:28:53 +02:00

737 lines
14 KiB
Markdown

# GraphBrainz Deployment
## Deployment Modes
GraphBrainz supports three deployment modes:
| Mode | Use Case | Entry Point |
|------|----------|-------------|
| Standalone Server | Dedicated GraphQL service | `cli.js` |
| Express Middleware | Embed in existing app | `middleware()` export |
| Direct GraphQL | Programmatic queries | `schema` + `context` exports |
## Standalone Server
### NPM Package
**Package Name**: `graphbrainz`
**Installation**:
```bash
npm install -g graphbrainz
```
**Binary Command**:
```bash
graphbrainz
```
### Local Development
**Installation**:
```bash
git clone https://github.com/exogen/graphbrainz.git
cd graphbrainz
npm install
```
**Start Server**:
```bash
npm start
# or
node cli.js
```
**Default Configuration**:
- Port: 3000
- Path: /
- GraphiQL: enabled
### Environment Variables
| Variable | Default | Purpose |
|----------|---------|---------|
| PORT | 3000 | Server port |
| GRAPHBRAINZ_PATH | / | GraphQL endpoint path |
| GRAPHBRAINZ_CORS_ORIGIN | false | CORS configuration |
| GRAPHBRAINZ_GRAPHIQL | true (dev) | Enable GraphiQL |
| GRAPHBRAINZ_EXTENSIONS | - | Extension list |
| GRAPHBRAINZ_CACHE_SIZE | 8192 | LRU cache size |
| GRAPHBRAINZ_CACHE_TTL | 86400000 | Cache TTL (ms) |
| MUSICBRAINZ_BASE_URL | http://musicbrainz.org/ws/2/ | MusicBrainz API |
| NODE_ENV | development | Environment mode |
### Example Configuration
**.env**:
```bash
PORT=4000
GRAPHBRAINZ_PATH=/graphql
GRAPHBRAINZ_CORS_ORIGIN=*
GRAPHBRAINZ_EXTENSIONS=cover-art-archive,fanart,mediawiki,theaudiodb
FANART_API_KEY=your-fanart-key
THEAUDIODB_API_KEY=your-theaudiodb-key
GRAPHBRAINZ_CACHE_SIZE=16384
GRAPHBRAINZ_CACHE_TTL=3600000
```
**Start**:
```bash
node cli.js
```
**Access**:
- GraphQL endpoint: http://localhost:4000/graphql
- GraphiQL interface: http://localhost:4000/graphql
## Express Middleware
### Installation
```bash
npm install graphbrainz
```
### Basic Integration
```javascript
import express from 'express';
import { middleware } from 'graphbrainz';
const app = express();
app.use('/graphql', middleware());
app.listen(3000, () => {
console.log('Server running on http://localhost:3000/graphql');
});
```
### Advanced Configuration
```javascript
import express from 'express';
import { middleware } from 'graphbrainz';
import lastfm from 'graphbrainz-extension-lastfm';
const app = express();
app.use('/graphql', middleware({
// Extension configuration
extensions: [
lastfm
],
// Cache configuration
cacheSize: 16384,
cacheTTL: 3600000,
// MusicBrainz configuration
musicbrainz: {
baseURL: 'http://localhost:5000/ws/2/'
},
// Extension API keys
fanart: {
apiKey: process.env.FANART_API_KEY
},
theaudiodb: {
apiKey: process.env.THEAUDIODB_API_KEY
},
// GraphiQL configuration
graphiql: true,
// CORS configuration
cors: {
origin: '*'
}
}));
app.listen(3000);
```
### Multiple Endpoints
```javascript
import express from 'express';
import { middleware } from 'graphbrainz';
const app = express();
// Public endpoint (no extensions)
app.use('/graphql/public', middleware({
extensions: []
}));
// Premium endpoint (all extensions)
app.use('/graphql/premium', middleware({
extensions: ['cover-art-archive', 'fanart', 'mediawiki', 'theaudiodb']
}));
app.listen(3000);
```
## Direct GraphQL Client
### Installation
```bash
npm install graphbrainz
```
### Programmatic Queries
```javascript
import { schema, context } from 'graphbrainz';
import { graphql } from 'graphql';
const query = `
{
lookup {
artist(mbid: "5b11f4ce-a62d-471e-81fc-a69a8278c7da") {
name
country
}
}
}
`;
const result = await graphql({
schema,
source: query,
contextValue: context
});
console.log(result.data);
```
### Custom Context
```javascript
import { createSchema, createContext } from 'graphbrainz';
const schema = createSchema({
extensions: ['cover-art-archive', 'fanart']
});
const context = createContext({
cacheSize: 16384,
cacheTTL: 3600000,
fanart: {
apiKey: process.env.FANART_API_KEY
}
});
const result = await graphql({
schema,
source: query,
contextValue: context
});
```
## Heroku Deployment
GraphBrainz includes Heroku-specific deployment scripts.
### Procfile
**File**: `Procfile`
```
web: node cli.js
```
### Deployment Script
**File**: `scripts/deploy.sh`
```bash
#!/bin/bash
# Create deploy branch
git checkout -b deploy
# Build schema and docs
npm run update-schema
npm run build-docs
# Commit build artifacts
git add -f schema.json docs/
git commit -m "Build for deployment"
# Force push to Heroku
git push -f heroku deploy:master
# Clean up
git checkout main
git branch -D deploy
```
### Heroku Configuration
**Create App**:
```bash
heroku create my-graphbrainz
```
**Set Environment Variables**:
```bash
heroku config:set NODE_ENV=production
heroku config:set GRAPHBRAINZ_EXTENSIONS=cover-art-archive,fanart,mediawiki,theaudiodb
heroku config:set FANART_API_KEY=your-key
heroku config:set THEAUDIODB_API_KEY=your-key
heroku config:set GRAPHBRAINZ_CACHE_SIZE=16384
heroku config:set GRAPHBRAINZ_GRAPHIQL=false
```
**Deploy**:
```bash
./scripts/deploy.sh
```
**Access**:
```
https://my-graphbrainz.herokuapp.com/
```
### Heroku Dyno Sizing
| Dyno Type | Memory | Recommended Load |
|-----------|--------|------------------|
| Free | 512 MB | Development only |
| Hobby | 512 MB | <10 req/s |
| Standard-1X | 512 MB | <25 req/s |
| Standard-2X | 1 GB | <100 req/s |
| Performance-M | 2.5 GB | <500 req/s |
## NPM Package Distribution
### Package Exports
**File**: `package.json`
```json
{
"name": "graphbrainz",
"version": "9.0.0",
"main": "src/index.js",
"bin": {
"graphbrainz": "cli.js"
},
"exports": {
".": "./src/index.js",
"./schema": "./schema.json",
"./extensions/cover-art-archive": "./src/extensions/cover-art-archive/index.js",
"./extensions/fanart": "./src/extensions/fanart/index.js",
"./extensions/mediawiki": "./src/extensions/mediawiki/index.js",
"./extensions/theaudiodb": "./src/extensions/theaudiodb/index.js"
}
}
```
### Module Imports
```javascript
// Main module
import { middleware, schema, context } from 'graphbrainz';
// Schema introspection
import schemaJSON from 'graphbrainz/schema';
// Built-in extensions
import coverArt from 'graphbrainz/extensions/cover-art-archive';
import fanart from 'graphbrainz/extensions/fanart';
import mediawiki from 'graphbrainz/extensions/mediawiki';
import theaudiodb from 'graphbrainz/extensions/theaudiodb';
```
## Continuous Integration
### Travis CI
**File**: `.travis.yml`
```yaml
language: node_js
node_js:
- "12"
- "14"
- "15"
cache:
directories:
- node_modules
script:
- npm test
- npm run build
after_success:
- npm run coverage
- npx codecov
- npx coveralls < coverage/lcov.info
```
### GitHub Actions (Not Implemented)
GraphBrainz uses Travis CI. Migration to GitHub Actions would look like:
```yaml
name: CI
on: [push, pull_request]
jobs:
test:
runs-on: ubuntu-latest
strategy:
matrix:
node-version: [12, 14, 16, 18]
steps:
- uses: actions/checkout@v3
- uses: actions/setup-node@v3
with:
node-version: ${{ matrix.node-version }}
- run: npm ci
- run: npm test
- run: npm run build
- uses: codecov/codecov-action@v3
```
## Build Process
### Schema Generation
**Command**:
```bash
npm run update-schema
```
**Script**:
```javascript
import { schema } from './src/index.js';
import { printSchema } from 'graphql';
import fs from 'fs';
const schemaSDL = printSchema(schema);
fs.writeFileSync('schema.graphql', schemaSDL);
const schemaJSON = JSON.stringify(schema.toJSON(), null, 2);
fs.writeFileSync('schema.json', schemaJSON);
```
**Output**:
- `schema.graphql` - SDL representation
- `schema.json` - Introspection JSON
### Documentation Generation
**Command**:
```bash
npm run build-docs
```
**Scripts**:
- `scripts/generate-readme-toc.js` - Table of contents
- `scripts/generate-schema-docs.js` - Schema reference
- `scripts/generate-type-docs.js` - Type documentation
- `scripts/generate-extension-docs.js` - Extension reference
### Preversion Hook
**File**: `package.json`
```json
{
"scripts": {
"preversion": "npm run update-schema && npm run build-docs && git add schema.json schema.graphql docs/"
}
}
```
Ensures schema and docs are updated before version bump.
## Docker (Not Implemented)
GraphBrainz does not include Docker configuration. Example implementation:
### Dockerfile
```dockerfile
FROM node:18-alpine
WORKDIR /app
COPY package*.json ./
RUN npm ci --production
COPY . .
EXPOSE 3000
CMD ["node", "cli.js"]
```
### docker-compose.yml
```yaml
version: '3.8'
services:
graphbrainz:
build: .
ports:
- "3000:3000"
environment:
- NODE_ENV=production
- GRAPHBRAINZ_EXTENSIONS=cover-art-archive,fanart,mediawiki,theaudiodb
- FANART_API_KEY=${FANART_API_KEY}
- THEAUDIODB_API_KEY=${THEAUDIODB_API_KEY}
- GRAPHBRAINZ_CACHE_SIZE=16384
restart: unless-stopped
```
### Build and Run
```bash
docker-compose up -d
```
## Kubernetes (Not Implemented)
Example Kubernetes deployment:
### Deployment
```yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: graphbrainz
spec:
replicas: 3
selector:
matchLabels:
app: graphbrainz
template:
metadata:
labels:
app: graphbrainz
spec:
containers:
- name: graphbrainz
image: graphbrainz:9.0.0
ports:
- containerPort: 3000
env:
- name: NODE_ENV
value: "production"
- name: GRAPHBRAINZ_CACHE_SIZE
value: "16384"
- name: FANART_API_KEY
valueFrom:
secretKeyRef:
name: graphbrainz-secrets
key: fanart-api-key
resources:
requests:
memory: "512Mi"
cpu: "250m"
limits:
memory: "1Gi"
cpu: "500m"
```
### Service
```yaml
apiVersion: v1
kind: Service
metadata:
name: graphbrainz
spec:
selector:
app: graphbrainz
ports:
- port: 80
targetPort: 3000
type: LoadBalancer
```
## Production Considerations
### Memory Allocation
**Node.js Heap Size**:
```bash
node --max-old-space-size=2048 cli.js
```
**Recommended Allocation**:
| Traffic | Heap Size | Total Memory |
|---------|-----------|--------------|
| <10 req/s | 512 MB | 1 GB |
| 10-50 req/s | 1 GB | 2 GB |
| 50-100 req/s | 2 GB | 4 GB |
| 100+ req/s | 4 GB | 8 GB |
### Process Management
**PM2**:
```bash
npm install -g pm2
pm2 start cli.js --name graphbrainz -i max
pm2 save
pm2 startup
```
**Systemd**:
```ini
[Unit]
Description=GraphBrainz GraphQL Server
After=network.target
[Service]
Type=simple
User=graphbrainz
WorkingDirectory=/opt/graphbrainz
ExecStart=/usr/bin/node cli.js
Restart=on-failure
Environment=NODE_ENV=production
Environment=PORT=3000
[Install]
WantedBy=multi-user.target
```
### Reverse Proxy
**Nginx**:
```nginx
upstream graphbrainz {
server localhost:3000;
}
server {
listen 80;
server_name graphbrainz.example.com;
location / {
proxy_pass http://graphbrainz;
proxy_http_version 1.1;
proxy_set_header Upgrade $http_upgrade;
proxy_set_header Connection 'upgrade';
proxy_set_header Host $host;
proxy_cache_bypass $http_upgrade;
}
}
```
### Monitoring
GraphBrainz does not include built-in monitoring. Recommended additions:
**Prometheus Metrics**:
```javascript
import promClient from 'prom-client';
const register = new promClient.Registry();
const httpRequestDuration = new promClient.Histogram({
name: 'http_request_duration_seconds',
help: 'Duration of HTTP requests in seconds',
labelNames: ['method', 'route', 'status_code']
});
register.registerMetric(httpRequestDuration);
app.use((req, res, next) => {
const start = Date.now();
res.on('finish', () => {
const duration = (Date.now() - start) / 1000;
httpRequestDuration.labels(req.method, req.path, res.statusCode).observe(duration);
});
next();
});
app.get('/metrics', (req, res) => {
res.set('Content-Type', register.contentType);
res.end(register.metrics());
});
```
### Health Checks
GraphBrainz does not include health endpoints. Recommended implementation:
```javascript
app.get('/health', (req, res) => {
res.json({
status: 'ok',
uptime: process.uptime(),
memory: process.memoryUsage(),
cache: {
size: cache.size,
max: cache.max
}
});
});
app.get('/ready', async (req, res) => {
try {
// Check MusicBrainz connectivity
await fetch(`${process.env.MUSICBRAINZ_BASE_URL}/artist/5b11f4ce-a62d-471e-81fc-a69a8278c7da`);
res.json({ status: 'ready' });
} catch (error) {
res.status(503).json({ status: 'not ready', error: error.message });
}
});
```
## Scaling Strategies
### Horizontal Scaling
GraphBrainz is stateless (except LRU cache) and can be horizontally scaled:
**Load Balancer**:
```
Client -> Load Balancer -> GraphBrainz Instance 1
-> GraphBrainz Instance 2
-> GraphBrainz Instance 3
```
**Cache Considerations**:
- Each instance has independent LRU cache
- Cache hit ratio decreases with more instances
- Consider shared cache (Redis) for better hit ratio
### Vertical Scaling
Increase memory allocation for larger cache:
```bash
GRAPHBRAINZ_CACHE_SIZE=32768 # 4x default
node --max-old-space-size=4096 cli.js
```
### Local MusicBrainz Mirror
Eliminate rate limits and reduce latency:
```bash
MUSICBRAINZ_BASE_URL=http://localhost:5000/ws/2/
```
**Benefits**:
- No rate limiting
- <10ms latency (vs 100-500ms)
- Offline operation
- Full dataset access
**Setup**: https://musicbrainz.org/doc/MusicBrainz_Server/Setup