- gRPC service with MusicBrainz provider - PostgreSQL schema with migrations - Service layer with database-first caching - Repository pattern for data access - YAML configuration support - Research documentation for 17 music metadata projects
11 KiB
MusicBrainz Server Integrations
Cover Art Archive
Overview
Service: Cover Art Archive (coverartarchive.org)
Storage: Amazon S3 + Internet Archive
Purpose: Store and serve album cover artwork
Upload Process
Method: Signed POST to S3
Authentication: HMAC-SHA1 signed policy
Configuration:
# DBDefs.pm
sub COVER_ART_ARCHIVE_ACCESS_KEY { 'access_key' }
sub COVER_ART_ARCHIVE_SECRET_KEY { 'secret_key' }
sub COVER_ART_ARCHIVE_UPLOAD_PREFIXER { 'MB' }
sub COVER_ART_ARCHIVE_DOWNLOAD_PREFIX { 'https://coverartarchive.org' }
Upload Flow:
- User uploads image via MusicBrainz interface
- Server generates S3 policy document
- Policy signed with HMAC-SHA1 using secret key
- Browser POSTs directly to S3 with signed policy
- S3 stores image and forwards to Internet Archive
- Image becomes available at coverartarchive.org
Policy Document:
{
"expiration": "2024-12-31T23:59:59Z",
"conditions": [
{"bucket": "mbid-{release_mbid}"},
{"acl": "public-read"},
["starts-with", "$key", "mbid-{release_mbid}/"],
["content-length-range", 0, 10485760]
]
}
Signature:
use Digest::SHA qw(hmac_sha1_base64);
my $policy_b64 = encode_base64($policy_json);
my $signature = hmac_sha1_base64($policy_b64, $secret_key);
$signature .= '=' while length($signature) % 4; # Pad to multiple of 4
Retrieval
URL Pattern: https://coverartarchive.org/release/{mbid}/front
Image Types:
front- Front coverback- Back cover{id}- Specific image by ID
Sizes:
- Original (full resolution)
250- 250px thumbnail500- 500px thumbnail1200- 1200px large
Example:
https://coverartarchive.org/release/76df3287-6cda-33eb-8e9a-044b5e15ffdd/front-250.jpg
Wikipedia/Wikidata/Wikimedia Commons
MediaWiki API Integration
Purpose: Fetch article extracts, images, and structured data
Endpoints:
- Wikipedia:
https://{lang}.wikipedia.org/w/api.php - Wikidata:
https://www.wikidata.org/w/api.php - Commons:
https://commons.wikimedia.org/w/api.php
Wikipedia Extracts
API Action: query with prop=extracts
Request:
my $url = "https://en.wikipedia.org/w/api.php?" .
"action=query&" .
"prop=extracts&" .
"exintro=1&" .
"explaintext=1&" .
"titles=" . uri_escape($artist_name) .
"&format=json";
my $response = $ua->get($url);
my $data = decode_json($response->content);
Caching: 3 days for extracts
Display: Artist/release pages show Wikipedia extract in sidebar
Language Links
API Action: query with prop=langlinks
Purpose: Find Wikipedia articles in different languages
Request:
my $url = "https://en.wikipedia.org/w/api.php?" .
"action=query&" .
"prop=langlinks&" .
"titles=" . uri_escape($title) .
"&lllimit=500&" .
"&format=json";
Caching: 7 days for language links
Usage: Display Wikipedia links in user's preferred language
Wikidata Integration
Purpose: Fetch structured data (birth dates, locations, etc.)
API Action: wbgetentities
Request:
my $url = "https://www.wikidata.org/w/api.php?" .
"action=wbgetentities&" .
"ids=Q{wikidata_id}&" .
"format=json";
Data Extracted:
- Birth/death dates
- Birth/death places
- Occupations
- Genres
- Record labels
- Official websites
Wikimedia Commons Images
Purpose: Fetch artist/band photos
API Action: query with prop=imageinfo
Request:
my $url = "https://commons.wikimedia.org/w/api.php?" .
"action=query&" .
"prop=imageinfo&" .
"iiprop=url|size|mime&" .
"titles=File:" . uri_escape($filename) .
"&format=json";
Display: Artist pages show Commons images in sidebar
CritiqueBrainz
Overview
Service: CritiqueBrainz (critiquebrainz.org)
Purpose: User-generated music reviews
Integration
Method: URL linking
Pattern: https://critiquebrainz.org/release/{mbid}
Display: Release pages show link to CritiqueBrainz reviews
Embedding: Review count and average rating displayed on release pages
API: CritiqueBrainz API used to fetch review statistics
Request:
my $url = "https://critiquebrainz.org/ws/1/release/$mbid";
my $response = $ua->get($url);
my $data = decode_json($response->content);
my $review_count = $data->{review_count};
my $avg_rating = $data->{average_rating};
Event Art Archive
Overview
Service: Event Art Archive
Purpose: Store event posters and promotional materials
Architecture: Similar to Cover Art Archive (S3 + Internet Archive)
URL Pattern: https://eventartarchive.org/event/{mbid}
Discourse SSO
Overview
Service: MusicBrainz Community Forum (community.metabrainz.org)
Protocol: Discourse SSO (Single Sign-On)
Authentication Flow
Method: HMAC-SHA256 signed payload
Flow:
- User clicks "Log in" on Discourse
- Discourse redirects to MusicBrainz with nonce
- MusicBrainz authenticates user
- MusicBrainz generates SSO payload
- Payload signed with HMAC-SHA256
- User redirected back to Discourse with signed payload
- Discourse verifies signature and logs in user
Configuration:
# DBDefs.pm
sub DISCOURSE_SSO_SECRET { 'shared_secret' }
sub DISCOURSE_SERVER { 'https://community.metabrainz.org' }
Payload Generation:
use Digest::SHA qw(hmac_sha256_hex);
use MIME::Base64;
my $payload = encode_base64(
"nonce=$nonce&" .
"email=$email&" .
"external_id=$user_id&" .
"username=$username&" .
"name=$name"
);
my $signature = hmac_sha256_hex($payload, $sso_secret);
my $redirect_url = "$discourse_server/session/sso_login?" .
"sso=" . uri_escape($payload) .
"&sig=$signature";
User Data Synced:
- Email address
- Username
- Display name
- User ID (external_id)
- Avatar URL (optional)
- Admin status (optional)
- Moderator status (optional)
MetaBrainz OAuth
Overview
Service: Centralized OAuth provider for MetaBrainz services
Protocol: OAuth 2.0 with token introspection
Token Introspection
Endpoint: https://musicbrainz.org/oauth2/introspect
Method: POST
Request:
my $response = $ua->post(
'https://musicbrainz.org/oauth2/introspect',
{
token => $access_token,
client_id => $client_id,
client_secret => $client_secret,
}
);
my $data = decode_json($response->content);
Response:
{
"active": true,
"scope": "profile email tag rating collection",
"client_id": "client_id",
"username": "username",
"token_type": "Bearer",
"exp": 1609459200,
"iat": 1609372800,
"sub": "user_id"
}
Usage: Other MetaBrainz services (ListenBrainz, BookBrainz, etc.) validate tokens via introspection
Services Using MetaBrainz OAuth
- ListenBrainz (listening history)
- BookBrainz (book metadata)
- CritiqueBrainz (music reviews)
- AcousticBrainz (audio analysis)
- Picard (music tagger)
Replication System
Overview
Purpose: Synchronize database changes from master to mirrors
Protocol: dbmirror2 packet system
Replication Modes
RT_MASTER:
- Generates replication packets
- Writes to
dbmirror_pendinganddbmirror_pendingdatatables - Exports packets for mirrors
RT_MIRROR:
- Consumes replication packets
- Applies changes from master
- Read-only (no edits)
RT_STANDALONE:
- No replication
- Fully independent database
Configuration:
# DBDefs.pm
sub REPLICATION_TYPE { RT_MASTER } # or RT_MIRROR or RT_STANDALONE
sub REPLICATION_ACCESS_TOKEN { 'secret_token' }
Packet Structure
Tables:
dbmirror_pending- Pending transactionsdbmirror_pendingdata- Data changes (INSERT/UPDATE/DELETE)
Packet Format:
SeqId: 12345
TransactionId: 67890
Operation: i # i=INSERT, u=UPDATE, d=DELETE
TableName: artist
Data: {"id":123,"gid":"...","name":"..."}
Replication Flow
Master Side:
- Edit applied to database
- Triggers capture changes to
dbmirror_pending - Export script generates replication packets
- Packets uploaded to FTP server
Mirror Side:
- Download replication packets from FTP
- Apply packets in sequence order
- Update replication state
- Verify data integrity
Packet Export:
# On master
./admin/replication/ExportReplicationPackets
# Generates packets in replication/ directory
# Uploads to FTP server
Packet Import:
# On mirror
./admin/replication/LoadReplicationChanges
# Downloads packets from FTP
# Applies changes to database
Replication Lag
Monitoring: Mirrors track replication lag (time behind master)
Typical Lag: Minutes to hours depending on packet size and network
Status Endpoint: /replication-status shows current replication state
Redis Integration
Architecture
Connection: Single Redis instance, 16 databases (0-15)
Configuration:
# DBDefs.pm
sub REDIS_SERVER { 'localhost:6379' }
sub REDIS_NAMESPACE { 'MB' }
Use Cases
Session Management (DB 1):
- Store user sessions
- 10 hour absolute expiry
- 3 hour idle timeout
Entity Cache (DB 0):
- Cache entity lookups by MBID
- 1 hour TTL
- Invalidate on edit
Search Cache (DB 2):
- Cache search results
- 15 minute TTL
Statistics Cache (DB 3):
- Cache homepage statistics
- 1 hour TTL
Rate Limiting (DB 4):
- Track API request counts
- 1 second sliding window
Pub/Sub (DB 5):
- Real-time notifications
- Edit submission events
- Cache invalidation events
Connection Pooling
Library: Redis.pm with connection pooling
Pool Size: 10 connections per worker
Reconnection: Automatic reconnection on connection loss
HTTP Client
LWP::UserAgent
Purpose: HTTP client for external service communication
Configuration:
use LWP::UserAgent;
my $ua = LWP::UserAgent->new(
agent => 'MusicBrainz/1.0 (https://musicbrainz.org)',
timeout => 30,
max_redirect => 5,
);
User-Agent: Always identifies as MusicBrainz with contact URL
Timeout: 30 seconds default
Redirects: Follow up to 5 redirects
SSL Verification: Enabled by default
Rate Limiting
External Services: Respect rate limits via delays
Wikipedia API: 1 request per second (recommended)
Wikidata API: 1 request per second (recommended)
Implementation:
use Time::HiRes qw(sleep);
my $last_request_time = 0;
sub rate_limited_request {
my ($url) = @_;
my $elapsed = time() - $last_request_time;
if ($elapsed < 1) {
sleep(1 - $elapsed);
}
my $response = $ua->get($url);
$last_request_time = time();
return $response;
}
Error Handling
Retry Logic: Exponential backoff for transient errors
Timeouts: Fail gracefully on timeout
Logging: Log all external service errors to Sentry
Example:
use Try::Tiny;
my $response;
my $retries = 3;
for my $attempt (1..$retries) {
try {
$response = $ua->get($url);
last if $response->is_success;
} catch {
warn "Request failed (attempt $attempt): $_";
sleep(2 ** $attempt); # Exponential backoff
};
}