- gRPC service with MusicBrainz provider - PostgreSQL schema with migrations - Service layer with database-first caching - Repository pattern for data access - YAML configuration support - Research documentation for 17 music metadata projects
13 KiB
GraphBrainz Architecture
Schema Construction Strategy
GraphBrainz employs a hybrid schema construction approach:
- Core Schema: Programmatic construction using GraphQL.js constructors
- Extensions: SDL (Schema Definition Language) strings merged via
extendSchema()
This strategy provides type safety and runtime flexibility for the core while allowing extensions to use the more ergonomic SDL syntax.
Why Programmatic Construction?
| Benefit | Description |
|---|---|
| Type Safety | Compile-time validation of schema structure |
| Dynamic Fields | Runtime field generation based on configuration |
| AST Inspection | Direct access to GraphQL AST for resolver optimization |
| Extension Points | Programmatic hooks for schema modification |
Entity Type System
GraphBrainz defines 17 entity types in src/types/ (~2000 lines of code):
| Entity Type | File Path | Purpose |
|---|---|---|
| Area | src/types/area.js | Geographic regions |
| Artist | src/types/artist.js | Musicians and groups |
| Collection | src/types/collection.js | User-curated lists |
| Disc | src/types/disc.js | Physical media |
| Event | src/types/event.js | Concerts and performances |
| Instrument | src/types/instrument.js | Musical instruments |
| Label | src/types/label.js | Record labels |
| Place | src/types/place.js | Venues and locations |
| Recording | src/types/recording.js | Audio recordings |
| Release | src/types/release.js | Album releases |
| ReleaseGroup | src/types/release-group.js | Release groupings |
| Series | src/types/series.js | Ordered collections |
| Tag | src/types/tag.js | User-generated tags |
| Track | src/types/track.js | Individual tracks |
| URL | src/types/url.js | External links |
| Work | src/types/work.js | Musical compositions |
| Relationships | src/types/relationships.js | Entity connections |
Each type file exports a GraphQL object type with field definitions, resolvers, and relationship mappings.
Query Type Hierarchy
GraphBrainz exposes four primary query patterns:
1. Lookup Queries
Direct entity retrieval by MusicBrainz ID (MBID).
Supported Entities: 13 types
lookup {
area(mbid: String!)
artist(mbid: String!)
collection(mbid: String!)
event(mbid: String!)
instrument(mbid: String!)
label(mbid: String!)
place(mbid: String!)
recording(mbid: String!)
release(mbid: String!)
releaseGroup(mbid: String!)
series(mbid: String!)
url(mbid: String!)
work(mbid: String!)
}
2. Browse Queries
Retrieve entities linked to a parent entity with cursor-based pagination.
Supported Entities: 9 types
browse {
areas(collection: String, first: Int, after: String)
artists(area: String, collection: String, recording: String, release: String, releaseGroup: String, work: String, first: Int, after: String)
collections(area: String, artist: String, editor: String, event: String, label: String, place: String, recording: String, release: String, releaseGroup: String, work: String, first: Int, after: String)
events(area: String, artist: String, collection: String, place: String, first: Int, after: String)
labels(area: String, collection: String, release: String, first: Int, after: String)
places(area: String, collection: String, first: Int, after: String)
recordings(artist: String, collection: String, release: String, first: Int, after: String)
releases(area: String, artist: String, collection: String, label: String, recording: String, releaseGroup: String, track: String, trackArtist: String, first: Int, after: String)
releaseGroups(artist: String, collection: String, release: String, first: Int, after: String)
}
3. Search Queries
Lucene-based full-text search across entity types.
Supported Entities: 10 types
search {
areas(query: String!, first: Int, after: String)
artists(query: String!, first: Int, after: String)
events(query: String!, first: Int, after: String)
instruments(query: String!, first: Int, after: String)
labels(query: String!, first: Int, after: String)
places(query: String!, first: Int, after: String)
recordings(query: String!, first: Int, after: String)
releases(query: String!, first: Int, after: String)
releaseGroups(query: String!, first: Int, after: String)
works(query: String!, first: Int, after: String)
}
4. Node Query (Relay)
Global object identification via Relay-compliant node interface.
node(id: ID!)
Resolver Architecture
GraphBrainz implements a three-tier resolver structure:
Tier 1: Query Resolvers
Entry points for lookup, browse, search, and node queries. Responsibilities:
- Validate input parameters
- Construct MusicBrainz API URLs
- Delegate to DataLoader
- Return raw API responses
Location: src/resolvers/query.js
Tier 2: Field Resolvers
Resolve individual fields on entity types. Responsibilities:
- Extract field values from parent object
- Trigger subqueries for related entities
- Apply field-level transformations
- Handle null/undefined cases
Location: src/types/*.js (per entity type)
Tier 3: Subquery Resolvers
Handle nested entity relationships. Responsibilities:
- Inspect GraphQL AST for required fields
- Determine MusicBrainz
incparameters - Batch related entity requests
- Resolve circular dependencies
Location: src/resolvers/subquery.js
AST Inspection for Query Optimization
GraphBrainz resolvers inspect the GraphQL AST to determine which MusicBrainz inc parameters are needed. This eliminates over-fetching and under-fetching.
Example
GraphQL Query:
{
lookup {
artist(mbid: "5b11f4ce-a62d-471e-81fc-a69a8278c7da") {
name
releases {
title
date
}
}
}
}
AST Inspection Result:
- Detects
releasesfield in selection set - Adds
inc=releasesto MusicBrainz API request - Avoids fetching recordings, works, or other unneeded relationships
MusicBrainz API Call:
GET /ws/2/artist/5b11f4ce-a62d-471e-81fc-a69a8278c7da?inc=releases
Implementation
AST inspection occurs in resolver functions via info.fieldNodes:
function resolveArtist(parent, args, context, info) {
const selections = info.fieldNodes[0].selectionSet.selections;
const inc = [];
for (const selection of selections) {
if (selection.name.value === 'releases') {
inc.push('releases');
}
if (selection.name.value === 'recordings') {
inc.push('recordings');
}
}
return context.loaders.artist.load({ mbid: args.mbid, inc });
}
Extension System
Extensions modify the schema and context in two phases:
Phase 1: Context Extension
Extensions add custom HTTP clients, DataLoaders, and caches to the GraphQL context.
Interface:
{
extendContext(context, options) {
return {
...context,
[extensionName]: {
client: new ExtensionClient(options),
loader: new DataLoader(batchFn),
cache: new LRUCache(options)
}
};
}
}
Phase 2: Schema Extension
Extensions add fields to existing types or define new types via SDL.
Interface:
{
extendSchema(schema, options) {
const typeDefs = `
extend type Artist {
fanArt: FanArtImages
}
type FanArtImages {
backgrounds: [FanArtImage]
logos: [FanArtImage]
}
`;
const resolvers = {
Artist: {
fanArt(artist, args, context) {
return context.fanart.loader.load(artist.id);
}
}
};
return extendSchema(schema, { typeDefs, resolvers });
}
}
Extension Loading
Extensions are loaded via environment variable or programmatic options:
Environment Variable:
GRAPHBRAINZ_EXTENSIONS="cover-art-archive,fanart,mediawiki,theaudiodb"
Programmatic:
import { middleware } from 'graphbrainz';
import lastfm from 'graphbrainz-extension-lastfm';
app.use('/graphql', middleware({
extensions: [lastfm]
}));
DataLoader Integration
GraphBrainz uses DataLoader for request batching and deduplication.
Per-Request Batching
Each GraphQL request receives a fresh DataLoader instance. This ensures:
- Requests within a single query are batched
- Duplicate requests are deduplicated
- Cache is scoped to request lifecycle
Batch Functions
Each entity type has a batch function that:
- Receives array of keys (MBIDs or query parameters)
- Groups keys by API endpoint
- Makes batched HTTP requests
- Returns array of results in same order as keys
Example:
async function batchArtists(keys) {
const results = await Promise.all(
keys.map(key =>
got(`/ws/2/artist/${key.mbid}?inc=${key.inc.join(',')}`)
)
);
return results.map(r => r.body);
}
const artistLoader = new DataLoader(batchArtists);
LRU Cache Layer
Shared LRU cache sits above DataLoader for cross-request caching.
Configuration
| Parameter | Environment Variable | Default |
|---|---|---|
| Size | GRAPHBRAINZ_CACHE_SIZE | 8192 items |
| TTL | GRAPHBRAINZ_CACHE_TTL | 86400000 ms (1 day) |
Cache Key Strategy
Cache keys combine entity type, MBID, and inc parameters:
artist:5b11f4ce-a62d-471e-81fc-a69a8278c7da:releases,recordings
This ensures different queries for the same entity don't collide.
Per-Extension Caches
Each extension maintains its own LRU cache with separate configuration:
FANART_CACHE_SIZE/FANART_CACHE_TTLTHEAUDIODB_CACHE_SIZE/THEAUDIODB_CACHE_TTLCOVERART_CACHE_SIZE/COVERART_CACHE_TTL
Rate Limiting
Custom priority queue implementation ensures API compliance.
MusicBrainz Rate Limits
- Limit: 5 requests per 5.5 seconds
- Strategy: Token bucket with 5 tokens, refill rate 0.909 tokens/second
- Concurrency: 1 (sequential requests)
Extension Rate Limits
- Limit: 10 requests per second (default)
- Strategy: Token bucket with 10 tokens, refill rate 10 tokens/second
- Concurrency: 5 (parallel requests)
Priority Queue
Requests are queued with priority levels:
- High: Lookup queries (direct MBID access)
- Medium: Browse queries (relationship traversal)
- Low: Search queries (full-text search)
Higher priority requests are processed first when rate limit is reached.
Implementation
Location: src/rate-limit.js
class RateLimiter {
constructor(options) {
this.tokens = options.limit;
this.limit = options.limit;
this.refillRate = options.limit / options.interval;
this.queue = new PriorityQueue();
}
async acquire(priority = 'medium') {
if (this.tokens > 0) {
this.tokens--;
return Promise.resolve();
}
return new Promise(resolve => {
this.queue.enqueue({ resolve, priority });
});
}
refill() {
this.tokens = Math.min(this.limit, this.tokens + this.refillRate);
while (this.tokens > 0 && this.queue.length > 0) {
const { resolve } = this.queue.dequeue();
this.tokens--;
resolve();
}
}
}
File Structure
src/
├── index.js # Entry point, start() function
├── schema.js # Schema construction
├── context.js # Context factory
├── types/ # Entity type definitions
│ ├── area.js
│ ├── artist.js
│ ├── collection.js
│ ├── disc.js
│ ├── event.js
│ ├── instrument.js
│ ├── label.js
│ ├── place.js
│ ├── recording.js
│ ├── release.js
│ ├── release-group.js
│ ├── series.js
│ ├── tag.js
│ ├── track.js
│ ├── url.js
│ ├── work.js
│ └── relationships.js
├── resolvers/ # Resolver implementations
│ ├── query.js
│ └── subquery.js
├── loaders/ # DataLoader batch functions
│ └── musicbrainz.js
├── rate-limit.js # Rate limiter implementation
├── client.js # Base HTTP client
└── extensions/ # Built-in extensions
├── cover-art-archive/
├── fanart/
├── mediawiki/
└── theaudiodb/
Relay Compliance
GraphBrainz implements the Relay specification for cursor-based pagination:
Connection Pattern
All list fields return connection types:
type ArtistConnection {
edges: [ArtistEdge]
nodes: [Artist]
pageInfo: PageInfo!
totalCount: Int
}
type ArtistEdge {
node: Artist
cursor: String!
}
type PageInfo {
hasNextPage: Boolean!
hasPreviousPage: Boolean!
startCursor: String
endCursor: String
}
Pagination Arguments
first: Int- Number of items to returnafter: String- Cursor for paginationlast: Int- Number of items from end (not implemented)before: String- Cursor for reverse pagination (not implemented)
Node Interface
Global object identification via node(id: ID!) query:
interface Node {
id: ID!
}
All entity types implement the Node interface with globally unique IDs.