Files
metadata-agregator/docs/research/graphbrainz/analysis/INTEGRATIONS.md
T
Alexander a1f6701bac feat: initial implementation of metadata aggregator
- gRPC service with MusicBrainz provider
- PostgreSQL schema with migrations
- Service layer with database-first caching
- Repository pattern for data access
- YAML configuration support
- Research documentation for 17 music metadata projects
2026-04-28 16:28:53 +02:00

17 KiB

GraphBrainz Integrations

Integration Architecture

GraphBrainz integrates with 5 external APIs through a unified extension system:

Integration Type Authentication Rate Limit
MusicBrainz Core None 5 req/5.5s
Cover Art Archive Built-in None 10 req/s
fanart.tv Built-in API key 10 req/s
MediaWiki Built-in None 10 req/s
TheAudioDB Built-in API key 10 req/s

External extensions (separate npm packages):

Extension Package Authentication
Last.fm graphbrainz-extension-lastfm API key
Discogs graphbrainz-extension-discogs API key
Spotify graphbrainz-extension-spotify OAuth

MusicBrainz REST API

Overview

Property Value
Base URL http://musicbrainz.org/ws/2/
Protocol REST (JSON)
Authentication None
Rate Limit 5 requests per 5.5 seconds
Documentation https://musicbrainz.org/doc/MusicBrainz_API

Operations

Lookup

Retrieve single entity by MBID.

Endpoint Pattern:

GET /ws/2/{entity}/{mbid}?inc={relationships}&fmt=json

Supported Entities:

  • area, artist, collection, event, instrument, label, place, recording, release, release-group, series, url, work

Example:

GET /ws/2/artist/5b11f4ce-a62d-471e-81fc-a69a8278c7da?inc=releases+recordings&fmt=json

Browse

Retrieve entities linked to parent entity.

Endpoint Pattern:

GET /ws/2/{entity}?{parent-entity}={mbid}&limit={limit}&offset={offset}&inc={relationships}&fmt=json

Example:

GET /ws/2/release?artist=5b11f4ce-a62d-471e-81fc-a69a8278c7da&limit=25&offset=0&fmt=json

Lucene-based full-text search.

Endpoint Pattern:

GET /ws/2/{entity}?query={lucene-query}&limit={limit}&offset={offset}&fmt=json

Example:

GET /ws/2/artist?query=artist:Radiohead%20AND%20country:GB&limit=25&fmt=json

Rate Limiting

Policy: 5 requests per 5.5 seconds (0.909 req/s average)

Implementation:

const musicbrainzLimiter = new RateLimiter({
  limit: 5,
  interval: 5500,
  concurrency: 1
});

Compliance Strategy:

  • Token bucket algorithm
  • Sequential requests (no parallelization)
  • Priority queue for request ordering

Local Mirror Support

GraphBrainz supports local MusicBrainz mirrors to eliminate rate limits:

MUSICBRAINZ_BASE_URL=http://localhost:5000/ws/2/

Benefits:

  • No rate limiting
  • Reduced latency
  • Offline operation
  • Full dataset access

Setup: See https://musicbrainz.org/doc/MusicBrainz_Server/Setup

Cover Art Archive

Overview

Property Value
Base URL http://coverartarchive.org/
Protocol REST (JSON)
Authentication None
Rate Limit 10 requests per second
Documentation https://musicbrainz.org/doc/Cover_Art_Archive/API

Purpose

Provides album artwork and thumbnails for MusicBrainz releases.

Schema Extension

Adds coverArtArchive field to Release type:

extend type Release {
  coverArtArchive: CoverArtArchiveRelease
}

type CoverArtArchiveRelease {
  front: Boolean
  back: Boolean
  artwork: Boolean
  count: Int
  release: String
  images: [CoverArtArchiveImage]
}

type CoverArtArchiveImage {
  fileID: String
  image: String
  thumbnails: CoverArtArchiveThumbnails
  front: Boolean
  back: Boolean
  types: [String]
  edit: Int
  approved: Boolean
  comment: String
}

type CoverArtArchiveThumbnails {
  small: String   # 250px
  large: String   # 500px
}

API Endpoints

Release Cover Art

Endpoint:

GET /release/{mbid}

Response:

{
  "images": [
    {
      "id": "12345",
      "image": "http://coverartarchive.org/release/{mbid}/12345.jpg",
      "thumbnails": {
        "small": "http://coverartarchive.org/release/{mbid}/12345-250.jpg",
        "large": "http://coverartarchive.org/release/{mbid}/12345-500.jpg"
      },
      "front": true,
      "back": false,
      "types": ["Front"],
      "approved": true
    }
  ],
  "release": "http://musicbrainz.org/release/{mbid}"
}

Front Cover (Direct)

Endpoint:

GET /release/{mbid}/front
GET /release/{mbid}/front-250  # Small thumbnail
GET /release/{mbid}/front-500  # Large thumbnail

Returns image binary (JPEG/PNG).

Configuration

Environment Variable Default Purpose
COVERART_CACHE_SIZE 8192 LRU cache size
COVERART_CACHE_TTL 86400000 Cache TTL (1 day)

Example Query

{
  lookup {
    release(mbid: "f0c8b1e5-c3b6-46c0-9641-25fd3c00e56a") {
      title
      coverArtArchive {
        front
        back
        count
        images {
          image
          thumbnails {
            large
          }
          types
          front
        }
      }
    }
  }
}

Implementation

File: src/extensions/cover-art-archive/index.js

Client: Custom HTTP client extending base Client class

Resolver:

Release: {
  coverArtArchive(release, args, context) {
    return context.coverArtArchive.loader.load(release.id);
  }
}

fanart.tv

Overview

Property Value
Base URL http://webservice.fanart.tv/v3/
Protocol REST (JSON)
Authentication API key (required)
Rate Limit 10 requests per second
Documentation https://fanart.tv/api-docs/

Purpose

Provides high-quality artist images: backgrounds, banners, logos, thumbnails.

Schema Extension

Adds fanArt field to Artist type:

extend type Artist {
  fanArt: FanArtImages
}

type FanArtImages {
  backgrounds: [FanArtImage]
  banners: [FanArtImage]
  logos: [FanArtLabelImage]
  logosHD: [FanArtLabelImage]
  thumbnails: [FanArtImage]
}

type FanArtImage {
  imageID: String
  url: String
  likes: Int
}

type FanArtLabelImage {
  imageID: String
  url: String
  likes: Int
  color: String
}

API Endpoints

Artist Images

Endpoint:

GET /music/{mbid}?api_key={key}

Response:

{
  "name": "Radiohead",
  "mbid_id": "5b11f4ce-a62d-471e-81fc-a69a8278c7da",
  "artistbackground": [
    {
      "id": "12345",
      "url": "https://assets.fanart.tv/fanart/music/5b11f4ce.../artistbackground/...",
      "likes": "42"
    }
  ],
  "hdmusiclogo": [
    {
      "id": "67890",
      "url": "https://assets.fanart.tv/fanart/music/5b11f4ce.../hdmusiclogo/...",
      "likes": "128",
      "colour": "FFFFFF"
    }
  ],
  "artistthumb": [...],
  "musicbanner": [...]
}

Configuration

Environment Variable Required Default Purpose
FANART_API_KEY Yes - API authentication
FANART_CACHE_SIZE No 8192 LRU cache size
FANART_CACHE_TTL No 86400000 Cache TTL (1 day)

Example Query

{
  lookup {
    artist(mbid: "5b11f4ce-a62d-471e-81fc-a69a8278c7da") {
      name
      fanArt {
        backgrounds {
          url
          likes
        }
        logosHD {
          url
          color
          likes
        }
        banners {
          url
        }
      }
    }
  }
}

Implementation

File: src/extensions/fanart/index.js

Client: FanArtClient extending base Client

Resolver:

Artist: {
  fanArt(artist, args, context) {
    return context.fanart.loader.load(artist.id);
  }
}

MediaWiki

Overview

Property Value
Base URL https://musicbrainz.org/w/api.php
Protocol MediaWiki API
Authentication None
Rate Limit 10 requests per second
Documentation https://www.mediawiki.org/wiki/API

Purpose

Retrieves images from MusicBrainz Wiki for artists, including EXIF metadata and license information.

Schema Extension

Adds mediaWikiImages field to Artist type:

extend type Artist {
  mediaWikiImages: [MediaWikiImage]
}

type MediaWikiImage {
  url: String
  descriptionURL: String
  title: String
  user: String
  size: Int
  width: Int
  height: Int
  canonicalTitle: String
  objectName: String
  descriptionShortURL: String
  metadata: [MediaWikiImageMetadata]
}

type MediaWikiImageMetadata {
  name: String
  value: String
}

API Endpoints

Endpoint:

GET /w/api.php?action=query&titles={artist-name}&prop=images&format=json

Response:

{
  "query": {
    "pages": {
      "12345": {
        "title": "Radiohead",
        "images": [
          {
            "title": "File:Radiohead.jpg"
          }
        ]
      }
    }
  }
}

Image Info

Endpoint:

GET /w/api.php?action=query&titles=File:{filename}&prop=imageinfo&iiprop=url|size|metadata|user&format=json

Response:

{
  "query": {
    "pages": {
      "67890": {
        "imageinfo": [
          {
            "url": "https://musicbrainz.org/w/images/...",
            "descriptionurl": "https://musicbrainz.org/w/File:...",
            "width": 1200,
            "height": 800,
            "size": 245678,
            "user": "WikiUser",
            "metadata": [
              { "name": "DateTime", "value": "2020:01:15 10:30:00" },
              { "name": "Artist", "value": "Photographer Name" }
            ]
          }
        ]
      }
    }
  }
}

Configuration

Environment Variable Default Purpose
MEDIAWIKI_CACHE_SIZE 8192 LRU cache size
MEDIAWIKI_CACHE_TTL 86400000 Cache TTL (1 day)

Example Query

{
  lookup {
    artist(mbid: "5b11f4ce-a62d-471e-81fc-a69a8278c7da") {
      name
      mediaWikiImages {
        url
        width
        height
        user
        metadata {
          name
          value
        }
      }
    }
  }
}

Implementation

File: src/extensions/mediawiki/index.js

Client: MediaWikiClient extending base Client

Resolver:

Artist: {
  mediaWikiImages(artist, args, context) {
    return context.mediawiki.loader.load(artist.name);
  }
}

TheAudioDB

Overview

Property Value
Base URL http://www.theaudiodb.com/api/v1/json/
Protocol REST (JSON)
Authentication API key (required)
Rate Limit 10 requests per second
Documentation https://www.theaudiodb.com/api_guide.php

Purpose

Provides artist biographies, logos, and additional metadata.

Schema Extension

Adds theAudioDB field to Artist type:

extend type Artist {
  theAudioDB: TheAudioDBArtist
}

type TheAudioDBArtist {
  artistID: String
  biography: String
  biographyEN: String
  memberCount: Int
  banner: String
  logo: String
  thumbnail: String
  fanArt: [TheAudioDBImage]
}

type TheAudioDBImage {
  url: String
}

API Endpoints

Artist by MBID

Endpoint:

GET /{api-key}/artist-mb.php?i={mbid}

Response:

{
  "artists": [
    {
      "idArtist": "111239",
      "strArtist": "Radiohead",
      "strArtistMBID": "5b11f4ce-a62d-471e-81fc-a69a8278c7da",
      "strBiographyEN": "Radiohead are an English rock band...",
      "intMembers": "5",
      "strArtistBanner": "https://www.theaudiodb.com/images/media/artist/banner/...",
      "strArtistLogo": "https://www.theaudiodb.com/images/media/artist/logo/...",
      "strArtistThumb": "https://www.theaudiodb.com/images/media/artist/thumb/...",
      "strArtistFanart": "https://www.theaudiodb.com/images/media/artist/fanart/...",
      "strArtistFanart2": "https://www.theaudiodb.com/images/media/artist/fanart2/...",
      "strArtistFanart3": "https://www.theaudiodb.com/images/media/artist/fanart3/..."
    }
  ]
}

Configuration

Environment Variable Required Default Purpose
THEAUDIODB_API_KEY Yes - API authentication
THEAUDIODB_CACHE_SIZE No 8192 LRU cache size
THEAUDIODB_CACHE_TTL No 86400000 Cache TTL (1 day)

Example Query

{
  lookup {
    artist(mbid: "5b11f4ce-a62d-471e-81fc-a69a8278c7da") {
      name
      theAudioDB {
        biographyEN
        memberCount
        logo
        banner
        fanArt {
          url
        }
      }
    }
  }
}

Implementation

File: src/extensions/theaudiodb/index.js

Client: TheAudioDBClient extending base Client

Resolver:

Artist: {
  theAudioDB(artist, args, context) {
    return context.theaudiodb.loader.load(artist.id);
  }
}

Extension Pattern

All extensions follow a consistent pattern for integration.

Extension Interface

{
  name: String,              // Extension identifier
  description: String,       // Human-readable description
  extendContext: Function,   // Add HTTP client, DataLoader, cache to context
  extendSchema: Function     // Add GraphQL types and resolvers
}

Context Extension

extendContext(context, options) {
  const client = new ExtensionClient({
    baseURL: options.baseURL,
    apiKey: options.apiKey,
    timeout: options.timeout
  });
  
  const cache = new LRU({
    max: options.cacheSize || 8192,
    ttl: options.cacheTTL || 86400000
  });
  
  const loader = new DataLoader(
    keys => batchFetch(client, keys),
    { cache: false }  // Use LRU cache instead
  );
  
  return {
    ...context,
    [extensionName]: {
      client,
      loader,
      cache
    }
  };
}

Schema Extension

extendSchema(schema, options) {
  const typeDefs = `
    extend type Artist {
      extensionField: ExtensionType
    }
    
    type ExtensionType {
      field1: String
      field2: Int
    }
  `;
  
  const resolvers = {
    Artist: {
      extensionField(artist, args, context) {
        return context.extensionName.loader.load(artist.id);
      }
    }
  };
  
  return extendSchema(schema, { typeDefs, resolvers });
}

Client Base Class

All extension clients extend a base Client class:

File: src/client.js

class Client {
  constructor(options) {
    this.client = got.extend({
      prefixUrl: options.baseURL,
      headers: options.headers,
      timeout: options.timeout || 30000,
      retry: { limit: 3 },
      hooks: {
        beforeRequest: [this.beforeRequest.bind(this)],
        afterResponse: [this.afterResponse.bind(this)]
      }
    });
    
    this.cache = options.cache;
    this.limiter = options.limiter;
  }
  
  async get(path, options) {
    const cacheKey = this.getCacheKey(path, options);
    const cached = this.cache.get(cacheKey);
    
    if (cached) {
      return cached;
    }
    
    await this.limiter.acquire();
    
    const response = await this.client.get(path, options);
    const data = response.body;
    
    this.cache.set(cacheKey, data);
    
    return data;
  }
  
  getCacheKey(path, options) {
    return `${path}:${JSON.stringify(options)}`;
  }
  
  beforeRequest(options) {
    debug(`${this.constructor.name}`)(`${options.method} ${options.url}`);
  }
  
  afterResponse(response) {
    return response;
  }
}

External Extensions

Last.fm

Package: graphbrainz-extension-lastfm

Installation:

npm install graphbrainz-extension-lastfm

Configuration:

LASTFM_API_KEY=your-api-key

Schema Additions:

  • Artist.lastFM - Scrobble statistics, similar artists
  • Recording.lastFM - Play counts, listener counts

Discogs

Package: graphbrainz-extension-discogs

Installation:

npm install graphbrainz-extension-discogs

Configuration:

DISCOGS_API_KEY=your-api-key

Schema Additions:

  • Release.discogs - Marketplace data, pricing, community ratings

Spotify

Package: graphbrainz-extension-spotify

Installation:

npm install graphbrainz-extension-spotify

Configuration:

SPOTIFY_CLIENT_ID=your-client-id
SPOTIFY_CLIENT_SECRET=your-client-secret

Schema Additions:

  • Artist.spotify - Popularity, followers, genres
  • Recording.spotify - Audio features, preview URLs

Integration Best Practices

Error Handling

Each extension implements custom error classes:

class FanArtError extends Error {
  constructor(message, statusCode) {
    super(message);
    this.name = 'FanArtError';
    this.statusCode = statusCode;
  }
}

Graceful Degradation

Extension failures don't break core queries:

{
  lookup {
    artist(mbid: "...") {
      name          # Always works (core)
      fanArt {      # Returns null if fanart.tv fails
        backgrounds
      }
    }
  }
}

Rate Limit Coordination

Each extension has independent rate limiter to prevent cross-contamination:

const fanartLimiter = new RateLimiter({ limit: 10, interval: 1000 });
const theaudiodbLimiter = new RateLimiter({ limit: 10, interval: 1000 });

Cache Isolation

Separate caches prevent eviction conflicts:

const fanartCache = new LRU({ max: 8192 });
const theaudiodbCache = new LRU({ max: 8192 });