Files
metadata-agregator/docs/research/graphbrainz/analysis/OVERVIEW.md
T
Alexander a1f6701bac feat: initial implementation of metadata aggregator
- gRPC service with MusicBrainz provider
- PostgreSQL schema with migrations
- Service layer with database-first caching
- Repository pattern for data access
- YAML configuration support
- Research documentation for 17 music metadata projects
2026-04-28 16:28:53 +02:00

192 lines
6.2 KiB
Markdown

# GraphBrainz Overview
## Project Identity
| Property | Value |
|----------|-------|
| Name | GraphBrainz |
| Version | 9.0.0 |
| Repository | https://github.com/exogen/graphbrainz |
| License | MIT (2016 Brian Beck) |
| Language | JavaScript (ESM) |
| Runtime | Node.js >=12.18.0 |
| Core Stack | Express + GraphQL |
| NPM Package | graphbrainz |
| Binary Command | graphbrainz |
## Purpose
GraphBrainz provides a GraphQL schema and Express server/middleware for querying the MusicBrainz API. It transforms the REST-based MusicBrainz web service into a modern GraphQL interface with extensible integrations for additional metadata sources.
The project serves three primary use cases:
1. **Standalone GraphQL Server** - Run as a dedicated service with built-in Express server
2. **Express Middleware** - Embed GraphQL endpoint into existing Express applications
3. **Direct GraphQL Client** - Import schema and context for programmatic queries
## Core Dependencies
| Package | Version | Purpose |
|---------|---------|---------|
| graphql | 15.5.0 | GraphQL implementation |
| express-graphql | 0.12.0 | Express middleware for GraphQL |
| @graphql-tools/schema | 7.1.3 | Schema composition utilities |
| dataloader | 2.0.0 | Request batching and deduplication |
| lru-cache | 6.0.0 | Shared response caching |
| got | 11.8.2 | HTTP client for API requests |
| graphql-relay | 0.6.0 | Relay specification helpers |
| debug | * | Namespace-based logging |
| es6-error | * | Custom error classes |
| dotenv | * | Environment configuration |
## Entry Points
The application flow starts at `cli.js` which delegates to `src/index.js` and its `start()` function. This entry point handles:
- Environment variable loading via dotenv
- Extension discovery and loading
- Schema construction and extension
- Server initialization (standalone mode)
- Middleware export (embedded mode)
## Extension System
GraphBrainz includes 4 built-in extensions and supports 3 external extensions via separate npm packages.
### Built-in Extensions
| Extension | Source | Purpose |
|-----------|--------|---------|
| Cover Art Archive | http://coverartarchive.org/ | Album artwork and thumbnails |
| fanart.tv | http://webservice.fanart.tv/v3/ | Artist backgrounds, logos, banners |
| MediaWiki | MusicBrainz Wiki | Image URLs and metadata |
| TheAudioDB | http://www.theaudiodb.com/ | Artist biographies and logos |
### External Extensions
| Extension | NPM Package | Purpose |
|-----------|-------------|---------|
| Last.fm | graphbrainz-extension-lastfm | Scrobbling data and statistics |
| Discogs | graphbrainz-extension-discogs | Release marketplace data |
| Spotify | graphbrainz-extension-spotify | Streaming platform metadata |
Extensions are loaded via the `GRAPHBRAINZ_EXTENSIONS` environment variable or programmatic options. Each extension receives its own HTTP client, DataLoader instance, and LRU cache.
## Deployment Modes
### Standalone Server
```bash
npm start
# or
graphbrainz
```
Starts Express server on port 3000 (configurable via `PORT` env var) with GraphQL endpoint at `/` (configurable via `GRAPHBRAINZ_PATH`).
### Express Middleware
```javascript
import { middleware } from 'graphbrainz';
app.use('/graphql', middleware());
```
Embeds GraphQL endpoint into existing Express application.
### Direct GraphQL Client
```javascript
import { schema, context } from 'graphbrainz';
import { graphql } from 'graphql';
const result = await graphql({
schema,
source: query,
contextValue: context
});
```
Programmatic access to schema and context for custom integrations.
## Architecture Highlights
### Schema Construction
GraphBrainz uses programmatic schema construction via GraphQL.js constructors rather than SDL (Schema Definition Language) for the core schema. This approach provides:
- Type-safe schema building
- Dynamic field generation
- Runtime schema introspection
- Programmatic extension points
Extensions use SDL strings merged via `extendSchema()` from `@graphql-tools/schema`.
### Performance Optimization
Two-tier caching strategy:
1. **DataLoader** - Per-request batching and deduplication
2. **LRU Cache** - Shared cache across requests (8192 items, 1 day TTL)
Custom rate limiter with priority queue ensures compliance with MusicBrainz API limits (5 requests per 5.5 seconds) and extension limits (10 requests per second).
### Resolver Intelligence
Resolvers inspect the GraphQL AST to determine which MusicBrainz `inc` parameters are needed. This eliminates over-fetching and under-fetching by requesting exactly the data required for the query.
## Package Distribution
The NPM package exports:
- Main module with `start()`, `middleware()`, `schema`, `context`
- Built-in extensions as separate modules
- `schema.json` for tooling and introspection
- Binary command for CLI usage
## Version Requirements
| Component | Minimum Version | Notes |
|-----------|----------------|-------|
| Node.js | 12.18.0 | ESM support required |
| GraphQL | 15.5.0 | Not latest (v16+ available) |
| Express | 4.x | Via express-graphql |
## Configuration Surface
GraphBrainz exposes 10+ environment variables for configuration:
- `MUSICBRAINZ_BASE_URL` - MusicBrainz API endpoint
- `GRAPHBRAINZ_PATH` - GraphQL endpoint path
- `GRAPHBRAINZ_CORS_ORIGIN` - CORS configuration
- `GRAPHBRAINZ_CACHE_SIZE` - LRU cache size
- `GRAPHBRAINZ_CACHE_TTL` - Cache TTL in milliseconds
- `GRAPHBRAINZ_GRAPHIQL` - Enable GraphiQL interface
- `GRAPHBRAINZ_EXTENSIONS` - Extension loading
- `PORT` - Server port
- `NODE_ENV` - Environment mode
- Per-extension variables (API keys, cache settings)
## Development Tooling
| Tool | Purpose |
|------|---------|
| AVA | Test framework |
| ava-nock | HTTP mocking (play/record/cache) |
| c8 | Code coverage |
| Travis CI | Continuous integration (Node 12/14/15) |
| Codecov + Coveralls | Coverage reporting |
| debug | Namespace-based logging |
## Project Maturity
GraphBrainz v9.0.0 represents a mature, stable project with:
- Comprehensive test suite (1475+ lines)
- Production-proven caching and rate limiting
- Relay-compliant GraphQL implementation
- Extensible architecture for metadata aggregation
- 5+ years of development history
The project has not seen major updates in recent years, indicating stability but potential technical debt in dependencies (Node.js 12 baseline, GraphQL v15).