feat: initial implementation of metadata aggregator
- gRPC service with MusicBrainz provider - PostgreSQL schema with migrations - Service layer with database-first caching - Repository pattern for data access - YAML configuration support - Research documentation for 17 music metadata projects
This commit is contained in:
@@ -0,0 +1,191 @@
|
||||
# GraphBrainz Overview
|
||||
|
||||
## Project Identity
|
||||
|
||||
| Property | Value |
|
||||
|----------|-------|
|
||||
| Name | GraphBrainz |
|
||||
| Version | 9.0.0 |
|
||||
| Repository | https://github.com/exogen/graphbrainz |
|
||||
| License | MIT (2016 Brian Beck) |
|
||||
| Language | JavaScript (ESM) |
|
||||
| Runtime | Node.js >=12.18.0 |
|
||||
| Core Stack | Express + GraphQL |
|
||||
| NPM Package | graphbrainz |
|
||||
| Binary Command | graphbrainz |
|
||||
|
||||
## Purpose
|
||||
|
||||
GraphBrainz provides a GraphQL schema and Express server/middleware for querying the MusicBrainz API. It transforms the REST-based MusicBrainz web service into a modern GraphQL interface with extensible integrations for additional metadata sources.
|
||||
|
||||
The project serves three primary use cases:
|
||||
|
||||
1. **Standalone GraphQL Server** - Run as a dedicated service with built-in Express server
|
||||
2. **Express Middleware** - Embed GraphQL endpoint into existing Express applications
|
||||
3. **Direct GraphQL Client** - Import schema and context for programmatic queries
|
||||
|
||||
## Core Dependencies
|
||||
|
||||
| Package | Version | Purpose |
|
||||
|---------|---------|---------|
|
||||
| graphql | 15.5.0 | GraphQL implementation |
|
||||
| express-graphql | 0.12.0 | Express middleware for GraphQL |
|
||||
| @graphql-tools/schema | 7.1.3 | Schema composition utilities |
|
||||
| dataloader | 2.0.0 | Request batching and deduplication |
|
||||
| lru-cache | 6.0.0 | Shared response caching |
|
||||
| got | 11.8.2 | HTTP client for API requests |
|
||||
| graphql-relay | 0.6.0 | Relay specification helpers |
|
||||
| debug | * | Namespace-based logging |
|
||||
| es6-error | * | Custom error classes |
|
||||
| dotenv | * | Environment configuration |
|
||||
|
||||
## Entry Points
|
||||
|
||||
The application flow starts at `cli.js` which delegates to `src/index.js` and its `start()` function. This entry point handles:
|
||||
|
||||
- Environment variable loading via dotenv
|
||||
- Extension discovery and loading
|
||||
- Schema construction and extension
|
||||
- Server initialization (standalone mode)
|
||||
- Middleware export (embedded mode)
|
||||
|
||||
## Extension System
|
||||
|
||||
GraphBrainz includes 4 built-in extensions and supports 3 external extensions via separate npm packages.
|
||||
|
||||
### Built-in Extensions
|
||||
|
||||
| Extension | Source | Purpose |
|
||||
|-----------|--------|---------|
|
||||
| Cover Art Archive | http://coverartarchive.org/ | Album artwork and thumbnails |
|
||||
| fanart.tv | http://webservice.fanart.tv/v3/ | Artist backgrounds, logos, banners |
|
||||
| MediaWiki | MusicBrainz Wiki | Image URLs and metadata |
|
||||
| TheAudioDB | http://www.theaudiodb.com/ | Artist biographies and logos |
|
||||
|
||||
### External Extensions
|
||||
|
||||
| Extension | NPM Package | Purpose |
|
||||
|-----------|-------------|---------|
|
||||
| Last.fm | graphbrainz-extension-lastfm | Scrobbling data and statistics |
|
||||
| Discogs | graphbrainz-extension-discogs | Release marketplace data |
|
||||
| Spotify | graphbrainz-extension-spotify | Streaming platform metadata |
|
||||
|
||||
Extensions are loaded via the `GRAPHBRAINZ_EXTENSIONS` environment variable or programmatic options. Each extension receives its own HTTP client, DataLoader instance, and LRU cache.
|
||||
|
||||
## Deployment Modes
|
||||
|
||||
### Standalone Server
|
||||
|
||||
```bash
|
||||
npm start
|
||||
# or
|
||||
graphbrainz
|
||||
```
|
||||
|
||||
Starts Express server on port 3000 (configurable via `PORT` env var) with GraphQL endpoint at `/` (configurable via `GRAPHBRAINZ_PATH`).
|
||||
|
||||
### Express Middleware
|
||||
|
||||
```javascript
|
||||
import { middleware } from 'graphbrainz';
|
||||
|
||||
app.use('/graphql', middleware());
|
||||
```
|
||||
|
||||
Embeds GraphQL endpoint into existing Express application.
|
||||
|
||||
### Direct GraphQL Client
|
||||
|
||||
```javascript
|
||||
import { schema, context } from 'graphbrainz';
|
||||
import { graphql } from 'graphql';
|
||||
|
||||
const result = await graphql({
|
||||
schema,
|
||||
source: query,
|
||||
contextValue: context
|
||||
});
|
||||
```
|
||||
|
||||
Programmatic access to schema and context for custom integrations.
|
||||
|
||||
## Architecture Highlights
|
||||
|
||||
### Schema Construction
|
||||
|
||||
GraphBrainz uses programmatic schema construction via GraphQL.js constructors rather than SDL (Schema Definition Language) for the core schema. This approach provides:
|
||||
|
||||
- Type-safe schema building
|
||||
- Dynamic field generation
|
||||
- Runtime schema introspection
|
||||
- Programmatic extension points
|
||||
|
||||
Extensions use SDL strings merged via `extendSchema()` from `@graphql-tools/schema`.
|
||||
|
||||
### Performance Optimization
|
||||
|
||||
Two-tier caching strategy:
|
||||
|
||||
1. **DataLoader** - Per-request batching and deduplication
|
||||
2. **LRU Cache** - Shared cache across requests (8192 items, 1 day TTL)
|
||||
|
||||
Custom rate limiter with priority queue ensures compliance with MusicBrainz API limits (5 requests per 5.5 seconds) and extension limits (10 requests per second).
|
||||
|
||||
### Resolver Intelligence
|
||||
|
||||
Resolvers inspect the GraphQL AST to determine which MusicBrainz `inc` parameters are needed. This eliminates over-fetching and under-fetching by requesting exactly the data required for the query.
|
||||
|
||||
## Package Distribution
|
||||
|
||||
The NPM package exports:
|
||||
|
||||
- Main module with `start()`, `middleware()`, `schema`, `context`
|
||||
- Built-in extensions as separate modules
|
||||
- `schema.json` for tooling and introspection
|
||||
- Binary command for CLI usage
|
||||
|
||||
## Version Requirements
|
||||
|
||||
| Component | Minimum Version | Notes |
|
||||
|-----------|----------------|-------|
|
||||
| Node.js | 12.18.0 | ESM support required |
|
||||
| GraphQL | 15.5.0 | Not latest (v16+ available) |
|
||||
| Express | 4.x | Via express-graphql |
|
||||
|
||||
## Configuration Surface
|
||||
|
||||
GraphBrainz exposes 10+ environment variables for configuration:
|
||||
|
||||
- `MUSICBRAINZ_BASE_URL` - MusicBrainz API endpoint
|
||||
- `GRAPHBRAINZ_PATH` - GraphQL endpoint path
|
||||
- `GRAPHBRAINZ_CORS_ORIGIN` - CORS configuration
|
||||
- `GRAPHBRAINZ_CACHE_SIZE` - LRU cache size
|
||||
- `GRAPHBRAINZ_CACHE_TTL` - Cache TTL in milliseconds
|
||||
- `GRAPHBRAINZ_GRAPHIQL` - Enable GraphiQL interface
|
||||
- `GRAPHBRAINZ_EXTENSIONS` - Extension loading
|
||||
- `PORT` - Server port
|
||||
- `NODE_ENV` - Environment mode
|
||||
- Per-extension variables (API keys, cache settings)
|
||||
|
||||
## Development Tooling
|
||||
|
||||
| Tool | Purpose |
|
||||
|------|---------|
|
||||
| AVA | Test framework |
|
||||
| ava-nock | HTTP mocking (play/record/cache) |
|
||||
| c8 | Code coverage |
|
||||
| Travis CI | Continuous integration (Node 12/14/15) |
|
||||
| Codecov + Coveralls | Coverage reporting |
|
||||
| debug | Namespace-based logging |
|
||||
|
||||
## Project Maturity
|
||||
|
||||
GraphBrainz v9.0.0 represents a mature, stable project with:
|
||||
|
||||
- Comprehensive test suite (1475+ lines)
|
||||
- Production-proven caching and rate limiting
|
||||
- Relay-compliant GraphQL implementation
|
||||
- Extensible architecture for metadata aggregation
|
||||
- 5+ years of development history
|
||||
|
||||
The project has not seen major updates in recent years, indicating stability but potential technical debt in dependencies (Node.js 12 baseline, GraphQL v15).
|
||||
Reference in New Issue
Block a user