feat: initial implementation of metadata aggregator

- gRPC service with MusicBrainz provider
- PostgreSQL schema with migrations
- Service layer with database-first caching
- Repository pattern for data access
- YAML configuration support
- Research documentation for 17 music metadata projects
This commit is contained in:
Alexander
2026-04-28 16:27:14 +02:00
commit a1f6701bac
163 changed files with 95884 additions and 0 deletions
@@ -0,0 +1,895 @@
# Harmony - Provider Integrations Analysis
## Provider Ecosystem Overview
Harmony integrates with **9 music metadata providers** using two primary access methods:
1. **API-based providers (5)**: Structured data via REST APIs
2. **HTML scraping providers (4)**: Data extraction from web pages
All providers share a common base architecture with URL pattern matching, rate limiting, caching, and harmonization to the `HarmonyRelease` schema.
## Provider Summary Table
| Provider | Type | Auth | Rate Limit | GTIN | Max Image | Regions | Status |
|----------|------|------|------------|------|-----------|---------|--------|
| Spotify | API | OAuth2 | Not specified | Yes (UPC) | 2000px | Global | Active |
| Deezer | API | Public | 50 req/5s | Yes | 1400px | Global | Active |
| iTunes | API | Public | Not specified | Yes | Varies | Multi-region | Active |
| Tidal | API | OAuth2 | Not specified | Yes | 1280px | Global | Active (v2) |
| MusicBrainz | API | Public | 5 req/5s | Yes (barcode) | N/A | Global | Active |
| Bandcamp | Scraping | None | Not specified | No | 3000px | Global | Active |
| Beatport | Scraping | None | Not specified | Yes | Varies | Global | Active |
| Mora | Scraping | None | Not specified | Yes | Varies | Japan | Active |
| Ototoy | Scraping | None | Not specified | Yes | Varies | Japan | Active |
## API-Based Providers
### 1. Spotify
**File**: `providers/spotify.ts`
#### Authentication
- **Method**: OAuth2 Client Credentials Flow
- **Credentials**: `HARMONY_SPOTIFY_CLIENT_ID`, `HARMONY_SPOTIFY_CLIENT_SECRET`
- **Token endpoint**: `https://accounts.spotify.com/api/token`
- **Token caching**: localStorage (dev) / sessionStorage (prod)
- **Token lifetime**: 3600 seconds (1 hour)
**OAuth2 Flow**:
```typescript
async function getAccessToken(): Promise<string> {
const response = await fetch('https://accounts.spotify.com/api/token', {
method: 'POST',
headers: {
'Authorization': `Basic ${btoa(`${clientId}:${clientSecret}`)}`,
'Content-Type': 'application/x-www-form-urlencoded'
},
body: 'grant_type=client_credentials'
});
const data = await response.json();
return data.access_token;
}
```
#### API Endpoints
| Endpoint | Purpose | Example |
|----------|---------|---------|
| `GET /v1/albums/{id}` | Album lookup by Spotify ID | `/v1/albums/3DiDSNVBRYVzccLn2yqhMJ` |
| `GET /v1/search` | Search by UPC | `/v1/search?q=upc:0602537347377&type=album` |
#### URL Pattern
```typescript
urlPattern = new URLPattern({
hostname: 'open.spotify.com',
pathname: '/album/:id'
});
```
**Matches**:
- `https://open.spotify.com/album/3DiDSNVBRYVzccLn2yqhMJ`
- `https://open.spotify.com/album/3DiDSNVBRYVzccLn2yqhMJ?si=xyz`
#### Feature Quality
```typescript
featureQuality = {
gtin: FeatureQuality.GOOD, // UPC in external_ids
title: FeatureQuality.GOOD, // Album name
artists: FeatureQuality.GOOD, // Artist array with names
releaseDate: FeatureQuality.GOOD, // release_date field
labels: FeatureQuality.PRESENT, // Label name (no catalog number)
media: FeatureQuality.GOOD, // Disc structure
tracks: FeatureQuality.GOOD, // Track listing with durations
isrc: FeatureQuality.GOOD, // ISRC per track
images: 2000, // Max 2000x2000px
copyright: FeatureQuality.PRESENT,// Copyright array
availability: FeatureQuality.GOOD // available_markets array
};
```
#### Data Mapping
**Spotify Album Object****HarmonyRelease**:
| Spotify Field | Harmony Field | Transformation |
|---------------|---------------|----------------|
| `name` | `title` | Direct |
| `artists[].name` | `artists[].name` | Map array |
| `external_ids.upc` | `gtin` | Direct |
| `release_date` | `releaseDate` | Parse to PartialDate |
| `label` | `labels[0].name` | Single label |
| `tracks.items[]` | `media[0].tracks[]` | Map to HarmonyTrack |
| `images[]` | `images[]` | Map with dimensions |
| `copyrights[0].text` | `copyright` | First copyright |
| `available_markets[]` | `availableIn[]` | Direct |
| `external_urls.spotify` | `externalLinks[0].url` | Streaming link |
**Example Harmonization**:
```typescript
harmonize(spotifyAlbum: SpotifyAlbum): HarmonyRelease {
return {
title: spotifyAlbum.name,
artists: spotifyAlbum.artists.map(a => ({ name: a.name })),
gtin: spotifyAlbum.external_ids?.upc,
media: [{
format: MediumFormat.Digital,
position: 1,
tracks: spotifyAlbum.tracks.items.map((t, i) => ({
title: t.name,
position: i + 1,
length: t.duration_ms,
isrc: t.external_ids?.isrc,
artists: t.artists.length !== spotifyAlbum.artists.length
? t.artists.map(a => ({ name: a.name }))
: undefined
}))
}],
releaseDate: this.parseDate(spotifyAlbum.release_date),
types: this.inferTypes(spotifyAlbum.album_type),
images: spotifyAlbum.images.map(img => ({
url: img.url,
types: [ImageType.Front],
width: img.width,
height: img.height
})),
labels: spotifyAlbum.label ? [{ name: spotifyAlbum.label }] : [],
copyright: spotifyAlbum.copyrights?.[0]?.text,
availableIn: spotifyAlbum.available_markets,
externalLinks: [{
url: spotifyAlbum.external_urls.spotify,
types: [LinkType.Streaming]
}],
info: {
providers: ['spotify'],
messages: []
}
};
}
```
#### Rate Limiting
- **Limit**: Not publicly specified
- **Handling**: Retry on 429 status with `Retry-After` header
- **Caching**: 24-hour cache reduces API calls
### 2. Deezer
**File**: `providers/deezer.ts`
#### Authentication
- **Method**: Public API (no authentication required)
- **Base URL**: `https://api.deezer.com`
#### Rate Limiting
- **Limit**: 50 requests per 5 seconds
- **Enforcement**: Server-side (429 status on exceed)
- **Handling**: Exponential backoff with `Retry-After` header
#### API Endpoints
| Endpoint | Purpose | Example |
|----------|---------|---------|
| `GET /album/{id}` | Album lookup by Deezer ID | `/album/123456` |
| `GET /search/album` | Search by UPC | `/search/album?q=upc:0602537347377` |
#### URL Pattern
```typescript
urlPattern = new URLPattern({
hostname: 'www.deezer.com',
pathname: '/:locale/album/:id'
});
```
**Matches**:
- `https://www.deezer.com/en/album/123456`
- `https://www.deezer.com/fr/album/123456`
#### Feature Quality
```typescript
featureQuality = {
gtin: FeatureQuality.GOOD, // UPC field
title: FeatureQuality.GOOD, // Title field
artists: FeatureQuality.GOOD, // Artist object
releaseDate: FeatureQuality.GOOD, // release_date field
labels: FeatureQuality.GOOD, // Label with catalog number
media: FeatureQuality.GOOD, // Disc structure
tracks: FeatureQuality.GOOD, // Track listing
isrc: FeatureQuality.GOOD, // ISRC per track
images: 1400, // Max 1400x1400px
copyright: FeatureQuality.GOOD, // Copyright field
availability: FeatureQuality.PRESENT // Available countries (limited)
};
```
#### Data Mapping
**Deezer Album Object****HarmonyRelease**:
| Deezer Field | Harmony Field | Notes |
|--------------|---------------|-------|
| `title` | `title` | Direct |
| `artist.name` | `artists[0].name` | Single artist |
| `upc` | `gtin` | Direct |
| `release_date` | `releaseDate` | YYYY-MM-DD format |
| `label` | `labels[0].name` | Label name |
| `tracks.data[]` | `media[0].tracks[]` | Track array |
| `cover_xl` | `images[0].url` | 1400x1400px |
| `copyright` | `copyright` | Direct |
### 3. iTunes (Apple Music)
**File**: `providers/itunes.ts`
#### Authentication
- **Method**: Public API (no authentication required)
- **Base URL**: `https://itunes.apple.com`
#### Multi-Region Support
iTunes API is region-specific. Harmony queries multiple regions in parallel.
**Supported Regions**:
- `US` (United States)
- `GB` (United Kingdom)
- `DE` (Germany)
- `JP` (Japan)
- `FR` (France)
- `CA` (Canada)
- `AU` (Australia)
**Region-Specific Endpoints**:
```
https://itunes.apple.com/us/lookup?id=123456
https://itunes.apple.com/gb/lookup?id=123456
https://itunes.apple.com/jp/lookup?id=123456
```
#### API Endpoints
| Endpoint | Purpose | Example |
|----------|---------|---------|
| `GET /{region}/lookup` | Album lookup by iTunes ID | `/us/lookup?id=123456` |
| `GET /{region}/search` | Search by UPC | `/us/search?term=upc:0602537347377` |
#### URL Pattern
```typescript
urlPattern = new URLPattern({
hostname: 'music.apple.com',
pathname: '/:region/album/:name/:id'
});
```
**Matches**:
- `https://music.apple.com/us/album/album-name/123456`
- `https://music.apple.com/jp/album/album-name/123456`
#### Feature Quality
```typescript
featureQuality = {
gtin: FeatureQuality.GOOD, // UPC in response
title: FeatureQuality.GOOD, // collectionName
artists: FeatureQuality.GOOD, // artistName
releaseDate: FeatureQuality.GOOD, // releaseDate
labels: FeatureQuality.PRESENT, // copyright (label name embedded)
media: FeatureQuality.GOOD, // Track listing
tracks: FeatureQuality.GOOD, // Track array
isrc: FeatureQuality.MISSING, // Not provided
images: 'varies', // 600x600 to 3000x3000
copyright: FeatureQuality.PRESENT,// copyright field
availability: FeatureQuality.GOOD // Region-specific
};
```
### 4. Tidal
**File**: `providers/tidal.ts`
#### Authentication
- **Method**: OAuth2 Client Credentials Flow
- **Credentials**: `HARMONY_TIDAL_CLIENT_ID`, `HARMONY_TIDAL_CLIENT_SECRET`
- **Token endpoint**: `https://auth.tidal.com/v1/oauth2/token`
- **API version**: v2 (v1 deprecated 2025-01-21)
#### API Version Migration
**v1 (deprecated 2025-01-21)**:
- Endpoint: `https://api.tidal.com/v1/albums/{id}`
- Status: No longer supported
**v2 (current)**:
- Endpoint: `https://openapi.tidal.com/v2/albums/{id}`
- Migration: Completed in Harmony codebase
#### API Endpoints
| Endpoint | Purpose | Example |
|----------|---------|---------|
| `GET /v2/albums/{id}` | Album lookup by Tidal ID | `/v2/albums/123456` |
| `GET /v2/albums/byBarcode/{upc}` | Lookup by UPC | `/v2/albums/byBarcode/0602537347377` |
#### URL Pattern
```typescript
urlPattern = new URLPattern({
hostname: 'tidal.com',
pathname: '/browse/album/:id'
});
```
**Matches**:
- `https://tidal.com/browse/album/123456`
- `https://listen.tidal.com/album/123456`
#### Feature Quality
```typescript
featureQuality = {
gtin: FeatureQuality.GOOD, // barcode field
title: FeatureQuality.GOOD, // title field
artists: FeatureQuality.GOOD, // artists array
releaseDate: FeatureQuality.GOOD, // releaseDate
labels: FeatureQuality.GOOD, // label with catalog number
media: FeatureQuality.GOOD, // Media array
tracks: FeatureQuality.GOOD, // Track listing
isrc: FeatureQuality.GOOD, // ISRC per track
images: 1280, // Max 1280x1280px
copyright: FeatureQuality.GOOD, // copyright field
availability: FeatureQuality.GOOD // Available countries
};
```
### 5. MusicBrainz
**File**: `providers/musicbrainz.ts`
#### Authentication
- **Method**: Public API (no authentication required)
- **Base URL**: Configurable via `HARMONY_MB_API_URL` (default: `https://musicbrainz.org/ws/2`)
#### Rate Limiting
- **Limit**: 5 requests per 5 seconds (1 req/sec average)
- **Enforcement**: Server-side (503 status on exceed)
- **Handling**: Exponential backoff, respect `Retry-After` header
#### API Endpoints
| Endpoint | Purpose | Example |
|----------|---------|---------|
| `GET /release/{mbid}` | Release lookup by MBID | `/release/12345678-1234-1234-1234-123456789012` |
| `GET /release?barcode={gtin}` | Search by barcode | `/release?barcode=0602537347377` |
| `GET /url?resource={url}` | MBID resolution | `/url?resource=https://open.spotify.com/album/xyz` |
#### URL Pattern
```typescript
urlPattern = new URLPattern({
hostname: 'musicbrainz.org',
pathname: '/release/:mbid'
});
```
**Matches**:
- `https://musicbrainz.org/release/12345678-1234-1234-1234-123456789012`
#### Feature Quality
```typescript
featureQuality = {
gtin: FeatureQuality.GOOD, // barcode field
title: FeatureQuality.GOOD, // title field
artists: FeatureQuality.GOOD, // artist-credit array
releaseDate: FeatureQuality.GOOD, // date field
labels: FeatureQuality.GOOD, // label-info array
media: FeatureQuality.GOOD, // media array
tracks: FeatureQuality.GOOD, // track array
isrc: FeatureQuality.GOOD, // ISRC per recording
images: FeatureQuality.MISSING, // No images in API
copyright: FeatureQuality.MISSING,// Not in API
availability: FeatureQuality.MISSING // Not tracked
};
```
#### Special Role: Template Provider
MusicBrainz serves as a **template provider** for merge algorithm:
- **Purpose**: Provide reference data for comparison
- **Usage**: `musicbrainz!` parameter in URL
- **Behavior**: MusicBrainz data used as baseline, other providers compared against it
- **Use case**: Verify existing MusicBrainz releases against external sources
#### MBID Resolution
**Batch URL Lookup** (up to 100 URLs per request):
```typescript
async function resolveMBIDs(urls: string[]): Promise<Map<string, string>> {
const params = urls.map(url => `resource=${encodeURIComponent(url)}`).join('&');
const response = await fetch(`https://musicbrainz.org/ws/2/url?${params}&inc=release-rels`);
const data = await response.json();
const mbids = new Map<string, string>();
for (const urlData of data.urls) {
const mbid = urlData.relations.find(r => r.type === 'streaming')?.release?.id;
if (mbid) {
mbids.set(urlData.resource, mbid);
}
}
return mbids;
}
```
**Duplicate Detection**:
- Check if external URLs already linked to MusicBrainz releases
- Warn user before creating duplicate
- Provide link to existing release
## HTML Scraping Providers
### 6. Bandcamp
**File**: `providers/bandcamp.ts`
#### Scraping Method
- **Technique**: JSON-LD extraction from `<script type="application/ld+json">`
- **Fallback**: HTML parsing with CSS selectors
- **Reliability**: High (JSON-LD is stable)
#### URL Pattern
```typescript
urlPattern = new URLPattern({
hostname: '*.bandcamp.com',
pathname: '/album/:slug'
});
```
**Matches**:
- `https://artist.bandcamp.com/album/album-name`
- `https://label.bandcamp.com/album/album-name`
#### Data Extraction
**JSON-LD Schema.org MusicAlbum**:
```json
{
"@type": "MusicAlbum",
"name": "Album Title",
"byArtist": {
"@type": "MusicGroup",
"name": "Artist Name"
},
"datePublished": "2014-11-24",
"image": "https://f4.bcbits.com/img/a123456789_10.jpg",
"track": [
{
"@type": "MusicRecording",
"name": "Track 1",
"duration": "PT4M5S"
}
],
"recordLabel": {
"@type": "Organization",
"name": "Label Name"
}
}
```
#### Feature Quality
```typescript
featureQuality = {
gtin: FeatureQuality.MISSING, // Not provided
title: FeatureQuality.GOOD, // name field
artists: FeatureQuality.GOOD, // byArtist
releaseDate: FeatureQuality.GOOD, // datePublished
labels: FeatureQuality.GOOD, // recordLabel
media: FeatureQuality.GOOD, // track array
tracks: FeatureQuality.GOOD, // Track listing
isrc: FeatureQuality.MISSING, // Not provided
images: 3000, // Max 3000x3000px (a123456789_10.jpg)
copyright: FeatureQuality.PRESENT,// publisher field
availability: FeatureQuality.MISSING // Not specified
};
```
#### Challenges
- **No GTIN**: Bandcamp doesn't display barcodes
- **Subdomain variability**: Each artist/label has unique subdomain
- **Rate limiting**: Not publicly specified, conservative approach
### 7. Beatport
**File**: `providers/beatport.ts`
#### Scraping Method
- **Technique**: HTML parsing with CSS selectors
- **Reliability**: Medium (HTML structure changes break scraper)
#### URL Pattern
```typescript
urlPattern = new URLPattern({
hostname: 'www.beatport.com',
pathname: '/release/:slug/:id'
});
```
**Matches**:
- `https://www.beatport.com/release/album-name/123456`
#### Data Extraction
**CSS Selectors**:
```typescript
const selectors = {
title: '.interior-release-chart-content-item h1',
artists: '.interior-release-chart-content-item .artist a',
releaseDate: '.interior-release-chart-content-item .release-date',
label: '.interior-release-chart-content-item .label a',
catalogNumber: '.interior-release-chart-content-item .catalog-number',
tracks: '.track-grid .track',
trackTitle: '.track-title',
trackArtists: '.track-artists a',
trackLength: '.track-length',
coverImage: '.interior-release-chart-artwork img'
};
```
#### Feature Quality
```typescript
featureQuality = {
gtin: FeatureQuality.PRESENT, // Sometimes in metadata
title: FeatureQuality.GOOD, // h1 element
artists: FeatureQuality.GOOD, // Artist links
releaseDate: FeatureQuality.GOOD, // Release date element
labels: FeatureQuality.GOOD, // Label + catalog number
media: FeatureQuality.GOOD, // Track grid
tracks: FeatureQuality.GOOD, // Track listing
isrc: FeatureQuality.MISSING, // Not displayed
images: 'varies', // Cover image
copyright: FeatureQuality.MISSING,// Not displayed
availability: FeatureQuality.MISSING // Not specified
};
```
#### Challenges
- **HTML structure changes**: Frequent redesigns break selectors
- **JavaScript rendering**: Some content loaded dynamically
- **Rate limiting**: Not specified, risk of IP blocking
### 8. Mora (Japan)
**File**: `providers/mora.ts`
#### Scraping Method
- **Technique**: HTML parsing with CSS selectors
- **Language**: Japanese (requires UTF-8 handling)
- **Reliability**: Medium
#### URL Pattern
```typescript
urlPattern = new URLPattern({
hostname: 'mora.jp',
pathname: '/package/:id'
});
```
**Matches**:
- `https://mora.jp/package/123456`
#### Data Extraction
**CSS Selectors** (Japanese labels):
```typescript
const selectors = {
title: '.productTitle',
artists: '.artistName a',
releaseDate: '.releaseDate',
label: '.labelName',
catalogNumber: '.catalogNumber',
tracks: '.trackList .track',
coverImage: '.productImage img'
};
```
#### Feature Quality
```typescript
featureQuality = {
gtin: FeatureQuality.PRESENT, // JAN code (Japanese barcode)
title: FeatureQuality.GOOD, // Product title
artists: FeatureQuality.GOOD, // Artist links
releaseDate: FeatureQuality.GOOD, // Release date
labels: FeatureQuality.GOOD, // Label + catalog number
media: FeatureQuality.GOOD, // Track list
tracks: FeatureQuality.GOOD, // Track details
isrc: FeatureQuality.MISSING, // Not displayed
images: 'varies', // Product image
copyright: FeatureQuality.PRESENT,// Copyright notice
availability: FeatureQuality.GOOD // Japan-specific
};
```
#### Challenges
- **Japanese text**: Requires proper encoding and language detection
- **JAN vs. UPC**: Japanese Article Number may differ from international UPC
- **Regional availability**: Japan-only releases
### 9. Ototoy (Japan)
**File**: `providers/ototoy.ts`
#### Scraping Method
- **Technique**: HTML parsing with CSS selectors
- **Language**: Japanese
- **Reliability**: Medium
#### URL Pattern
```typescript
urlPattern = new URLPattern({
hostname: 'ototoy.jp',
pathname: '/album/:id'
});
```
**Matches**:
- `https://ototoy.jp/album/123456`
#### Feature Quality
```typescript
featureQuality = {
gtin: FeatureQuality.PRESENT, // JAN code
title: FeatureQuality.GOOD, // Album title
artists: FeatureQuality.GOOD, // Artist name
releaseDate: FeatureQuality.GOOD, // Release date
labels: FeatureQuality.GOOD, // Label info
media: FeatureQuality.GOOD, // Track list
tracks: FeatureQuality.GOOD, // Track details
isrc: FeatureQuality.MISSING, // Not displayed
images: 'varies', // Album art
copyright: FeatureQuality.PRESENT,// Copyright info
availability: FeatureQuality.GOOD // Japan-specific
};
```
## Provider Base Architecture
### MetadataProvider (Abstract Base)
**File**: `providers/base.ts`
**Core Functionality**:
```typescript
abstract class MetadataProvider {
// Identity
abstract name: string;
abstract urlPattern: URLPattern;
// Lookup methods
abstract lookupByUrl(url: string): Promise<ProviderRelease>;
abstract lookupByGtin(gtin: string, region?: string): Promise<ProviderRelease>;
// Harmonization
abstract harmonize(release: ProviderRelease): HarmonyRelease;
// Feature quality
abstract featureQuality: FeatureQualityMap;
// Rate limiting
protected rateLimit: RateLimiter;
protected async throttle(): Promise<void> {
await this.rateLimit.wait();
}
// Caching
protected cache: SnapStorage;
protected async getCached(key: string): Promise<Response | null> {
return await this.cache.get(key);
}
protected async setCached(key: string, response: Response): Promise<void> {
await this.cache.set(key, response);
}
// URL matching
matchesUrl(url: string): boolean {
return this.urlPattern.test(url);
}
}
```
### MetadataApiProvider (OAuth2)
**File**: `providers/api_base.ts`
**OAuth2 Support**:
```typescript
abstract class MetadataApiProvider extends MetadataProvider {
protected abstract clientId: string;
protected abstract clientSecret: string;
protected abstract tokenEndpoint: string;
protected async getAccessToken(): Promise<string> {
// Check cache
const cached = this.getTokenFromCache();
if (cached && !this.isTokenExpired(cached)) {
return cached.access_token;
}
// Request new token
const token = await this.requestToken();
this.cacheToken(token);
return token.access_token;
}
protected abstract async requestToken(): Promise<OAuth2Token>;
protected async fetch(url: string, options?: RequestInit): Promise<Response> {
const token = await this.getAccessToken();
return await fetch(url, {
...options,
headers: {
...options?.headers,
'Authorization': `Bearer ${token}`
}
});
}
}
```
### RateLimiter
**File**: `utils/rate_limiter.ts`
**Implementation**:
```typescript
class RateLimiter {
private queue: number[] = [];
private maxRequests: number;
private timeWindow: number; // milliseconds
constructor(maxRequests: number, timeWindow: number) {
this.maxRequests = maxRequests;
this.timeWindow = timeWindow;
}
async wait(): Promise<void> {
const now = Date.now();
// Remove old requests outside time window
this.queue = this.queue.filter(t => now - t < this.timeWindow);
// If at limit, wait until oldest request expires
if (this.queue.length >= this.maxRequests) {
const oldestRequest = this.queue[0];
const waitTime = this.timeWindow - (now - oldestRequest);
await new Promise(resolve => setTimeout(resolve, waitTime));
return this.wait(); // Recursive call after waiting
}
// Add current request to queue
this.queue.push(now);
}
}
// Usage
const deezerLimiter = new RateLimiter(50, 5000); // 50 req / 5 sec
const mbLimiter = new RateLimiter(5, 5000); // 5 req / 5 sec
```
## Provider Registry
**File**: `providers/registry.ts`
**Registration**:
```typescript
class ProviderRegistry {
private providers = new Map<string, MetadataProvider>();
private categories = new Map<string, string[]>();
register(provider: MetadataProvider, category: string): void {
this.providers.set(provider.name, provider);
if (!this.categories.has(category)) {
this.categories.set(category, []);
}
this.categories.get(category)!.push(provider.name);
}
get(name: string): MetadataProvider | undefined {
return this.providers.get(name);
}
getByCategory(category: string): MetadataProvider[] {
const names = this.categories.get(category) || [];
return names.map(name => this.providers.get(name)!);
}
getByUrl(url: string): MetadataProvider | undefined {
for (const provider of this.providers.values()) {
if (provider.matchesUrl(url)) {
return provider;
}
}
return undefined;
}
getByGtin(): MetadataProvider[] {
return Array.from(this.providers.values()).filter(p =>
p.featureQuality.gtin !== FeatureQuality.MISSING
);
}
}
// Initialize registry
const registry = new ProviderRegistry();
registry.register(new SpotifyProvider(), 'preferred');
registry.register(new DeezerProvider(), 'default');
registry.register(new iTunesProvider(), 'default');
registry.register(new TidalProvider(), 'preferred');
registry.register(new MusicBrainzProvider(), 'preferred');
registry.register(new BandcampProvider(), 'all');
registry.register(new BeatportProvider(), 'all');
registry.register(new MoraProvider(), 'japan');
registry.register(new OtotoyProvider(), 'japan');
```
## Not Implemented: KKBOX
**Status**: Mentioned in documentation but not implemented
**Reason**: Unknown (possibly API access issues or low priority)
**Potential Implementation**:
- **Region**: Taiwan, Hong Kong, Japan, Singapore, Malaysia
- **API**: Public API available
- **Authentication**: API key required
- **Data quality**: High (official metadata)
## Summary
Harmony's provider integration demonstrates:
1. **Diverse access methods**: API-based (5) and HTML scraping (4)
2. **Unified abstraction**: All providers implement common interface
3. **OAuth2 support**: Spotify and Tidal with token caching
4. **Rate limiting**: Per-provider rate limiters with exponential backoff
5. **Multi-region support**: iTunes queries multiple regions in parallel
6. **Feature quality ratings**: Transparent quality assessment per provider
7. **Graceful degradation**: `Promise.allSettled` ensures partial results
8. **MusicBrainz integration**: MBID resolution and duplicate detection
9. **Caching**: 24-hour HTTP response cache reduces API calls
This architecture is production-ready and serves as an excellent reference for building multi-source metadata aggregation systems.