# Harmony - Provider Integrations Analysis ## Provider Ecosystem Overview Harmony integrates with **9 music metadata providers** using two primary access methods: 1. **API-based providers (5)**: Structured data via REST APIs 2. **HTML scraping providers (4)**: Data extraction from web pages All providers share a common base architecture with URL pattern matching, rate limiting, caching, and harmonization to the `HarmonyRelease` schema. ## Provider Summary Table | Provider | Type | Auth | Rate Limit | GTIN | Max Image | Regions | Status | |----------|------|------|------------|------|-----------|---------|--------| | Spotify | API | OAuth2 | Not specified | Yes (UPC) | 2000px | Global | Active | | Deezer | API | Public | 50 req/5s | Yes | 1400px | Global | Active | | iTunes | API | Public | Not specified | Yes | Varies | Multi-region | Active | | Tidal | API | OAuth2 | Not specified | Yes | 1280px | Global | Active (v2) | | MusicBrainz | API | Public | 5 req/5s | Yes (barcode) | N/A | Global | Active | | Bandcamp | Scraping | None | Not specified | No | 3000px | Global | Active | | Beatport | Scraping | None | Not specified | Yes | Varies | Global | Active | | Mora | Scraping | None | Not specified | Yes | Varies | Japan | Active | | Ototoy | Scraping | None | Not specified | Yes | Varies | Japan | Active | ## API-Based Providers ### 1. Spotify **File**: `providers/spotify.ts` #### Authentication - **Method**: OAuth2 Client Credentials Flow - **Credentials**: `HARMONY_SPOTIFY_CLIENT_ID`, `HARMONY_SPOTIFY_CLIENT_SECRET` - **Token endpoint**: `https://accounts.spotify.com/api/token` - **Token caching**: localStorage (dev) / sessionStorage (prod) - **Token lifetime**: 3600 seconds (1 hour) **OAuth2 Flow**: ```typescript async function getAccessToken(): Promise { const response = await fetch('https://accounts.spotify.com/api/token', { method: 'POST', headers: { 'Authorization': `Basic ${btoa(`${clientId}:${clientSecret}`)}`, 'Content-Type': 'application/x-www-form-urlencoded' }, body: 'grant_type=client_credentials' }); const data = await response.json(); return data.access_token; } ``` #### API Endpoints | Endpoint | Purpose | Example | |----------|---------|---------| | `GET /v1/albums/{id}` | Album lookup by Spotify ID | `/v1/albums/3DiDSNVBRYVzccLn2yqhMJ` | | `GET /v1/search` | Search by UPC | `/v1/search?q=upc:0602537347377&type=album` | #### URL Pattern ```typescript urlPattern = new URLPattern({ hostname: 'open.spotify.com', pathname: '/album/:id' }); ``` **Matches**: - `https://open.spotify.com/album/3DiDSNVBRYVzccLn2yqhMJ` - `https://open.spotify.com/album/3DiDSNVBRYVzccLn2yqhMJ?si=xyz` #### Feature Quality ```typescript featureQuality = { gtin: FeatureQuality.GOOD, // UPC in external_ids title: FeatureQuality.GOOD, // Album name artists: FeatureQuality.GOOD, // Artist array with names releaseDate: FeatureQuality.GOOD, // release_date field labels: FeatureQuality.PRESENT, // Label name (no catalog number) media: FeatureQuality.GOOD, // Disc structure tracks: FeatureQuality.GOOD, // Track listing with durations isrc: FeatureQuality.GOOD, // ISRC per track images: 2000, // Max 2000x2000px copyright: FeatureQuality.PRESENT,// Copyright array availability: FeatureQuality.GOOD // available_markets array }; ``` #### Data Mapping **Spotify Album Object** → **HarmonyRelease**: | Spotify Field | Harmony Field | Transformation | |---------------|---------------|----------------| | `name` | `title` | Direct | | `artists[].name` | `artists[].name` | Map array | | `external_ids.upc` | `gtin` | Direct | | `release_date` | `releaseDate` | Parse to PartialDate | | `label` | `labels[0].name` | Single label | | `tracks.items[]` | `media[0].tracks[]` | Map to HarmonyTrack | | `images[]` | `images[]` | Map with dimensions | | `copyrights[0].text` | `copyright` | First copyright | | `available_markets[]` | `availableIn[]` | Direct | | `external_urls.spotify` | `externalLinks[0].url` | Streaming link | **Example Harmonization**: ```typescript harmonize(spotifyAlbum: SpotifyAlbum): HarmonyRelease { return { title: spotifyAlbum.name, artists: spotifyAlbum.artists.map(a => ({ name: a.name })), gtin: spotifyAlbum.external_ids?.upc, media: [{ format: MediumFormat.Digital, position: 1, tracks: spotifyAlbum.tracks.items.map((t, i) => ({ title: t.name, position: i + 1, length: t.duration_ms, isrc: t.external_ids?.isrc, artists: t.artists.length !== spotifyAlbum.artists.length ? t.artists.map(a => ({ name: a.name })) : undefined })) }], releaseDate: this.parseDate(spotifyAlbum.release_date), types: this.inferTypes(spotifyAlbum.album_type), images: spotifyAlbum.images.map(img => ({ url: img.url, types: [ImageType.Front], width: img.width, height: img.height })), labels: spotifyAlbum.label ? [{ name: spotifyAlbum.label }] : [], copyright: spotifyAlbum.copyrights?.[0]?.text, availableIn: spotifyAlbum.available_markets, externalLinks: [{ url: spotifyAlbum.external_urls.spotify, types: [LinkType.Streaming] }], info: { providers: ['spotify'], messages: [] } }; } ``` #### Rate Limiting - **Limit**: Not publicly specified - **Handling**: Retry on 429 status with `Retry-After` header - **Caching**: 24-hour cache reduces API calls ### 2. Deezer **File**: `providers/deezer.ts` #### Authentication - **Method**: Public API (no authentication required) - **Base URL**: `https://api.deezer.com` #### Rate Limiting - **Limit**: 50 requests per 5 seconds - **Enforcement**: Server-side (429 status on exceed) - **Handling**: Exponential backoff with `Retry-After` header #### API Endpoints | Endpoint | Purpose | Example | |----------|---------|---------| | `GET /album/{id}` | Album lookup by Deezer ID | `/album/123456` | | `GET /search/album` | Search by UPC | `/search/album?q=upc:0602537347377` | #### URL Pattern ```typescript urlPattern = new URLPattern({ hostname: 'www.deezer.com', pathname: '/:locale/album/:id' }); ``` **Matches**: - `https://www.deezer.com/en/album/123456` - `https://www.deezer.com/fr/album/123456` #### Feature Quality ```typescript featureQuality = { gtin: FeatureQuality.GOOD, // UPC field title: FeatureQuality.GOOD, // Title field artists: FeatureQuality.GOOD, // Artist object releaseDate: FeatureQuality.GOOD, // release_date field labels: FeatureQuality.GOOD, // Label with catalog number media: FeatureQuality.GOOD, // Disc structure tracks: FeatureQuality.GOOD, // Track listing isrc: FeatureQuality.GOOD, // ISRC per track images: 1400, // Max 1400x1400px copyright: FeatureQuality.GOOD, // Copyright field availability: FeatureQuality.PRESENT // Available countries (limited) }; ``` #### Data Mapping **Deezer Album Object** → **HarmonyRelease**: | Deezer Field | Harmony Field | Notes | |--------------|---------------|-------| | `title` | `title` | Direct | | `artist.name` | `artists[0].name` | Single artist | | `upc` | `gtin` | Direct | | `release_date` | `releaseDate` | YYYY-MM-DD format | | `label` | `labels[0].name` | Label name | | `tracks.data[]` | `media[0].tracks[]` | Track array | | `cover_xl` | `images[0].url` | 1400x1400px | | `copyright` | `copyright` | Direct | ### 3. iTunes (Apple Music) **File**: `providers/itunes.ts` #### Authentication - **Method**: Public API (no authentication required) - **Base URL**: `https://itunes.apple.com` #### Multi-Region Support iTunes API is region-specific. Harmony queries multiple regions in parallel. **Supported Regions**: - `US` (United States) - `GB` (United Kingdom) - `DE` (Germany) - `JP` (Japan) - `FR` (France) - `CA` (Canada) - `AU` (Australia) **Region-Specific Endpoints**: ``` https://itunes.apple.com/us/lookup?id=123456 https://itunes.apple.com/gb/lookup?id=123456 https://itunes.apple.com/jp/lookup?id=123456 ``` #### API Endpoints | Endpoint | Purpose | Example | |----------|---------|---------| | `GET /{region}/lookup` | Album lookup by iTunes ID | `/us/lookup?id=123456` | | `GET /{region}/search` | Search by UPC | `/us/search?term=upc:0602537347377` | #### URL Pattern ```typescript urlPattern = new URLPattern({ hostname: 'music.apple.com', pathname: '/:region/album/:name/:id' }); ``` **Matches**: - `https://music.apple.com/us/album/album-name/123456` - `https://music.apple.com/jp/album/album-name/123456` #### Feature Quality ```typescript featureQuality = { gtin: FeatureQuality.GOOD, // UPC in response title: FeatureQuality.GOOD, // collectionName artists: FeatureQuality.GOOD, // artistName releaseDate: FeatureQuality.GOOD, // releaseDate labels: FeatureQuality.PRESENT, // copyright (label name embedded) media: FeatureQuality.GOOD, // Track listing tracks: FeatureQuality.GOOD, // Track array isrc: FeatureQuality.MISSING, // Not provided images: 'varies', // 600x600 to 3000x3000 copyright: FeatureQuality.PRESENT,// copyright field availability: FeatureQuality.GOOD // Region-specific }; ``` ### 4. Tidal **File**: `providers/tidal.ts` #### Authentication - **Method**: OAuth2 Client Credentials Flow - **Credentials**: `HARMONY_TIDAL_CLIENT_ID`, `HARMONY_TIDAL_CLIENT_SECRET` - **Token endpoint**: `https://auth.tidal.com/v1/oauth2/token` - **API version**: v2 (v1 deprecated 2025-01-21) #### API Version Migration **v1 (deprecated 2025-01-21)**: - Endpoint: `https://api.tidal.com/v1/albums/{id}` - Status: No longer supported **v2 (current)**: - Endpoint: `https://openapi.tidal.com/v2/albums/{id}` - Migration: Completed in Harmony codebase #### API Endpoints | Endpoint | Purpose | Example | |----------|---------|---------| | `GET /v2/albums/{id}` | Album lookup by Tidal ID | `/v2/albums/123456` | | `GET /v2/albums/byBarcode/{upc}` | Lookup by UPC | `/v2/albums/byBarcode/0602537347377` | #### URL Pattern ```typescript urlPattern = new URLPattern({ hostname: 'tidal.com', pathname: '/browse/album/:id' }); ``` **Matches**: - `https://tidal.com/browse/album/123456` - `https://listen.tidal.com/album/123456` #### Feature Quality ```typescript featureQuality = { gtin: FeatureQuality.GOOD, // barcode field title: FeatureQuality.GOOD, // title field artists: FeatureQuality.GOOD, // artists array releaseDate: FeatureQuality.GOOD, // releaseDate labels: FeatureQuality.GOOD, // label with catalog number media: FeatureQuality.GOOD, // Media array tracks: FeatureQuality.GOOD, // Track listing isrc: FeatureQuality.GOOD, // ISRC per track images: 1280, // Max 1280x1280px copyright: FeatureQuality.GOOD, // copyright field availability: FeatureQuality.GOOD // Available countries }; ``` ### 5. MusicBrainz **File**: `providers/musicbrainz.ts` #### Authentication - **Method**: Public API (no authentication required) - **Base URL**: Configurable via `HARMONY_MB_API_URL` (default: `https://musicbrainz.org/ws/2`) #### Rate Limiting - **Limit**: 5 requests per 5 seconds (1 req/sec average) - **Enforcement**: Server-side (503 status on exceed) - **Handling**: Exponential backoff, respect `Retry-After` header #### API Endpoints | Endpoint | Purpose | Example | |----------|---------|---------| | `GET /release/{mbid}` | Release lookup by MBID | `/release/12345678-1234-1234-1234-123456789012` | | `GET /release?barcode={gtin}` | Search by barcode | `/release?barcode=0602537347377` | | `GET /url?resource={url}` | MBID resolution | `/url?resource=https://open.spotify.com/album/xyz` | #### URL Pattern ```typescript urlPattern = new URLPattern({ hostname: 'musicbrainz.org', pathname: '/release/:mbid' }); ``` **Matches**: - `https://musicbrainz.org/release/12345678-1234-1234-1234-123456789012` #### Feature Quality ```typescript featureQuality = { gtin: FeatureQuality.GOOD, // barcode field title: FeatureQuality.GOOD, // title field artists: FeatureQuality.GOOD, // artist-credit array releaseDate: FeatureQuality.GOOD, // date field labels: FeatureQuality.GOOD, // label-info array media: FeatureQuality.GOOD, // media array tracks: FeatureQuality.GOOD, // track array isrc: FeatureQuality.GOOD, // ISRC per recording images: FeatureQuality.MISSING, // No images in API copyright: FeatureQuality.MISSING,// Not in API availability: FeatureQuality.MISSING // Not tracked }; ``` #### Special Role: Template Provider MusicBrainz serves as a **template provider** for merge algorithm: - **Purpose**: Provide reference data for comparison - **Usage**: `musicbrainz!` parameter in URL - **Behavior**: MusicBrainz data used as baseline, other providers compared against it - **Use case**: Verify existing MusicBrainz releases against external sources #### MBID Resolution **Batch URL Lookup** (up to 100 URLs per request): ```typescript async function resolveMBIDs(urls: string[]): Promise> { const params = urls.map(url => `resource=${encodeURIComponent(url)}`).join('&'); const response = await fetch(`https://musicbrainz.org/ws/2/url?${params}&inc=release-rels`); const data = await response.json(); const mbids = new Map(); for (const urlData of data.urls) { const mbid = urlData.relations.find(r => r.type === 'streaming')?.release?.id; if (mbid) { mbids.set(urlData.resource, mbid); } } return mbids; } ``` **Duplicate Detection**: - Check if external URLs already linked to MusicBrainz releases - Warn user before creating duplicate - Provide link to existing release ## HTML Scraping Providers ### 6. Bandcamp **File**: `providers/bandcamp.ts` #### Scraping Method - **Technique**: JSON-LD extraction from `