- gRPC service with MusicBrainz provider - PostgreSQL schema with migrations - Service layer with database-first caching - Repository pattern for data access - YAML configuration support - Research documentation for 17 music metadata projects
24 KiB
Harmony - Architecture Analysis
System Architecture Overview
Harmony implements a 4-stage pipeline architecture for metadata aggregation and harmonization:
┌──────────┐ ┌────────────┐ ┌───────┐ ┌──────┐
│ LOOKUP │ --> │ HARMONIZE │ --> │ MERGE │ --> │ SEED │
└──────────┘ └────────────┘ └───────┘ └──────┘
│ │ │ │
Parallel Provider 3-phase MusicBrainz
Multi-source Conversion Merge Format
Queries to Harmony Algorithm Conversion
Each stage has distinct responsibilities and operates on well-defined data structures.
Stage 1: LOOKUP
CombinedReleaseLookup
The entry point for all metadata retrieval operations.
Location: harmonizer/combined_lookup.ts
Responsibilities:
- Accepts GTIN, URLs, or provider-specific IDs
- Determines which providers to query based on input
- Executes provider lookups in parallel
- Handles provider failures gracefully via
Promise.allSettled - Returns array of provider-specific release objects
Input Types:
interface LookupInput {
gtin?: string; // Global Trade Item Number (barcode)
urls?: string[]; // Provider URLs
region?: string[]; // Market regions (e.g., ['GB', 'US', 'JP'])
category?: string; // Provider category filter
providerIds?: Record<string, string>; // Provider-specific IDs
}
Parallel Execution:
// Conceptual flow
const lookupPromises = providers.map(provider =>
provider.lookup(input).catch(error => ({ error }))
);
const results = await Promise.allSettled(lookupPromises);
Output: Array of provider-native release objects (Spotify, Deezer, iTunes formats, etc.)
Provider Selection Logic
- URL-based: Extract provider from URL pattern matching
- GTIN-based: Query all providers supporting GTIN lookup
- Category filtering: Apply user preferences (all/default/preferred)
- Region filtering: Pass region codes to region-aware providers
Stage 2: HARMONIZE
Provider Conversion
Each provider implements a harmonize() method that converts its native format to HarmonyRelease.
Location: Individual provider files in providers/
Conversion Responsibilities:
- Map provider-specific field names to Harmony schema
- Normalize data types (dates, durations, ISRCs)
- Extract nested structures (artists, labels, media)
- Detect language and script from metadata
- Resolve release types (album, single, EP, etc.)
- Extract external links and identifiers
Example Provider Conversion (conceptual):
class SpotifyProvider extends MetadataApiProvider {
harmonize(spotifyAlbum: SpotifyAlbum): HarmonyRelease {
return {
title: spotifyAlbum.name,
artists: this.convertArtists(spotifyAlbum.artists),
gtin: spotifyAlbum.external_ids?.upc,
media: this.convertTracks(spotifyAlbum.tracks),
releaseDate: this.parseDate(spotifyAlbum.release_date),
images: this.convertImages(spotifyAlbum.images),
externalLinks: [{
url: spotifyAlbum.external_urls.spotify,
types: ['streaming']
}],
// ... additional fields
};
}
}
HarmonyRelease Schema
Location: harmonizer/types.ts (273 lines)
Core Structure:
interface HarmonyRelease {
// Basic metadata
title: string;
artists: ArtistCreditName[];
gtin?: string;
// Media and tracks
media: HarmonyMedium[];
// Release details
language?: string;
script?: string;
status?: ReleaseStatus;
types: ReleaseType[];
releaseDate?: PartialDate;
// Commercial info
labels: Label[];
packaging?: PackagingType;
copyright?: string;
// Distribution
availableIn?: string[]; // Country codes
excludedFrom?: string[]; // Country codes
// Visual assets
images: Image[];
// Links and identifiers
externalLinks: ExternalLink[];
// Metadata about metadata
info: {
providers: string[]; // Which providers contributed
messages: Message[]; // Warnings, errors
sourceMap?: SourceMap; // Property -> provider mapping
incompatibleData?: IncompatibilityInfo;
};
}
Key Sub-structures:
ArtistCreditName
interface ArtistCreditName {
name: string; // Display name
creditedName?: string; // Alternative credit
joinPhrase?: string; // Separator (e.g., " & ", " feat. ")
mbid?: string; // MusicBrainz ID
}
HarmonyMedium
interface HarmonyMedium {
title?: string;
format?: MediumFormat; // CD, Vinyl, Digital, etc.
position: number;
tracks: HarmonyTrack[];
}
HarmonyTrack
interface HarmonyTrack {
title: string;
artists?: ArtistCreditName[];
position: number;
length?: number; // Duration in milliseconds
isrc?: string; // International Standard Recording Code
}
Label
interface Label {
name: string;
catalogNumber?: string;
mbid?: string;
}
Image
interface Image {
url: string;
types: ImageType[]; // 'front', 'back', 'medium', etc.
width?: number;
height?: number;
comment?: string;
}
Harmonizer Modules
Location: harmonizer/ directory
| Module | Purpose | Lines |
|---|---|---|
types.ts |
HarmonyRelease schema and type definitions | 273 |
merge.ts |
3-phase merge algorithm | ~200 |
compatibility.ts |
Conflict detection and resolution | ~150 |
deduplicate.ts |
Remove duplicate entries | ~100 |
isrc.ts |
ISRC validation and normalization | ~50 |
language_script.ts |
Auto-detect language and script | ~100 |
release_label.ts |
Label normalization | ~80 |
release_types.ts |
Release type inference | ~120 |
tracklist_gap.ts |
Detect missing tracks | ~60 |
Stage 3: MERGE
3-Phase Merge Algorithm
Location: harmonizer/merge.ts
The merge algorithm combines multiple HarmonyRelease objects into a single MergedHarmonyRelease using provider preferences and compatibility checking.
Phase 1: Property Collection
Collect all values for each property across all releases:
// Conceptual
const propertyValues = {
title: ['Album Title', 'Album Title (Deluxe)', 'Album Title'],
gtin: ['0602537347377', '0602537347377'],
releaseDate: ['2014-11-24', '2014-11-24', '2014-11-25'],
// ... all properties
};
Phase 2: Compatibility Checking
For each property, check if values are compatible:
interface CompatibilityCheck {
compatible: boolean;
canonicalValue?: any;
conflicts?: ConflictInfo[];
}
Compatibility Rules:
- Strings: Case-insensitive comparison, whitespace normalization
- Dates: Partial date matching (year-only vs. full date)
- Arrays: Set comparison (order-independent)
- Numbers: Exact match or within tolerance
- Objects: Recursive field comparison
Example Compatibility:
// Compatible
'2014-11-24' ≈ '2014-11' // Partial date match
'Album Title' ≈ 'album title' // Case-insensitive
// Incompatible
'2014-11-24' ≠ '2014-11-25' // Date conflict
'Album' ≠ 'EP' // Type conflict
Phase 3: Value Selection
For each property, select the best value using provider preferences:
Provider Preference Order (configurable):
- MusicBrainz (template/reference)
- Spotify (high quality, comprehensive)
- Tidal (high quality audio metadata)
- Deezer (good coverage)
- iTunes (region-specific)
- Bandcamp (artist-verified)
- Beatport (electronic music specialist)
- Mora (Japan specialist)
- Ototoy (Japan specialist)
Selection Logic:
function selectBestValue(values: PropertyValues, preferences: string[]): any {
// 1. Filter to compatible values only
const compatible = values.filter(v => v.isCompatible);
// 2. If no compatible values, mark as conflict
if (compatible.length === 0) {
return { conflict: true, values };
}
// 3. Select from highest-preference provider
for (const provider of preferences) {
const value = compatible.find(v => v.provider === provider);
if (value) return value.data;
}
// 4. Fallback to first compatible value
return compatible[0].data;
}
MergedHarmonyRelease
Extends HarmonyRelease with merge metadata:
interface MergedHarmonyRelease extends HarmonyRelease {
sourceMap: SourceMap; // Property -> provider mapping
incompatibleData?: IncompatibilityInfo;
}
interface SourceMap {
[propertyPath: string]: string; // e.g., "title" -> "spotify"
}
interface IncompatibilityInfo {
conflicts: Conflict[];
warnings: string[];
}
interface Conflict {
property: string;
values: Array<{
provider: string;
value: any;
}>;
}
Deduplication
Location: harmonizer/deduplicate.ts
Removes duplicate entries in arrays:
- Artists: Match by name (case-insensitive) or MBID
- Labels: Match by name and catalog number
- Tracks: Match by position and title
- Images: Match by URL or dimensions
- External links: Match by URL
Compatibility Checking
Location: harmonizer/compatibility.ts
Detects and reports incompatible data:
Incompatibility Types:
- Value conflicts: Different values for same property
- Type conflicts: Different data types
- Structural conflicts: Different array lengths, missing required fields
- Semantic conflicts: Logically incompatible values (e.g., release date before artist birth)
Handling:
- Strict mode: Reject merge if any conflicts
- Lenient mode: Prefer highest-quality provider, log warnings
- User override: Allow manual conflict resolution
Stage 4: SEED
MusicBrainz Seeding
Location: musicbrainz/seeding.ts
Converts MergedHarmonyRelease to MusicBrainz import format.
Conversion Steps:
- Map HarmonyRelease fields to MusicBrainz schema
- Generate edit notes with provider URLs
- Create permalink for reproducibility
- Build annotation with extra data (copyright, availability)
- Format for MusicBrainz seeder form
MusicBrainz Mapping:
| Harmony Field | MusicBrainz Field | Notes |
|---|---|---|
title |
Release name | Direct mapping |
artists |
Artist credit | Join with joinPhrase |
gtin |
Barcode | Validate format |
releaseDate |
Release events | Per-country events |
labels |
Release labels | With catalog numbers |
media |
Mediums | With format and tracks |
types |
Release group types | Primary + secondary |
language |
Language | ISO 639-3 code |
script |
Script | ISO 15924 code |
packaging |
Packaging | Jewel case, digipak, etc. |
Edit Note Generation:
function generateEditNote(release: MergedHarmonyRelease, permalink: string): string {
const sources = release.info.providers.join(', ');
return `
Imported from ${sources} via Harmony
Permalink: ${permalink}
${release.externalLinks.map(link => link.url).join('\n')}
`.trim();
}
MBID Resolution
Location: musicbrainz/mbid_mapping.ts
Resolves external URLs to MusicBrainz IDs (MBIDs).
Batch Lookup:
- Collects up to 100 URLs
- Single MusicBrainz API request:
GET /ws/2/url?resource={url1}&resource={url2}&... - Caches results in localStorage (dev) or sessionStorage (prod)
- Returns MBID mappings
Duplicate Detection:
- Checks if release already exists in MusicBrainz
- Warns user before creating duplicate
- Provides link to existing release
Cache Strategy:
interface MBIDCache {
[externalUrl: string]: {
mbid: string;
type: 'release' | 'release-group' | 'recording' | 'artist';
cached: number; // Timestamp
};
}
Annotation Builder
Location: musicbrainz/annotation.ts
Generates MusicBrainz annotation text for additional metadata:
Included Data:
- Copyright information
- Availability/exclusion regions
- Provider-specific notes
- Compatibility warnings
- Image URLs (if not added as cover art)
Format:
Copyright: © 2014 Record Label
Available in: US, GB, DE, JP
Excluded from: CN
Sources:
- Spotify: https://open.spotify.com/album/xyz
- Deezer: https://www.deezer.com/album/123
Notes:
- Release date conflict: Spotify (2014-11-24) vs iTunes (2014-11-25)
Provider Architecture
Base Class Hierarchy
MetadataProvider (abstract)
├── MetadataApiProvider (OAuth2 support)
│ ├── SpotifyProvider
│ └── TidalProvider
├── ReleaseLookup (GTIN/URL/ID support)
│ ├── DeezerProvider
│ ├── iTunesProvider
│ ├── BandcampProvider
│ ├── BeatportProvider
│ ├── MoraProvider
│ └── OtotoyProvider
└── ReleaseApiLookup (multi-region support)
├── iTunesProvider
└── DeezerProvider
MetadataProvider (Abstract Base)
Location: providers/base.ts
Core Responsibilities:
- URL pattern matching via
URLPattern - Rate limiting with configurable delays
- HTTP response caching via
snap_storage - Error handling and retry logic
- Feature quality ratings
Key Methods:
abstract class MetadataProvider {
// URL pattern matching
abstract urlPattern: URLPattern;
matchesUrl(url: string): boolean;
// Lookup methods
abstract lookupByUrl(url: string): Promise<Release>;
abstract lookupByGtin(gtin: string, region?: string): Promise<Release>;
// Harmonization
abstract harmonize(release: Release): HarmonyRelease;
// Rate limiting
protected rateLimit: RateLimiter;
protected async throttle(): Promise<void>;
// Caching
protected cache: SnapStorage;
protected async getCached(key: string): Promise<Response | null>;
protected async setCached(key: string, response: Response): Promise<void>;
// Feature quality
abstract featureQuality: FeatureQualityMap;
}
MetadataApiProvider (OAuth2)
Location: providers/api_base.ts
Additional Responsibilities:
- OAuth2 token acquisition and refresh
- Token caching in localStorage
- Automatic token renewal
- API client configuration
OAuth2 Flow:
class MetadataApiProvider extends MetadataProvider {
protected async getAccessToken(): Promise<string> {
// 1. Check cache
const cached = localStorage.getItem(`${this.name}_token`);
if (cached && !this.isTokenExpired(cached)) {
return cached.access_token;
}
// 2. Request new token
const token = await this.requestToken();
// 3. Cache token
localStorage.setItem(`${this.name}_token`, JSON.stringify(token));
return token.access_token;
}
protected abstract async requestToken(): Promise<OAuth2Token>;
}
ReleaseLookup
Location: providers/release_lookup.ts
Lookup Methods:
interface ReleaseLookup {
lookupByUrl(url: string): Promise<Release>;
lookupByGtin(gtin: string): Promise<Release>;
lookupById(id: string): Promise<Release>;
}
ReleaseApiLookup (Multi-Region)
Location: providers/release_api_lookup.ts
Region Handling:
class ReleaseApiLookup extends ReleaseLookup {
protected supportedRegions: string[]; // ['US', 'GB', 'JP', ...]
async lookupByGtin(gtin: string, regions: string[]): Promise<Release[]> {
const lookups = regions
.filter(r => this.supportedRegions.includes(r))
.map(r => this.lookupInRegion(gtin, r));
const results = await Promise.allSettled(lookups);
return results
.filter(r => r.status === 'fulfilled')
.map(r => r.value);
}
protected abstract lookupInRegion(gtin: string, region: string): Promise<Release>;
}
Provider Registry
Location: providers/registry.ts
Manages provider instantiation and categorization.
Registry Structure:
class ProviderRegistry {
private providers: Map<string, MetadataProvider>;
private categories: Map<string, string[]>; // category -> provider names
register(provider: MetadataProvider, category: string): void;
get(name: string): MetadataProvider | undefined;
getByCategory(category: string): MetadataProvider[];
getByUrl(url: string): MetadataProvider | undefined;
getByGtin(): MetadataProvider[]; // All GTIN-supporting providers
}
Categories:
default: Commonly used providers (Spotify, Deezer, iTunes)preferred: High-quality providers (Spotify, Tidal, MusicBrainz)all: All registered providersjapan: Japan-specific providers (Mora, Ototoy)electronic: Electronic music specialists (Beatport)
Feature Quality Ratings
Each provider declares quality ratings for supported features:
interface FeatureQualityMap {
gtin: FeatureQuality;
title: FeatureQuality;
artists: FeatureQuality;
releaseDate: FeatureQuality;
labels: FeatureQuality;
media: FeatureQuality;
tracks: FeatureQuality;
isrc: FeatureQuality;
images: FeatureQuality | number; // Number = max dimension
copyright: FeatureQuality;
availability: FeatureQuality;
}
enum FeatureQuality {
MISSING = 0,
BAD = 1,
PRESENT = 2,
GOOD = 3,
}
Example (Spotify):
featureQuality = {
gtin: FeatureQuality.GOOD,
title: FeatureQuality.GOOD,
artists: FeatureQuality.GOOD,
releaseDate: FeatureQuality.GOOD,
labels: FeatureQuality.PRESENT,
media: FeatureQuality.GOOD,
tracks: FeatureQuality.GOOD,
isrc: FeatureQuality.GOOD,
images: 2000, // Max 2000px
copyright: FeatureQuality.PRESENT,
availability: FeatureQuality.GOOD,
};
Server Architecture (Fresh Framework)
Fresh Islands Architecture
Fresh uses a hybrid rendering model:
- Server-side rendering (SSR): Default for all components
- Islands: Client-side interactive components
Benefits:
- Minimal JavaScript shipped to client
- Fast initial page load
- Progressive enhancement
- SEO-friendly
Route Structure
Location: routes/ directory
| Route File | URL | Purpose |
|---|---|---|
index.tsx |
/ |
Landing page |
release.tsx |
/release |
Main lookup interface |
release/actions.tsx |
/release/actions |
ISRC/cover submission |
about.tsx |
/about |
Provider documentation |
settings.tsx |
/settings |
User preferences |
Components
Location: components/ directory
22 Static Components (server-rendered):
- Layout components (Header, Footer, Navigation)
- Display components (ReleaseInfo, TrackList, ArtistCredit)
- Comparison components (ProviderTable, FeatureMatrix)
- Form components (LookupForm, SeederForm)
5 Interactive Islands (client-side):
LookupForm.tsx: Dynamic form with validationProviderSelector.tsx: Provider category filteringRegionSelector.tsx: Multi-region selectionPermalinkGenerator.tsx: Timestamp-based permalink creationSeederForm.tsx: MusicBrainz import form with copy-to-clipboard
Request Flow
1. Browser Request
↓
2. Fresh Router (routes/release.tsx)
↓
3. CombinedReleaseLookup (parallel provider queries)
↓
4. Provider Harmonization (convert to HarmonyRelease)
↓
5. Merge Algorithm (combine releases)
↓
6. Server-Side Rendering (generate HTML)
↓
7. Island Hydration (activate interactive components)
↓
8. Browser Response
Data Flow Diagram
┌─────────────────────────────────────────────────────────────┐
│ User Input │
│ GTIN: 0602537347377 URLs: [spotify, deezer] Region: US │
└────────────────────────┬────────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────────────────┐
│ CombinedReleaseLookup │
│ - Parse input │
│ - Select providers (Spotify, Deezer) │
│ - Execute parallel lookups │
└────────────────────────┬────────────────────────────────────┘
│
┌───────────────┼───────────────┐
▼ ▼ ▼
┌─────────────┐ ┌─────────────┐ ┌─────────────┐
│ Spotify │ │ Deezer │ │ iTunes │
│ Provider │ │ Provider │ │ Provider │
│ │ │ │ │ │
│ - API call │ │ - API call │ │ - API call │
│ - Cache │ │ - Cache │ │ - Cache │
│ - Parse │ │ - Parse │ │ - Parse │
└──────┬──────┘ └──────┬──────┘ └──────┬──────┘
│ │ │
▼ ▼ ▼
┌─────────────┐ ┌─────────────┐ ┌─────────────┐
│ Harmonize │ │ Harmonize │ │ Harmonize │
│ (Spotify) │ │ (Deezer) │ │ (iTunes) │
└──────┬──────┘ └──────┬──────┘ └──────┬──────┘
│ │ │
└────────────────┼────────────────┘
▼
┌─────────────────────────────────────────────────────────────┐
│ Merge Algorithm │
│ Phase 1: Collect property values from all releases │
│ Phase 2: Check compatibility │
│ Phase 3: Select best value per property │
└────────────────────────┬────────────────────────────────────┘
▼
┌─────────────────────────────────────────────────────────────┐
│ MergedHarmonyRelease │
│ - Unified metadata │
│ - Source map (property -> provider) │
│ - Incompatibility warnings │
└────────────────────────┬────────────────────────────────────┘
│
┌───────────────┼───────────────┐
▼ ▼
┌─────────────────┐ ┌─────────────────┐
│ Web UI Display │ │ MusicBrainz │
│ - Comparison │ │ Seeding │
│ - Warnings │ │ - Convert │
│ - Permalink │ │ - Edit note │
└─────────────────┘ │ - Annotation │
└─────────────────┘
Summary
Harmony's architecture demonstrates:
- Clear separation of concerns: 4-stage pipeline with distinct responsibilities
- Provider abstraction: Base classes handle common functionality (caching, rate limiting, OAuth2)
- Type safety: 273-line HarmonyRelease schema ensures data consistency
- Intelligent merging: 3-phase algorithm with compatibility checking and provider preferences
- Graceful degradation:
Promise.allSettledensures partial results on provider failures - MusicBrainz integration: Seamless conversion to MB format with MBID resolution
- Modern web stack: Fresh framework with SSR and islands for optimal performance
This architecture is production-ready and serves as an excellent reference for building metadata aggregation systems.