# Harmony - Architecture Analysis ## System Architecture Overview Harmony implements a **4-stage pipeline architecture** for metadata aggregation and harmonization: ``` ┌──────────┐ ┌────────────┐ ┌───────┐ ┌──────┐ │ LOOKUP │ --> │ HARMONIZE │ --> │ MERGE │ --> │ SEED │ └──────────┘ └────────────┘ └───────┘ └──────┘ │ │ │ │ Parallel Provider 3-phase MusicBrainz Multi-source Conversion Merge Format Queries to Harmony Algorithm Conversion ``` Each stage has distinct responsibilities and operates on well-defined data structures. ## Stage 1: LOOKUP ### CombinedReleaseLookup The entry point for all metadata retrieval operations. **Location**: `harmonizer/combined_lookup.ts` **Responsibilities**: - Accepts GTIN, URLs, or provider-specific IDs - Determines which providers to query based on input - Executes provider lookups in parallel - Handles provider failures gracefully via `Promise.allSettled` - Returns array of provider-specific release objects **Input Types**: ```typescript interface LookupInput { gtin?: string; // Global Trade Item Number (barcode) urls?: string[]; // Provider URLs region?: string[]; // Market regions (e.g., ['GB', 'US', 'JP']) category?: string; // Provider category filter providerIds?: Record; // Provider-specific IDs } ``` **Parallel Execution**: ```typescript // Conceptual flow const lookupPromises = providers.map(provider => provider.lookup(input).catch(error => ({ error })) ); const results = await Promise.allSettled(lookupPromises); ``` **Output**: Array of provider-native release objects (Spotify, Deezer, iTunes formats, etc.) ### Provider Selection Logic 1. **URL-based**: Extract provider from URL pattern matching 2. **GTIN-based**: Query all providers supporting GTIN lookup 3. **Category filtering**: Apply user preferences (all/default/preferred) 4. **Region filtering**: Pass region codes to region-aware providers ## Stage 2: HARMONIZE ### Provider Conversion Each provider implements a `harmonize()` method that converts its native format to `HarmonyRelease`. **Location**: Individual provider files in `providers/` **Conversion Responsibilities**: - Map provider-specific field names to Harmony schema - Normalize data types (dates, durations, ISRCs) - Extract nested structures (artists, labels, media) - Detect language and script from metadata - Resolve release types (album, single, EP, etc.) - Extract external links and identifiers **Example Provider Conversion** (conceptual): ```typescript class SpotifyProvider extends MetadataApiProvider { harmonize(spotifyAlbum: SpotifyAlbum): HarmonyRelease { return { title: spotifyAlbum.name, artists: this.convertArtists(spotifyAlbum.artists), gtin: spotifyAlbum.external_ids?.upc, media: this.convertTracks(spotifyAlbum.tracks), releaseDate: this.parseDate(spotifyAlbum.release_date), images: this.convertImages(spotifyAlbum.images), externalLinks: [{ url: spotifyAlbum.external_urls.spotify, types: ['streaming'] }], // ... additional fields }; } } ``` ### HarmonyRelease Schema **Location**: `harmonizer/types.ts` (273 lines) **Core Structure**: ```typescript interface HarmonyRelease { // Basic metadata title: string; artists: ArtistCreditName[]; gtin?: string; // Media and tracks media: HarmonyMedium[]; // Release details language?: string; script?: string; status?: ReleaseStatus; types: ReleaseType[]; releaseDate?: PartialDate; // Commercial info labels: Label[]; packaging?: PackagingType; copyright?: string; // Distribution availableIn?: string[]; // Country codes excludedFrom?: string[]; // Country codes // Visual assets images: Image[]; // Links and identifiers externalLinks: ExternalLink[]; // Metadata about metadata info: { providers: string[]; // Which providers contributed messages: Message[]; // Warnings, errors sourceMap?: SourceMap; // Property -> provider mapping incompatibleData?: IncompatibilityInfo; }; } ``` **Key Sub-structures**: #### ArtistCreditName ```typescript interface ArtistCreditName { name: string; // Display name creditedName?: string; // Alternative credit joinPhrase?: string; // Separator (e.g., " & ", " feat. ") mbid?: string; // MusicBrainz ID } ``` #### HarmonyMedium ```typescript interface HarmonyMedium { title?: string; format?: MediumFormat; // CD, Vinyl, Digital, etc. position: number; tracks: HarmonyTrack[]; } ``` #### HarmonyTrack ```typescript interface HarmonyTrack { title: string; artists?: ArtistCreditName[]; position: number; length?: number; // Duration in milliseconds isrc?: string; // International Standard Recording Code } ``` #### Label ```typescript interface Label { name: string; catalogNumber?: string; mbid?: string; } ``` #### Image ```typescript interface Image { url: string; types: ImageType[]; // 'front', 'back', 'medium', etc. width?: number; height?: number; comment?: string; } ``` ### Harmonizer Modules **Location**: `harmonizer/` directory | Module | Purpose | Lines | |--------|---------|-------| | `types.ts` | HarmonyRelease schema and type definitions | 273 | | `merge.ts` | 3-phase merge algorithm | ~200 | | `compatibility.ts` | Conflict detection and resolution | ~150 | | `deduplicate.ts` | Remove duplicate entries | ~100 | | `isrc.ts` | ISRC validation and normalization | ~50 | | `language_script.ts` | Auto-detect language and script | ~100 | | `release_label.ts` | Label normalization | ~80 | | `release_types.ts` | Release type inference | ~120 | | `tracklist_gap.ts` | Detect missing tracks | ~60 | ## Stage 3: MERGE ### 3-Phase Merge Algorithm **Location**: `harmonizer/merge.ts` The merge algorithm combines multiple `HarmonyRelease` objects into a single `MergedHarmonyRelease` using provider preferences and compatibility checking. #### Phase 1: Property Collection Collect all values for each property across all releases: ```typescript // Conceptual const propertyValues = { title: ['Album Title', 'Album Title (Deluxe)', 'Album Title'], gtin: ['0602537347377', '0602537347377'], releaseDate: ['2014-11-24', '2014-11-24', '2014-11-25'], // ... all properties }; ``` #### Phase 2: Compatibility Checking For each property, check if values are compatible: ```typescript interface CompatibilityCheck { compatible: boolean; canonicalValue?: any; conflicts?: ConflictInfo[]; } ``` **Compatibility Rules**: - **Strings**: Case-insensitive comparison, whitespace normalization - **Dates**: Partial date matching (year-only vs. full date) - **Arrays**: Set comparison (order-independent) - **Numbers**: Exact match or within tolerance - **Objects**: Recursive field comparison **Example Compatibility**: ```typescript // Compatible '2014-11-24' ≈ '2014-11' // Partial date match 'Album Title' ≈ 'album title' // Case-insensitive // Incompatible '2014-11-24' ≠ '2014-11-25' // Date conflict 'Album' ≠ 'EP' // Type conflict ``` #### Phase 3: Value Selection For each property, select the best value using provider preferences: **Provider Preference Order** (configurable): 1. MusicBrainz (template/reference) 2. Spotify (high quality, comprehensive) 3. Tidal (high quality audio metadata) 4. Deezer (good coverage) 5. iTunes (region-specific) 6. Bandcamp (artist-verified) 7. Beatport (electronic music specialist) 8. Mora (Japan specialist) 9. Ototoy (Japan specialist) **Selection Logic**: ```typescript function selectBestValue(values: PropertyValues, preferences: string[]): any { // 1. Filter to compatible values only const compatible = values.filter(v => v.isCompatible); // 2. If no compatible values, mark as conflict if (compatible.length === 0) { return { conflict: true, values }; } // 3. Select from highest-preference provider for (const provider of preferences) { const value = compatible.find(v => v.provider === provider); if (value) return value.data; } // 4. Fallback to first compatible value return compatible[0].data; } ``` ### MergedHarmonyRelease Extends `HarmonyRelease` with merge metadata: ```typescript interface MergedHarmonyRelease extends HarmonyRelease { sourceMap: SourceMap; // Property -> provider mapping incompatibleData?: IncompatibilityInfo; } interface SourceMap { [propertyPath: string]: string; // e.g., "title" -> "spotify" } interface IncompatibilityInfo { conflicts: Conflict[]; warnings: string[]; } interface Conflict { property: string; values: Array<{ provider: string; value: any; }>; } ``` ### Deduplication **Location**: `harmonizer/deduplicate.ts` Removes duplicate entries in arrays: - **Artists**: Match by name (case-insensitive) or MBID - **Labels**: Match by name and catalog number - **Tracks**: Match by position and title - **Images**: Match by URL or dimensions - **External links**: Match by URL ### Compatibility Checking **Location**: `harmonizer/compatibility.ts` Detects and reports incompatible data: **Incompatibility Types**: 1. **Value conflicts**: Different values for same property 2. **Type conflicts**: Different data types 3. **Structural conflicts**: Different array lengths, missing required fields 4. **Semantic conflicts**: Logically incompatible values (e.g., release date before artist birth) **Handling**: - **Strict mode**: Reject merge if any conflicts - **Lenient mode**: Prefer highest-quality provider, log warnings - **User override**: Allow manual conflict resolution ## Stage 4: SEED ### MusicBrainz Seeding **Location**: `musicbrainz/seeding.ts` Converts `MergedHarmonyRelease` to MusicBrainz import format. **Conversion Steps**: 1. Map HarmonyRelease fields to MusicBrainz schema 2. Generate edit notes with provider URLs 3. Create permalink for reproducibility 4. Build annotation with extra data (copyright, availability) 5. Format for MusicBrainz seeder form **MusicBrainz Mapping**: | Harmony Field | MusicBrainz Field | Notes | |---------------|-------------------|-------| | `title` | Release name | Direct mapping | | `artists` | Artist credit | Join with `joinPhrase` | | `gtin` | Barcode | Validate format | | `releaseDate` | Release events | Per-country events | | `labels` | Release labels | With catalog numbers | | `media` | Mediums | With format and tracks | | `types` | Release group types | Primary + secondary | | `language` | Language | ISO 639-3 code | | `script` | Script | ISO 15924 code | | `packaging` | Packaging | Jewel case, digipak, etc. | **Edit Note Generation**: ```typescript function generateEditNote(release: MergedHarmonyRelease, permalink: string): string { const sources = release.info.providers.join(', '); return ` Imported from ${sources} via Harmony Permalink: ${permalink} ${release.externalLinks.map(link => link.url).join('\n')} `.trim(); } ``` ### MBID Resolution **Location**: `musicbrainz/mbid_mapping.ts` Resolves external URLs to MusicBrainz IDs (MBIDs). **Batch Lookup**: - Collects up to 100 URLs - Single MusicBrainz API request: `GET /ws/2/url?resource={url1}&resource={url2}&...` - Caches results in localStorage (dev) or sessionStorage (prod) - Returns MBID mappings **Duplicate Detection**: - Checks if release already exists in MusicBrainz - Warns user before creating duplicate - Provides link to existing release **Cache Strategy**: ```typescript interface MBIDCache { [externalUrl: string]: { mbid: string; type: 'release' | 'release-group' | 'recording' | 'artist'; cached: number; // Timestamp }; } ``` ### Annotation Builder **Location**: `musicbrainz/annotation.ts` Generates MusicBrainz annotation text for additional metadata: **Included Data**: - Copyright information - Availability/exclusion regions - Provider-specific notes - Compatibility warnings - Image URLs (if not added as cover art) **Format**: ``` Copyright: © 2014 Record Label Available in: US, GB, DE, JP Excluded from: CN Sources: - Spotify: https://open.spotify.com/album/xyz - Deezer: https://www.deezer.com/album/123 Notes: - Release date conflict: Spotify (2014-11-24) vs iTunes (2014-11-25) ``` ## Provider Architecture ### Base Class Hierarchy ``` MetadataProvider (abstract) ├── MetadataApiProvider (OAuth2 support) │ ├── SpotifyProvider │ └── TidalProvider ├── ReleaseLookup (GTIN/URL/ID support) │ ├── DeezerProvider │ ├── iTunesProvider │ ├── BandcampProvider │ ├── BeatportProvider │ ├── MoraProvider │ └── OtotoyProvider └── ReleaseApiLookup (multi-region support) ├── iTunesProvider └── DeezerProvider ``` ### MetadataProvider (Abstract Base) **Location**: `providers/base.ts` **Core Responsibilities**: - URL pattern matching via `URLPattern` - Rate limiting with configurable delays - HTTP response caching via `snap_storage` - Error handling and retry logic - Feature quality ratings **Key Methods**: ```typescript abstract class MetadataProvider { // URL pattern matching abstract urlPattern: URLPattern; matchesUrl(url: string): boolean; // Lookup methods abstract lookupByUrl(url: string): Promise; abstract lookupByGtin(gtin: string, region?: string): Promise; // Harmonization abstract harmonize(release: Release): HarmonyRelease; // Rate limiting protected rateLimit: RateLimiter; protected async throttle(): Promise; // Caching protected cache: SnapStorage; protected async getCached(key: string): Promise; protected async setCached(key: string, response: Response): Promise; // Feature quality abstract featureQuality: FeatureQualityMap; } ``` ### MetadataApiProvider (OAuth2) **Location**: `providers/api_base.ts` **Additional Responsibilities**: - OAuth2 token acquisition and refresh - Token caching in localStorage - Automatic token renewal - API client configuration **OAuth2 Flow**: ```typescript class MetadataApiProvider extends MetadataProvider { protected async getAccessToken(): Promise { // 1. Check cache const cached = localStorage.getItem(`${this.name}_token`); if (cached && !this.isTokenExpired(cached)) { return cached.access_token; } // 2. Request new token const token = await this.requestToken(); // 3. Cache token localStorage.setItem(`${this.name}_token`, JSON.stringify(token)); return token.access_token; } protected abstract async requestToken(): Promise; } ``` ### ReleaseLookup **Location**: `providers/release_lookup.ts` **Lookup Methods**: ```typescript interface ReleaseLookup { lookupByUrl(url: string): Promise; lookupByGtin(gtin: string): Promise; lookupById(id: string): Promise; } ``` ### ReleaseApiLookup (Multi-Region) **Location**: `providers/release_api_lookup.ts` **Region Handling**: ```typescript class ReleaseApiLookup extends ReleaseLookup { protected supportedRegions: string[]; // ['US', 'GB', 'JP', ...] async lookupByGtin(gtin: string, regions: string[]): Promise { const lookups = regions .filter(r => this.supportedRegions.includes(r)) .map(r => this.lookupInRegion(gtin, r)); const results = await Promise.allSettled(lookups); return results .filter(r => r.status === 'fulfilled') .map(r => r.value); } protected abstract lookupInRegion(gtin: string, region: string): Promise; } ``` ### Provider Registry **Location**: `providers/registry.ts` Manages provider instantiation and categorization. **Registry Structure**: ```typescript class ProviderRegistry { private providers: Map; private categories: Map; // category -> provider names register(provider: MetadataProvider, category: string): void; get(name: string): MetadataProvider | undefined; getByCategory(category: string): MetadataProvider[]; getByUrl(url: string): MetadataProvider | undefined; getByGtin(): MetadataProvider[]; // All GTIN-supporting providers } ``` **Categories**: - `default`: Commonly used providers (Spotify, Deezer, iTunes) - `preferred`: High-quality providers (Spotify, Tidal, MusicBrainz) - `all`: All registered providers - `japan`: Japan-specific providers (Mora, Ototoy) - `electronic`: Electronic music specialists (Beatport) ### Feature Quality Ratings Each provider declares quality ratings for supported features: ```typescript interface FeatureQualityMap { gtin: FeatureQuality; title: FeatureQuality; artists: FeatureQuality; releaseDate: FeatureQuality; labels: FeatureQuality; media: FeatureQuality; tracks: FeatureQuality; isrc: FeatureQuality; images: FeatureQuality | number; // Number = max dimension copyright: FeatureQuality; availability: FeatureQuality; } enum FeatureQuality { MISSING = 0, BAD = 1, PRESENT = 2, GOOD = 3, } ``` **Example** (Spotify): ```typescript featureQuality = { gtin: FeatureQuality.GOOD, title: FeatureQuality.GOOD, artists: FeatureQuality.GOOD, releaseDate: FeatureQuality.GOOD, labels: FeatureQuality.PRESENT, media: FeatureQuality.GOOD, tracks: FeatureQuality.GOOD, isrc: FeatureQuality.GOOD, images: 2000, // Max 2000px copyright: FeatureQuality.PRESENT, availability: FeatureQuality.GOOD, }; ``` ## Server Architecture (Fresh Framework) ### Fresh Islands Architecture Fresh uses a hybrid rendering model: - **Server-side rendering (SSR)**: Default for all components - **Islands**: Client-side interactive components **Benefits**: - Minimal JavaScript shipped to client - Fast initial page load - Progressive enhancement - SEO-friendly ### Route Structure **Location**: `routes/` directory | Route File | URL | Purpose | |------------|-----|---------| | `index.tsx` | `/` | Landing page | | `release.tsx` | `/release` | Main lookup interface | | `release/actions.tsx` | `/release/actions` | ISRC/cover submission | | `about.tsx` | `/about` | Provider documentation | | `settings.tsx` | `/settings` | User preferences | ### Components **Location**: `components/` directory **22 Static Components** (server-rendered): - Layout components (Header, Footer, Navigation) - Display components (ReleaseInfo, TrackList, ArtistCredit) - Comparison components (ProviderTable, FeatureMatrix) - Form components (LookupForm, SeederForm) **5 Interactive Islands** (client-side): - `LookupForm.tsx`: Dynamic form with validation - `ProviderSelector.tsx`: Provider category filtering - `RegionSelector.tsx`: Multi-region selection - `PermalinkGenerator.tsx`: Timestamp-based permalink creation - `SeederForm.tsx`: MusicBrainz import form with copy-to-clipboard ### Request Flow ``` 1. Browser Request ↓ 2. Fresh Router (routes/release.tsx) ↓ 3. CombinedReleaseLookup (parallel provider queries) ↓ 4. Provider Harmonization (convert to HarmonyRelease) ↓ 5. Merge Algorithm (combine releases) ↓ 6. Server-Side Rendering (generate HTML) ↓ 7. Island Hydration (activate interactive components) ↓ 8. Browser Response ``` ## Data Flow Diagram ``` ┌─────────────────────────────────────────────────────────────┐ │ User Input │ │ GTIN: 0602537347377 URLs: [spotify, deezer] Region: US │ └────────────────────────┬────────────────────────────────────┘ │ ▼ ┌─────────────────────────────────────────────────────────────┐ │ CombinedReleaseLookup │ │ - Parse input │ │ - Select providers (Spotify, Deezer) │ │ - Execute parallel lookups │ └────────────────────────┬────────────────────────────────────┘ │ ┌───────────────┼───────────────┐ ▼ ▼ ▼ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ │ Spotify │ │ Deezer │ │ iTunes │ │ Provider │ │ Provider │ │ Provider │ │ │ │ │ │ │ │ - API call │ │ - API call │ │ - API call │ │ - Cache │ │ - Cache │ │ - Cache │ │ - Parse │ │ - Parse │ │ - Parse │ └──────┬──────┘ └──────┬──────┘ └──────┬──────┘ │ │ │ ▼ ▼ ▼ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ │ Harmonize │ │ Harmonize │ │ Harmonize │ │ (Spotify) │ │ (Deezer) │ │ (iTunes) │ └──────┬──────┘ └──────┬──────┘ └──────┬──────┘ │ │ │ └────────────────┼────────────────┘ ▼ ┌─────────────────────────────────────────────────────────────┐ │ Merge Algorithm │ │ Phase 1: Collect property values from all releases │ │ Phase 2: Check compatibility │ │ Phase 3: Select best value per property │ └────────────────────────┬────────────────────────────────────┘ ▼ ┌─────────────────────────────────────────────────────────────┐ │ MergedHarmonyRelease │ │ - Unified metadata │ │ - Source map (property -> provider) │ │ - Incompatibility warnings │ └────────────────────────┬────────────────────────────────────┘ │ ┌───────────────┼───────────────┐ ▼ ▼ ┌─────────────────┐ ┌─────────────────┐ │ Web UI Display │ │ MusicBrainz │ │ - Comparison │ │ Seeding │ │ - Warnings │ │ - Convert │ │ - Permalink │ │ - Edit note │ └─────────────────┘ │ - Annotation │ └─────────────────┘ ``` ## Summary Harmony's architecture demonstrates: 1. **Clear separation of concerns**: 4-stage pipeline with distinct responsibilities 2. **Provider abstraction**: Base classes handle common functionality (caching, rate limiting, OAuth2) 3. **Type safety**: 273-line HarmonyRelease schema ensures data consistency 4. **Intelligent merging**: 3-phase algorithm with compatibility checking and provider preferences 5. **Graceful degradation**: `Promise.allSettled` ensures partial results on provider failures 6. **MusicBrainz integration**: Seamless conversion to MB format with MBID resolution 7. **Modern web stack**: Fresh framework with SSR and islands for optimal performance This architecture is production-ready and serves as an excellent reference for building metadata aggregation systems.