Files
metadata-agregator/docs/research/harmony/analysis/CODEBASE.md
T
Alexander a1f6701bac feat: initial implementation of metadata aggregator
- gRPC service with MusicBrainz provider
- PostgreSQL schema with migrations
- Service layer with database-first caching
- Repository pattern for data access
- YAML configuration support
- Research documentation for 17 music metadata projects
2026-04-28 16:28:53 +02:00

21 KiB

Harmony - Codebase and Implementation Analysis

Project Structure

harmony/
├── cli.ts                      # CLI entry point
├── config.ts                   # Configuration management (36 lines)
├── deno.json                   # Deno configuration and tasks
├── deno.lock                   # Dependency lock file
├── .env.example                # Environment variable template
├── .github/
│   └── workflows/
│       └── deno.yml            # CI/CD pipeline
├── components/                 # UI components (22 static)
│   ├── Header.tsx
│   ├── Footer.tsx
│   ├── ReleaseInfo.tsx
│   ├── TrackList.tsx
│   ├── ProviderTable.tsx
│   └── ...
├── islands/                    # Interactive components (5 islands)
│   ├── LookupForm.tsx
│   ├── ProviderSelector.tsx
│   ├── RegionSelector.tsx
│   ├── PermalinkGenerator.tsx
│   └── SeederForm.tsx
├── routes/                     # Fresh routes
│   ├── index.tsx               # Landing page
│   ├── release.tsx             # Main lookup interface
│   ├── about.tsx               # Provider documentation
│   ├── settings.tsx            # User preferences
│   └── release/
│       └── actions.tsx         # ISRC/cover submission
├── static/                     # Static assets
│   ├── styles.css
│   └── favicon.ico
├── server/                     # Server entry points
│   ├── main.ts                 # Production server
│   └── dev.ts                  # Development server
├── providers/                  # Provider implementations
│   ├── base.ts                 # MetadataProvider abstract class
│   ├── api_base.ts             # MetadataApiProvider (OAuth2)
│   ├── release_lookup.ts       # ReleaseLookup interface
│   ├── release_api_lookup.ts   # ReleaseApiLookup (multi-region)
│   ├── registry.ts             # ProviderRegistry
│   ├── spotify.ts              # Spotify provider
│   ├── deezer.ts               # Deezer provider
│   ├── itunes.ts               # iTunes provider
│   ├── tidal.ts                # Tidal provider
│   ├── musicbrainz.ts          # MusicBrainz provider
│   ├── bandcamp.ts             # Bandcamp provider
│   ├── beatport.ts             # Beatport provider
│   ├── mora.ts                 # Mora provider
│   └── ototoy.ts               # Ototoy provider
├── harmonizer/                 # Harmonization modules
│   ├── types.ts                # HarmonyRelease schema (273 lines)
│   ├── combined_lookup.ts      # CombinedReleaseLookup
│   ├── merge.ts                # 3-phase merge algorithm
│   ├── compatibility.ts        # Compatibility checking
│   ├── deduplicate.ts          # Deduplication
│   ├── isrc.ts                 # ISRC validation
│   ├── language_script.ts      # Language/script detection
│   ├── release_label.ts        # Label normalization
│   ├── release_types.ts        # Release type inference
│   └── tracklist_gap.ts        # Track gap detection
├── musicbrainz/                # MusicBrainz integration
│   ├── seeding.ts              # MB format conversion
│   ├── mbid_mapping.ts         # MBID resolution (batch 100)
│   ├── api_client.ts           # MB API client
│   ├── annotation.ts           # Annotation builder
│   └── edit_link.ts            # Edit link generation
├── utils/                      # Utility modules
│   ├── config.ts               # Config helpers
│   ├── logger.ts               # Logging setup
│   ├── rate_limiter.ts         # Rate limiting
│   ├── cache.ts                # Cache utilities
│   └── errors.ts               # Error classes
├── testdata/                   # Test fixtures (43 cached responses)
│   ├── spotify/
│   ├── deezer/
│   ├── itunes/
│   └── ...
└── tests/                      # Test files (38 total)
    ├── providers/
    │   ├── spotify_test.ts
    │   ├── deezer_test.ts
    │   └── ...
    ├── harmonizer/
    │   ├── merge_test.ts
    │   ├── compatibility_test.ts
    │   └── ...
    └── musicbrainz/
        ├── seeding_test.ts
        └── mbid_mapping_test.ts

Configuration Management

config.ts (36 lines)

Location: config.ts

Purpose: Centralized configuration with environment variable loading

Structure:

export const config = {
	// OAuth2 Credentials
	spotify: {
		clientId: getFromEnv('HARMONY_SPOTIFY_CLIENT_ID'),
		clientSecret: getFromEnv('HARMONY_SPOTIFY_CLIENT_SECRET')
	},
	tidal: {
		clientId: getFromEnv('HARMONY_TIDAL_CLIENT_ID'),
		clientSecret: getFromEnv('HARMONY_TIDAL_CLIENT_SECRET')
	},
	
	// MusicBrainz Configuration
	musicbrainz: {
		apiUrl: getUrlFromEnv('HARMONY_MB_API_URL', 'https://musicbrainz.org/ws/2'),
		targetUrl: getUrlFromEnv('HARMONY_MB_TARGET_URL', 'https://musicbrainz.org')
	},
	
	// Data Storage
	dataDir: getFromEnv('HARMONY_DATA_DIR', './'),
	
	// Server Configuration
	port: parseInt(getFromEnv('PORT', '8000')),
	forwardProto: getFromEnv('FORWARD_PROTO'),
	deploymentId: getFromEnv('DENO_DEPLOYMENT_ID')
};

utils/config.ts

Configuration Helpers:

export function getFromEnv(key: string, defaultValue?: string): string {
	const value = Deno.env.get(key);
	if (value === undefined) {
		if (defaultValue !== undefined) {
			return defaultValue;
		}
		throw new Error(`Environment variable ${key} is required but not set`);
	}
	return value;
}

export function getBooleanFromEnv(key: string, defaultValue: boolean): boolean {
	const value = Deno.env.get(key);
	if (value === undefined) return defaultValue;
	return value.toLowerCase() === 'true' || value === '1';
}

export function getUrlFromEnv(key: string, defaultValue?: string): string {
	const value = getFromEnv(key, defaultValue);
	try {
		new URL(value); // Validate URL format
		return value;
	} catch {
		throw new Error(`Environment variable ${key} is not a valid URL: ${value}`);
	}
}

.env.example

Template:

# OAuth2 Credentials
# Get from: https://developer.spotify.com/dashboard
HARMONY_SPOTIFY_CLIENT_ID=
HARMONY_SPOTIFY_CLIENT_SECRET=

# Get from: https://developer.tidal.com/
HARMONY_TIDAL_CLIENT_ID=
HARMONY_TIDAL_CLIENT_SECRET=

# MusicBrainz Configuration
HARMONY_MB_API_URL=https://musicbrainz.org/ws/2
HARMONY_MB_TARGET_URL=https://musicbrainz.org

# Data Storage
HARMONY_DATA_DIR=/var/lib/harmony

# Server Configuration
PORT=8000
FORWARD_PROTO=https

Logging System

utils/logger.ts

Logger Setup:

import * as log from 'std/log/mod.ts';

export async function setupLogging() {
	await log.setup({
		handlers: {
			console: new log.handlers.ConsoleHandler('DEBUG', {
				formatter: (record) => {
					const timestamp = new Date(record.datetime).toISOString();
					const level = record.levelName.padEnd(7);
					const logger = record.loggerName.padEnd(20);
					return `${timestamp} ${level} ${logger} ${record.msg}`;
				},
				useColors: true
			})
		},
		loggers: {
			'harmony.lookup': {
				level: 'INFO',
				handlers: ['console']
			},
			'harmony.mbid': {
				level: 'DEBUG',
				handlers: ['console']
			},
			'harmony.provider': {
				level: 'INFO',
				handlers: ['console']
			},
			'harmony.server': {
				level: 'INFO',
				handlers: ['console']
			},
			'requests': {
				level: 'INFO',
				handlers: ['console']
			}
		}
	});
}

Logger Usage

Get logger:

import * as log from 'std/log/mod.ts';

const logger = log.getLogger('harmony.provider');

Log levels:

logger.debug('Debug message');
logger.info('Info message');
logger.warning('Warning message');
logger.error('Error message');
logger.critical('Critical message');

Structured logging:

logger.info(`Fetching album ${albumId} from ${providerName}`);
logger.warning(`Rate limit exceeded, retrying after ${retryAfter}s`);
logger.error(`Provider ${providerName} failed: ${error.message}`);

Color Formatting

Console output (with ANSI colors):

2024-01-01T12:00:00.000Z INFO    harmony.lookup       Looking up GTIN 0602537347377
2024-01-01T12:00:00.123Z INFO    harmony.provider     Spotify: Fetching album 3DiDSNVBRYVzccLn2yqhMJ
2024-01-01T12:00:00.456Z DEBUG   harmony.provider     Spotify: Using cached response
2024-01-01T12:00:00.789Z WARN    harmony.provider     iTunes: Rate limit exceeded
2024-01-01T12:00:01.234Z INFO    harmony.lookup       Merge complete: 3 providers

Color scheme:

  • DEBUG: Gray
  • INFO: Blue
  • WARNING: Yellow
  • ERROR: Red
  • CRITICAL: Red + bold

Error Handling

Error Hierarchy

File: utils/errors.ts

// Base error
export class LookupError extends Error {
	constructor(message: string) {
		super(message);
		this.name = 'LookupError';
	}
}

// Provider errors
export class ProviderError extends LookupError {
	constructor(
		public provider: string,
		message: string
	) {
		super(`${provider}: ${message}`);
		this.name = 'ProviderError';
	}
}

// HTTP/API errors
export class ResponseError extends ProviderError {
	constructor(
		provider: string,
		public status: number,
		message: string
	) {
		super(provider, `HTTP ${status}: ${message}`);
		this.name = 'ResponseError';
	}
}

// Data compatibility errors
export class CompatibilityError extends LookupError {
	constructor(
		public property: string,
		public values: any[]
	) {
		super(`Incompatible values for ${property}: ${JSON.stringify(values)}`);
		this.name = 'CompatibilityError';
	}
}

// Cache errors
export class CacheMissError extends LookupError {
	constructor(
		public key: string
	) {
		super(`Cache miss for key: ${key}`);
		this.name = 'CacheMissError';
	}
}

Error Handling Patterns

Graceful Degradation

// Use Promise.allSettled for parallel provider queries
const lookupPromises = providers.map(provider => 
	provider.lookup(input).catch(error => {
		logger.warning(`Provider ${provider.name} failed: ${error.message}`);
		return null; // Return null on error
	})
);

const results = await Promise.allSettled(lookupPromises);

// Filter successful results
const releases = results
	.filter(r => r.status === 'fulfilled' && r.value !== null)
	.map(r => r.value);

if (releases.length === 0) {
	throw new LookupError('All providers failed');
}

Rate Limit Handling

async function fetchWithRetry(url: string, maxRetries = 3): Promise<Response> {
	for (let attempt = 0; attempt < maxRetries; attempt++) {
		const response = await fetch(url);
		
		if (response.status === 429) {
			// Rate limit exceeded
			const retryAfter = parseInt(response.headers.get('Retry-After') || '60');
			
			if (retryAfter > 300) {
				// Don't wait more than 5 minutes
				throw new ResponseError('provider', 429, `Rate limit exceeded, retry after ${retryAfter}s (too long)`);
			}
			
			logger.warning(`Rate limit exceeded, retrying after ${retryAfter}s`);
			await new Promise(resolve => setTimeout(resolve, retryAfter * 1000));
			continue;
		}
		
		if (!response.ok) {
			throw new ResponseError('provider', response.status, response.statusText);
		}
		
		return response;
	}
	
	throw new ResponseError('provider', 429, 'Rate limit exceeded after max retries');
}

Error Propagation

try {
	const release = await provider.lookup(input);
	return provider.harmonize(release);
} catch (error) {
	if (error instanceof ProviderError) {
		// Log and re-throw provider errors
		logger.error(error.message);
		throw error;
	} else {
		// Wrap unexpected errors
		throw new ProviderError(provider.name, error.message);
	}
}

Testing Infrastructure

Test Framework

Deno built-in testing + @std/testing:

import { assertEquals, assertExists } from '@std/testing/asserts';
import { describe, it } from '@std/testing/bdd';

Test Structure

38 test files organized by module:

tests/
├── providers/
│   ├── spotify_test.ts
│   ├── deezer_test.ts
│   ├── itunes_test.ts
│   ├── tidal_test.ts
│   ├── musicbrainz_test.ts
│   ├── bandcamp_test.ts
│   ├── beatport_test.ts
│   ├── mora_test.ts
│   └── ototoy_test.ts
├── harmonizer/
│   ├── merge_test.ts
│   ├── compatibility_test.ts
│   ├── deduplicate_test.ts
│   ├── isrc_test.ts
│   ├── language_script_test.ts
│   ├── release_label_test.ts
│   ├── release_types_test.ts
│   └── tracklist_gap_test.ts
└── musicbrainz/
    ├── seeding_test.ts
    ├── mbid_mapping_test.ts
    ├── annotation_test.ts
    └── edit_link_test.ts

Declarative Provider Tests

File: tests/utils/describe_provider.ts

Purpose: Consistent provider testing with minimal boilerplate

Usage:

import { describeProvider } from '../utils/describe_provider.ts';

describeProvider({
	name: 'Spotify',
	provider: new SpotifyProvider(),
	tests: {
		urlMatching: [
			{ url: 'https://open.spotify.com/album/3DiDSNVBRYVzccLn2yqhMJ', shouldMatch: true },
			{ url: 'https://www.deezer.com/album/123456', shouldMatch: false }
		],
		gtinLookup: {
			gtin: '0602537347377',
			expectedTitle: 'Album Title',
			expectedArtists: ['Artist Name']
		},
		urlLookup: {
			url: 'https://open.spotify.com/album/3DiDSNVBRYVzccLn2yqhMJ',
			expectedTitle: 'Album Title'
		},
		harmonization: {
			input: spotifyAlbumFixture,
			expectedFields: ['title', 'artists', 'gtin', 'media', 'images']
		}
	}
});

Generated tests:

  • URL pattern matching
  • GTIN lookup
  • URL lookup
  • Harmonization
  • Feature quality validation

Snapshot Testing

Purpose: Verify output stability across changes

Example:

import { assertSnapshot } from '@std/testing/snapshot';

Deno.test('Spotify harmonization snapshot', async (t) => {
	const provider = new SpotifyProvider();
	const spotifyAlbum = await loadFixture('spotify/album.json');
	const harmonyRelease = provider.harmonize(spotifyAlbum);
	
	await assertSnapshot(t, harmonyRelease);
});

Snapshot file (auto-generated):

// __snapshots__/spotify_test.ts.snap
export const snapshot = {
	"Spotify harmonization snapshot": {
		title: "Album Title",
		artists: [{ name: "Artist Name" }],
		gtin: "0602537347377",
		// ... full object
	}
};

Offline Testing

Test data: 43 cached responses in testdata/

Structure:

testdata/
├── spotify/
│   ├── album_3DiDSNVBRYVzccLn2yqhMJ.json
│   ├── album_search_upc_0602537347377.json
│   └── ...
├── deezer/
│   ├── album_123456.json
│   └── ...
├── itunes/
│   ├── lookup_us_123456.json
│   └── ...
└── ...

Loading fixtures:

async function loadFixture(path: string): Promise<any> {
	const content = await Deno.readTextFile(`testdata/${path}`);
	return JSON.parse(content);
}

Offline mode (default):

deno test -A

Uses cached responses from testdata/, no network requests.

Download mode (fetch fresh data):

deno test -A --download

Fetches fresh responses from providers and updates testdata/.

Test Coverage

Run tests with coverage:

deno test -A --coverage=coverage
deno coverage coverage

Coverage report:

file:///opt/harmony/providers/spotify.ts 95.2%
file:///opt/harmony/harmonizer/merge.ts 88.7%
file:///opt/harmony/musicbrainz/seeding.ts 92.3%
...

Code Style

Formatting Rules

File: deno.json

{
	"fmt": {
		"useTabs": true,
		"lineWidth": 120,
		"indentWidth": 4,
		"singleQuote": true,
		"proseWrap": "preserve"
	}
}

Rules:

  • Tabs: Use tabs for indentation (not spaces)
  • Line width: 120 characters maximum
  • Quotes: Single quotes for strings
  • Semicolons: Required
  • Trailing commas: Allowed

Format code:

deno fmt

Check formatting:

deno fmt --check

Linting Rules

File: deno.json

{
	"lint": {
		"rules": {
			"tags": ["recommended"],
			"exclude": ["no-explicit-any"]
		}
	}
}

Lint code:

deno lint

Common lint errors:

  • Unused variables
  • Missing return types
  • Unreachable code
  • Prefer const over let

Type Checking

Strict mode enabled:

{
	"compilerOptions": {
		"strict": true,
		"noImplicitAny": true,
		"strictNullChecks": true,
		"strictFunctionTypes": true
	}
}

Type check:

deno check **/*.ts

Dependency Management

deno.json

Import map:

{
	"imports": {
		"$fresh/": "https://deno.land/x/fresh@1.6.8/",
		"preact": "https://esm.sh/preact@10.19.6",
		"preact/": "https://esm.sh/preact@10.19.6/",
		"@preact/signals": "https://esm.sh/@preact/signals@1.2.2",
		"@kellnerd/musicbrainz": "https://deno.land/x/musicbrainz@v0.5.0/mod.ts",
		"snap-storage": "https://deno.land/x/snap_storage@v0.2.0/mod.ts",
		"@std/": "https://deno.land/std@0.208.0/"
	}
}

Key dependencies:

Dependency Version Purpose
Fresh 1.6.8 Web framework
Preact 10.19.6 UI library
@kellnerd/musicbrainz 0.5.0 MusicBrainz API client
snap-storage 0.2.0 HTTP response caching
@std/* 0.208.0 Deno standard library

Lock File

deno.lock: Dependency integrity verification

Update lock file:

deno cache --reload --lock=deno.lock --lock-write deps.ts

Tasks

deno.json Tasks

{
	"tasks": {
		"check": "deno fmt --check && deno lint && deno check **/*.ts",
		"ok": "deno fmt && deno lint && deno check **/*.ts && deno test -A",
		"cli": "deno run -A cli.ts",
		"dev": "deno run -A --watch=static/,routes/ server/dev.ts",
		"build": "deno run -A server/dev.ts build",
		"server": "DENO_DEPLOYMENT_ID=$(git describe --tags --always) deno run -A server/main.ts"
	}
}

Task descriptions:

Task Purpose Usage
check Verify code quality (format, lint, type check) deno task check
ok Format, lint, check, and test deno task ok
cli Run CLI deno task cli --gtin 0602537347377
dev Start development server deno task dev
build Build static assets deno task build
server Start production server deno task server

No External Tooling

Harmony does not use:

  • Sentry: No error tracking
  • Prometheus: No metrics collection
  • Datadog/New Relic: No APM
  • Webpack/Vite: Fresh handles bundling
  • ESLint: Deno lint built-in
  • Prettier: Deno fmt built-in
  • Jest/Mocha: Deno test built-in

Rationale: Deno provides all necessary tooling out-of-the-box.

Performance Optimizations

Parallel Provider Queries

const lookups = providers.map(p => p.lookup(input));
const results = await Promise.allSettled(lookups);

Benefit: Reduce total response time from sum of provider latencies to max of provider latencies.

HTTP Response Caching

const cached = await cache.get(url);
if (cached) return cached;

const response = await fetch(url);
await cache.set(url, response);
return response;

Benefit: Avoid redundant API calls, comply with rate limits.

OAuth2 Token Caching

const cached = localStorage.getItem('spotify_token');
if (cached && !isExpired(cached)) {
	return cached.access_token;
}

Benefit: Reduce token requests, faster authentication.

Server-Side Rendering

Fresh SSR generates HTML on server, reducing client-side JavaScript.

Benefit: Faster initial page load, better SEO.

Islands Architecture

Only interactive components load JavaScript on client.

Benefit: Minimal JavaScript bundle size, faster page interactivity.

Summary

Harmony's codebase demonstrates:

  1. Clean architecture: Clear separation of concerns (providers, harmonizer, MusicBrainz)
  2. Type safety: Full TypeScript coverage with strict mode
  3. Comprehensive testing: 38 test files with declarative provider specs
  4. Offline testing: 43 cached responses for reproducible tests
  5. Logging system: 5 specialized loggers with color formatting
  6. Error hierarchy: Structured error handling with graceful degradation
  7. Configuration management: Environment variables with validation
  8. Code quality: Deno fmt, lint, and type check enforced
  9. No external tooling: Deno provides all necessary tools
  10. Performance optimizations: Parallel queries, caching, SSR, islands

This codebase is production-ready and serves as an excellent reference for building type-safe, well-tested metadata aggregation systems.