# Harmony - Project Overview ## Project Identity | Property | Value | |----------|-------| | **Name** | Harmony | | **Repository** | https://github.com/kellnerd/harmony | | **License** | MIT (2022-2024 David Kellner) | | **Language** | TypeScript | | **Runtime** | Deno | | **Primary Framework** | Fresh 1.6.8 | | **UI Library** | Preact 10.19.6 | | **Purpose** | Music metadata aggregator and MusicBrainz importer | ## Core Purpose Harmony is a specialized tool designed to solve two critical problems in music metadata management: 1. **Multi-source metadata aggregation**: Fetches release information from 9 different music platforms and intelligently merges them into a unified, harmonized dataset 2. **MusicBrainz import facilitation**: Converts aggregated metadata into MusicBrainz-compatible format for seeding new releases or improving existing entries The project targets MusicBrainz editors and music metadata enthusiasts who need to cross-reference multiple sources when adding or verifying release information. ## Technical Stack ### Runtime and Framework - **Deno**: Modern TypeScript/JavaScript runtime with built-in tooling - **Fresh 1.6.8**: Deno-native web framework with server-side rendering and islands architecture - **Preact 10.19.6**: Lightweight React alternative for interactive UI components ### Key Dependencies | Dependency | Purpose | |------------|---------| | `@kellnerd/musicbrainz` | MusicBrainz API client and data structures | | `snap-storage` | HTTP response caching with SQLite backend | | `@std/*` | Deno standard library modules (log, testing, http, etc.) | | `preact` | UI rendering and component system | | `preact-render-to-string` | Server-side rendering | ## Entry Points The project provides three distinct entry points for different use cases: ### 1. Web Server (Production) ```bash # File: server/main.ts deno task server ``` Starts the Fresh web application for interactive metadata lookup and comparison. ### 2. Development Server ```bash # File: server/dev.ts deno task dev ``` Runs the web server with auto-reload on file changes. ### 3. Command-Line Interface ```bash # File: cli.ts deno task cli ``` Provides terminal-based GTIN/URL lookup for testing and automation. ## Available Tasks The `deno.json` configuration defines the following tasks: | Task | Command | Purpose | |------|---------|---------| | `check` | `deno fmt --check && deno lint && deno check **/*.ts` | Verify code formatting, linting, and type checking | | `ok` | `deno fmt && deno lint && deno check **/*.ts && deno test -A` | Format, lint, check, and test in one command | | `cli` | `deno run -A cli.ts` | Run command-line interface | | `dev` | `deno run -A --watch=static/,routes/ server/dev.ts` | Start development server with auto-reload | | `build` | `deno run -A server/dev.ts build` | Build static assets | | `server` | `DENO_DEPLOYMENT_ID=$(git describe --tags --always) deno run -A server/main.ts` | Start production server | ## Provider Ecosystem Harmony integrates with 9 music metadata providers, categorized by access method: ### API-Based Providers (5) | Provider | Authentication | Rate Limit | Max Image Size | GTIN Support | |----------|---------------|------------|----------------|--------------| | **Spotify** | OAuth2 | Not specified | 2000px | Yes (UPC) | | **Deezer** | Public API | 50 req/5s | 1400px | Yes | | **iTunes** | Public API | Not specified | Varies | Yes | | **Tidal** | OAuth2 | Not specified | 1280px | Yes | | **MusicBrainz** | Public API | 5 req/5s | N/A | Yes (barcode) | ### HTML Scraping Providers (4) | Provider | Region | Max Image Size | GTIN Support | Notes | |----------|--------|----------------|--------------|-------| | **Bandcamp** | Global | 3000px | No | JSON-LD extraction | | **Beatport** | Global | Varies | Yes | Electronic music focus | | **Mora** | Japan | Varies | Yes | Japanese market | | **Ototoy** | Japan | Varies | Yes | Japanese market | ### Not Implemented - **KKBOX**: Mentioned in documentation but not implemented ## Architecture Highlights Harmony employs a **4-stage pipeline** for metadata processing: 1. **LOOKUP**: `CombinedReleaseLookup` queries multiple providers in parallel 2. **HARMONIZE**: Each provider converts its native format to `HarmonyRelease` schema 3. **MERGE**: Combines releases from multiple providers using configurable preferences 4. **SEED**: Converts harmonized data to MusicBrainz import format This pipeline ensures: - Parallel provider queries for performance - Standardized internal data representation - Intelligent conflict resolution - MusicBrainz-compatible output ## Data Storage Strategy Harmony uses a **cache-first, no-database** approach: - **snap_storage**: SQLite-backed HTTP response cache (`snaps.db` + `snaps/` directory) - **24-hour default cache policy**: Reduces API calls and enables permalink functionality - **Permalink system**: `ts` parameter replays cached lookups for reproducible results - **In-memory processing**: All data transformations happen in memory, no persistent storage This design prioritizes: - Reproducibility (permalinks) - API rate limit compliance - Simplicity (no database migrations) - Statelessness (no user data storage) ## Deployment Model Harmony is designed for **self-hosted deployment** without containerization: ### Production Deployment ```bash deno run -A server/main.ts ``` Environment variables: - `PORT`: Server port (default varies) - `DENO_DEPLOYMENT_ID`: Version identifier (auto-set from git tags) - `HARMONY_SPOTIFY_CLIENT_ID` / `HARMONY_SPOTIFY_CLIENT_SECRET` - `HARMONY_TIDAL_CLIENT_ID` / `HARMONY_TIDAL_CLIENT_SECRET` - `HARMONY_MB_API_URL`: MusicBrainz API endpoint - `HARMONY_MB_TARGET_URL`: MusicBrainz target instance - `HARMONY_DATA_DIR`: Data directory for cache storage ### CI/CD Pipeline GitHub Actions workflow (`deno.yml`): 1. **Test stage**: Format check, lint, type check, unit tests 2. **Deploy stage**: SSH to server, rsync code, systemd service restart 3. **Trigger**: Tagged releases (`v*`) and authorized users only ### No Docker The project intentionally avoids containerization: - Deno provides consistent runtime across environments - Fresh framework handles asset bundling - Simple systemd service management - Direct SSH deployment ## CLI Usage The command-line interface supports GTIN and URL lookups: ```bash # GTIN lookup deno task cli --gtin 0602537347377 # URL lookup deno task cli --url https://open.spotify.com/album/xyz # Multiple URLs deno task cli --url https://open.spotify.com/album/xyz --url https://www.deezer.com/album/123 # Region-specific lookup deno task cli --gtin 0602537347377 --region JP,US ``` Output includes: - Harmonized release metadata - Provider comparison - Compatibility warnings - MusicBrainz seeding data ## Web Interface The Fresh-based web UI provides: ### Main Route: `/release` Query parameters: - `gtin`: Global Trade Item Number (barcode) - `url`: Provider URL(s) - supports multiple - `region`: Market regions (default: GB,US,DE,JP) - `category`: Provider category filter (all/default/preferred) - `[provider_name]`: Provider-specific ID or GTIN lookup - `[provider_name]!`: Template mode for provider - `ts`: Timestamp for permalink replay ### Additional Routes | Route | Purpose | |-------|---------| | `/` | Landing page with documentation | | `/release/actions` | ISRC/cover submission for existing MusicBrainz releases | | `/about` | Provider documentation and feature comparison | | `/settings` | User preferences (stored in cookies) | ### UI Components - **22 static components**: Server-rendered UI elements - **5 interactive islands**: Client-side interactive features (Fresh islands architecture) ## Feature Quality System Providers are rated on feature quality using a standardized scale: | Rating | Meaning | |--------|---------| | `MISSING` | Feature not available | | `BAD` | Feature present but unreliable/incomplete | | `PRESENT` | Feature available with acceptable quality | | `GOOD` | Feature available with high quality | | Numeric | Specific measurements (e.g., image dimensions) | This system enables: - Informed provider selection - Merge algorithm prioritization - User transparency about data quality ## Development Workflow ### Code Quality Standards ```bash # Format code (tabs, single quotes, 120 char width) deno fmt # Lint code deno lint # Type check deno check **/*.ts # Run tests deno test -A # All-in-one deno task ok ``` ### Testing Infrastructure - **38 test files**: Comprehensive test coverage - **Declarative provider specs**: `describeProvider` helper for consistent provider testing - **Snapshot testing**: Verify output stability - **Offline mode**: 43 cached responses in `testdata/` directory - **Download flag**: `--download` to fetch fresh test data ### Logging System 5 specialized loggers using Deno std/log: | Logger | Level | Purpose | |--------|-------|---------| | `harmony.lookup` | INFO | Release lookup operations | | `harmony.mbid` | DEBUG | MusicBrainz ID resolution | | `harmony.provider` | DEBUG/INFO | Provider interactions | | `harmony.server` | INFO | Server lifecycle events | | `requests` | INFO/WARN | HTTP request logging | All loggers use `ConsoleHandler` with color formatting for readability. ## Error Handling Philosophy Harmony uses a **graceful degradation** approach: ### Error Hierarchy ``` LookupError (base) └── ProviderError ├── ResponseError (HTTP/API errors) ├── CompatibilityError (data conflicts) └── CacheMissError (cache lookup failures) ``` ### Resilience Strategy - `Promise.allSettled`: Continue processing even if some providers fail - Rate limit handling: Parse `Retry-After` headers, dynamic delay adjustment - Partial results: Return available data even with provider failures - User feedback: Display warnings for failed providers ## Project Maturity ### Strengths - **Single developer project**: Consistent vision and architecture - **Active maintenance**: Recent Tidal v1 deprecation handling (2025-01-21) - **Production-ready**: Used by MusicBrainz community - **Well-tested**: 38 test files with offline test data - **Type-safe**: Full TypeScript coverage with 273-line `HarmonyRelease` schema ### Limitations - **No REST API**: Web UI only, no programmatic JSON endpoints - **No authentication**: Public access only - **No metrics/monitoring**: No health endpoint, no Sentry integration - **Scraping fragility**: HTML-based providers break when sites change - **Deno-only**: Fresh framework ties project to Deno ecosystem ## Relevance to Metadata Aggregation Harmony represents the **gold standard** for multi-source music metadata aggregation: ### Architectural Lessons 1. **Provider abstraction**: Base classes with URLPattern matching, rate limiting, caching 2. **Harmonized schema**: `HarmonyRelease` as universal internal format 3. **Intelligent merging**: 3-phase merge with provider preferences 4. **Permalink system**: Timestamp-based cache replay for reproducibility 5. **Quality ratings**: Per-feature, per-provider quality assessment ### Adoption Recommendations - **HarmonyRelease schema**: Adopt as internal data model - **Merge algorithm**: Study 3-phase merge with compatibility checking - **Provider base classes**: Reuse abstraction patterns - **MBID resolution**: Batch URL lookup (100 per request) is efficient - **Testing framework**: Declarative provider specs with offline mode ## Configuration Management ### Environment Variables ```bash # OAuth2 Credentials HARMONY_SPOTIFY_CLIENT_ID=your_client_id HARMONY_SPOTIFY_CLIENT_SECRET=your_client_secret HARMONY_TIDAL_CLIENT_ID=your_client_id HARMONY_TIDAL_CLIENT_SECRET=your_client_secret # MusicBrainz Integration HARMONY_MB_API_URL=https://musicbrainz.org/ws/2 HARMONY_MB_TARGET_URL=https://musicbrainz.org # Storage HARMONY_DATA_DIR=/path/to/data # Server PORT=8000 FORWARD_PROTO=https ``` ### Configuration Helpers Located in `utils/config.ts`: - `getFromEnv(key, defaultValue)`: String environment variables - `getBooleanFromEnv(key, defaultValue)`: Boolean parsing - `getUrlFromEnv(key, defaultValue)`: URL validation ### Template `.env.example` provides a complete configuration template for new deployments. ## Community and Licensing - **License**: MIT (permissive, commercial-friendly) - **Copyright**: 2022-2024 David Kellner - **Community**: MusicBrainz editor community - **Contribution**: Single maintainer, open to contributions - **Documentation**: Comprehensive inline comments and type definitions ## Summary Harmony is a production-ready, TypeScript-based music metadata aggregator that demonstrates best practices in: - Multi-source data integration - Intelligent conflict resolution - MusicBrainz ecosystem integration - Type-safe architecture - Graceful error handling Its 4-stage pipeline (LOOKUP → HARMONIZE → MERGE → SEED) and provider abstraction system make it the most relevant reference project for building a comprehensive metadata aggregation system.