Files
metadata-agregator/docs/research/harmony/analysis/API.md
T
Alexander a1f6701bac feat: initial implementation of metadata aggregator
- gRPC service with MusicBrainz provider
- PostgreSQL schema with migrations
- Service layer with database-first caching
- Repository pattern for data access
- YAML configuration support
- Research documentation for 17 music metadata projects
2026-04-28 16:28:53 +02:00

752 lines
19 KiB
Markdown

# Harmony - API and Interface Analysis
## API Architecture
Harmony is a **web UI-first application** built on the Fresh framework. It does not provide a traditional REST API or JSON endpoints. All interactions occur through server-side rendered HTML pages with embedded data.
### Framework: Fresh 1.6.8
Fresh is a Deno-native web framework with:
- **Server-side rendering (SSR)**: All pages rendered on server
- **Islands architecture**: Selective client-side interactivity
- **File-based routing**: Routes defined by file structure
- **Zero config**: No build step required for development
## Route Structure
### Main Application Routes
| Route | File | Method | Purpose |
|-------|------|--------|---------|
| `/` | `routes/index.tsx` | GET | Landing page with documentation |
| `/release` | `routes/release.tsx` | GET | Main lookup and comparison interface |
| `/release/actions` | `routes/release/actions.tsx` | GET | ISRC/cover submission for existing MB releases |
| `/about` | `routes/about.tsx` | GET | Provider documentation and feature matrix |
| `/settings` | `routes/settings.tsx` | GET/POST | User preferences (stored in cookies) |
### Static Assets
| Route | Purpose |
|-------|---------|
| `/static/*` | CSS, JavaScript, images |
| `/favicon.ico` | Site favicon |
## Primary Route: `/release`
The main interface for metadata lookup and harmonization.
### Query Parameters
#### Core Lookup Parameters
| Parameter | Type | Required | Description | Example |
|-----------|------|----------|-------------|---------|
| `gtin` | string | No* | Global Trade Item Number (barcode) | `0602537347377` |
| `url` | string[] | No* | Provider URL(s), supports multiple | `https://open.spotify.com/album/xyz` |
*At least one of `gtin` or `url` must be provided.
#### Provider-Specific Parameters
| Parameter | Type | Description | Example |
|-----------|------|-------------|---------|
| `[provider_name]` | string | Provider-specific ID or GTIN lookup | `spotify=3DiDSNVBRYVzccLn2yqhMJ` |
| `[provider_name]!` | empty | Template mode for provider | `musicbrainz!` |
**Supported Provider Names**:
- `spotify`
- `deezer`
- `itunes`
- `tidal`
- `bandcamp`
- `beatport`
- `musicbrainz`
- `mora`
- `ototoy`
#### Filtering Parameters
| Parameter | Type | Default | Description | Values |
|-----------|------|---------|-------------|--------|
| `region` | string[] | `GB,US,DE,JP` | Market regions for lookup | ISO 3166-1 alpha-2 codes |
| `category` | string | `default` | Provider category filter | `all`, `default`, `preferred` |
#### Permalink Parameters
| Parameter | Type | Description | Example |
|-----------|------|-------------|---------|
| `ts` | number | Unix timestamp for cache replay | `1704067200` |
### Request Examples
#### GTIN Lookup (Default Regions)
```
GET /release?gtin=0602537347377
```
Queries all GTIN-supporting providers in default regions (GB, US, DE, JP).
#### GTIN Lookup (Specific Regions)
```
GET /release?gtin=0602537347377&region=JP,US
```
Queries only Japan and US regions.
#### URL Lookup (Single Provider)
```
GET /release?url=https://open.spotify.com/album/3DiDSNVBRYVzccLn2yqhMJ
```
Queries only Spotify using the provided URL.
#### URL Lookup (Multiple Providers)
```
GET /release?url=https://open.spotify.com/album/3DiDSNVBRYVzccLn2yqhMJ&url=https://www.deezer.com/album/123456
```
Queries both Spotify and Deezer.
#### Provider-Specific ID Lookup
```
GET /release?spotify=3DiDSNVBRYVzccLn2yqhMJ&deezer=123456
```
Queries Spotify and Deezer using their native IDs.
#### Template Mode (MusicBrainz)
```
GET /release?gtin=0602537347377&musicbrainz!
```
Uses MusicBrainz as template provider (reference data for merge).
#### Category Filtering
```
GET /release?gtin=0602537347377&category=preferred
```
Queries only preferred providers (Spotify, Tidal, MusicBrainz).
#### Permalink (Cache Replay)
```
GET /release?gtin=0602537347377&ts=1704067200
```
Replays cached lookup from timestamp 1704067200.
### Response Format
The `/release` route returns an **HTML page** with embedded data, not JSON.
#### Response Sections
1. **Release Header**
- Title
- Artist credit
- Release date
- GTIN (if available)
2. **Provider Comparison Table**
- Side-by-side comparison of all providers
- Color-coded compatibility indicators
- Feature quality ratings
3. **Harmonized Metadata Display**
- Merged release information
- Track listing with ISRCs
- Label and catalog number information
- Cover art images
- Copyright and availability info
4. **MusicBrainz Seeder Form**
- Pre-filled form for MB import
- Edit note with provider URLs
- Annotation with extra data
- Copy-to-clipboard functionality
5. **Warnings and Messages**
- Compatibility conflicts
- Provider errors
- Missing data indicators
- Duplicate detection warnings
6. **Permalink**
- Timestamp-based URL for reproducibility
- Share button
#### Example Response Structure (HTML)
```html
<!DOCTYPE html>
<html>
<head>
<title>Album Title - Artist Name | Harmony</title>
<!-- Meta tags, CSS -->
</head>
<body>
<header>
<!-- Navigation -->
</header>
<main>
<!-- Release Header -->
<section class="release-header">
<h1>Album Title</h1>
<p class="artist-credit">Artist Name</p>
<p class="release-date">2014-11-24</p>
<p class="gtin">GTIN: 0602537347377</p>
</section>
<!-- Provider Comparison -->
<section class="provider-comparison">
<table>
<thead>
<tr>
<th>Property</th>
<th>Spotify</th>
<th>Deezer</th>
<th>iTunes</th>
<th>Merged</th>
</tr>
</thead>
<tbody>
<!-- Comparison rows -->
</tbody>
</table>
</section>
<!-- Harmonized Metadata -->
<section class="harmonized-release">
<!-- Track listing, labels, images, etc. -->
</section>
<!-- MusicBrainz Seeder -->
<section class="musicbrainz-seeder">
<form>
<!-- Pre-filled MB import form -->
</form>
</section>
<!-- Warnings -->
<section class="warnings">
<!-- Compatibility warnings, errors -->
</section>
<!-- Permalink -->
<section class="permalink">
<input type="text" readonly value="https://harmony.example.com/release?gtin=0602537347377&ts=1704067200">
<button>Copy</button>
</section>
</main>
<footer>
<!-- Footer content -->
</footer>
<!-- Island hydration scripts -->
<script type="module" src="/islands/LookupForm.js"></script>
<script type="module" src="/islands/SeederForm.js"></script>
</body>
</html>
```
### Error Handling
Errors are displayed inline in the HTML response:
#### Provider Errors
```html
<div class="provider-error">
<strong>Spotify:</strong> Rate limit exceeded. Retry after 60 seconds.
</div>
```
#### Lookup Errors
```html
<div class="lookup-error">
<strong>Error:</strong> No providers found for GTIN 0602537347377 in region CN.
</div>
```
#### Compatibility Warnings
```html
<div class="compatibility-warning">
<strong>Warning:</strong> Release date conflict:
<ul>
<li>Spotify: 2014-11-24</li>
<li>iTunes: 2014-11-25</li>
</ul>
Using Spotify value (higher preference).
</div>
```
## Secondary Routes
### `/` - Landing Page
**Purpose**: Introduction and quick start guide
**Content**:
- Project description
- Supported providers
- Usage examples
- Link to `/about` for detailed documentation
**No query parameters**
### `/release/actions` - ISRC/Cover Submission
**Purpose**: Submit ISRCs or cover art for existing MusicBrainz releases
**Query Parameters**:
| Parameter | Type | Required | Description |
|-----------|------|----------|-------------|
| `mbid` | string | Yes | MusicBrainz release ID |
| `action` | string | Yes | `isrc` or `cover` |
**Example**:
```
GET /release/actions?mbid=12345678-1234-1234-1234-123456789012&action=isrc
```
**Response**: Form for submitting ISRCs or cover art to MusicBrainz
### `/about` - Provider Documentation
**Purpose**: Detailed provider information and feature comparison
**Content**:
- Provider descriptions
- Feature quality matrix
- Rate limits and authentication requirements
- Supported regions
- Known limitations
**No query parameters**
**Feature Quality Matrix Example**:
| Provider | GTIN | Title | Artists | Date | Labels | Tracks | ISRC | Images | Copyright |
|----------|------|-------|---------|------|--------|--------|------|--------|-----------|
| Spotify | ✓ | ✓ | ✓ | ✓ | ~ | ✓ | ✓ | 2000px | ~ |
| Deezer | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | 1400px | ✓ |
| iTunes | ✓ | ✓ | ✓ | ✓ | ~ | ✓ | ~ | Varies | ~ |
| Tidal | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | 1280px | ✓ |
| Bandcamp | ✗ | ✓ | ✓ | ✓ | ✓ | ✓ | ✗ | 3000px | ✓ |
Legend:
- ✓ = GOOD quality
- ~ = PRESENT quality
- ✗ = MISSING
### `/settings` - User Preferences
**Purpose**: Configure user preferences
**Method**: GET (display form), POST (save preferences)
**Preferences**:
| Setting | Type | Default | Description |
|---------|------|---------|-------------|
| `defaultRegions` | string[] | `['GB','US','DE','JP']` | Default regions for lookup |
| `defaultCategory` | string | `default` | Default provider category |
| `providerPreferences` | string[] | Custom order | Provider preference order for merge |
| `showCompatibilityWarnings` | boolean | `true` | Display compatibility warnings |
| `cacheStrategy` | string | `24h` | Cache duration |
**Storage**: Preferences stored in cookies (no server-side storage)
**Example Cookie**:
```
harmony_prefs={"defaultRegions":["JP","US"],"defaultCategory":"preferred","providerPreferences":["spotify","tidal","deezer"]}; Max-Age=31536000; Path=/
```
## Islands (Client-Side Interactivity)
Fresh's islands architecture enables selective client-side interactivity.
### Island Components
#### 1. LookupForm Island
**File**: `islands/LookupForm.tsx`
**Purpose**: Dynamic lookup form with validation
**Features**:
- Real-time GTIN validation
- URL parsing and provider detection
- Region multi-select
- Category radio buttons
- Form submission with loading state
**Client-Side Logic**:
```typescript
// Conceptual
function LookupForm() {
const [gtin, setGtin] = useState('');
const [urls, setUrls] = useState<string[]>([]);
const [regions, setRegions] = useState(['GB', 'US', 'DE', 'JP']);
const validateGtin = (value: string) => {
// GTIN-13 validation
return /^\d{13}$/.test(value);
};
const handleSubmit = async (e: Event) => {
e.preventDefault();
// Navigate to /release with query params
const params = new URLSearchParams();
if (gtin) params.set('gtin', gtin);
urls.forEach(url => params.append('url', url));
params.set('region', regions.join(','));
window.location.href = `/release?${params}`;
};
return (
<form onSubmit={handleSubmit}>
{/* Form fields */}
</form>
);
}
```
#### 2. ProviderSelector Island
**File**: `islands/ProviderSelector.tsx`
**Purpose**: Provider category filtering
**Features**:
- Category selection (all/default/preferred)
- Individual provider checkboxes
- Real-time URL update
#### 3. RegionSelector Island
**File**: `islands/RegionSelector.tsx`
**Purpose**: Multi-region selection
**Features**:
- Checkbox list of supported regions
- Select all / deselect all
- Common region presets (US+GB, Japan, Europe)
#### 4. PermalinkGenerator Island
**File**: `islands/PermalinkGenerator.tsx`
**Purpose**: Generate timestamp-based permalink
**Features**:
- Current timestamp capture
- URL generation with `ts` parameter
- Copy to clipboard
- Share button
**Client-Side Logic**:
```typescript
function PermalinkGenerator({ currentUrl }: { currentUrl: string }) {
const [permalink, setPermalink] = useState('');
const generatePermalink = () => {
const url = new URL(currentUrl);
url.searchParams.set('ts', Math.floor(Date.now() / 1000).toString());
setPermalink(url.toString());
};
const copyToClipboard = () => {
navigator.clipboard.writeText(permalink);
};
return (
<div>
<button onClick={generatePermalink}>Generate Permalink</button>
{permalink && (
<>
<input type="text" readonly value={permalink} />
<button onClick={copyToClipboard}>Copy</button>
</>
)}
</div>
);
}
```
#### 5. SeederForm Island
**File**: `islands/SeederForm.tsx`
**Purpose**: MusicBrainz import form with copy functionality
**Features**:
- Pre-filled form fields
- Copy individual fields to clipboard
- Copy entire form as JSON
- Open MusicBrainz seeder in new tab
**Client-Side Logic**:
```typescript
function SeederForm({ release }: { release: MergedHarmonyRelease }) {
const copyField = (field: string, value: string) => {
navigator.clipboard.writeText(value);
};
const openSeeder = () => {
const mbUrl = `https://musicbrainz.org/release/add`;
const form = document.createElement('form');
form.method = 'POST';
form.action = mbUrl;
form.target = '_blank';
// Add form fields
Object.entries(release).forEach(([key, value]) => {
const input = document.createElement('input');
input.type = 'hidden';
input.name = key;
input.value = JSON.stringify(value);
form.appendChild(input);
});
document.body.appendChild(form);
form.submit();
document.body.removeChild(form);
};
return (
<div>
{/* Form fields with copy buttons */}
<button onClick={openSeeder}>Open in MusicBrainz</button>
</div>
);
}
```
## No REST API
Harmony **does not provide a REST API** or JSON endpoints. Key implications:
### No JSON Responses
All routes return HTML. There is no `Accept: application/json` support.
**Request**:
```
GET /release?gtin=0602537347377
Accept: application/json
```
**Response**:
```
HTTP/1.1 200 OK
Content-Type: text/html
<!DOCTYPE html>
<!-- HTML response, not JSON -->
```
### No Programmatic Access
Clients cannot fetch data programmatically without HTML parsing.
**Workaround** (not officially supported):
1. Fetch HTML response
2. Parse HTML with DOM parser
3. Extract data from structured elements
**Example** (conceptual):
```typescript
const response = await fetch('/release?gtin=0602537347377');
const html = await response.text();
const doc = new DOMParser().parseFromString(html, 'text/html');
const title = doc.querySelector('.release-header h1')?.textContent;
```
### No API Authentication
No API keys, no OAuth2 for API access (OAuth2 only used for provider authentication).
### No Rate Limiting on Server
Server does not enforce rate limits (providers have their own limits).
## Request/Response Flow
### Typical Request Flow
```
1. User submits lookup form
2. Browser sends GET /release?gtin=...&region=...
3. Fresh router matches route to routes/release.tsx
4. Route handler executes:
a. Parse query parameters
b. Call CombinedReleaseLookup
c. Parallel provider queries
d. Harmonize responses
e. Merge releases
f. Generate MusicBrainz seeding data
5. Server-side rendering:
a. Render components with data
b. Generate HTML
c. Inject island hydration scripts
6. HTTP response sent to browser
7. Browser renders HTML
8. Island hydration:
a. Load island JavaScript modules
b. Attach event listeners
c. Enable client-side interactivity
```
### Caching Strategy
#### Server-Side Caching
- **snap_storage**: Caches HTTP responses from providers
- **Cache key**: URL + query parameters
- **Cache duration**: 24 hours (configurable)
- **Cache storage**: SQLite database (`snaps.db`) + file directory (`snaps/`)
#### Client-Side Caching
- **Browser cache**: Standard HTTP caching headers
- **localStorage**: OAuth2 tokens, MBID mappings (dev mode)
- **sessionStorage**: MBID mappings (production mode)
- **Cookies**: User preferences
#### Permalink Caching
The `ts` parameter enables cache replay:
1. User performs lookup at timestamp T
2. Responses cached with timestamp T
3. Permalink generated: `/release?gtin=...&ts=T`
4. Future requests with `ts=T` replay cached responses
5. Ensures reproducible results even if provider data changes
**Cache Lookup Logic**:
```typescript
async function getCachedResponse(url: string, timestamp?: number): Promise<Response | null> {
if (timestamp) {
// Permalink mode: lookup by timestamp
return await cache.getByTimestamp(url, timestamp);
} else {
// Normal mode: lookup by recency
return await cache.getRecent(url, MAX_AGE);
}
}
```
## Error Responses
### HTTP Status Codes
| Status | Scenario |
|--------|----------|
| 200 | Success (even with partial provider failures) |
| 400 | Invalid query parameters |
| 404 | Route not found |
| 500 | Server error (unhandled exception) |
### Error Display
Errors displayed inline in HTML, not as HTTP error codes.
**Example**: All providers fail, but response is still 200 OK with error messages in HTML.
## Performance Considerations
### Parallel Provider Queries
All provider lookups execute in parallel via `Promise.allSettled`:
```typescript
const lookups = providers.map(p => p.lookup(input));
const results = await Promise.allSettled(lookups);
```
**Benefits**:
- Faster total response time
- Graceful degradation (partial results)
**Typical Response Times**:
- Single provider: 200-500ms
- Multiple providers (parallel): 500-1500ms
- Cached response: <50ms
### Server-Side Rendering Overhead
Fresh SSR adds minimal overhead:
- Component rendering: 10-50ms
- HTML generation: 5-20ms
- Total SSR overhead: <100ms
### Island Hydration
Islands load asynchronously after initial page render:
- Initial HTML render: Immediate
- Island JavaScript load: 100-300ms
- Island hydration: 50-100ms
**User experience**: Page is interactive immediately, islands enhance progressively.
## Integration Patterns
### Embedding in Other Applications
Since Harmony has no REST API, integration requires:
1. **iFrame embedding**: Embed `/release` route in iFrame
2. **Redirect**: Redirect users to Harmony for lookup
3. **HTML parsing**: Fetch and parse HTML responses (fragile)
**iFrame Example**:
```html
<iframe src="https://harmony.example.com/release?gtin=0602537347377" width="100%" height="600"></iframe>
```
### MusicBrainz Integration
Harmony integrates with MusicBrainz via:
1. **Seeder form**: Pre-filled form for MB import
2. **Edit notes**: Include provider URLs and permalink
3. **Annotations**: Extra metadata not in main form
4. **MBID resolution**: Batch URL lookup to detect duplicates
**Workflow**:
```
1. User performs lookup in Harmony
2. Harmony displays harmonized release
3. User clicks "Open in MusicBrainz"
4. Seeder form opens in new tab
5. User reviews and submits to MusicBrainz
```
## Summary
Harmony's API design prioritizes:
1. **Web UI first**: No REST API, HTML-only responses
2. **Server-side rendering**: Fast initial load, SEO-friendly
3. **Islands architecture**: Selective client-side interactivity
4. **Permalink system**: Reproducible results via timestamp caching
5. **Graceful degradation**: Partial results on provider failures
6. **MusicBrainz integration**: Seamless seeding workflow
This design is optimized for human users (MusicBrainz editors) rather than programmatic API consumers. For a metadata aggregation system targeting API consumers, a REST API layer would need to be added.