a1f6701bac
- gRPC service with MusicBrainz provider - PostgreSQL schema with migrations - Service layer with database-first caching - Repository pattern for data access - YAML configuration support - Research documentation for 17 music metadata projects
815 lines
19 KiB
Markdown
815 lines
19 KiB
Markdown
# Meelo Integrations
|
|
|
|
## Integration Overview
|
|
|
|
Meelo integrates with 8 metadata providers and 2 scrobbling services. The Matcher service handles provider queries, while the Server handles scrobbling. All integrations are configurable via settings.json and .env.
|
|
|
|
## Metadata Providers
|
|
|
|
### MusicBrainz
|
|
|
|
**Type**: Primary music database
|
|
**Library**: musicbrainzngs (Python)
|
|
**Authentication**: None (public API)
|
|
**Rate Limit**: 1 request/second
|
|
**Priority**: Highest (primary source)
|
|
|
|
#### Capabilities
|
|
|
|
- Artist metadata (name, sort name, areas, relationships)
|
|
- Album metadata (title, type, release date, labels)
|
|
- Track metadata (title, duration, ISRC)
|
|
- Recording relationships (covers, remixes, versions)
|
|
- Release groups and releases
|
|
- Area data (countries, cities with ISO 3166 codes)
|
|
|
|
#### Matching Strategy
|
|
|
|
1. Query by AcoustID fingerprint (most accurate)
|
|
2. If no fingerprint, search by artist + album + track title
|
|
3. Extract MBID (MusicBrainz ID) for future queries
|
|
4. Store MBID in LocalIdentifiers table
|
|
|
|
#### Data Extraction
|
|
|
|
**Artist**:
|
|
```python
|
|
artist_data = mb.get_artist_by_id(mbid, includes=['areas', 'aliases'])
|
|
{
|
|
'name': artist_data['artist']['name'],
|
|
'sortName': artist_data['artist']['sort-name'],
|
|
'areas': [area['name'] for area in artist_data['artist'].get('areas', [])]
|
|
}
|
|
```
|
|
|
|
**Album**:
|
|
```python
|
|
release_group = mb.get_release_group_by_id(mbid, includes=['releases', 'labels'])
|
|
{
|
|
'name': release_group['release-group']['title'],
|
|
'type': release_group['release-group']['type'],
|
|
'releaseDate': release_group['release-group']['first-release-date'],
|
|
'releases': [...]
|
|
}
|
|
```
|
|
|
|
**Track**:
|
|
```python
|
|
recording = mb.get_recording_by_id(mbid, includes=['isrcs', 'releases'])
|
|
{
|
|
'title': recording['recording']['title'],
|
|
'duration': recording['recording']['length'],
|
|
'isrc': recording['recording'].get('isrc-list', [None])[0]
|
|
}
|
|
```
|
|
|
|
#### Rate Limiting
|
|
|
|
musicbrainzngs library enforces 1 request/second automatically. No additional limiting needed.
|
|
|
|
#### Error Handling
|
|
|
|
- **404 Not Found**: No match, skip provider
|
|
- **503 Service Unavailable**: Retry with exponential backoff (max 3 attempts)
|
|
- **Rate Limit Exceeded**: Wait and retry
|
|
|
|
### Genius
|
|
|
|
**Type**: Lyrics and song descriptions
|
|
**Library**: lyricsgenius (Python)
|
|
**Authentication**: API token (GENIUS_ACCESS_TOKEN)
|
|
**Rate Limit**: 10 requests/second
|
|
**Priority**: High (for lyrics)
|
|
|
|
#### Capabilities
|
|
|
|
- Song lyrics (plain text)
|
|
- Song descriptions and annotations
|
|
- Artist biographies
|
|
- Album descriptions
|
|
|
|
#### Matching Strategy
|
|
|
|
1. Search by artist + song title
|
|
2. Extract song ID from search results
|
|
3. Fetch full song data including lyrics
|
|
4. Store lyrics in Lyrics table
|
|
|
|
#### Data Extraction
|
|
|
|
**Lyrics**:
|
|
```python
|
|
genius = lyricsgenius.Genius(token)
|
|
song = genius.search_song(title, artist)
|
|
{
|
|
'plain': song.lyrics,
|
|
'description': song.description
|
|
}
|
|
```
|
|
|
|
**Artist Bio**:
|
|
```python
|
|
artist = genius.search_artist(name)
|
|
{
|
|
'description': artist.description
|
|
}
|
|
```
|
|
|
|
#### Rate Limiting
|
|
|
|
Implemented using aiolimiter:
|
|
```python
|
|
limiter = AsyncLimiter(10, 1) # 10 requests per second
|
|
async with limiter:
|
|
result = await fetch_genius(...)
|
|
```
|
|
|
|
#### Error Handling
|
|
|
|
- **404 Not Found**: No lyrics available, skip
|
|
- **401 Unauthorized**: Invalid token, log error
|
|
- **Rate Limit**: Wait and retry
|
|
|
|
### Wikipedia
|
|
|
|
**Type**: Artist and album context
|
|
**Library**: wikipedia (Python)
|
|
**Authentication**: None
|
|
**Rate Limit**: 5 requests/second (self-imposed)
|
|
**Priority**: Medium (for descriptions)
|
|
|
|
#### Capabilities
|
|
|
|
- Artist biographies
|
|
- Album background and reception
|
|
- Contextual information (formation, breakup, influences)
|
|
|
|
#### Matching Strategy
|
|
|
|
1. Search Wikipedia by artist/album name
|
|
2. Extract first paragraph as description
|
|
3. Store full URL as source
|
|
|
|
#### Data Extraction
|
|
|
|
**Artist Bio**:
|
|
```python
|
|
import wikipedia
|
|
page = wikipedia.page(artist_name)
|
|
{
|
|
'description': page.summary,
|
|
'url': page.url
|
|
}
|
|
```
|
|
|
|
**Album Context**:
|
|
```python
|
|
page = wikipedia.page(f"{album_name} ({artist_name} album)")
|
|
{
|
|
'description': page.summary,
|
|
'url': page.url
|
|
}
|
|
```
|
|
|
|
#### Disambiguation
|
|
|
|
Wikipedia often returns disambiguation pages. Handle by:
|
|
1. Detect disambiguation page (check for "may refer to")
|
|
2. Search for most likely option (e.g., add "band" or "musician")
|
|
3. If still ambiguous, skip
|
|
|
|
#### Rate Limiting
|
|
|
|
```python
|
|
limiter = AsyncLimiter(5, 1) # 5 requests per second
|
|
```
|
|
|
|
#### Error Handling
|
|
|
|
- **PageError**: No Wikipedia page, skip
|
|
- **DisambiguationError**: Try disambiguation, or skip
|
|
- **HTTPError**: Retry with backoff
|
|
|
|
### Wikidata
|
|
|
|
**Type**: Structured data
|
|
**Library**: SPARQLWrapper (Python)
|
|
**Authentication**: None
|
|
**Rate Limit**: None (fast SPARQL endpoint)
|
|
**Priority**: Medium (for structured data)
|
|
|
|
#### Capabilities
|
|
|
|
- Artist relationships (members, collaborators)
|
|
- Area data (countries, cities, ISO codes)
|
|
- Dates (birth, death, formation, dissolution)
|
|
- External IDs (MusicBrainz, Discogs, AllMusic)
|
|
|
|
#### Matching Strategy
|
|
|
|
1. Query by MusicBrainz ID (if available)
|
|
2. Extract Wikidata entity ID
|
|
3. Query for additional properties
|
|
4. Store structured data
|
|
|
|
#### Data Extraction
|
|
|
|
**Artist Data**:
|
|
```sparql
|
|
SELECT ?property ?value WHERE {
|
|
?artist wdt:P434 "MBID" . # MusicBrainz artist ID
|
|
?artist ?property ?value .
|
|
}
|
|
```
|
|
|
|
**Area Hierarchy**:
|
|
```sparql
|
|
SELECT ?area ?parent ?iso WHERE {
|
|
?area wdt:P31 wd:Q515 . # instance of city
|
|
?area wdt:P131 ?parent . # located in
|
|
?area wdt:P300 ?iso . # ISO 3166 code
|
|
}
|
|
```
|
|
|
|
#### Rate Limiting
|
|
|
|
No rate limit. SPARQL endpoint is fast and public.
|
|
|
|
#### Error Handling
|
|
|
|
- **No Results**: Entity not in Wikidata, skip
|
|
- **Timeout**: Retry with simpler query
|
|
- **SPARQL Error**: Log and skip
|
|
|
|
### Discogs
|
|
|
|
**Type**: Release information
|
|
**Library**: discogs_client (Python)
|
|
**Authentication**: API token (DISCOGS_ACCESS_TOKEN)
|
|
**Rate Limit**: 60 requests/minute
|
|
**Priority**: Low (optional)
|
|
|
|
#### Capabilities
|
|
|
|
- Release details (catalog number, barcode, format)
|
|
- Label information
|
|
- Release variations (country, format)
|
|
- Marketplace data (not used)
|
|
|
|
#### Matching Strategy
|
|
|
|
1. Search by artist + album title
|
|
2. Filter by format (CD, Vinyl, etc.)
|
|
3. Extract release details
|
|
4. Store in Release.extensions JSON
|
|
|
|
#### Data Extraction
|
|
|
|
**Release**:
|
|
```python
|
|
import discogs_client
|
|
d = discogs_client.Client('Meelo/1.0', user_token=token)
|
|
results = d.search(artist=artist, release_title=album, type='release')
|
|
release = results[0]
|
|
{
|
|
'catalogNumber': release.data['catno'],
|
|
'barcode': release.data.get('barcode'),
|
|
'format': release.formats[0]['name'],
|
|
'country': release.country,
|
|
'label': release.labels[0].name
|
|
}
|
|
```
|
|
|
|
#### Rate Limiting
|
|
|
|
```python
|
|
limiter = AsyncLimiter(60, 60) # 60 requests per minute
|
|
```
|
|
|
|
#### Error Handling
|
|
|
|
- **404 Not Found**: No Discogs entry, skip
|
|
- **401 Unauthorized**: Invalid token, log error
|
|
- **Rate Limit**: Wait 60 seconds and retry
|
|
|
|
### AllMusic
|
|
|
|
**Type**: Editorial reviews and ratings
|
|
**Library**: BeautifulSoup (web scraping)
|
|
**Authentication**: None
|
|
**Rate Limit**: 1 request/second (self-imposed, no official API)
|
|
**Priority**: Low (optional)
|
|
|
|
#### Capabilities
|
|
|
|
- Album reviews
|
|
- Album ratings (1-5 stars)
|
|
- Artist biographies
|
|
- Genre classifications
|
|
|
|
#### Matching Strategy
|
|
|
|
1. Search AllMusic by artist + album
|
|
2. Scrape search results page
|
|
3. Extract review and rating
|
|
4. Store rating normalized to 0-100 scale
|
|
|
|
#### Data Extraction
|
|
|
|
**Album Review**:
|
|
```python
|
|
from bs4 import BeautifulSoup
|
|
import httpx
|
|
|
|
url = f"https://www.allmusic.com/search/albums/{artist}+{album}"
|
|
response = httpx.get(url)
|
|
soup = BeautifulSoup(response.text, 'html.parser')
|
|
|
|
rating_elem = soup.select_one('.allmusic-rating')
|
|
rating = len(rating_elem.select('.star-rating.full')) # Count full stars
|
|
|
|
review_elem = soup.select_one('.review-text')
|
|
review = review_elem.text.strip()
|
|
|
|
{
|
|
'rating': rating * 20, # Convert 1-5 to 0-100
|
|
'description': review
|
|
}
|
|
```
|
|
|
|
#### Rate Limiting
|
|
|
|
```python
|
|
limiter = AsyncLimiter(1, 1) # 1 request per second
|
|
```
|
|
|
|
#### Error Handling
|
|
|
|
- **404 Not Found**: No AllMusic page, skip
|
|
- **Parsing Error**: HTML structure changed, log and skip
|
|
- **Timeout**: Retry with backoff
|
|
|
|
#### Scraping Risks
|
|
|
|
AllMusic has no official API. Scraping may break if HTML structure changes. Disabled by default in settings.json.
|
|
|
|
### Metacritic
|
|
|
|
**Type**: Aggregated critic scores
|
|
**Library**: BeautifulSoup (web scraping)
|
|
**Authentication**: None
|
|
**Rate Limit**: 1 request/second (self-imposed)
|
|
**Priority**: Low (optional)
|
|
|
|
#### Capabilities
|
|
|
|
- Album critic scores (0-100)
|
|
- User scores (not used)
|
|
- Critic reviews (not extracted)
|
|
|
|
#### Matching Strategy
|
|
|
|
1. Search Metacritic by artist + album
|
|
2. Scrape album page
|
|
3. Extract Metascore
|
|
4. Store as rating (already 0-100 scale)
|
|
|
|
#### Data Extraction
|
|
|
|
**Album Score**:
|
|
```python
|
|
url = f"https://www.metacritic.com/music/{album_slug}/{artist_slug}"
|
|
response = httpx.get(url)
|
|
soup = BeautifulSoup(response.text, 'html.parser')
|
|
|
|
score_elem = soup.select_one('.metascore_w')
|
|
score = int(score_elem.text.strip())
|
|
|
|
{
|
|
'rating': score
|
|
}
|
|
```
|
|
|
|
#### Rate Limiting
|
|
|
|
```python
|
|
limiter = AsyncLimiter(1, 1) # 1 request per second
|
|
```
|
|
|
|
#### Error Handling
|
|
|
|
- **404 Not Found**: Album not on Metacritic, skip
|
|
- **Parsing Error**: HTML structure changed, log and skip
|
|
- **Timeout**: Retry with backoff
|
|
|
|
#### Scraping Risks
|
|
|
|
Same as AllMusic. Disabled by default.
|
|
|
|
### LrcLib
|
|
|
|
**Type**: Synced lyrics
|
|
**Library**: httpx (direct API calls)
|
|
**Authentication**: None
|
|
**Rate Limit**: 10 requests/second (self-imposed)
|
|
**Priority**: High (for synced lyrics)
|
|
|
|
#### Capabilities
|
|
|
|
- Synced lyrics in .lrc format
|
|
- Plain lyrics (fallback)
|
|
- Lyrics by duration matching (improves accuracy)
|
|
|
|
#### Matching Strategy
|
|
|
|
1. Search by artist + title + duration
|
|
2. Parse .lrc format to JSON
|
|
3. Store in Lyrics.synced field
|
|
|
|
#### Data Extraction
|
|
|
|
**Synced Lyrics**:
|
|
```python
|
|
import httpx
|
|
|
|
url = "https://lrclib.net/api/get"
|
|
params = {
|
|
'artist_name': artist,
|
|
'track_name': title,
|
|
'duration': duration
|
|
}
|
|
response = httpx.get(url, params=params)
|
|
data = response.json()
|
|
|
|
lrc_text = data['syncedLyrics']
|
|
# Parse .lrc format
|
|
lines = []
|
|
for line in lrc_text.split('\n'):
|
|
match = re.match(r'\[(\d+):(\d+\.\d+)\](.*)', line)
|
|
if match:
|
|
minutes, seconds, text = match.groups()
|
|
time_ms = (int(minutes) * 60 + float(seconds)) * 1000
|
|
lines.append({'time': int(time_ms), 'text': text.strip()})
|
|
|
|
{
|
|
'synced': lines,
|
|
'plain': data.get('plainLyrics')
|
|
}
|
|
```
|
|
|
|
#### Rate Limiting
|
|
|
|
```python
|
|
limiter = AsyncLimiter(10, 1) # 10 requests per second
|
|
```
|
|
|
|
#### Error Handling
|
|
|
|
- **404 Not Found**: No synced lyrics, try plain lyrics
|
|
- **Parsing Error**: Invalid .lrc format, skip
|
|
- **Timeout**: Retry with backoff
|
|
|
|
## Scrobbling Services
|
|
|
|
### Last.fm
|
|
|
|
**Type**: Scrobbling service
|
|
**Library**: pylast (Python)
|
|
**Authentication**: OAuth (LASTFM_API_KEY, LASTFM_API_SECRET)
|
|
**Rate Limit**: None specified
|
|
**Integration**: Server (NestJS)
|
|
|
|
#### Capabilities
|
|
|
|
- Scrobble track plays
|
|
- Update "now playing" status
|
|
- Retrieve user listening history (not implemented)
|
|
|
|
#### OAuth Flow
|
|
|
|
1. User clicks "Connect Last.fm" in settings
|
|
2. Server redirects to Last.fm OAuth page
|
|
3. User authorizes Meelo
|
|
4. Last.fm redirects to callback with token
|
|
5. Server exchanges token for session key
|
|
6. Session key stored in UserScrobbler.data JSON
|
|
|
|
#### Scrobbling
|
|
|
|
**Now Playing**:
|
|
```typescript
|
|
await lastfm.updateNowPlaying({
|
|
artist: track.song.artist.name,
|
|
track: track.song.name,
|
|
album: track.release.album.name,
|
|
duration: track.duration
|
|
});
|
|
```
|
|
|
|
**Scrobble**:
|
|
```typescript
|
|
await lastfm.scrobble({
|
|
artist: track.song.artist.name,
|
|
track: track.song.name,
|
|
album: track.release.album.name,
|
|
timestamp: Math.floor(Date.now() / 1000)
|
|
});
|
|
```
|
|
|
|
#### Scrobble Rules
|
|
|
|
- Track must play for at least 30 seconds or 50% of duration (whichever is shorter)
|
|
- Scrobble sent when track ends or user skips past 50%
|
|
- "Now playing" sent immediately on play
|
|
|
|
#### Error Handling
|
|
|
|
- **Invalid Session**: Re-authenticate user
|
|
- **Network Error**: Queue scrobble for retry
|
|
- **Rate Limit**: Wait and retry
|
|
|
|
### ListenBrainz
|
|
|
|
**Type**: Open-source scrobbling service
|
|
**Library**: pylistenbrainz (Python)
|
|
**Authentication**: User token
|
|
**Rate Limit**: None specified
|
|
**Integration**: Server (NestJS)
|
|
|
|
#### Capabilities
|
|
|
|
- Submit listens (scrobbles)
|
|
- Retrieve listening history (not implemented)
|
|
- Statistics and recommendations (not implemented)
|
|
|
|
#### Authentication
|
|
|
|
1. User obtains token from ListenBrainz settings
|
|
2. User enters token in Meelo settings
|
|
3. Token stored in UserScrobbler.data JSON
|
|
4. No OAuth flow needed
|
|
|
|
#### Submitting Listens
|
|
|
|
**Single Listen**:
|
|
```typescript
|
|
await listenbrainz.submitListen({
|
|
listened_at: Math.floor(Date.now() / 1000),
|
|
track_metadata: {
|
|
artist_name: track.song.artist.name,
|
|
track_name: track.song.name,
|
|
release_name: track.release.album.name,
|
|
additional_info: {
|
|
duration_ms: track.duration * 1000,
|
|
tracknumber: track.trackIndex
|
|
}
|
|
}
|
|
});
|
|
```
|
|
|
|
#### Listen Types
|
|
|
|
- **Single**: Submit one listen (used for scrobbling)
|
|
- **Playing Now**: Update current track (not implemented)
|
|
- **Import**: Bulk import (not used)
|
|
|
|
#### Error Handling
|
|
|
|
- **Invalid Token**: Notify user to re-enter token
|
|
- **Network Error**: Queue listen for retry
|
|
- **Rate Limit**: Wait and retry
|
|
|
|
## Provider Configuration
|
|
|
|
### settings.json
|
|
|
|
```json
|
|
{
|
|
"providers": {
|
|
"musicbrainz": {
|
|
"enabled": true
|
|
},
|
|
"genius": {
|
|
"enabled": true
|
|
},
|
|
"wikipedia": {
|
|
"enabled": true
|
|
},
|
|
"wikidata": {
|
|
"enabled": true
|
|
},
|
|
"discogs": {
|
|
"enabled": false
|
|
},
|
|
"allmusic": {
|
|
"enabled": false
|
|
},
|
|
"metacritic": {
|
|
"enabled": false
|
|
},
|
|
"lrclib": {
|
|
"enabled": true
|
|
}
|
|
},
|
|
"metadata": {
|
|
"source": "providers",
|
|
"order": ["musicbrainz", "genius", "wikipedia", "lrclib", "wikidata"]
|
|
}
|
|
}
|
|
```
|
|
|
|
**Fields**:
|
|
- `providers.<name>.enabled`: Enable/disable provider
|
|
- `metadata.source`: Prefer "embedded" tags or "providers"
|
|
- `metadata.order`: Provider priority for conflicting data
|
|
|
|
### .env
|
|
|
|
```bash
|
|
# Genius
|
|
GENIUS_ACCESS_TOKEN=your_genius_token
|
|
|
|
# Discogs
|
|
DISCOGS_ACCESS_TOKEN=your_discogs_token
|
|
|
|
# Last.fm
|
|
LASTFM_API_KEY=your_lastfm_key
|
|
LASTFM_API_SECRET=your_lastfm_secret
|
|
|
|
# Public URL for OAuth callbacks
|
|
PUBLIC_URL=https://meelo.example.com
|
|
```
|
|
|
|
## Provider Priority
|
|
|
|
When multiple providers return conflicting data, Matcher uses priority from `metadata.order`:
|
|
|
|
1. **MusicBrainz**: Highest priority (most accurate)
|
|
2. **Genius**: High priority for lyrics
|
|
3. **Wikipedia**: Medium priority for descriptions
|
|
4. **LrcLib**: High priority for synced lyrics
|
|
5. **Wikidata**: Medium priority for structured data
|
|
6. **Discogs**: Low priority (optional)
|
|
7. **AllMusic**: Low priority (optional)
|
|
8. **Metacritic**: Low priority (optional)
|
|
|
|
## Data Aggregation
|
|
|
|
### Descriptions
|
|
|
|
Concatenate descriptions from multiple providers:
|
|
|
|
```
|
|
MusicBrainz: "The Beatles were an English rock band..."
|
|
Wikipedia: "Formed in Liverpool in 1960..."
|
|
Genius: "Known for their innovative songwriting..."
|
|
|
|
Result: "The Beatles were an English rock band... Formed in Liverpool in 1960... Known for their innovative songwriting..."
|
|
```
|
|
|
|
### Ratings
|
|
|
|
Average ratings from multiple providers:
|
|
|
|
```
|
|
AllMusic: 90/100
|
|
Metacritic: 85/100
|
|
|
|
Result: (90 + 85) / 2 = 87.5 → 88/100
|
|
```
|
|
|
|
### Lyrics
|
|
|
|
Prefer synced lyrics over plain:
|
|
|
|
```
|
|
LrcLib: Synced lyrics available → Use synced
|
|
Genius: Plain lyrics available → Use as fallback
|
|
```
|
|
|
|
If both available, store both in Lyrics table.
|
|
|
|
## Matching Workflow
|
|
|
|
1. **Scanner** registers file with Server
|
|
2. **Scanner** publishes `file.added` event to RabbitMQ
|
|
3. **Matcher** consumes event
|
|
4. **Matcher** fetches file metadata from Server
|
|
5. **Matcher** queries enabled providers in parallel:
|
|
- MusicBrainz by AcoustID fingerprint
|
|
- Genius by artist + title
|
|
- Wikipedia by artist name
|
|
- LrcLib by artist + title + duration
|
|
- Wikidata by MusicBrainz ID (if found)
|
|
- Discogs by artist + album (if enabled)
|
|
- AllMusic by artist + album (if enabled)
|
|
- Metacritic by artist + album (if enabled)
|
|
6. **Matcher** aggregates results based on priority
|
|
7. **Matcher** pushes enriched metadata to Server
|
|
8. **Server** updates database and search index
|
|
|
|
## Error Recovery
|
|
|
|
### Provider Failures
|
|
|
|
If provider fails:
|
|
1. Log error with provider name and reason
|
|
2. Continue with other providers
|
|
3. Push partial metadata to Server
|
|
4. Mark track as "partially matched"
|
|
|
|
### Retry Logic
|
|
|
|
For transient errors (network, rate limit):
|
|
1. Retry with exponential backoff
|
|
2. Max 3 attempts per provider
|
|
3. If all attempts fail, skip provider
|
|
|
|
### Manual Refresh
|
|
|
|
Users can trigger metadata refresh via Scanner API:
|
|
```bash
|
|
POST /scanner/refresh
|
|
```
|
|
|
|
This re-queries all providers for existing tracks.
|
|
|
|
## Performance Optimization
|
|
|
|
### Parallel Queries
|
|
|
|
Matcher queries all providers in parallel using asyncio:
|
|
|
|
```python
|
|
async def enrich_metadata(file_id):
|
|
tasks = [
|
|
fetch_musicbrainz(file_id),
|
|
fetch_genius(file_id),
|
|
fetch_wikipedia(file_id),
|
|
fetch_lrclib(file_id),
|
|
fetch_wikidata(file_id)
|
|
]
|
|
results = await asyncio.gather(*tasks, return_exceptions=True)
|
|
return aggregate_results(results)
|
|
```
|
|
|
|
### Caching
|
|
|
|
Provider responses cached in memory for 1 hour:
|
|
- Reduces duplicate queries during batch scans
|
|
- Invalidated on manual refresh
|
|
|
|
### Rate Limit Coordination
|
|
|
|
Rate limiters shared across all workers:
|
|
- Prevents exceeding provider limits
|
|
- Uses token bucket algorithm
|
|
|
|
## Privacy Considerations
|
|
|
|
### Data Sent to Providers
|
|
|
|
- **MusicBrainz**: AcoustID fingerprint, artist/album/track names
|
|
- **Genius**: Artist and track names
|
|
- **Wikipedia**: Artist and album names
|
|
- **Wikidata**: MusicBrainz IDs
|
|
- **Discogs**: Artist and album names
|
|
- **AllMusic**: Artist and album names
|
|
- **Metacritic**: Artist and album names
|
|
- **LrcLib**: Artist, track name, duration
|
|
|
|
No file paths or user data sent.
|
|
|
|
### Scrobbling Privacy
|
|
|
|
- **Last.fm**: Track plays sent with timestamp
|
|
- **ListenBrainz**: Track plays sent with timestamp
|
|
|
|
Users control scrobbling via settings. Disabled by default.
|
|
|
|
## Future Enhancements
|
|
|
|
### Additional Providers
|
|
|
|
Potential providers to add:
|
|
- **Spotify**: Metadata and popularity scores
|
|
- **Apple Music**: Editorial content
|
|
- **Bandcamp**: Independent artist data
|
|
- **RateYourMusic**: User ratings and reviews
|
|
|
|
### Provider Plugins
|
|
|
|
Allow users to add custom providers via plugin system.
|
|
|
|
### Offline Mode
|
|
|
|
Cache provider responses for offline access.
|
|
|
|
### Provider Statistics
|
|
|
|
Track provider accuracy and response times. Display in admin panel.
|
|
|
|
## Summary
|
|
|
|
Meelo's integration architecture separates concerns: Matcher handles provider queries, Server handles scrobbling. The provider pattern enables easy addition of new sources. Parallel queries and rate limiting optimize performance. Priority-based aggregation ensures data quality. OAuth flows and token management handle authentication. The system is flexible (enable/disable providers), resilient (retry logic, partial results), and privacy-conscious (no file paths sent).
|