a1f6701bac
- gRPC service with MusicBrainz provider - PostgreSQL schema with migrations - Service layer with database-first caching - Repository pattern for data access - YAML configuration support - Research documentation for 17 music metadata projects
1507 lines
35 KiB
Markdown
1507 lines
35 KiB
Markdown
# Lidarr Metadata API - External Integrations
|
|
|
|
## Integration Overview
|
|
|
|
The Lidarr Metadata API integrates with 15 external systems to provide comprehensive music metadata aggregation:
|
|
|
|
| Integration | Type | Purpose | Authentication | Rate Limit |
|
|
|-------------|------|---------|----------------|------------|
|
|
| **MusicBrainz DB** | Database | Core metadata | Read-only user | N/A |
|
|
| **Solr Search** | Search Engine | Full-text search | None | N/A |
|
|
| **Cover Art Archive** | CDN | Album cover art | None | N/A |
|
|
| **FanArt.tv** | REST API | Artist/album images | API key | 7-day lag (free) |
|
|
| **TheAudioDB** | REST API | Metadata fallback | API key "1" | Unknown |
|
|
| **Wikipedia** | Web Scraping | Artist biographies | None | Polite crawling |
|
|
| **Spotify** | REST API | ID mapping, charts | OAuth | 429 handling |
|
|
| **Last.fm** | REST API | Charts | API key | Unknown |
|
|
| **Billboard** | Web Scraping | Charts | None | Polite crawling |
|
|
| **Apple Music** | RSS API | Charts | None | N/A |
|
|
| **RabbitMQ** | Message Queue | Search index updates | Basic auth | N/A |
|
|
| **Redis** | Cache | Ephemeral cache | None | N/A |
|
|
| **Sentry** | Error Tracking | Monitoring | DSN | Redis-based |
|
|
| **Telegraf** | Metrics | StatsD metrics | None | N/A |
|
|
| **Cloudflare** | CDN | Edge caching | API token | 1200 req/5min |
|
|
|
|
## 1. MusicBrainz Database
|
|
|
|
### Overview
|
|
|
|
**Type**: Direct PostgreSQL database access
|
|
|
|
**Purpose**: Authoritative source for all music metadata
|
|
|
|
**Container**: `ghcr.io/lidarr/mb-postgres:1.0.10`
|
|
|
|
**Access method**: Read-only asyncpg connection
|
|
|
|
### Configuration
|
|
|
|
```python
|
|
MUSICBRAINZ_DB = {
|
|
'host': 'musicbrainz',
|
|
'port': 5432,
|
|
'database': 'musicbrainz_db',
|
|
'user': 'musicbrainz_ro', # Read-only user
|
|
'password': 'abc',
|
|
'min_pool_size': 10,
|
|
'max_pool_size': 50,
|
|
'command_timeout': 30
|
|
}
|
|
```
|
|
|
|
### Connection Pool
|
|
|
|
```python
|
|
import asyncpg
|
|
|
|
pool = await asyncpg.create_pool(
|
|
host=config.MUSICBRAINZ_DB['host'],
|
|
port=config.MUSICBRAINZ_DB['port'],
|
|
database=config.MUSICBRAINZ_DB['database'],
|
|
user=config.MUSICBRAINZ_DB['user'],
|
|
password=config.MUSICBRAINZ_DB['password'],
|
|
min_size=config.MUSICBRAINZ_DB['min_pool_size'],
|
|
max_size=config.MUSICBRAINZ_DB['max_pool_size'],
|
|
command_timeout=config.MUSICBRAINZ_DB['command_timeout']
|
|
)
|
|
```
|
|
|
|
### Replication Setup
|
|
|
|
**Replication method**: MusicBrainz replication packets
|
|
|
|
**Update frequency**: Hourly
|
|
|
|
**Replication script**: Custom script in container
|
|
|
|
**Process**:
|
|
1. Check current replication sequence
|
|
2. Download replication packets from MusicBrainz FTP
|
|
3. Apply SQL changes
|
|
4. Update replication control table
|
|
5. Trigger search index updates
|
|
|
|
**Monitoring replication lag**:
|
|
```sql
|
|
SELECT
|
|
current_replication_sequence,
|
|
last_replication_date,
|
|
NOW() - last_replication_date AS lag
|
|
FROM replication_control;
|
|
```
|
|
|
|
### Database Size and Performance
|
|
|
|
**Database size**: 100GB+ (full MusicBrainz dataset)
|
|
|
|
**Query performance**:
|
|
- Simple artist lookup: 50-100ms
|
|
- Complex artist with releases: 100-500ms
|
|
- Album with tracks: 200-1000ms
|
|
- Change detection query: 500-2000ms
|
|
|
|
**Optimization**: Custom indices on `last_updated` columns
|
|
|
|
### Security Considerations
|
|
|
|
**Read-only access**: User has SELECT-only permissions
|
|
|
|
**Network isolation**: Database accessible only within Docker network
|
|
|
|
**Credentials**: Hardcoded (insecure default, should be changed)
|
|
|
|
## 2. Solr Search
|
|
|
|
### Overview
|
|
|
|
**Type**: Apache Solr 8.x search engine
|
|
|
|
**Purpose**: Full-text search for artists and albums
|
|
|
|
**Container**: `ghcr.io/lidarr/mb-solr:3.3.1.9`
|
|
|
|
**Cores**: `artist`, `release-group`
|
|
|
|
### Configuration
|
|
|
|
```python
|
|
SOLR = {
|
|
'url': 'http://solr:8983/solr',
|
|
'artist_core': 'artist',
|
|
'album_core': 'release-group',
|
|
'timeout': 5,
|
|
'rows': 10
|
|
}
|
|
```
|
|
|
|
### Query Interface
|
|
|
|
**HTTP client**: aiohttp
|
|
|
|
**Query format**: JSON API
|
|
|
|
**Example query**:
|
|
```python
|
|
import aiohttp
|
|
|
|
async def search_artist(query, limit=10):
|
|
async with aiohttp.ClientSession() as session:
|
|
params = {
|
|
'q': query,
|
|
'defType': 'dismax',
|
|
'qf': 'artist^2 sortname alias',
|
|
'mm': '1',
|
|
'rows': limit,
|
|
'wt': 'json'
|
|
}
|
|
|
|
async with session.get(
|
|
f"{config.SOLR['url']}/{config.SOLR['artist_core']}/select",
|
|
params=params,
|
|
timeout=aiohttp.ClientTimeout(total=config.SOLR['timeout'])
|
|
) as response:
|
|
data = await response.json()
|
|
return data['response']['docs']
|
|
```
|
|
|
|
### Real-Time Index Updates
|
|
|
|
**Update mechanism**: RabbitMQ + SIR (Search Index Rebuilder)
|
|
|
|
**Process**:
|
|
1. MusicBrainz database changes trigger RabbitMQ messages
|
|
2. SIR consumes messages from queue
|
|
3. SIR queries MusicBrainz DB for updated entity
|
|
4. SIR posts update to Solr core
|
|
5. Solr performs soft commit (1 second)
|
|
|
|
**Update latency**: 1-5 seconds from database change
|
|
|
|
### Index Maintenance
|
|
|
|
**Full reindex**: Required after schema changes
|
|
|
|
**Reindex process**:
|
|
```bash
|
|
# Stop SIR
|
|
docker-compose stop indexer
|
|
|
|
# Clear Solr cores
|
|
curl "http://solr:8983/solr/artist/update?commit=true" -H "Content-Type: text/xml" --data-binary '<delete><query>*:*</query></delete>'
|
|
curl "http://solr:8983/solr/release-group/update?commit=true" -H "Content-Type: text/xml" --data-binary '<delete><query>*:*</query></delete>'
|
|
|
|
# Rebuild indices
|
|
docker-compose run indexer rebuild-artist
|
|
docker-compose run indexer rebuild-album
|
|
|
|
# Restart SIR
|
|
docker-compose start indexer
|
|
```
|
|
|
|
**Reindex duration**: 4-8 hours for full MusicBrainz dataset
|
|
|
|
### Performance Tuning
|
|
|
|
**JVM heap size**: 2GB
|
|
|
|
**Solr cache settings**:
|
|
```xml
|
|
<filterCache size="512" initialSize="512" autowarmCount="256"/>
|
|
<queryResultCache size="512" initialSize="512" autowarmCount="256"/>
|
|
<documentCache size="512" initialSize="512" autowarmCount="0"/>
|
|
```
|
|
|
|
**Commit settings**:
|
|
```xml
|
|
<autoCommit>
|
|
<maxTime>15000</maxTime>
|
|
<openSearcher>false</openSearcher>
|
|
</autoCommit>
|
|
|
|
<autoSoftCommit>
|
|
<maxTime>1000</maxTime>
|
|
</autoSoftCommit>
|
|
```
|
|
|
|
## 3. Cover Art Archive
|
|
|
|
### Overview
|
|
|
|
**Type**: Image CDN
|
|
|
|
**Purpose**: Album cover art images
|
|
|
|
**Base URL**: `https://coverartarchive.org`
|
|
|
|
**Proxy**: `https://imagecache.lidarr.audio`
|
|
|
|
### Image URL Format
|
|
|
|
**Direct URL**:
|
|
```
|
|
https://coverartarchive.org/release/{release-mbid}/front-500.jpg
|
|
```
|
|
|
|
**Proxied URL**:
|
|
```
|
|
https://imagecache.lidarr.audio/cover/{release-mbid}/front.jpg
|
|
```
|
|
|
|
### Image Types
|
|
|
|
| Type | Description | Typical Size |
|
|
|------|-------------|--------------|
|
|
| `front` | Front cover | 500x500 - 1200x1200 |
|
|
| `back` | Back cover | 500x500 - 1200x1200 |
|
|
| `booklet` | Booklet pages | Variable |
|
|
| `medium` | Disc/vinyl image | 500x500 |
|
|
| `tray` | CD tray card | Variable |
|
|
| `obi` | Japanese obi strip | Variable |
|
|
| `spine` | Spine image | Variable |
|
|
| `track` | Track listing | Variable |
|
|
| `liner` | Liner notes | Variable |
|
|
| `sticker` | Sticker image | Variable |
|
|
| `poster` | Poster image | Variable |
|
|
|
|
### Image Proxy Benefits
|
|
|
|
**Advantages of using imagecache.lidarr.audio**:
|
|
1. **Caching**: Images cached at edge for faster delivery
|
|
2. **Resizing**: Automatic image resizing via query parameters
|
|
3. **Format conversion**: WebP conversion for modern browsers
|
|
4. **Bandwidth**: Reduced load on Cover Art Archive
|
|
5. **Reliability**: Fallback to direct URL if proxy fails
|
|
|
|
**Proxy query parameters**:
|
|
```
|
|
https://imagecache.lidarr.audio/cover/{mbid}/front.jpg?size=500&format=webp
|
|
```
|
|
|
|
### Integration Code
|
|
|
|
```python
|
|
async def get_cover_art(release_mbid):
|
|
"""Fetch cover art URLs for release"""
|
|
images = []
|
|
|
|
# Try proxy first
|
|
proxy_url = f"https://imagecache.lidarr.audio/cover/{release_mbid}/front.jpg"
|
|
if await check_url_exists(proxy_url):
|
|
images.append({
|
|
'Url': proxy_url,
|
|
'CoverType': 'cover',
|
|
'Extension': '.jpg'
|
|
})
|
|
else:
|
|
# Fallback to direct URL
|
|
direct_url = f"https://coverartarchive.org/release/{release_mbid}/front-500.jpg"
|
|
if await check_url_exists(direct_url):
|
|
images.append({
|
|
'Url': direct_url,
|
|
'CoverType': 'cover',
|
|
'Extension': '.jpg'
|
|
})
|
|
|
|
return images
|
|
```
|
|
|
|
### Error Handling
|
|
|
|
**404 Not Found**: No cover art available for release
|
|
|
|
**503 Service Unavailable**: Cover Art Archive temporarily down
|
|
|
|
**Fallback**: Use FanArt.tv or TheAudioDB images
|
|
|
|
## 4. FanArt.tv
|
|
|
|
### Overview
|
|
|
|
**Type**: REST API
|
|
|
|
**Purpose**: High-quality artist and album images
|
|
|
|
**Base URL**: `https://webservice.fanart.tv/v3`
|
|
|
|
**Authentication**: API key
|
|
|
|
### Configuration
|
|
|
|
```python
|
|
FANART = {
|
|
'api_key': 'your-api-key-here',
|
|
'base_url': 'https://webservice.fanart.tv/v3',
|
|
'timeout': 10,
|
|
'cache_ttl': 2592000 # 30 days
|
|
}
|
|
```
|
|
|
|
### API Key Types
|
|
|
|
| Key Type | Cost | Rate Limit | Image Lag |
|
|
|----------|------|------------|-----------|
|
|
| **Free** | Free | Unknown | 7 days |
|
|
| **Personal** | $2/month | Higher | No lag |
|
|
| **Commercial** | $5/month | Highest | No lag |
|
|
|
|
**7-day lag**: Free API keys only return images added 7+ days ago
|
|
|
|
### Endpoints
|
|
|
|
#### Artist Images
|
|
|
|
**Endpoint**: `GET /music/{mbid}`
|
|
|
|
**Request**:
|
|
```python
|
|
async def get_fanart_artist_images(mbid):
|
|
async with aiohttp.ClientSession() as session:
|
|
headers = {'api-key': config.FANART['api_key']}
|
|
url = f"{config.FANART['base_url']}/music/{mbid}"
|
|
|
|
async with session.get(url, headers=headers, timeout=10) as response:
|
|
if response.status == 404:
|
|
return []
|
|
response.raise_for_status()
|
|
return await response.json()
|
|
```
|
|
|
|
**Response**:
|
|
```json
|
|
{
|
|
"name": "Nirvana",
|
|
"mbid_id": "5b11f4ce-a62d-471e-81fc-a69a8278c7da",
|
|
"artistbackground": [
|
|
{
|
|
"id": "12345",
|
|
"url": "https://assets.fanart.tv/fanart/music/5b11f4ce-a62d-471e-81fc-a69a8278c7da/artistbackground/nirvana-1.jpg",
|
|
"likes": "42"
|
|
}
|
|
],
|
|
"artistthumb": [...],
|
|
"hdmusiclogo": [...],
|
|
"musicbanner": [...],
|
|
"musiclogo": [...]
|
|
}
|
|
```
|
|
|
|
#### Album Images
|
|
|
|
**Endpoint**: `GET /music/albums/{mbid}`
|
|
|
|
**Response**:
|
|
```json
|
|
{
|
|
"albums": {
|
|
"1b022e01-4da6-387b-8658-8678046e4cef": {
|
|
"albumcover": [
|
|
{
|
|
"id": "67890",
|
|
"url": "https://assets.fanart.tv/fanart/music/1b022e01-4da6-387b-8658-8678046e4cef/albumcover/nevermind-1.jpg",
|
|
"likes": "156"
|
|
}
|
|
],
|
|
"cdart": [...]
|
|
}
|
|
}
|
|
}
|
|
```
|
|
|
|
### Image Types
|
|
|
|
**Artist images**:
|
|
- `artistbackground`: Background images (1920x1080)
|
|
- `artistthumb`: Artist thumbnails (1000x1000)
|
|
- `hdmusiclogo`: HD logos (transparent PNG)
|
|
- `musicbanner`: Banners (1000x185)
|
|
- `musiclogo`: Standard logos (transparent PNG)
|
|
|
|
**Album images**:
|
|
- `albumcover`: Album covers (1000x1000)
|
|
- `cdart`: CD art (transparent PNG)
|
|
|
|
### Mapping to Lidarr Image Types
|
|
|
|
```python
|
|
FANART_TYPE_MAPPING = {
|
|
'artistbackground': 'fanart',
|
|
'artistthumb': 'poster',
|
|
'hdmusiclogo': 'logo',
|
|
'musicbanner': 'banner',
|
|
'musiclogo': 'logo',
|
|
'albumcover': 'cover',
|
|
'cdart': 'disc'
|
|
}
|
|
```
|
|
|
|
### Error Handling
|
|
|
|
**404 Not Found**: No images available for artist/album
|
|
|
|
**429 Too Many Requests**: Rate limit exceeded (retry with backoff)
|
|
|
|
**503 Service Unavailable**: FanArt.tv temporarily down
|
|
|
|
**Fallback**: Use TheAudioDB or Cover Art Archive
|
|
|
|
### Caching Strategy
|
|
|
|
**Cache TTL**: 30 days (images rarely change)
|
|
|
|
**Cache key**: `fanart:artist:{mbid}` or `fanart:album:{mbid}`
|
|
|
|
**Invalidation**: Manual only (images are immutable)
|
|
|
|
## 5. TheAudioDB
|
|
|
|
### Overview
|
|
|
|
**Type**: REST API
|
|
|
|
**Purpose**: Fallback metadata and images
|
|
|
|
**Base URL**: `https://theaudiodb.com/api/v1/json`
|
|
|
|
**Authentication**: API key "1" (public key)
|
|
|
|
### Configuration
|
|
|
|
```python
|
|
THEAUDIODB = {
|
|
'api_key': '1',
|
|
'base_url': 'https://theaudiodb.com/api/v1/json',
|
|
'timeout': 10,
|
|
'cache_ttl': 2592000 # 30 days
|
|
}
|
|
```
|
|
|
|
### Endpoints
|
|
|
|
#### Artist by MusicBrainz ID
|
|
|
|
**Endpoint**: `GET /1/artist-mb.php?i={mbid}`
|
|
|
|
**Request**:
|
|
```python
|
|
async def get_theaudiodb_artist(mbid):
|
|
async with aiohttp.ClientSession() as session:
|
|
url = f"{config.THEAUDIODB['base_url']}/1/artist-mb.php"
|
|
params = {'i': mbid}
|
|
|
|
async with session.get(url, params=params, timeout=10) as response:
|
|
if response.status == 404:
|
|
return None
|
|
response.raise_for_status()
|
|
data = await response.json()
|
|
return data['artists'][0] if data['artists'] else None
|
|
```
|
|
|
|
**Response**:
|
|
```json
|
|
{
|
|
"artists": [
|
|
{
|
|
"idArtist": "111247",
|
|
"strArtist": "Nirvana",
|
|
"strArtistAlternate": "",
|
|
"strLabel": "DGC Records",
|
|
"idLabel": "45114",
|
|
"intFormedYear": "1987",
|
|
"intBornYear": "",
|
|
"intDiedYear": "",
|
|
"strDisbanded": "1994",
|
|
"strStyle": "Grunge",
|
|
"strGenre": "Rock",
|
|
"strMood": "Angry",
|
|
"strWebsite": "www.nirvana.com",
|
|
"strFacebook": "www.facebook.com/Nirvana",
|
|
"strTwitter": "twitter.com/nirvana",
|
|
"strBiographyEN": "Nirvana was an American rock band...",
|
|
"strBiographyDE": null,
|
|
"strBiographyFR": null,
|
|
"strGender": "Male",
|
|
"strCountry": "United States",
|
|
"strCountryCode": "US",
|
|
"strArtistThumb": "https://www.theaudiodb.com/images/media/artist/thumb/uxrqxy1347913147.jpg",
|
|
"strArtistLogo": "https://www.theaudiodb.com/images/media/artist/logo/urspuv1434553994.png",
|
|
"strArtistFanart": "https://www.theaudiodb.com/images/media/artist/fanart/spvryu1347980801.jpg",
|
|
"strArtistBanner": "https://www.theaudiodb.com/images/media/artist/banner/xuypqw1342640163.jpg",
|
|
"strMusicBrainzID": "5b11f4ce-a62d-471e-81fc-a69a8278c7da",
|
|
"strLastFMChart": "https://www.last.fm/music/Nirvana",
|
|
"intCharted": "5",
|
|
"strLocked": "unlocked"
|
|
}
|
|
]
|
|
}
|
|
```
|
|
|
|
#### Album by MusicBrainz ID
|
|
|
|
**Endpoint**: `GET /1/album-mb.php?i={mbid}`
|
|
|
|
**Response**: Similar structure with album-specific fields
|
|
|
|
### Data Extraction
|
|
|
|
**Biography**:
|
|
```python
|
|
def extract_biography(artist_data):
|
|
"""Extract biography with language fallback"""
|
|
languages = ['EN', 'DE', 'FR', 'ES', 'IT', 'JP']
|
|
for lang in languages:
|
|
bio = artist_data.get(f'strBiography{lang}')
|
|
if bio:
|
|
return bio
|
|
return None
|
|
```
|
|
|
|
**Images**:
|
|
```python
|
|
def extract_images(artist_data):
|
|
"""Extract image URLs"""
|
|
images = []
|
|
|
|
if artist_data.get('strArtistThumb'):
|
|
images.append({
|
|
'Url': artist_data['strArtistThumb'],
|
|
'CoverType': 'poster',
|
|
'Extension': '.jpg'
|
|
})
|
|
|
|
if artist_data.get('strArtistLogo'):
|
|
images.append({
|
|
'Url': artist_data['strArtistLogo'],
|
|
'CoverType': 'logo',
|
|
'Extension': '.png'
|
|
})
|
|
|
|
if artist_data.get('strArtistFanart'):
|
|
images.append({
|
|
'Url': artist_data['strArtistFanart'],
|
|
'CoverType': 'fanart',
|
|
'Extension': '.jpg'
|
|
})
|
|
|
|
if artist_data.get('strArtistBanner'):
|
|
images.append({
|
|
'Url': artist_data['strArtistBanner'],
|
|
'CoverType': 'banner',
|
|
'Extension': '.jpg'
|
|
})
|
|
|
|
return images
|
|
```
|
|
|
|
**Links**:
|
|
```python
|
|
def extract_links(artist_data):
|
|
"""Extract social media links"""
|
|
links = []
|
|
|
|
if artist_data.get('strWebsite'):
|
|
links.append({
|
|
'Url': f"http://{artist_data['strWebsite']}",
|
|
'Name': 'website'
|
|
})
|
|
|
|
if artist_data.get('strFacebook'):
|
|
links.append({
|
|
'Url': f"https://{artist_data['strFacebook']}",
|
|
'Name': 'facebook'
|
|
})
|
|
|
|
if artist_data.get('strTwitter'):
|
|
links.append({
|
|
'Url': f"https://{artist_data['strTwitter']}",
|
|
'Name': 'twitter'
|
|
})
|
|
|
|
return links
|
|
```
|
|
|
|
### Error Handling
|
|
|
|
**404 Not Found**: Artist/album not in TheAudioDB
|
|
|
|
**Timeout**: 10-second timeout, fallback to other providers
|
|
|
|
**Invalid JSON**: Graceful degradation
|
|
|
|
## 6. Wikipedia
|
|
|
|
### Overview
|
|
|
|
**Type**: Web scraping
|
|
|
|
**Purpose**: Artist biographical information
|
|
|
|
**Base URL**: `https://{lang}.wikipedia.org`
|
|
|
|
**Authentication**: None (public access)
|
|
|
|
### Configuration
|
|
|
|
```python
|
|
WIKIPEDIA = {
|
|
'timeout': 2,
|
|
'max_connections_per_host': 1,
|
|
'user_agent': 'LidarrMetadataAPI/10.0.0 (https://github.com/Lidarr/LidarrAPI.Metadata)',
|
|
'languages': ['en', 'fr', 'de', 'es', 'it', 'ja', 'zh', 'ru', 'pt', 'nl', 'sv', 'fi', 'no', 'da', 'pl', 'cs', 'hu', 'ro', 'tr', 'el', 'he', 'ar', 'fa', 'hi', 'th', 'ko', 'vi', 'id', 'ms', 'tl', 'bn', 'ta']
|
|
}
|
|
```
|
|
|
|
### Lookup Process
|
|
|
|
**Multi-stage lookup**:
|
|
|
|
1. **MusicBrainz → Wikidata**: Extract Wikidata ID from MusicBrainz links
|
|
2. **Wikidata → Wikipedia**: Get Wikipedia article title from Wikidata
|
|
3. **Wikipedia → Extract**: Scrape and parse Wikipedia article
|
|
|
|
### Wikidata Integration
|
|
|
|
**Wikidata entity URL**: `https://www.wikidata.org/wiki/Special:EntityData/{entity_id}.json`
|
|
|
|
**Extract Wikipedia links**:
|
|
```python
|
|
async def get_wikipedia_title_from_wikidata(wikidata_id, language='en'):
|
|
"""Get Wikipedia article title from Wikidata entity"""
|
|
async with aiohttp.ClientSession() as session:
|
|
url = f"https://www.wikidata.org/wiki/Special:EntityData/{wikidata_id}.json"
|
|
|
|
async with session.get(url, timeout=2) as response:
|
|
data = await response.json()
|
|
entity = data['entities'][wikidata_id]
|
|
|
|
# Get Wikipedia link for language
|
|
sitelinks = entity.get('sitelinks', {})
|
|
wiki_key = f'{language}wiki'
|
|
|
|
if wiki_key in sitelinks:
|
|
return sitelinks[wiki_key]['title']
|
|
|
|
return None
|
|
```
|
|
|
|
### Wikipedia Article Extraction
|
|
|
|
**Fetch article HTML**:
|
|
```python
|
|
async def get_wikipedia_article(title, language='en'):
|
|
"""Fetch Wikipedia article HTML"""
|
|
async with aiohttp.ClientSession() as session:
|
|
url = f"https://{language}.wikipedia.org/wiki/{title}"
|
|
headers = {'User-Agent': config.WIKIPEDIA['user_agent']}
|
|
|
|
async with session.get(url, headers=headers, timeout=2) as response:
|
|
if response.status == 404:
|
|
return None
|
|
response.raise_for_status()
|
|
return await response.text()
|
|
```
|
|
|
|
**Parse and extract summary**:
|
|
```python
|
|
from bs4 import BeautifulSoup
|
|
|
|
def extract_wikipedia_summary(html):
|
|
"""Extract first paragraph as summary"""
|
|
soup = BeautifulSoup(html, 'lxml')
|
|
|
|
# Find main content div
|
|
content = soup.find('div', {'id': 'mw-content-text'})
|
|
if not content:
|
|
return None
|
|
|
|
# Find first paragraph (skip disambiguation notices)
|
|
for p in content.find_all('p', recursive=False):
|
|
text = p.get_text().strip()
|
|
|
|
# Skip empty paragraphs
|
|
if not text:
|
|
continue
|
|
|
|
# Skip coordinate-only paragraphs
|
|
if text.startswith('Coordinates:'):
|
|
continue
|
|
|
|
# Return first substantial paragraph
|
|
if len(text) > 50:
|
|
return text
|
|
|
|
return None
|
|
```
|
|
|
|
### Language Fallback
|
|
|
|
**32-language fallback chain**:
|
|
|
|
```python
|
|
async def get_artist_overview(mbid):
|
|
"""Get artist overview with language fallback"""
|
|
# Get Wikidata ID from MusicBrainz
|
|
wikidata_id = await get_wikidata_id_from_musicbrainz(mbid)
|
|
if not wikidata_id:
|
|
return None
|
|
|
|
# Try each language in order
|
|
for language in config.WIKIPEDIA['languages']:
|
|
try:
|
|
# Get Wikipedia title for language
|
|
title = await get_wikipedia_title_from_wikidata(wikidata_id, language)
|
|
if not title:
|
|
continue
|
|
|
|
# Fetch and parse article
|
|
html = await get_wikipedia_article(title, language)
|
|
if not html:
|
|
continue
|
|
|
|
summary = extract_wikipedia_summary(html)
|
|
if summary:
|
|
return summary
|
|
|
|
except Exception as e:
|
|
logger.debug(f"Wikipedia lookup failed for {language}: {e}")
|
|
continue
|
|
|
|
return None
|
|
```
|
|
|
|
### Rate Limiting
|
|
|
|
**Polite crawling**:
|
|
- 1 connection per host maximum
|
|
- 2-second timeout per request
|
|
- User-Agent header identifies bot
|
|
- Respect robots.txt (manual check)
|
|
|
|
**No explicit rate limit**: Wikipedia allows reasonable bot traffic
|
|
|
|
### Error Handling
|
|
|
|
**404 Not Found**: Article doesn't exist in language
|
|
|
|
**Timeout**: 2-second timeout, try next language
|
|
|
|
**Parse errors**: Graceful degradation, try next language
|
|
|
|
**Fallback**: Use TheAudioDB biography if Wikipedia unavailable
|
|
|
|
## 7. Spotify
|
|
|
|
### Overview
|
|
|
|
**Type**: REST API with OAuth
|
|
|
|
**Purpose**: ID mapping and cross-platform linking
|
|
|
|
**Base URL**: `https://api.spotify.com/v1`
|
|
|
|
**Authentication**: OAuth 2.0 Client Credentials
|
|
|
|
**Library**: spotipy 2.16.1
|
|
|
|
### Configuration
|
|
|
|
```python
|
|
SPOTIFY = {
|
|
'client_id': 'your-client-id',
|
|
'client_secret': 'your-client-secret',
|
|
'redirect_uri': 'http://localhost:5001/spotify/callback',
|
|
'timeout': 5
|
|
}
|
|
```
|
|
|
|
### OAuth Flow
|
|
|
|
**Client Credentials Grant** (for server-to-server):
|
|
|
|
```python
|
|
import spotipy
|
|
from spotipy.oauth2 import SpotifyClientCredentials
|
|
|
|
auth_manager = SpotifyClientCredentials(
|
|
client_id=config.SPOTIFY['client_id'],
|
|
client_secret=config.SPOTIFY['client_secret']
|
|
)
|
|
|
|
spotify = spotipy.Spotify(auth_manager=auth_manager)
|
|
```
|
|
|
|
**Token caching**: Tokens cached in Redis with automatic refresh
|
|
|
|
### ID Mapping
|
|
|
|
**MusicBrainz → Spotify**:
|
|
|
|
```python
|
|
async def map_musicbrainz_to_spotify(mbid, artist_name):
|
|
"""Map MusicBrainz ID to Spotify ID"""
|
|
# Search Spotify by artist name
|
|
results = spotify.search(q=f'artist:{artist_name}', type='artist', limit=10)
|
|
|
|
if not results['artists']['items']:
|
|
return None
|
|
|
|
# Find best match using Levenshtein distance
|
|
best_match = None
|
|
best_score = 0
|
|
|
|
for artist in results['artists']['items']:
|
|
score = levenshtein_similarity(artist_name, artist['name'])
|
|
if score > best_score and score >= 0.8:
|
|
best_score = score
|
|
best_match = artist
|
|
|
|
return best_match['id'] if best_match else None
|
|
```
|
|
|
|
**Levenshtein similarity**:
|
|
```python
|
|
from Levenshtein import ratio
|
|
|
|
def levenshtein_similarity(s1, s2):
|
|
"""Calculate Levenshtein similarity (0-1)"""
|
|
return ratio(s1.lower(), s2.lower())
|
|
```
|
|
|
|
**Threshold**: 0.8 minimum similarity for match
|
|
|
|
### Spotify API Endpoints
|
|
|
|
**Get artist**:
|
|
```python
|
|
artist = spotify.artist('6olE6TJLqED3rqDCT0FyPh')
|
|
```
|
|
|
|
**Get album**:
|
|
```python
|
|
album = spotify.album('2guirTSEqLizK7j9i1MTTZ')
|
|
```
|
|
|
|
**Search**:
|
|
```python
|
|
results = spotify.search(q='nirvana', type='artist', limit=10)
|
|
```
|
|
|
|
### Error Handling
|
|
|
|
**429 Too Many Requests**: Retry with exponential backoff
|
|
|
|
**401 Unauthorized**: Refresh OAuth token
|
|
|
|
**404 Not Found**: Artist/album not on Spotify
|
|
|
|
**Timeout**: 5-second timeout, graceful degradation
|
|
|
|
### Caching Strategy
|
|
|
|
**Cache TTL**: 90 days (Spotify IDs rarely change)
|
|
|
|
**Cache key**: `spotify:artist:{spotify_id}` or `spotify:mbid:{mbid}`
|
|
|
|
## 8. Last.fm
|
|
|
|
### Overview
|
|
|
|
**Type**: REST API
|
|
|
|
**Purpose**: Music charts and scrobble data
|
|
|
|
**Base URL**: `https://ws.audioscrobbler.com/2.0`
|
|
|
|
**Authentication**: API key
|
|
|
|
**Library**: pylast 4.3.0
|
|
|
|
### Configuration
|
|
|
|
```python
|
|
LASTFM = {
|
|
'api_key': 'your-api-key',
|
|
'api_secret': 'your-api-secret',
|
|
'timeout': 5
|
|
}
|
|
```
|
|
|
|
### pylast Integration
|
|
|
|
```python
|
|
import pylast
|
|
|
|
network = pylast.LastFMNetwork(
|
|
api_key=config.LASTFM['api_key'],
|
|
api_secret=config.LASTFM['api_secret']
|
|
)
|
|
```
|
|
|
|
### Chart Endpoints
|
|
|
|
**Top artists**:
|
|
```python
|
|
def get_lastfm_top_artists(limit=50):
|
|
"""Get Last.fm top artists chart"""
|
|
top_artists = network.get_top_artists(limit=limit)
|
|
|
|
results = []
|
|
for artist in top_artists:
|
|
results.append({
|
|
'name': artist.item.name,
|
|
'playcount': artist.weight,
|
|
'listeners': artist.item.get_listener_count()
|
|
})
|
|
|
|
return results
|
|
```
|
|
|
|
**Top albums**:
|
|
```python
|
|
def get_lastfm_top_albums(limit=50):
|
|
"""Get Last.fm top albums chart"""
|
|
top_albums = network.get_top_albums(limit=limit)
|
|
|
|
results = []
|
|
for album in top_albums:
|
|
results.append({
|
|
'name': album.item.title,
|
|
'artist': album.item.artist.name,
|
|
'playcount': album.weight
|
|
})
|
|
|
|
return results
|
|
```
|
|
|
|
**Top tracks**:
|
|
```python
|
|
def get_lastfm_top_tracks(limit=50):
|
|
"""Get Last.fm top tracks chart"""
|
|
top_tracks = network.get_top_tracks(limit=limit)
|
|
|
|
results = []
|
|
for track in top_tracks:
|
|
results.append({
|
|
'name': track.item.title,
|
|
'artist': track.item.artist.name,
|
|
'playcount': track.weight
|
|
})
|
|
|
|
return results
|
|
```
|
|
|
|
### MusicBrainz Mapping
|
|
|
|
**Map Last.fm artist to MusicBrainz**:
|
|
```python
|
|
async def map_lastfm_to_musicbrainz(lastfm_artist_name):
|
|
"""Map Last.fm artist to MusicBrainz ID"""
|
|
# Search MusicBrainz via Solr
|
|
results = await search_artist(lastfm_artist_name, limit=5)
|
|
|
|
if not results:
|
|
return None
|
|
|
|
# Return best match (first result)
|
|
return results[0]['Id']
|
|
```
|
|
|
|
### Caching
|
|
|
|
**Cache TTL**: 6 hours (charts update daily)
|
|
|
|
**Cache key**: `lastfm:chart:{type}:{limit}`
|
|
|
|
## 9. Billboard
|
|
|
|
### Overview
|
|
|
|
**Type**: Web scraping
|
|
|
|
**Purpose**: Billboard music charts
|
|
|
|
**Base URL**: `https://www.billboard.com/charts`
|
|
|
|
**Authentication**: None
|
|
|
|
**Library**: billboard-py 7.0.0
|
|
|
|
### billboard-py Integration
|
|
|
|
```python
|
|
import billboard
|
|
|
|
def get_billboard_hot_100():
|
|
"""Get Billboard Hot 100 chart"""
|
|
chart = billboard.ChartData('hot-100')
|
|
|
|
results = []
|
|
for entry in chart:
|
|
results.append({
|
|
'position': entry.rank,
|
|
'title': entry.title,
|
|
'artist': entry.artist,
|
|
'last_position': entry.lastPos,
|
|
'peak_position': entry.peakPos,
|
|
'weeks_on_chart': entry.weeks
|
|
})
|
|
|
|
return results
|
|
```
|
|
|
|
### Supported Charts
|
|
|
|
| Chart Name | billboard-py ID | Type |
|
|
|------------|-----------------|------|
|
|
| **Hot 100** | `hot-100` | Tracks |
|
|
| **Billboard 200** | `billboard-200` | Albums |
|
|
| **Artist 100** | `artist-100` | Artists |
|
|
| **Streaming Songs** | `streaming-songs` | Tracks |
|
|
| **Radio Songs** | `radio-songs` | Tracks |
|
|
| **Digital Song Sales** | `digital-song-sales` | Tracks |
|
|
|
|
### MusicBrainz Mapping
|
|
|
|
**Map Billboard entry to MusicBrainz**:
|
|
```python
|
|
async def map_billboard_to_musicbrainz(artist_name, track_title=None):
|
|
"""Map Billboard entry to MusicBrainz"""
|
|
# Search artist
|
|
artist_results = await search_artist(artist_name, limit=5)
|
|
if not artist_results:
|
|
return None
|
|
|
|
artist_mbid = artist_results[0]['Id']
|
|
|
|
# If track title provided, search for recording
|
|
if track_title:
|
|
# Search would require recording search (not implemented)
|
|
pass
|
|
|
|
return artist_mbid
|
|
```
|
|
|
|
### Error Handling
|
|
|
|
**HTTP errors**: Retry with backoff
|
|
|
|
**Parse errors**: Graceful degradation
|
|
|
|
**Rate limiting**: Polite crawling (1 request per second)
|
|
|
|
### Caching
|
|
|
|
**Cache TTL**: 6 hours (charts update weekly)
|
|
|
|
**Cache key**: `billboard:chart:{chart_name}`
|
|
|
|
## 10. Apple Music / iTunes
|
|
|
|
### Overview
|
|
|
|
**Type**: RSS API
|
|
|
|
**Purpose**: Apple Music and iTunes charts
|
|
|
|
**Base URL**: `https://rss.applemarketingtools.com/api/v2`
|
|
|
|
**Authentication**: None
|
|
|
|
### RSS Feed URLs
|
|
|
|
**Top albums**:
|
|
```
|
|
https://rss.applemarketingtools.com/api/v2/us/music/most-played/100/albums.json
|
|
```
|
|
|
|
**Top songs**:
|
|
```
|
|
https://rss.applemarketingtools.com/api/v2/us/music/most-played/100/songs.json
|
|
```
|
|
|
|
**New releases**:
|
|
```
|
|
https://rss.applemarketingtools.com/api/v2/us/music/new-releases/100/albums.json
|
|
```
|
|
|
|
### Fetch and Parse
|
|
|
|
```python
|
|
async def get_apple_music_chart(chart_type, limit=100):
|
|
"""Fetch Apple Music chart"""
|
|
async with aiohttp.ClientSession() as session:
|
|
url = f"https://rss.applemarketingtools.com/api/v2/us/music/most-played/{limit}/{chart_type}.json"
|
|
|
|
async with session.get(url, timeout=5) as response:
|
|
response.raise_for_status()
|
|
data = await response.json()
|
|
|
|
results = []
|
|
for entry in data['feed']['results']:
|
|
results.append({
|
|
'position': len(results) + 1,
|
|
'name': entry['name'],
|
|
'artist': entry['artistName'],
|
|
'url': entry['url'],
|
|
'artwork': entry['artworkUrl100']
|
|
})
|
|
|
|
return results
|
|
```
|
|
|
|
### MusicBrainz Mapping
|
|
|
|
**Map Apple Music entry to MusicBrainz**: Similar to Billboard mapping
|
|
|
|
### Caching
|
|
|
|
**Cache TTL**: 6 hours
|
|
|
|
**Cache key**: `apple:chart:{type}:{limit}`
|
|
|
|
## 11. RabbitMQ
|
|
|
|
### Overview
|
|
|
|
**Type**: Message queue
|
|
|
|
**Purpose**: Real-time search index updates
|
|
|
|
**Technology**: RabbitMQ 3.x
|
|
|
|
**Protocol**: AMQP 0.9.1
|
|
|
|
### Configuration
|
|
|
|
```python
|
|
RABBITMQ = {
|
|
'host': 'rabbitmq',
|
|
'port': 5672,
|
|
'user': 'abc',
|
|
'password': 'abc',
|
|
'exchange': 'search.index',
|
|
'artist_queue': 'search.index.artist',
|
|
'album_queue': 'search.index.album'
|
|
}
|
|
```
|
|
|
|
### Message Format
|
|
|
|
**Artist update message**:
|
|
```json
|
|
{
|
|
"entity_type": "artist",
|
|
"mbid": "5b11f4ce-a62d-471e-81fc-a69a8278c7da",
|
|
"action": "update",
|
|
"timestamp": "2025-04-28T12:34:56Z"
|
|
}
|
|
```
|
|
|
|
**Album update message**:
|
|
```json
|
|
{
|
|
"entity_type": "release_group",
|
|
"mbid": "1b022e01-4da6-387b-8658-8678046e4cef",
|
|
"action": "update",
|
|
"timestamp": "2025-04-28T12:34:56Z"
|
|
}
|
|
```
|
|
|
|
### SIR (Search Index Rebuilder)
|
|
|
|
**Purpose**: Consume RabbitMQ messages and update Solr
|
|
|
|
**Process**:
|
|
1. Connect to RabbitMQ
|
|
2. Subscribe to queues
|
|
3. Consume messages
|
|
4. Query MusicBrainz DB for entity
|
|
5. Post update to Solr
|
|
6. Acknowledge message
|
|
|
|
**Container**: Separate service in docker-compose
|
|
|
|
### Monitoring
|
|
|
|
**Queue depth**:
|
|
```bash
|
|
rabbitmqctl list_queues name messages
|
|
```
|
|
|
|
**Consumer count**:
|
|
```bash
|
|
rabbitmqctl list_consumers
|
|
```
|
|
|
|
## 12. Redis
|
|
|
|
### Overview
|
|
|
|
**Type**: In-memory cache
|
|
|
|
**Purpose**: Ephemeral cache and rate limiting
|
|
|
|
**Technology**: Redis 6+
|
|
|
|
**Memory**: 512MB limit
|
|
|
|
### Configuration
|
|
|
|
```python
|
|
REDIS = {
|
|
'url': 'redis://redis:6379/0',
|
|
'namespace': 'lm3.7',
|
|
'max_memory': '512mb',
|
|
'eviction_policy': 'allkeys-lfu'
|
|
}
|
|
```
|
|
|
|
### Use Cases
|
|
|
|
1. **Hot cache**: Frequently accessed metadata
|
|
2. **Rate limiting**: Request counting
|
|
3. **Sentry deduplication**: Error tracking
|
|
4. **Invalidation locks**: Distributed locking
|
|
|
|
### Connection Pool
|
|
|
|
```python
|
|
import aioredis
|
|
|
|
redis = await aioredis.create_redis_pool(
|
|
config.REDIS['url'],
|
|
minsize=5,
|
|
maxsize=20,
|
|
encoding='utf-8'
|
|
)
|
|
```
|
|
|
|
## 13. Sentry
|
|
|
|
### Overview
|
|
|
|
**Type**: Error tracking
|
|
|
|
**Purpose**: Application monitoring
|
|
|
|
**Technology**: Sentry SaaS
|
|
|
|
**Library**: sentry-sdk 0.19.5
|
|
|
|
### Configuration
|
|
|
|
```python
|
|
import sentry_sdk
|
|
from sentry_sdk.integrations.flask import FlaskIntegration
|
|
|
|
sentry_sdk.init(
|
|
dsn=config.SENTRY_DSN,
|
|
integrations=[FlaskIntegration()],
|
|
release=f"lidarr-metadata@{__version__}",
|
|
environment=config.ENVIRONMENT,
|
|
traces_sample_rate=0.1
|
|
)
|
|
```
|
|
|
|
### Redis-Based Rate Limiting
|
|
|
|
**Purpose**: Prevent alert fatigue
|
|
|
|
```python
|
|
class SentryRedisTtlProcessor:
|
|
"""Rate limit Sentry events using Redis"""
|
|
|
|
def __init__(self, redis, ttl=3600):
|
|
self.redis = redis
|
|
self.ttl = ttl
|
|
|
|
async def __call__(self, event, hint):
|
|
# Generate error hash
|
|
error_hash = hashlib.md5(
|
|
f"{event['exception']['type']}:{event['exception']['value']}".encode()
|
|
).hexdigest()
|
|
|
|
key = f"lm3.7:sentry:{error_hash}"
|
|
|
|
# Check if error seen recently
|
|
if await self.redis.exists(key):
|
|
return None # Drop event
|
|
|
|
# Mark error as seen
|
|
await self.redis.setex(key, self.ttl, "1")
|
|
|
|
return event
|
|
```
|
|
|
|
### Release Tracking
|
|
|
|
**Sentry releases**: Tied to git commits
|
|
|
|
**CI/CD integration**:
|
|
```bash
|
|
sentry-cli releases new "lidarr-metadata@${GIT_SHA}"
|
|
sentry-cli releases set-commits "lidarr-metadata@${GIT_SHA}" --auto
|
|
sentry-cli releases finalize "lidarr-metadata@${GIT_SHA}"
|
|
```
|
|
|
|
## 14. Telegraf
|
|
|
|
### Overview
|
|
|
|
**Type**: Metrics collection
|
|
|
|
**Purpose**: StatsD metrics aggregation
|
|
|
|
**Technology**: Telegraf (InfluxData)
|
|
|
|
**Protocol**: StatsD
|
|
|
|
### Configuration
|
|
|
|
```python
|
|
TELEGRAF = {
|
|
'host': 'telegraf',
|
|
'port': 8125,
|
|
'prefix': 'lidarr.metadata'
|
|
}
|
|
```
|
|
|
|
### StatsD Client
|
|
|
|
```python
|
|
import statsd
|
|
|
|
stats = statsd.StatsClient(
|
|
host=config.TELEGRAF['host'],
|
|
port=config.TELEGRAF['port'],
|
|
prefix=config.TELEGRAF['prefix']
|
|
)
|
|
```
|
|
|
|
### Metrics
|
|
|
|
**Request counters**:
|
|
```python
|
|
stats.incr('requests.artist')
|
|
stats.incr('requests.album')
|
|
stats.incr('requests.search')
|
|
```
|
|
|
|
**Response times**:
|
|
```python
|
|
with stats.timer('response_time.artist'):
|
|
artist = await get_artist(mbid)
|
|
```
|
|
|
|
**Cache hits/misses**:
|
|
```python
|
|
stats.incr('cache.hit')
|
|
stats.incr('cache.miss')
|
|
```
|
|
|
|
**Provider requests**:
|
|
```python
|
|
stats.incr('provider.fanart.request')
|
|
stats.incr('provider.wikipedia.request')
|
|
```
|
|
|
|
## 15. Cloudflare
|
|
|
|
### Overview
|
|
|
|
**Type**: CDN and edge caching
|
|
|
|
**Purpose**: Global content delivery
|
|
|
|
**Technology**: Cloudflare CDN
|
|
|
|
**API**: Cloudflare REST API v4
|
|
|
|
### Configuration
|
|
|
|
```python
|
|
CLOUDFLARE = {
|
|
'zone_id': 'your-zone-id',
|
|
'api_token': 'your-api-token',
|
|
'base_url': 'https://api.cloudflare.com/client/v4'
|
|
}
|
|
```
|
|
|
|
### Cache Purge
|
|
|
|
**Purge by URL**:
|
|
```python
|
|
async def purge_cloudflare_cache(urls):
|
|
"""Purge Cloudflare cache for URLs"""
|
|
async with aiohttp.ClientSession() as session:
|
|
headers = {
|
|
'Authorization': f"Bearer {config.CLOUDFLARE['api_token']}",
|
|
'Content-Type': 'application/json'
|
|
}
|
|
|
|
# Batch URLs (max 30 per request)
|
|
for batch in chunks(urls, 30):
|
|
data = {'files': batch}
|
|
url = f"{config.CLOUDFLARE['base_url']}/zones/{config.CLOUDFLARE['zone_id']}/purge_cache"
|
|
|
|
async with session.post(url, headers=headers, json=data) as response:
|
|
response.raise_for_status()
|
|
```
|
|
|
|
**Purge all**:
|
|
```python
|
|
async def purge_all_cloudflare_cache():
|
|
"""Purge entire Cloudflare cache"""
|
|
async with aiohttp.ClientSession() as session:
|
|
headers = {
|
|
'Authorization': f"Bearer {config.CLOUDFLARE['api_token']}",
|
|
'Content-Type': 'application/json'
|
|
}
|
|
|
|
data = {'purge_everything': True}
|
|
url = f"{config.CLOUDFLARE['base_url']}/zones/{config.CLOUDFLARE['zone_id']}/purge_cache"
|
|
|
|
async with session.post(url, headers=headers, json=data) as response:
|
|
response.raise_for_status()
|
|
```
|
|
|
|
### Rate Limits
|
|
|
|
**Cloudflare API**: 1200 requests per 5 minutes
|
|
|
|
**Batch purging**: Max 30 URLs per request
|
|
|
|
### Cache-Control Headers
|
|
|
|
**Set by API**:
|
|
```python
|
|
response.headers['Cache-Control'] = 's-maxage=2592000, max-age=0'
|
|
```
|
|
|
|
**Interpretation**:
|
|
- `s-maxage=2592000`: CDN caches for 30 days
|
|
- `max-age=0`: Clients must revalidate
|
|
|
|
## Integration Summary
|
|
|
|
The 15 integrations provide comprehensive metadata aggregation:
|
|
|
|
**Core data**: MusicBrainz DB (direct SQL)
|
|
|
|
**Search**: Solr (real-time via RabbitMQ)
|
|
|
|
**Images**: Cover Art Archive, FanArt.tv, TheAudioDB
|
|
|
|
**Biographies**: Wikipedia (32 languages), TheAudioDB
|
|
|
|
**Charts**: Last.fm, Billboard, Apple Music, Spotify
|
|
|
|
**Cross-platform**: Spotify ID mapping
|
|
|
|
**Infrastructure**: Redis (cache), PostgreSQL (persistent cache), RabbitMQ (messaging)
|
|
|
|
**Monitoring**: Sentry (errors), Telegraf (metrics)
|
|
|
|
**CDN**: Cloudflare (edge caching)
|
|
|
|
The integration architecture demonstrates excellent separation of concerns with fallback chains for resilience.
|