bc9fa36646
Week 10 - Plugin System (FR-19): - Plugin traits: Plugin, OriginPlugin, MetadataPlugin, FormatPlugin - NativePluginHost with libloading for dynamic loading - WasmPluginHost (feature-gated) with wasmtime runtime - PluginManager coordinating both hosts with version checks - OriginInstance::watch() with WatchHandle, WatchEvent for live updates - FormatPlugin::synthesize_header() for metadata overlay Week 11 - Control API & Production (FR-17, FR-18, NFR-6, NFR-10): - gRPC server with full MusicFS service (status, cache, origins, events) - Proto extended: MountState enum, TierStats, full StatusResponse/CacheStats - WebhookHandler with HMAC-SHA256 signing and exponential retry - Metrics with latency histograms (p50/p95/p99) and origin health gauges - CLI with mount, status, cache, search, origin, events, shutdown commands - E2E player compatibility tests (mpv, VLC, file manager) - systemd service, PKGBUILD, RPM spec for packaging Plans added for Weeks 10-14 covering P1 features. All 154 tests passing.
16 KiB
16 KiB
Week 12: External Metadata Integration
Phase: 5 - P1 Feature Completion
Goal: Integrate external metadata sources for automatic tagging and artwork
Requirements: FR-21.1-21.4, FR-16.5
Deliverables
| Task | Crate | Files | Requirements |
|---|---|---|---|
| MusicBrainz client | musicfs-external | musicbrainz.rs |
FR-21.1 |
| Discogs client | musicfs-external | discogs.rs |
FR-21.2 |
| Last.fm client | musicfs-external | lastfm.rs |
FR-21.3 |
| AcoustID/Chromaprint | musicfs-external | acoustid.rs |
FR-21.4 |
| Online artwork fetch | musicfs-external | artwork_fetch.rs |
FR-16.5 |
| Metadata enrichment | musicfs-external | enrichment.rs |
All |
| Plugin integration | musicfs-plugins | metadata_plugin.rs |
FR-21.5 |
Task 1: Create musicfs-external Crate
1.1 Cargo.toml
[package]
name = "musicfs-external"
version.workspace = true
edition.workspace = true
[dependencies]
musicfs-core = { path = "../musicfs-core" }
reqwest = { version = "0.11", features = ["json"] }
serde = { workspace = true, features = ["derive"] }
serde_json.workspace = true
tokio.workspace = true
tracing.workspace = true
thiserror.workspace = true
chromaprint = "0.6" # Audio fingerprinting
base64 = "0.21"
[dev-dependencies]
wiremock = "0.5" # Mock HTTP responses
tokio-test = "0.4"
1.2 src/lib.rs
pub mod musicbrainz;
pub mod discogs;
pub mod lastfm;
pub mod acoustid;
pub mod artwork_fetch;
pub mod enrichment;
pub use enrichment::MetadataEnricher;
Task 2: MusicBrainz Client (musicfs-external/src/musicbrainz.rs)
use serde::Deserialize;
const MB_API: &str = "https://musicbrainz.org/ws/2";
const USER_AGENT: &str = "MusicFS/0.1.0 (https://github.com/user/musicfs)";
#[derive(Debug, Deserialize)]
pub struct MbRecording {
pub id: String,
pub title: String,
pub length: Option<u64>,
#[serde(rename = "artist-credit")]
pub artist_credit: Vec<ArtistCredit>,
pub releases: Option<Vec<MbRelease>>,
}
#[derive(Debug, Deserialize)]
pub struct MbRelease {
pub id: String,
pub title: String,
pub date: Option<String>,
#[serde(rename = "release-group")]
pub release_group: Option<MbReleaseGroup>,
}
#[derive(Debug, Deserialize)]
pub struct MbReleaseGroup {
pub id: String,
#[serde(rename = "primary-type")]
pub primary_type: Option<String>,
}
#[derive(Debug, Deserialize)]
pub struct ArtistCredit {
pub artist: MbArtist,
}
#[derive(Debug, Deserialize)]
pub struct MbArtist {
pub id: String,
pub name: String,
#[serde(rename = "sort-name")]
pub sort_name: String,
}
pub struct MusicBrainzClient {
client: reqwest::Client,
rate_limiter: RateLimiter, // 1 req/sec per MB guidelines
}
impl MusicBrainzClient {
pub fn new() -> Self {
let client = reqwest::Client::builder()
.user_agent(USER_AGENT)
.build()
.expect("client build");
Self {
client,
rate_limiter: RateLimiter::new(Duration::from_secs(1)),
}
}
/// Search by recording title + artist (FR-21.1)
pub async fn search_recording(
&self,
title: &str,
artist: Option<&str>,
) -> Result<Vec<MbRecording>, ExternalError> {
self.rate_limiter.wait().await;
let mut query = format!("recording:{}", title);
if let Some(artist) = artist {
query.push_str(&format!(" AND artist:{}", artist));
}
let resp = self.client
.get(format!("{}/recording", MB_API))
.query(&[
("query", query.as_str()),
("fmt", "json"),
("limit", "5"),
])
.send()
.await?;
let body: SearchResponse<MbRecording> = resp.json().await?;
Ok(body.recordings)
}
/// Get release artwork from Cover Art Archive
pub async fn get_cover_art(&self, release_id: &str) -> Result<Option<Vec<u8>>, ExternalError> {
let url = format!("https://coverartarchive.org/release/{}/front-500", release_id);
let resp = self.client.get(&url).send().await?;
if resp.status() == 404 {
return Ok(None);
}
let bytes = resp.bytes().await?;
Ok(Some(bytes.to_vec()))
}
/// Lookup recording by MusicBrainz ID
pub async fn get_recording(&self, mbid: &str) -> Result<MbRecording, ExternalError> {
self.rate_limiter.wait().await;
let resp = self.client
.get(format!("{}/recording/{}", MB_API, mbid))
.query(&[
("inc", "artist-credits+releases+release-groups"),
("fmt", "json"),
])
.send()
.await?;
Ok(resp.json().await?)
}
}
struct RateLimiter {
interval: Duration,
last_request: Mutex<Instant>,
}
impl RateLimiter {
fn new(interval: Duration) -> Self {
Self {
interval,
last_request: Mutex::new(Instant::now() - interval),
}
}
async fn wait(&self) {
let mut last = self.last_request.lock().await;
let elapsed = last.elapsed();
if elapsed < self.interval {
tokio::time::sleep(self.interval - elapsed).await;
}
*last = Instant::now();
}
}
Task 3: Discogs Client (musicfs-external/src/discogs.rs)
const DISCOGS_API: &str = "https://api.discogs.com";
pub struct DiscogsClient {
client: reqwest::Client,
token: Option<String>,
rate_limiter: RateLimiter, // 60 req/min authenticated
}
impl DiscogsClient {
pub fn new(token: Option<String>) -> Self;
/// Search releases (FR-21.2)
pub async fn search(
&self,
query: &str,
artist: Option<&str>,
) -> Result<Vec<DiscogsRelease>, ExternalError>;
/// Get master release details
pub async fn get_master(&self, id: u64) -> Result<DiscogsMaster, ExternalError>;
/// Get release images
pub async fn get_images(&self, release_id: u64) -> Result<Vec<DiscogsImage>, ExternalError>;
}
#[derive(Debug, Deserialize)]
pub struct DiscogsRelease {
pub id: u64,
pub title: String,
pub year: Option<u16>,
pub thumb: Option<String>,
pub master_id: Option<u64>,
}
#[derive(Debug, Deserialize)]
pub struct DiscogsImage {
pub uri: String,
pub width: u32,
pub height: u32,
#[serde(rename = "type")]
pub image_type: String, // "primary" or "secondary"
}
Task 4: Last.fm Client (musicfs-external/src/lastfm.rs)
const LASTFM_API: &str = "https://ws.audioscrobbler.com/2.0";
pub struct LastFmClient {
client: reqwest::Client,
api_key: String,
}
impl LastFmClient {
pub fn new(api_key: String) -> Self;
/// Get track info with play counts, tags (FR-21.3)
pub async fn get_track_info(
&self,
track: &str,
artist: &str,
) -> Result<LastFmTrack, ExternalError>;
/// Get album info with artwork
pub async fn get_album_info(
&self,
album: &str,
artist: &str,
) -> Result<LastFmAlbum, ExternalError>;
/// Get artist info
pub async fn get_artist_info(&self, artist: &str) -> Result<LastFmArtist, ExternalError>;
}
#[derive(Debug, Deserialize)]
pub struct LastFmTrack {
pub name: String,
pub playcount: Option<u64>,
pub listeners: Option<u64>,
pub duration: Option<u64>,
pub toptags: Option<Tags>,
pub album: Option<LastFmAlbumRef>,
}
#[derive(Debug, Deserialize)]
pub struct LastFmAlbum {
pub name: String,
pub artist: String,
pub image: Vec<LastFmImage>,
pub tracks: Option<Tracks>,
}
#[derive(Debug, Deserialize)]
pub struct LastFmImage {
#[serde(rename = "#text")]
pub url: String,
pub size: String, // "small", "medium", "large", "extralarge", "mega"
}
Task 5: AcoustID/Chromaprint (musicfs-external/src/acoustid.rs)
use chromaprint::{Fingerprinter, Configuration};
const ACOUSTID_API: &str = "https://api.acoustid.org/v2/lookup";
pub struct AcoustIdClient {
client: reqwest::Client,
api_key: String,
}
impl AcoustIdClient {
pub fn new(api_key: String) -> Self;
/// Generate fingerprint from audio data (FR-21.4)
pub fn fingerprint(&self, samples: &[i16], sample_rate: u32) -> Result<String, ExternalError> {
let config = Configuration::preset_test1();
let mut fp = Fingerprinter::new(&config);
fp.start(sample_rate, 1)?; // mono
fp.feed(samples)?;
fp.finish()?;
Ok(fp.fingerprint().to_string())
}
/// Lookup fingerprint on AcoustID database
pub async fn lookup(
&self,
fingerprint: &str,
duration: u32,
) -> Result<Vec<AcoustIdResult>, ExternalError> {
let resp = self.client
.get(ACOUSTID_API)
.query(&[
("client", self.api_key.as_str()),
("fingerprint", fingerprint),
("duration", &duration.to_string()),
("meta", "recordings+releasegroups"),
])
.send()
.await?;
let body: AcoustIdResponse = resp.json().await?;
Ok(body.results)
}
}
#[derive(Debug, Deserialize)]
pub struct AcoustIdResult {
pub id: String,
pub score: f32,
pub recordings: Option<Vec<AcoustIdRecording>>,
}
#[derive(Debug, Deserialize)]
pub struct AcoustIdRecording {
pub id: String, // MusicBrainz recording ID
pub title: Option<String>,
pub artists: Option<Vec<AcoustIdArtist>>,
}
Task 6: Online Artwork Fetch (musicfs-external/src/artwork_fetch.rs)
pub struct ArtworkFetcher {
musicbrainz: MusicBrainzClient,
discogs: Option<DiscogsClient>,
lastfm: Option<LastFmClient>,
}
impl ArtworkFetcher {
/// Fetch missing artwork from online sources (FR-16.5)
/// Tries sources in order: MusicBrainz Cover Art Archive → Discogs → Last.fm
pub async fn fetch_artwork(
&self,
artist: &str,
album: &str,
size: ArtworkSize,
) -> Result<Option<ArtworkData>, ExternalError> {
// 1. Try MusicBrainz release search → Cover Art Archive
if let Some(art) = self.try_musicbrainz(artist, album, size).await? {
return Ok(Some(art));
}
// 2. Try Discogs
if let Some(discogs) = &self.discogs {
if let Some(art) = self.try_discogs(discogs, artist, album, size).await? {
return Ok(Some(art));
}
}
// 3. Try Last.fm
if let Some(lastfm) = &self.lastfm {
if let Some(art) = self.try_lastfm(lastfm, artist, album, size).await? {
return Ok(Some(art));
}
}
Ok(None)
}
async fn try_musicbrainz(
&self,
artist: &str,
album: &str,
size: ArtworkSize,
) -> Result<Option<ArtworkData>, ExternalError> {
// Search for release, get cover art from Cover Art Archive
let releases = self.musicbrainz.search_release(album, Some(artist)).await?;
for release in releases.iter().take(3) {
if let Some(art) = self.musicbrainz.get_cover_art(&release.id).await? {
return Ok(Some(ArtworkData {
data: art,
source: ArtworkSource::MusicBrainz,
mime_type: "image/jpeg".to_string(),
}));
}
}
Ok(None)
}
}
#[derive(Debug)]
pub struct ArtworkData {
pub data: Vec<u8>,
pub source: ArtworkSource,
pub mime_type: String,
}
#[derive(Debug)]
pub enum ArtworkSource {
MusicBrainz,
Discogs,
LastFm,
Embedded,
}
pub enum ArtworkSize {
Small, // 150px
Medium, // 300px
Large, // 500px
Original,
}
Task 7: Metadata Enrichment (musicfs-external/src/enrichment.rs)
pub struct MetadataEnricher {
musicbrainz: MusicBrainzClient,
acoustid: Option<AcoustIdClient>,
artwork_fetcher: ArtworkFetcher,
}
impl MetadataEnricher {
/// Enrich metadata from external sources
pub async fn enrich(&self, meta: &AudioMeta) -> Result<EnrichedMetadata, ExternalError> {
let mut enriched = EnrichedMetadata::from(meta);
// If we have title + artist, search MusicBrainz
if let (Some(title), Some(artist)) = (&meta.title, &meta.artist) {
let recordings = self.musicbrainz.search_recording(title, Some(artist)).await?;
if let Some(best) = recordings.first() {
enriched.musicbrainz_recording_id = Some(best.id.clone());
// Enrich with release info
if let Some(releases) = &best.releases {
if let Some(release) = releases.first() {
enriched.musicbrainz_release_id = Some(release.id.clone());
}
}
}
}
Ok(enriched)
}
/// Identify unknown track by audio fingerprint
pub async fn identify_by_fingerprint(
&self,
samples: &[i16],
sample_rate: u32,
duration: u32,
) -> Result<Option<IdentifiedTrack>, ExternalError> {
let acoustid = self.acoustid.as_ref()
.ok_or(ExternalError::ServiceNotConfigured("AcoustID"))?;
let fingerprint = acoustid.fingerprint(samples, sample_rate)?;
let results = acoustid.lookup(&fingerprint, duration).await?;
// Return best match above threshold
results.into_iter()
.filter(|r| r.score > 0.8)
.flat_map(|r| r.recordings)
.flatten()
.next()
.map(|rec| IdentifiedTrack {
title: rec.title,
musicbrainz_id: Some(rec.id),
artists: rec.artists.map(|a| a.into_iter().map(|x| x.name).collect()),
})
.pipe(Ok)
}
}
#[derive(Debug)]
pub struct EnrichedMetadata {
pub original: AudioMeta,
pub musicbrainz_recording_id: Option<String>,
pub musicbrainz_release_id: Option<String>,
pub musicbrainz_artist_id: Option<String>,
pub genres: Vec<String>,
pub play_count: Option<u64>,
}
#[derive(Debug)]
pub struct IdentifiedTrack {
pub title: Option<String>,
pub musicbrainz_id: Option<String>,
pub artists: Option<Vec<String>>,
}
Configuration
[external]
# MusicBrainz (no auth required, rate limited to 1 req/sec)
musicbrainz.enabled = true
# Discogs (optional, requires token for higher rate limits)
discogs.enabled = true
discogs.token = "your_discogs_token"
# Last.fm (requires API key)
lastfm.enabled = true
lastfm.api_key = "your_lastfm_api_key"
# AcoustID (requires API key)
acoustid.enabled = true
acoustid.api_key = "your_acoustid_api_key"
# Artwork fetching behavior
artwork.fetch_missing = true
artwork.cache_fetched = true
artwork.preferred_size = "large" # small, medium, large, original
Tests
| Test | Type | Validates |
|---|---|---|
test_musicbrainz_search |
Integration | Recording search (FR-21.1) |
test_musicbrainz_cover_art |
Integration | Cover Art Archive |
test_discogs_search |
Integration | Release search (FR-21.2) |
test_lastfm_track_info |
Integration | Track metadata (FR-21.3) |
test_acoustid_fingerprint |
Unit | Chromaprint generation |
test_acoustid_lookup |
Integration | Fingerprint lookup (FR-21.4) |
test_artwork_fetch_cascade |
Integration | Multi-source artwork (FR-16.5) |
test_metadata_enrichment |
Integration | Full enrichment flow |
test_rate_limiting |
Unit | Rate limiter works |
test_mock_responses |
Unit | Offline testing with mocks |
Exit Criteria
- MusicBrainz search returns relevant recordings
- Cover Art Archive artwork downloads work
- Discogs integration retrieves release info
- Last.fm integration retrieves track/artist info
- AcoustID fingerprinting identifies tracks
- Artwork fetcher tries all sources in cascade
- Metadata enricher adds external IDs
- Rate limiting prevents API abuse
- All tests pass with mock HTTP responses
Architecture Alignment
Per requirements.md:
- FR-21.1: MusicBrainz for canonical metadata ✓
- FR-21.2: Discogs for release info, artwork ✓
- FR-21.3: Last.fm for play counts, tags ✓
- FR-21.4: AcoustID for audio fingerprinting ✓
- FR-16.5: Fetch missing artwork from online ✓
Per architecture.md section 4.3.4:
- External metadata via
MetadataPlugintrait ✓ - Plugin architecture allows adding more sources ✓