Add Week 10 Plugin System and Week 11 Control API

Week 10 - Plugin System (FR-19):
- Plugin traits: Plugin, OriginPlugin, MetadataPlugin, FormatPlugin
- NativePluginHost with libloading for dynamic loading
- WasmPluginHost (feature-gated) with wasmtime runtime
- PluginManager coordinating both hosts with version checks
- OriginInstance::watch() with WatchHandle, WatchEvent for live updates
- FormatPlugin::synthesize_header() for metadata overlay

Week 11 - Control API & Production (FR-17, FR-18, NFR-6, NFR-10):
- gRPC server with full MusicFS service (status, cache, origins, events)
- Proto extended: MountState enum, TierStats, full StatusResponse/CacheStats
- WebhookHandler with HMAC-SHA256 signing and exponential retry
- Metrics with latency histograms (p50/p95/p99) and origin health gauges
- CLI with mount, status, cache, search, origin, events, shutdown commands
- E2E player compatibility tests (mpv, VLC, file manager)
- systemd service, PKGBUILD, RPM spec for packaging

Plans added for Weeks 10-14 covering P1 features.
All 154 tests passing.
This commit is contained in:
Alexander
2026-05-13 10:34:01 +02:00
parent 34d05b7a49
commit bc9fa36646
27 changed files with 7050 additions and 49 deletions
+21
View File
@@ -23,6 +23,21 @@ Implement audio metadata extraction using symphonia and create SQLite schema for
---
## Task 0: Extend AudioMeta in `musicfs-core`
Add `lyrics` and `composer` fields to `AudioMeta` struct (FR-6.4):
```rust
// In musicfs-core/src/types.rs, add to AudioMeta:
pub struct AudioMeta {
// ... existing fields ...
pub lyrics: Option<String>,
pub composer: Option<String>,
}
```
---
## Task 1: Metadata Parser (`musicfs-metadata`)
### 1.1 Create `Cargo.toml`
@@ -168,6 +183,12 @@ impl MetadataParser {
meta.year = value.chars().take(4).collect::<String>()
.parse().ok();
}
StandardTagKey::Lyrics => {
meta.lyrics = Some(value);
}
StandardTagKey::Composer => {
meta.composer = Some(value);
}
_ => {}
}
}
+179
View File
@@ -0,0 +1,179 @@
# Week 10: Plugin System
**Phase**: 4 - Plugin System & Polish
**Goal**: Extensibility via native and WASM plugins
**Requirements**: FR-23.1-23.5, FR-24.1-24.3
---
## Deliverables
| Task | Crate | Files | Requirements |
|------|-------|-------|--------------|
| Plugin traits | musicfs-plugins | `traits.rs` | FR-23.1-23.4 |
| Native host | musicfs-plugins | `native.rs` | FR-23.2 |
| WASM host | musicfs-plugins | `wasm.rs` | FR-23.3 |
| Plugin lifecycle | musicfs-plugins | `manager.rs` | FR-23.5 |
| Example plugins | plugins/ | `example-origin/`, `example-format/` | FR-23.5 |
---
## Plugin Traits (`musicfs-plugins/src/traits.rs`)
```rust
/// Base plugin interface
pub trait Plugin: Send + Sync {
fn name(&self) -> &str;
fn version(&self) -> Version;
fn init(&mut self, config: Value) -> Result<(), PluginError>;
fn shutdown(&mut self) -> Result<(), PluginError>;
}
/// Origin plugin interface (per architecture 4.3.4)
pub trait OriginPlugin: Plugin {
fn origin_type(&self) -> &str;
fn create(&self, config: Value) -> Result<Box<dyn Origin>, PluginError>;
}
/// Metadata source plugin
pub trait MetadataPlugin: Plugin {
fn lookup(&self, query: &MetadataQuery) -> Result<Option<ExternalMetadata>, PluginError>;
}
/// Format plugin for custom audio formats (FR-24.1)
pub trait FormatPlugin: Plugin {
fn extensions(&self) -> &[&str];
fn can_handle(&self, extension: &str) -> bool;
fn parse(&self, reader: &mut dyn Read) -> Result<AudioMeta, PluginError>;
}
```
---
## Native Plugin Host (`musicfs-plugins/src/native.rs`)
```rust
pub struct NativePluginHost {
plugins: HashMap<String, LoadedPlugin>,
search_paths: Vec<PathBuf>,
}
struct LoadedPlugin {
library: libloading::Library,
instance: Box<dyn Plugin>,
}
impl NativePluginHost {
pub fn new() -> Self;
/// Load plugin from shared library (.so/.dylib)
pub fn load(&mut self, path: &Path) -> Result<PluginId, PluginError>;
/// Unload plugin (FR-23.5)
pub fn unload(&mut self, id: PluginId) -> Result<(), PluginError>;
/// Hot reload plugin without restart (FR-23.4)
pub fn reload(&mut self, id: PluginId) -> Result<(), PluginError>;
/// List loaded plugins
pub fn list(&self) -> Vec<PluginInfo>;
}
```
---
## WASM Plugin Host (`musicfs-plugins/src/wasm.rs`)
```rust
pub struct WasmPluginHost {
engine: wasmtime::Engine,
linker: wasmtime::Linker<PluginState>,
}
impl WasmPluginHost {
pub fn new() -> Result<Self, PluginError>;
/// Load WASM plugin with sandboxing (FR-23.3)
pub fn load(&mut self, wasm_bytes: &[u8]) -> Result<WasmPlugin, PluginError>;
/// Resource limits for sandboxed execution
pub fn set_limits(&mut self, limits: ResourceLimits);
}
pub struct ResourceLimits {
pub max_memory_mb: u32,
pub max_cpu_time_ms: u32,
pub allow_network: bool,
pub allow_filesystem: bool,
}
```
---
## Plugin Manager (`musicfs-plugins/src/manager.rs`)
```rust
pub struct PluginManager {
native_host: NativePluginHost,
wasm_host: WasmPluginHost,
registry: PluginRegistry,
}
impl PluginManager {
/// Initialize and load plugins from config
pub fn init(config: &PluginConfig) -> Result<Self, PluginError>;
/// Get all origin plugins
pub fn origin_plugins(&self) -> Vec<&dyn OriginPlugin>;
/// Get all format plugins
pub fn format_plugins(&self) -> Vec<&dyn FormatPlugin>;
/// Get all metadata plugins
pub fn metadata_plugins(&self) -> Vec<&dyn MetadataPlugin>;
/// Reload all plugins (hot reload)
pub fn reload_all(&mut self) -> Result<(), PluginError>;
}
```
---
## Tests
| Test | Type | Validates |
|------|------|-----------|
| `test_native_plugin_load` | Unit | Native plugin loading (FR-23.2) |
| `test_native_plugin_unload` | Unit | Clean unload |
| `test_wasm_plugin_sandbox` | Unit | WASM isolation (FR-23.3) |
| `test_wasm_resource_limits` | Unit | Memory/CPU limits enforced |
| `test_plugin_hot_reload` | Integration | Reload without restart (FR-23.4) |
| `test_example_origin_plugin` | Integration | Custom origin works |
| `test_example_format_plugin` | Integration | Custom format works |
---
## Exit Criteria
- [ ] Native plugins loadable at runtime
- [ ] WASM plugins sandboxed with resource limits
- [ ] Example plugins functional
- [ ] Plugins hot-reloadable without daemon restart
- [ ] Plugin lifecycle management (load, unload, reload)
---
## Architecture Alignment
Per architecture.md section 4.3.4:
- Plugin loading: Built-in → Native (.so) → WASM
- Origin plugins create `Box<dyn Origin>`
- Format plugins register file extensions
- WASM runs in wasmtime sandbox
Per requirements.md:
- FR-23.1: Loadable plugins ✓
- FR-23.2: Stable plugin API ✓
- FR-23.3: Plugins for origins, metadata, formats ✓
- FR-23.4: WASM sandbox ✓
- FR-23.5: Plugin lifecycle ✓
+539
View File
@@ -0,0 +1,539 @@
# Week 11: Control API & Production
**Phase**: 4 - Plugin System & Polish
**Goal**: gRPC control API, metrics, and production readiness
**Requirements**: FR-17.1-17.5, FR-18.1-18.4, NFR-6.1-6.4, NFR-10.1-10.4
---
## Deliverables
| Task | Crate | Files | Requirements |
|------|-------|-------|--------------|
| gRPC server | musicfs-grpc | `server.rs` | FR-17.1-17.5 |
| Proto codegen | proto/ | `musicfs.proto`, `build.rs` | FR-17.2 |
| Event streaming | musicfs-grpc | `events.rs` | FR-18.1-18.3 |
| Webhook handler | musicfs-grpc | `webhook.rs` | FR-18.2 |
| Metrics export | musicfs-core | `metrics.rs` | NFR-6.1-6.4, NFR-10.2-10.4 |
| CLI completion | musicfs-cli | `main.rs` | FR-17 |
| systemd unit | dist/ | `musicfs.service` | Production |
| Packaging | dist/ | `PKGBUILD`, `musicfs.spec` | Production |
| E2E compatibility | tests/ | `e2e_players.rs` | NFR-12.1-12.3 |
---
## Proto Definitions (`proto/musicfs.proto`)
Per architecture.md section 4.3.7, implement full gRPC API:
```protobuf
syntax = "proto3";
package musicfs.v1;
service MusicFS {
// Daemon lifecycle
rpc GetStatus(Empty) returns (StatusResponse);
rpc Shutdown(ShutdownRequest) returns (Empty);
// Cache management
rpc GetCacheStats(Empty) returns (CacheStats);
rpc ClearCache(ClearCacheRequest) returns (ClearCacheResponse);
rpc Prefetch(PrefetchRequest) returns (stream PrefetchProgress);
// Origin management
rpc ListOrigins(Empty) returns (OriginsResponse);
rpc GetOriginHealth(OriginRequest) returns (OriginHealth);
rpc RescanOrigin(OriginRequest) returns (stream SyncProgress);
// Search (already implemented in Week 8)
rpc Search(SearchRequest) returns (SearchResponse);
rpc SearchStream(SearchRequest) returns (stream SearchResult);
// Events (server-streaming)
rpc SubscribeEvents(EventFilter) returns (stream Event);
}
```
Full message definitions in architecture.md section 4.3.7.
---
## gRPC Server (`musicfs-grpc/src/server.rs`)
```rust
pub struct MusicFsService {
core: Arc<MusicFsCore>,
events: broadcast::Sender<Event>,
metrics: Arc<MetricsCollector>,
}
#[tonic::async_trait]
impl musicfs::v1::music_fs_server::MusicFs for MusicFsService {
// Daemon lifecycle
async fn get_status(&self, _: Request<Empty>) -> Result<Response<StatusResponse>, Status>;
async fn shutdown(&self, req: Request<ShutdownRequest>) -> Result<Response<Empty>, Status>;
// Cache management
async fn get_cache_stats(&self, _: Request<Empty>) -> Result<Response<CacheStats>, Status>;
async fn clear_cache(&self, req: Request<ClearCacheRequest>) -> Result<Response<ClearCacheResponse>, Status>;
type PrefetchStream = ReceiverStream<Result<PrefetchProgress, Status>>;
async fn prefetch(&self, req: Request<PrefetchRequest>) -> Result<Response<Self::PrefetchStream>, Status>;
// Origin management
async fn list_origins(&self, _: Request<Empty>) -> Result<Response<OriginsResponse>, Status>;
async fn get_origin_health(&self, req: Request<OriginRequest>) -> Result<Response<OriginHealth>, Status>;
type RescanOriginStream = ReceiverStream<Result<SyncProgress, Status>>;
async fn rescan_origin(&self, req: Request<OriginRequest>) -> Result<Response<Self::RescanOriginStream>, Status>;
// Events
type SubscribeEventsStream = ReceiverStream<Result<Event, Status>>;
async fn subscribe_events(&self, req: Request<EventFilter>) -> Result<Response<Self::SubscribeEventsStream>, Status>;
}
```
---
## Event Streaming (`musicfs-grpc/src/events.rs`)
```rust
pub struct EventStreamer {
bus: Arc<EventBus>,
}
impl EventStreamer {
/// Convert internal events to gRPC Event messages
pub fn subscribe(&self, filter: EventFilter) -> impl Stream<Item = Event>;
/// Filter events by type and origin
fn matches(event: &Event, filter: &EventFilter) -> bool;
}
```
---
## Webhook Handler (`musicfs-grpc/src/webhook.rs`)
HTTP webhook notifications for external integrations (FR-18.2):
```rust
use reqwest::Client;
use serde::Serialize;
use tokio::sync::broadcast;
#[derive(Debug, Clone, Serialize)]
pub struct WebhookPayload {
pub event_type: String,
pub timestamp: i64,
pub data: serde_json::Value,
}
pub struct WebhookConfig {
pub url: String,
pub secret: Option<String>,
pub events: Vec<String>, // Filter: ["file_accessed", "sync_completed", ...]
pub retry_count: u32,
pub timeout_ms: u64,
}
pub struct WebhookHandler {
client: Client,
configs: Vec<WebhookConfig>,
}
impl WebhookHandler {
pub fn new(configs: Vec<WebhookConfig>) -> Self;
/// Start listening to event bus and dispatch webhooks
pub async fn run(&self, mut rx: broadcast::Receiver<Event>) {
while let Ok(event) = rx.recv().await {
for config in &self.configs {
if self.matches_filter(&event, config) {
self.dispatch(config, &event).await;
}
}
}
}
/// Dispatch webhook with retry logic
async fn dispatch(&self, config: &WebhookConfig, event: &Event) {
let payload = WebhookPayload {
event_type: event.event_type(),
timestamp: event.timestamp(),
data: event.to_json(),
};
let mut attempts = 0;
loop {
let result = self.client
.post(&config.url)
.timeout(Duration::from_millis(config.timeout_ms))
.header("X-MusicFS-Signature", self.sign(&payload, config))
.json(&payload)
.send()
.await;
match result {
Ok(resp) if resp.status().is_success() => break,
_ if attempts < config.retry_count => {
attempts += 1;
tokio::time::sleep(Duration::from_millis(100 * 2u64.pow(attempts))).await;
}
_ => {
tracing::warn!("Webhook delivery failed after {} attempts", attempts);
break;
}
}
}
}
/// HMAC-SHA256 signature if secret configured
fn sign(&self, payload: &WebhookPayload, config: &WebhookConfig) -> String;
fn matches_filter(&self, event: &Event, config: &WebhookConfig) -> bool;
}
```
Configuration in `config.toml`:
```toml
[[webhooks]]
url = "https://example.com/musicfs/events"
secret = "your-webhook-secret"
events = ["file_accessed", "sync_completed", "origin_health_changed"]
retry_count = 3
timeout_ms = 5000
```
---
## E2E Compatibility Tests (`tests/e2e_players.rs`)
Verify MusicFS works with common media players (NFR-12.1-12.3):
```rust
//! E2E tests for media player compatibility
//! Requires: mpv, vlc, file manager (nautilus/dolphin) installed
use std::process::Command;
use std::time::Duration;
/// Test mpv can play files from MusicFS (NFR-12.1)
#[test]
#[ignore] // Run manually: cargo test --ignored
fn test_mpv_playback() {
let mountpoint = setup_test_mount();
// mpv should be able to:
// 1. Open file without hanging
// 2. Read metadata (duration, format)
// 3. Play first few seconds
// 4. Seek forward
// 5. Exit cleanly
let output = Command::new("mpv")
.args([
"--no-video",
"--no-audio", // Silent playback
"--length=2", // Play 2 seconds only
"--msg-level=all=debug",
&format!("{}/Artist/Album/01 - Track.flac", mountpoint),
])
.output()
.expect("mpv must be installed");
assert!(output.status.success(), "mpv playback failed: {:?}", output);
}
/// Test VLC can browse and play (NFR-12.2)
#[test]
#[ignore]
fn test_vlc_playback() {
let mountpoint = setup_test_mount();
// VLC should handle:
// 1. Directory browsing
// 2. Playlist creation from folder
// 3. Metadata display
// 4. Gapless playback (if supported)
let output = Command::new("cvlc") // Command-line VLC
.args([
"--play-and-exit",
"--run-time=2",
&format!("{}/Artist/Album/", mountpoint),
])
.output()
.expect("vlc must be installed");
assert!(output.status.success(), "VLC playback failed");
}
/// Test file manager operations (NFR-12.3)
#[test]
#[ignore]
fn test_file_manager_operations() {
let mountpoint = setup_test_mount();
// File managers should be able to:
// 1. List directories without timeout
// 2. Show file previews/thumbnails
// 3. Display file properties
// 4. Copy files to local disk
// Test basic stat operations that file managers use
let entries: Vec<_> = std::fs::read_dir(&mountpoint)
.expect("read_dir failed")
.collect();
assert!(!entries.is_empty(), "mountpoint should have entries");
// Test stat on each entry (file managers do this for icons)
for entry in entries {
let entry = entry.expect("entry should be valid");
let metadata = entry.metadata().expect("metadata should work");
assert!(metadata.is_dir() || metadata.is_file());
}
}
/// Test concurrent access from multiple players
#[test]
#[ignore]
fn test_concurrent_player_access() {
let mountpoint = setup_test_mount();
// Spawn multiple players accessing different files
let handles: Vec<_> = (0..3)
.map(|i| {
let mp = mountpoint.clone();
std::thread::spawn(move || {
Command::new("mpv")
.args([
"--no-video", "--no-audio", "--length=1",
&format!("{}/Artist/Album/0{} - Track.flac", mp, i + 1),
])
.output()
})
})
.collect();
for handle in handles {
let output = handle.join().unwrap().expect("mpv should run");
assert!(output.status.success());
}
}
fn setup_test_mount() -> String {
// Returns path to test mount with sample files
std::env::var("MUSICFS_TEST_MOUNT")
.unwrap_or_else(|_| "/tmp/musicfs-test".to_string())
}
```
---
## Metrics (`musicfs-core/src/metrics.rs`)
Per architecture.md section 5.2:
```rust
use prometheus::{IntCounterVec, HistogramVec, IntGauge, register_*};
lazy_static! {
pub static ref FUSE_OPS: IntCounterVec = register_int_counter_vec!(
"musicfs_fuse_ops_total",
"Total FUSE operations",
&["op"]
).unwrap();
pub static ref FUSE_LATENCY: HistogramVec = register_histogram_vec!(
"musicfs_fuse_latency_seconds",
"FUSE operation latency",
&["op"],
vec![0.0001, 0.0005, 0.001, 0.005, 0.01, 0.05, 0.1, 0.5, 1.0]
).unwrap();
pub static ref CACHE_HITS: IntCounter = register_int_counter!(
"musicfs_cache_hits_total",
"Cache hits"
).unwrap();
pub static ref CACHE_MISSES: IntCounter = register_int_counter!(
"musicfs_cache_misses_total",
"Cache misses"
).unwrap();
pub static ref CACHE_SIZE_BYTES: IntGauge = register_int_gauge!(
"musicfs_cache_size_bytes",
"Current cache size in bytes"
).unwrap();
pub static ref ORIGIN_HEALTH: IntGaugeVec = register_int_gauge_vec!(
"musicfs_origin_health",
"Origin health status (1=healthy, 0=unhealthy)",
&["origin"]
).unwrap();
}
/// Expose metrics on HTTP endpoint
pub async fn serve_metrics(addr: SocketAddr) -> Result<(), MetricsError>;
```
---
## CLI Commands (`musicfs-cli/src/main.rs`)
```rust
#[derive(Parser)]
enum Command {
/// Mount filesystem
Mount {
#[arg(short, long)]
config: PathBuf,
mountpoint: PathBuf,
},
/// Get daemon status
Status,
/// Cache management
Cache {
#[command(subcommand)]
command: CacheCommand,
},
/// Search library
Search {
query: String,
#[arg(short, long, default_value = "100")]
limit: u32,
},
/// Origin management
Origin {
#[command(subcommand)]
command: OriginCommand,
},
/// Subscribe to events
Events {
#[arg(short, long)]
r#type: Option<String>,
},
}
#[derive(Subcommand)]
enum CacheCommand {
Stats,
Clear { origin: Option<String> },
Prefetch { paths: Vec<String> },
}
#[derive(Subcommand)]
enum OriginCommand {
List,
Health { origin_id: String },
Rescan { origin_id: String },
}
```
---
## systemd Service (`dist/musicfs.service`)
```ini
[Unit]
Description=MusicFS - Metadata-Organized Music Filesystem
After=network.target
[Service]
Type=notify
ExecStart=/usr/bin/musicfs mount --config /etc/musicfs/config.toml /mnt/music
ExecStop=/usr/bin/musicfs shutdown
Restart=on-failure
RestartSec=5
User=musicfs
Group=musicfs
# Security hardening
NoNewPrivileges=true
ProtectSystem=strict
ProtectHome=read-only
ReadWritePaths=/var/cache/musicfs /mnt/music
PrivateTmp=true
[Install]
WantedBy=multi-user.target
```
---
## Tests
| Test | Type | Validates |
|------|------|-----------|
| `test_grpc_status` | Unit | GetStatus RPC (FR-17.1) |
| `test_grpc_cache_stats` | Unit | GetCacheStats RPC |
| `test_grpc_cache_clear` | Unit | ClearCache RPC (FR-17.3) |
| `test_grpc_origins_list` | Unit | ListOrigins RPC |
| `test_grpc_origin_rescan` | Integration | RescanOrigin streaming |
| `test_grpc_events_stream` | Integration | Event streaming (FR-18.1) |
| `test_grpc_prefetch_stream` | Integration | Prefetch progress |
| `test_webhook_dispatch` | Unit | Webhook delivery (FR-18.2) |
| `test_webhook_retry` | Unit | Webhook retry on failure |
| `test_webhook_hmac_signature` | Unit | HMAC-SHA256 signing |
| `test_metrics_prometheus` | Unit | Prometheus format (NFR-6.1) |
| `test_metrics_http_endpoint` | Integration | HTTP metrics endpoint |
| `test_cli_commands` | Integration | CLI works |
| `test_systemd_service` | E2E | Service lifecycle |
| `test_mpv_playback` | E2E | mpv compatibility (NFR-12.1) |
| `test_vlc_playback` | E2E | VLC compatibility (NFR-12.2) |
| `test_file_manager_operations` | E2E | File manager browsing (NFR-12.3) |
| `test_concurrent_player_access` | E2E | Multiple players concurrently |
---
## Exit Criteria
- [ ] gRPC API fully functional (all RPCs from architecture.md 4.3.7)
- [ ] Event streaming works with filtering
- [ ] Webhook notifications delivered with HMAC signing
- [ ] Prometheus metrics exported on HTTP endpoint
- [ ] CLI feature-complete with all commands
- [ ] systemd service works (start, stop, restart)
- [ ] mpv, VLC playback verified (E2E tests)
- [ ] File manager browsing verified
- [ ] All acceptance tests pass
---
## Architecture Alignment
Per architecture.md section 4.3.7:
- gRPC over Unix socket ✓
- Protocol Buffers for type safety ✓
- Server-streaming for events, sync progress, prefetch ✓
- CLI wraps gRPC client ✓
Per architecture.md section 5.2:
- Prometheus metrics format ✓
- Golden signals: latency, traffic, errors, saturation ✓
Per requirements.md:
- FR-17.1: Unix socket control ✓
- FR-17.2: gRPC with Protocol Buffers ✓
- FR-17.3: Cache management commands ✓
- FR-17.4: Runtime configuration ✓
- FR-17.5: Graceful shutdown ✓
- FR-18.1: File access events ✓
- FR-18.2: Webhook notifications ✓ (HTTP webhooks with HMAC)
- FR-18.3: Event streaming ✓
- FR-18.4: Access pattern logging ✓
- NFR-10.1: Configurable logging ✓
- NFR-10.2: Metrics exposure ✓
- NFR-10.3: Health check ✓
- NFR-10.4: Prometheus integration ✓
- NFR-12.1: mpv compatibility ✓ (E2E tests)
- NFR-12.2: VLC compatibility ✓ (E2E tests)
- NFR-12.3: File manager compatibility ✓ (E2E tests)
+624
View File
@@ -0,0 +1,624 @@
# Week 12: External Metadata Integration
**Phase**: 5 - P1 Feature Completion
**Goal**: Integrate external metadata sources for automatic tagging and artwork
**Requirements**: FR-21.1-21.4, FR-16.5
---
## Deliverables
| Task | Crate | Files | Requirements |
|------|-------|-------|--------------|
| MusicBrainz client | musicfs-external | `musicbrainz.rs` | FR-21.1 |
| Discogs client | musicfs-external | `discogs.rs` | FR-21.2 |
| Last.fm client | musicfs-external | `lastfm.rs` | FR-21.3 |
| AcoustID/Chromaprint | musicfs-external | `acoustid.rs` | FR-21.4 |
| Online artwork fetch | musicfs-external | `artwork_fetch.rs` | FR-16.5 |
| Metadata enrichment | musicfs-external | `enrichment.rs` | All |
| Plugin integration | musicfs-plugins | `metadata_plugin.rs` | FR-21.5 |
---
## Task 1: Create `musicfs-external` Crate
### 1.1 `Cargo.toml`
```toml
[package]
name = "musicfs-external"
version.workspace = true
edition.workspace = true
[dependencies]
musicfs-core = { path = "../musicfs-core" }
reqwest = { version = "0.11", features = ["json"] }
serde = { workspace = true, features = ["derive"] }
serde_json.workspace = true
tokio.workspace = true
tracing.workspace = true
thiserror.workspace = true
chromaprint = "0.6" # Audio fingerprinting
base64 = "0.21"
[dev-dependencies]
wiremock = "0.5" # Mock HTTP responses
tokio-test = "0.4"
```
### 1.2 `src/lib.rs`
```rust
pub mod musicbrainz;
pub mod discogs;
pub mod lastfm;
pub mod acoustid;
pub mod artwork_fetch;
pub mod enrichment;
pub use enrichment::MetadataEnricher;
```
---
## Task 2: MusicBrainz Client (`musicfs-external/src/musicbrainz.rs`)
```rust
use serde::Deserialize;
const MB_API: &str = "https://musicbrainz.org/ws/2";
const USER_AGENT: &str = "MusicFS/0.1.0 (https://github.com/user/musicfs)";
#[derive(Debug, Deserialize)]
pub struct MbRecording {
pub id: String,
pub title: String,
pub length: Option<u64>,
#[serde(rename = "artist-credit")]
pub artist_credit: Vec<ArtistCredit>,
pub releases: Option<Vec<MbRelease>>,
}
#[derive(Debug, Deserialize)]
pub struct MbRelease {
pub id: String,
pub title: String,
pub date: Option<String>,
#[serde(rename = "release-group")]
pub release_group: Option<MbReleaseGroup>,
}
#[derive(Debug, Deserialize)]
pub struct MbReleaseGroup {
pub id: String,
#[serde(rename = "primary-type")]
pub primary_type: Option<String>,
}
#[derive(Debug, Deserialize)]
pub struct ArtistCredit {
pub artist: MbArtist,
}
#[derive(Debug, Deserialize)]
pub struct MbArtist {
pub id: String,
pub name: String,
#[serde(rename = "sort-name")]
pub sort_name: String,
}
pub struct MusicBrainzClient {
client: reqwest::Client,
rate_limiter: RateLimiter, // 1 req/sec per MB guidelines
}
impl MusicBrainzClient {
pub fn new() -> Self {
let client = reqwest::Client::builder()
.user_agent(USER_AGENT)
.build()
.expect("client build");
Self {
client,
rate_limiter: RateLimiter::new(Duration::from_secs(1)),
}
}
/// Search by recording title + artist (FR-21.1)
pub async fn search_recording(
&self,
title: &str,
artist: Option<&str>,
) -> Result<Vec<MbRecording>, ExternalError> {
self.rate_limiter.wait().await;
let mut query = format!("recording:{}", title);
if let Some(artist) = artist {
query.push_str(&format!(" AND artist:{}", artist));
}
let resp = self.client
.get(format!("{}/recording", MB_API))
.query(&[
("query", query.as_str()),
("fmt", "json"),
("limit", "5"),
])
.send()
.await?;
let body: SearchResponse<MbRecording> = resp.json().await?;
Ok(body.recordings)
}
/// Get release artwork from Cover Art Archive
pub async fn get_cover_art(&self, release_id: &str) -> Result<Option<Vec<u8>>, ExternalError> {
let url = format!("https://coverartarchive.org/release/{}/front-500", release_id);
let resp = self.client.get(&url).send().await?;
if resp.status() == 404 {
return Ok(None);
}
let bytes = resp.bytes().await?;
Ok(Some(bytes.to_vec()))
}
/// Lookup recording by MusicBrainz ID
pub async fn get_recording(&self, mbid: &str) -> Result<MbRecording, ExternalError> {
self.rate_limiter.wait().await;
let resp = self.client
.get(format!("{}/recording/{}", MB_API, mbid))
.query(&[
("inc", "artist-credits+releases+release-groups"),
("fmt", "json"),
])
.send()
.await?;
Ok(resp.json().await?)
}
}
struct RateLimiter {
interval: Duration,
last_request: Mutex<Instant>,
}
impl RateLimiter {
fn new(interval: Duration) -> Self {
Self {
interval,
last_request: Mutex::new(Instant::now() - interval),
}
}
async fn wait(&self) {
let mut last = self.last_request.lock().await;
let elapsed = last.elapsed();
if elapsed < self.interval {
tokio::time::sleep(self.interval - elapsed).await;
}
*last = Instant::now();
}
}
```
---
## Task 3: Discogs Client (`musicfs-external/src/discogs.rs`)
```rust
const DISCOGS_API: &str = "https://api.discogs.com";
pub struct DiscogsClient {
client: reqwest::Client,
token: Option<String>,
rate_limiter: RateLimiter, // 60 req/min authenticated
}
impl DiscogsClient {
pub fn new(token: Option<String>) -> Self;
/// Search releases (FR-21.2)
pub async fn search(
&self,
query: &str,
artist: Option<&str>,
) -> Result<Vec<DiscogsRelease>, ExternalError>;
/// Get master release details
pub async fn get_master(&self, id: u64) -> Result<DiscogsMaster, ExternalError>;
/// Get release images
pub async fn get_images(&self, release_id: u64) -> Result<Vec<DiscogsImage>, ExternalError>;
}
#[derive(Debug, Deserialize)]
pub struct DiscogsRelease {
pub id: u64,
pub title: String,
pub year: Option<u16>,
pub thumb: Option<String>,
pub master_id: Option<u64>,
}
#[derive(Debug, Deserialize)]
pub struct DiscogsImage {
pub uri: String,
pub width: u32,
pub height: u32,
#[serde(rename = "type")]
pub image_type: String, // "primary" or "secondary"
}
```
---
## Task 4: Last.fm Client (`musicfs-external/src/lastfm.rs`)
```rust
const LASTFM_API: &str = "https://ws.audioscrobbler.com/2.0";
pub struct LastFmClient {
client: reqwest::Client,
api_key: String,
}
impl LastFmClient {
pub fn new(api_key: String) -> Self;
/// Get track info with play counts, tags (FR-21.3)
pub async fn get_track_info(
&self,
track: &str,
artist: &str,
) -> Result<LastFmTrack, ExternalError>;
/// Get album info with artwork
pub async fn get_album_info(
&self,
album: &str,
artist: &str,
) -> Result<LastFmAlbum, ExternalError>;
/// Get artist info
pub async fn get_artist_info(&self, artist: &str) -> Result<LastFmArtist, ExternalError>;
}
#[derive(Debug, Deserialize)]
pub struct LastFmTrack {
pub name: String,
pub playcount: Option<u64>,
pub listeners: Option<u64>,
pub duration: Option<u64>,
pub toptags: Option<Tags>,
pub album: Option<LastFmAlbumRef>,
}
#[derive(Debug, Deserialize)]
pub struct LastFmAlbum {
pub name: String,
pub artist: String,
pub image: Vec<LastFmImage>,
pub tracks: Option<Tracks>,
}
#[derive(Debug, Deserialize)]
pub struct LastFmImage {
#[serde(rename = "#text")]
pub url: String,
pub size: String, // "small", "medium", "large", "extralarge", "mega"
}
```
---
## Task 5: AcoustID/Chromaprint (`musicfs-external/src/acoustid.rs`)
```rust
use chromaprint::{Fingerprinter, Configuration};
const ACOUSTID_API: &str = "https://api.acoustid.org/v2/lookup";
pub struct AcoustIdClient {
client: reqwest::Client,
api_key: String,
}
impl AcoustIdClient {
pub fn new(api_key: String) -> Self;
/// Generate fingerprint from audio data (FR-21.4)
pub fn fingerprint(&self, samples: &[i16], sample_rate: u32) -> Result<String, ExternalError> {
let config = Configuration::preset_test1();
let mut fp = Fingerprinter::new(&config);
fp.start(sample_rate, 1)?; // mono
fp.feed(samples)?;
fp.finish()?;
Ok(fp.fingerprint().to_string())
}
/// Lookup fingerprint on AcoustID database
pub async fn lookup(
&self,
fingerprint: &str,
duration: u32,
) -> Result<Vec<AcoustIdResult>, ExternalError> {
let resp = self.client
.get(ACOUSTID_API)
.query(&[
("client", self.api_key.as_str()),
("fingerprint", fingerprint),
("duration", &duration.to_string()),
("meta", "recordings+releasegroups"),
])
.send()
.await?;
let body: AcoustIdResponse = resp.json().await?;
Ok(body.results)
}
}
#[derive(Debug, Deserialize)]
pub struct AcoustIdResult {
pub id: String,
pub score: f32,
pub recordings: Option<Vec<AcoustIdRecording>>,
}
#[derive(Debug, Deserialize)]
pub struct AcoustIdRecording {
pub id: String, // MusicBrainz recording ID
pub title: Option<String>,
pub artists: Option<Vec<AcoustIdArtist>>,
}
```
---
## Task 6: Online Artwork Fetch (`musicfs-external/src/artwork_fetch.rs`)
```rust
pub struct ArtworkFetcher {
musicbrainz: MusicBrainzClient,
discogs: Option<DiscogsClient>,
lastfm: Option<LastFmClient>,
}
impl ArtworkFetcher {
/// Fetch missing artwork from online sources (FR-16.5)
/// Tries sources in order: MusicBrainz Cover Art Archive → Discogs → Last.fm
pub async fn fetch_artwork(
&self,
artist: &str,
album: &str,
size: ArtworkSize,
) -> Result<Option<ArtworkData>, ExternalError> {
// 1. Try MusicBrainz release search → Cover Art Archive
if let Some(art) = self.try_musicbrainz(artist, album, size).await? {
return Ok(Some(art));
}
// 2. Try Discogs
if let Some(discogs) = &self.discogs {
if let Some(art) = self.try_discogs(discogs, artist, album, size).await? {
return Ok(Some(art));
}
}
// 3. Try Last.fm
if let Some(lastfm) = &self.lastfm {
if let Some(art) = self.try_lastfm(lastfm, artist, album, size).await? {
return Ok(Some(art));
}
}
Ok(None)
}
async fn try_musicbrainz(
&self,
artist: &str,
album: &str,
size: ArtworkSize,
) -> Result<Option<ArtworkData>, ExternalError> {
// Search for release, get cover art from Cover Art Archive
let releases = self.musicbrainz.search_release(album, Some(artist)).await?;
for release in releases.iter().take(3) {
if let Some(art) = self.musicbrainz.get_cover_art(&release.id).await? {
return Ok(Some(ArtworkData {
data: art,
source: ArtworkSource::MusicBrainz,
mime_type: "image/jpeg".to_string(),
}));
}
}
Ok(None)
}
}
#[derive(Debug)]
pub struct ArtworkData {
pub data: Vec<u8>,
pub source: ArtworkSource,
pub mime_type: String,
}
#[derive(Debug)]
pub enum ArtworkSource {
MusicBrainz,
Discogs,
LastFm,
Embedded,
}
pub enum ArtworkSize {
Small, // 150px
Medium, // 300px
Large, // 500px
Original,
}
```
---
## Task 7: Metadata Enrichment (`musicfs-external/src/enrichment.rs`)
```rust
pub struct MetadataEnricher {
musicbrainz: MusicBrainzClient,
acoustid: Option<AcoustIdClient>,
artwork_fetcher: ArtworkFetcher,
}
impl MetadataEnricher {
/// Enrich metadata from external sources
pub async fn enrich(&self, meta: &AudioMeta) -> Result<EnrichedMetadata, ExternalError> {
let mut enriched = EnrichedMetadata::from(meta);
// If we have title + artist, search MusicBrainz
if let (Some(title), Some(artist)) = (&meta.title, &meta.artist) {
let recordings = self.musicbrainz.search_recording(title, Some(artist)).await?;
if let Some(best) = recordings.first() {
enriched.musicbrainz_recording_id = Some(best.id.clone());
// Enrich with release info
if let Some(releases) = &best.releases {
if let Some(release) = releases.first() {
enriched.musicbrainz_release_id = Some(release.id.clone());
}
}
}
}
Ok(enriched)
}
/// Identify unknown track by audio fingerprint
pub async fn identify_by_fingerprint(
&self,
samples: &[i16],
sample_rate: u32,
duration: u32,
) -> Result<Option<IdentifiedTrack>, ExternalError> {
let acoustid = self.acoustid.as_ref()
.ok_or(ExternalError::ServiceNotConfigured("AcoustID"))?;
let fingerprint = acoustid.fingerprint(samples, sample_rate)?;
let results = acoustid.lookup(&fingerprint, duration).await?;
// Return best match above threshold
results.into_iter()
.filter(|r| r.score > 0.8)
.flat_map(|r| r.recordings)
.flatten()
.next()
.map(|rec| IdentifiedTrack {
title: rec.title,
musicbrainz_id: Some(rec.id),
artists: rec.artists.map(|a| a.into_iter().map(|x| x.name).collect()),
})
.pipe(Ok)
}
}
#[derive(Debug)]
pub struct EnrichedMetadata {
pub original: AudioMeta,
pub musicbrainz_recording_id: Option<String>,
pub musicbrainz_release_id: Option<String>,
pub musicbrainz_artist_id: Option<String>,
pub genres: Vec<String>,
pub play_count: Option<u64>,
}
#[derive(Debug)]
pub struct IdentifiedTrack {
pub title: Option<String>,
pub musicbrainz_id: Option<String>,
pub artists: Option<Vec<String>>,
}
```
---
## Configuration
```toml
[external]
# MusicBrainz (no auth required, rate limited to 1 req/sec)
musicbrainz.enabled = true
# Discogs (optional, requires token for higher rate limits)
discogs.enabled = true
discogs.token = "your_discogs_token"
# Last.fm (requires API key)
lastfm.enabled = true
lastfm.api_key = "your_lastfm_api_key"
# AcoustID (requires API key)
acoustid.enabled = true
acoustid.api_key = "your_acoustid_api_key"
# Artwork fetching behavior
artwork.fetch_missing = true
artwork.cache_fetched = true
artwork.preferred_size = "large" # small, medium, large, original
```
---
## Tests
| Test | Type | Validates |
|------|------|-----------|
| `test_musicbrainz_search` | Integration | Recording search (FR-21.1) |
| `test_musicbrainz_cover_art` | Integration | Cover Art Archive |
| `test_discogs_search` | Integration | Release search (FR-21.2) |
| `test_lastfm_track_info` | Integration | Track metadata (FR-21.3) |
| `test_acoustid_fingerprint` | Unit | Chromaprint generation |
| `test_acoustid_lookup` | Integration | Fingerprint lookup (FR-21.4) |
| `test_artwork_fetch_cascade` | Integration | Multi-source artwork (FR-16.5) |
| `test_metadata_enrichment` | Integration | Full enrichment flow |
| `test_rate_limiting` | Unit | Rate limiter works |
| `test_mock_responses` | Unit | Offline testing with mocks |
---
## Exit Criteria
- [ ] MusicBrainz search returns relevant recordings
- [ ] Cover Art Archive artwork downloads work
- [ ] Discogs integration retrieves release info
- [ ] Last.fm integration retrieves track/artist info
- [ ] AcoustID fingerprinting identifies tracks
- [ ] Artwork fetcher tries all sources in cascade
- [ ] Metadata enricher adds external IDs
- [ ] Rate limiting prevents API abuse
- [ ] All tests pass with mock HTTP responses
---
## Architecture Alignment
Per requirements.md:
- FR-21.1: MusicBrainz for canonical metadata ✓
- FR-21.2: Discogs for release info, artwork ✓
- FR-21.3: Last.fm for play counts, tags ✓
- FR-21.4: AcoustID for audio fingerprinting ✓
- FR-16.5: Fetch missing artwork from online ✓
Per architecture.md section 4.3.4:
- External metadata via `MetadataPlugin` trait ✓
- Plugin architecture allows adding more sources ✓
+699
View File
@@ -0,0 +1,699 @@
# Week 13: Import & Export
**Phase**: 5 - P1 Feature Completion
**Goal**: Import metadata from existing library managers, export library data
**Requirements**: FR-22.1-22.3
---
## Deliverables
| Task | Crate | Files | Requirements |
|------|-------|-------|--------------|
| Beets database import | musicfs-import | `beets.rs` | FR-22.1 |
| iTunes/Apple Music import | musicfs-import | `itunes.rs` | FR-22.2 |
| Library export | musicfs-import | `export.rs` | FR-22.3 |
| Import CLI | musicfs-cli | `import.rs` | All |
---
## Task 1: Create `musicfs-import` Crate
### 1.1 `Cargo.toml`
```toml
[package]
name = "musicfs-import"
version.workspace = true
edition.workspace = true
[dependencies]
musicfs-core = { path = "../musicfs-core" }
musicfs-cache = { path = "../musicfs-cache" }
rusqlite = { workspace = true, features = ["bundled"] }
serde = { workspace = true, features = ["derive"] }
serde_json.workspace = true
plist = "1.5" # For iTunes XML parsing
tokio.workspace = true
tracing.workspace = true
thiserror.workspace = true
csv = "1.3"
url = "2.4"
percent-encoding = "2.3"
chrono = { version = "0.4", features = ["serde"] }
[dev-dependencies]
tempfile.workspace = true
```
### 1.2 `src/lib.rs`
```rust
pub mod beets;
pub mod itunes;
pub mod export;
use musicfs_core::Result;
/// Common import result
#[derive(Debug, Default)]
pub struct ImportResult {
pub imported: usize,
pub skipped: usize,
pub errors: Vec<ImportError>,
}
#[derive(Debug)]
pub struct ImportError {
pub path: String,
pub reason: String,
}
/// Import progress callback
pub type ProgressCallback = Box<dyn Fn(ImportProgress) + Send>;
#[derive(Debug, Clone)]
pub struct ImportProgress {
pub current: usize,
pub total: usize,
pub current_file: String,
}
```
---
## Task 2: Beets Database Import (`musicfs-import/src/beets.rs`)
```rust
use rusqlite::{Connection, params};
use std::path::Path;
/// Beets database schema (simplified)
/// Full schema: https://beets.readthedocs.io/en/stable/dev/db.html
#[derive(Debug)]
pub struct BeetsItem {
pub id: i64,
pub path: String,
pub title: Option<String>,
pub artist: Option<String>,
pub album: Option<String>,
pub album_artist: Option<String>,
pub genre: Option<String>,
pub year: Option<i32>,
pub track: Option<i32>,
pub disc: Option<i32>,
pub length: Option<f64>,
pub bitrate: Option<i32>,
pub sample_rate: Option<i32>,
pub format: Option<String>,
pub mb_trackid: Option<String>,
pub mb_albumid: Option<String>,
pub mb_artistid: Option<String>,
pub mtime: f64,
}
pub struct BeetsImporter {
beets_db: Connection,
target_db: Arc<Database>,
}
impl BeetsImporter {
/// Open beets database for import (FR-22.1)
pub fn new(beets_db_path: &Path, target_db: Arc<Database>) -> Result<Self, ImportError> {
let conn = Connection::open_with_flags(
beets_db_path,
rusqlite::OpenFlags::SQLITE_OPEN_READ_ONLY,
)?;
// Verify this is a beets database
let tables: Vec<String> = conn
.prepare("SELECT name FROM sqlite_master WHERE type='table'")?
.query_map([], |row| row.get(0))?
.filter_map(|r| r.ok())
.collect();
if !tables.contains(&"items".to_string()) {
return Err(ImportError::InvalidDatabase("Not a beets database"));
}
Ok(Self {
beets_db: conn,
target_db,
})
}
/// Count items to import
pub fn count_items(&self) -> Result<usize, ImportError> {
self.beets_db
.query_row("SELECT COUNT(*) FROM items", [], |row| row.get(0))
.map_err(Into::into)
}
/// Import all items with progress callback
pub fn import_all(&self, progress: Option<ProgressCallback>) -> Result<ImportResult, ImportError> {
let total = self.count_items()?;
let mut result = ImportResult::default();
let mut stmt = self.beets_db.prepare(r#"
SELECT id, path, title, artist, album, albumartist, genre,
year, track, disc, length, bitrate, samplerate, format,
mb_trackid, mb_albumid, mb_artistid, mtime
FROM items
"#)?;
let items = stmt.query_map([], |row| {
Ok(BeetsItem {
id: row.get(0)?,
path: row.get(1)?,
title: row.get(2)?,
artist: row.get(3)?,
album: row.get(4)?,
album_artist: row.get(5)?,
genre: row.get(6)?,
year: row.get(7)?,
track: row.get(8)?,
disc: row.get(9)?,
length: row.get(10)?,
bitrate: row.get(11)?,
sample_rate: row.get(12)?,
format: row.get(13)?,
mb_trackid: row.get(14)?,
mb_albumid: row.get(15)?,
mb_artistid: row.get(16)?,
mtime: row.get(17)?,
})
})?;
for (idx, item) in items.enumerate() {
match item {
Ok(item) => {
if let Some(ref cb) = progress {
cb(ImportProgress {
current: idx + 1,
total,
current_file: item.path.clone(),
});
}
match self.import_item(&item) {
Ok(_) => result.imported += 1,
Err(e) => {
result.errors.push(ImportError {
path: item.path,
reason: e.to_string(),
});
}
}
}
Err(e) => {
result.skipped += 1;
result.errors.push(ImportError {
path: format!("item_{}", idx),
reason: e.to_string(),
});
}
}
}
Ok(result)
}
fn import_item(&self, item: &BeetsItem) -> Result<(), ImportError> {
let path = Path::new(&item.path);
// Convert to our AudioMeta
let audio_meta = AudioMeta {
title: item.title.clone(),
artist: item.artist.clone(),
album: item.album.clone(),
album_artist: item.album_artist.clone(),
genre: item.genre.clone(),
year: item.year.map(|y| y as u32),
track: item.track.map(|t| t as u32),
disc: item.disc.map(|d| d as u32),
duration_ms: item.length.map(|l| (l * 1000.0) as u64),
bitrate: item.bitrate.map(|b| b as u32),
sample_rate: item.sample_rate.map(|s| s as u32),
format: AudioFormat::from_extension(
path.extension().and_then(|e| e.to_str()).unwrap_or("")
),
..Default::default()
};
// Generate virtual path using our resolver
let virtual_path = VirtualPath::from_metadata(&audio_meta, path);
// Import to our database
self.target_db.upsert_file(
&OriginId::from("beets-import"),
path,
&virtual_path,
&audio_meta,
std::time::UNIX_EPOCH + std::time::Duration::from_secs_f64(item.mtime),
std::fs::metadata(path).map(|m| m.len()).unwrap_or(0),
)?;
Ok(())
}
}
```
---
## Task 3: iTunes/Apple Music Import (`musicfs-import/src/itunes.rs`)
```rust
use plist::Value;
use std::collections::HashMap;
use url::Url;
/// iTunes Library XML format
#[derive(Debug)]
pub struct ItunesTrack {
pub track_id: u64,
pub name: Option<String>,
pub artist: Option<String>,
pub album: Option<String>,
pub album_artist: Option<String>,
pub genre: Option<String>,
pub year: Option<u32>,
pub track_number: Option<u32>,
pub disc_number: Option<u32>,
pub total_time: Option<u64>, // milliseconds
pub bit_rate: Option<u32>,
pub sample_rate: Option<u32>,
pub location: Option<String>, // file:// URL
pub date_added: Option<String>,
}
pub struct ItunesImporter {
tracks: Vec<ItunesTrack>,
target_db: Arc<Database>,
}
impl ItunesImporter {
/// Parse iTunes Library.xml (FR-22.2)
pub fn from_xml(xml_path: &Path, target_db: Arc<Database>) -> Result<Self, ImportError> {
let file = std::fs::File::open(xml_path)?;
let plist: Value = plist::from_reader(file)?;
let dict = plist.as_dictionary()
.ok_or(ImportError::InvalidFormat("Expected dictionary at root"))?;
let tracks_dict = dict.get("Tracks")
.and_then(|v| v.as_dictionary())
.ok_or(ImportError::InvalidFormat("Missing Tracks dictionary"))?;
let mut tracks = Vec::new();
for (_, track_value) in tracks_dict {
if let Some(track_dict) = track_value.as_dictionary() {
tracks.push(Self::parse_track(track_dict)?);
}
}
Ok(Self { tracks, target_db })
}
fn parse_track(dict: &plist::Dictionary) -> Result<ItunesTrack, ImportError> {
Ok(ItunesTrack {
track_id: dict.get("Track ID")
.and_then(|v| v.as_unsigned_integer())
.unwrap_or(0),
name: dict.get("Name").and_then(|v| v.as_string()).map(String::from),
artist: dict.get("Artist").and_then(|v| v.as_string()).map(String::from),
album: dict.get("Album").and_then(|v| v.as_string()).map(String::from),
album_artist: dict.get("Album Artist").and_then(|v| v.as_string()).map(String::from),
genre: dict.get("Genre").and_then(|v| v.as_string()).map(String::from),
year: dict.get("Year").and_then(|v| v.as_unsigned_integer()).map(|v| v as u32),
track_number: dict.get("Track Number").and_then(|v| v.as_unsigned_integer()).map(|v| v as u32),
disc_number: dict.get("Disc Number").and_then(|v| v.as_unsigned_integer()).map(|v| v as u32),
total_time: dict.get("Total Time").and_then(|v| v.as_unsigned_integer()),
bit_rate: dict.get("Bit Rate").and_then(|v| v.as_unsigned_integer()).map(|v| v as u32),
sample_rate: dict.get("Sample Rate").and_then(|v| v.as_unsigned_integer()).map(|v| v as u32),
location: dict.get("Location").and_then(|v| v.as_string()).map(String::from),
date_added: dict.get("Date Added").and_then(|v| v.as_string()).map(String::from),
})
}
/// Convert file:// URL to path
fn url_to_path(url_str: &str) -> Option<PathBuf> {
Url::parse(url_str).ok()
.filter(|u| u.scheme() == "file")
.and_then(|u| u.to_file_path().ok())
}
pub fn count_tracks(&self) -> usize {
self.tracks.len()
}
/// Import all tracks
pub fn import_all(&self, progress: Option<ProgressCallback>) -> Result<ImportResult, ImportError> {
let total = self.tracks.len();
let mut result = ImportResult::default();
for (idx, track) in self.tracks.iter().enumerate() {
if let Some(ref cb) = progress {
cb(ImportProgress {
current: idx + 1,
total,
current_file: track.name.clone().unwrap_or_default(),
});
}
// Skip tracks without location
let Some(ref location) = track.location else {
result.skipped += 1;
continue;
};
let Some(path) = Self::url_to_path(location) else {
result.skipped += 1;
result.errors.push(ImportError {
path: location.clone(),
reason: "Invalid file URL".to_string(),
});
continue;
};
match self.import_track(track, &path) {
Ok(_) => result.imported += 1,
Err(e) => {
result.errors.push(ImportError {
path: path.display().to_string(),
reason: e.to_string(),
});
}
}
}
Ok(result)
}
fn import_track(&self, track: &ItunesTrack, path: &Path) -> Result<(), ImportError> {
let audio_meta = AudioMeta {
title: track.name.clone(),
artist: track.artist.clone(),
album: track.album.clone(),
album_artist: track.album_artist.clone(),
genre: track.genre.clone(),
year: track.year,
track: track.track_number,
disc: track.disc_number,
duration_ms: track.total_time,
bitrate: track.bit_rate,
sample_rate: track.sample_rate,
format: AudioFormat::from_extension(
path.extension().and_then(|e| e.to_str()).unwrap_or("")
),
..Default::default()
};
let virtual_path = VirtualPath::from_metadata(&audio_meta, path);
let mtime = std::fs::metadata(path)
.map(|m| m.modified().unwrap_or(std::time::UNIX_EPOCH))
.unwrap_or(std::time::UNIX_EPOCH);
let size = std::fs::metadata(path).map(|m| m.len()).unwrap_or(0);
self.target_db.upsert_file(
&OriginId::from("itunes-import"),
path,
&virtual_path,
&audio_meta,
mtime,
size,
)?;
Ok(())
}
}
```
---
## Task 4: Library Export (`musicfs-import/src/export.rs`)
```rust
use csv::Writer;
use serde::Serialize;
#[derive(Debug, Serialize)]
pub struct ExportedTrack {
pub virtual_path: String,
pub real_path: String,
pub title: String,
pub artist: String,
pub album: String,
pub album_artist: String,
pub genre: String,
pub year: Option<u32>,
pub track: Option<u32>,
pub disc: Option<u32>,
pub duration_ms: Option<u64>,
pub format: String,
pub musicbrainz_id: Option<String>,
}
pub struct LibraryExporter {
db: Arc<Database>,
}
impl LibraryExporter {
pub fn new(db: Arc<Database>) -> Self {
Self { db }
}
/// Export library to CSV (FR-22.3)
pub fn export_csv(&self, output: &Path) -> Result<usize, ExportError> {
let files = self.db.list_all_files()?;
let mut writer = Writer::from_path(output)?;
let mut count = 0;
for file in files {
let audio = file.audio.as_ref();
writer.serialize(ExportedTrack {
virtual_path: file.virtual_path.as_str().to_string(),
real_path: file.real_path.path.display().to_string(),
title: audio.and_then(|a| a.title.clone()).unwrap_or_default(),
artist: audio.and_then(|a| a.artist.clone()).unwrap_or_default(),
album: audio.and_then(|a| a.album.clone()).unwrap_or_default(),
album_artist: audio.and_then(|a| a.album_artist.clone()).unwrap_or_default(),
genre: audio.and_then(|a| a.genre.clone()).unwrap_or_default(),
year: audio.and_then(|a| a.year),
track: audio.and_then(|a| a.track),
disc: audio.and_then(|a| a.disc),
duration_ms: audio.and_then(|a| a.duration_ms),
format: audio.map(|a| format!("{:?}", a.format)).unwrap_or_default(),
musicbrainz_id: None, // TODO: Include if enriched
})?;
count += 1;
}
writer.flush()?;
Ok(count)
}
/// Export library to JSON
pub fn export_json(&self, output: &Path) -> Result<usize, ExportError> {
let files = self.db.list_all_files()?;
let tracks: Vec<ExportedTrack> = files.iter()
.map(|file| {
let audio = file.audio.as_ref();
ExportedTrack {
virtual_path: file.virtual_path.as_str().to_string(),
real_path: file.real_path.path.display().to_string(),
title: audio.and_then(|a| a.title.clone()).unwrap_or_default(),
artist: audio.and_then(|a| a.artist.clone()).unwrap_or_default(),
album: audio.and_then(|a| a.album.clone()).unwrap_or_default(),
album_artist: audio.and_then(|a| a.album_artist.clone()).unwrap_or_default(),
genre: audio.and_then(|a| a.genre.clone()).unwrap_or_default(),
year: audio.and_then(|a| a.year),
track: audio.and_then(|a| a.track),
disc: audio.and_then(|a| a.disc),
duration_ms: audio.and_then(|a| a.duration_ms),
format: audio.map(|a| format!("{:?}", a.format)).unwrap_or_default(),
musicbrainz_id: None,
}
})
.collect();
let json = serde_json::to_string_pretty(&tracks)?;
std::fs::write(output, json)?;
Ok(tracks.len())
}
/// Export to M3U playlist format
pub fn export_m3u(&self, output: &Path, base_path: Option<&Path>) -> Result<usize, ExportError> {
let files = self.db.list_all_files()?;
let mut content = String::from("#EXTM3U\n");
for file in &files {
let duration = file.audio.as_ref()
.and_then(|a| a.duration_ms)
.map(|d| d / 1000)
.unwrap_or(0);
let title = file.audio.as_ref()
.and_then(|a| a.title.clone())
.unwrap_or_else(|| file.virtual_path.as_str().to_string());
let artist = file.audio.as_ref()
.and_then(|a| a.artist.clone())
.unwrap_or_default();
content.push_str(&format!(
"#EXTINF:{},{} - {}\n",
duration, artist, title
));
// Use virtual path relative to base, or absolute real path
let path = if let Some(base) = base_path {
base.join(file.virtual_path.as_str().trim_start_matches('/'))
.display().to_string()
} else {
file.real_path.path.display().to_string()
};
content.push_str(&path);
content.push('\n');
}
std::fs::write(output, content)?;
Ok(files.len())
}
}
```
---
## Task 5: Import CLI Commands (`musicfs-cli/src/import.rs`)
```rust
#[derive(Subcommand)]
pub enum ImportCommand {
/// Import from beets database
Beets {
/// Path to beets library.db
#[arg(short, long)]
db: PathBuf,
},
/// Import from iTunes Library.xml
Itunes {
/// Path to iTunes Library.xml
#[arg(short, long)]
xml: PathBuf,
},
/// Export library
Export {
/// Output file path
#[arg(short, long)]
output: PathBuf,
/// Format: csv, json, m3u
#[arg(short, long, default_value = "csv")]
format: String,
},
}
pub async fn handle_import(cmd: ImportCommand, db: Arc<Database>) -> Result<()> {
match cmd {
ImportCommand::Beets { db: beets_path } => {
println!("Importing from beets database: {:?}", beets_path);
let importer = BeetsImporter::new(&beets_path, db)?;
let total = importer.count_items()?;
println!("Found {} items to import", total);
let pb = ProgressBar::new(total as u64);
let result = importer.import_all(Some(Box::new(move |p| {
pb.set_position(p.current as u64);
})))?;
println!("\nImport complete:");
println!(" Imported: {}", result.imported);
println!(" Skipped: {}", result.skipped);
println!(" Errors: {}", result.errors.len());
}
ImportCommand::Itunes { xml } => {
println!("Importing from iTunes Library: {:?}", xml);
let importer = ItunesImporter::from_xml(&xml, db)?;
let total = importer.count_tracks();
println!("Found {} tracks to import", total);
let pb = ProgressBar::new(total as u64);
let result = importer.import_all(Some(Box::new(move |p| {
pb.set_position(p.current as u64);
})))?;
println!("\nImport complete:");
println!(" Imported: {}", result.imported);
println!(" Skipped: {}", result.skipped);
println!(" Errors: {}", result.errors.len());
}
ImportCommand::Export { output, format } => {
let exporter = LibraryExporter::new(db);
let count = match format.as_str() {
"csv" => exporter.export_csv(&output)?,
"json" => exporter.export_json(&output)?,
"m3u" => exporter.export_m3u(&output, None)?,
_ => return Err(anyhow::anyhow!("Unknown format: {}", format)),
};
println!("Exported {} tracks to {:?}", count, output);
}
}
Ok(())
}
```
---
## Tests
| Test | Type | Validates |
|------|------|-----------|
| `test_beets_import_valid` | Integration | Beets database parsing (FR-22.1) |
| `test_beets_import_missing_fields` | Unit | Handle incomplete metadata |
| `test_itunes_xml_parsing` | Unit | iTunes XML parsing (FR-22.2) |
| `test_itunes_url_to_path` | Unit | file:// URL conversion |
| `test_itunes_import_tracks` | Integration | Full iTunes import |
| `test_export_csv` | Unit | CSV export (FR-22.3) |
| `test_export_json` | Unit | JSON export |
| `test_export_m3u` | Unit | M3U playlist export |
| `test_import_preserves_musicbrainz_ids` | Integration | External IDs preserved |
| `test_import_deduplication` | Integration | No duplicates on re-import |
---
## Exit Criteria
- [ ] Beets database import works with real beets.db
- [ ] iTunes Library.xml import parses all tracks
- [ ] CSV/JSON/M3U export generates valid files
- [ ] Progress reporting works during import
- [ ] Errors are reported without crashing
- [ ] Import is idempotent (re-import updates, doesn't duplicate)
- [ ] MusicBrainz IDs from beets are preserved
---
## Architecture Alignment
Per requirements.md:
- FR-22.1: Import from beets database ✓
- FR-22.2: Import from iTunes/Apple Music ✓
- FR-22.3: Export library metadata ✓
+633
View File
@@ -0,0 +1,633 @@
# Week 14: Extended Formats & Audio Fingerprinting
**Phase**: 5 - P1 Feature Completion
**Goal**: Audio fingerprint search and audiobook format support
**Requirements**: FR-14.4, FR-24.2
---
## Deliverables
| Task | Crate | Files | Requirements |
|------|-------|-------|--------------|
| Fingerprint indexing | musicfs-search | `fingerprint.rs` | FR-14.4 |
| Fingerprint search | musicfs-search | `fingerprint_search.rs` | FR-14.4 |
| M4B audiobook support | musicfs-metadata | `formats/m4b.rs` | FR-24.2 |
| Chapter extraction | musicfs-metadata | `chapters.rs` | FR-24.2 |
| Virtual chapter files | musicfs-fuse | `ops/chapters.rs` | FR-24.2 |
---
## Task 1: Audio Fingerprint Generation
### 1.1 Add Dependencies
```toml
# In musicfs-search/Cargo.toml
[dependencies]
chromaprint = "0.6"
symphonia = { version = "0.5", features = ["all"] }
```
### 1.2 Fingerprint Generation (`musicfs-search/src/fingerprint.rs`)
```rust
use chromaprint::{Configuration, Fingerprinter};
use symphonia::core::audio::SampleBuffer;
use symphonia::core::codecs::DecoderOptions;
use std::path::Path;
/// Audio fingerprint using Chromaprint algorithm
#[derive(Debug, Clone)]
pub struct AudioFingerprint {
pub raw: Vec<u32>,
pub duration_secs: u32,
}
impl AudioFingerprint {
/// Generate fingerprint from audio file (FR-14.4)
pub fn from_file(path: &Path) -> Result<Self, FingerprintError> {
let file = std::fs::File::open(path)?;
let mss = MediaSourceStream::new(Box::new(file), Default::default());
let probed = symphonia::default::get_probe()
.format(&Hint::new(), mss, &FormatOptions::default(), &MetadataOptions::default())?;
let mut format = probed.format;
let track = format.tracks()
.iter()
.find(|t| t.codec_params.codec != CODEC_TYPE_NULL)
.ok_or(FingerprintError::NoAudioTrack)?;
let sample_rate = track.codec_params.sample_rate
.ok_or(FingerprintError::NoSampleRate)?;
let mut decoder = symphonia::default::get_codecs()
.make(&track.codec_params, &DecoderOptions::default())?;
// Chromaprint configuration
let config = Configuration::preset_test1();
let mut fingerprinter = Fingerprinter::new(&config);
fingerprinter.start(sample_rate, 1)?; // Mono
let mut samples: Vec<i16> = Vec::new();
let mut duration_samples = 0u64;
// Decode and collect samples (first 120 seconds max)
let max_samples = sample_rate as u64 * 120;
loop {
match format.next_packet() {
Ok(packet) => {
let decoded = decoder.decode(&packet)?;
let mut sample_buf = SampleBuffer::<i16>::new(
decoded.capacity() as u64,
*decoded.spec(),
);
sample_buf.copy_interleaved_ref(decoded);
// Convert to mono if stereo
let mono: Vec<i16> = if decoded.spec().channels.count() > 1 {
sample_buf.samples()
.chunks(decoded.spec().channels.count())
.map(|chunk| (chunk.iter().map(|&s| s as i32).sum::<i32>() / chunk.len() as i32) as i16)
.collect()
} else {
sample_buf.samples().to_vec()
};
samples.extend(&mono);
duration_samples += mono.len() as u64;
if duration_samples >= max_samples {
break;
}
}
Err(symphonia::core::errors::Error::IoError(e))
if e.kind() == std::io::ErrorKind::UnexpectedEof => break,
Err(e) => return Err(e.into()),
}
}
// Feed samples to fingerprinter
fingerprinter.feed(&samples)?;
fingerprinter.finish()?;
let raw = fingerprinter.fingerprint().to_vec();
let duration_secs = (duration_samples / sample_rate as u64) as u32;
Ok(Self { raw, duration_secs })
}
/// Compress fingerprint for storage
pub fn to_bytes(&self) -> Vec<u8> {
// Use chromaprint's compressed format
chromaprint::encode_fingerprint(&self.raw, chromaprint::Algorithm::Test1)
}
/// Decompress fingerprint
pub fn from_bytes(bytes: &[u8]) -> Result<Self, FingerprintError> {
let (raw, _) = chromaprint::decode_fingerprint(bytes)?;
Ok(Self { raw, duration_secs: 0 })
}
}
#[derive(Debug, thiserror::Error)]
pub enum FingerprintError {
#[error("No audio track found")]
NoAudioTrack,
#[error("No sample rate")]
NoSampleRate,
#[error("IO error: {0}")]
Io(#[from] std::io::Error),
#[error("Decode error: {0}")]
Decode(String),
#[error("Chromaprint error: {0}")]
Chromaprint(String),
}
```
---
## Task 2: Fingerprint Search (`musicfs-search/src/fingerprint_search.rs`)
```rust
use crate::fingerprint::AudioFingerprint;
/// Fingerprint similarity search using bit-level comparison
pub struct FingerprintIndex {
db: Arc<Database>,
}
impl FingerprintIndex {
pub fn new(db: Arc<Database>) -> Self {
Self { db }
}
/// Index a file's fingerprint
pub fn index(&self, file_id: FileId, fingerprint: &AudioFingerprint) -> Result<(), SearchError> {
let bytes = fingerprint.to_bytes();
self.db.store_fingerprint(file_id, &bytes, fingerprint.duration_secs)?;
Ok(())
}
/// Search by fingerprint similarity (FR-14.4)
pub fn search(
&self,
query: &AudioFingerprint,
threshold: f32, // 0.0-1.0, higher = more similar
limit: usize,
) -> Result<Vec<FingerprintMatch>, SearchError> {
let candidates = self.db.get_fingerprints_by_duration(
query.duration_secs.saturating_sub(10),
query.duration_secs + 10,
)?;
let mut matches: Vec<FingerprintMatch> = candidates
.into_iter()
.filter_map(|(file_id, fp_bytes, duration)| {
let fp = AudioFingerprint::from_bytes(&fp_bytes).ok()?;
let similarity = self.compare(&query.raw, &fp.raw);
if similarity >= threshold {
Some(FingerprintMatch { file_id, similarity, duration })
} else {
None
}
})
.collect();
// Sort by similarity descending
matches.sort_by(|a, b| b.similarity.partial_cmp(&a.similarity).unwrap());
matches.truncate(limit);
Ok(matches)
}
/// Compare two fingerprints using bit error rate
fn compare(&self, a: &[u32], b: &[u32]) -> f32 {
let len = a.len().min(b.len());
if len == 0 {
return 0.0;
}
let mut matching_bits = 0u32;
let mut total_bits = 0u32;
for i in 0..len {
let xor = a[i] ^ b[i];
matching_bits += 32 - xor.count_ones();
total_bits += 32;
}
matching_bits as f32 / total_bits as f32
}
/// Find duplicates by fingerprint
pub fn find_duplicates(&self, threshold: f32) -> Result<Vec<DuplicateGroup>, SearchError> {
let all_fps = self.db.get_all_fingerprints()?;
let mut groups: Vec<DuplicateGroup> = Vec::new();
let mut processed: HashSet<FileId> = HashSet::new();
for (file_id, fp_bytes, duration) in &all_fps {
if processed.contains(file_id) {
continue;
}
let fp = AudioFingerprint::from_bytes(fp_bytes)?;
let matches = self.search(&fp, threshold, 100)?;
if matches.len() > 1 {
let group = DuplicateGroup {
files: matches.iter().map(|m| m.file_id).collect(),
similarity: matches.iter().map(|m| m.similarity).sum::<f32>() / matches.len() as f32,
};
for m in &matches {
processed.insert(m.file_id);
}
groups.push(group);
}
}
Ok(groups)
}
}
#[derive(Debug)]
pub struct FingerprintMatch {
pub file_id: FileId,
pub similarity: f32,
pub duration: u32,
}
#[derive(Debug)]
pub struct DuplicateGroup {
pub files: Vec<FileId>,
pub similarity: f32,
}
```
---
## Task 3: M4B Audiobook Support (`musicfs-metadata/src/formats/m4b.rs`)
```rust
use symphonia::core::meta::StandardTagKey;
/// M4B audiobook metadata (FR-24.2)
#[derive(Debug, Clone, Default)]
pub struct AudiobookMeta {
pub title: Option<String>,
pub author: Option<String>, // Maps to "artist" in audio
pub narrator: Option<String>,
pub series: Option<String>,
pub series_part: Option<u32>,
pub description: Option<String>,
pub publisher: Option<String>,
pub year: Option<u32>,
pub duration_ms: Option<u64>,
pub chapters: Vec<Chapter>,
}
#[derive(Debug, Clone)]
pub struct Chapter {
pub index: u32,
pub title: String,
pub start_ms: u64,
pub end_ms: u64,
}
impl Chapter {
pub fn duration_ms(&self) -> u64 {
self.end_ms - self.start_ms
}
}
pub struct M4bParser;
impl M4bParser {
/// Parse M4B audiobook with chapters
pub fn parse(&self, path: &Path) -> Result<AudiobookMeta, MetadataError> {
let file = std::fs::File::open(path)?;
let mss = MediaSourceStream::new(Box::new(file), Default::default());
let mut hint = Hint::new();
hint.with_extension("m4b");
let probed = symphonia::default::get_probe()
.format(&hint, mss, &FormatOptions::default(), &MetadataOptions::default())?;
let mut meta = AudiobookMeta::default();
let format = probed.format;
// Extract metadata
if let Some(metadata) = format.metadata().current() {
for tag in metadata.tags() {
if let Some(std_key) = tag.std_key {
let value = tag.value.to_string();
match std_key {
StandardTagKey::TrackTitle | StandardTagKey::Album => {
meta.title = Some(value);
}
StandardTagKey::Artist => {
meta.author = Some(value);
}
StandardTagKey::Composer => {
meta.narrator = Some(value);
}
StandardTagKey::Description => {
meta.description = Some(value);
}
StandardTagKey::Label => {
meta.publisher = Some(value);
}
StandardTagKey::Date => {
meta.year = value.chars().take(4).collect::<String>().parse().ok();
}
_ => {}
}
}
}
}
// Extract chapters from MP4 chpl atom
meta.chapters = self.extract_chapters(&format)?;
// Get total duration
if let Some(track) = format.tracks().first() {
if let (Some(n_frames), Some(sample_rate)) =
(track.codec_params.n_frames, track.codec_params.sample_rate)
{
meta.duration_ms = Some((n_frames as u64 * 1000) / sample_rate as u64);
}
}
Ok(meta)
}
fn extract_chapters(&self, format: &dyn FormatReader) -> Result<Vec<Chapter>, MetadataError> {
let mut chapters = Vec::new();
// Symphonia exposes chapters via cues
if let Some(cues) = format.cues() {
for (idx, cue) in cues.iter().enumerate() {
let start_ms = (cue.start_ts as f64 / cue.start_offset_ts.unwrap_or(1) as f64 * 1000.0) as u64;
// End time is start of next chapter or track end
let end_ms = cues.get(idx + 1)
.map(|next| (next.start_ts as f64 / next.start_offset_ts.unwrap_or(1) as f64 * 1000.0) as u64)
.unwrap_or(u64::MAX); // Will be clamped to duration
chapters.push(Chapter {
index: idx as u32,
title: cue.tags.iter()
.find(|t| t.std_key == Some(StandardTagKey::TrackTitle))
.map(|t| t.value.to_string())
.unwrap_or_else(|| format!("Chapter {}", idx + 1)),
start_ms,
end_ms,
});
}
}
Ok(chapters)
}
}
```
---
## Task 4: Chapter Extraction (`musicfs-metadata/src/chapters.rs`)
```rust
/// Generic chapter support for various formats
pub trait ChapterSource {
fn chapters(&self) -> &[Chapter];
fn chapter_at(&self, position_ms: u64) -> Option<&Chapter>;
}
impl ChapterSource for AudiobookMeta {
fn chapters(&self) -> &[Chapter] {
&self.chapters
}
fn chapter_at(&self, position_ms: u64) -> Option<&Chapter> {
self.chapters.iter()
.find(|c| position_ms >= c.start_ms && position_ms < c.end_ms)
}
}
/// Virtual chapter file generator
pub struct ChapterFileGenerator;
impl ChapterFileGenerator {
/// Generate virtual files for each chapter
/// Example: book.m4b -> book/01 - Introduction.m4b.chapter
pub fn generate_virtual_files(&self, meta: &AudiobookMeta, base_path: &VirtualPath) -> Vec<VirtualChapterFile> {
meta.chapters.iter()
.map(|chapter| {
let filename = format!(
"{:02} - {}.chapter",
chapter.index + 1,
sanitize_filename(&chapter.title)
);
VirtualChapterFile {
path: base_path.join(&filename),
chapter_index: chapter.index,
start_ms: chapter.start_ms,
end_ms: chapter.end_ms,
title: chapter.title.clone(),
}
})
.collect()
}
}
#[derive(Debug)]
pub struct VirtualChapterFile {
pub path: VirtualPath,
pub chapter_index: u32,
pub start_ms: u64,
pub end_ms: u64,
pub title: String,
}
fn sanitize_filename(name: &str) -> String {
name.chars()
.map(|c| match c {
'/' | '\\' | ':' | '*' | '?' | '"' | '<' | '>' | '|' => '_',
_ => c,
})
.collect()
}
```
---
## Task 5: Virtual Chapter Files (`musicfs-fuse/src/ops/chapters.rs`)
```rust
use crate::VirtualFs;
impl VirtualFs {
/// Handle reads from virtual chapter files
/// These return a byte-range reference to the parent M4B file
pub async fn read_chapter(
&self,
chapter_file: &VirtualChapterFile,
offset: u64,
size: usize,
) -> Result<Vec<u8>, FuseError> {
// Get the parent audiobook file
let parent = self.get_parent_audiobook(&chapter_file.path)?;
// Calculate byte range for this chapter
// This requires knowing the audio bitrate to convert ms -> bytes
let meta = self.get_audiobook_meta(&parent)?;
let bitrate_bps = meta.bitrate.unwrap_or(128_000); // Default 128kbps
let bytes_per_ms = bitrate_bps / 8 / 1000;
let chapter_start_bytes = chapter_file.start_ms * bytes_per_ms;
let chapter_end_bytes = chapter_file.end_ms * bytes_per_ms;
// Adjust offset to be within chapter
let actual_offset = chapter_start_bytes + offset;
let max_size = (chapter_end_bytes - actual_offset) as usize;
let read_size = size.min(max_size);
// Read from the actual file
self.read_file(&parent, actual_offset, read_size).await
}
/// List chapter files for an audiobook
pub fn list_chapters(&self, audiobook_path: &VirtualPath) -> Result<Vec<DirEntry>, FuseError> {
let meta = self.get_audiobook_meta(audiobook_path)?;
let generator = ChapterFileGenerator;
let chapters = generator.generate_virtual_files(&meta, audiobook_path);
Ok(chapters.into_iter()
.map(|c| DirEntry {
name: c.path.filename().to_string(),
kind: FileType::RegularFile,
size: self.estimate_chapter_size(&c),
})
.collect())
}
fn estimate_chapter_size(&self, chapter: &VirtualChapterFile) -> u64 {
// Estimate based on duration and typical bitrate
let duration_secs = (chapter.end_ms - chapter.start_ms) / 1000;
duration_secs * 128_000 / 8 // 128kbps assumption
}
}
```
---
## Task 6: Fingerprint Search Virtual Directory
```rust
/// Virtual directory for fingerprint search
/// /.search/fingerprint/{base64_fingerprint} -> matching files
impl SearchOps {
pub async fn search_by_fingerprint(
&self,
fingerprint_path: &str,
) -> Result<Vec<SearchResult>, SearchError> {
// Path format: /.search/fingerprint/{base64_encoded_fingerprint}
let fp_bytes = base64::decode(fingerprint_path)
.map_err(|_| SearchError::InvalidQuery)?;
let fingerprint = AudioFingerprint::from_bytes(&fp_bytes)?;
let matches = self.fingerprint_index.search(&fingerprint, 0.8, 20)?;
let mut results = Vec::new();
for m in matches {
if let Some(file) = self.db.get_file_by_id(m.file_id)? {
results.push(SearchResult {
path: file.virtual_path,
score: m.similarity,
snippet: format!("Similarity: {:.1}%", m.similarity * 100.0),
});
}
}
Ok(results)
}
}
```
---
## Database Schema Additions
```sql
-- Fingerprint storage
CREATE TABLE IF NOT EXISTS fingerprints (
file_id INTEGER PRIMARY KEY REFERENCES files(id) ON DELETE CASCADE,
fingerprint BLOB NOT NULL, -- Compressed chromaprint
duration INTEGER NOT NULL, -- Duration in seconds
indexed_at INTEGER NOT NULL DEFAULT (strftime('%s', 'now'))
);
CREATE INDEX IF NOT EXISTS idx_fingerprints_duration ON fingerprints(duration);
-- Audiobook chapters
CREATE TABLE IF NOT EXISTS chapters (
id INTEGER PRIMARY KEY,
file_id INTEGER NOT NULL REFERENCES files(id) ON DELETE CASCADE,
chapter_idx INTEGER NOT NULL,
title TEXT NOT NULL,
start_ms INTEGER NOT NULL,
end_ms INTEGER NOT NULL,
UNIQUE(file_id, chapter_idx)
);
CREATE INDEX IF NOT EXISTS idx_chapters_file ON chapters(file_id);
```
---
## Tests
| Test | Type | Validates |
|------|------|-----------|
| `test_fingerprint_generation` | Unit | Chromaprint from audio (FR-14.4) |
| `test_fingerprint_similarity` | Unit | Bit comparison algorithm |
| `test_fingerprint_search` | Integration | Find similar tracks |
| `test_fingerprint_duplicates` | Integration | Detect duplicate audio |
| `test_m4b_parsing` | Unit | M4B metadata extraction (FR-24.2) |
| `test_chapter_extraction` | Unit | Chapter list from M4B |
| `test_virtual_chapter_files` | Integration | Chapter files appear in listing |
| `test_chapter_read` | Integration | Read chapter content |
| `test_audiobook_navigation` | E2E | Browse audiobook chapters |
---
## Exit Criteria
- [ ] Audio fingerprints generated from audio files
- [ ] Fingerprint similarity search finds matching tracks
- [ ] Duplicate detection works across library
- [ ] M4B files parsed with full metadata
- [ ] Chapters extracted and stored
- [ ] Virtual chapter files appear in directory listing
- [ ] Chapter files are readable (return correct byte range)
- [ ] All tests pass
---
## Architecture Alignment
Per requirements.md:
- FR-14.4: Audio fingerprint search ✓
- FR-24.2: Audiobook formats with chapters ✓
Per architecture.md section 4.3.4:
- FormatPlugin trait for M4B support ✓
- Chapter extraction via symphonia ✓