Commit Graph

11 Commits

Author SHA1 Message Date
Alexander 6285eeb6c0 Implement Phase A: Stop Dying resilience fixes
Implements all 6 critical resilience fixes from phase-a-stop-dying.md:

- Issue 2.9: Migrate std::sync::RwLock → parking_lot::RwLock (7 files)
  Prevents lock poisoning cascade on writer panic

- Issue 2.2: Add install_panic_hook() to log panics via tracing
  Ensures panics are captured in logs/journald before process death

- Issue 3.7: Add ExecStopPost to systemd service
  Cleans up stale FUSE mounts on service stop

- Issue 2.7: Add check_stale_mount() detection on startup
  Auto-cleans leftover mounts from previous crashes

- Issue 2.10: Integrate sd_notify for systemd lifecycle
  Sends READY=1 after mount, STOPPING on shutdown

- Issue 2.1: Add signal handling with spawn_mount
  Catches SIGTERM/SIGINT for clean shutdown instead of instant death

All 7 Phase A tests pass:
- test_poisoned_tree_lock_returns_eio_not_panic
- test_parking_lot_rwlock_survives_panic
- test_panic_hook_logs_to_tracing
- test_systemd_service_has_execstoppost
- test_stale_mount_check_function_exists
- test_sd_notify_ready_sent
- test_sigterm_triggers_shutdown
2026-05-13 14:48:32 +02:00
Alexander e3eeba4650 Add musicfs-test-utils crate with RED resilience tests
Phase 1 of resilience testing design doc implementation:
- New musicfs-test-utils crate with FaultyOrigin, FaultyCasStore, fixtures
- Failpoints instrumented in musicfs-cas/store.rs
- 16 resilience tests (13 RED for missing features, 3 GREEN for existing)
- 3 Docker/Toxiproxy network tests (RED until docker-compose up)
- docker-compose.yml for Toxiproxy + MinIO + SFTP test infrastructure

Tests properly fail-first (TDD): check_all() sequential, no health timeout,
missing corruption detection, no passthrough mode, etc.
2026-05-13 13:49:25 +02:00
Alexander 5ac33987c0 Add comprehensive logging with tracing, file rotation, and systemd integration
- Add tracing-appender and tracing-journald for production logging
- Add LoggingConfig with trace_sample_rate, json_output, journald options
- Expand init_logging() with file rotation, journald, and stderr layers
- Add sanitize_path() helper for PII protection in logs
- Instrument FUSE operations with #[instrument] and trace decision points
- Instrument gRPC handlers (10 methods) with span correlation
- Add spawn instrumentation for health monitor, indexer, watcher tasks
- Add broadcast lag handling (RecvError::Lagged) in event subscribers
- Fix webhook.rs expect() calls with proper error handling
- Add logging to patterns.rs, collections.rs, artwork.rs database ops
- Add Drop impl logging for PluginManager and WatchHandle
- Update systemd service with rate limiting and journal output
- Add logrotate config and example config.toml with logging section
2026-05-13 11:21:51 +02:00
Alexander 34d05b7a49 Add Week 9 Smart Features: collections, artwork, predictive prefetch
Smart Collections (musicfs-search/src/collections.rs):
- CollectionStore with thread-safe Mutex<Connection>
- CollectionQuery enum: Match, DateRange, RecentlyAdded/Played, MostPlayed, Genre, Compound
- Builtin collections for Recently Added, 80s/90s Music

Artwork Extraction & Caching:
- ArtworkExtractor using symphonia Visual (musicfs-metadata)
- ArtworkCache with CAS storage + on-demand resize (musicfs-cache)
- ArtType: Front/Back/Other, ArtSize: Thumbnail/Medium/Full

Predictive Prefetching:
- PatternStore tracks access patterns with sequence prediction
- PrefetchEngine listens to FileAccessed events, prefetches predictions
- PrefetchOps exposes /.prefetch/ virtual directory with status/hints

Oracle review fixes applied:
- CollectionStore uses Mutex for thread safety
- FileAccessed event now includes file_id for canonical correlation
- JSON parse warnings in collection deserialization

130 tests pass (15 new tests added)
2026-05-13 07:21:28 +02:00
Alexander 6bae6ca67b Add tantivy, moka, tonic workspace dependencies for Week 8 search
Ultraworked with [Sisyphus](https://github.com/code-yeongyu/claude-agent)

Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
2026-05-12 23:23:04 +02:00
Alexander 09f019730f Implement Week 7 Remote Origins with Oracle fixes
- Add credentials.rs with CredentialStore, redacted Debug (session_token shows [REDACTED])
- Add nfs.rs with ESTALE retry using Fn closure, 5s health timeout
- Add smb.rs with ENOTCONN retry handling, 5s health timeout
- Add s3.rs/sftp.rs feature-gated stubs with security documentation
- Add error variants: S3, Sftp, Timeout, Credential, NfsStaleHandle
- Fix delta.rs unused imports

Oracle fixes applied:
- SMB retry_on_disconnect for ENOTCONN (errno 107)
- session_token Debug shows [REDACTED] when Some, None otherwise
- NFS/SMB health checks wrapped with tokio::time::timeout(5s)

102 tests pass, 0 warnings.
2026-05-12 22:26:19 +02:00
Alexander d5ef68c9c9 Implement Week 6 Origin Federation with Oracle fixes
New files:
- musicfs-core/src/config.rs: Config, OriginConfig, HealthConfig
- musicfs-origins/src/registry.rs: OriginRegistry with watch cleanup
- musicfs-origins/src/router.rs: Priority router with (priority, latency) ordering
- musicfs-origins/src/health.rs: HealthMonitor with per-origin-type thresholds
- musicfs-origins/src/failover.rs: FailoverExecutor with NFR-7.3 backoff

Oracle fixes applied:
- Per-OriginType threshold: Local=1, Remote=3 (check_one uses threshold_for)
- AllOriginsUnhealthy event: Added to events.rs, emitted in select_with_fallback
- Unified OriginType: Removed duplicate from traits.rs, use musicfs_core::OriginType
- Watch handle cleanup: Tracked and dropped on unregister()
- Retry delays: 100ms, 500ms, 2000ms (NFR-7.3 compliant)

Tests: 91 pass (+20 new)
2026-05-12 20:15:56 +02:00
Alexander 7ad554f8d5 Add CLI implementation and MVP performance review
- Implement functional CLI with clap argument parsing
- Add directory scanning and metadata extraction at startup
- Fix filesystem.rs to store tokio Handle for async/sync bridge
- Fix flake.nix with LD_LIBRARY_PATH for libfuse3
- Add MVP performance review with real-world benchmark results

Benchmarks show:
- Mount time: 8ms (target <500ms)
- Throughput: 2-3 GB/s (target >500 MB/s)
- Identifies critical gap: incomplete file caching (only ~2MB per file)
- Identifies missing CDC chunking per architecture spec
2026-05-12 19:28:13 +02:00
Alexander ffbb238633 Implement Week 4 CAS store with chunk deduplication and LRU eviction
- Add musicfs-cas crate: CasStore, ChunkHash, FileReader, ChunkManifest
- Add LruEviction policy to musicfs-cache for cache size management
- Integrate FileReader into FUSE filesystem for actual file reads
- Use xxHash64 for content hashing, sled for index, msgpack serialization
- Default cache path: ~/.cache/musicfs/chunks/ with 256 subdirs sharding
- 20 new tests (14 CAS unit + 3 integration + 3 eviction), 54 total
2026-05-12 18:43:39 +02:00
Alexander d664439746 Implement Week 2 metadata extraction and cache database
Week 1 fixes:
- Move hex to workspace dependencies
- Add cargo-criterion, protobuf, grpcurl to flake.nix

Week 2 implementation:
- musicfs-metadata: MetadataParser with symphonia 0.5 for FLAC, MP3,
  Opus/Vorbis, M4A/AAC (2 tests)
- musicfs-cache: SQLite schema per architecture 4.3.6 with track/disc
  columns, TEXT content_hash, all required indexes
- musicfs-cache/db.rs: Database with upsert, CRUD, mtime lookup (9 tests)
- musicfs-cache/metadata.rs: MetadataCache with store/lookup/is_fresh/
  invalidate (2 tests)
- musicfs-core: Added Metadata error variant

22 tests pass total. Oracle-verified against architecture doc.
2026-05-12 18:15:44 +02:00
Alexander 76856b893a Implement Week 1 foundation: workspace, core types, FUSE skeleton, LocalOrigin
- musicfs-core: OriginId, FileId, VirtualPath, ContentHash, AudioMeta,
  FileMeta, EventBus with FileAccessed event (5 tests)
- musicfs-fuse: FUSE skeleton with EROFS handlers for write ops
- musicfs-origins: Origin trait with watch(), LocalOrigin impl (6 tests)
- flake.nix: Nix dev shell with rust toolchain, clang, lld, fuse3

All 11 tests pass. Build produces no warnings.
2026-05-12 18:01:47 +02:00