Commit Graph

19 Commits

Author SHA1 Message Date
Alexander d6fadde50d Track BlueDoc and GreenDoc document templates 2026-05-13 12:39:16 +02:00
Alexander 0b97905826 Add resilience testing strategy
Maps all 34 resilience issues to concrete test approaches across 3 layers:
trait-based mocks + failpoints (fast), fork-kill crash recovery (medium),
and Toxiproxy + Docker integration (thorough). Includes tooling choices
(fail crate, rlimit, nix, wiremock), test organization, failpoint
instrumentation map, and coverage matrix.
2026-05-13 12:22:21 +02:00
Alexander 87574ce008 Add resilience audit and persistent state plans
Comprehensive fault tolerance analysis covering 34 issues across 6 phases:
signal handling, crash recovery, cache corruption, network failures,
resource exhaustion, and the critical finding that no persistent state
is used on mount (every restart is a full origin rescan).

Persistent state plan covers storage engine options, mount flow redesign,
background delta sync, and the in-memory state inventory.
2026-05-13 12:09:41 +02:00
Alexander 5ac33987c0 Add comprehensive logging with tracing, file rotation, and systemd integration
- Add tracing-appender and tracing-journald for production logging
- Add LoggingConfig with trace_sample_rate, json_output, journald options
- Expand init_logging() with file rotation, journald, and stderr layers
- Add sanitize_path() helper for PII protection in logs
- Instrument FUSE operations with #[instrument] and trace decision points
- Instrument gRPC handlers (10 methods) with span correlation
- Add spawn instrumentation for health monitor, indexer, watcher tasks
- Add broadcast lag handling (RecvError::Lagged) in event subscribers
- Fix webhook.rs expect() calls with proper error handling
- Add logging to patterns.rs, collections.rs, artwork.rs database ops
- Add Drop impl logging for PluginManager and WatchHandle
- Update systemd service with rate limiting and journal output
- Add logrotate config and example config.toml with logging section
2026-05-13 11:21:51 +02:00
Alexander bc9fa36646 Add Week 10 Plugin System and Week 11 Control API
Week 10 - Plugin System (FR-19):
- Plugin traits: Plugin, OriginPlugin, MetadataPlugin, FormatPlugin
- NativePluginHost with libloading for dynamic loading
- WasmPluginHost (feature-gated) with wasmtime runtime
- PluginManager coordinating both hosts with version checks
- OriginInstance::watch() with WatchHandle, WatchEvent for live updates
- FormatPlugin::synthesize_header() for metadata overlay

Week 11 - Control API & Production (FR-17, FR-18, NFR-6, NFR-10):
- gRPC server with full MusicFS service (status, cache, origins, events)
- Proto extended: MountState enum, TierStats, full StatusResponse/CacheStats
- WebhookHandler with HMAC-SHA256 signing and exponential retry
- Metrics with latency histograms (p50/p95/p99) and origin health gauges
- CLI with mount, status, cache, search, origin, events, shutdown commands
- E2E player compatibility tests (mpv, VLC, file manager)
- systemd service, PKGBUILD, RPM spec for packaging

Plans added for Weeks 10-14 covering P1 features.
All 154 tests passing.
2026-05-13 10:34:01 +02:00
Alexander 34d05b7a49 Add Week 9 Smart Features: collections, artwork, predictive prefetch
Smart Collections (musicfs-search/src/collections.rs):
- CollectionStore with thread-safe Mutex<Connection>
- CollectionQuery enum: Match, DateRange, RecentlyAdded/Played, MostPlayed, Genre, Compound
- Builtin collections for Recently Added, 80s/90s Music

Artwork Extraction & Caching:
- ArtworkExtractor using symphonia Visual (musicfs-metadata)
- ArtworkCache with CAS storage + on-demand resize (musicfs-cache)
- ArtType: Front/Back/Other, ArtSize: Thumbnail/Medium/Full

Predictive Prefetching:
- PatternStore tracks access patterns with sequence prediction
- PrefetchEngine listens to FileAccessed events, prefetches predictions
- PrefetchOps exposes /.prefetch/ virtual directory with status/hints

Oracle review fixes applied:
- CollectionStore uses Mutex for thread safety
- FileAccessed event now includes file_id for canonical correlation
- JSON parse warnings in collection deserialization

130 tests pass (15 new tests added)
2026-05-13 07:21:28 +02:00
Alexander 3cb6dfcaf8 Add Week 8 Search API docs and Week 8-9 plans with Oracle fixes
- docs/api/search.md: FUSE and gRPC search API documentation
- Week 8 plan: Oracle fixes for IndexWriter pattern, moka cache, gRPC API
- Week 9 plan: Oracle fixes for artwork schema, spawn_blocking, access_log
- Week 7 performance review

Ultraworked with [Sisyphus](https://github.com/code-yeongyu/claude-agent)

Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
2026-05-12 23:23:49 +02:00
Alexander 0e5a514015 Add Week 5-7 plans with Oracle review fixes
Week 5 (CDC & Delta Detection):
- Add read_full() method to avoid u32 overflow on >4GB files
- Add chunk_streaming() to avoid 200MB+ memory per file
- Implement scan_origin() recursive walk (was stub)
- Use spawn_blocking for watcher instead of separate runtime
- Add 200ms event debouncing
- Add >90% bandwidth reduction test

Week 6 (Origin Federation):
- Define all-origins-unhealthy behavior (least-bad selection)
- Track watch handles for cleanup on unregister
- Clarify tuple-based priority routing
- Add per-origin-type health thresholds
- Align retry delays with NFR-7.3 spec (100ms, 500ms, 2000ms)

Week 7 (Remote Origins):
- Replace SFTP single mutex with connection pool
- Add 30s timeout to all remote operations
- Custom Debug impl to redact credentials
- SSH host verification against known_hosts
- Clamp S3 range requests to file size
- Use head_bucket for S3 health checks
2026-05-12 19:48:40 +02:00
Alexander 7ad554f8d5 Add CLI implementation and MVP performance review
- Implement functional CLI with clap argument parsing
- Add directory scanning and metadata extraction at startup
- Fix filesystem.rs to store tokio Handle for async/sync bridge
- Fix flake.nix with LD_LIBRARY_PATH for libfuse3
- Add MVP performance review with real-world benchmark results

Benchmarks show:
- Mount time: 8ms (target <500ms)
- Throughput: 2-3 GB/s (target >500 MB/s)
- Identifies critical gap: incomplete file caching (only ~2MB per file)
- Identifies missing CDC chunking per architecture spec
2026-05-12 19:28:13 +02:00
Alexander e575276b6f Add Week 4b plan: Origin-CAS connector for cache-miss handling
- Create week-04b-origin-connector.md with ContentFetcher design
- Update development-plan.md: Phase 1 now includes Week 4b
- Update architecture.md: Phase 1 table includes Week 4b
- Plan includes EventBus integration per FR-18.1 (Oracle-verified)
2026-05-12 18:55:58 +02:00
Alexander ffbb238633 Implement Week 4 CAS store with chunk deduplication and LRU eviction
- Add musicfs-cas crate: CasStore, ChunkHash, FileReader, ChunkManifest
- Add LruEviction policy to musicfs-cache for cache size management
- Integrate FileReader into FUSE filesystem for actual file reads
- Use xxHash64 for content hashing, sled for index, msgpack serialization
- Default cache path: ~/.cache/musicfs/chunks/ with 256 subdirs sharding
- 20 new tests (14 CAS unit + 3 integration + 3 eviction), 54 total
2026-05-12 18:43:39 +02:00
Alexander e08988f7f3 Add development plan and Oracle-validated weekly plans (Weeks 1-3)
development-plan.md (master plan):
- 11-week implementation broken into 4 phases
- 11 Rust crates with dependency graph
- Per-week deliverables, tests, exit criteria
- Deferred requirements (FR-21, FR-22) with rationale

plans/week-01-foundation.md:
- Workspace setup, core types, FUSE skeleton, local origin
- Origin trait with watch() method (arch 4.3.4)
- EventBus with FileAccessed event (FR-18.1)
- All EROFS handlers for read-only enforcement (FR-4.1-4.5)

plans/week-02-metadata.md:
- symphonia metadata extraction (FR-6.1-6.5)
- SQLite schema matching architecture 4.3.6 exactly
- Column names: track/disc (not track_number/disc_number)
- Hash columns as TEXT (hex-encoded, not BLOB)
- Added idx_files_real index (FR-7.3)

plans/week-03-virtual-tree.md:
- Path resolver with $var syntax (arch 4.3.1)
- Template vars: $artist, $album, $title, $track, $year, $disc, $genre, $format, $format_upper
- RefreshPolicy struct for FR-9.3 (TTL-based refresh)
- force_refresh() method for FR-9.4 (signal/API refresh)

All plans Oracle-validated against architecture.md and requirements.md
2026-05-12 17:52:33 +02:00
Alexander dac9f3dd02 Replace JSON-RPC with gRPC for Control API
Update Control API specification to use gRPC over Unix socket instead of
JSON-RPC 2.0. gRPC provides better type safety, native streaming for events,
and auto-generated clients for multi-language integration.

architecture.md:
- Add decision rationale table (JSON-RPC vs gRPC comparison)
- Add full .proto definitions (~200 lines) for musicfs.v1 package
- Define MusicFS service with 9 RPC methods:
  - Daemon: GetStatus, Shutdown
  - Cache: GetCacheStats, ClearCache, Prefetch (streaming)
  - Origins: ListOrigins, GetOriginHealth, RescanOrigin (streaming)
  - Search: Search, SearchStream
  - Events: SubscribeEvents (server-streaming)
- Add grpcurl debugging examples

requirements.md:
- FR-17.1: Clarify Unix socket uses gRPC
- FR-17.2: Upgrade from SHOULD to SHALL for gRPC requirement
2026-05-12 16:51:35 +02:00
Alexander 1374084135 Reorganize docs into v1 (beetfs) and v2 (new architecture)
docs/v1/ - Original beetfs documentation:
  - analysis.md, components.md, data-flow.md, drawbacks.md
  - features.md, modernization.md, rust-migration.md
  - benchmark-plan.md, benchmark-results.md, e2e-test-plan.md
  - README.md

docs/v2/ - New MusicFS architecture:
  - requirements.md: Full requirements spec (FR-1 to FR-25, NFR-1 to NFR-14)
    - P0: Multi-origin, plugins, CAS, control API
    - P1: Search, album art, prefetch, metadata sources
    - P3: HA, 10M+ files scalability
  - architecture.md: Google BlueDoc style design document
    - PlantUML diagrams for all components
    - Design requirements with quantitative targets
    - Alternatives considered, implementation plan
2026-05-12 16:46:37 +02:00
Alexander 3a6115cbab Add benchmark suite and results
Benchmark harness (benchmarks/run_benchmarks.py):
- Mount time, readdir, stat latency, file open/read, memory usage
- ENOENT lookup (missing file) benchmark per Oracle review
- Uses synthetic FLAC files from test infrastructure

Results: ALL BENCHMARKS BLOCKED BY BUGS
- Bug #2 (directory tree building) crashes mount with any content
- FSNode.adddir() assumes parent dirs exist, fails with KeyError
- Bug #1 (nested methods) would block FUSE ops even if mount worked

beetfs is non-functional for real-world use until both bugs fixed.
2026-05-12 14:44:02 +02:00
Alexander dacd3a7c1f Add benchmark plan for beetfs performance measurement
Covers:
- Mount time scaling (100 to 100K items)
- Metadata operations (stat, readdir throughput)
- File I/O (open latency, read throughput)
- Memory usage (idle, per-file, leak detection)
- Concurrent access (GIL impact)
- Realistic workloads (library scan, album playback)

Tools: fio, mdtest, hyperfine
Baselines: ext4, fuse-passthrough, sshfs

Key bottlenecks identified:
- FileHandler loads entire file into RAM on open
- Mount-time bulk load of all library items
- Python GIL limits parallelism
2026-05-12 14:36:24 +02:00
Alexander f8666ae8c6 Document test findings and fix mount script
Test Results (74 tests):
- 12 passed, 56 failures, 3 errors, 3 skipped

Bugs Detected:
1. Nested methods bug: lines 758-1144 indented inside access()
   - FUSE operations (readdir, open, read, write) unreachable
   - os.listdir() returns ENOSYS (Function not implemented)

2. Directory tree building: KeyError in FSNode.getnode()
   - Mount fails when library contains tracks

3. Unmount not clean: filesystem not releasing properly

Changes:
- Fix conftest.py: inline sanitization (no module-level sanitize fn)
- Add test findings to e2e-test-plan.md
- Add .gitignore for .pyc and test artifacts
2026-05-12 14:29:05 +02:00
Alexander 81df4790bf Add e2e test suite for beetfs
Tests use real FUSE operations against mounted beetfs filesystem:
- test_smoke: mount/unmount lifecycle
- test_nested_bug: detects critical indentation bug (13 failures)
- test_readdir: directory listing
- test_read: metadata overlay verification
- test_stat: file/directory attributes
- test_write: metadata modification
- test_error_handling: ENOENT, EOPNOTSUPP

Also includes:
- conftest.py with BeetFSTestCase base class and synthetic FLAC generator
- e2e-test-plan.md with Oracle-reviewed test strategy
- flake.nix updated with ffmpeg/flac for test fixtures

Run: cd tests && nix develop ../ --command python -m unittest discover
2026-05-12 14:02:55 +02:00
Alexander f0a83df190 Add reverse-engineered documentation
- README.md: Overview, core concept diagram, component summary
- architecture.md: System design, initialization flow, memory model
- components.md: Deep dive on all classes and functions
- data-flow.md: Complete read/write operation flows with diagrams
- analysis.md: Performance analysis (latency, memory footprint, I/O)
- drawbacks.md: 27 identified issues and limitations catalog
- modernization.md: Python 3 migration guide with effort estimates
2026-05-12 11:52:48 +02:00