Commit Graph

30 Commits

Author SHA1 Message Date
Alexander 6bae6ca67b Add tantivy, moka, tonic workspace dependencies for Week 8 search
Ultraworked with [Sisyphus](https://github.com/code-yeongyu/claude-agent)

Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
2026-05-12 23:23:04 +02:00
Alexander 09f019730f Implement Week 7 Remote Origins with Oracle fixes
- Add credentials.rs with CredentialStore, redacted Debug (session_token shows [REDACTED])
- Add nfs.rs with ESTALE retry using Fn closure, 5s health timeout
- Add smb.rs with ENOTCONN retry handling, 5s health timeout
- Add s3.rs/sftp.rs feature-gated stubs with security documentation
- Add error variants: S3, Sftp, Timeout, Credential, NfsStaleHandle
- Fix delta.rs unused imports

Oracle fixes applied:
- SMB retry_on_disconnect for ENOTCONN (errno 107)
- session_token Debug shows [REDACTED] when Some, None otherwise
- NFS/SMB health checks wrapped with tokio::time::timeout(5s)

102 tests pass, 0 warnings.
2026-05-12 22:26:19 +02:00
Alexander d5ef68c9c9 Implement Week 6 Origin Federation with Oracle fixes
New files:
- musicfs-core/src/config.rs: Config, OriginConfig, HealthConfig
- musicfs-origins/src/registry.rs: OriginRegistry with watch cleanup
- musicfs-origins/src/router.rs: Priority router with (priority, latency) ordering
- musicfs-origins/src/health.rs: HealthMonitor with per-origin-type thresholds
- musicfs-origins/src/failover.rs: FailoverExecutor with NFR-7.3 backoff

Oracle fixes applied:
- Per-OriginType threshold: Local=1, Remote=3 (check_one uses threshold_for)
- AllOriginsUnhealthy event: Added to events.rs, emitted in select_with_fallback
- Unified OriginType: Removed duplicate from traits.rs, use musicfs_core::OriginType
- Watch handle cleanup: Tracked and dropped on unregister()
- Retry delays: 100ms, 500ms, 2000ms (NFR-7.3 compliant)

Tests: 91 pass (+20 new)
2026-05-12 20:15:56 +02:00
Alexander 32c96701c8 Implement Week 5 CDC & Delta Detection with Oracle fixes
- Add CdcChunker using FastCDC v3 (16KB/64KB/256KB chunks)
- Add DeltaDetector with scan_origin() returning ScannedFile (no FileId assignment)
- Add OriginWatcher with inotify and 200ms debounce using tokio::spawn
- Fix LocalOrigin::read() to loop until all bytes read
- Add read_full() method to Origin trait
- Add mtime field to ChunkManifest
- Update ContentFetcher to use CDC chunking
- Update bandwidth reduction test to assert >90% (NFR-6.4)

Tests: 71 pass (+11 new)
2026-05-12 20:05:44 +02:00
Alexander 0e5a514015 Add Week 5-7 plans with Oracle review fixes
Week 5 (CDC & Delta Detection):
- Add read_full() method to avoid u32 overflow on >4GB files
- Add chunk_streaming() to avoid 200MB+ memory per file
- Implement scan_origin() recursive walk (was stub)
- Use spawn_blocking for watcher instead of separate runtime
- Add 200ms event debouncing
- Add >90% bandwidth reduction test

Week 6 (Origin Federation):
- Define all-origins-unhealthy behavior (least-bad selection)
- Track watch handles for cleanup on unregister
- Clarify tuple-based priority routing
- Add per-origin-type health thresholds
- Align retry delays with NFR-7.3 spec (100ms, 500ms, 2000ms)

Week 7 (Remote Origins):
- Replace SFTP single mutex with connection pool
- Add 30s timeout to all remote operations
- Custom Debug impl to redact credentials
- SSH host verification against known_hosts
- Clamp S3 range requests to file size
- Use head_bucket for S3 health checks
2026-05-12 19:48:40 +02:00
Alexander 7ad554f8d5 Add CLI implementation and MVP performance review
- Implement functional CLI with clap argument parsing
- Add directory scanning and metadata extraction at startup
- Fix filesystem.rs to store tokio Handle for async/sync bridge
- Fix flake.nix with LD_LIBRARY_PATH for libfuse3
- Add MVP performance review with real-world benchmark results

Benchmarks show:
- Mount time: 8ms (target <500ms)
- Throughput: 2-3 GB/s (target >500 MB/s)
- Identifies critical gap: incomplete file caching (only ~2MB per file)
- Identifies missing CDC chunking per architecture spec
2026-05-12 19:28:13 +02:00
Alexander c46750b1ec Implement Week 4b Origin-CAS connector for cache-miss handling
- Add ContentFetcher bridging Origin→CAS on cache miss
- Integrate fetcher into FileReader via with_fetcher() constructor
- Add get_or_fetch_manifest() for lazy manifest loading
- Emit FileAccessed events per FR-18.1 via EventBus
- Add 2 integration tests for e2e fetch flow
- Test count: 60 (was 54)
2026-05-12 19:04:48 +02:00
Alexander e575276b6f Add Week 4b plan: Origin-CAS connector for cache-miss handling
- Create week-04b-origin-connector.md with ContentFetcher design
- Update development-plan.md: Phase 1 now includes Week 4b
- Update architecture.md: Phase 1 table includes Week 4b
- Plan includes EventBus integration per FR-18.1 (Oracle-verified)
2026-05-12 18:55:58 +02:00
Alexander ffbb238633 Implement Week 4 CAS store with chunk deduplication and LRU eviction
- Add musicfs-cas crate: CasStore, ChunkHash, FileReader, ChunkManifest
- Add LruEviction policy to musicfs-cache for cache size management
- Integrate FileReader into FUSE filesystem for actual file reads
- Use xxHash64 for content hashing, sled for index, msgpack serialization
- Default cache path: ~/.cache/musicfs/chunks/ with 256 subdirs sharding
- 20 new tests (14 CAS unit + 3 integration + 3 eviction), 54 total
2026-05-12 18:43:39 +02:00
Alexander d9e5e06166 Implement Week 3 virtual tree with path resolver and FUSE integration 2026-05-12 18:25:24 +02:00
Alexander d664439746 Implement Week 2 metadata extraction and cache database
Week 1 fixes:
- Move hex to workspace dependencies
- Add cargo-criterion, protobuf, grpcurl to flake.nix

Week 2 implementation:
- musicfs-metadata: MetadataParser with symphonia 0.5 for FLAC, MP3,
  Opus/Vorbis, M4A/AAC (2 tests)
- musicfs-cache: SQLite schema per architecture 4.3.6 with track/disc
  columns, TEXT content_hash, all required indexes
- musicfs-cache/db.rs: Database with upsert, CRUD, mtime lookup (9 tests)
- musicfs-cache/metadata.rs: MetadataCache with store/lookup/is_fresh/
  invalidate (2 tests)
- musicfs-core: Added Metadata error variant

22 tests pass total. Oracle-verified against architecture doc.
2026-05-12 18:15:44 +02:00
Alexander 76856b893a Implement Week 1 foundation: workspace, core types, FUSE skeleton, LocalOrigin
- musicfs-core: OriginId, FileId, VirtualPath, ContentHash, AudioMeta,
  FileMeta, EventBus with FileAccessed event (5 tests)
- musicfs-fuse: FUSE skeleton with EROFS handlers for write ops
- musicfs-origins: Origin trait with watch(), LocalOrigin impl (6 tests)
- flake.nix: Nix dev shell with rust toolchain, clang, lld, fuse3

All 11 tests pass. Build produces no warnings.
2026-05-12 18:01:47 +02:00
Alexander e08988f7f3 Add development plan and Oracle-validated weekly plans (Weeks 1-3)
development-plan.md (master plan):
- 11-week implementation broken into 4 phases
- 11 Rust crates with dependency graph
- Per-week deliverables, tests, exit criteria
- Deferred requirements (FR-21, FR-22) with rationale

plans/week-01-foundation.md:
- Workspace setup, core types, FUSE skeleton, local origin
- Origin trait with watch() method (arch 4.3.4)
- EventBus with FileAccessed event (FR-18.1)
- All EROFS handlers for read-only enforcement (FR-4.1-4.5)

plans/week-02-metadata.md:
- symphonia metadata extraction (FR-6.1-6.5)
- SQLite schema matching architecture 4.3.6 exactly
- Column names: track/disc (not track_number/disc_number)
- Hash columns as TEXT (hex-encoded, not BLOB)
- Added idx_files_real index (FR-7.3)

plans/week-03-virtual-tree.md:
- Path resolver with $var syntax (arch 4.3.1)
- Template vars: $artist, $album, $title, $track, $year, $disc, $genre, $format, $format_upper
- RefreshPolicy struct for FR-9.3 (TTL-based refresh)
- force_refresh() method for FR-9.4 (signal/API refresh)

All plans Oracle-validated against architecture.md and requirements.md
2026-05-12 17:52:33 +02:00
Alexander dac9f3dd02 Replace JSON-RPC with gRPC for Control API
Update Control API specification to use gRPC over Unix socket instead of
JSON-RPC 2.0. gRPC provides better type safety, native streaming for events,
and auto-generated clients for multi-language integration.

architecture.md:
- Add decision rationale table (JSON-RPC vs gRPC comparison)
- Add full .proto definitions (~200 lines) for musicfs.v1 package
- Define MusicFS service with 9 RPC methods:
  - Daemon: GetStatus, Shutdown
  - Cache: GetCacheStats, ClearCache, Prefetch (streaming)
  - Origins: ListOrigins, GetOriginHealth, RescanOrigin (streaming)
  - Search: Search, SearchStream
  - Events: SubscribeEvents (server-streaming)
- Add grpcurl debugging examples

requirements.md:
- FR-17.1: Clarify Unix socket uses gRPC
- FR-17.2: Upgrade from SHOULD to SHALL for gRPC requirement
2026-05-12 16:51:35 +02:00
Alexander 1374084135 Reorganize docs into v1 (beetfs) and v2 (new architecture)
docs/v1/ - Original beetfs documentation:
  - analysis.md, components.md, data-flow.md, drawbacks.md
  - features.md, modernization.md, rust-migration.md
  - benchmark-plan.md, benchmark-results.md, e2e-test-plan.md
  - README.md

docs/v2/ - New MusicFS architecture:
  - requirements.md: Full requirements spec (FR-1 to FR-25, NFR-1 to NFR-14)
    - P0: Multi-origin, plugins, CAS, control API
    - P1: Search, album art, prefetch, metadata sources
    - P3: HA, 10M+ files scalability
  - architecture.md: Google BlueDoc style design document
    - PlantUML diagrams for all components
    - Design requirements with quantitative targets
    - Alternatives considered, implementation plan
2026-05-12 16:46:37 +02:00
Alexander 3a6115cbab Add benchmark suite and results
Benchmark harness (benchmarks/run_benchmarks.py):
- Mount time, readdir, stat latency, file open/read, memory usage
- ENOENT lookup (missing file) benchmark per Oracle review
- Uses synthetic FLAC files from test infrastructure

Results: ALL BENCHMARKS BLOCKED BY BUGS
- Bug #2 (directory tree building) crashes mount with any content
- FSNode.adddir() assumes parent dirs exist, fails with KeyError
- Bug #1 (nested methods) would block FUSE ops even if mount worked

beetfs is non-functional for real-world use until both bugs fixed.
2026-05-12 14:44:02 +02:00
Alexander dacd3a7c1f Add benchmark plan for beetfs performance measurement
Covers:
- Mount time scaling (100 to 100K items)
- Metadata operations (stat, readdir throughput)
- File I/O (open latency, read throughput)
- Memory usage (idle, per-file, leak detection)
- Concurrent access (GIL impact)
- Realistic workloads (library scan, album playback)

Tools: fio, mdtest, hyperfine
Baselines: ext4, fuse-passthrough, sshfs

Key bottlenecks identified:
- FileHandler loads entire file into RAM on open
- Mount-time bulk load of all library items
- Python GIL limits parallelism
2026-05-12 14:36:24 +02:00
Alexander f8666ae8c6 Document test findings and fix mount script
Test Results (74 tests):
- 12 passed, 56 failures, 3 errors, 3 skipped

Bugs Detected:
1. Nested methods bug: lines 758-1144 indented inside access()
   - FUSE operations (readdir, open, read, write) unreachable
   - os.listdir() returns ENOSYS (Function not implemented)

2. Directory tree building: KeyError in FSNode.getnode()
   - Mount fails when library contains tracks

3. Unmount not clean: filesystem not releasing properly

Changes:
- Fix conftest.py: inline sanitization (no module-level sanitize fn)
- Add test findings to e2e-test-plan.md
- Add .gitignore for .pyc and test artifacts
2026-05-12 14:29:05 +02:00
Alexander 81df4790bf Add e2e test suite for beetfs
Tests use real FUSE operations against mounted beetfs filesystem:
- test_smoke: mount/unmount lifecycle
- test_nested_bug: detects critical indentation bug (13 failures)
- test_readdir: directory listing
- test_read: metadata overlay verification
- test_stat: file/directory attributes
- test_write: metadata modification
- test_error_handling: ENOENT, EOPNOTSUPP

Also includes:
- conftest.py with BeetFSTestCase base class and synthetic FLAC generator
- e2e-test-plan.md with Oracle-reviewed test strategy
- flake.nix updated with ffmpeg/flac for test fixtures

Run: cd tests && nix develop ../ --command python -m unittest discover
2026-05-12 14:02:55 +02:00
Alexander c18e15987c Add Nix flake for Python 2.7 development environment
Uses nixpkgs-18.09 which has all required Python 2 packages:
- fuse-python 0.2.1
- mutagen 1.41.1
- beets 1.4.9 (built from source)
- jellyfish 0.6.1
- munkres 1.0.6

Run 'nix develop' or 'direnv allow' to enter the environment.
2026-05-12 13:18:30 +02:00
Alexander f0a83df190 Add reverse-engineered documentation
- README.md: Overview, core concept diagram, component summary
- architecture.md: System design, initialization flow, memory model
- components.md: Deep dive on all classes and functions
- data-flow.md: Complete read/write operation flows with diagrams
- analysis.md: Performance analysis (latency, memory footprint, I/O)
- drawbacks.md: 27 identified issues and limitations catalog
- modernization.md: Python 3 migration guide with effort estimates
2026-05-12 11:52:48 +02:00
Johannes Baiter 39a9821a07 Make code PEP8-compliant 2013-05-27 14:47:31 +02:00
Johannes Baiter 04b75f6cf7 Add README 2013-05-27 13:34:52 +02:00
Martin Eve 02c04ffaf1 Change InterpolatedFLAC to operate from memory; first working version of write() 2010-07-23 20:04:30 +01:00
Martin Eve 5baa443428 Basic FLAC tags from database 2010-07-23 10:08:18 +01:00
Adrian Sampson 0256547b2c set MP3 header length to 0 for now so MP3 files can be read
This will need to be changed once ID3 interpolation actually works.
2010-07-21 11:19:29 -07:00
Adrian Sampson 6a2df1f1c1 use get_item (in beets HEAD) instead of get_path 2010-07-21 11:18:05 -07:00
Martin Eve 5736929acd FLAC interpolation now works 2010-07-18 14:27:39 +01:00
Martin Eve 04ff5691c8 Remove confusing debug messages 2010-07-18 10:47:05 +01:00
Martin Eve f958700eea Initial commit 2010-07-16 18:39:16 +01:00