Files

T

Alexander 1374084135 Reorganize docs into v1 (beetfs) and v2 (new architecture)

docs/v1/ - Original beetfs documentation:
  - analysis.md, components.md, data-flow.md, drawbacks.md
  - features.md, modernization.md, rust-migration.md
  - benchmark-plan.md, benchmark-results.md, e2e-test-plan.md
  - README.md

docs/v2/ - New MusicFS architecture:
  - requirements.md: Full requirements spec (FR-1 to FR-25, NFR-1 to NFR-14)
    - P0: Multi-origin, plugins, CAS, control API
    - P1: Search, album art, prefetch, metadata sources
    - P3: HA, 10M+ files scalability
  - architecture.md: Google BlueDoc style design document
    - PlantUML diagrams for all components
    - Design requirements with quantitative targets
    - Alternatives considered, implementation plan

2026-05-12 16:46:37 +02:00

9.7 KiB

Raw Blame History

beetfs Benchmark Plan

Executive Summary

Benchmark suite to measure beetfs FUSE filesystem performance across mount time, metadata operations, file I/O, and memory usage. Focus on realistic music library workloads.

Critical Performance Findings (Pre-Benchmark)

Architecture Bottlenecks Identified

Bottleneck	Location	Impact
Full file load into RAM	`FileHandler.__init__` line 481	50-100MB per open FLAC
Mount-time bulk load	`mount()` line 143	O(N) for N library items
GIL serialization	Python 2.7	Single-core limit for metadata ops
Per-file DB lookup	`getattr()`, `access()`	SQLite query per stat call

Expected Performance Characteristics

Operation	Expected Performance	Bottleneck
Mount (10K items)	5-30 seconds	`lib.items()` + FSNode construction
readdir	Fast (in-memory dict)	None
getattr (file)	Slow (~1ms)	DB lookup + real file stat
open (first)	Very slow	Full file read into RAM
read	Fast	Memory-to-memory copy
Memory (10 open files)	500MB-1GB	FileHandler caches entire files

Benchmark Tools

Primary Tools

Tool	Purpose	Install
fio	I/O throughput, IOPS, latency	`nix-shell -p fio`
mdtest	Metadata operations (stat, readdir)	`nix-shell -p ior`
hyperfine	Mount time, command timing	`nix-shell -p hyperfine`
time	Basic timing	builtin
/usr/bin/time -v	Memory usage (maxrss)	builtin

Measurement Scripts

All benchmarks use synthetic FLAC files (5-10MB) to avoid I/O variance from real storage.

Benchmark Categories

1. Mount Time Scaling

Goal: Measure how mount time scales with library size.

Method:

# Create libraries with N items: 100, 1K, 10K, 50K, 100K
hyperfine --warmup 1 --runs 5 \
  'beet mount /mnt/beetfs && sleep 1 && fusermount -u /mnt/beetfs'

Metrics:

Time to mount (seconds)
Memory usage at mount completion (RSS)

Expected scaling: O(N) - linear with library size

Test matrix:

Library Size	Expected Mount Time	Expected Memory
100 items	<1s	~50MB
1,000 items	1-3s	~60MB
10,000 items	5-15s	~100MB
50,000 items	30-60s	~300MB
100,000 items	60-120s	~500MB

2. Metadata Operations (stat/readdir)

Goal: Measure getattr and readdir performance - critical for music players that scan libraries.

2a. Single stat latency

# Measure single stat call latency
hyperfine --warmup 10 --runs 100 \
  'stat /mnt/beetfs/Artist/Album/01-Track.flac'

Target: <5ms average, <20ms p99

2b. Bulk stat (library scan simulation)

# Stat all files in library
hyperfine --warmup 1 --runs 5 \
  'find /mnt/beetfs -type f -exec stat {} + > /dev/null'

Metrics:

Total time for N files
stat operations per second
p50, p95, p99 latency

Target: >500 stat/s (Python FUSE baseline)

2c. Directory listing

# List directory with N entries
hyperfine --warmup 3 --runs 10 \
  'ls /mnt/beetfs/Artist/Album/'

Test matrix:

Directory entries	Target time
10	<50ms
100	<100ms
1,000	<500ms

3. File Open Performance

Goal: Measure file open latency - the critical bottleneck due to full file load.

3a. First open (cold)

# Clear any caches, then open file
echo 3 > /proc/sys/vm/drop_caches
hyperfine --warmup 0 --runs 10 \
  'head -c 1 /mnt/beetfs/Artist/Album/01-Track.flac > /dev/null'

Test matrix:

File size	Expected open time
5MB	50-200ms
20MB	200-500ms
50MB	500ms-1s
100MB	1-2s

3b. Cached open (warm)

# File already opened once
hyperfine --warmup 5 --runs 50 \
  'head -c 1 /mnt/beetfs/Artist/Album/01-Track.flac > /dev/null'

Target: <10ms (should hit FileHandler cache)

4. Read Throughput

Goal: Measure sequential and random read performance.

4a. Sequential read

fio --name=seq_read \
    --filename=/mnt/beetfs/Artist/Album/01-Track.flac \
    --rw=read --bs=1M --direct=0 \
    --ioengine=sync --numjobs=1 \
    --runtime=30 --time_based

Metrics: MB/s throughput

Target: >100 MB/s (memory-backed after first read)

4b. Random read (simulates seeking in audio player)

fio --name=rand_read \
    --filename=/mnt/beetfs/Artist/Album/01-Track.flac \
    --rw=randread --bs=64k --direct=0 \
    --ioengine=sync --numjobs=1 \
    --runtime=30 --time_based

Metrics: IOPS, latency histogram

5. Memory Usage

Goal: Measure memory consumption under load.

5a. Idle memory (mounted, no activity)

# Mount and measure RSS
beet mount /mnt/beetfs &
sleep 5
ps -o rss= -p $(pgrep -f beetfs)

5b. Memory per open file

# Open N files, measure memory growth
for i in 1 5 10 20; do
  # Open $i files simultaneously
  cat /mnt/beetfs/Artist/Album/0{1..$i}*.flac > /dev/null &
  ps -o rss= -p $(pgrep -f beetfs)
done

Expected: ~file_size × open_files (FileHandler caches entire file)

5c. Memory leak detection

# Repeatedly open/close files, check for memory growth
for i in {1..100}; do
  cat /mnt/beetfs/Artist/Album/01-Track.flac > /dev/null
done
# Compare RSS before and after

6. Concurrent Access

Goal: Measure performance under parallel access (multiple processes).

# Parallel stat operations
hyperfine --warmup 1 --runs 5 \
  'seq 1 100 | xargs -P 4 -I {} stat /mnt/beetfs/Artist/Album/0{}-Track.flac'

Metrics:

Throughput scaling with parallelism (1, 2, 4, 8 workers)
Latency degradation

Expected: Limited scaling due to Python GIL

7. Realistic Workloads

7a. Music player library scan

Simulates: Rhythmbox/Clementine scanning library at startup

# Recursive stat + readdir
time find /mnt/beetfs -type f -name "*.flac" -exec stat {} + | wc -l

7b. Album playback

Simulates: Playing 12-track album sequentially

# Open each file, read 1MB (simulate buffering), close
for f in /mnt/beetfs/Artist/Album/*.flac; do
  dd if="$f" of=/dev/null bs=1M count=1 2>/dev/null
done

7c. Metadata edit

Simulates: Editing tags in Picard/Kid3

# Open file, write to header region, close
# (Requires write support to be functional)

Baseline Comparisons

Reference Filesystems

Filesystem	Purpose
ext4 (local)	Best-case baseline
fuse-passthrough	FUSE overhead baseline
sshfs	Network FUSE comparison

Comparison Method

Run identical benchmarks on:

Real music files on ext4
Same files via FUSE passthrough
Same files via beetfs

Calculate overhead: (beetfs_time - ext4_time) / ext4_time × 100%

Test Environment

Hardware Requirements

CPU: 4+ cores (to test GIL impact)
RAM: 8+ GB (for large library tests)
Storage: SSD recommended (reduces I/O variance)

Software Requirements

# Add to flake.nix devShell
buildInputs = [
  fio
  hyperfine
  # ior  # includes mdtest
];

Cache Control

# Clear all caches before cold benchmarks
sync
echo 3 > /proc/sys/vm/drop_caches

# Disable kernel FUSE caching for accurate measurements
mount -o entry_timeout=0,attr_timeout=0,negative_timeout=0

Success Criteria

Minimum Viable Performance

Metric	Minimum	Target	Excellent
Mount time (10K items)	<60s	<15s	<5s
stat latency (avg)	<20ms	<5ms	<1ms
stat throughput	>100/s	>500/s	>2000/s
File open (50MB, cold)	<5s	<1s	<200ms
Read throughput	>50 MB/s	>200 MB/s	>500 MB/s
Memory (idle, 10K items)	<500MB	<100MB	<50MB
Memory per open file	<2× file size	<1.5×	<1.1×

Regression Detection

Any benchmark result >20% worse than baseline triggers investigation.

Implementation Notes

Test Data Generation

Use existing test infrastructure from tests/conftest.py:

create_synthetic_flac() - generates valid FLAC files
BeetFSTestCase - creates isolated beets library

Benchmark Script Structure

beetfs/
├── benchmarks/
│   ├── run_all.sh          # Master script
│   ├── bench_mount.sh      # Mount time tests
│   ├── bench_metadata.sh   # stat/readdir tests
│   ├── bench_io.sh         # Read/write throughput
│   ├── bench_memory.sh     # Memory profiling
│   └── results/            # Output directory
│       ├── mount_scaling.csv
│       ├── stat_latency.csv
│       └── ...

Output Format

# Example: mount_scaling.csv
library_size,mount_time_ms,memory_rss_kb,timestamp
100,450,52000,2024-01-15T10:30:00
1000,2100,61000,2024-01-15T10:31:00
10000,12500,98000,2024-01-15T10:33:00

Known Limitations

Python 2.7 GIL: Cannot achieve true parallelism - expect flat scaling beyond 1 core
FileHandler memory: Each open file = full file in RAM - will OOM with many large files
No lazy loading: All library items loaded at mount - slow for large libraries
SQLite single-writer: Concurrent writes will serialize

Optimization Opportunities (Post-Benchmark)

Based on benchmark results, consider:

Lazy FSNode construction - Build tree on first access, not mount
Memory-mapped file access - mmap instead of full read
LRU cache for FileHandler - Evict old files instead of holding all
Metadata caching - Cache getattr results, invalidate on DB change
Batch DB queries - Prefetch metadata for directory listings

9.7 KiB Raw Blame History Unescape Escape

beetfs Benchmark Plan

Executive Summary

Critical Performance Findings (Pre-Benchmark)

Architecture Bottlenecks Identified

Expected Performance Characteristics

Benchmark Tools

Primary Tools

Measurement Scripts

Benchmark Categories

1. Mount Time Scaling

2. Metadata Operations (stat/readdir)

2a. Single stat latency

2b. Bulk stat (library scan simulation)

2c. Directory listing

3. File Open Performance

3a. First open (cold)

3b. Cached open (warm)

4. Read Throughput

4a. Sequential read

4b. Random read (simulates seeking in audio player)

5. Memory Usage

5a. Idle memory (mounted, no activity)

5b. Memory per open file

5c. Memory leak detection

6. Concurrent Access

7. Realistic Workloads

7a. Music player library scan

7b. Album playback

7c. Metadata edit

Baseline Comparisons

Reference Filesystems

Comparison Method

Test Environment

Hardware Requirements

Software Requirements

Cache Control

Success Criteria

Minimum Viable Performance

Regression Detection

Implementation Notes

Test Data Generation

Benchmark Script Structure

Output Format

Known Limitations

Optimization Opportunities (Post-Benchmark)

9.7 KiB

Raw Blame History