From dacd3a7c1fa1d44286dac8f369625379485cc21b Mon Sep 17 00:00:00 2001 From: Alexander Date: Tue, 12 May 2026 14:36:24 +0200 Subject: [PATCH] Add benchmark plan for beetfs performance measurement Covers: - Mount time scaling (100 to 100K items) - Metadata operations (stat, readdir throughput) - File I/O (open latency, read throughput) - Memory usage (idle, per-file, leak detection) - Concurrent access (GIL impact) - Realistic workloads (library scan, album playback) Tools: fio, mdtest, hyperfine Baselines: ext4, fuse-passthrough, sshfs Key bottlenecks identified: - FileHandler loads entire file into RAM on open - Mount-time bulk load of all library items - Python GIL limits parallelism --- docs/benchmark-plan.md | 403 +++++++++++++++++++++++++++++++++++++++++ 1 file changed, 403 insertions(+) create mode 100644 docs/benchmark-plan.md diff --git a/docs/benchmark-plan.md b/docs/benchmark-plan.md new file mode 100644 index 0000000..a360ab3 --- /dev/null +++ b/docs/benchmark-plan.md @@ -0,0 +1,403 @@ +# beetfs Benchmark Plan + +## Executive Summary + +Benchmark suite to measure beetfs FUSE filesystem performance across mount time, metadata operations, file I/O, and memory usage. Focus on realistic music library workloads. + +## Critical Performance Findings (Pre-Benchmark) + +### Architecture Bottlenecks Identified + +| Bottleneck | Location | Impact | +|------------|----------|--------| +| **Full file load into RAM** | `FileHandler.__init__` line 481 | 50-100MB per open FLAC | +| **Mount-time bulk load** | `mount()` line 143 | O(N) for N library items | +| **GIL serialization** | Python 2.7 | Single-core limit for metadata ops | +| **Per-file DB lookup** | `getattr()`, `access()` | SQLite query per stat call | + +### Expected Performance Characteristics + +| Operation | Expected Performance | Bottleneck | +|-----------|---------------------|------------| +| Mount (10K items) | 5-30 seconds | `lib.items()` + FSNode construction | +| readdir | Fast (in-memory dict) | None | +| getattr (file) | Slow (~1ms) | DB lookup + real file stat | +| open (first) | Very slow | Full file read into RAM | +| read | Fast | Memory-to-memory copy | +| Memory (10 open files) | 500MB-1GB | FileHandler caches entire files | + +--- + +## Benchmark Tools + +### Primary Tools + +| Tool | Purpose | Install | +|------|---------|---------| +| **fio** | I/O throughput, IOPS, latency | `nix-shell -p fio` | +| **mdtest** | Metadata operations (stat, readdir) | `nix-shell -p ior` | +| **hyperfine** | Mount time, command timing | `nix-shell -p hyperfine` | +| **time** | Basic timing | builtin | +| **/usr/bin/time -v** | Memory usage (maxrss) | builtin | + +### Measurement Scripts + +All benchmarks use synthetic FLAC files (5-10MB) to avoid I/O variance from real storage. + +--- + +## Benchmark Categories + +### 1. Mount Time Scaling + +**Goal**: Measure how mount time scales with library size. + +**Method**: +```bash +# Create libraries with N items: 100, 1K, 10K, 50K, 100K +hyperfine --warmup 1 --runs 5 \ + 'beet mount /mnt/beetfs && sleep 1 && fusermount -u /mnt/beetfs' +``` + +**Metrics**: +- Time to mount (seconds) +- Memory usage at mount completion (RSS) + +**Expected scaling**: O(N) - linear with library size + +**Test matrix**: +| Library Size | Expected Mount Time | Expected Memory | +|--------------|--------------------:|----------------:| +| 100 items | <1s | ~50MB | +| 1,000 items | 1-3s | ~60MB | +| 10,000 items | 5-15s | ~100MB | +| 50,000 items | 30-60s | ~300MB | +| 100,000 items | 60-120s | ~500MB | + +--- + +### 2. Metadata Operations (stat/readdir) + +**Goal**: Measure getattr and readdir performance - critical for music players that scan libraries. + +#### 2a. Single stat latency + +```bash +# Measure single stat call latency +hyperfine --warmup 10 --runs 100 \ + 'stat /mnt/beetfs/Artist/Album/01-Track.flac' +``` + +**Target**: <5ms average, <20ms p99 + +#### 2b. Bulk stat (library scan simulation) + +```bash +# Stat all files in library +hyperfine --warmup 1 --runs 5 \ + 'find /mnt/beetfs -type f -exec stat {} + > /dev/null' +``` + +**Metrics**: +- Total time for N files +- stat operations per second +- p50, p95, p99 latency + +**Target**: >500 stat/s (Python FUSE baseline) + +#### 2c. Directory listing + +```bash +# List directory with N entries +hyperfine --warmup 3 --runs 10 \ + 'ls /mnt/beetfs/Artist/Album/' +``` + +**Test matrix**: +| Directory entries | Target time | +|------------------:|------------:| +| 10 | <50ms | +| 100 | <100ms | +| 1,000 | <500ms | + +--- + +### 3. File Open Performance + +**Goal**: Measure file open latency - the critical bottleneck due to full file load. + +#### 3a. First open (cold) + +```bash +# Clear any caches, then open file +echo 3 > /proc/sys/vm/drop_caches +hyperfine --warmup 0 --runs 10 \ + 'head -c 1 /mnt/beetfs/Artist/Album/01-Track.flac > /dev/null' +``` + +**Test matrix**: +| File size | Expected open time | +|----------:|-------------------:| +| 5MB | 50-200ms | +| 20MB | 200-500ms | +| 50MB | 500ms-1s | +| 100MB | 1-2s | + +#### 3b. Cached open (warm) + +```bash +# File already opened once +hyperfine --warmup 5 --runs 50 \ + 'head -c 1 /mnt/beetfs/Artist/Album/01-Track.flac > /dev/null' +``` + +**Target**: <10ms (should hit FileHandler cache) + +--- + +### 4. Read Throughput + +**Goal**: Measure sequential and random read performance. + +#### 4a. Sequential read + +```bash +fio --name=seq_read \ + --filename=/mnt/beetfs/Artist/Album/01-Track.flac \ + --rw=read --bs=1M --direct=0 \ + --ioengine=sync --numjobs=1 \ + --runtime=30 --time_based +``` + +**Metrics**: MB/s throughput + +**Target**: >100 MB/s (memory-backed after first read) + +#### 4b. Random read (simulates seeking in audio player) + +```bash +fio --name=rand_read \ + --filename=/mnt/beetfs/Artist/Album/01-Track.flac \ + --rw=randread --bs=64k --direct=0 \ + --ioengine=sync --numjobs=1 \ + --runtime=30 --time_based +``` + +**Metrics**: IOPS, latency histogram + +--- + +### 5. Memory Usage + +**Goal**: Measure memory consumption under load. + +#### 5a. Idle memory (mounted, no activity) + +```bash +# Mount and measure RSS +beet mount /mnt/beetfs & +sleep 5 +ps -o rss= -p $(pgrep -f beetfs) +``` + +#### 5b. Memory per open file + +```bash +# Open N files, measure memory growth +for i in 1 5 10 20; do + # Open $i files simultaneously + cat /mnt/beetfs/Artist/Album/0{1..$i}*.flac > /dev/null & + ps -o rss= -p $(pgrep -f beetfs) +done +``` + +**Expected**: ~file_size × open_files (FileHandler caches entire file) + +#### 5c. Memory leak detection + +```bash +# Repeatedly open/close files, check for memory growth +for i in {1..100}; do + cat /mnt/beetfs/Artist/Album/01-Track.flac > /dev/null +done +# Compare RSS before and after +``` + +--- + +### 6. Concurrent Access + +**Goal**: Measure performance under parallel access (multiple processes). + +```bash +# Parallel stat operations +hyperfine --warmup 1 --runs 5 \ + 'seq 1 100 | xargs -P 4 -I {} stat /mnt/beetfs/Artist/Album/0{}-Track.flac' +``` + +**Metrics**: +- Throughput scaling with parallelism (1, 2, 4, 8 workers) +- Latency degradation + +**Expected**: Limited scaling due to Python GIL + +--- + +### 7. Realistic Workloads + +#### 7a. Music player library scan + +Simulates: Rhythmbox/Clementine scanning library at startup + +```bash +# Recursive stat + readdir +time find /mnt/beetfs -type f -name "*.flac" -exec stat {} + | wc -l +``` + +#### 7b. Album playback + +Simulates: Playing 12-track album sequentially + +```bash +# Open each file, read 1MB (simulate buffering), close +for f in /mnt/beetfs/Artist/Album/*.flac; do + dd if="$f" of=/dev/null bs=1M count=1 2>/dev/null +done +``` + +#### 7c. Metadata edit + +Simulates: Editing tags in Picard/Kid3 + +```bash +# Open file, write to header region, close +# (Requires write support to be functional) +``` + +--- + +## Baseline Comparisons + +### Reference Filesystems + +| Filesystem | Purpose | +|------------|---------| +| **ext4 (local)** | Best-case baseline | +| **fuse-passthrough** | FUSE overhead baseline | +| **sshfs** | Network FUSE comparison | + +### Comparison Method + +Run identical benchmarks on: +1. Real music files on ext4 +2. Same files via FUSE passthrough +3. Same files via beetfs + +Calculate overhead: `(beetfs_time - ext4_time) / ext4_time × 100%` + +--- + +## Test Environment + +### Hardware Requirements + +- CPU: 4+ cores (to test GIL impact) +- RAM: 8+ GB (for large library tests) +- Storage: SSD recommended (reduces I/O variance) + +### Software Requirements + +```nix +# Add to flake.nix devShell +buildInputs = [ + fio + hyperfine + # ior # includes mdtest +]; +``` + +### Cache Control + +```bash +# Clear all caches before cold benchmarks +sync +echo 3 > /proc/sys/vm/drop_caches + +# Disable kernel FUSE caching for accurate measurements +mount -o entry_timeout=0,attr_timeout=0,negative_timeout=0 +``` + +--- + +## Success Criteria + +### Minimum Viable Performance + +| Metric | Minimum | Target | Excellent | +|--------|--------:|-------:|----------:| +| Mount time (10K items) | <60s | <15s | <5s | +| stat latency (avg) | <20ms | <5ms | <1ms | +| stat throughput | >100/s | >500/s | >2000/s | +| File open (50MB, cold) | <5s | <1s | <200ms | +| Read throughput | >50 MB/s | >200 MB/s | >500 MB/s | +| Memory (idle, 10K items) | <500MB | <100MB | <50MB | +| Memory per open file | <2× file size | <1.5× | <1.1× | + +### Regression Detection + +Any benchmark result >20% worse than baseline triggers investigation. + +--- + +## Implementation Notes + +### Test Data Generation + +Use existing test infrastructure from `tests/conftest.py`: +- `create_synthetic_flac()` - generates valid FLAC files +- `BeetFSTestCase` - creates isolated beets library + +### Benchmark Script Structure + +``` +beetfs/ +├── benchmarks/ +│ ├── run_all.sh # Master script +│ ├── bench_mount.sh # Mount time tests +│ ├── bench_metadata.sh # stat/readdir tests +│ ├── bench_io.sh # Read/write throughput +│ ├── bench_memory.sh # Memory profiling +│ └── results/ # Output directory +│ ├── mount_scaling.csv +│ ├── stat_latency.csv +│ └── ... +``` + +### Output Format + +```csv +# Example: mount_scaling.csv +library_size,mount_time_ms,memory_rss_kb,timestamp +100,450,52000,2024-01-15T10:30:00 +1000,2100,61000,2024-01-15T10:31:00 +10000,12500,98000,2024-01-15T10:33:00 +``` + +--- + +## Known Limitations + +1. **Python 2.7 GIL**: Cannot achieve true parallelism - expect flat scaling beyond 1 core +2. **FileHandler memory**: Each open file = full file in RAM - will OOM with many large files +3. **No lazy loading**: All library items loaded at mount - slow for large libraries +4. **SQLite single-writer**: Concurrent writes will serialize + +## Optimization Opportunities (Post-Benchmark) + +Based on benchmark results, consider: + +1. **Lazy FSNode construction** - Build tree on first access, not mount +2. **Memory-mapped file access** - mmap instead of full read +3. **LRU cache for FileHandler** - Evict old files instead of holding all +4. **Metadata caching** - Cache getattr results, invalidate on DB change +5. **Batch DB queries** - Prefetch metadata for directory listings