Reorganize docs into v1 (beetfs) and v2 (new architecture)
docs/v1/ - Original beetfs documentation:
- analysis.md, components.md, data-flow.md, drawbacks.md
- features.md, modernization.md, rust-migration.md
- benchmark-plan.md, benchmark-results.md, e2e-test-plan.md
- README.md
docs/v2/ - New MusicFS architecture:
- requirements.md: Full requirements spec (FR-1 to FR-25, NFR-1 to NFR-14)
- P0: Multi-origin, plugins, CAS, control API
- P1: Search, album art, prefetch, metadata sources
- P3: HA, 10M+ files scalability
- architecture.md: Google BlueDoc style design document
- PlantUML diagrams for all components
- Design requirements with quantitative targets
- Alternatives considered, implementation plan
This commit is contained in:
@@ -0,0 +1,459 @@
|
||||
# beetfs Modernization Guide
|
||||
|
||||
## Current State Analysis
|
||||
|
||||
### Technical Debt
|
||||
|
||||
| Issue | Severity | Location |
|
||||
|-------|----------|----------|
|
||||
| Python 2 syntax | 🔴 Critical | Throughout |
|
||||
| fuse-python (deprecated) | 🔴 Critical | Lines 25, 51 |
|
||||
| `basestring` usage | 🔴 Critical | Line 89 |
|
||||
| `reduce` without import | 🟡 Medium | Line 197 |
|
||||
| `0755` octal syntax | 🟡 Medium | Lines 654, 700 |
|
||||
| `print` as statement | 🟡 Medium | N/A (not used) |
|
||||
| `except Exception, e` | 🔴 Critical | Line 181 |
|
||||
| Long integers (`0L`) | 🟡 Medium | Line 197 |
|
||||
| Global state | 🟡 Medium | Lines 125-140 |
|
||||
| Memory-heavy design | 🟡 Medium | Line 481 |
|
||||
|
||||
### Dependencies to Update
|
||||
|
||||
| Original | Replacement | Notes |
|
||||
|----------|-------------|-------|
|
||||
| `fuse-python` | `pyfuse3` or `llfuse` | Modern FUSE bindings |
|
||||
| `beets` (old API) | `beets >= 1.6` | Check API compatibility |
|
||||
| `mutagen` | `mutagen >= 1.45` | Mostly compatible |
|
||||
| Python 2.7 | Python 3.9+ | Full migration needed |
|
||||
|
||||
---
|
||||
|
||||
## Migration Steps
|
||||
|
||||
### Phase 1: Python 3 Compatibility
|
||||
|
||||
#### 1.1 Fix Syntax Issues
|
||||
|
||||
```python
|
||||
# BEFORE (Python 2)
|
||||
except fuse.FuseError, e:
|
||||
log.error(str(e))
|
||||
|
||||
# AFTER (Python 3)
|
||||
except fuse.FuseError as e:
|
||||
log.error(str(e))
|
||||
```
|
||||
|
||||
```python
|
||||
# BEFORE
|
||||
if isinstance(value, basestring):
|
||||
|
||||
# AFTER
|
||||
if isinstance(value, str):
|
||||
```
|
||||
|
||||
```python
|
||||
# BEFORE
|
||||
return reduce(lambda a, b: (a << 8) + ord(b), string, 0L)
|
||||
|
||||
# AFTER
|
||||
from functools import reduce
|
||||
return reduce(lambda a, b: (a << 8) + b, string, 0)
|
||||
```
|
||||
|
||||
```python
|
||||
# BEFORE
|
||||
mode = stat.S_IFDIR | 0755
|
||||
|
||||
# AFTER
|
||||
mode = stat.S_IFDIR | 0o755
|
||||
```
|
||||
|
||||
#### 1.2 Fix String/Bytes Handling
|
||||
|
||||
```python
|
||||
# BEFORE - implicit string/bytes mixing
|
||||
self.header = self.inf.get_header(self.real_path)
|
||||
return self.header[offset:offset+size]
|
||||
|
||||
# AFTER - explicit bytes handling
|
||||
self.header: bytes = self.inf.get_header(self.real_path)
|
||||
return self.header[offset:offset+size]
|
||||
```
|
||||
|
||||
```python
|
||||
# BEFORE
|
||||
self.item.title = str(self.inf["title"][0]).encode('utf-8')
|
||||
|
||||
# AFTER
|
||||
self.item.title = self.inf["title"][0] # Already str in Python 3
|
||||
```
|
||||
|
||||
#### 1.3 Fix Dictionary Methods
|
||||
|
||||
```python
|
||||
# BEFORE
|
||||
return node.dirs.keys()
|
||||
|
||||
# AFTER
|
||||
return list(node.dirs.keys()) # If list is needed
|
||||
# or just
|
||||
return node.dirs.keys() # If iteration is sufficient
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### Phase 2: FUSE Library Migration
|
||||
|
||||
#### Option A: pyfuse3 (Recommended)
|
||||
|
||||
Modern, async-capable FUSE bindings.
|
||||
|
||||
```python
|
||||
# BEFORE (fuse-python)
|
||||
import fuse
|
||||
fuse.fuse_python_api = (0, 2)
|
||||
|
||||
class beetFileSystem(fuse.Fuse):
|
||||
def read(self, path, size, offset):
|
||||
return data
|
||||
|
||||
# AFTER (pyfuse3)
|
||||
import pyfuse3
|
||||
import trio
|
||||
|
||||
class BeetFS(pyfuse3.Operations):
|
||||
async def read(self, fh, offset, size):
|
||||
return data
|
||||
|
||||
async def main():
|
||||
fs = BeetFS()
|
||||
fuse_options = set(pyfuse3.default_options)
|
||||
fuse_options.add('fsname=beetfs')
|
||||
pyfuse3.init(fs, mountpoint, fuse_options)
|
||||
try:
|
||||
await pyfuse3.main()
|
||||
finally:
|
||||
pyfuse3.close()
|
||||
|
||||
trio.run(main)
|
||||
```
|
||||
|
||||
**Key Differences**:
|
||||
| fuse-python | pyfuse3 |
|
||||
|-------------|---------|
|
||||
| `read(path, size, offset)` | `read(fh, offset, size)` |
|
||||
| Synchronous | Async (trio) |
|
||||
| Return data directly | Return bytes |
|
||||
| Path-based | File handle based |
|
||||
|
||||
#### Option B: llfuse (Alternative)
|
||||
|
||||
Lower-level, synchronous.
|
||||
|
||||
```python
|
||||
import llfuse
|
||||
|
||||
class BeetFS(llfuse.Operations):
|
||||
def read(self, fh, offset, size):
|
||||
return data
|
||||
|
||||
def main():
|
||||
fs = BeetFS()
|
||||
llfuse.init(fs, mountpoint, options)
|
||||
try:
|
||||
llfuse.main()
|
||||
finally:
|
||||
llfuse.close()
|
||||
```
|
||||
|
||||
#### Option C: fusepy (Simple)
|
||||
|
||||
Simple wrapper, but less maintained.
|
||||
|
||||
```python
|
||||
from fuse import FUSE, Operations
|
||||
|
||||
class BeetFS(Operations):
|
||||
def read(self, path, size, offset, fh):
|
||||
return data
|
||||
|
||||
FUSE(BeetFS(), mountpoint, foreground=True)
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### Phase 3: Architecture Improvements
|
||||
|
||||
#### 3.1 Remove Global State
|
||||
|
||||
```python
|
||||
# BEFORE - Global variables
|
||||
global structure_split
|
||||
global structure_depth
|
||||
global library
|
||||
global directory_structure
|
||||
|
||||
# AFTER - Instance variables
|
||||
class BeetFS:
|
||||
def __init__(self, lib: Library, path_format: str):
|
||||
self.lib = lib
|
||||
self.path_format = path_format
|
||||
self.structure_split = path_format.split("/")
|
||||
self.structure_depth = len(self.structure_split)
|
||||
self.directory_structure = FSNode({}, {})
|
||||
self._build_tree()
|
||||
```
|
||||
|
||||
#### 3.2 Reduce Memory Usage
|
||||
|
||||
```python
|
||||
# BEFORE - Load entire audio into memory
|
||||
self.music_data = self.file_object.read() # Could be 100MB+
|
||||
|
||||
# AFTER - Lazy loading with mmap or seek
|
||||
class FileHandler:
|
||||
def __init__(self, path, lib):
|
||||
self.real_path = self._resolve_path(path)
|
||||
self.file_object = open(self.real_path, 'rb')
|
||||
self._header = None # Lazy load
|
||||
self._music_offset = None
|
||||
|
||||
@property
|
||||
def header(self) -> bytes:
|
||||
if self._header is None:
|
||||
self._header = self._generate_header()
|
||||
return self._header
|
||||
|
||||
def read(self, size: int, offset: int) -> bytes:
|
||||
if offset < len(self.header):
|
||||
# Header region - return from generated header
|
||||
if offset + size <= len(self.header):
|
||||
return self.header[offset:offset+size]
|
||||
else:
|
||||
# Span header and audio
|
||||
header_part = self.header[offset:]
|
||||
audio_offset = 0
|
||||
audio_size = size - len(header_part)
|
||||
audio_part = self._read_audio(audio_offset, audio_size)
|
||||
return header_part + audio_part
|
||||
else:
|
||||
# Audio region - read directly from file
|
||||
audio_offset = offset - len(self.header)
|
||||
return self._read_audio(audio_offset, size)
|
||||
|
||||
def _read_audio(self, offset: int, size: int) -> bytes:
|
||||
self.file_object.seek(self._music_offset + offset)
|
||||
return self.file_object.read(size)
|
||||
```
|
||||
|
||||
#### 3.3 Add Type Hints
|
||||
|
||||
```python
|
||||
from typing import Dict, List, Optional, Tuple
|
||||
from pathlib import Path
|
||||
|
||||
class FSNode:
|
||||
def __init__(self, dirs: Dict[str, 'FSNode'], files: Dict[str, int]):
|
||||
self.dirs: Dict[str, FSNode] = dirs
|
||||
self.files: Dict[str, int] = files
|
||||
|
||||
def getnode(self, elements: List[str], root: Optional['FSNode'] = None) -> 'FSNode':
|
||||
...
|
||||
|
||||
def addfile(self, elements: List[str], filename: str, item_id: int) -> None:
|
||||
...
|
||||
```
|
||||
|
||||
#### 3.4 Add MP3 Support
|
||||
|
||||
```python
|
||||
class FileHandler:
|
||||
def __init__(self, path: str, lib: Library):
|
||||
self.format = Path(path).suffix[1:].lower()
|
||||
|
||||
if self.format == "flac":
|
||||
self._handler = FLACHandler(self.real_path, self.item)
|
||||
elif self.format == "mp3":
|
||||
self._handler = MP3Handler(self.real_path, self.item)
|
||||
elif self.format in ("ogg", "opus"):
|
||||
self._handler = OggHandler(self.real_path, self.item)
|
||||
else:
|
||||
raise UnsupportedFormatError(f"Format {self.format} not supported")
|
||||
|
||||
class FLACHandler:
|
||||
def generate_header(self, item: Item) -> bytes:
|
||||
inf = InterpolatedFLAC(self.file_data)
|
||||
inf["title"] = item.title
|
||||
inf["album"] = item.album
|
||||
inf["artist"] = item.artist
|
||||
inf["genre"] = item.genre
|
||||
return inf.get_header()
|
||||
|
||||
class MP3Handler:
|
||||
def generate_header(self, item: Item) -> bytes:
|
||||
# Implement ID3v2 header generation
|
||||
id3 = InterpolatedID3()
|
||||
id3.add(TIT2(encoding=3, text=item.title))
|
||||
id3.add(TPE1(encoding=3, text=item.artist))
|
||||
id3.add(TALB(encoding=3, text=item.album))
|
||||
id3.add(TCON(encoding=3, text=item.genre))
|
||||
|
||||
# Calculate padding to match original header size
|
||||
...
|
||||
return id3.render()
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### Phase 4: Testing
|
||||
|
||||
#### 4.1 Unit Tests
|
||||
|
||||
```python
|
||||
import pytest
|
||||
from beetfs import FSNode, FileHandler
|
||||
|
||||
class TestFSNode:
|
||||
def test_adddir(self):
|
||||
root = FSNode({}, {})
|
||||
root.adddir([], "Artist")
|
||||
assert "Artist" in root.dirs
|
||||
|
||||
def test_addfile(self):
|
||||
root = FSNode({}, {})
|
||||
root.adddir([], "Artist")
|
||||
root.addfile(["Artist"], "track.flac", 42)
|
||||
assert root.dirs["Artist"].files["track.flac"] == 42
|
||||
|
||||
def test_getnode(self):
|
||||
root = FSNode({}, {})
|
||||
root.adddir([], "Artist")
|
||||
root.adddir(["Artist"], "Album")
|
||||
node = root.getnode(["Artist", "Album"])
|
||||
assert node is not None
|
||||
|
||||
class TestFileHandler:
|
||||
def test_read_header(self, mock_flac_file, mock_beets_item):
|
||||
handler = FileHandler("/Artist/Album/track.flac", mock_lib)
|
||||
data = handler.read(100, 0)
|
||||
assert data.startswith(b"fLaC")
|
||||
|
||||
def test_read_audio(self, mock_flac_file, mock_beets_item):
|
||||
handler = FileHandler("/Artist/Album/track.flac", mock_lib)
|
||||
data = handler.read(100, handler.bound + 100)
|
||||
# Should be audio data from original file
|
||||
assert data == mock_flac_file.audio_data[100:200]
|
||||
```
|
||||
|
||||
#### 4.2 Integration Tests
|
||||
|
||||
```python
|
||||
import subprocess
|
||||
import tempfile
|
||||
import os
|
||||
|
||||
class TestFUSEMount:
|
||||
def test_mount_unmount(self, beets_library):
|
||||
with tempfile.TemporaryDirectory() as mountpoint:
|
||||
# Mount
|
||||
proc = subprocess.Popen(
|
||||
["beet", "mount", mountpoint],
|
||||
stdout=subprocess.PIPE
|
||||
)
|
||||
time.sleep(1)
|
||||
|
||||
# Verify mount
|
||||
assert os.path.ismount(mountpoint)
|
||||
|
||||
# List files
|
||||
files = os.listdir(mountpoint)
|
||||
assert len(files) > 0
|
||||
|
||||
# Unmount
|
||||
subprocess.run(["fusermount", "-u", mountpoint])
|
||||
proc.wait()
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### Phase 5: Standalone Mode (Optional)
|
||||
|
||||
Remove beets dependency for use as standalone metadata overlay.
|
||||
|
||||
```python
|
||||
class StandaloneFS:
|
||||
"""Metadata overlay without beets dependency."""
|
||||
|
||||
def __init__(self,
|
||||
source_dir: Path,
|
||||
metadata_db: Path,
|
||||
path_format: str):
|
||||
self.source_dir = source_dir
|
||||
self.db = sqlite3.connect(metadata_db)
|
||||
self.path_format = path_format
|
||||
self._build_tree()
|
||||
|
||||
def _build_tree(self):
|
||||
"""Build virtual tree from source directory and metadata DB."""
|
||||
for audio_file in self.source_dir.rglob("*.flac"):
|
||||
# Get metadata from DB or scan file
|
||||
metadata = self._get_metadata(audio_file)
|
||||
# Build virtual path from template
|
||||
virtual_path = self._format_path(metadata)
|
||||
# Add to tree
|
||||
self.directory_structure.addfile(
|
||||
virtual_path.parent.parts,
|
||||
virtual_path.name,
|
||||
str(audio_file) # Store actual path instead of ID
|
||||
)
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Recommended Migration Order
|
||||
|
||||
```
|
||||
1. [ ] Fork and set up development environment
|
||||
2. [ ] Add type hints throughout (helps catch issues)
|
||||
3. [ ] Fix Python 3 syntax issues
|
||||
4. [ ] Replace fuse-python with pyfuse3/llfuse
|
||||
5. [ ] Add unit tests for FSNode and FileHandler
|
||||
6. [ ] Refactor global state to instance variables
|
||||
7. [ ] Implement lazy loading for audio data
|
||||
8. [ ] Add MP3 support
|
||||
9. [ ] Add integration tests
|
||||
10. [ ] Optional: Create standalone mode
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Estimated Effort
|
||||
|
||||
| Phase | Effort | Risk |
|
||||
|-------|--------|------|
|
||||
| Phase 1 (Python 3) | 2-3 days | Low |
|
||||
| Phase 2 (FUSE migration) | 3-5 days | Medium |
|
||||
| Phase 3 (Architecture) | 3-5 days | Medium |
|
||||
| Phase 4 (Testing) | 2-3 days | Low |
|
||||
| Phase 5 (Standalone) | 3-5 days | Medium |
|
||||
| **Total** | **13-21 days** | |
|
||||
|
||||
---
|
||||
|
||||
## Alternative: Rewrite from Scratch
|
||||
|
||||
Given the age of the codebase, a rewrite might be more efficient:
|
||||
|
||||
**Pros of Rewrite**:
|
||||
- Clean architecture from start
|
||||
- Modern async design
|
||||
- Better memory management
|
||||
- Easier to test
|
||||
|
||||
**Cons of Rewrite**:
|
||||
- More initial effort
|
||||
- Risk of missing edge cases
|
||||
- Need to re-discover FLAC/ID3 intricacies
|
||||
|
||||
**Recommended Approach**: Start with Phase 1-2 to understand the code deeply, then decide whether to continue refactoring or rewrite.
|
||||
Reference in New Issue
Block a user