feat: initial implementation of metadata aggregator

- gRPC service with MusicBrainz provider
- PostgreSQL schema with migrations
- Service layer with database-first caching
- Repository pattern for data access
- YAML configuration support
- Research documentation for 17 music metadata projects
This commit is contained in:
Alexander
2026-04-28 16:27:14 +02:00
commit a1f6701bac
163 changed files with 95884 additions and 0 deletions
File diff suppressed because it is too large Load Diff
@@ -0,0 +1,724 @@
# Meelo Architecture
## System Overview
Meelo implements a microservices architecture with four application services and four infrastructure services, orchestrated via Docker Compose. Each service has a single responsibility and communicates through well-defined interfaces (REST APIs, message queues).
```
┌─────────────────────────────────────────────────────────────┐
│ Nginx │
│ Reverse Proxy (Port 80) │
│ Routes: / → Front, /api/ → Server, /scanner/ → Scanner │
└─────────────────────────────────────────────────────────────┘
│ │ │ │
┌────┘ ┌────┘ ┌────┘ ┌────┘
│ │ │ │
┌───▼────┐ ┌────▼─────┐ ┌──────▼───┐ ┌──────▼────┐
│ Front │ │ Server │ │ Scanner │ │ Matcher │
│ Next.js│ │ NestJS │ │ Go │ │ FastAPI │
│ :3000 │ │ :4000 │ │ :8133 │ │ :6789 │
└────────┘ └────┬─────┘ └────┬─────┘ └─────┬─────┘
│ │ │
┌────────┼──────────────┼───────────────┘
│ │ │
┌────▼───┐ ┌─▼──────────┐ ┌─▼──────────┐
│ Postgres│ │ MeiliSearch│ │ RabbitMQ │
│ :5432 │ │ :7700 │ │ :5672 │
└─────────┘ └────────────┘ └────────────┘
```
## Service Responsibilities
### Server (NestJS 11, TypeScript)
**Port**: 4000
**Database**: PostgreSQL via Prisma ORM
**Search**: MeiliSearch client
**Messaging**: RabbitMQ publisher
#### Module Structure
NestJS organizes code into modules. Each module encapsulates related functionality:
**Core Domain Modules**
- `ArtistModule`: CRUD operations, relationships to albums/songs/videos
- `AlbumModule`: Album management, release associations
- `SongModule`: Song entities, track relationships, lyrics
- `TrackModule`: Individual track instances (audio/video)
- `ReleaseModule`: Physical/digital release variants
- `GenreModule`: Genre taxonomy and associations
- `VideoModule`: Music video management
**Supporting Modules**
- `AuthModule`: JWT authentication, user registration, login
- `UserModule`: User management, preferences, scrobbler connections
- `LibraryModule`: Library configuration, scan triggers
- `FileModule`: File metadata, checksums, fingerprints
- `PlaylistModule`: Playlist CRUD, entry management
- `LyricsModule`: Plain and synced lyrics storage
**Integration Modules**
- `ExternalMetadataModule`: Provider data aggregation
- `SearchModule`: MeiliSearch indexing and queries
- `ScrobblerModule`: Last.fm and ListenBrainz integration
- `StreamModule`: Audio/video streaming endpoints
- `EventsModule`: WebSocket notifications for UI updates
**Infrastructure Modules**
- `PrismaModule`: Database connection and ORM
- `MeiliSearchModule`: Search client configuration
- `RabbitMQModule`: Message queue publisher
#### Data Flow
1. **Incoming Request**: Nginx forwards to Server at `/api/*`
2. **Controller**: Route handler validates request, extracts JWT
3. **Service**: Business logic executes, calls Prisma for data
4. **Repository**: Prisma queries PostgreSQL
5. **Response**: JSON returned to client
For write operations:
1. Service updates database via Prisma
2. Service publishes event to RabbitMQ (if needed)
3. Service updates MeiliSearch index
4. Service emits WebSocket event for live UI updates
#### Authentication Flow
1. User submits credentials to `/api/auth/login`
2. `AuthService` validates against bcrypt hash in database
3. JWT signed with `JWT_SIGNATURE` from .env
4. Token returned to client
5. Client includes token in `Authorization: Bearer <token>` header
6. `JwtStrategy` validates token on protected routes
7. User object attached to request context
Anonymous mode (`ALLOW_ANONYMOUS=1`) bypasses this flow.
#### Scrobbling Flow
1. User authorizes Last.fm via OAuth (callback to `/api/scrobblers/lastfm/callback`)
2. Server exchanges code for access token
3. Token stored in `UserScrobbler` table
4. On track play, `ScrobblerService` posts to Last.fm API
5. ListenBrainz uses simpler token-based auth (user provides token directly)
#### Search Integration
1. On entity creation/update, service calls `MeiliSearchService.index()`
2. Service transforms entity to search document
3. Document pushed to MeiliSearch via HTTP API
4. Client queries `/api/search?q=<term>`
5. Server forwards to MeiliSearch
6. Results enriched with database data (illustrations, counts)
7. JSON returned to client
### Scanner (Go 1.25, Echo v5)
**Port**: 8133
**Framework**: Echo HTTP server
**Dependencies**: FFmpeg, FFprobe, AcoustID
#### Responsibilities
1. **Filesystem Watching**: Monitor library directories for changes
2. **Metadata Extraction**: Parse audio/video files using FFprobe
3. **Fingerprinting**: Generate AcoustID fingerprints for matching
4. **Filename Parsing**: Apply regex from settings.json to extract metadata
5. **File Registration**: POST file metadata to Server API
6. **Match Triggering**: Publish events to RabbitMQ for Matcher consumption
#### Scan Process
1. **Trigger**: POST to `/scanner/scan/:libraryId` or filesystem event
2. **Discovery**: Walk directory tree, filter by extension (.mp3, .flac, .m4a, .mkv, etc.)
3. **Extraction**: For each file:
- Run FFprobe to get duration, bitrate, codec, embedded tags
- Generate AcoustID fingerprint using chromaprint
- Parse filename using regex from settings.json
- Calculate file checksum (SHA256)
4. **Registration**: POST to Server `/api/files` with:
- File path
- Checksum
- Fingerprint
- Extracted metadata (title, artist, album, track number)
- Technical details (duration, bitrate, codec)
5. **Event Publishing**: Publish to RabbitMQ queue `file.added` with file ID
6. **Repeat**: Process next file
#### Filename Regex
Settings.json contains `trackRegex` pattern. Example:
```
(?P<artist>[^/]+)/(?P<album>[^/]+)/(?P<disc>\d+)-(?P<track>\d+) (?P<title>.+)\.(?P<ext>\w+)
```
Named capture groups extract metadata when embedded tags are missing or untrusted.
#### Health Monitoring
Scanner exposes `GET /` endpoint. Returns JSON with:
- Service status
- Active scan tasks
- Last scan timestamp
- Library statistics
Docker health check hits this endpoint every 30 seconds.
#### Error Handling
- **File Read Errors**: Log and skip file, continue scan
- **FFprobe Failures**: Retry once, then skip
- **Server API Errors**: Retry with exponential backoff (max 3 attempts)
- **RabbitMQ Unavailable**: Queue events in memory, flush when connection restored
### Matcher (Python 3.14, FastAPI)
**Port**: 6789
**Framework**: FastAPI with async HTTP
**Messaging**: RabbitMQ consumer
#### Responsibilities
1. **Event Consumption**: Listen to RabbitMQ `file.added` queue
2. **Provider Queries**: Fetch metadata from 8 external sources
3. **Data Aggregation**: Merge results based on priority in settings.json
4. **Metadata Push**: POST enriched data to Server API
#### Provider Architecture
Each provider is a separate module implementing a common interface:
```python
class Provider(ABC):
@abstractmethod
async def search_track(self, fingerprint: str, title: str, artist: str) -> Optional[TrackMetadata]:
pass
@abstractmethod
async def fetch_artist(self, artist_id: str) -> Optional[ArtistMetadata]:
pass
@abstractmethod
async def fetch_album(self, album_id: str) -> Optional[AlbumMetadata]:
pass
```
**Provider Modules**
- `musicbrainz.py`: Primary database, uses musicbrainzngs library
- `genius.py`: Lyrics and song descriptions, requires API token
- `wikipedia.py`: Artist/album context, uses Wikipedia API
- `wikidata.py`: Structured data (areas, relationships), SPARQL queries
- `discogs.py`: Release details, requires API token
- `allmusic.py`: Editorial reviews, web scraping (no official API)
- `metacritic.py`: Critic scores, web scraping
- `lrclib.py`: Synced lyrics, public API
#### Matching Flow
1. **Event Received**: RabbitMQ delivers `file.added` message with file ID
2. **File Fetch**: GET `/api/files/:id` from Server to retrieve metadata
3. **Provider Selection**: Read settings.json for enabled providers and priority
4. **Parallel Queries**: Launch async tasks for each provider:
- MusicBrainz: Query by AcoustID fingerprint
- Genius: Search by title + artist
- Wikipedia: Search by artist name
- Wikidata: Query by MusicBrainz ID (if found)
- Discogs: Search by release title
- AllMusic: Scrape by artist + album
- Metacritic: Scrape by album title
- LrcLib: Search by title + artist + duration
5. **Result Aggregation**: Merge results based on priority:
- MusicBrainz IDs take precedence
- Lyrics: prefer synced (LrcLib) over plain (Genius)
- Descriptions: concatenate from multiple sources
- Ratings: average across providers
6. **Metadata Push**: POST to Server `/api/external-metadata` with:
- Track/album/artist IDs
- Descriptions
- Ratings
- Source URLs
- Provider names
7. **Acknowledgment**: ACK message to RabbitMQ
#### Rate Limiting
Providers have different rate limits:
- **MusicBrainz**: 1 request/second (enforced by library)
- **Genius**: 10 requests/second (API limit)
- **Wikipedia**: No official limit, use 5 requests/second
- **Wikidata**: No limit, SPARQL endpoint is fast
- **Discogs**: 60 requests/minute (API limit)
- **AllMusic**: No API, scraping limited to 1 request/second
- **Metacritic**: No API, scraping limited to 1 request/second
- **LrcLib**: No official limit, use 10 requests/second
Matcher implements per-provider rate limiters using `aiolimiter`.
#### Error Handling
- **Provider Timeout**: Skip provider, continue with others
- **HTTP Errors**: Retry with exponential backoff (max 3 attempts)
- **Parsing Errors**: Log and skip provider result
- **Server API Errors**: NACK message to RabbitMQ for redelivery
- **No Results**: Push empty metadata (Server marks as "not found")
#### Configuration
Settings.json controls provider behavior:
```json
{
"providers": {
"musicbrainz": { "enabled": true },
"genius": { "enabled": true, "token": "..." },
"wikipedia": { "enabled": true },
"wikidata": { "enabled": true },
"discogs": { "enabled": false },
"allmusic": { "enabled": false },
"metacritic": { "enabled": false },
"lrclib": { "enabled": true }
},
"metadata": {
"order": ["musicbrainz", "genius", "wikipedia", "lrclib"]
}
}
```
Disabled providers are skipped. Order determines priority for conflicting data.
### Front (Next.js 16, React)
**Port**: 3000
**Framework**: Next.js with SSR
**UI**: Material-UI components
**State**: Jotai atoms
**Data Fetching**: TanStack Query
**i18n**: i18next
#### Responsibilities
1. **User Interface**: Render pages for browsing, playback, settings
2. **API Communication**: Fetch data from Server via REST
3. **State Management**: Manage playback queue, user preferences, auth tokens
4. **Internationalization**: Support multiple languages
#### Page Structure
- `/`: Home page with recent albums, top artists
- `/artists`: Artist grid with search
- `/artists/:id`: Artist detail with albums, songs, videos
- `/albums`: Album grid with filters
- `/albums/:id`: Album detail with tracks, releases
- `/songs`: Song list with search
- `/songs/:id`: Song detail with tracks, lyrics
- `/playlists`: User playlists
- `/playlists/:id`: Playlist detail with tracks
- `/videos`: Music video grid
- `/videos/:id`: Video player
- `/search`: Global search results
- `/settings`: User preferences, library management, scrobbler setup
#### State Management
Jotai atoms store global state:
- `authAtom`: JWT token, user info
- `playbackAtom`: Current track, queue, position, volume
- `settingsAtom`: Theme, language, playback preferences
TanStack Query caches API responses:
- `useArtists()`: Fetch artist list
- `useArtist(id)`: Fetch artist detail
- `useAlbums()`: Fetch album list
- `useAlbum(id)`: Fetch album detail
- `useTracks()`: Fetch track list
- `useSearch(query)`: Fetch search results
Queries invalidate on mutations (create playlist, update settings).
#### Playback Flow
1. User clicks track
2. `playbackAtom` updated with track ID
3. Component fetches stream URL: `/api/tracks/:id/stream`
4. HTML5 `<audio>` element loads stream
5. Playback starts
6. On play event, POST to `/api/scrobblers/scrobble` (if enabled)
7. On track end, advance queue, repeat flow
Video playback uses `<video>` element with transcoder stream.
#### Mobile App
Expo/React Native app shares components and state logic with web. Differences:
- Navigation: React Navigation instead of Next.js router
- Storage: AsyncStorage instead of localStorage
- Media: expo-av instead of HTML5 audio/video
- Notifications: expo-notifications for background playback
Monorepo structure:
```
front/
web/ # Next.js app
mobile/ # Expo app
shared/ # Common components, hooks, state
```
#### Internationalization
i18next with JSON translation files:
```
locales/
en/
common.json
artist.json
album.json
fr/
common.json
artist.json
album.json
```
Language switcher in settings. Detects browser locale on first visit.
## Infrastructure Services
### PostgreSQL
**Port**: 5432
**Image**: postgres:alpine3.14
**Volume**: `meelo_db`
Stores all persistent data. Prisma manages schema migrations. Health check via `pg_isready`.
### MeiliSearch
**Port**: 7700
**Image**: meilisearch:v1.5
**Volume**: `meelo_search`
Indexes artists, albums, songs, videos. Configured with:
- Searchable attributes: name, title, artist names
- Filterable attributes: genre, year, type
- Sortable attributes: releaseDate, name
- Ranking rules: typo, words, proximity, attribute, sort, exactness
Health check via `GET /health`.
### RabbitMQ
**Port**: 5672 (AMQP), 15672 (management UI)
**Image**: rabbitmq:4.2-alpine
**Volume**: `meelo_rabbitmq_data`
Message queue for event-driven architecture. Queues:
- `file.added`: Scanner publishes, Matcher consumes
- `metadata.updated`: Matcher publishes, Server consumes (future use)
Health check via `rabbitmq-diagnostics ping`.
### Kyoo Transcoder
**Port**: 7666
**Volume**: `meelo_transcoder_cache`
Transcodes video files for web playback. Supports:
- Adaptive bitrate streaming (HLS)
- Multiple resolutions (480p, 720p, 1080p)
- Codec conversion (H.264, VP9)
- Subtitle burning
Server proxies requests to transcoder. Client receives HLS manifest.
### Nginx
**Port**: 80
**Image**: nginx:1.29.7-alpine
**Config**: Mounted from `nginx.conf`
Routes requests to services:
```nginx
location / {
proxy_pass http://front:3000;
}
location /api/ {
proxy_pass http://server:4000;
}
location /scanner/ {
proxy_pass http://scanner:8133;
}
location /matcher/ {
proxy_pass http://matcher:6789;
}
```
Handles WebSocket upgrades for Server events.
## Inter-Service Communication
### REST APIs
- **Front → Server**: All data fetching (artists, albums, tracks, playlists)
- **Scanner → Server**: File registration, library queries
- **Matcher → Server**: Metadata push, file queries
- **Server → MeiliSearch**: Index updates, search queries
- **Server → Transcoder**: Video stream requests
### Message Queue
- **Scanner → RabbitMQ**: Publish `file.added` events
- **RabbitMQ → Matcher**: Deliver `file.added` events
### Database
- **Server → PostgreSQL**: All CRUD operations via Prisma
## Startup Orchestration
Docker Compose defines service dependencies and health checks:
1. **PostgreSQL** starts first, health check via `pg_isready`
2. **MeiliSearch** starts, health check via `GET /health`
3. **RabbitMQ** starts, health check via `rabbitmq-diagnostics ping`
4. **Server** starts after database/search/queue are healthy
- Runs Prisma migrations
- Seeds initial data (admin user if none exists)
- Connects to MeiliSearch and RabbitMQ
5. **Scanner** starts after Server is healthy
- Registers with Server API
- Begins filesystem watching
6. **Matcher** starts after Server and RabbitMQ are healthy
- Connects to RabbitMQ
- Begins consuming events
7. **Front** starts after Server is healthy
- SSR requires Server API for initial data
8. **Transcoder** starts independently (no dependencies)
9. **Nginx** starts last, after all application services are healthy
Health checks run every 30 seconds. Unhealthy services restart automatically.
## Data Consistency
### Transactions
Prisma transactions ensure atomicity:
```typescript
await prisma.$transaction([
prisma.song.create({ data: songData }),
prisma.track.create({ data: trackData }),
prisma.file.update({ where: { id: fileId }, data: { trackId } })
]);
```
If any operation fails, all rollback.
### Event Ordering
RabbitMQ guarantees message order per queue. Matcher processes events sequentially to avoid race conditions.
### Search Consistency
MeiliSearch updates are asynchronous. Brief window where database and search index diverge. Acceptable for this use case (eventual consistency).
### Cache Invalidation
TanStack Query invalidates caches on mutations:
```typescript
const mutation = useMutation({
mutationFn: createPlaylist,
onSuccess: () => {
queryClient.invalidateQueries(['playlists']);
}
});
```
## Scalability Considerations
### Horizontal Scaling
- **Scanner**: Run multiple instances for different libraries
- **Matcher**: Run multiple consumers for faster enrichment
- **Front**: Stateless, can run multiple instances behind load balancer
### Vertical Scaling
- **Server**: CPU-bound for complex queries, benefits from more cores
- **MeiliSearch**: Memory-bound, benefits from more RAM
- **PostgreSQL**: I/O-bound, benefits from SSD and connection pooling
### Bottlenecks
- **Matcher**: Limited by external provider rate limits
- **Transcoder**: CPU-intensive, limits concurrent video streams
- **Database**: Complex queries (artist with all albums/songs/videos) can be slow
## Monitoring and Observability
### Logging
- **Server**: NestJS Logger with configurable levels (error, warn, info, debug)
- **Scanner**: zerolog with structured JSON output
- **Matcher**: Python logging with JSON formatter
- **Front**: Console logs in development, silent in production
All logs written to stdout, captured by Docker.
### Health Checks
Every service exposes health endpoint:
- **Server**: `GET /api/health`
- **Scanner**: `GET /`
- **Matcher**: `GET /health`
- **Front**: `GET /api/health` (Next.js API route)
Docker Compose monitors these endpoints.
### Metrics
No built-in Prometheus metrics. Future enhancement.
## Security Architecture
### Authentication
- **JWT**: Signed tokens with expiration
- **API Keys**: `x-api-key` header for Scanner/Matcher
- **Bcrypt**: Password hashing with salt rounds = 10
### Authorization
- **Admin Flag**: Users have `isAdmin` boolean
- **Ownership**: Users can only modify their own playlists
- **Public Playlists**: Readable by all, writable by owner or if `allowChanges=true`
### Network Isolation
Docker Compose creates private network. Only Nginx exposes port 80. Internal services not accessible from host.
### Input Validation
- **Server**: NestJS validation pipes with class-validator
- **Scanner**: Go struct validation
- **Matcher**: Pydantic models
Invalid input returns 400 Bad Request.
### SQL Injection
Prisma uses parameterized queries. No raw SQL in codebase.
### XSS Protection
React escapes output by default. No `dangerouslySetInnerHTML` except for sanitized lyrics.
## Deployment Variants
### Production (docker-compose.yml)
Pre-built images from Docker Hub. Environment variables from .env. Volumes for persistence. Restart policy: always.
### Development (docker-compose.dev.yml)
Mounted source directories. Hot reload enabled. Exposed ports for debugging (PostgreSQL 5432, MeiliSearch 7700, RabbitMQ 15672). Restart policy: unless-stopped.
### Local Build (docker-compose.local.yml)
Builds images from source using Dockerfiles. Tests changes before pushing to Docker Hub. Same volumes and network as production.
## Configuration Management
### Environment Variables (.env)
Deployment-specific settings:
- `PORT`: Server port (default 4000)
- `PUBLIC_URL`: External URL for OAuth callbacks
- `CONFIG_DIR`: Path to settings.json
- `DATA_DIR`: Path to music files
- `JWT_SIGNATURE`: Secret for signing tokens
- `GENIUS_ACCESS_TOKEN`: Genius API key
- `DISCOGS_ACCESS_TOKEN`: Discogs API key
- `LASTFM_API_KEY`, `LASTFM_API_SECRET`: Last.fm OAuth
### Settings File (settings.json)
User preferences:
- `trackRegex`: Filename parsing pattern
- `metadata.source`: Prefer embedded tags or external providers
- `metadata.order`: Provider priority list
- `providers`: Enable/disable specific providers
- `compilations`: Rules for detecting compilation albums
Server reads settings.json on startup. Changes require restart.
## Error Recovery
### Service Failures
Docker restart policy handles crashes. Health checks detect hung processes.
### Database Corruption
PostgreSQL volume backups recommended. Restore from backup if corruption detected.
### Message Queue Failures
RabbitMQ persists messages to disk. Unacknowledged messages redelivered on restart.
### Search Index Corruption
Rebuild MeiliSearch index from database:
```bash
curl -X POST http://localhost:4000/api/search/reindex
```
Server iterates all entities, pushes to MeiliSearch.
## Performance Optimization
### Database Indexes
Prisma schema defines indexes on:
- Foreign keys (artistId, albumId, songId)
- Unique constraints (slug, checksum)
- Frequently queried fields (releaseDate, type)
### Query Optimization
- **Eager Loading**: Prisma `include` to avoid N+1 queries
- **Pagination**: Limit/offset for large result sets
- **Caching**: TanStack Query caches API responses client-side
### Asset Optimization
- **Images**: Illustrations stored as blurhash + URL
- **Lazy Loading**: Front loads images on scroll
- **Code Splitting**: Next.js splits bundles per page
## Testing Strategy
### Unit Tests
- **Server**: Jest tests for services, controllers, utilities
- **Matcher**: pytest tests for provider modules
- **Scanner**: Go tests for file parsing, fingerprinting
### Integration Tests
- **Server**: Test API endpoints with in-memory database
- **Matcher**: Mock external provider responses
### End-to-End Tests
Not implemented. Future enhancement with Playwright.
### Coverage
SonarCloud tracks coverage per service. Minimum threshold: 80%.
## Summary
Meelo's architecture separates concerns across four microservices, each optimized for its task. The event-driven design decouples scanning from enrichment, enabling parallel processing and fault tolerance. Infrastructure services (PostgreSQL, MeiliSearch, RabbitMQ) provide persistence, search, and messaging. Docker Compose orchestrates startup order and health monitoring. The result is a scalable, maintainable system that handles complex metadata workflows without blocking user interactions.
+981
View File
@@ -0,0 +1,981 @@
# Meelo Codebase
## Repository Structure
```
Meelo/
├── server/ # NestJS backend
│ ├── src/
│ │ ├── artist/
│ │ ├── album/
│ │ ├── song/
│ │ ├── track/
│ │ ├── auth/
│ │ ├── search/
│ │ └── ...
│ ├── prisma/
│ │ ├── schema.prisma
│ │ └── migrations/
│ ├── test/
│ └── package.json
├── scanner/ # Go file scanner
│ ├── cmd/
│ ├── internal/
│ │ ├── scanner/
│ │ ├── fingerprint/
│ │ └── parser/
│ ├── go.mod
│ └── main.go
├── matcher/ # Python metadata matcher
│ ├── providers/
│ │ ├── musicbrainz.py
│ │ ├── genius.py
│ │ ├── wikipedia.py
│ │ └── ...
│ ├── main.py
│ ├── requirements.txt
│ └── tests/
├── front/ # Next.js frontend
│ ├── web/
│ │ ├── pages/
│ │ ├── components/
│ │ └── package.json
│ ├── mobile/
│ │ ├── App.tsx
│ │ └── package.json
│ └── shared/
│ ├── components/
│ ├── hooks/
│ └── state/
├── docker-compose.yml
├── docker-compose.dev.yml
├── docker-compose.local.yml
├── .env.example
├── biome.json
└── README.md
```
## Server (NestJS)
### Module Organization
NestJS organizes code into modules. Each module encapsulates related functionality.
**Core Modules**:
- `ArtistModule`: Artist CRUD, relationships
- `AlbumModule`: Album CRUD, releases
- `SongModule`: Song CRUD, lyrics
- `TrackModule`: Track CRUD, streaming
- `ReleaseModule`: Release CRUD
- `GenreModule`: Genre management
- `VideoModule`: Video CRUD, streaming
**Supporting Modules**:
- `AuthModule`: JWT authentication
- `UserModule`: User management
- `LibraryModule`: Library configuration
- `FileModule`: File metadata
- `PlaylistModule`: Playlist CRUD
- `LyricsModule`: Lyrics storage
**Integration Modules**:
- `ExternalMetadataModule`: Provider data
- `SearchModule`: MeiliSearch integration
- `ScrobblerModule`: Last.fm/ListenBrainz
- `StreamModule`: Audio/video streaming
- `EventsModule`: WebSocket events
**Infrastructure Modules**:
- `PrismaModule`: Database ORM
- `MeiliSearchModule`: Search client
- `RabbitMQModule`: Message queue
### Module Structure
Each module follows consistent structure:
```
artist/
├── artist.module.ts # Module definition
├── artist.controller.ts # HTTP endpoints
├── artist.service.ts # Business logic
├── artist.entity.ts # Prisma entity (generated)
├── dto/
│ ├── create-artist.dto.ts
│ ├── update-artist.dto.ts
│ └── artist-response.dto.ts
└── artist.spec.ts # Unit tests
```
### Controller Example
```typescript
@Controller('artists')
@UseGuards(JwtAuthGuard)
export class ArtistController {
constructor(private readonly artistService: ArtistService) {}
@Get()
async findAll(
@Query('skip') skip?: number,
@Query('take') take?: number,
@Query('sortBy') sortBy?: string,
@Query('sortOrder') sortOrder?: 'asc' | 'desc',
) {
return this.artistService.findAll({ skip, take, sortBy, sortOrder });
}
@Get(':id')
async findOne(
@Param('id', ParseIntPipe) id: number,
@Query('include') include?: string[],
) {
return this.artistService.findOne(id, include);
}
@Post()
@UseGuards(AdminGuard)
async create(@Body() createArtistDto: CreateArtistDto) {
return this.artistService.create(createArtistDto);
}
@Patch(':id')
@UseGuards(AdminGuard)
async update(
@Param('id', ParseIntPipe) id: number,
@Body() updateArtistDto: UpdateArtistDto,
) {
return this.artistService.update(id, updateArtistDto);
}
@Delete(':id')
@UseGuards(AdminGuard)
async remove(@Param('id', ParseIntPipe) id: number) {
return this.artistService.remove(id);
}
}
```
### Service Example
```typescript
@Injectable()
export class ArtistService {
constructor(
private readonly prisma: PrismaService,
private readonly meilisearch: MeiliSearchService,
) {}
async findAll(params: {
skip?: number;
take?: number;
sortBy?: string;
sortOrder?: 'asc' | 'desc';
}) {
const { skip = 0, take = 20, sortBy = 'name', sortOrder = 'asc' } = params;
const [items, total] = await Promise.all([
this.prisma.artist.findMany({
skip,
take,
orderBy: { [sortBy]: sortOrder },
include: {
illustration: true,
_count: {
select: { albums: true, songs: true },
},
},
}),
this.prisma.artist.count(),
]);
return { items, total, skip, take };
}
async findOne(id: number, include?: string[]) {
const includeOptions = this.buildIncludeOptions(include);
const artist = await this.prisma.artist.findUnique({
where: { id },
include: includeOptions,
});
if (!artist) {
throw new NotFoundException(`Artist with ID ${id} not found`);
}
return artist;
}
async create(data: CreateArtistDto) {
const slug = this.generateSlug(data.name);
const artist = await this.prisma.artist.create({
data: {
...data,
slug,
},
});
await this.meilisearch.index('artists', artist);
return artist;
}
async update(id: number, data: UpdateArtistDto) {
const artist = await this.prisma.artist.update({
where: { id },
data,
});
await this.meilisearch.update('artists', artist);
return artist;
}
async remove(id: number) {
await this.prisma.artist.delete({
where: { id },
});
await this.meilisearch.delete('artists', id);
}
private buildIncludeOptions(include?: string[]) {
if (!include) return {};
const options: any = {};
if (include.includes('albums')) options.albums = true;
if (include.includes('songs')) options.songs = true;
if (include.includes('videos')) options.videos = true;
if (include.includes('areas')) options.areas = { include: { area: true } };
if (include.includes('externalMetadata')) {
options.externalMetadata = { include: { sources: true } };
}
return options;
}
private generateSlug(name: string): string {
return name
.toLowerCase()
.replace(/[^a-z0-9]+/g, '-')
.replace(/^-|-$/g, '');
}
}
```
### DTO Example
```typescript
export class CreateArtistDto {
@IsString()
@IsNotEmpty()
name: string;
@IsString()
@IsOptional()
sortName?: string;
@IsArray()
@IsInt({ each: true })
@IsOptional()
areaIds?: number[];
}
export class UpdateArtistDto extends PartialType(CreateArtistDto) {}
export class ArtistResponseDto {
id: number;
name: string;
slug: string;
sortName?: string;
illustration?: IllustrationDto;
albumCount?: number;
songCount?: number;
}
```
### Testing
Jest tests for services and controllers:
```typescript
describe('ArtistService', () => {
let service: ArtistService;
let prisma: PrismaService;
beforeEach(async () => {
const module: TestingModule = await Test.createTestingModule({
providers: [
ArtistService,
{
provide: PrismaService,
useValue: {
artist: {
findMany: jest.fn(),
findUnique: jest.fn(),
create: jest.fn(),
update: jest.fn(),
delete: jest.fn(),
},
},
},
{
provide: MeiliSearchService,
useValue: {
index: jest.fn(),
update: jest.fn(),
delete: jest.fn(),
},
},
],
}).compile();
service = module.get<ArtistService>(ArtistService);
prisma = module.get<PrismaService>(PrismaService);
});
it('should find all artists', async () => {
const mockArtists = [{ id: 1, name: 'Test Artist', slug: 'test-artist' }];
jest.spyOn(prisma.artist, 'findMany').mockResolvedValue(mockArtists);
jest.spyOn(prisma.artist, 'count').mockResolvedValue(1);
const result = await service.findAll({});
expect(result.items).toEqual(mockArtists);
expect(result.total).toBe(1);
});
});
```
## Scanner (Go)
### Package Structure
```
scanner/
├── cmd/
│ └── scanner/
│ └── main.go # Entry point
├── internal/
│ ├── scanner/
│ │ ├── scanner.go # Main scanner logic
│ │ └── watcher.go # Filesystem watcher
│ ├── fingerprint/
│ │ └── acoustid.go # AcoustID fingerprinting
│ ├── parser/
│ │ ├── metadata.go # FFprobe metadata extraction
│ │ └── filename.go # Regex filename parsing
│ ├── api/
│ │ └── client.go # Server API client
│ └── config/
│ └── config.go # Configuration loading
├── go.mod
└── go.sum
```
### Main Entry Point
```go
package main
import (
"log"
"os"
"github.com/labstack/echo/v5"
"meelo/scanner/internal/scanner"
"meelo/scanner/internal/config"
)
func main() {
cfg, err := config.Load()
if err != nil {
log.Fatalf("Failed to load config: %v", err)
}
s := scanner.New(cfg)
e := echo.New()
e.GET("/", s.HealthCheck)
e.GET("/tasks", s.ListTasks)
e.POST("/scan", s.ScanAll)
e.POST("/scan/:libraryId", s.ScanLibrary)
e.POST("/clean", s.CleanOrphans)
e.POST("/refresh", s.RefreshMetadata)
log.Fatal(e.Start(":8133"))
}
```
### Scanner Logic
```go
package scanner
import (
"context"
"log"
"path/filepath"
"meelo/scanner/internal/fingerprint"
"meelo/scanner/internal/parser"
"meelo/scanner/internal/api"
)
type Scanner struct {
client *api.Client
fingerprint *fingerprint.Generator
parser *parser.Parser
}
func New(cfg *config.Config) *Scanner {
return &Scanner{
client: api.NewClient(cfg.ServerURL, cfg.APIKey),
fingerprint: fingerprint.New(),
parser: parser.New(cfg.TrackRegex),
}
}
func (s *Scanner) ScanLibrary(ctx context.Context, libraryID int) error {
library, err := s.client.GetLibrary(libraryID)
if err != nil {
return err
}
return filepath.Walk(library.Path, func(path string, info os.FileInfo, err error) error {
if err != nil {
return err
}
if info.IsDir() {
return nil
}
if !s.isAudioFile(path) {
return nil
}
return s.processFile(ctx, path, libraryID)
})
}
func (s *Scanner) processFile(ctx context.Context, path string, libraryID int) error {
// Extract metadata using FFprobe
metadata, err := s.parser.ExtractMetadata(path)
if err != nil {
log.Printf("Failed to extract metadata from %s: %v", path, err)
return nil // Skip file, continue scan
}
// Generate AcoustID fingerprint
fp, err := s.fingerprint.Generate(path)
if err != nil {
log.Printf("Failed to generate fingerprint for %s: %v", path, err)
// Continue without fingerprint
}
// Calculate checksum
checksum, err := s.calculateChecksum(path)
if err != nil {
return err
}
// Register file with Server
file := &api.FileRegistration{
Path: path,
Checksum: checksum,
Fingerprint: fp,
LibraryID: libraryID,
Metadata: metadata,
}
if err := s.client.RegisterFile(file); err != nil {
return err
}
log.Printf("Registered file: %s", path)
return nil
}
func (s *Scanner) isAudioFile(path string) bool {
ext := filepath.Ext(path)
audioExts := []string{".mp3", ".flac", ".m4a", ".ogg", ".opus", ".wav"}
for _, audioExt := range audioExts {
if ext == audioExt {
return true
}
}
return false
}
```
### Metadata Extraction
```go
package parser
import (
"encoding/json"
"os/exec"
)
type Parser struct {
trackRegex *regexp.Regexp
}
func New(regex string) *Parser {
return &Parser{
trackRegex: regexp.MustCompile(regex),
}
}
func (p *Parser) ExtractMetadata(path string) (*Metadata, error) {
// Run FFprobe
cmd := exec.Command("ffprobe",
"-v", "quiet",
"-print_format", "json",
"-show_format",
"-show_streams",
path,
)
output, err := cmd.Output()
if err != nil {
return nil, err
}
var probe ProbeResult
if err := json.Unmarshal(output, &probe); err != nil {
return nil, err
}
// Extract metadata from tags
metadata := &Metadata{
Title: probe.Format.Tags.Title,
Artist: probe.Format.Tags.Artist,
Album: probe.Format.Tags.Album,
Duration: probe.Format.Duration,
Bitrate: probe.Format.BitRate,
Codec: probe.Streams[0].CodecName,
}
// Parse filename if tags missing
if metadata.Title == "" || metadata.Artist == "" {
fileMetadata := p.parseFilename(path)
if metadata.Title == "" {
metadata.Title = fileMetadata.Title
}
if metadata.Artist == "" {
metadata.Artist = fileMetadata.Artist
}
}
return metadata, nil
}
func (p *Parser) parseFilename(path string) *Metadata {
matches := p.trackRegex.FindStringSubmatch(path)
if matches == nil {
return &Metadata{}
}
return &Metadata{
Artist: matches[p.trackRegex.SubexpIndex("artist")],
Album: matches[p.trackRegex.SubexpIndex("album")],
Title: matches[p.trackRegex.SubexpIndex("title")],
}
}
```
### Testing
```go
package scanner
import (
"testing"
)
func TestIsAudioFile(t *testing.T) {
s := &Scanner{}
tests := []struct {
path string
expected bool
}{
{"song.mp3", true},
{"song.flac", true},
{"song.txt", false},
{"song.jpg", false},
}
for _, tt := range tests {
result := s.isAudioFile(tt.path)
if result != tt.expected {
t.Errorf("isAudioFile(%s) = %v, want %v", tt.path, result, tt.expected)
}
}
}
```
## Matcher (Python)
### Package Structure
```
matcher/
├── providers/
│ ├── __init__.py
│ ├── base.py # Base provider interface
│ ├── musicbrainz.py
│ ├── genius.py
│ ├── wikipedia.py
│ ├── wikidata.py
│ ├── discogs.py
│ ├── allmusic.py
│ ├── metacritic.py
│ └── lrclib.py
├── main.py # FastAPI app + RabbitMQ consumer
├── config.py # Configuration loading
├── aggregator.py # Result aggregation
├── requirements.txt
└── tests/
├── test_musicbrainz.py
├── test_genius.py
└── ...
```
### Main Entry Point
```python
from fastapi import FastAPI
from aio_pika import connect_robust
import asyncio
from providers import ProviderFactory
from aggregator import MetadataAggregator
from config import load_config
app = FastAPI()
config = load_config()
@app.get("/health")
async def health():
return {"status": "healthy"}
async def consume_events():
connection = await connect_robust(config.rabbitmq_url)
channel = await connection.channel()
queue = await channel.declare_queue("file.added")
async with queue.iterator() as queue_iter:
async for message in queue_iter:
async with message.process():
await process_file(message.body)
async def process_file(file_id: int):
# Fetch file metadata from Server
file_data = await fetch_file(file_id)
# Query providers in parallel
factory = ProviderFactory(config)
providers = factory.get_enabled_providers()
tasks = [provider.fetch_metadata(file_data) for provider in providers]
results = await asyncio.gather(*tasks, return_exceptions=True)
# Aggregate results
aggregator = MetadataAggregator(config.provider_order)
metadata = aggregator.aggregate(results)
# Push to Server
await push_metadata(file_id, metadata)
if __name__ == "__main__":
import uvicorn
loop = asyncio.get_event_loop()
loop.create_task(consume_events())
uvicorn.run(app, host="0.0.0.0", port=6789)
```
### Provider Base Class
```python
from abc import ABC, abstractmethod
from typing import Optional
class Provider(ABC):
def __init__(self, config):
self.config = config
@abstractmethod
async def fetch_metadata(self, file_data: dict) -> Optional[dict]:
"""Fetch metadata for file."""
pass
@abstractmethod
async def search_artist(self, name: str) -> Optional[dict]:
"""Search for artist by name."""
pass
@abstractmethod
async def search_album(self, artist: str, album: str) -> Optional[dict]:
"""Search for album by artist and title."""
pass
```
### MusicBrainz Provider
```python
import musicbrainzngs as mb
from aiolimiter import AsyncLimiter
from providers.base import Provider
class MusicBrainzProvider(Provider):
def __init__(self, config):
super().__init__(config)
mb.set_useragent("Meelo", "1.0", "https://github.com/Arthi-chaud/Meelo")
self.limiter = AsyncLimiter(1, 1) # 1 request per second
async def fetch_metadata(self, file_data: dict) -> Optional[dict]:
async with self.limiter:
# Try AcoustID fingerprint first
if file_data.get("fingerprint"):
result = await self._query_by_fingerprint(file_data["fingerprint"])
if result:
return result
# Fallback to text search
return await self._query_by_text(
file_data["metadata"]["artist"],
file_data["metadata"]["album"],
file_data["metadata"]["title"]
)
async def _query_by_fingerprint(self, fingerprint: str) -> Optional[dict]:
try:
result = mb.get_recordings_by_puid(fingerprint)
if result["recording-list"]:
recording = result["recording-list"][0]
return self._extract_metadata(recording)
except mb.WebServiceError:
return None
async def _query_by_text(self, artist: str, album: str, title: str) -> Optional[dict]:
try:
result = mb.search_recordings(
artist=artist,
release=album,
recording=title,
limit=1
)
if result["recording-list"]:
recording = result["recording-list"][0]
return self._extract_metadata(recording)
except mb.WebServiceError:
return None
def _extract_metadata(self, recording: dict) -> dict:
return {
"title": recording["title"],
"artist": recording["artist-credit"][0]["artist"]["name"],
"album": recording["release-list"][0]["title"] if recording.get("release-list") else None,
"duration": recording.get("length"),
"mbid": recording["id"],
}
```
### Testing
```python
import pytest
from providers.musicbrainz import MusicBrainzProvider
@pytest.mark.asyncio
async def test_musicbrainz_search():
provider = MusicBrainzProvider({})
result = await provider.search_artist("The Beatles")
assert result is not None
assert result["name"] == "The Beatles"
assert "mbid" in result
```
## Front (Next.js)
### Directory Structure
```
front/web/
├── pages/
│ ├── index.tsx # Home page
│ ├── artists/
│ │ ├── index.tsx # Artist list
│ │ └── [id].tsx # Artist detail
│ ├── albums/
│ ├── songs/
│ ├── playlists/
│ └── settings/
├── components/
│ ├── ArtistCard.tsx
│ ├── AlbumCard.tsx
│ ├── TrackList.tsx
│ └── Player.tsx
├── hooks/
│ ├── useArtists.ts
│ ├── useAlbums.ts
│ └── usePlayback.ts
├── state/
│ ├── auth.ts # Jotai atoms
│ ├── playback.ts
│ └── settings.ts
├── lib/
│ └── api.ts # API client
└── styles/
└── globals.css
```
### API Client
```typescript
import axios from 'axios';
const api = axios.create({
baseURL: process.env.NEXT_PUBLIC_API_URL,
});
api.interceptors.request.use((config) => {
const token = localStorage.getItem('token');
if (token) {
config.headers.Authorization = `Bearer ${token}`;
}
return config;
});
export const artistsApi = {
getAll: (params?: { skip?: number; take?: number }) =>
api.get('/artists', { params }),
getOne: (id: number, include?: string[]) =>
api.get(`/artists/${id}`, { params: { include } }),
create: (data: CreateArtistDto) => api.post('/artists', data),
update: (id: number, data: UpdateArtistDto) => api.patch(`/artists/${id}`, data),
delete: (id: number) => api.delete(`/artists/${id}`),
};
```
### TanStack Query Hook
```typescript
import { useQuery, useMutation, useQueryClient } from '@tanstack/react-query';
import { artistsApi } from '../lib/api';
export function useArtists(params?: { skip?: number; take?: number }) {
return useQuery({
queryKey: ['artists', params],
queryFn: () => artistsApi.getAll(params),
});
}
export function useArtist(id: number, include?: string[]) {
return useQuery({
queryKey: ['artists', id, include],
queryFn: () => artistsApi.getOne(id, include),
});
}
export function useCreateArtist() {
const queryClient = useQueryClient();
return useMutation({
mutationFn: artistsApi.create,
onSuccess: () => {
queryClient.invalidateQueries({ queryKey: ['artists'] });
},
});
}
```
### Component Example
```typescript
import { useArtists } from '../hooks/useArtists';
import ArtistCard from '../components/ArtistCard';
export default function ArtistsPage() {
const { data, isLoading, error } = useArtists({ take: 20 });
if (isLoading) return <div>Loading...</div>;
if (error) return <div>Error loading artists</div>;
return (
<div>
<h1>Artists</h1>
<div className="grid">
{data.items.map((artist) => (
<ArtistCard key={artist.id} artist={artist} />
))}
</div>
</div>
);
}
```
## Code Quality
### Biome Configuration
```json
{
"formatter": {
"enabled": true,
"indentStyle": "tab",
"lineWidth": 100
},
"linter": {
"enabled": true,
"rules": {
"recommended": true
}
},
"javascript": {
"formatter": {
"quoteStyle": "double"
}
}
}
```
### Logging
**Server (NestJS)**:
```typescript
import { Logger } from '@nestjs/common';
const logger = new Logger('ArtistService');
logger.log('Artist created', { id: artist.id });
logger.error('Failed to create artist', error.stack);
```
**Scanner (Go)**:
```go
import "github.com/rs/zerolog/log"
log.Info().Str("path", path).Msg("File registered")
log.Error().Err(err).Msg("Failed to extract metadata")
```
**Matcher (Python)**:
```python
import logging
logger = logging.getLogger(__name__)
logger.info(f"Fetching metadata for file {file_id}")
logger.error(f"Provider failed: {provider_name}", exc_info=True)
```
## Summary
Meelo's codebase is organized into four microservices with clear separation of concerns. Server uses NestJS modules for domain logic, Prisma for database access, and Jest for testing. Scanner uses Go packages for file processing, FFprobe for metadata extraction, and AcoustID for fingerprinting. Matcher uses Python provider modules for external queries, asyncio for parallelism, and pytest for testing. Front uses Next.js pages for routing, TanStack Query for data fetching, and Jotai for state management. Code quality is enforced via Biome linting, type checking (TypeScript, Pyright, Go), and SonarCloud quality gates. Logging uses structured formats (JSON) for easy parsing. The monorepo structure simplifies version coordination and cross-service changes.
File diff suppressed because it is too large Load Diff
+839
View File
@@ -0,0 +1,839 @@
# Meelo Deployment
## Deployment Overview
Meelo deploys as a multi-container Docker application orchestrated by Docker Compose. Three deployment variants support different use cases: production (pre-built images), development (hot reload), and local build (custom images).
## Docker Compose Variants
### Production (docker-compose.yml)
**Use Case**: End users running stable releases
**Images**: Pre-built from Docker Hub
**Startup Time**: Fast (no build step)
**Updates**: Pull new images, restart containers
```yaml
services:
server:
image: arthichaud/meelo-server:latest
restart: always
depends_on:
db:
condition: service_healthy
meilisearch:
condition: service_healthy
mq:
condition: service_healthy
environment:
- DATABASE_URL=postgresql://postgres:postgres@db:5432/meelo
- MEILISEARCH_URL=http://meilisearch:7700
- RABBITMQ_URL=amqp://guest:guest@mq:5672
volumes:
- ${CONFIG_DIR}:/config
- ${DATA_DIR}:/data
```
**Key Features**:
- `restart: always` for automatic recovery
- Health check dependencies ensure startup order
- Environment variables from .env
- Volumes for config and data persistence
### Development (docker-compose.dev.yml)
**Use Case**: Contributors developing features
**Images**: Built from source with hot reload
**Startup Time**: Slower (build + watch)
**Updates**: Automatic on file save
```yaml
services:
server:
build:
context: ./server
dockerfile: Dockerfile.dev
volumes:
- ./server/src:/app/src
- ./server/prisma:/app/prisma
ports:
- "4000:4000"
environment:
- NODE_ENV=development
command: npm run start:dev
```
**Key Features**:
- Source directories mounted for hot reload
- Exposed ports for debugging
- Development commands (start:dev, test:watch)
- No restart policy (manual control)
### Local Build (docker-compose.local.yml)
**Use Case**: Testing Dockerfile changes, custom builds
**Images**: Built from source
**Startup Time**: Slow (full build)
**Updates**: Rebuild images manually
```yaml
services:
server:
build:
context: ./server
dockerfile: Dockerfile
restart: unless-stopped
```
**Key Features**:
- Builds production images locally
- Tests Dockerfile changes before pushing
- `unless-stopped` restart policy
## Service Configuration
### Server (NestJS)
**Image**: arthichaud/meelo-server
**Port**: 4000
**Dependencies**: PostgreSQL, MeiliSearch, RabbitMQ
**Environment Variables**:
```bash
DATABASE_URL=postgresql://postgres:postgres@db:5432/meelo
MEILISEARCH_URL=http://meilisearch:7700
RABBITMQ_URL=amqp://guest:guest@mq:5672
JWT_SIGNATURE=your_secret_key
PORT=4000
PUBLIC_URL=https://meelo.example.com
CONFIG_DIR=/config
DATA_DIR=/data
```
**Volumes**:
- `${CONFIG_DIR}:/config` - settings.json
- `${DATA_DIR}:/data` - music files (read-only)
**Health Check**:
```yaml
healthcheck:
test: ["CMD", "wget", "--quiet", "--tries=1", "--spider", "http://localhost:4000/api/health"]
interval: 30s
timeout: 10s
retries: 3
start_period: 40s
```
### Scanner (Go)
**Image**: arthichaud/meelo-scanner
**Port**: 8133
**Dependencies**: Server
**Environment Variables**:
```bash
SERVER_URL=http://server:4000
API_KEY=your_api_key
```
**Volumes**:
- `${DATA_DIR}:/data` - music files (read-only)
**Health Check**:
```yaml
healthcheck:
test: ["CMD", "wget", "--quiet", "--tries=1", "--spider", "http://localhost:8133/"]
interval: 30s
timeout: 10s
retries: 3
```
### Matcher (Python)
**Image**: arthichaud/meelo-matcher
**Port**: 6789
**Dependencies**: Server, RabbitMQ
**Environment Variables**:
```bash
SERVER_URL=http://server:4000
RABBITMQ_URL=amqp://guest:guest@mq:5672
GENIUS_ACCESS_TOKEN=your_genius_token
DISCOGS_ACCESS_TOKEN=your_discogs_token
```
**Health Check**:
```yaml
healthcheck:
test: ["CMD", "wget", "--quiet", "--tries=1", "--spider", "http://localhost:6789/health"]
interval: 30s
timeout: 10s
retries: 3
```
### Front (Next.js)
**Image**: arthichaud/meelo-front
**Port**: 3000
**Dependencies**: Server
**Environment Variables**:
```bash
NEXT_PUBLIC_API_URL=http://localhost/api
```
**Health Check**:
```yaml
healthcheck:
test: ["CMD", "wget", "--quiet", "--tries=1", "--spider", "http://localhost:3000/api/health"]
interval: 30s
timeout: 10s
retries: 3
```
### PostgreSQL
**Image**: postgres:alpine3.14
**Port**: 5432 (internal only)
**Volume**: meelo_db
**Environment Variables**:
```bash
POSTGRES_USER=postgres
POSTGRES_PASSWORD=postgres
POSTGRES_DB=meelo
```
**Health Check**:
```yaml
healthcheck:
test: ["CMD", "pg_isready", "-U", "postgres"]
interval: 10s
timeout: 5s
retries: 5
```
### MeiliSearch
**Image**: getmeili/meilisearch:v1.5
**Port**: 7700 (internal only)
**Volume**: meelo_search
**Environment Variables**:
```bash
MEILI_ENV=production
MEILI_NO_ANALYTICS=true
```
**Health Check**:
```yaml
healthcheck:
test: ["CMD", "wget", "--quiet", "--tries=1", "--spider", "http://localhost:7700/health"]
interval: 10s
timeout: 5s
retries: 5
```
### RabbitMQ
**Image**: rabbitmq:4.2-alpine
**Port**: 5672 (AMQP), 15672 (management UI)
**Volume**: meelo_rabbitmq_data
**Health Check**:
```yaml
healthcheck:
test: ["CMD", "rabbitmq-diagnostics", "ping"]
interval: 10s
timeout: 5s
retries: 5
```
### Kyoo Transcoder
**Image**: zoriya/kyoo_transcoder:latest
**Port**: 7666 (internal only)
**Volume**: meelo_transcoder_cache
**Environment Variables**:
```bash
TRANSCODER_CACHE_ROOT=/cache
```
No health check (optional service).
### Nginx
**Image**: nginx:1.29.7-alpine
**Port**: 80 (exposed to host)
**Config**: Mounted from nginx.conf
**Configuration**:
```nginx
server {
listen 80;
server_name localhost;
location / {
proxy_pass http://front:3000;
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
}
location /api/ {
proxy_pass http://server:4000;
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
}
location /scanner/ {
proxy_pass http://scanner:8133;
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
}
location /matcher/ {
proxy_pass http://matcher:6789;
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
}
location /api/events {
proxy_pass http://server:4000;
proxy_http_version 1.1;
proxy_set_header Upgrade $http_upgrade;
proxy_set_header Connection "upgrade";
}
}
```
**Health Check**:
```yaml
healthcheck:
test: ["CMD", "wget", "--quiet", "--tries=1", "--spider", "http://localhost/"]
interval: 30s
timeout: 10s
retries: 3
```
## Volumes
### Named Volumes
```yaml
volumes:
meelo_db:
driver: local
meelo_search:
driver: local
meelo_rabbitmq_data:
driver: local
meelo_transcoder_cache:
driver: local
```
**Persistence**:
- `meelo_db`: PostgreSQL data (critical, backup regularly)
- `meelo_search`: MeiliSearch index (can rebuild from database)
- `meelo_rabbitmq_data`: Message queue state (can lose without data loss)
- `meelo_transcoder_cache`: Transcoded video segments (can delete to free space)
### Bind Mounts
```yaml
volumes:
- ${CONFIG_DIR}:/config
- ${DATA_DIR}:/data:ro
```
**Paths**:
- `CONFIG_DIR`: Directory containing settings.json (default: ./config)
- `DATA_DIR`: Music library directory (default: ./data)
**Permissions**:
- `DATA_DIR` mounted read-only (`:ro`) to prevent accidental modification
- Services run as non-root user (UID 1000)
## Startup Order
Docker Compose orchestrates startup using health checks:
```
1. PostgreSQL starts
└─ Health check: pg_isready
2. MeiliSearch starts
└─ Health check: GET /health
3. RabbitMQ starts
└─ Health check: rabbitmq-diagnostics ping
4. Server starts (depends on db, meilisearch, mq)
└─ Runs Prisma migrations
└─ Seeds initial data
└─ Health check: GET /api/health
5. Scanner starts (depends on server)
└─ Registers with Server
└─ Health check: GET /
6. Matcher starts (depends on server, mq)
└─ Connects to RabbitMQ
└─ Health check: GET /health
7. Front starts (depends on server)
└─ SSR requires Server API
└─ Health check: GET /api/health
8. Transcoder starts (no dependencies)
9. Nginx starts (depends on all application services)
└─ Health check: GET /
```
**Start Period**: Each service has a start period (30-40s) before health checks begin. This allows initialization without false failures.
## Configuration Files
### .env
Environment variables for deployment:
```bash
# Ports
PORT=4000
FRONT_PORT=3000
SCANNER_PORT=8133
MATCHER_PORT=6789
# URLs
PUBLIC_URL=https://meelo.example.com
# Directories
CONFIG_DIR=./config
DATA_DIR=/path/to/music
# Database
DATABASE_URL=postgresql://postgres:postgres@db:5432/meelo
# Search
MEILISEARCH_URL=http://meilisearch:7700
# Message Queue
RABBITMQ_URL=amqp://guest:guest@mq:5672
# Authentication
JWT_SIGNATURE=your_secret_key_here
ALLOW_ANONYMOUS=0
# External Providers
GENIUS_ACCESS_TOKEN=your_genius_token
DISCOGS_ACCESS_TOKEN=your_discogs_token
# Last.fm OAuth
LASTFM_API_KEY=your_lastfm_key
LASTFM_API_SECRET=your_lastfm_secret
# CORS
CORS_ORIGINS=https://meelo.example.com
```
### settings.json
User preferences (stored in CONFIG_DIR):
```json
{
"trackRegex": "(?P<artist>[^/]+)/(?P<album>[^/]+)/(?P<disc>\\d+)-(?P<track>\\d+) (?P<title>.+)\\.(?P<ext>\\w+)",
"metadata": {
"source": "providers",
"order": ["musicbrainz", "genius", "wikipedia", "lrclib"]
},
"providers": {
"musicbrainz": { "enabled": true },
"genius": { "enabled": true },
"wikipedia": { "enabled": true },
"wikidata": { "enabled": true },
"discogs": { "enabled": false },
"allmusic": { "enabled": false },
"metacritic": { "enabled": false },
"lrclib": { "enabled": true }
},
"compilations": {
"detectByArtist": true,
"detectByFolder": true,
"keywords": ["Various Artists", "Compilation", "Soundtrack"]
}
}
```
## First-Time Setup
### 1. Clone Repository
```bash
git clone https://github.com/Arthi-chaud/Meelo.git
cd Meelo
```
### 2. Configure Environment
```bash
cp .env.example .env
nano .env
```
Fill in required values:
- `DATA_DIR`: Path to music library
- `JWT_SIGNATURE`: Random secret key
- `GENIUS_ACCESS_TOKEN`: Genius API token (optional)
- `DISCOGS_ACCESS_TOKEN`: Discogs API token (optional)
- `LASTFM_API_KEY`, `LASTFM_API_SECRET`: Last.fm OAuth credentials (optional)
### 3. Create Settings File
```bash
mkdir -p config
nano config/settings.json
```
Copy example settings from above, adjust `trackRegex` to match your file naming.
### 4. Start Services
```bash
docker-compose up -d
```
Wait for all services to become healthy:
```bash
docker-compose ps
```
### 5. Register Admin User
Navigate to `http://localhost` and register first user (becomes admin automatically).
### 6. Create Library
1. Go to Settings > Libraries
2. Click "Add Library"
3. Enter name and path (must match DATA_DIR mount)
4. Save
### 7. Trigger Initial Scan
```bash
curl -X POST http://localhost/scanner/scan
```
Monitor progress:
```bash
curl http://localhost/scanner/tasks
```
### 8. Wait for Enrichment
Matcher processes files asynchronously. Check progress in UI (Artists/Albums pages populate as metadata arrives).
## Updates
### Pull New Images
```bash
docker-compose pull
```
### Restart Services
```bash
docker-compose up -d
```
Docker Compose recreates containers with new images. Volumes persist data.
### Database Migrations
Prisma migrations run automatically on Server startup. No manual intervention needed.
## Backup
### Database Backup
```bash
docker exec meelo-db pg_dump -U postgres meelo > backup.sql
```
### Restore Database
```bash
docker exec -i meelo-db psql -U postgres meelo < backup.sql
```
### Volume Backup
```bash
docker run --rm -v meelo_db:/data -v $(pwd):/backup alpine tar czf /backup/db.tar.gz /data
```
### Restore Volume
```bash
docker run --rm -v meelo_db:/data -v $(pwd):/backup alpine tar xzf /backup/db.tar.gz -C /
```
### Config Backup
```bash
cp -r config config.backup
```
## Monitoring
### Service Status
```bash
docker-compose ps
```
Shows health status for all services.
### Logs
**All Services**:
```bash
docker-compose logs -f
```
**Specific Service**:
```bash
docker-compose logs -f server
```
**Last 100 Lines**:
```bash
docker-compose logs --tail=100 server
```
### Resource Usage
```bash
docker stats
```
Shows CPU, memory, network, and disk I/O per container.
## Troubleshooting
### Service Won't Start
Check logs:
```bash
docker-compose logs <service>
```
Common issues:
- **Database connection failed**: PostgreSQL not healthy yet, wait longer
- **Port already in use**: Change port in .env
- **Volume mount failed**: Check DATA_DIR path exists and has correct permissions
### Health Check Failing
Increase start period in docker-compose.yml:
```yaml
healthcheck:
start_period: 60s # Increase from 40s
```
### Out of Memory
Increase Docker memory limit (Docker Desktop settings) or reduce concurrent services.
### Slow Performance
Check resource usage:
```bash
docker stats
```
Bottlenecks:
- **High CPU on Matcher**: Too many providers enabled, disable optional ones
- **High memory on MeiliSearch**: Large library, increase Docker memory
- **High I/O on Scanner**: Slow disk, use SSD
## Production Deployment
### Reverse Proxy
Use Nginx or Caddy as external reverse proxy:
```nginx
server {
listen 443 ssl http2;
server_name meelo.example.com;
ssl_certificate /path/to/cert.pem;
ssl_certificate_key /path/to/key.pem;
location / {
proxy_pass http://localhost:80;
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_set_header X-Forwarded-Proto $scheme;
}
}
```
### HTTPS
Use Let's Encrypt with Certbot:
```bash
certbot --nginx -d meelo.example.com
```
Or use Caddy (automatic HTTPS):
```
meelo.example.com {
reverse_proxy localhost:80
}
```
### Firewall
Open only port 443 (HTTPS):
```bash
ufw allow 443/tcp
ufw enable
```
### Security Hardening
- Set `ALLOW_ANONYMOUS=0` in .env
- Use strong `JWT_SIGNATURE` (32+ random characters)
- Restrict `CORS_ORIGINS` to your domain
- Run Docker in rootless mode
- Enable Docker Content Trust
### Monitoring
Use Prometheus + Grafana (future enhancement, not built-in).
### Backups
Automate database backups with cron:
```bash
0 2 * * * docker exec meelo-db pg_dump -U postgres meelo > /backups/meelo-$(date +\%Y\%m\%d).sql
```
Rotate backups:
```bash
find /backups -name "meelo-*.sql" -mtime +30 -delete
```
## CI/CD
### GitHub Actions
Meelo uses GitHub Actions for CI/CD. Workflows per service:
**server.yml**:
```yaml
name: Server CI/CD
on:
push:
branches: [main]
paths:
- 'server/**'
jobs:
test:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v3
- uses: actions/setup-node@v3
with:
node-version: 20
- run: npm ci
working-directory: server
- run: npm run lint
working-directory: server
- run: npm test
working-directory: server
- uses: SonarSource/sonarcloud-github-action@master
env:
SONAR_TOKEN: ${{ secrets.SONAR_TOKEN }}
build:
needs: test
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v3
- uses: docker/setup-buildx-action@v2
- uses: docker/login-action@v2
with:
username: ${{ secrets.DOCKER_USERNAME }}
password: ${{ secrets.DOCKER_PASSWORD }}
- uses: docker/build-push-action@v4
with:
context: ./server
push: true
tags: arthichaud/meelo-server:latest
```
Similar workflows for scanner, matcher, front.
### Quality Gates
SonarCloud enforces:
- Code coverage > 80%
- No critical bugs
- No security vulnerabilities
- Maintainability rating A
Failing quality gates block merges.
## Scaling
### Horizontal Scaling
Run multiple instances of stateless services:
```yaml
services:
scanner:
image: arthichaud/meelo-scanner
deploy:
replicas: 3
```
Load balance with Nginx upstream:
```nginx
upstream scanner {
server scanner_1:8133;
server scanner_2:8133;
server scanner_3:8133;
}
location /scanner/ {
proxy_pass http://scanner;
}
```
### Vertical Scaling
Increase container resources:
```yaml
services:
server:
deploy:
resources:
limits:
cpus: '2'
memory: 4G
reservations:
cpus: '1'
memory: 2G
```
## Summary
Meelo's deployment uses Docker Compose to orchestrate 8 services with health checks ensuring correct startup order. Three variants (production, development, local build) support different use cases. Configuration via .env and settings.json separates deployment and user preferences. Volumes persist data, bind mounts provide access to music files. First-time setup involves configuring environment, creating settings, starting services, registering admin, creating library, and triggering scan. Updates are simple (pull images, restart). Backups cover database, volumes, and config. Production deployment adds reverse proxy, HTTPS, firewall, and security hardening. CI/CD via GitHub Actions ensures quality. Scaling options include horizontal (multiple instances) and vertical (more resources).
+564
View File
@@ -0,0 +1,564 @@
# Meelo Evaluation
## Strengths
### Data Model Sophistication
Meelo's data model is the most mature among self-hosted music servers. The Album/Release and Song/Track distinctions accurately represent real-world music organization.
**Album vs Release**:
- Albums are abstract concepts (e.g., "Abbey Road")
- Releases are physical/digital manifestations (original, 2019 remaster, deluxe edition)
- One album can have multiple releases with different track listings, mastering, labels
This mirrors how music collectors think. A remaster is not a different album, it's a different release of the same album.
**Song vs Track**:
- Songs are compositions (e.g., "Come Together")
- Tracks are recordings (studio version, live version, acoustic version)
- One song can have multiple tracks across different releases
This enables tracking different performances of the same composition without creating duplicate songs.
**Song Groups**:
- Group versions of the same composition (original, covers, remixes)
- Example: "Hallelujah" by Leonard Cohen, Jeff Buckley, Pentatonix
- Enables discovering different interpretations
No other self-hosted music server implements this level of versioning.
### Multi-Provider Metadata
Meelo queries 8 external providers:
1. **MusicBrainz**: Primary database, most accurate
2. **Genius**: Lyrics and song descriptions
3. **Wikipedia**: Artist/album context
4. **Wikidata**: Structured data
5. **Discogs**: Release details
6. **AllMusic**: Editorial reviews
7. **Metacritic**: Critic scores
8. **LrcLib**: Synced lyrics
**Aggregation Strategy**:
- Priority-based merging (MusicBrainz > Genius > Wikipedia)
- Concatenate descriptions from multiple sources
- Average ratings across providers
- Prefer synced lyrics over plain
**Result**: Richer metadata than single-provider systems. Descriptions combine MusicBrainz facts, Wikipedia context, and Genius annotations.
### Music Video Support
Videos are first-class citizens, not afterthoughts.
**Video Types**:
- Official music videos
- Live performances
- Lyric videos
- Behind the scenes
- Interviews
- Documentaries
**Integration**:
- Videos link to songs (same as audio tracks)
- Kyoo transcoder handles adaptive streaming
- UI treats videos equally with audio
**Comparison**:
- **Navidrome**: No video support
- **Jellyfin**: Videos are separate media type, not linked to songs
- **Plex**: Similar to Jellyfin
Meelo is the only self-hosted music server with proper music video integration.
### Event-Driven Architecture
RabbitMQ decouples scanning from enrichment.
**Flow**:
1. Scanner registers file with Server
2. Scanner publishes event to RabbitMQ
3. Matcher consumes event asynchronously
4. Matcher queries providers in parallel
5. Matcher pushes enriched metadata to Server
**Benefits**:
- Scanning doesn't block on provider queries
- Matcher can retry failed providers without re-scanning
- Multiple matchers can process events in parallel
- Provider failures don't stop scanning
**Comparison**:
- **Navidrome**: Synchronous metadata fetching blocks scanning
- **Airsonic**: No external metadata providers
### Scrobbling Built-In
Last.fm and ListenBrainz integration is native, not a plugin.
**Features**:
- OAuth flow for Last.fm
- Token-based auth for ListenBrainz
- Automatic scrobbling on track play
- "Now playing" updates
**Comparison**:
- **Navidrome**: Last.fm only, requires external scrobbler
- **Airsonic**: No built-in scrobbling
### Mobile App
Expo/React Native app shares code with web frontend.
**Shared**:
- Components (ArtistCard, AlbumCard, TrackList)
- Hooks (useArtists, useAlbums, usePlayback)
- State management (Jotai atoms)
**Mobile-Specific**:
- React Navigation instead of Next.js router
- AsyncStorage instead of localStorage
- expo-av for media playback
- expo-notifications for background playback
**Result**: Feature parity between web and mobile without duplicating code.
**Comparison**:
- **Navidrome**: Third-party mobile apps (Substreamer, Subtracks)
- **Jellyfin**: Official mobile app, but music is secondary
### Search Performance
MeiliSearch provides sub-100ms search across large libraries.
**Features**:
- Typo tolerance (handles misspellings)
- Faceted search (filter by genre, year, type)
- Instant results (as-you-type)
- Relevance ranking
**Indexed Entities**:
- Artists (name, sort name)
- Albums (name, artist name, type, release date)
- Songs (name, artist name, type)
- Videos (name, artist name, type)
**Comparison**:
- **Navidrome**: Database full-text search (slower, no typo tolerance)
- **Airsonic**: Basic SQL LIKE queries
### Active Development
**Indicators**:
- 40 releases (consistent iteration)
- 1,095 stars (healthy community)
- GitHub Actions CI/CD per service
- SonarCloud quality gates
- Regular commits (weekly)
**Comparison**:
- **Navidrome**: Active (single maintainer)
- **Airsonic**: Stagnant (last release 2020)
- **Funkwhale**: Active but slower
### Geographic Context
Areas (countries, cities, regions) are first-class entities.
**Features**:
- ISO 3166 codes
- Parent/child hierarchy (city → state → country)
- Artist associations (birthplace, formation location)
**Use Case**:
- Browse artists by location
- Discover local music scenes
- Understand artist context
**Comparison**: No other self-hosted music server has area support.
### Code Quality
**Measures**:
- SonarCloud enforces 80% coverage, no critical bugs
- Biome linting for TypeScript
- Pyright type checking for Python
- golangci-lint for Go
- Jest, pytest, Go testing
**Result**: High code quality, low bug rate.
## Weaknesses
### Complex Deployment
8+ containers required:
1. Server (NestJS)
2. Scanner (Go)
3. Matcher (Python)
4. Front (Next.js)
5. PostgreSQL
6. MeiliSearch
7. RabbitMQ
8. Kyoo Transcoder
9. Nginx
**Challenges**:
- Docker Compose orchestration
- Health check dependencies
- Volume management
- Network configuration
- Resource allocation
**Comparison**:
- **Navidrome**: Single binary, no dependencies
- **Airsonic**: Single JAR, embedded database option
**Impact**: High barrier to entry for non-technical users.
### Multi-Language Stack
4 languages across services:
- TypeScript (Server, Front)
- Go (Scanner)
- Python (Matcher)
- TypeScript again (Front mobile)
**Challenges**:
- Different toolchains (npm, go, pip)
- Different testing frameworks (Jest, Go testing, pytest)
- Different linting tools (Biome, golangci-lint, Ruff)
- Harder to contribute (need expertise in multiple languages)
**Comparison**:
- **Navidrome**: Single language (Go)
- **Airsonic**: Single language (Java)
**Impact**: Steeper learning curve for contributors.
### Heavy Infrastructure
Required services:
- **PostgreSQL**: Relational database
- **MeiliSearch**: Search engine
- **RabbitMQ**: Message queue
- **Kyoo Transcoder**: Video transcoding
**Resource Requirements**:
- Minimum: 4GB RAM, 2 CPU cores
- Recommended: 8GB RAM, 4 CPU cores
- Storage: 10GB + library size
**Comparison**:
- **Navidrome**: 512MB RAM, 1 CPU core, SQLite
- **Airsonic**: 1GB RAM, 1 CPU core, embedded database
**Impact**: Not suitable for low-power devices (Raspberry Pi 3, old NAS).
### Requires Clean Collection
Meelo works best with well-organized music:
- Embedded metadata (ID3 tags, Vorbis comments)
- Standard folder structure (Artist/Album/Track)
- Consistent naming
**Challenges**:
- Messy collections require manual cleanup
- Missing tags need filename regex
- Inconsistent naming breaks matching
**Comparison**:
- **Navidrome**: More forgiving, uses folder structure
- **Jellyfin**: Handles messy collections better
**Impact**: Not suitable for users with poorly organized libraries.
### GPL-3.0 License
**Restrictions**:
- Derivative works must be GPL-3.0
- Source code must be disclosed
- No proprietary forks
**Impact**:
- Prevents commercial SaaS offerings
- Limits corporate adoption
- Acceptable for self-hosters, restrictive for businesses
**Comparison**:
- **Navidrome**: GPL-3.0 (same restrictions)
- **Jellyfin**: GPL-2.0 (similar restrictions)
- **Airsonic**: GPL-3.0 (same restrictions)
### Kyoo Transcoder Dependency
Video transcoding relies on external project (Kyoo).
**Risks**:
- Kyoo development stalls
- Breaking changes in Kyoo API
- Meelo must maintain compatibility
**Comparison**:
- **Jellyfin**: Built-in transcoder (FFmpeg wrapper)
- **Plex**: Built-in transcoder
**Impact**: Video support is fragile.
### No Prometheus Metrics
No built-in metrics for monitoring.
**Missing**:
- Request rates
- Error rates
- Latency percentiles
- Queue depths
- Provider response times
**Workaround**: Parse logs or use external monitoring.
**Comparison**:
- **Navidrome**: Prometheus metrics endpoint
- **Jellyfin**: No metrics
**Impact**: Harder to monitor in production.
## Integration Potential
### Data Model
**Applicability**: Excellent reference for metadata aggregator.
**Lessons**:
- Separate abstract entities (Album, Song) from concrete instances (Release, Track)
- Use song groups for versioning
- Store external metadata separately from core entities
- Use local identifiers for cross-referencing
**Adoption**:
- Implement Album/Release distinction
- Implement Song/Track distinction
- Implement song groups for covers/remixes
- Separate ExternalMetadata table
### Provider Pattern
**Applicability**: Directly applicable to metadata aggregator.
**Architecture**:
- Base provider interface (search, fetch)
- Per-provider modules (musicbrainz.py, genius.py)
- Factory pattern for provider instantiation
- Parallel queries with asyncio
- Rate limiting per provider
- Priority-based aggregation
**Adoption**:
- Copy provider interface design
- Implement factory pattern
- Use asyncio for parallel queries
- Implement per-provider rate limiters
- Use priority-based merging
### Event-Driven Enrichment
**Applicability**: Scalable approach for metadata aggregator.
**Architecture**:
- Scanner publishes events to queue
- Matcher consumes events asynchronously
- Server receives enriched metadata via API
- Decouples scanning from enrichment
**Adoption**:
- Use message queue (RabbitMQ, Redis Streams)
- Separate scanner and matcher services
- Enable retries without re-scanning
### Search Integration
**Applicability**: Fast search is critical for metadata aggregator.
**Architecture**:
- MeiliSearch for full-text search
- Index on entity creation/update
- Typo tolerance and faceted search
- Sub-100ms response times
**Adoption**:
- Integrate MeiliSearch or Typesense
- Index artists, albums, songs
- Implement as-you-type search
## Relevance to Metadata Aggregator
### High Relevance
**Data Model**:
- Album/Release and Song/Track distinctions are essential for accurate metadata
- Song groups enable tracking versions and covers
- External metadata separation keeps provider data clean
**Provider Architecture**:
- Factory pattern simplifies adding new providers
- Parallel queries optimize performance
- Rate limiting prevents API bans
- Priority-based aggregation ensures quality
**Event-Driven Design**:
- Decouples metadata fetching from file scanning
- Enables retries without re-processing
- Scales horizontally (multiple matchers)
### Medium Relevance
**Search Integration**:
- Fast search improves user experience
- Typo tolerance handles misspellings
- Faceted search enables filtering
**Scrobbling**:
- OAuth flows are reusable patterns
- Token management is standard practice
**Mobile App**:
- Code sharing between web and mobile reduces duplication
- Monorepo structure simplifies version coordination
### Low Relevance
**Video Support**:
- Metadata aggregator may not handle videos
- Transcoding is out of scope
**Geographic Context**:
- Areas are nice-to-have, not essential
- ISO 3166 codes are useful for standardization
**Deployment Complexity**:
- Metadata aggregator may use simpler deployment (single service)
- Docker Compose is overkill for smaller projects
## Comparison with Alternatives
### vs Navidrome
**Meelo Advantages**:
- Richer data model (Album/Release, Song/Track)
- Multi-provider metadata (8 vs 1)
- Music video support
- Built-in scrobbling
- Search performance (MeiliSearch vs SQL)
**Navidrome Advantages**:
- Simpler deployment (single binary)
- Lower resource requirements (512MB vs 4GB)
- Faster startup (no dependencies)
- More mature (older project)
**Verdict**: Meelo for metadata richness, Navidrome for simplicity.
### vs Jellyfin
**Meelo Advantages**:
- Music-focused (not general media server)
- Better music metadata (Album/Release, Song/Track)
- Multi-provider enrichment
- Faster search (MeiliSearch)
**Jellyfin Advantages**:
- Handles all media types (movies, TV, music)
- Larger community
- More mature
- Better transcoding (built-in)
**Verdict**: Meelo for music collectors, Jellyfin for general media.
### vs Airsonic
**Meelo Advantages**:
- Modern stack (NestJS, Next.js vs Java)
- Active development (40 releases vs stagnant)
- Better metadata (multi-provider)
- Search performance
**Airsonic Advantages**:
- Simpler deployment (single JAR)
- Subsonic API compatibility
- Larger ecosystem (mobile apps)
**Verdict**: Meelo for modern features, Airsonic for stability.
### vs Funkwhale
**Meelo Advantages**:
- Better metadata model
- Multi-provider enrichment
- Faster search
**Funkwhale Advantages**:
- Federated (share music across instances)
- Social features (follows, favorites)
- Podcast support
**Verdict**: Meelo for personal use, Funkwhale for communities.
## Recommendations for Metadata Aggregator
### Adopt
1. **Data Model**:
- Implement Album/Release distinction
- Implement Song/Track distinction
- Implement song groups for versions
- Separate ExternalMetadata table
2. **Provider Pattern**:
- Base provider interface
- Per-provider modules
- Factory pattern
- Parallel queries with asyncio
- Rate limiting per provider
- Priority-based aggregation
3. **Event-Driven Architecture**:
- Message queue for decoupling
- Separate scanner and matcher services
- Retry logic without re-scanning
### Adapt
1. **Search Integration**:
- Use MeiliSearch or Typesense
- Index on entity creation/update
- Implement typo tolerance
2. **Scrobbling**:
- OAuth flows for Last.fm
- Token-based auth for ListenBrainz
3. **Code Quality**:
- Linting (Biome, Ruff)
- Type checking (TypeScript, Pyright)
- Testing (Jest, pytest)
- SonarCloud quality gates
### Avoid
1. **Complex Deployment**:
- Prefer single service or fewer containers
- Avoid heavy infrastructure (PostgreSQL, RabbitMQ) if possible
- Use SQLite for smaller deployments
2. **Multi-Language Stack**:
- Stick to one or two languages
- Avoid mixing TypeScript, Go, Python unless necessary
3. **Kyoo Dependency**:
- If video support needed, use built-in transcoder (FFmpeg)
- Avoid external dependencies for core features
## Summary
Meelo excels at data modeling, multi-provider metadata enrichment, and music video support. The Album/Release and Song/Track distinctions are the most accurate representation of real-world music organization among self-hosted servers. The provider pattern with parallel queries and priority-based aggregation is directly applicable to metadata aggregators. The event-driven architecture scales well and decouples concerns. However, deployment complexity (8+ containers), multi-language stack (TypeScript, Go, Python), and heavy infrastructure (PostgreSQL, MeiliSearch, RabbitMQ) limit accessibility. The GPL-3.0 license restricts commercial use. For a metadata aggregator, adopt the data model and provider architecture, adapt the search integration and scrobbling patterns, but avoid the deployment complexity and multi-language stack. Meelo is an excellent reference for sophisticated metadata handling in a self-hosted context.
@@ -0,0 +1,814 @@
# Meelo Integrations
## Integration Overview
Meelo integrates with 8 metadata providers and 2 scrobbling services. The Matcher service handles provider queries, while the Server handles scrobbling. All integrations are configurable via settings.json and .env.
## Metadata Providers
### MusicBrainz
**Type**: Primary music database
**Library**: musicbrainzngs (Python)
**Authentication**: None (public API)
**Rate Limit**: 1 request/second
**Priority**: Highest (primary source)
#### Capabilities
- Artist metadata (name, sort name, areas, relationships)
- Album metadata (title, type, release date, labels)
- Track metadata (title, duration, ISRC)
- Recording relationships (covers, remixes, versions)
- Release groups and releases
- Area data (countries, cities with ISO 3166 codes)
#### Matching Strategy
1. Query by AcoustID fingerprint (most accurate)
2. If no fingerprint, search by artist + album + track title
3. Extract MBID (MusicBrainz ID) for future queries
4. Store MBID in LocalIdentifiers table
#### Data Extraction
**Artist**:
```python
artist_data = mb.get_artist_by_id(mbid, includes=['areas', 'aliases'])
{
'name': artist_data['artist']['name'],
'sortName': artist_data['artist']['sort-name'],
'areas': [area['name'] for area in artist_data['artist'].get('areas', [])]
}
```
**Album**:
```python
release_group = mb.get_release_group_by_id(mbid, includes=['releases', 'labels'])
{
'name': release_group['release-group']['title'],
'type': release_group['release-group']['type'],
'releaseDate': release_group['release-group']['first-release-date'],
'releases': [...]
}
```
**Track**:
```python
recording = mb.get_recording_by_id(mbid, includes=['isrcs', 'releases'])
{
'title': recording['recording']['title'],
'duration': recording['recording']['length'],
'isrc': recording['recording'].get('isrc-list', [None])[0]
}
```
#### Rate Limiting
musicbrainzngs library enforces 1 request/second automatically. No additional limiting needed.
#### Error Handling
- **404 Not Found**: No match, skip provider
- **503 Service Unavailable**: Retry with exponential backoff (max 3 attempts)
- **Rate Limit Exceeded**: Wait and retry
### Genius
**Type**: Lyrics and song descriptions
**Library**: lyricsgenius (Python)
**Authentication**: API token (GENIUS_ACCESS_TOKEN)
**Rate Limit**: 10 requests/second
**Priority**: High (for lyrics)
#### Capabilities
- Song lyrics (plain text)
- Song descriptions and annotations
- Artist biographies
- Album descriptions
#### Matching Strategy
1. Search by artist + song title
2. Extract song ID from search results
3. Fetch full song data including lyrics
4. Store lyrics in Lyrics table
#### Data Extraction
**Lyrics**:
```python
genius = lyricsgenius.Genius(token)
song = genius.search_song(title, artist)
{
'plain': song.lyrics,
'description': song.description
}
```
**Artist Bio**:
```python
artist = genius.search_artist(name)
{
'description': artist.description
}
```
#### Rate Limiting
Implemented using aiolimiter:
```python
limiter = AsyncLimiter(10, 1) # 10 requests per second
async with limiter:
result = await fetch_genius(...)
```
#### Error Handling
- **404 Not Found**: No lyrics available, skip
- **401 Unauthorized**: Invalid token, log error
- **Rate Limit**: Wait and retry
### Wikipedia
**Type**: Artist and album context
**Library**: wikipedia (Python)
**Authentication**: None
**Rate Limit**: 5 requests/second (self-imposed)
**Priority**: Medium (for descriptions)
#### Capabilities
- Artist biographies
- Album background and reception
- Contextual information (formation, breakup, influences)
#### Matching Strategy
1. Search Wikipedia by artist/album name
2. Extract first paragraph as description
3. Store full URL as source
#### Data Extraction
**Artist Bio**:
```python
import wikipedia
page = wikipedia.page(artist_name)
{
'description': page.summary,
'url': page.url
}
```
**Album Context**:
```python
page = wikipedia.page(f"{album_name} ({artist_name} album)")
{
'description': page.summary,
'url': page.url
}
```
#### Disambiguation
Wikipedia often returns disambiguation pages. Handle by:
1. Detect disambiguation page (check for "may refer to")
2. Search for most likely option (e.g., add "band" or "musician")
3. If still ambiguous, skip
#### Rate Limiting
```python
limiter = AsyncLimiter(5, 1) # 5 requests per second
```
#### Error Handling
- **PageError**: No Wikipedia page, skip
- **DisambiguationError**: Try disambiguation, or skip
- **HTTPError**: Retry with backoff
### Wikidata
**Type**: Structured data
**Library**: SPARQLWrapper (Python)
**Authentication**: None
**Rate Limit**: None (fast SPARQL endpoint)
**Priority**: Medium (for structured data)
#### Capabilities
- Artist relationships (members, collaborators)
- Area data (countries, cities, ISO codes)
- Dates (birth, death, formation, dissolution)
- External IDs (MusicBrainz, Discogs, AllMusic)
#### Matching Strategy
1. Query by MusicBrainz ID (if available)
2. Extract Wikidata entity ID
3. Query for additional properties
4. Store structured data
#### Data Extraction
**Artist Data**:
```sparql
SELECT ?property ?value WHERE {
?artist wdt:P434 "MBID" . # MusicBrainz artist ID
?artist ?property ?value .
}
```
**Area Hierarchy**:
```sparql
SELECT ?area ?parent ?iso WHERE {
?area wdt:P31 wd:Q515 . # instance of city
?area wdt:P131 ?parent . # located in
?area wdt:P300 ?iso . # ISO 3166 code
}
```
#### Rate Limiting
No rate limit. SPARQL endpoint is fast and public.
#### Error Handling
- **No Results**: Entity not in Wikidata, skip
- **Timeout**: Retry with simpler query
- **SPARQL Error**: Log and skip
### Discogs
**Type**: Release information
**Library**: discogs_client (Python)
**Authentication**: API token (DISCOGS_ACCESS_TOKEN)
**Rate Limit**: 60 requests/minute
**Priority**: Low (optional)
#### Capabilities
- Release details (catalog number, barcode, format)
- Label information
- Release variations (country, format)
- Marketplace data (not used)
#### Matching Strategy
1. Search by artist + album title
2. Filter by format (CD, Vinyl, etc.)
3. Extract release details
4. Store in Release.extensions JSON
#### Data Extraction
**Release**:
```python
import discogs_client
d = discogs_client.Client('Meelo/1.0', user_token=token)
results = d.search(artist=artist, release_title=album, type='release')
release = results[0]
{
'catalogNumber': release.data['catno'],
'barcode': release.data.get('barcode'),
'format': release.formats[0]['name'],
'country': release.country,
'label': release.labels[0].name
}
```
#### Rate Limiting
```python
limiter = AsyncLimiter(60, 60) # 60 requests per minute
```
#### Error Handling
- **404 Not Found**: No Discogs entry, skip
- **401 Unauthorized**: Invalid token, log error
- **Rate Limit**: Wait 60 seconds and retry
### AllMusic
**Type**: Editorial reviews and ratings
**Library**: BeautifulSoup (web scraping)
**Authentication**: None
**Rate Limit**: 1 request/second (self-imposed, no official API)
**Priority**: Low (optional)
#### Capabilities
- Album reviews
- Album ratings (1-5 stars)
- Artist biographies
- Genre classifications
#### Matching Strategy
1. Search AllMusic by artist + album
2. Scrape search results page
3. Extract review and rating
4. Store rating normalized to 0-100 scale
#### Data Extraction
**Album Review**:
```python
from bs4 import BeautifulSoup
import httpx
url = f"https://www.allmusic.com/search/albums/{artist}+{album}"
response = httpx.get(url)
soup = BeautifulSoup(response.text, 'html.parser')
rating_elem = soup.select_one('.allmusic-rating')
rating = len(rating_elem.select('.star-rating.full')) # Count full stars
review_elem = soup.select_one('.review-text')
review = review_elem.text.strip()
{
'rating': rating * 20, # Convert 1-5 to 0-100
'description': review
}
```
#### Rate Limiting
```python
limiter = AsyncLimiter(1, 1) # 1 request per second
```
#### Error Handling
- **404 Not Found**: No AllMusic page, skip
- **Parsing Error**: HTML structure changed, log and skip
- **Timeout**: Retry with backoff
#### Scraping Risks
AllMusic has no official API. Scraping may break if HTML structure changes. Disabled by default in settings.json.
### Metacritic
**Type**: Aggregated critic scores
**Library**: BeautifulSoup (web scraping)
**Authentication**: None
**Rate Limit**: 1 request/second (self-imposed)
**Priority**: Low (optional)
#### Capabilities
- Album critic scores (0-100)
- User scores (not used)
- Critic reviews (not extracted)
#### Matching Strategy
1. Search Metacritic by artist + album
2. Scrape album page
3. Extract Metascore
4. Store as rating (already 0-100 scale)
#### Data Extraction
**Album Score**:
```python
url = f"https://www.metacritic.com/music/{album_slug}/{artist_slug}"
response = httpx.get(url)
soup = BeautifulSoup(response.text, 'html.parser')
score_elem = soup.select_one('.metascore_w')
score = int(score_elem.text.strip())
{
'rating': score
}
```
#### Rate Limiting
```python
limiter = AsyncLimiter(1, 1) # 1 request per second
```
#### Error Handling
- **404 Not Found**: Album not on Metacritic, skip
- **Parsing Error**: HTML structure changed, log and skip
- **Timeout**: Retry with backoff
#### Scraping Risks
Same as AllMusic. Disabled by default.
### LrcLib
**Type**: Synced lyrics
**Library**: httpx (direct API calls)
**Authentication**: None
**Rate Limit**: 10 requests/second (self-imposed)
**Priority**: High (for synced lyrics)
#### Capabilities
- Synced lyrics in .lrc format
- Plain lyrics (fallback)
- Lyrics by duration matching (improves accuracy)
#### Matching Strategy
1. Search by artist + title + duration
2. Parse .lrc format to JSON
3. Store in Lyrics.synced field
#### Data Extraction
**Synced Lyrics**:
```python
import httpx
url = "https://lrclib.net/api/get"
params = {
'artist_name': artist,
'track_name': title,
'duration': duration
}
response = httpx.get(url, params=params)
data = response.json()
lrc_text = data['syncedLyrics']
# Parse .lrc format
lines = []
for line in lrc_text.split('\n'):
match = re.match(r'\[(\d+):(\d+\.\d+)\](.*)', line)
if match:
minutes, seconds, text = match.groups()
time_ms = (int(minutes) * 60 + float(seconds)) * 1000
lines.append({'time': int(time_ms), 'text': text.strip()})
{
'synced': lines,
'plain': data.get('plainLyrics')
}
```
#### Rate Limiting
```python
limiter = AsyncLimiter(10, 1) # 10 requests per second
```
#### Error Handling
- **404 Not Found**: No synced lyrics, try plain lyrics
- **Parsing Error**: Invalid .lrc format, skip
- **Timeout**: Retry with backoff
## Scrobbling Services
### Last.fm
**Type**: Scrobbling service
**Library**: pylast (Python)
**Authentication**: OAuth (LASTFM_API_KEY, LASTFM_API_SECRET)
**Rate Limit**: None specified
**Integration**: Server (NestJS)
#### Capabilities
- Scrobble track plays
- Update "now playing" status
- Retrieve user listening history (not implemented)
#### OAuth Flow
1. User clicks "Connect Last.fm" in settings
2. Server redirects to Last.fm OAuth page
3. User authorizes Meelo
4. Last.fm redirects to callback with token
5. Server exchanges token for session key
6. Session key stored in UserScrobbler.data JSON
#### Scrobbling
**Now Playing**:
```typescript
await lastfm.updateNowPlaying({
artist: track.song.artist.name,
track: track.song.name,
album: track.release.album.name,
duration: track.duration
});
```
**Scrobble**:
```typescript
await lastfm.scrobble({
artist: track.song.artist.name,
track: track.song.name,
album: track.release.album.name,
timestamp: Math.floor(Date.now() / 1000)
});
```
#### Scrobble Rules
- Track must play for at least 30 seconds or 50% of duration (whichever is shorter)
- Scrobble sent when track ends or user skips past 50%
- "Now playing" sent immediately on play
#### Error Handling
- **Invalid Session**: Re-authenticate user
- **Network Error**: Queue scrobble for retry
- **Rate Limit**: Wait and retry
### ListenBrainz
**Type**: Open-source scrobbling service
**Library**: pylistenbrainz (Python)
**Authentication**: User token
**Rate Limit**: None specified
**Integration**: Server (NestJS)
#### Capabilities
- Submit listens (scrobbles)
- Retrieve listening history (not implemented)
- Statistics and recommendations (not implemented)
#### Authentication
1. User obtains token from ListenBrainz settings
2. User enters token in Meelo settings
3. Token stored in UserScrobbler.data JSON
4. No OAuth flow needed
#### Submitting Listens
**Single Listen**:
```typescript
await listenbrainz.submitListen({
listened_at: Math.floor(Date.now() / 1000),
track_metadata: {
artist_name: track.song.artist.name,
track_name: track.song.name,
release_name: track.release.album.name,
additional_info: {
duration_ms: track.duration * 1000,
tracknumber: track.trackIndex
}
}
});
```
#### Listen Types
- **Single**: Submit one listen (used for scrobbling)
- **Playing Now**: Update current track (not implemented)
- **Import**: Bulk import (not used)
#### Error Handling
- **Invalid Token**: Notify user to re-enter token
- **Network Error**: Queue listen for retry
- **Rate Limit**: Wait and retry
## Provider Configuration
### settings.json
```json
{
"providers": {
"musicbrainz": {
"enabled": true
},
"genius": {
"enabled": true
},
"wikipedia": {
"enabled": true
},
"wikidata": {
"enabled": true
},
"discogs": {
"enabled": false
},
"allmusic": {
"enabled": false
},
"metacritic": {
"enabled": false
},
"lrclib": {
"enabled": true
}
},
"metadata": {
"source": "providers",
"order": ["musicbrainz", "genius", "wikipedia", "lrclib", "wikidata"]
}
}
```
**Fields**:
- `providers.<name>.enabled`: Enable/disable provider
- `metadata.source`: Prefer "embedded" tags or "providers"
- `metadata.order`: Provider priority for conflicting data
### .env
```bash
# Genius
GENIUS_ACCESS_TOKEN=your_genius_token
# Discogs
DISCOGS_ACCESS_TOKEN=your_discogs_token
# Last.fm
LASTFM_API_KEY=your_lastfm_key
LASTFM_API_SECRET=your_lastfm_secret
# Public URL for OAuth callbacks
PUBLIC_URL=https://meelo.example.com
```
## Provider Priority
When multiple providers return conflicting data, Matcher uses priority from `metadata.order`:
1. **MusicBrainz**: Highest priority (most accurate)
2. **Genius**: High priority for lyrics
3. **Wikipedia**: Medium priority for descriptions
4. **LrcLib**: High priority for synced lyrics
5. **Wikidata**: Medium priority for structured data
6. **Discogs**: Low priority (optional)
7. **AllMusic**: Low priority (optional)
8. **Metacritic**: Low priority (optional)
## Data Aggregation
### Descriptions
Concatenate descriptions from multiple providers:
```
MusicBrainz: "The Beatles were an English rock band..."
Wikipedia: "Formed in Liverpool in 1960..."
Genius: "Known for their innovative songwriting..."
Result: "The Beatles were an English rock band... Formed in Liverpool in 1960... Known for their innovative songwriting..."
```
### Ratings
Average ratings from multiple providers:
```
AllMusic: 90/100
Metacritic: 85/100
Result: (90 + 85) / 2 = 87.5 → 88/100
```
### Lyrics
Prefer synced lyrics over plain:
```
LrcLib: Synced lyrics available → Use synced
Genius: Plain lyrics available → Use as fallback
```
If both available, store both in Lyrics table.
## Matching Workflow
1. **Scanner** registers file with Server
2. **Scanner** publishes `file.added` event to RabbitMQ
3. **Matcher** consumes event
4. **Matcher** fetches file metadata from Server
5. **Matcher** queries enabled providers in parallel:
- MusicBrainz by AcoustID fingerprint
- Genius by artist + title
- Wikipedia by artist name
- LrcLib by artist + title + duration
- Wikidata by MusicBrainz ID (if found)
- Discogs by artist + album (if enabled)
- AllMusic by artist + album (if enabled)
- Metacritic by artist + album (if enabled)
6. **Matcher** aggregates results based on priority
7. **Matcher** pushes enriched metadata to Server
8. **Server** updates database and search index
## Error Recovery
### Provider Failures
If provider fails:
1. Log error with provider name and reason
2. Continue with other providers
3. Push partial metadata to Server
4. Mark track as "partially matched"
### Retry Logic
For transient errors (network, rate limit):
1. Retry with exponential backoff
2. Max 3 attempts per provider
3. If all attempts fail, skip provider
### Manual Refresh
Users can trigger metadata refresh via Scanner API:
```bash
POST /scanner/refresh
```
This re-queries all providers for existing tracks.
## Performance Optimization
### Parallel Queries
Matcher queries all providers in parallel using asyncio:
```python
async def enrich_metadata(file_id):
tasks = [
fetch_musicbrainz(file_id),
fetch_genius(file_id),
fetch_wikipedia(file_id),
fetch_lrclib(file_id),
fetch_wikidata(file_id)
]
results = await asyncio.gather(*tasks, return_exceptions=True)
return aggregate_results(results)
```
### Caching
Provider responses cached in memory for 1 hour:
- Reduces duplicate queries during batch scans
- Invalidated on manual refresh
### Rate Limit Coordination
Rate limiters shared across all workers:
- Prevents exceeding provider limits
- Uses token bucket algorithm
## Privacy Considerations
### Data Sent to Providers
- **MusicBrainz**: AcoustID fingerprint, artist/album/track names
- **Genius**: Artist and track names
- **Wikipedia**: Artist and album names
- **Wikidata**: MusicBrainz IDs
- **Discogs**: Artist and album names
- **AllMusic**: Artist and album names
- **Metacritic**: Artist and album names
- **LrcLib**: Artist, track name, duration
No file paths or user data sent.
### Scrobbling Privacy
- **Last.fm**: Track plays sent with timestamp
- **ListenBrainz**: Track plays sent with timestamp
Users control scrobbling via settings. Disabled by default.
## Future Enhancements
### Additional Providers
Potential providers to add:
- **Spotify**: Metadata and popularity scores
- **Apple Music**: Editorial content
- **Bandcamp**: Independent artist data
- **RateYourMusic**: User ratings and reviews
### Provider Plugins
Allow users to add custom providers via plugin system.
### Offline Mode
Cache provider responses for offline access.
### Provider Statistics
Track provider accuracy and response times. Display in admin panel.
## Summary
Meelo's integration architecture separates concerns: Matcher handles provider queries, Server handles scrobbling. The provider pattern enables easy addition of new sources. Parallel queries and rate limiting optimize performance. Priority-based aggregation ensures data quality. OAuth flows and token management handle authentication. The system is flexible (enable/disable providers), resilient (retry logic, partial results), and privacy-conscious (no file paths sent).
+374
View File
@@ -0,0 +1,374 @@
# Meelo Overview
## Project Identity
**Repository**: https://github.com/Arthi-chaud/Meelo
**License**: GPL-3.0
**Stars**: 1,095
**Releases**: 40 (latest: v3.10.1)
**Primary Languages**: TypeScript, Go, Python
**Architecture**: Microservices monorepo
## Purpose
Meelo is a self-hosted music server designed for music collectors who need flexible metadata management. Unlike typical music servers that treat metadata as static, Meelo provides sophisticated versioning and relationship tracking. The system supports music videos as first-class citizens, not afterthoughts, and includes built-in scrobbling to Last.fm and ListenBrainz.
The project targets users with well-organized collections who want control over their metadata without sacrificing modern features like full-text search, mobile access, and streaming.
## Core Services
### Server (NestJS 11, TypeScript)
- **Port**: 4000
- **Role**: Central API and business logic
- **Stack**: NestJS framework, Prisma ORM, PostgreSQL
- **Responsibilities**: Authentication, data persistence, search coordination, streaming, scrobbling, event publishing
### Scanner (Go 1.25, Echo v5)
- **Port**: 8133
- **Role**: Filesystem monitoring and metadata extraction
- **Stack**: Echo HTTP framework, FFmpeg/FFprobe bindings
- **Responsibilities**: File watching, metadata parsing, AcoustID fingerprinting, filename regex parsing, file registration, match triggering
### Matcher (Python 3.14, FastAPI)
- **Port**: 6789
- **Role**: External metadata enrichment
- **Stack**: FastAPI, async HTTP clients
- **Responsibilities**: Consuming match events, querying 8 external providers, pushing enriched metadata to Server
### Front (Next.js 16, React)
- **Port**: 3000
- **Role**: User interface
- **Stack**: Next.js SSR, Material-UI, Jotai state management, TanStack Query
- **Variants**: Web (Next.js) and mobile (Expo/React Native)
## Infrastructure Dependencies
### PostgreSQL
Primary data store. Handles all persistent data through Prisma ORM. Stores users, artists, albums, songs, tracks, releases, files, playlists, external metadata, and relationships.
### MeiliSearch (v1.5)
Full-text search engine. Indexes artists, albums, songs, and videos for fast, typo-tolerant search. Provides instant results as users type.
### RabbitMQ (4.2-alpine)
Message queue for event-driven architecture. Decouples Scanner and Matcher from Server. Enables asynchronous metadata enrichment without blocking file scanning.
### Kyoo Transcoder
Video transcoding service. Handles music video streaming with adaptive bitrate. Converts source files to web-compatible formats on demand.
### Nginx (1.29.7-alpine)
Reverse proxy. Routes requests to appropriate services:
- `/` → Front
- `/api/` → Server
- `/scanner/` → Scanner
- `/matcher/` → Matcher
## Docker Images
All services ship as pre-built Docker images:
- `arthichaud/meelo-server`
- `arthichaud/meelo-front`
- `arthichaud/meelo-scanner`
- `arthichaud/meelo-matcher`
Images are built via GitHub Actions on every release. Development uses hot-reload containers with mounted source directories.
## Key Features
### Flexible Metadata Model
Albums can have multiple releases (original, remaster, deluxe). Songs can have multiple tracks (studio, live, acoustic). Tracks link to source files. This hierarchy mirrors real-world music organization.
### Music Video Support
Videos are not bolted on. They have dedicated types (official, live, lyric video, etc.), link to songs, and stream through the transcoder. The UI treats them as equals to audio tracks.
### Multi-Provider Metadata
Matcher queries 8 sources:
- MusicBrainz (primary database)
- Genius (lyrics, descriptions)
- Wikipedia (artist/album context)
- Wikidata (structured data)
- Discogs (release details)
- AllMusic (editorial reviews)
- Metacritic (critic scores)
- LrcLib (synced lyrics)
Users configure provider priority in settings.json.
### Scrobbling Integration
Built-in support for Last.fm and ListenBrainz. OAuth flow for Last.fm, token-based for ListenBrainz. Scrobbles track plays automatically.
### Geographic Context
Areas (countries, cities, regions) are first-class entities with ISO 3166 codes. Artists link to areas. Areas form parent/child trees (city → state → country).
### Search Performance
MeiliSearch provides sub-100ms search across thousands of tracks. Typo tolerance handles misspellings. Faceted search filters by genre, year, type.
## Development Activity
- **40 releases** show consistent iteration
- **1,095 stars** indicate healthy community interest
- **Active CI/CD** with GitHub Actions per service
- **SonarCloud integration** enforces quality gates
- **Multi-language testing**: Jest (TypeScript), pytest (Python), Go testing
## Configuration Approach
### Environment Variables (.env)
Deployment settings: ports, URLs, directories, credentials for external services (Genius, Discogs, Last.fm).
### Settings File (settings.json)
User preferences: track filename regex, metadata source priority, provider enable/disable, compilation detection rules.
This split keeps deployment config separate from user preferences. Docker Compose handles .env, users edit settings.json through the UI or manually.
## Target Use Case
Meelo fits users who:
- Maintain large, well-organized music collections
- Want metadata control without manual database editing
- Need music video support beyond YouTube links
- Value data accuracy over convenience
- Run home servers or NAS devices
- Prefer self-hosting to cloud services
It does not fit users who:
- Want plug-and-play setup (8+ containers, complex config)
- Have messy folder structures (requires clean metadata or standard naming)
- Need lightweight deployment (heavy infrastructure stack)
- Avoid GPL-3.0 licensing
## Architectural Philosophy
Meelo embraces microservices despite being a self-hosted app. Each service has a single responsibility:
- Scanner watches files
- Matcher enriches metadata
- Server manages state
- Front displays data
This separation enables:
- Independent scaling (run multiple scanners for large libraries)
- Language-specific optimization (Go for I/O, Python for HTTP scraping)
- Isolated failures (matcher crash doesn't stop playback)
- Parallel development (teams can work on different services)
The tradeoff is operational complexity. Users must manage 8 containers, 4 languages, and inter-service communication. For the target audience (technical music collectors), this is acceptable.
## Comparison Context
Among self-hosted music servers:
- **Navidrome**: Simpler (single binary), less metadata flexibility
- **Funkwhale**: Federated, social features, lighter metadata model
- **Airsonic**: Java monolith, basic metadata, stable but dated
- **Jellyfin**: General media server, music is secondary
- **Plex**: Proprietary, cloud-dependent, limited metadata control
Meelo occupies the "sophisticated metadata, self-hosted, open source" niche. It's more complex than Navidrome but more capable. It's more focused than Jellyfin but less mature.
## Technical Highlights
### Monorepo Structure
All services live in one repository with shared tooling (Biome, Docker Compose). This simplifies version coordination and cross-service changes.
### Event-Driven Enrichment
Scanner publishes "file added" events to RabbitMQ. Matcher consumes them asynchronously. Server receives enriched metadata via API. This decoupling prevents blocking and enables retries.
### Type Safety
TypeScript (Server, Front), Go (Scanner), Python with Pyright (Matcher). All services use static typing. Prisma generates TypeScript types from database schema.
### Health Monitoring
Every Docker service has health checks. Compose orchestrates startup order: database first, then message queue, then application services, finally nginx. This prevents race conditions.
### Mobile Parity
Front monorepo includes web (Next.js) and mobile (Expo). Shared components and state management. Mobile app is not an afterthought.
## Deployment Models
### Production (docker-compose.yml)
Pre-built images from Docker Hub. Fast startup. No build tools needed. Suitable for end users.
### Development (docker-compose.dev.yml)
Hot reload for all services. Exposed ports for debugging. Mounted source directories. Suitable for contributors.
### Local Build (docker-compose.local.yml)
Builds images from source. Tests Dockerfile changes. Suitable for CI or custom modifications.
All three share the same infrastructure services (PostgreSQL, MeiliSearch, RabbitMQ). Only application services differ.
## Data Flow Example
1. User adds music files to library folder
2. Scanner detects new files via filesystem watch
3. Scanner extracts metadata (tags, duration, bitrate) using FFmpeg
4. Scanner generates AcoustID fingerprint
5. Scanner registers file with Server API
6. Scanner publishes "file added" event to RabbitMQ
7. Matcher consumes event
8. Matcher queries MusicBrainz using AcoustID
9. Matcher queries Genius for lyrics
10. Matcher queries Wikipedia for artist bio
11. Matcher pushes enriched metadata to Server API
12. Server updates database
13. Server updates MeiliSearch index
14. Front queries Server API
15. User sees new track with complete metadata
This flow demonstrates the event-driven architecture and multi-provider enrichment.
## Quality Assurance
### Testing
- **Server**: Jest unit tests for NestJS modules
- **Matcher**: pytest with async support for provider modules
- **Scanner**: Go testing for file parsing and fingerprinting
- **Coverage**: SonarCloud tracks coverage per service
### Linting
- **TypeScript**: Biome (replaces ESLint + Prettier)
- **Python**: Ruff + Pyright
- **Go**: golangci-lint
### CI/CD
GitHub Actions per service:
1. Lint code
2. Run tests
3. Upload coverage to SonarCloud
4. Build Docker image
5. Push to Docker Hub (on release)
Quality gates block merges if coverage drops or bugs are introduced.
## Configuration Files
### biome.json
Formatting rules: tabs, double quotes, line width 100. Applies to TypeScript (Server, Front).
### settings.json
User-editable preferences:
- `trackRegex`: Filename parsing pattern
- `metadata.source`: Prefer embedded tags or external providers
- `metadata.order`: Provider priority list
- `providers`: Enable/disable specific providers
- `compilations`: Rules for detecting compilation albums
### .env
Deployment secrets:
- `JWT_SIGNATURE`: Auth token signing key
- `GENIUS_ACCESS_TOKEN`: Genius API key
- `DISCOGS_ACCESS_TOKEN`: Discogs API key
- `LASTFM_API_KEY`, `LASTFM_API_SECRET`: Last.fm OAuth
- `PUBLIC_URL`: External URL for OAuth callbacks
- `CONFIG_DIR`, `DATA_DIR`: Volume mount paths
## First-Time Setup
1. Clone repository
2. Copy `.env.example` to `.env`
3. Fill in required credentials (Genius, Discogs, Last.fm)
4. Create `settings.json` with track regex and provider preferences
5. Run `docker-compose up -d`
6. Wait for health checks to pass
7. Navigate to `http://localhost:3000`
8. Register admin user
9. Create library pointing to music folder
10. Trigger initial scan via Scanner API
The system will scan files, extract metadata, query providers, and populate the database. Initial scan time depends on library size and provider response times.
## Maintenance Operations
### Rescan Library
POST to `/scanner/scan/:libraryId` triggers full rescan. Useful after bulk file changes.
### Clean Orphans
POST to `/scanner/clean` removes database entries for deleted files.
### Refresh Metadata
POST to `/scanner/refresh` re-queries providers for existing tracks. Updates descriptions, ratings, lyrics.
### Backup Database
Standard PostgreSQL dump. Volume is `meelo_db` in Docker.
### Update Services
Pull new images, restart containers. Database migrations run automatically via Prisma.
## Extension Points
### Custom Providers
Add new provider modules to Matcher. Implement provider interface (search, fetch metadata). Register in factory. No Server changes needed.
### Additional Scrobblers
Implement scrobbler interface in Server. Add OAuth flow if needed. Store credentials in UserScrobbler table.
### Alternative Frontends
Server API is provider-agnostic. Build custom clients (CLI, desktop app, voice assistant) using REST API.
### Transcoding Profiles
Configure Kyoo transcoder with custom profiles. Adjust bitrates, codecs, resolutions for different devices.
## Performance Characteristics
### Scan Speed
Go scanner processes ~100 files/second on SSD. Bottleneck is FFprobe metadata extraction, not file I/O.
### Search Latency
MeiliSearch returns results in <100ms for libraries up to 100k tracks. Scales linearly beyond that.
### Streaming Startup
Direct file streaming (no transcoding) starts in <500ms. Transcoded streams add 2-5s for initial segment generation.
### Metadata Enrichment
Matcher processes ~10 tracks/second. Limited by external provider rate limits (MusicBrainz: 1 req/sec, Genius: 10 req/sec).
## Resource Requirements
### Minimum
- **CPU**: 2 cores
- **RAM**: 4GB
- **Storage**: 10GB + music library size
- **Network**: 10 Mbps upload for remote streaming
### Recommended
- **CPU**: 4 cores (for transcoding)
- **RAM**: 8GB (MeiliSearch benefits from memory)
- **Storage**: SSD for database and search index
- **Network**: 50 Mbps upload for multiple streams
## Security Considerations
### Authentication
JWT tokens with configurable expiration. Bcrypt password hashing. API keys for internal service communication.
### Anonymous Access
`ALLOW_ANONYMOUS=1` disables auth. Useful for private networks. Not recommended for internet-exposed instances.
### External Providers
Credentials stored in .env. Never logged or exposed via API. Matcher makes requests server-side, not from client.
### File Access
Scanner and Server run as non-root in Docker. File permissions must allow read access. No write operations on music files.
## Community and Support
### Documentation
README covers setup. Wiki has advanced topics (custom providers, troubleshooting). API docs at `/api/docs`.
### Issue Tracker
GitHub Issues for bugs and features. Active maintainer responses. Template for bug reports.
### Contributions
Pull requests welcome. CI checks must pass. SonarCloud quality gates enforced. Biome formatting required.
### Roadmap
GitHub Projects track planned features. Community votes on priorities. Regular releases (every 2-3 weeks).
## Licensing Implications
GPL-3.0 requires:
- Source code disclosure for modifications
- Same license for derivative works
- No proprietary forks
This prevents commercial services from using Meelo without open-sourcing their changes. Acceptable for self-hosters, restrictive for SaaS providers.
## Summary
Meelo is a sophisticated, microservices-based music server for technical users who value metadata accuracy and flexibility. It trades operational simplicity for data model richness and extensibility. The event-driven architecture, multi-provider metadata enrichment, and first-class video support distinguish it from simpler alternatives. The GPL-3.0 license and heavy infrastructure requirements limit its audience to self-hosting enthusiasts with technical skills and well-organized music collections.