feat: initial implementation of metadata aggregator
- gRPC service with MusicBrainz provider - PostgreSQL schema with migrations - Service layer with database-first caching - Repository pattern for data access - YAML configuration support - Research documentation for 17 music metadata projects
This commit is contained in:
@@ -0,0 +1,275 @@
|
||||
# MiniMediaMetadataAPI - Project Overview
|
||||
|
||||
## Project Identity
|
||||
|
||||
**Name:** MiniMediaMetadataAPI
|
||||
**Repository:** https://github.com/MusicMoveArr/MiniMediaMetadataAPI
|
||||
**License:** GPL-3.0 (copyleft)
|
||||
**Maintainer:** Single maintainer (MusicMoveArr organization)
|
||||
**Status:** Active development
|
||||
|
||||
## Technology Stack
|
||||
|
||||
### Runtime & Language
|
||||
- **.NET 8.0** (SDK 8.0.0)
|
||||
- **C#** (modern language features)
|
||||
- **ASP.NET Core** web framework
|
||||
|
||||
### Database Layer
|
||||
- **PostgreSQL** as primary data store
|
||||
- **Dapper 2.1.72** micro-ORM (NOT Entity Framework)
|
||||
- **Npgsql 10.0.2** PostgreSQL driver for .NET
|
||||
- **pg_trgm extension** for fuzzy text search
|
||||
|
||||
### Core Dependencies
|
||||
|
||||
| Package | Version | Purpose |
|
||||
|---------|---------|---------|
|
||||
| Dapper | 2.1.72 | Lightweight ORM, SQL mapping |
|
||||
| Npgsql | 10.0.2 | PostgreSQL connectivity |
|
||||
| FuzzySharp | 2.0.2 | String similarity matching |
|
||||
| Polly | 8.6.6 | Resilience and transient fault handling |
|
||||
| Quartz | 3.17.0 | Job scheduling framework |
|
||||
| SpotifyAPI.Web.Auth | 7.4.2 | Spotify authentication (unused in API) |
|
||||
| prometheus-net | 8.2.1 | Metrics collection and export |
|
||||
| Swashbuckle | 10.1.7 | OpenAPI/Swagger documentation |
|
||||
|
||||
## Provider Coverage
|
||||
|
||||
The API aggregates metadata from **6 music providers**:
|
||||
|
||||
1. **Spotify** - Streaming service with rich metadata
|
||||
2. **Tidal** - High-fidelity streaming platform
|
||||
3. **MusicBrainz** - Open music encyclopedia
|
||||
4. **Deezer** - European streaming service
|
||||
5. **Discogs** - Music database and marketplace
|
||||
6. **SoundCloud** - User-generated content platform
|
||||
|
||||
Each provider has dedicated database models and repository implementations.
|
||||
|
||||
## Solution Structure
|
||||
|
||||
The codebase is organized into **3 projects**:
|
||||
|
||||
### 1. MiniMediaMetadataAPI (Main API)
|
||||
- ASP.NET Core web application
|
||||
- Controllers for HTTP endpoints
|
||||
- Middleware for request processing
|
||||
- Configuration and dependency injection
|
||||
- Entry point: `Program.cs`
|
||||
|
||||
### 2. MiniMediaMetadataAPI.Application (Business Logic)
|
||||
- Repository pattern implementations
|
||||
- Service layer (SearchArtist, SearchAlbum, SearchTrack)
|
||||
- Database models for all 6 providers
|
||||
- Entity models for API responses
|
||||
- Helper utilities
|
||||
|
||||
### 3. MiniMediaMetadataAPI.Tests (Testing)
|
||||
- xUnit test framework
|
||||
- **Current state: Empty stub only (0% coverage)**
|
||||
|
||||
## Dependency Injection Configuration
|
||||
|
||||
`Program.cs` registers the following components:
|
||||
|
||||
### Repositories (7 total)
|
||||
- `ISpotifyRepository` → `SpotifyRepository`
|
||||
- `ITidalRepository` → `TidalRepository`
|
||||
- `IMusicBrainzRepository` → `MusicBrainzRepository`
|
||||
- `IDeezerRepository` → `DeezerRepository`
|
||||
- `IDiscogsRepository` → `DiscogsRepository`
|
||||
- `ISoundCloudRepository` → `SoundCloudRepository`
|
||||
- `IJobRepository` → `JobRepository`
|
||||
|
||||
### Services (3 total)
|
||||
- `ISearchArtistService` → `SearchArtistService`
|
||||
- `ISearchAlbumService` → `SearchAlbumService`
|
||||
- `ISearchTrackService` → `SearchTrackService`
|
||||
|
||||
## Resource Footprint
|
||||
|
||||
**Memory Usage:** <250MB
|
||||
**Connection Pooling:** MinPoolSize=5, MaxPoolSize=100
|
||||
|
||||
This lightweight footprint makes the API suitable for containerized deployments and resource-constrained environments.
|
||||
|
||||
## Database Relationship
|
||||
|
||||
**Critical architectural note:** This API does NOT own the database schema.
|
||||
|
||||
- **Schema Owner:** MiniMediaScanner (separate project)
|
||||
- **API Role:** Read-only consumer
|
||||
- **Data Sync:** Handled entirely by MiniMediaScanner
|
||||
- **No Migrations:** This project contains no database migration code
|
||||
|
||||
The API queries pre-populated tables. Data freshness depends on MiniMediaScanner's sync schedule.
|
||||
|
||||
## Codebase Metrics
|
||||
|
||||
- **Total C# files:** 99
|
||||
- **Database models:** 60+
|
||||
- **Controllers:** 4
|
||||
- **Repositories:** 7
|
||||
- **Services:** 3
|
||||
- **Middleware:** 1 (Prometheus request tracking)
|
||||
|
||||
## Key Architectural Decisions
|
||||
|
||||
### Why Dapper over Entity Framework?
|
||||
- Lightweight, minimal overhead
|
||||
- Direct SQL control for complex queries
|
||||
- Better performance for read-heavy workloads
|
||||
- No change tracking overhead (read-only API)
|
||||
|
||||
### Why Repository Pattern?
|
||||
- Clean separation between data access and business logic
|
||||
- Provider-specific implementations isolated
|
||||
- Easy to mock for testing (though tests are missing)
|
||||
- Consistent interface across all providers
|
||||
|
||||
### Why No Schema Ownership?
|
||||
- Separation of concerns: MiniMediaScanner handles sync complexity
|
||||
- API focuses on query optimization and response formatting
|
||||
- Avoids dual-write problems
|
||||
- Simpler deployment (no migration coordination)
|
||||
|
||||
## Integration Points
|
||||
|
||||
### External Dependencies
|
||||
- PostgreSQL database (shared with MiniMediaScanner)
|
||||
- Prometheus metrics collector (optional)
|
||||
|
||||
### Internal Dependencies
|
||||
- No inter-service communication
|
||||
- No message queues
|
||||
- No caching layer
|
||||
- No external API calls (data pre-populated)
|
||||
|
||||
## Configuration Surface
|
||||
|
||||
Primary configuration via `appsettings.json`:
|
||||
|
||||
```json
|
||||
{
|
||||
"DatabaseConfiguration": {
|
||||
"ConnectionString": "Host=...;Database=...;Username=...;Password=..."
|
||||
},
|
||||
"Prometheus": {
|
||||
"MetricsUrl": "/metrics"
|
||||
},
|
||||
"Logging": {
|
||||
"LogLevel": {
|
||||
"Default": "Information"
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
## Deployment Artifacts
|
||||
|
||||
- **Dockerfile:** Multi-stage build, non-root user, ports 8080/8081
|
||||
- **compose.yaml:** Minimal build configuration
|
||||
- **Production compose:** Port mapping (56232:8080), memory limit (256M), volume mount for config
|
||||
|
||||
## CI/CD Pipeline
|
||||
|
||||
**GitHub Actions:** `docker-image.yml`
|
||||
|
||||
- **Trigger:** Push to main branch
|
||||
- **Steps:** Build Docker image → Push to Docker Hub
|
||||
- **Missing:** Test execution, deployment automation, health checks
|
||||
|
||||
## API Surface
|
||||
|
||||
**Base Path:** `/api`
|
||||
**Documentation:** `/swagger` (Swagger UI)
|
||||
**Metrics:** `/metrics` (Prometheus format)
|
||||
|
||||
### Endpoints
|
||||
- `GET /api/SearchArtist` - Search artists across providers
|
||||
- `GET /api/SearchAlbum` - Search albums across providers
|
||||
- `GET /api/SearchTrack` - Search tracks across providers
|
||||
- `GET /api/Search` - Stub endpoint (not implemented)
|
||||
|
||||
## Security Posture
|
||||
|
||||
**Authentication:** None (fully open API)
|
||||
**Authorization:** None
|
||||
**Rate Limiting:** None
|
||||
**CORS:** Not configured
|
||||
**HTTPS:** Commented out in production
|
||||
|
||||
This is a **trust-based deployment** suitable only for internal networks or behind authentication gateway.
|
||||
|
||||
## Observability
|
||||
|
||||
**Metrics:** Prometheus request counters (path, method, status labels)
|
||||
**Logging:** ASP.NET Core default (console output)
|
||||
**Tracing:** None
|
||||
**Health Checks:** None
|
||||
**Error Tracking:** None (no Sentry, no structured logging)
|
||||
|
||||
## Testing Strategy
|
||||
|
||||
**Current State:** No meaningful tests
|
||||
**Test Framework:** xUnit configured but unused
|
||||
**Coverage:** 0%
|
||||
**CI Integration:** Tests not run in pipeline
|
||||
|
||||
This is a significant gap for production readiness.
|
||||
|
||||
## License Implications
|
||||
|
||||
**GPL-3.0** is a copyleft license requiring:
|
||||
- Source code disclosure for derivative works
|
||||
- Same license for modifications
|
||||
- Patent grant to users
|
||||
|
||||
**Impact on integration:**
|
||||
- Cannot incorporate code into proprietary systems without GPL compliance
|
||||
- Can use as separate service (API boundary preserves license isolation)
|
||||
- Database schema and API patterns can inspire clean-room implementations
|
||||
|
||||
## Relevance to metadata-aggregator Project
|
||||
|
||||
**High relevance** - this is the closest existing implementation to our goals:
|
||||
|
||||
1. **Multi-provider aggregation** - exactly our use case
|
||||
2. **Unified search API** - provider-agnostic queries
|
||||
3. **Database schema design** - proven model for multi-provider storage
|
||||
4. **Provider isolation** - clean separation via repository pattern
|
||||
5. **Fuzzy search** - pg_trgm implementation reference
|
||||
|
||||
**Key learnings:**
|
||||
- Repository-per-provider scales well
|
||||
- Dapper performs well for read-heavy metadata queries
|
||||
- Separate sync process (MiniMediaScanner) simplifies API
|
||||
- Provider=Any pattern enables cross-provider search
|
||||
|
||||
**Gaps to address:**
|
||||
- Add comprehensive testing
|
||||
- Implement authentication/authorization
|
||||
- Add caching layer for performance
|
||||
- Health checks for production readiness
|
||||
- API versioning for evolution
|
||||
- Rate limiting for abuse prevention
|
||||
|
||||
## Project Maturity Assessment
|
||||
|
||||
**Strengths:**
|
||||
- Clean architecture
|
||||
- Multiple providers working
|
||||
- Lightweight and performant
|
||||
- Good separation of concerns
|
||||
|
||||
**Weaknesses:**
|
||||
- Single maintainer risk
|
||||
- No test coverage
|
||||
- Missing production hardening (auth, rate limiting, health checks)
|
||||
- Schema coupling with external project
|
||||
- Limited observability
|
||||
|
||||
**Maturity Level:** Early production / Advanced prototype
|
||||
|
||||
Suitable for internal use or as reference implementation. Needs hardening for public deployment.
|
||||
Reference in New Issue
Block a user