Files

T

Alexander a1f6701bac feat: initial implementation of metadata aggregator

- gRPC service with MusicBrainz provider
- PostgreSQL schema with migrations
- Service layer with database-first caching
- Repository pattern for data access
- YAML configuration support
- Research documentation for 17 music metadata projects

2026-04-28 16:28:53 +02:00

8.3 KiB

Raw Permalink Blame History

MiniMediaMetadataAPI - Project Overview

Project Identity

Name: MiniMediaMetadataAPI
Repository: https://github.com/MusicMoveArr/MiniMediaMetadataAPI
License: GPL-3.0 (copyleft)
Maintainer: Single maintainer (MusicMoveArr organization)
Status: Active development

Technology Stack

Runtime & Language

.NET 8.0 (SDK 8.0.0)
C# (modern language features)
ASP.NET Core web framework

Database Layer

PostgreSQL as primary data store
Dapper 2.1.72 micro-ORM (NOT Entity Framework)
Npgsql 10.0.2 PostgreSQL driver for .NET
pg_trgm extension for fuzzy text search

Core Dependencies

Package	Version	Purpose
Dapper	2.1.72	Lightweight ORM, SQL mapping
Npgsql	10.0.2	PostgreSQL connectivity
FuzzySharp	2.0.2	String similarity matching
Polly	8.6.6	Resilience and transient fault handling
Quartz	3.17.0	Job scheduling framework
SpotifyAPI.Web.Auth	7.4.2	Spotify authentication (unused in API)
prometheus-net	8.2.1	Metrics collection and export
Swashbuckle	10.1.7	OpenAPI/Swagger documentation

Provider Coverage

The API aggregates metadata from 6 music providers:

Spotify - Streaming service with rich metadata
Tidal - High-fidelity streaming platform
MusicBrainz - Open music encyclopedia
Deezer - European streaming service
Discogs - Music database and marketplace
SoundCloud - User-generated content platform

Each provider has dedicated database models and repository implementations.

Solution Structure

The codebase is organized into 3 projects:

1. MiniMediaMetadataAPI (Main API)

ASP.NET Core web application
Controllers for HTTP endpoints
Middleware for request processing
Configuration and dependency injection
Entry point: Program.cs

2. MiniMediaMetadataAPI.Application (Business Logic)

Repository pattern implementations
Service layer (SearchArtist, SearchAlbum, SearchTrack)
Database models for all 6 providers
Entity models for API responses
Helper utilities

3. MiniMediaMetadataAPI.Tests (Testing)

xUnit test framework
Current state: Empty stub only (0% coverage)

Dependency Injection Configuration

Program.cs registers the following components:

Repositories (7 total)

ISpotifyRepository → SpotifyRepository
ITidalRepository → TidalRepository
IMusicBrainzRepository → MusicBrainzRepository
IDeezerRepository → DeezerRepository
IDiscogsRepository → DiscogsRepository
ISoundCloudRepository → SoundCloudRepository
IJobRepository → JobRepository

Services (3 total)

ISearchArtistService → SearchArtistService
ISearchAlbumService → SearchAlbumService
ISearchTrackService → SearchTrackService

Resource Footprint

Memory Usage: <250MB
Connection Pooling: MinPoolSize=5, MaxPoolSize=100

This lightweight footprint makes the API suitable for containerized deployments and resource-constrained environments.

Database Relationship

Critical architectural note: This API does NOT own the database schema.

Schema Owner: MiniMediaScanner (separate project)
API Role: Read-only consumer
Data Sync: Handled entirely by MiniMediaScanner
No Migrations: This project contains no database migration code

The API queries pre-populated tables. Data freshness depends on MiniMediaScanner's sync schedule.

Codebase Metrics

Total C# files: 99
Database models: 60+
Controllers: 4
Repositories: 7
Services: 3
Middleware: 1 (Prometheus request tracking)

Key Architectural Decisions

Why Dapper over Entity Framework?

Lightweight, minimal overhead
Direct SQL control for complex queries
Better performance for read-heavy workloads
No change tracking overhead (read-only API)

Why Repository Pattern?

Clean separation between data access and business logic
Provider-specific implementations isolated
Easy to mock for testing (though tests are missing)
Consistent interface across all providers

Why No Schema Ownership?

Separation of concerns: MiniMediaScanner handles sync complexity
API focuses on query optimization and response formatting
Avoids dual-write problems
Simpler deployment (no migration coordination)

Integration Points

External Dependencies

PostgreSQL database (shared with MiniMediaScanner)
Prometheus metrics collector (optional)

Internal Dependencies

No inter-service communication
No message queues
No caching layer
No external API calls (data pre-populated)

Configuration Surface

Primary configuration via appsettings.json:

{
  "DatabaseConfiguration": {
    "ConnectionString": "Host=...;Database=...;Username=...;Password=..."
  },
  "Prometheus": {
    "MetricsUrl": "/metrics"
  },
  "Logging": {
    "LogLevel": {
      "Default": "Information"
    }
  }
}

Deployment Artifacts

Dockerfile: Multi-stage build, non-root user, ports 8080/8081
compose.yaml: Minimal build configuration
Production compose: Port mapping (56232:8080), memory limit (256M), volume mount for config

CI/CD Pipeline

GitHub Actions: docker-image.yml

Trigger: Push to main branch
Steps: Build Docker image → Push to Docker Hub
Missing: Test execution, deployment automation, health checks

API Surface

Base Path: /api
Documentation: /swagger (Swagger UI)
Metrics: /metrics (Prometheus format)

Endpoints

GET /api/SearchArtist - Search artists across providers
GET /api/SearchAlbum - Search albums across providers
GET /api/SearchTrack - Search tracks across providers
GET /api/Search - Stub endpoint (not implemented)

Security Posture

Authentication: None (fully open API)
Authorization: None
Rate Limiting: None
CORS: Not configured
HTTPS: Commented out in production

This is a trust-based deployment suitable only for internal networks or behind authentication gateway.

Observability

Metrics: Prometheus request counters (path, method, status labels)
Logging: ASP.NET Core default (console output)
Tracing: None
Health Checks: None
Error Tracking: None (no Sentry, no structured logging)

Testing Strategy

Current State: No meaningful tests
Test Framework: xUnit configured but unused
Coverage: 0%
CI Integration: Tests not run in pipeline

This is a significant gap for production readiness.

License Implications

GPL-3.0 is a copyleft license requiring:

Source code disclosure for derivative works
Same license for modifications
Patent grant to users

Impact on integration:

Cannot incorporate code into proprietary systems without GPL compliance
Can use as separate service (API boundary preserves license isolation)
Database schema and API patterns can inspire clean-room implementations

Relevance to metadata-aggregator Project

High relevance - this is the closest existing implementation to our goals:

Multi-provider aggregation - exactly our use case
Unified search API - provider-agnostic queries
Database schema design - proven model for multi-provider storage
Provider isolation - clean separation via repository pattern
Fuzzy search - pg_trgm implementation reference

Key learnings:

Repository-per-provider scales well
Dapper performs well for read-heavy metadata queries
Separate sync process (MiniMediaScanner) simplifies API
Provider=Any pattern enables cross-provider search

Gaps to address:

Add comprehensive testing
Implement authentication/authorization
Add caching layer for performance
Health checks for production readiness
API versioning for evolution
Rate limiting for abuse prevention

Project Maturity Assessment

Strengths:

Clean architecture
Multiple providers working
Lightweight and performant
Good separation of concerns

Weaknesses:

Single maintainer risk
No test coverage
Missing production hardening (auth, rate limiting, health checks)
Schema coupling with external project
Limited observability

Maturity Level: Early production / Advanced prototype

Suitable for internal use or as reference implementation. Needs hardening for public deployment.

8.3 KiB Raw Permalink Blame History