Files
metadata-agregator/docs/research/minimediametadataapi/analysis/CODEBASE.md
T
Alexander a1f6701bac feat: initial implementation of metadata aggregator
- gRPC service with MusicBrainz provider
- PostgreSQL schema with migrations
- Service layer with database-first caching
- Repository pattern for data access
- YAML configuration support
- Research documentation for 17 music metadata projects
2026-04-28 16:28:53 +02:00

24 KiB

MiniMediaMetadataAPI - Codebase Analysis

Project Structure

MiniMediaMetadataAPI/
├── .github/
│   └── workflows/
│       └── docker-image.yml          # CI/CD pipeline
├── MiniMediaMetadataAPI/             # Web API project
│   ├── Controllers/
│   │   ├── SearchArtistController.cs
│   │   ├── SearchAlbumController.cs
│   │   ├── SearchTrackController.cs
│   │   └── SearchController.cs       # Stub
│   ├── Middlewares/
│   │   └── RequestMiddleware.cs      # Prometheus metrics
│   ├── Options/
│   │   └── PrometheusOptions.cs
│   ├── appsettings.json
│   ├── appsettings.Development.json
│   ├── Program.cs                    # Entry point
│   └── MiniMediaMetadataAPI.csproj
├── MiniMediaMetadataAPI.Application/ # Business logic project
│   ├── Configurations/
│   │   └── DatabaseConfiguration.cs
│   ├── Enums/
│   │   ├── ProviderType.cs
│   │   └── SearchResultType.cs
│   ├── Helpers/
│   │   ├── StringHelper.cs
│   │   └── DiscogsHelper.cs
│   ├── Models/
│   │   ├── Database/                 # 60+ provider-specific models
│   │   │   ├── Deezer/
│   │   │   ├── Discogs/
│   │   │   ├── MusicBrainz/
│   │   │   ├── SoundCloud/
│   │   │   ├── Spotify/
│   │   │   └── Tidal/
│   │   └── Entities/                 # API response models
│   │       ├── SearchArtistEntity.cs
│   │       ├── SearchAlbumEntity.cs
│   │       ├── SearchTrackEntity.cs
│   │       ├── ArtistImageEntity.cs
│   │       ├── AlbumImageEntity.cs
│   │       └── TrackImageEntity.cs
│   ├── Repositories/
│   │   ├── SpotifyRepository.cs
│   │   ├── TidalRepository.cs
│   │   ├── MusicBrainzRepository.cs
│   │   ├── DeezerRepository.cs
│   │   ├── DiscogsRepository.cs
│   │   ├── SoundCloudRepository.cs
│   │   └── JobRepository.cs          # Unused
│   ├── Services/
│   │   ├── SearchArtistService.cs
│   │   ├── SearchAlbumService.cs
│   │   └── SearchTrackService.cs
│   └── MiniMediaMetadataAPI.Application.csproj
├── MiniMediaMetadataAPI.Tests/       # Test project (empty)
│   ├── UnitTest1.cs                  # Stub
│   └── MiniMediaMetadataAPI.Tests.csproj
├── Dockerfile
├── compose.yaml
├── .gitignore
├── README.md
└── MiniMediaMetadataAPI.sln

Total Files: 99 C# files

Lines of Code: ~15,000 (estimated)

Configuration Files

appsettings.json

Location: MiniMediaMetadataAPI/appsettings.json

Full Configuration:

{
  "Logging": {
    "LogLevel": {
      "Default": "Information",
      "Microsoft.AspNetCore": "Warning"
    }
  },
  "AllowedHosts": "*",
  "DatabaseConfiguration": {
    "ConnectionString": "Host=localhost;Database=minimediametadata;Username=postgres;Password=postgres;MinPoolSize=5;MaxPoolSize=100"
  },
  "Prometheus": {
    "MetricsUrl": "/metrics"
  }
}

Configuration Sections:

Logging

  • Default: Information level (Info, Warning, Error, Critical)
  • Microsoft.AspNetCore: Warning level (reduces framework noise)
  • Output: Console (Docker logs)

AllowedHosts

  • Value: * (all hosts allowed)
  • Security Risk: No host header validation
  • Recommendation: Specify allowed domains in production

DatabaseConfiguration

  • Host: PostgreSQL server hostname
  • Database: Database name
  • Username/Password: Credentials (plain text, NOT SECURE)
  • MinPoolSize: 5 connections kept alive
  • MaxPoolSize: 100 concurrent connections

Prometheus

  • MetricsUrl: Endpoint path for metrics (/metrics)

appsettings.Development.json

Location: MiniMediaMetadataAPI/appsettings.Development.json

Configuration:

{
  "Logging": {
    "LogLevel": {
      "Default": "Debug",
      "Microsoft.AspNetCore": "Information"
    }
  }
}

Changes from Base:

  • Debug logging enabled (verbose)
  • Framework logging at Information level

No Production Override: appsettings.Production.json not included

Entry Point

Program.cs

Location: MiniMediaMetadataAPI/Program.cs

Full Implementation:

using MiniMediaMetadataAPI.Application.Configurations;
using MiniMediaMetadataAPI.Application.Repositories;
using MiniMediaMetadataAPI.Application.Services;
using MiniMediaMetadataAPI.Middlewares;
using Prometheus;

var builder = WebApplication.CreateBuilder(args);

// Configuration
builder.Services.Configure<DatabaseConfiguration>(
    builder.Configuration.GetSection("DatabaseConfiguration"));

// Repositories
builder.Services.AddScoped<ISpotifyRepository, SpotifyRepository>();
builder.Services.AddScoped<ITidalRepository, TidalRepository>();
builder.Services.AddScoped<IMusicBrainzRepository, MusicBrainzRepository>();
builder.Services.AddScoped<IDeezerRepository, DeezerRepository>();
builder.Services.AddScoped<IDiscogsRepository, DiscogsRepository>();
builder.Services.AddScoped<ISoundCloudRepository, SoundCloudRepository>();
builder.Services.AddScoped<IJobRepository, JobRepository>();

// Services
builder.Services.AddScoped<ISearchArtistService, SearchArtistService>();
builder.Services.AddScoped<ISearchAlbumService, SearchAlbumService>();
builder.Services.AddScoped<ISearchTrackService, SearchTrackService>();

// Controllers
builder.Services.AddControllers();

// Swagger
builder.Services.AddEndpointsApiExplorer();
builder.Services.AddSwaggerGen();

var app = builder.Build();

// Swagger (all environments)
app.UseSwagger();
app.UseSwaggerUI();

// HTTPS redirection (COMMENTED OUT)
// app.UseHttpsRedirection();

// Prometheus middleware
app.UseMiddleware<RequestMiddleware>();

// Authorization (not configured)
app.UseAuthorization();

// Map controllers
app.MapControllers();

// Prometheus metrics endpoint
app.MapMetrics();

app.Run();

Dependency Injection:

  • Scoped lifetime: New instances per request
  • No Singleton services: No shared state
  • No Transient services: All scoped

Middleware Pipeline:

  1. Swagger (documentation)
  2. RequestMiddleware (Prometheus metrics)
  3. Authorization (no-op, no auth configured)
  4. Controllers (endpoint routing)

Missing Middleware:

  • HTTPS redirection (commented out)
  • CORS (not configured)
  • Authentication (not configured)
  • Rate limiting (not configured)
  • Exception handling (uses ASP.NET Core default)

Swagger in Production: Enabled in all environments (security risk)

Controllers

SearchArtistController

Location: MiniMediaMetadataAPI/Controllers/SearchArtistController.cs

Implementation:

using Microsoft.AspNetCore.Mvc;
using MiniMediaMetadataAPI.Application.Enums;
using MiniMediaMetadataAPI.Application.Helpers;
using MiniMediaMetadataAPI.Application.Services;

namespace MiniMediaMetadataAPI.Controllers;

[ApiController]
[Route("api/[controller]")]
public class SearchArtistController : ControllerBase
{
    private readonly ISearchArtistService _searchArtistService;
    private readonly ILogger<SearchArtistController> _logger;

    public SearchArtistController(
        ISearchArtistService searchArtistService,
        ILogger<SearchArtistController> logger)
    {
        _searchArtistService = searchArtistService;
        _logger = logger;
    }

    [HttpGet]
    public async Task<IActionResult> Get(
        [FromQuery] string? Id = null,
        [FromQuery] string? Name = null,
        [FromQuery] ProviderType Provider = ProviderType.Any,
        [FromQuery] int Offset = 0)
    {
        // Input sanitization
        Name = StringHelper.RemoveControlChars(Name);

        // Search by ID or Name
        if (!string.IsNullOrEmpty(Id))
        {
            var result = await _searchArtistService.GetArtistById(Id, Provider);
            return Ok(result);
        }
        else if (!string.IsNullOrEmpty(Name))
        {
            var result = await _searchArtistService.SearchArtist(Name, Provider, Offset);
            return Ok(result);
        }
        else
        {
            return BadRequest("Either Id or Name must be provided");
        }
    }
}

Features:

  • Optional parameters (Id, Name, Provider, Offset)
  • Input sanitization (control character removal)
  • Conditional logic (ID vs Name search)
  • Always returns 200 OK (even for errors)

Missing:

  • Input validation (length, format)
  • Rate limiting
  • Authentication
  • Proper HTTP status codes (404 for not found)

SearchAlbumController

Location: MiniMediaMetadataAPI/Controllers/SearchAlbumController.cs

Similar Structure:

[HttpGet]
public async Task<IActionResult> Get(
    [FromQuery] string? AlbumId = null,
    [FromQuery] string? ArtistId = null,
    [FromQuery] string? AlbumName = null,
    [FromQuery] ProviderType Provider = ProviderType.Any,
    [FromQuery] int Offset = 0)
{
    AlbumName = StringHelper.RemoveControlChars(AlbumName);

    if (!string.IsNullOrEmpty(AlbumId))
    {
        var result = await _searchAlbumService.GetAlbumById(AlbumId, Provider);
        return Ok(result);
    }
    else if (!string.IsNullOrEmpty(AlbumName) || !string.IsNullOrEmpty(ArtistId))
    {
        var result = await _searchAlbumService.SearchAlbum(AlbumName, ArtistId, Provider, Offset);
        return Ok(result);
    }
    else
    {
        return BadRequest("Either AlbumId, AlbumName, or ArtistId must be provided");
    }
}

Additional Feature: Combined search (AlbumName + ArtistId)

SearchTrackController

Location: MiniMediaMetadataAPI/Controllers/SearchTrackController.cs

Similar Structure:

[HttpGet]
public async Task<IActionResult> Get(
    [FromQuery] string? TrackId = null,
    [FromQuery] string? ArtistId = null,
    [FromQuery] string? TrackName = null,
    [FromQuery] ProviderType Provider = ProviderType.Any,
    [FromQuery] int Offset = 0)
{
    TrackName = StringHelper.RemoveControlChars(TrackName);

    if (!string.IsNullOrEmpty(TrackId))
    {
        var result = await _searchTrackService.GetTrackById(TrackId, Provider);
        return Ok(result);
    }
    else if (!string.IsNullOrEmpty(TrackName) || !string.IsNullOrEmpty(ArtistId))
    {
        var result = await _searchTrackService.SearchTrack(TrackName, ArtistId, Provider, Offset);
        return Ok(result);
    }
    else
    {
        return BadRequest("Either TrackId, TrackName, or ArtistId must be provided");
    }
}

SearchController

Location: MiniMediaMetadataAPI/Controllers/SearchController.cs

Stub Implementation:

[ApiController]
[Route("api/[controller]")]
public class SearchController : ControllerBase
{
    [HttpGet]
    public IActionResult Get()
    {
        return Ok("Search endpoint - not implemented");
    }
}

Status: Placeholder only

Intended Purpose: Unified search across artists, albums, tracks (speculation)

Middleware

RequestMiddleware

Location: MiniMediaMetadataAPI/Middlewares/RequestMiddleware.cs

Full Implementation:

using Prometheus;

namespace MiniMediaMetadataAPI.Middlewares;

public class RequestMiddleware
{
    private readonly RequestDelegate _next;
    private static readonly Counter RequestCounter = Metrics.CreateCounter(
        "minimediametadataapi_request_total",
        "Total HTTP requests",
        new CounterConfiguration
        {
            LabelNames = new[] { "path", "method", "status" }
        });

    public RequestMiddleware(RequestDelegate next)
    {
        _next = next;
    }

    public async Task InvokeAsync(HttpContext context)
    {
        await _next(context);

        RequestCounter
            .WithLabels(
                context.Request.Path,
                context.Request.Method,
                context.Response.StatusCode.ToString())
            .Inc();
    }
}

Purpose: Prometheus metrics collection

Metrics:

  • Name: minimediametadataapi_request_total
  • Type: Counter
  • Labels: path, method, status

Execution: After request processing (post-middleware)

Missing Metrics:

  • Request duration
  • Database query time
  • Error rates
  • Active requests

Enums

ProviderType

Location: MiniMediaMetadataAPI.Application/Enums/ProviderType.cs

Definition:

namespace MiniMediaMetadataAPI.Application.Enums;

public enum ProviderType
{
    Any = 0,
    Deezer = 1,
    Discogs = 2,
    MusicBrainz = 3,
    Spotify = 4,
    Tidal = 5,
    SoundCloud = 6
}

Usage:

  • Query parameter in API endpoints
  • Filter for provider-specific searches
  • Any triggers multi-provider search

SearchResultType

Location: MiniMediaMetadataAPI.Application/Enums/SearchResultType.cs

Definition:

namespace MiniMediaMetadataAPI.Application.Enums;

public enum SearchResultType
{
    Ok = 0,
    NotFound = 1,
    InQueueSync = 2
}

Usage:

  • Response status indicator
  • Ok - Results found
  • NotFound - No results
  • InQueueSync - Data sync in progress (UNUSED)

Issue: InQueueSync never returned (dead code)

Helpers

StringHelper

Location: MiniMediaMetadataAPI.Application/Helpers/StringHelper.cs

Full Implementation:

using System.Text.RegularExpressions;

namespace MiniMediaMetadataAPI.Application.Helpers;

public static class StringHelper
{
    public static string RemoveControlChars(string? input)
    {
        if (string.IsNullOrEmpty(input))
            return input ?? string.Empty;

        // Remove control characters (0x00-0x1F, 0x7F-0x9F)
        return Regex.Replace(input, @"[\x00-\x1F\x7F-\x9F]", string.Empty);
    }

    public static string RemoveEmojis(string? input)
    {
        if (string.IsNullOrEmpty(input))
            return input ?? string.Empty;

        // Remove surrogate pairs (emojis)
        return Regex.Replace(input, @"\p{Cs}", string.Empty);
    }
}

RemoveControlChars:

  • Removes ASCII control characters
  • Prevents injection attacks
  • Used on all user input

RemoveEmojis:

  • Removes Unicode emojis
  • NOT used in API (only in MiniMediaScanner)

DiscogsHelper

Location: MiniMediaMetadataAPI.Application/Helpers/DiscogsHelper.cs

Full Implementation:

namespace MiniMediaMetadataAPI.Application.Helpers;

public static class DiscogsHelper
{
    public static int GetDiscNumber(string? position)
    {
        if (string.IsNullOrEmpty(position))
            return 1;

        var parts = position.Split('-');
        return parts.Length > 0 && int.TryParse(parts[0], out var disc)
            ? disc
            : 1;
    }

    public static int GetTrackNumber(string? position)
    {
        if (string.IsNullOrEmpty(position))
            return 0;

        var parts = position.Split('-');
        return parts.Length > 1 && int.TryParse(parts[1], out var track)
            ? track
            : 0;
    }
}

Purpose: Parse Discogs position format ("2-5" → disc 2, track 5)

Limitations: Only handles numeric format (not vinyl sides like "A1")

Error Handling

Repository Level

Pattern:

public async Task<List<SearchArtistEntity>> SearchArtist(string name, int offset)
{
    try
    {
        using var connection = new NpgsqlConnection(_connectionString);
        var results = await connection.QueryAsync<SpotifyArtist>(sql, parameters);
        return results.Select(MapToEntity).ToList();
    }
    catch (Exception ex)
    {
        _logger.LogError(ex, "Error searching Spotify artists for term: {SearchTerm}", name);
        return new List<SearchArtistEntity>();
    }
}

Strategy:

  • Catch all exceptions
  • Log error with context
  • Return empty list (no error propagation)

Issues:

  • Client can't distinguish error from no results
  • No retry logic
  • No specific exception handling

Service Level

Pattern:

public async Task<SearchArtistResponse> SearchArtist(string name, ProviderType provider, int offset)
{
    try
    {
        // Orchestrate repositories
        var results = await GetResults(name, provider, offset);
        return new SearchArtistResponse
        {
            SearchResultType = results.Any() ? SearchResultType.Ok : SearchResultType.NotFound,
            Artists = results
        };
    }
    catch (Exception ex)
    {
        _logger.LogError(ex, "Error in SearchArtistService for term: {SearchTerm}", name);
        return new SearchArtistResponse
        {
            SearchResultType = SearchResultType.NotFound,
            Artists = new List<SearchArtistEntity>()
        };
    }
}

Strategy: Same as repository (swallow errors, return empty)

Controller Level

Pattern:

[HttpGet]
public async Task<IActionResult> Get(...)
{
    // No try-catch - relies on ASP.NET Core default error handling
    var result = await _searchArtistService.SearchArtist(name, provider, offset);
    return Ok(result);
}

Strategy: No error handling (framework handles unhandled exceptions)

ASP.NET Core Default:

  • 500 Internal Server Error for unhandled exceptions
  • Generic error response (no details in production)
  • Stack trace hidden in production

Logging

Configuration

appsettings.json:

{
  "Logging": {
    "LogLevel": {
      "Default": "Information",
      "Microsoft.AspNetCore": "Warning"
    }
  }
}

Log Levels:

  • Trace (0) - Most verbose
  • Debug (1) - Development details
  • Information (2) - General flow
  • Warning (3) - Unexpected but handled
  • Error (4) - Failures
  • Critical (5) - Fatal errors

Current Levels:

  • Application: Information
  • Framework: Warning

Usage Patterns

Repository Logging:

_logger.LogError(ex, "Error searching Spotify artists for term: {SearchTerm}", name);

Service Logging:

_logger.LogError(ex, "Error in SearchArtistService for term: {SearchTerm}", name);

No Info/Debug Logging: Only errors logged

Missing:

  • Request/response logging
  • Performance logging
  • Business event logging
  • Structured logging (Serilog)

Log Output

Destination: Console (Docker logs)

Format: Plain text (not JSON)

Example:

info: MiniMediaMetadataAPI.Application.Services.SearchArtistService[0]
      Searching for artist: Beatles
fail: MiniMediaMetadataAPI.Application.Repositories.SpotifyRepository[0]
      Error searching Spotify artists for term: Beatles
      Npgsql.NpgsqlException: Connection refused

No Correlation IDs: Can't trace requests across logs

Testing

Test Project

Location: MiniMediaMetadataAPI.Tests/

Framework: xUnit

Current State: Empty stub

UnitTest1.cs:

namespace MiniMediaMetadataAPI.Tests;

public class UnitTest1
{
    [Fact]
    public void Test1()
    {
        // Empty test
    }
}

Coverage: 0%

CI/CD Integration: Tests not run in pipeline

Missing Tests

Unit Tests:

  • Repository methods
  • Service orchestration
  • Helper functions
  • Enum conversions

Integration Tests:

  • Database queries
  • Multi-provider aggregation
  • Error handling

API Tests:

  • Controller endpoints
  • Input validation
  • Response formats

Performance Tests:

  • Load testing
  • Stress testing
  • Fuzzy search performance

Code Quality

Naming Conventions

Consistent:

  • PascalCase for classes, methods, properties
  • camelCase for parameters, local variables
  • Prefix interfaces with I
  • Suffix repositories with Repository
  • Suffix services with Service

Examples:

public interface ISpotifyRepository { }
public class SpotifyRepository : ISpotifyRepository { }
public class SearchArtistService : ISearchArtistService { }

Code Organization

Good:

  • Clear separation of concerns (controllers, services, repositories)
  • Provider isolation (one repository per provider)
  • Consistent file structure

Issues:

  • Large repository files (500+ lines)
  • No partial classes for large models
  • Helpers in single file (could split)

Documentation

XML Comments: None

README: Basic setup instructions

API Documentation: Swagger only (no additional docs)

Missing:

  • Code comments
  • Architecture documentation
  • Deployment guide
  • Troubleshooting guide

Code Smells

Unused Dependencies:

  • Quartz (registered but no jobs)
  • Polly (registered but no policies)
  • FuzzySharp (registered but not used)
  • SpotifyAPI.Web.Auth (not used)

Magic Numbers:

  • Hardcoded page size (20)
  • Hardcoded similarity threshold (0.5)
  • Hardcoded connection pool sizes (5, 100)

Recommendation: Move to configuration

Swallowed Exceptions:

  • All exceptions caught and logged
  • No error propagation
  • Client can't distinguish errors from no results

No Async Suffix:

  • Methods return Task<T> but don't end with Async
  • Violates .NET naming conventions

Example:

// Current
Task<List<SearchArtistEntity>> SearchArtist(string name, int offset);

// Recommended
Task<List<SearchArtistEntity>> SearchArtistAsync(string name, int offset);

Security Analysis

Authentication

Status: None

Implications:

  • Fully open API
  • No user identification
  • No access control

Authorization

Status: None (middleware registered but not configured)

Code:

app.UseAuthorization(); // No-op without authentication

Input Validation

Implemented:

  • Control character removal
  • Null/empty checks

Missing:

  • Length validation
  • Format validation (email, URL, etc.)
  • SQL injection protection (handled by Dapper)
  • XSS protection (handled by JSON serialization)

HTTPS

Status: Disabled

Code:

// app.UseHttpsRedirection(); // COMMENTED OUT

Implication: Plain text traffic (expects reverse proxy)

Secrets Management

Status: Plain text in configuration

Issue:

{
  "DatabaseConfiguration": {
    "ConnectionString": "Host=localhost;Database=minimediametadata;Username=postgres;Password=postgres"
  }
}

Recommendation: Environment variables or secrets manager

CORS

Status: Not configured

Implication: Browser clients blocked by default

Rate Limiting

Status: Not implemented

Implication: Vulnerable to abuse

Performance Considerations

Database Queries

Optimized:

  • Parameterized queries (Dapper)
  • Connection pooling
  • Fuzzy search with GIN indexes

Not Optimized:

  • No query result caching
  • No prepared statements
  • No query batching

Parallel Execution

Implemented:

var tasks = new[]
{
    _spotify.SearchArtist(name, offset),
    _tidal.SearchArtist(name, offset),
    // ... 4 more providers
};
var results = await Task.WhenAll(tasks);

Benefit: Multi-provider search in parallel (not sequential)

Memory Usage

Efficient:

  • Scoped services (per-request lifetime)
  • No static state
  • Connection pooling

Potential Issues:

  • Large result sets (no pagination limit)
  • No streaming for large responses

Codebase Evaluation

Strengths:

  • Clean architecture (layers well-separated)
  • Consistent naming conventions
  • Provider isolation (repository pattern)
  • Parallel query execution
  • Input sanitization

Weaknesses:

  • No tests (0% coverage)
  • No authentication/authorization
  • Unused dependencies
  • Swallowed exceptions
  • No structured logging
  • No XML documentation
  • Magic numbers hardcoded
  • HTTPS disabled
  • No rate limiting

Maintainability: 7/10

  • Easy to understand
  • Clear structure
  • But lacks tests and documentation

Production Readiness: 5/10

  • Works but missing critical features
  • Security gaps
  • No observability beyond basic metrics

Recommendations:

  1. Add comprehensive tests
  2. Implement authentication
  3. Remove unused dependencies
  4. Add structured logging (Serilog)
  5. Move magic numbers to configuration
  6. Add XML documentation
  7. Implement proper error handling
  8. Enable HTTPS
  9. Add rate limiting
  10. Add health checks