Files
metadata-agregator/docs/research/minimediametadataapi/analysis/INTEGRATIONS.md
T
Alexander a1f6701bac feat: initial implementation of metadata aggregator
- gRPC service with MusicBrainz provider
- PostgreSQL schema with migrations
- Service layer with database-first caching
- Repository pattern for data access
- YAML configuration support
- Research documentation for 17 music metadata projects
2026-04-28 16:28:53 +02:00

22 KiB

MiniMediaMetadataAPI - Integration Analysis

Integration Philosophy

Critical Distinction: This API does NOT integrate with external provider APIs.

Data Source: Pre-populated PostgreSQL database
Sync Responsibility: MiniMediaScanner (separate project)
API Role: Query interface only

Architecture Overview

External Providers (Spotify, Tidal, etc.)
    ↓
MiniMediaScanner (separate project)
    ↓ (writes)
PostgreSQL Database
    ↓ (reads)
MiniMediaMetadataAPI (this project)
    ↓
API Clients

Separation of Concerns:

  • MiniMediaScanner: Provider API integration, authentication, rate limiting, data sync
  • MiniMediaMetadataAPI: Database queries, response formatting, API serving

Provider Integration Status

Spotify

Integration Type: None (data pre-populated)
Dependency: SpotifyAPI.Web.Auth 7.4.2 (UNUSED)

Why Dependency Exists:

  • Likely copied from MiniMediaScanner
  • Not removed during project split
  • Dead code / dependency bloat

Data Available:

  • Artists (with images, genres, popularity, followers)
  • Albums (with images, UPC, label, copyright)
  • Tracks (with ISRC, explicit flag, duration)

Data Sync: Handled by MiniMediaScanner via Spotify Web API

Authentication: Not needed in this API (MiniMediaScanner handles OAuth)

Tidal

Integration Type: None (data pre-populated)
Dependency: None

Data Available:

  • Artists (with image links)
  • Albums (with UPC, copyright, explicit flag)
  • Tracks (with ISRC, duration)

Data Sync: Handled by MiniMediaScanner via Tidal API

Authentication: Not needed in this API

MusicBrainz

Integration Type: None (data pre-populated)
Dependency: None

Data Available:

  • Artists (with sort name, type, country)
  • Releases (with barcode, status, packaging)
  • Labels (with hierarchy)
  • Tracks (with ISRC)

Data Sync: Handled by MiniMediaScanner via MusicBrainz API

Authentication: Not needed (MusicBrainz is open)

Deezer

Integration Type: None (data pre-populated)
Dependency: None

Data Available:

  • Artists (with image links, fans)
  • Albums (with genres, fans)
  • Tracks (with duration, explicit flag)

Data Sync: Handled by MiniMediaScanner via Deezer API

Authentication: Not needed in this API

Discogs

Integration Type: None (data pre-populated)
Dependency: None

Data Available:

  • Artists (with aliases, real names, profiles)
  • Releases (with identifiers, genres, styles)
  • Labels (with hierarchy, contact info)
  • Tracks (with disc/track numbers)

Data Sync: Handled by MiniMediaScanner via Discogs API

Authentication: Not needed in this API

SoundCloud

Integration Type: None (data pre-populated)
Dependency: None

Data Available:

  • Users (with avatars, follower counts)
  • Playlists (with artwork, track counts)
  • Tracks (with artwork, playback counts, genre)

Data Sync: Handled by MiniMediaScanner via SoundCloud API

Authentication: Not needed in this API

Repository Pattern Implementation

Interface Design

Each provider has dedicated repository interface and implementation.

Example: ISpotifyRepository

public interface ISpotifyRepository
{
    Task<List<SearchArtistEntity>> SearchArtist(string name, int offset);
    Task<SearchArtistEntity> GetArtistById(string id);
    Task<List<SearchAlbumEntity>> SearchAlbum(string name, string artistId, int offset);
    Task<SearchAlbumEntity> GetAlbumById(string id);
    Task<List<SearchTrackEntity>> SearchTrack(string name, string artistId, int offset);
    Task<SearchTrackEntity> GetTrackById(string id);
}

Implementation: SpotifyRepository

public class SpotifyRepository : ISpotifyRepository
{
    private readonly string _connectionString;
    private readonly ILogger<SpotifyRepository> _logger;

    public SpotifyRepository(
        IOptions<DatabaseConfiguration> config,
        ILogger<SpotifyRepository> logger)
    {
        _connectionString = config.Value.ConnectionString;
        _logger = logger;
    }

    public async Task<List<SearchArtistEntity>> SearchArtist(string name, int offset)
    {
        try
        {
            using var connection = new NpgsqlConnection(_connectionString);
            
            var sql = @"
                SET LOCAL pg_trgm.similarity_threshold = 0.5;
                
                SELECT 
                    a.id,
                    a.name,
                    a.popularity,
                    a.external_url,
                    a.followers,
                    a.genres,
                    a.last_sync_time,
                    i.url AS image_url,
                    i.height AS image_height,
                    i.width AS image_width
                FROM spotify_artist a
                LEFT JOIN spotify_artist_image i ON a.id = i.artist_id
                WHERE lower(a.name) % lower(@searchTerm)
                ORDER BY similarity(lower(a.name), lower(@searchTerm)) DESC
                LIMIT 20 OFFSET @offset;
            ";
            
            var artistDict = new Dictionary<string, SearchArtistEntity>();
            
            await connection.QueryAsync<SpotifyArtist, SpotifyArtistImage, SearchArtistEntity>(
                sql,
                (artist, image) =>
                {
                    if (!artistDict.TryGetValue(artist.Id, out var entity))
                    {
                        entity = MapToEntity(artist);
                        artistDict.Add(artist.Id, entity);
                    }
                    
                    if (image != null)
                    {
                        entity.Images.Add(MapImageToEntity(image));
                    }
                    
                    return entity;
                },
                new { searchTerm = name, offset },
                splitOn: "image_url"
            );
            
            return artistDict.Values.ToList();
        }
        catch (Exception ex)
        {
            _logger.LogError(ex, "Error searching Spotify artists for term: {SearchTerm}", name);
            return new List<SearchArtistEntity>();
        }
    }

    private SearchArtistEntity MapToEntity(SpotifyArtist artist)
    {
        return new SearchArtistEntity
        {
            ProviderType = ProviderType.Spotify,
            Id = artist.Id,
            Name = artist.Name,
            Popularity = artist.Popularity,
            Url = artist.ExternalUrl,
            TotalFollowers = artist.Followers,
            Genres = artist.Genres,
            Images = new List<ArtistImageEntity>(),
            LastSyncTime = artist.LastSyncTime
        };
    }
}

Repository Variations

ID Type Differences:

Repository ID Type C# Type
SpotifyRepository VARCHAR string
TidalRepository INTEGER int
MusicBrainzRepository UUID Guid
DeezerRepository BIGINT long
DiscogsRepository INTEGER int
SoundCloudRepository BIGINT long

Interface Adaptation:

// Spotify
Task<SearchArtistEntity> GetArtistById(string id);

// Tidal
Task<SearchArtistEntity> GetArtistById(int id);

// MusicBrainz
Task<SearchArtistEntity> GetArtistById(Guid id);

// Deezer
Task<SearchArtistEntity> GetArtistById(long id);

No Common Interface: Each repository has provider-specific method signatures.

Provider-Specific Logic

Discogs Helper:

public static class DiscogsHelper
{
    public static int GetDiscNumber(string position)
    {
        // Discogs stores position as "1-1", "2-3", etc.
        // Format: "disc-track"
        if (string.IsNullOrEmpty(position))
            return 1;
        
        var parts = position.Split('-');
        return parts.Length > 0 && int.TryParse(parts[0], out var disc) 
            ? disc 
            : 1;
    }

    public static int GetTrackNumber(string position)
    {
        if (string.IsNullOrEmpty(position))
            return 0;
        
        var parts = position.Split('-');
        return parts.Length > 1 && int.TryParse(parts[1], out var track) 
            ? track 
            : 0;
    }
}

Usage in DiscogsRepository:

var track = new SearchTrackEntity
{
    DiscNumber = DiscogsHelper.GetDiscNumber(dbTrack.Position),
    TrackNumber = DiscogsHelper.GetTrackNumber(dbTrack.Position)
};

MusicBrainz Sort Name:

// MusicBrainz stores "Beatles, The" for alphabetical sorting
var artist = new SearchArtistEntity
{
    Name = dbArtist.Name, // "The Beatles"
    SortName = dbArtist.SortName // "Beatles, The"
};

SoundCloud User vs Artist:

// SoundCloud has "users" not "artists"
var artist = new SearchArtistEntity
{
    Name = dbUser.FullName ?? dbUser.Username,
    Url = dbUser.Url,
    TotalFollowers = dbUser.FollowersCount
};

Service Layer Orchestration

Cross-Provider Aggregation

SearchArtistService:

public class SearchArtistService : ISearchArtistService
{
    private readonly ISpotifyRepository _spotify;
    private readonly ITidalRepository _tidal;
    private readonly IMusicBrainzRepository _musicBrainz;
    private readonly IDeezerRepository _deezer;
    private readonly IDiscogsRepository _discogs;
    private readonly ISoundCloudRepository _soundCloud;
    private readonly ILogger<SearchArtistService> _logger;

    public async Task<SearchArtistResponse> SearchArtist(
        string name, 
        ProviderType provider, 
        int offset)
    {
        if (provider == ProviderType.Any)
        {
            return await SearchAllProviders(name, offset);
        }
        else
        {
            return await SearchSingleProvider(name, provider, offset);
        }
    }

    private async Task<SearchArtistResponse> SearchAllProviders(string name, int offset)
    {
        try
        {
            var tasks = new[]
            {
                _spotify.SearchArtist(name, offset),
                _tidal.SearchArtist(name, offset),
                _musicBrainz.SearchArtist(name, offset),
                _deezer.SearchArtist(name, offset),
                _discogs.SearchArtist(name, offset),
                _soundCloud.SearchArtist(name, offset)
            };
            
            var results = await Task.WhenAll(tasks);
            var combined = results.SelectMany(r => r).ToList();
            
            return new SearchArtistResponse
            {
                SearchResultType = combined.Any() 
                    ? SearchResultType.Ok 
                    : SearchResultType.NotFound,
                Artists = combined
            };
        }
        catch (Exception ex)
        {
            _logger.LogError(ex, "Error searching all providers for artist: {Name}", name);
            return new SearchArtistResponse
            {
                SearchResultType = SearchResultType.NotFound,
                Artists = new List<SearchArtistEntity>()
            };
        }
    }

    private async Task<SearchArtistResponse> SearchSingleProvider(
        string name, 
        ProviderType provider, 
        int offset)
    {
        try
        {
            var results = provider switch
            {
                ProviderType.Spotify => await _spotify.SearchArtist(name, offset),
                ProviderType.Tidal => await _tidal.SearchArtist(name, offset),
                ProviderType.MusicBrainz => await _musicBrainz.SearchArtist(name, offset),
                ProviderType.Deezer => await _deezer.SearchArtist(name, offset),
                ProviderType.Discogs => await _discogs.SearchArtist(name, offset),
                ProviderType.SoundCloud => await _soundCloud.SearchArtist(name, offset),
                _ => new List<SearchArtistEntity>()
            };
            
            return new SearchArtistResponse
            {
                SearchResultType = results.Any() 
                    ? SearchResultType.Ok 
                    : SearchResultType.NotFound,
                Artists = results
            };
        }
        catch (Exception ex)
        {
            _logger.LogError(ex, "Error searching {Provider} for artist: {Name}", provider, name);
            return new SearchArtistResponse
            {
                SearchResultType = SearchResultType.NotFound,
                Artists = new List<SearchArtistEntity>()
            };
        }
    }
}

Parallel Execution:

  • Task.WhenAll() runs all 6 provider queries simultaneously
  • Total query time = slowest provider (not sum of all)
  • Typical: 20-50ms for all providers (with indexes)

No Result Deduplication:

  • Same artist from multiple providers returned multiple times
  • Each result has ProviderType field to distinguish
  • Client responsible for deduplication if needed

Error Handling:

  • Individual provider failures don't fail entire request
  • Empty list returned for failed providers
  • Logged but not exposed to client

Helper Utilities

StringHelper

File: Helpers/StringHelper.cs

Methods:

RemoveControlChars

public static string RemoveControlChars(string input)
{
    if (string.IsNullOrEmpty(input))
        return input;
    
    // Remove control characters (0x00-0x1F, 0x7F-0x9F)
    return Regex.Replace(input, @"[\x00-\x1F\x7F-\x9F]", string.Empty);
}

Usage: Sanitize user input before database queries

Protects Against:

  • Null byte injection
  • Terminal escape sequences
  • Control character exploits

RemoveEmojis

public static string RemoveEmojis(string input)
{
    if (string.IsNullOrEmpty(input))
        return input;
    
    // Remove surrogate pairs (emojis)
    return Regex.Replace(input, @"\p{Cs}", string.Empty);
}

Usage: Clean provider data before storage (in MiniMediaScanner)

Not Used in API: Data already cleaned during sync

DiscogsHelper

File: Helpers/DiscogsHelper.cs

Purpose: Parse Discogs-specific position format

Methods:

GetDiscNumber

public static int GetDiscNumber(string position)
{
    // Input: "2-5" (disc 2, track 5)
    // Output: 2
    
    if (string.IsNullOrEmpty(position))
        return 1;
    
    var parts = position.Split('-');
    return parts.Length > 0 && int.TryParse(parts[0], out var disc) 
        ? disc 
        : 1;
}

GetTrackNumber

public static int GetTrackNumber(string position)
{
    // Input: "2-5" (disc 2, track 5)
    // Output: 5
    
    if (string.IsNullOrEmpty(position))
        return 0;
    
    var parts = position.Split('-');
    return parts.Length > 1 && int.TryParse(parts[1], out var track) 
        ? track 
        : 0;
}

Discogs Position Formats:

  • "1-1" - Disc 1, Track 1
  • "2-5" - Disc 2, Track 5
  • "A1" - Vinyl side A, track 1 (not handled)
  • "DVD1" - DVD disc (not handled)

Limitations: Only handles numeric disc-track format.

Job Repository

File: Repositories/JobRepository.cs

Purpose: Track background sync jobs (unused in current implementation)

Interface:

public interface IJobRepository
{
    Task<Job> GetJobById(int id);
    Task<List<Job>> GetPendingJobs();
    Task CreateJob(Job job);
    Task UpdateJobStatus(int id, JobStatus status);
}

Job Model:

public class Job
{
    public int Id { get; set; }
    public ProviderType Provider { get; set; }
    public JobType Type { get; set; } // ArtistSync, AlbumSync, TrackSync
    public JobStatus Status { get; set; } // Pending, InProgress, Completed, Failed
    public string EntityId { get; set; }
    public DateTime CreatedAt { get; set; }
    public DateTime? CompletedAt { get; set; }
    public string ErrorMessage { get; set; }
}

Current Status: Registered in DI but never used.

Intended Use: Track sync requests from API to MiniMediaScanner (not implemented).

SearchResultType.InQueueSync: Enum value exists but never returned.

Quartz Scheduler Integration

Dependency: Quartz 3.17.0

Configuration: Registered in DI

Jobs Defined: None

Current Status: Dead code

Intended Use: Scheduled background tasks (speculation):

  • Periodic sync triggers
  • Stale data cleanup
  • Metrics aggregation

Recommendation: Remove dependency if not used.

Polly Resilience Integration

Dependency: Polly 8.6.6

Configuration: Registered in DI

Policies Defined: None

Current Status: Dead code

Intended Use: Retry policies for database queries (speculation):

// NOT IMPLEMENTED
var retryPolicy = Policy
    .Handle<NpgsqlException>()
    .WaitAndRetryAsync(3, retryAttempt => 
        TimeSpan.FromSeconds(Math.Pow(2, retryAttempt)));

await retryPolicy.ExecuteAsync(async () =>
{
    return await connection.QueryAsync<SpotifyArtist>(sql, parameters);
});

Recommendation: Implement retry policies or remove dependency.

FuzzySharp Integration

Dependency: FuzzySharp 2.0.2

Purpose: String similarity matching (alternative to pg_trgm)

Current Status: Registered but not used

Intended Use: Client-side fuzzy matching (speculation):

// NOT IMPLEMENTED
var results = await _spotify.SearchArtist(name, offset);
var scored = results.Select(r => new
{
    Artist = r,
    Score = Fuzz.Ratio(name.ToLower(), r.Name.ToLower())
});
var filtered = scored.Where(s => s.Score >= 70).OrderByDescending(s => s.Score);

Why Not Used: pg_trgm handles fuzzy search in database (more efficient).

Recommendation: Remove dependency if not needed.

Prometheus Integration

Dependency: prometheus-net 8.2.1

Metrics Exposed:

minimediametadataapi_request_total

Type: Counter
Labels: path, method, status

Implementation:

public class RequestMiddleware
{
    private static readonly Counter RequestCounter = Metrics
        .CreateCounter(
            "minimediametadataapi_request_total",
            "Total HTTP requests",
            new CounterConfiguration
            {
                LabelNames = new[] { "path", "method", "status" }
            });

    public async Task InvokeAsync(HttpContext context, RequestDelegate next)
    {
        await next(context);
        
        RequestCounter
            .WithLabels(
                context.Request.Path,
                context.Request.Method,
                context.Response.StatusCode.ToString())
            .Inc();
    }
}

Endpoint: /metrics

Format: Prometheus text exposition

Missing Metrics:

  • Request duration histogram
  • Database query duration
  • Error rate by provider
  • Active requests gauge
  • Connection pool usage

Swagger Integration

Dependency: Swashbuckle.AspNetCore 10.1.7

Configuration:

builder.Services.AddSwaggerGen();
app.UseSwagger();
app.UseSwaggerUI();

Endpoint: /swagger

Features:

  • Auto-generated from controller attributes
  • Interactive API testing
  • Request/response schema documentation
  • Enum value descriptions

Customization: None (default configuration)

Production Access: Enabled (no environment check)

Database Connection Management

Pattern: Connection-per-request

Implementation:

using var connection = new NpgsqlConnection(_connectionString);
await connection.QueryAsync<T>(sql, parameters);
// Connection automatically disposed and returned to pool

No DbContext: Each repository method creates own connection.

No Transactions: Read-only queries don't need transactions.

Connection Pooling: Handled by Npgsql driver (configured in connection string).

Error Handling Strategy

Repository Level:

try
{
    // Database query
}
catch (Exception ex)
{
    _logger.LogError(ex, "Error message with context");
    return new List<T>(); // Empty result
}

Service Level:

try
{
    // Orchestrate repositories
}
catch (Exception ex)
{
    _logger.LogError(ex, "Error message with context");
    return new Response
    {
        SearchResultType = SearchResultType.NotFound,
        Results = new List<T>()
    };
}

Controller Level:

// No try-catch - relies on ASP.NET Core default error handling
var response = await _service.SearchArtist(name, provider, offset);
return Ok(response);

Implications:

  • Errors logged but not exposed to client
  • Client can't distinguish between "no results" and "error"
  • No retry logic
  • No circuit breaker pattern

Integration Recommendations

For Production Use

  1. Implement Retry Policies (Polly):
builder.Services.AddHttpClient<ISpotifyRepository, SpotifyRepository>()
    .AddTransientHttpErrorPolicy(policy => 
        policy.WaitAndRetryAsync(3, retryAttempt => 
            TimeSpan.FromSeconds(Math.Pow(2, retryAttempt))));
  1. Add Circuit Breaker:
.AddTransientHttpErrorPolicy(policy => 
    policy.CircuitBreakerAsync(5, TimeSpan.FromSeconds(30)));
  1. Implement Health Checks:
builder.Services.AddHealthChecks()
    .AddNpgSql(_connectionString)
    .AddCheck<SpotifyRepositoryHealthCheck>("spotify_repository");
  1. Add Result Caching:
builder.Services.AddMemoryCache();
builder.Services.AddDistributedRedisCache(options =>
{
    options.Configuration = "localhost:6379";
});
  1. Implement Request Deduplication:
// Combine results from multiple providers, remove duplicates by name similarity
var deduplicated = DeduplicateArtists(combined, similarityThreshold: 0.9);

For Integration with MiniMediaScanner

Potential Enhancements:

  1. Sync Triggering: API could request sync for missing data
  2. Job Status Tracking: Use JobRepository to track sync progress
  3. Webhook Notifications: MiniMediaScanner notifies API of sync completion
  4. Shared Message Queue: RabbitMQ/Kafka for async communication

Current Limitation: No communication channel between projects.

Integration Evaluation

Strengths:

  • Clean separation from provider APIs
  • Repository pattern isolates provider logic
  • Parallel query execution for multi-provider search
  • Helper utilities for provider-specific quirks

Weaknesses:

  • Unused dependencies (Polly, Quartz, FuzzySharp, SpotifyAPI.Web.Auth)
  • No retry logic despite Polly dependency
  • No caching layer
  • Error handling swallows failures
  • No communication with MiniMediaScanner
  • Job tracking infrastructure unused

Recommendations:

  • Remove unused dependencies
  • Implement retry policies
  • Add caching layer (Redis)
  • Expose error details to clients
  • Consider message queue for MiniMediaScanner integration