Files
metadata-agregator/docs/research/minimediametadataapi/analysis/API.md
T
Alexander a1f6701bac feat: initial implementation of metadata aggregator
- gRPC service with MusicBrainz provider
- PostgreSQL schema with migrations
- Service layer with database-first caching
- Repository pattern for data access
- YAML configuration support
- Research documentation for 17 music metadata projects
2026-04-28 16:28:53 +02:00

22 KiB

MiniMediaMetadataAPI - API Interface Analysis

API Surface Overview

Base URL: http://localhost:56232 (production deployment)
API Prefix: /api
Documentation: /swagger (Swagger UI)
Metrics: /metrics (Prometheus format)

Controllers

1. SearchArtistController

Path: /api/SearchArtist
File: Controllers/SearchArtistController.cs

GET /api/SearchArtist

Query Parameters:

Parameter Type Required Default Description
Id string No null Artist ID (provider-specific format)
Name string No null Artist name for fuzzy search
Provider ProviderType No Any Target provider or Any for all
Offset int No 0 Pagination offset

Provider Parameter Values:

  • Any (default) - Search all 6 providers
  • Spotify - Spotify only
  • Tidal - Tidal only
  • MusicBrainz - MusicBrainz only
  • Deezer - Deezer only
  • Discogs - Discogs only
  • SoundCloud - SoundCloud only

Response Model: SearchArtistResponse

{
  "searchResultType": "Ok",
  "artists": [
    {
      "providerType": "Spotify",
      "id": "3WrFJ7ztbogyGnTHbHJFl2",
      "name": "The Beatles",
      "popularity": 100,
      "url": "https://open.spotify.com/artist/3WrFJ7ztbogyGnTHbHJFl2",
      "totalFollowers": 50000000,
      "genres": ["rock", "british invasion", "classic rock"],
      "images": [
        {
          "url": "https://i.scdn.co/image/...",
          "height": 640,
          "width": 640
        }
      ],
      "sortName": null,
      "lastSyncTime": "2024-01-15T10:30:00Z"
    }
  ]
}

SearchResultType Enum:

  • Ok (0) - Results found
  • NotFound (1) - No results
  • InQueueSync (2) - Data sync in progress (unused in current implementation)

SearchArtistEntity Fields:

Field Type Nullable Providers Description
providerType ProviderType No All Source provider
id string No All Provider-specific artist ID
name string No All Artist name
popularity int Yes Spotify, Deezer Popularity score (0-100)
url string Yes All Artist URL on provider platform
totalFollowers int Yes Spotify, Deezer Follower count
genres string[] Yes Spotify, Deezer, MusicBrainz Genre tags
images ArtistImageEntity[] Yes Spotify, Tidal, Deezer Artist images
sortName string Yes MusicBrainz MusicBrainz sort name
lastSyncTime DateTime Yes All Last data sync timestamp

Example Requests:

# Search by name across all providers
GET /api/SearchArtist?Name=Beatles&Provider=Any

# Search Spotify only
GET /api/SearchArtist?Name=Beatles&Provider=Spotify

# Get artist by ID
GET /api/SearchArtist?Id=3WrFJ7ztbogyGnTHbHJFl2&Provider=Spotify

# Paginated search
GET /api/SearchArtist?Name=Beatles&Offset=20

Input Sanitization:

Name = StringHelper.RemoveControlChars(Name);

Removes control characters (0x00-0x1F, 0x7F-0x9F) to prevent injection attacks.

2. SearchAlbumController

Path: /api/SearchAlbum
File: Controllers/SearchAlbumController.cs

GET /api/SearchAlbum

Query Parameters:

Parameter Type Required Default Description
AlbumId string No null Album ID (provider-specific)
ArtistId string No null Filter by artist ID
AlbumName string No null Album name for fuzzy search
Provider ProviderType No Any Target provider
Offset int No 0 Pagination offset

Response Model: SearchAlbumResponse

{
  "searchResultType": "Ok",
  "albums": [
    {
      "providerType": "Spotify",
      "id": "1klALx0u4AavZNEvC4LrTL",
      "name": "Abbey Road",
      "popularity": 95,
      "url": "https://open.spotify.com/album/1klALx0u4AavZNEvC4LrTL",
      "label": "Apple Records",
      "releaseDate": "1969-09-26",
      "totalTracks": 17,
      "type": "album",
      "upc": "00602547437389",
      "copyright": "℗ 2009 Calderstone Productions Limited",
      "artistId": "3WrFJ7ztbogyGnTHbHJFl2",
      "images": [
        {
          "url": "https://i.scdn.co/image/...",
          "height": 640,
          "width": 640
        }
      ],
      "artists": [
        {
          "id": "3WrFJ7ztbogyGnTHbHJFl2",
          "name": "The Beatles"
        }
      ],
      "lastSyncTime": "2024-01-15T10:30:00Z"
    }
  ]
}

SearchAlbumEntity Fields:

Field Type Nullable Providers Description
providerType ProviderType No All Source provider
id string No All Provider-specific album ID
name string No All Album name
popularity int Yes Spotify, Deezer Popularity score
url string Yes All Album URL
label string Yes Spotify, Tidal, MusicBrainz, Discogs Record label
releaseDate string Yes All Release date (ISO 8601)
totalTracks int Yes All Track count
type string Yes Spotify, Tidal, MusicBrainz Album type (album, single, compilation)
upc string Yes Spotify, Tidal, Discogs Universal Product Code
copyright string Yes Spotify, Tidal Copyright notice
artistId string Yes All Primary artist ID
images AlbumImageEntity[] Yes Spotify, Tidal, Deezer Album artwork
artists ArtistReference[] Yes All Contributing artists
lastSyncTime DateTime Yes All Last sync timestamp

Example Requests:

# Search by album name
GET /api/SearchAlbum?AlbumName=Abbey%20Road&Provider=Any

# Search albums by artist
GET /api/SearchAlbum?ArtistId=3WrFJ7ztbogyGnTHbHJFl2&Provider=Spotify

# Get album by ID
GET /api/SearchAlbum?AlbumId=1klALx0u4AavZNEvC4LrTL&Provider=Spotify

# Combined search
GET /api/SearchAlbum?AlbumName=Abbey&ArtistId=3WrFJ7ztbogyGnTHbHJFl2

3. SearchTrackController

Path: /api/SearchTrack
File: Controllers/SearchTrackController.cs

GET /api/SearchTrack

Query Parameters:

Parameter Type Required Default Description
TrackId string No null Track ID (provider-specific)
ArtistId string No null Filter by artist ID
TrackName string No null Track name for fuzzy search
Provider ProviderType No Any Target provider
Offset int No 0 Pagination offset

Response Model: SearchTrackResponse

{
  "searchResultType": "Ok",
  "tracks": [
    {
      "providerType": "Spotify",
      "id": "6dGnYIeXmHdcikdzNNDMm2",
      "name": "Here Comes The Sun",
      "popularity": 90,
      "url": "https://open.spotify.com/track/6dGnYIeXmHdcikdzNNDMm2",
      "duration": 185000,
      "explicit": false,
      "discNumber": 1,
      "trackNumber": 7,
      "label": "Apple Records",
      "isrc": "GBAYE0601715",
      "album": {
        "id": "1klALx0u4AavZNEvC4LrTL",
        "name": "Abbey Road",
        "releaseDate": "1969-09-26"
      },
      "artists": [
        {
          "id": "3WrFJ7ztbogyGnTHbHJFl2",
          "name": "The Beatles"
        }
      ],
      "images": [
        {
          "url": "https://i.scdn.co/image/...",
          "height": 640,
          "width": 640
        }
      ],
      "lastSyncTime": "2024-01-15T10:30:00Z"
    }
  ]
}

SearchTrackEntity Fields:

Field Type Nullable Providers Description
providerType ProviderType No All Source provider
id string No All Provider-specific track ID
name string No All Track name
popularity int Yes Spotify, Deezer Popularity score
url string Yes All Track URL
duration int Yes All Duration in milliseconds
explicit bool Yes Spotify, Tidal, Deezer Explicit content flag
discNumber int Yes All Disc number in multi-disc albums
trackNumber int Yes All Track number on disc
label string Yes Spotify, Tidal Record label
isrc string Yes Spotify, Tidal, MusicBrainz International Standard Recording Code
album AlbumReference Yes All Parent album info
artists ArtistReference[] Yes All Contributing artists
images TrackImageEntity[] Yes Spotify, Tidal, Deezer Track/album artwork
lastSyncTime DateTime Yes All Last sync timestamp

Example Requests:

# Search by track name
GET /api/SearchTrack?TrackName=Here%20Comes%20The%20Sun&Provider=Any

# Search tracks by artist
GET /api/SearchTrack?ArtistId=3WrFJ7ztbogyGnTHbHJFl2&Provider=Spotify

# Get track by ID
GET /api/SearchTrack?TrackId=6dGnYIeXmHdcikdzNNDMm2&Provider=Spotify

# Combined search
GET /api/SearchTrack?TrackName=Sun&ArtistId=3WrFJ7ztbogyGnTHbHJFl2

4. SearchController

Path: /api/Search
File: Controllers/SearchController.cs

Status: Stub implementation (not functional)

GET /api/Search

Current Implementation:

[HttpGet]
public IActionResult Get()
{
    return Ok("Search endpoint - not implemented");
}

Intended Purpose: Unified search across artists, albums, and tracks (speculation based on naming).

Current State: Returns placeholder string, no actual functionality.

Swagger/OpenAPI Documentation

Endpoint: /swagger
Framework: Swashbuckle.AspNetCore 10.1.7

Configuration:

builder.Services.AddSwaggerGen();
app.UseSwagger();
app.UseSwaggerUI();

Features:

  • Auto-generated from controller attributes
  • Interactive API testing
  • Request/response schema documentation
  • Enum value descriptions

Access: Available in all environments (no production disable).

Example Swagger URL: http://localhost:56232/swagger/index.html

Prometheus Metrics

Endpoint: /metrics
Format: Prometheus text exposition format

Metrics Exposed:

minimediametadataapi_request_total

Type: Counter
Description: Total HTTP requests
Labels:

  • path - Request path (e.g., /api/SearchArtist)
  • method - HTTP method (GET, POST, etc.)
  • status - HTTP status code (200, 404, 500, etc.)

Example Output:

# HELP minimediametadataapi_request_total Total HTTP requests
# TYPE minimediametadataapi_request_total counter
minimediametadataapi_request_total{path="/api/SearchArtist",method="GET",status="200"} 1523
minimediametadataapi_request_total{path="/api/SearchAlbum",method="GET",status="200"} 892
minimediametadataapi_request_total{path="/api/SearchTrack",method="GET",status="404"} 45
minimediametadataapi_request_total{path="/metrics",method="GET",status="200"} 3200

No other metrics:

  • No request duration histograms
  • No database query metrics
  • No error rate metrics
  • No active request gauges

Security Analysis

Authentication

Status: None
Implications: Fully open API, no user identification

Missing:

  • API keys
  • OAuth 2.0
  • JWT tokens
  • Basic authentication
  • Client certificates

Authorization

Status: None
Implications: All endpoints accessible to all clients

Missing:

  • Role-based access control (RBAC)
  • Scope-based permissions
  • Rate limiting per user/key

HTTPS

Status: Commented out in production

Program.cs:

// app.UseHttpsRedirection(); // COMMENTED OUT

Implications:

  • Traffic sent in plain text
  • Vulnerable to man-in-the-middle attacks
  • No encryption for sensitive data (if any)

Deployment: Expects reverse proxy (nginx, Traefik) to handle TLS termination.

CORS

Status: Not configured

Implications:

  • Browser-based clients blocked by default
  • No cross-origin requests allowed
  • Must be same-origin or use proxy

Missing Configuration:

// NOT PRESENT
builder.Services.AddCors(options => { ... });
app.UseCors();

Input Validation

Sanitization: StringHelper.RemoveControlChars()

Implementation:

public static string RemoveControlChars(string input)
{
    if (string.IsNullOrEmpty(input))
        return input;
    
    return Regex.Replace(input, @"[\x00-\x1F\x7F-\x9F]", string.Empty);
}

Protects Against:

  • Control character injection
  • Null byte attacks
  • Terminal escape sequences

Does NOT Protect Against:

  • SQL injection (mitigated by Dapper parameterization)
  • XSS (JSON serialization handles escaping)
  • Path traversal (no file operations)
  • Command injection (no shell execution)

Additional Sanitization: StringHelper.RemoveEmojis()

Implementation:

public static string RemoveEmojis(string input)
{
    if (string.IsNullOrEmpty(input))
        return input;
    
    return Regex.Replace(input, 
        @"\p{Cs}", // Surrogate pairs (emojis)
        string.Empty);
}

Usage: Applied to database queries, not API inputs.

Rate Limiting

Status: None

Implications:

  • Vulnerable to abuse
  • No protection against DoS
  • Unlimited queries per client

Missing:

  • Request throttling
  • IP-based limits
  • API key quotas
  • Burst protection

SQL Injection Protection

Mechanism: Dapper parameterized queries

Example:

var sql = "SELECT * FROM spotify_artist WHERE name ILIKE @name";
var results = await connection.QueryAsync<SpotifyArtist>(sql, new { name = $"%{searchTerm}%" });

Protection: Parameters never concatenated into SQL strings.

Risk Level: Low (Dapper handles parameterization correctly).

Response Formats

Content Type

Default: application/json
Encoding: UTF-8
Serialization: System.Text.Json (ASP.NET Core default)

Success Response (200 OK)

{
  "searchResultType": "Ok",
  "artists": [ /* array of entities */ ]
}

Not Found Response (200 OK with NotFound type)

{
  "searchResultType": "NotFound",
  "artists": []
}

Note: Returns HTTP 200 even when no results found. Client must check searchResultType.

Error Response (500 Internal Server Error)

ASP.NET Core default error handling:

{
  "type": "https://tools.ietf.org/html/rfc7231#section-6.6.1",
  "title": "An error occurred while processing your request.",
  "status": 500,
  "traceId": "00-abc123..."
}

No custom error responses - uses framework defaults.

Error Details: Hidden in production (no stack traces exposed).

Pagination

Offset-Based Pagination

Parameter: Offset (default: 0)
Limit: Hardcoded to 20 results per request

Example:

# First page (0-19)
GET /api/SearchArtist?Name=Beatles&Offset=0

# Second page (20-39)
GET /api/SearchArtist?Name=Beatles&Offset=20

# Third page (40-59)
GET /api/SearchArtist?Name=Beatles&Offset=40

SQL Implementation:

LIMIT 20 OFFSET @offset

Limitations:

  • No configurable page size
  • No total count in response
  • No next/previous links
  • No cursor-based pagination
  • Performance degrades with large offsets

Missing Metadata:

// NOT PRESENT
{
  "pagination": {
    "offset": 20,
    "limit": 20,
    "total": 150,
    "hasMore": true
  }
}

Provider-Specific Behavior

ID Format Differences

Provider ID Type Example
Spotify string (base62) 3WrFJ7ztbogyGnTHbHJFl2
Tidal int 12345678
MusicBrainz Guid b10bbbfc-cf9e-42e0-be17-e2c3e1d2600d
Deezer long 123456789
Discogs int 12345
SoundCloud int/long 123456789

Implication: ID parameter must match provider's format. Cross-provider ID lookups not possible.

Field Availability

Popularity:

  • Available: Spotify, Deezer
  • Unavailable: Tidal, MusicBrainz, Discogs, SoundCloud
  • Returns: null when unavailable

Genres:

  • Available: Spotify, Deezer, MusicBrainz
  • Unavailable: Tidal, Discogs, SoundCloud
  • Returns: null or empty array

Images:

  • Available: Spotify, Tidal, Deezer
  • Limited: MusicBrainz (via relationships)
  • Unavailable: Discogs (URLs only), SoundCloud
  • Returns: null or empty array

UPC/ISRC:

  • Available: Spotify, Tidal, MusicBrainz
  • Limited: Discogs (identifiers table)
  • Unavailable: Deezer, SoundCloud
  • Returns: null when unavailable

Provider Comparison Table

Feature Spotify Tidal MusicBrainz Deezer Discogs SoundCloud
Artist Images Limited
Album Images Limited
Popularity Score
Follower Count
Genres
UPC
ISRC
Label Info
Release Date
Explicit Flag
Track Duration

API Versioning

Status: None

Current Approach: Single version at /api/*

Implications:

  • Breaking changes affect all clients
  • No gradual migration path
  • No deprecation strategy

Missing:

  • URL versioning (/api/v1/, /api/v2/)
  • Header versioning (Accept: application/vnd.api+json;version=1)
  • Query parameter versioning (?api-version=1.0)

Health Checks

Status: None

Missing Endpoints:

  • /health - Overall health status
  • /health/ready - Readiness probe (Kubernetes)
  • /health/live - Liveness probe (Kubernetes)

Implications:

  • No automated health monitoring
  • Load balancers can't detect unhealthy instances
  • Kubernetes probes not supported

Recommended Implementation:

builder.Services.AddHealthChecks()
    .AddNpgSql(connectionString);

app.MapHealthChecks("/health");

API Design Evaluation

Strengths

  1. Consistent Interface: All search endpoints follow same pattern
  2. Provider Abstraction: Provider=Any enables cross-provider search
  3. Fuzzy Search: pg_trgm provides forgiving name matching
  4. Swagger Docs: Interactive documentation out of the box
  5. Prometheus Metrics: Basic observability
  6. Input Sanitization: Control character removal

Weaknesses

  1. No Authentication: Fully open API
  2. No Rate Limiting: Vulnerable to abuse
  3. No HTTPS: Plain text traffic
  4. No CORS: Browser clients blocked
  5. No Versioning: Breaking changes unavoidable
  6. No Health Checks: Monitoring gaps
  7. Fixed Page Size: No pagination control
  8. No Total Counts: Can't determine result set size
  9. HTTP 200 for Not Found: Should use 404
  10. No Error Details: Generic error responses

Recommendations for Production

  1. Add API Key Authentication:

    builder.Services.AddAuthentication("ApiKey")
        .AddScheme<ApiKeyAuthenticationOptions, ApiKeyAuthenticationHandler>("ApiKey", null);
    
  2. Implement Rate Limiting:

    builder.Services.AddRateLimiter(options => {
        options.AddFixedWindowLimiter("api", opt => {
            opt.Window = TimeSpan.FromMinutes(1);
            opt.PermitLimit = 100;
        });
    });
    
  3. Enable HTTPS:

    app.UseHttpsRedirection();
    
  4. Add CORS Policy:

    builder.Services.AddCors(options => {
        options.AddDefaultPolicy(policy => {
            policy.WithOrigins("https://trusted-domain.com")
                  .AllowAnyMethod()
                  .AllowAnyHeader();
        });
    });
    
  5. Add Health Checks:

    builder.Services.AddHealthChecks()
        .AddNpgSql(connectionString);
    app.MapHealthChecks("/health");
    
  6. Implement API Versioning:

    builder.Services.AddApiVersioning(options => {
        options.DefaultApiVersion = new ApiVersion(1, 0);
        options.AssumeDefaultVersionWhenUnspecified = true;
        options.ReportApiVersions = true;
    });
    
  7. Add Pagination Metadata:

    {
      "data": [ /* results */ ],
      "pagination": {
        "offset": 20,
        "limit": 20,
        "total": 150
      }
    }
    
  8. Use Proper HTTP Status Codes:

    • 200 OK - Results found
    • 404 Not Found - No results
    • 400 Bad Request - Invalid parameters
    • 429 Too Many Requests - Rate limit exceeded
    • 500 Internal Server Error - Server errors

Integration Examples

cURL

# Search artists
curl "http://localhost:56232/api/SearchArtist?Name=Beatles&Provider=Any"

# Get specific artist
curl "http://localhost:56232/api/SearchArtist?Id=3WrFJ7ztbogyGnTHbHJFl2&Provider=Spotify"

# Search albums with pagination
curl "http://localhost:56232/api/SearchAlbum?AlbumName=Abbey&Offset=20"

JavaScript (Fetch API)

async function searchArtist(name, provider = 'Any') {
  const params = new URLSearchParams({ Name: name, Provider: provider });
  const response = await fetch(`http://localhost:56232/api/SearchArtist?${params}`);
  const data = await response.json();
  
  if (data.searchResultType === 'Ok') {
    return data.artists;
  } else {
    return [];
  }
}

Python (requests)

import requests

def search_track(track_name, artist_id=None, provider='Any'):
    params = {
        'TrackName': track_name,
        'Provider': provider
    }
    if artist_id:
        params['ArtistId'] = artist_id
    
    response = requests.get(
        'http://localhost:56232/api/SearchTrack',
        params=params
    )
    data = response.json()
    
    return data['tracks'] if data['searchResultType'] == 'Ok' else []

C# (HttpClient)

public async Task<List<SearchAlbumEntity>> SearchAlbum(string albumName, string provider = "Any")
{
    using var client = new HttpClient();
    var url = $"http://localhost:56232/api/SearchAlbum?AlbumName={Uri.EscapeDataString(albumName)}&Provider={provider}";
    var response = await client.GetFromJsonAsync<SearchAlbumResponse>(url);
    
    return response?.SearchResultType == SearchResultType.Ok 
        ? response.Albums 
        : new List<SearchAlbumEntity>();
}