Files
Alexander a1f6701bac feat: initial implementation of metadata aggregator
- gRPC service with MusicBrainz provider
- PostgreSQL schema with migrations
- Service layer with database-first caching
- Repository pattern for data access
- YAML configuration support
- Research documentation for 17 music metadata projects
2026-04-28 16:28:53 +02:00

1005 lines
24 KiB
Markdown

# MiniMediaMetadataAPI - Codebase Analysis
## Project Structure
```
MiniMediaMetadataAPI/
├── .github/
│ └── workflows/
│ └── docker-image.yml # CI/CD pipeline
├── MiniMediaMetadataAPI/ # Web API project
│ ├── Controllers/
│ │ ├── SearchArtistController.cs
│ │ ├── SearchAlbumController.cs
│ │ ├── SearchTrackController.cs
│ │ └── SearchController.cs # Stub
│ ├── Middlewares/
│ │ └── RequestMiddleware.cs # Prometheus metrics
│ ├── Options/
│ │ └── PrometheusOptions.cs
│ ├── appsettings.json
│ ├── appsettings.Development.json
│ ├── Program.cs # Entry point
│ └── MiniMediaMetadataAPI.csproj
├── MiniMediaMetadataAPI.Application/ # Business logic project
│ ├── Configurations/
│ │ └── DatabaseConfiguration.cs
│ ├── Enums/
│ │ ├── ProviderType.cs
│ │ └── SearchResultType.cs
│ ├── Helpers/
│ │ ├── StringHelper.cs
│ │ └── DiscogsHelper.cs
│ ├── Models/
│ │ ├── Database/ # 60+ provider-specific models
│ │ │ ├── Deezer/
│ │ │ ├── Discogs/
│ │ │ ├── MusicBrainz/
│ │ │ ├── SoundCloud/
│ │ │ ├── Spotify/
│ │ │ └── Tidal/
│ │ └── Entities/ # API response models
│ │ ├── SearchArtistEntity.cs
│ │ ├── SearchAlbumEntity.cs
│ │ ├── SearchTrackEntity.cs
│ │ ├── ArtistImageEntity.cs
│ │ ├── AlbumImageEntity.cs
│ │ └── TrackImageEntity.cs
│ ├── Repositories/
│ │ ├── SpotifyRepository.cs
│ │ ├── TidalRepository.cs
│ │ ├── MusicBrainzRepository.cs
│ │ ├── DeezerRepository.cs
│ │ ├── DiscogsRepository.cs
│ │ ├── SoundCloudRepository.cs
│ │ └── JobRepository.cs # Unused
│ ├── Services/
│ │ ├── SearchArtistService.cs
│ │ ├── SearchAlbumService.cs
│ │ └── SearchTrackService.cs
│ └── MiniMediaMetadataAPI.Application.csproj
├── MiniMediaMetadataAPI.Tests/ # Test project (empty)
│ ├── UnitTest1.cs # Stub
│ └── MiniMediaMetadataAPI.Tests.csproj
├── Dockerfile
├── compose.yaml
├── .gitignore
├── README.md
└── MiniMediaMetadataAPI.sln
```
**Total Files:** 99 C# files
**Lines of Code:** ~15,000 (estimated)
## Configuration Files
### appsettings.json
**Location:** `MiniMediaMetadataAPI/appsettings.json`
**Full Configuration:**
```json
{
"Logging": {
"LogLevel": {
"Default": "Information",
"Microsoft.AspNetCore": "Warning"
}
},
"AllowedHosts": "*",
"DatabaseConfiguration": {
"ConnectionString": "Host=localhost;Database=minimediametadata;Username=postgres;Password=postgres;MinPoolSize=5;MaxPoolSize=100"
},
"Prometheus": {
"MetricsUrl": "/metrics"
}
}
```
**Configuration Sections:**
#### Logging
- **Default:** Information level (Info, Warning, Error, Critical)
- **Microsoft.AspNetCore:** Warning level (reduces framework noise)
- **Output:** Console (Docker logs)
#### AllowedHosts
- **Value:** `*` (all hosts allowed)
- **Security Risk:** No host header validation
- **Recommendation:** Specify allowed domains in production
#### DatabaseConfiguration
- **Host:** PostgreSQL server hostname
- **Database:** Database name
- **Username/Password:** Credentials (plain text, NOT SECURE)
- **MinPoolSize:** 5 connections kept alive
- **MaxPoolSize:** 100 concurrent connections
#### Prometheus
- **MetricsUrl:** Endpoint path for metrics (`/metrics`)
### appsettings.Development.json
**Location:** `MiniMediaMetadataAPI/appsettings.Development.json`
**Configuration:**
```json
{
"Logging": {
"LogLevel": {
"Default": "Debug",
"Microsoft.AspNetCore": "Information"
}
}
}
```
**Changes from Base:**
- Debug logging enabled (verbose)
- Framework logging at Information level
**No Production Override:** `appsettings.Production.json` not included
## Entry Point
### Program.cs
**Location:** `MiniMediaMetadataAPI/Program.cs`
**Full Implementation:**
```csharp
using MiniMediaMetadataAPI.Application.Configurations;
using MiniMediaMetadataAPI.Application.Repositories;
using MiniMediaMetadataAPI.Application.Services;
using MiniMediaMetadataAPI.Middlewares;
using Prometheus;
var builder = WebApplication.CreateBuilder(args);
// Configuration
builder.Services.Configure<DatabaseConfiguration>(
builder.Configuration.GetSection("DatabaseConfiguration"));
// Repositories
builder.Services.AddScoped<ISpotifyRepository, SpotifyRepository>();
builder.Services.AddScoped<ITidalRepository, TidalRepository>();
builder.Services.AddScoped<IMusicBrainzRepository, MusicBrainzRepository>();
builder.Services.AddScoped<IDeezerRepository, DeezerRepository>();
builder.Services.AddScoped<IDiscogsRepository, DiscogsRepository>();
builder.Services.AddScoped<ISoundCloudRepository, SoundCloudRepository>();
builder.Services.AddScoped<IJobRepository, JobRepository>();
// Services
builder.Services.AddScoped<ISearchArtistService, SearchArtistService>();
builder.Services.AddScoped<ISearchAlbumService, SearchAlbumService>();
builder.Services.AddScoped<ISearchTrackService, SearchTrackService>();
// Controllers
builder.Services.AddControllers();
// Swagger
builder.Services.AddEndpointsApiExplorer();
builder.Services.AddSwaggerGen();
var app = builder.Build();
// Swagger (all environments)
app.UseSwagger();
app.UseSwaggerUI();
// HTTPS redirection (COMMENTED OUT)
// app.UseHttpsRedirection();
// Prometheus middleware
app.UseMiddleware<RequestMiddleware>();
// Authorization (not configured)
app.UseAuthorization();
// Map controllers
app.MapControllers();
// Prometheus metrics endpoint
app.MapMetrics();
app.Run();
```
**Dependency Injection:**
- **Scoped lifetime:** New instances per request
- **No Singleton services:** No shared state
- **No Transient services:** All scoped
**Middleware Pipeline:**
1. Swagger (documentation)
2. RequestMiddleware (Prometheus metrics)
3. Authorization (no-op, no auth configured)
4. Controllers (endpoint routing)
**Missing Middleware:**
- HTTPS redirection (commented out)
- CORS (not configured)
- Authentication (not configured)
- Rate limiting (not configured)
- Exception handling (uses ASP.NET Core default)
**Swagger in Production:** Enabled in all environments (security risk)
## Controllers
### SearchArtistController
**Location:** `MiniMediaMetadataAPI/Controllers/SearchArtistController.cs`
**Implementation:**
```csharp
using Microsoft.AspNetCore.Mvc;
using MiniMediaMetadataAPI.Application.Enums;
using MiniMediaMetadataAPI.Application.Helpers;
using MiniMediaMetadataAPI.Application.Services;
namespace MiniMediaMetadataAPI.Controllers;
[ApiController]
[Route("api/[controller]")]
public class SearchArtistController : ControllerBase
{
private readonly ISearchArtistService _searchArtistService;
private readonly ILogger<SearchArtistController> _logger;
public SearchArtistController(
ISearchArtistService searchArtistService,
ILogger<SearchArtistController> logger)
{
_searchArtistService = searchArtistService;
_logger = logger;
}
[HttpGet]
public async Task<IActionResult> Get(
[FromQuery] string? Id = null,
[FromQuery] string? Name = null,
[FromQuery] ProviderType Provider = ProviderType.Any,
[FromQuery] int Offset = 0)
{
// Input sanitization
Name = StringHelper.RemoveControlChars(Name);
// Search by ID or Name
if (!string.IsNullOrEmpty(Id))
{
var result = await _searchArtistService.GetArtistById(Id, Provider);
return Ok(result);
}
else if (!string.IsNullOrEmpty(Name))
{
var result = await _searchArtistService.SearchArtist(Name, Provider, Offset);
return Ok(result);
}
else
{
return BadRequest("Either Id or Name must be provided");
}
}
}
```
**Features:**
- Optional parameters (Id, Name, Provider, Offset)
- Input sanitization (control character removal)
- Conditional logic (ID vs Name search)
- Always returns 200 OK (even for errors)
**Missing:**
- Input validation (length, format)
- Rate limiting
- Authentication
- Proper HTTP status codes (404 for not found)
### SearchAlbumController
**Location:** `MiniMediaMetadataAPI/Controllers/SearchAlbumController.cs`
**Similar Structure:**
```csharp
[HttpGet]
public async Task<IActionResult> Get(
[FromQuery] string? AlbumId = null,
[FromQuery] string? ArtistId = null,
[FromQuery] string? AlbumName = null,
[FromQuery] ProviderType Provider = ProviderType.Any,
[FromQuery] int Offset = 0)
{
AlbumName = StringHelper.RemoveControlChars(AlbumName);
if (!string.IsNullOrEmpty(AlbumId))
{
var result = await _searchAlbumService.GetAlbumById(AlbumId, Provider);
return Ok(result);
}
else if (!string.IsNullOrEmpty(AlbumName) || !string.IsNullOrEmpty(ArtistId))
{
var result = await _searchAlbumService.SearchAlbum(AlbumName, ArtistId, Provider, Offset);
return Ok(result);
}
else
{
return BadRequest("Either AlbumId, AlbumName, or ArtistId must be provided");
}
}
```
**Additional Feature:** Combined search (AlbumName + ArtistId)
### SearchTrackController
**Location:** `MiniMediaMetadataAPI/Controllers/SearchTrackController.cs`
**Similar Structure:**
```csharp
[HttpGet]
public async Task<IActionResult> Get(
[FromQuery] string? TrackId = null,
[FromQuery] string? ArtistId = null,
[FromQuery] string? TrackName = null,
[FromQuery] ProviderType Provider = ProviderType.Any,
[FromQuery] int Offset = 0)
{
TrackName = StringHelper.RemoveControlChars(TrackName);
if (!string.IsNullOrEmpty(TrackId))
{
var result = await _searchTrackService.GetTrackById(TrackId, Provider);
return Ok(result);
}
else if (!string.IsNullOrEmpty(TrackName) || !string.IsNullOrEmpty(ArtistId))
{
var result = await _searchTrackService.SearchTrack(TrackName, ArtistId, Provider, Offset);
return Ok(result);
}
else
{
return BadRequest("Either TrackId, TrackName, or ArtistId must be provided");
}
}
```
### SearchController
**Location:** `MiniMediaMetadataAPI/Controllers/SearchController.cs`
**Stub Implementation:**
```csharp
[ApiController]
[Route("api/[controller]")]
public class SearchController : ControllerBase
{
[HttpGet]
public IActionResult Get()
{
return Ok("Search endpoint - not implemented");
}
}
```
**Status:** Placeholder only
**Intended Purpose:** Unified search across artists, albums, tracks (speculation)
## Middleware
### RequestMiddleware
**Location:** `MiniMediaMetadataAPI/Middlewares/RequestMiddleware.cs`
**Full Implementation:**
```csharp
using Prometheus;
namespace MiniMediaMetadataAPI.Middlewares;
public class RequestMiddleware
{
private readonly RequestDelegate _next;
private static readonly Counter RequestCounter = Metrics.CreateCounter(
"minimediametadataapi_request_total",
"Total HTTP requests",
new CounterConfiguration
{
LabelNames = new[] { "path", "method", "status" }
});
public RequestMiddleware(RequestDelegate next)
{
_next = next;
}
public async Task InvokeAsync(HttpContext context)
{
await _next(context);
RequestCounter
.WithLabels(
context.Request.Path,
context.Request.Method,
context.Response.StatusCode.ToString())
.Inc();
}
}
```
**Purpose:** Prometheus metrics collection
**Metrics:**
- **Name:** `minimediametadataapi_request_total`
- **Type:** Counter
- **Labels:** path, method, status
**Execution:** After request processing (post-middleware)
**Missing Metrics:**
- Request duration
- Database query time
- Error rates
- Active requests
## Enums
### ProviderType
**Location:** `MiniMediaMetadataAPI.Application/Enums/ProviderType.cs`
**Definition:**
```csharp
namespace MiniMediaMetadataAPI.Application.Enums;
public enum ProviderType
{
Any = 0,
Deezer = 1,
Discogs = 2,
MusicBrainz = 3,
Spotify = 4,
Tidal = 5,
SoundCloud = 6
}
```
**Usage:**
- Query parameter in API endpoints
- Filter for provider-specific searches
- `Any` triggers multi-provider search
### SearchResultType
**Location:** `MiniMediaMetadataAPI.Application/Enums/SearchResultType.cs`
**Definition:**
```csharp
namespace MiniMediaMetadataAPI.Application.Enums;
public enum SearchResultType
{
Ok = 0,
NotFound = 1,
InQueueSync = 2
}
```
**Usage:**
- Response status indicator
- `Ok` - Results found
- `NotFound` - No results
- `InQueueSync` - Data sync in progress (UNUSED)
**Issue:** `InQueueSync` never returned (dead code)
## Helpers
### StringHelper
**Location:** `MiniMediaMetadataAPI.Application/Helpers/StringHelper.cs`
**Full Implementation:**
```csharp
using System.Text.RegularExpressions;
namespace MiniMediaMetadataAPI.Application.Helpers;
public static class StringHelper
{
public static string RemoveControlChars(string? input)
{
if (string.IsNullOrEmpty(input))
return input ?? string.Empty;
// Remove control characters (0x00-0x1F, 0x7F-0x9F)
return Regex.Replace(input, @"[\x00-\x1F\x7F-\x9F]", string.Empty);
}
public static string RemoveEmojis(string? input)
{
if (string.IsNullOrEmpty(input))
return input ?? string.Empty;
// Remove surrogate pairs (emojis)
return Regex.Replace(input, @"\p{Cs}", string.Empty);
}
}
```
**RemoveControlChars:**
- Removes ASCII control characters
- Prevents injection attacks
- Used on all user input
**RemoveEmojis:**
- Removes Unicode emojis
- NOT used in API (only in MiniMediaScanner)
### DiscogsHelper
**Location:** `MiniMediaMetadataAPI.Application/Helpers/DiscogsHelper.cs`
**Full Implementation:**
```csharp
namespace MiniMediaMetadataAPI.Application.Helpers;
public static class DiscogsHelper
{
public static int GetDiscNumber(string? position)
{
if (string.IsNullOrEmpty(position))
return 1;
var parts = position.Split('-');
return parts.Length > 0 && int.TryParse(parts[0], out var disc)
? disc
: 1;
}
public static int GetTrackNumber(string? position)
{
if (string.IsNullOrEmpty(position))
return 0;
var parts = position.Split('-');
return parts.Length > 1 && int.TryParse(parts[1], out var track)
? track
: 0;
}
}
```
**Purpose:** Parse Discogs position format (`"2-5"` → disc 2, track 5)
**Limitations:** Only handles numeric format (not vinyl sides like `"A1"`)
## Error Handling
### Repository Level
**Pattern:**
```csharp
public async Task<List<SearchArtistEntity>> SearchArtist(string name, int offset)
{
try
{
using var connection = new NpgsqlConnection(_connectionString);
var results = await connection.QueryAsync<SpotifyArtist>(sql, parameters);
return results.Select(MapToEntity).ToList();
}
catch (Exception ex)
{
_logger.LogError(ex, "Error searching Spotify artists for term: {SearchTerm}", name);
return new List<SearchArtistEntity>();
}
}
```
**Strategy:**
- Catch all exceptions
- Log error with context
- Return empty list (no error propagation)
**Issues:**
- Client can't distinguish error from no results
- No retry logic
- No specific exception handling
### Service Level
**Pattern:**
```csharp
public async Task<SearchArtistResponse> SearchArtist(string name, ProviderType provider, int offset)
{
try
{
// Orchestrate repositories
var results = await GetResults(name, provider, offset);
return new SearchArtistResponse
{
SearchResultType = results.Any() ? SearchResultType.Ok : SearchResultType.NotFound,
Artists = results
};
}
catch (Exception ex)
{
_logger.LogError(ex, "Error in SearchArtistService for term: {SearchTerm}", name);
return new SearchArtistResponse
{
SearchResultType = SearchResultType.NotFound,
Artists = new List<SearchArtistEntity>()
};
}
}
```
**Strategy:** Same as repository (swallow errors, return empty)
### Controller Level
**Pattern:**
```csharp
[HttpGet]
public async Task<IActionResult> Get(...)
{
// No try-catch - relies on ASP.NET Core default error handling
var result = await _searchArtistService.SearchArtist(name, provider, offset);
return Ok(result);
}
```
**Strategy:** No error handling (framework handles unhandled exceptions)
**ASP.NET Core Default:**
- 500 Internal Server Error for unhandled exceptions
- Generic error response (no details in production)
- Stack trace hidden in production
## Logging
### Configuration
**appsettings.json:**
```json
{
"Logging": {
"LogLevel": {
"Default": "Information",
"Microsoft.AspNetCore": "Warning"
}
}
}
```
**Log Levels:**
- Trace (0) - Most verbose
- Debug (1) - Development details
- Information (2) - General flow
- Warning (3) - Unexpected but handled
- Error (4) - Failures
- Critical (5) - Fatal errors
**Current Levels:**
- Application: Information
- Framework: Warning
### Usage Patterns
**Repository Logging:**
```csharp
_logger.LogError(ex, "Error searching Spotify artists for term: {SearchTerm}", name);
```
**Service Logging:**
```csharp
_logger.LogError(ex, "Error in SearchArtistService for term: {SearchTerm}", name);
```
**No Info/Debug Logging:** Only errors logged
**Missing:**
- Request/response logging
- Performance logging
- Business event logging
- Structured logging (Serilog)
### Log Output
**Destination:** Console (Docker logs)
**Format:** Plain text (not JSON)
**Example:**
```
info: MiniMediaMetadataAPI.Application.Services.SearchArtistService[0]
Searching for artist: Beatles
fail: MiniMediaMetadataAPI.Application.Repositories.SpotifyRepository[0]
Error searching Spotify artists for term: Beatles
Npgsql.NpgsqlException: Connection refused
```
**No Correlation IDs:** Can't trace requests across logs
## Testing
### Test Project
**Location:** `MiniMediaMetadataAPI.Tests/`
**Framework:** xUnit
**Current State:** Empty stub
**UnitTest1.cs:**
```csharp
namespace MiniMediaMetadataAPI.Tests;
public class UnitTest1
{
[Fact]
public void Test1()
{
// Empty test
}
}
```
**Coverage:** 0%
**CI/CD Integration:** Tests not run in pipeline
### Missing Tests
**Unit Tests:**
- Repository methods
- Service orchestration
- Helper functions
- Enum conversions
**Integration Tests:**
- Database queries
- Multi-provider aggregation
- Error handling
**API Tests:**
- Controller endpoints
- Input validation
- Response formats
**Performance Tests:**
- Load testing
- Stress testing
- Fuzzy search performance
## Code Quality
### Naming Conventions
**Consistent:**
- PascalCase for classes, methods, properties
- camelCase for parameters, local variables
- Prefix interfaces with `I`
- Suffix repositories with `Repository`
- Suffix services with `Service`
**Examples:**
```csharp
public interface ISpotifyRepository { }
public class SpotifyRepository : ISpotifyRepository { }
public class SearchArtistService : ISearchArtistService { }
```
### Code Organization
**Good:**
- Clear separation of concerns (controllers, services, repositories)
- Provider isolation (one repository per provider)
- Consistent file structure
**Issues:**
- Large repository files (500+ lines)
- No partial classes for large models
- Helpers in single file (could split)
### Documentation
**XML Comments:** None
**README:** Basic setup instructions
**API Documentation:** Swagger only (no additional docs)
**Missing:**
- Code comments
- Architecture documentation
- Deployment guide
- Troubleshooting guide
### Code Smells
**Unused Dependencies:**
- Quartz (registered but no jobs)
- Polly (registered but no policies)
- FuzzySharp (registered but not used)
- SpotifyAPI.Web.Auth (not used)
**Magic Numbers:**
- Hardcoded page size (20)
- Hardcoded similarity threshold (0.5)
- Hardcoded connection pool sizes (5, 100)
**Recommendation:** Move to configuration
**Swallowed Exceptions:**
- All exceptions caught and logged
- No error propagation
- Client can't distinguish errors from no results
**No Async Suffix:**
- Methods return `Task<T>` but don't end with `Async`
- Violates .NET naming conventions
**Example:**
```csharp
// Current
Task<List<SearchArtistEntity>> SearchArtist(string name, int offset);
// Recommended
Task<List<SearchArtistEntity>> SearchArtistAsync(string name, int offset);
```
## Security Analysis
### Authentication
**Status:** None
**Implications:**
- Fully open API
- No user identification
- No access control
### Authorization
**Status:** None (middleware registered but not configured)
**Code:**
```csharp
app.UseAuthorization(); // No-op without authentication
```
### Input Validation
**Implemented:**
- Control character removal
- Null/empty checks
**Missing:**
- Length validation
- Format validation (email, URL, etc.)
- SQL injection protection (handled by Dapper)
- XSS protection (handled by JSON serialization)
### HTTPS
**Status:** Disabled
**Code:**
```csharp
// app.UseHttpsRedirection(); // COMMENTED OUT
```
**Implication:** Plain text traffic (expects reverse proxy)
### Secrets Management
**Status:** Plain text in configuration
**Issue:**
```json
{
"DatabaseConfiguration": {
"ConnectionString": "Host=localhost;Database=minimediametadata;Username=postgres;Password=postgres"
}
}
```
**Recommendation:** Environment variables or secrets manager
### CORS
**Status:** Not configured
**Implication:** Browser clients blocked by default
### Rate Limiting
**Status:** Not implemented
**Implication:** Vulnerable to abuse
## Performance Considerations
### Database Queries
**Optimized:**
- Parameterized queries (Dapper)
- Connection pooling
- Fuzzy search with GIN indexes
**Not Optimized:**
- No query result caching
- No prepared statements
- No query batching
### Parallel Execution
**Implemented:**
```csharp
var tasks = new[]
{
_spotify.SearchArtist(name, offset),
_tidal.SearchArtist(name, offset),
// ... 4 more providers
};
var results = await Task.WhenAll(tasks);
```
**Benefit:** Multi-provider search in parallel (not sequential)
### Memory Usage
**Efficient:**
- Scoped services (per-request lifetime)
- No static state
- Connection pooling
**Potential Issues:**
- Large result sets (no pagination limit)
- No streaming for large responses
## Codebase Evaluation
**Strengths:**
- Clean architecture (layers well-separated)
- Consistent naming conventions
- Provider isolation (repository pattern)
- Parallel query execution
- Input sanitization
**Weaknesses:**
- No tests (0% coverage)
- No authentication/authorization
- Unused dependencies
- Swallowed exceptions
- No structured logging
- No XML documentation
- Magic numbers hardcoded
- HTTPS disabled
- No rate limiting
**Maintainability:** 7/10
- Easy to understand
- Clear structure
- But lacks tests and documentation
**Production Readiness:** 5/10
- Works but missing critical features
- Security gaps
- No observability beyond basic metrics
**Recommendations:**
1. Add comprehensive tests
2. Implement authentication
3. Remove unused dependencies
4. Add structured logging (Serilog)
5. Move magic numbers to configuration
6. Add XML documentation
7. Implement proper error handling
8. Enable HTTPS
9. Add rate limiting
10. Add health checks