Files
metadata-agregator/docs/research/minimediametadataapi/analysis/API.md
T
Alexander a1f6701bac feat: initial implementation of metadata aggregator
- gRPC service with MusicBrainz provider
- PostgreSQL schema with migrations
- Service layer with database-first caching
- Repository pattern for data access
- YAML configuration support
- Research documentation for 17 music metadata projects
2026-04-28 16:28:53 +02:00

840 lines
22 KiB
Markdown

# MiniMediaMetadataAPI - API Interface Analysis
## API Surface Overview
**Base URL:** `http://localhost:56232` (production deployment)
**API Prefix:** `/api`
**Documentation:** `/swagger` (Swagger UI)
**Metrics:** `/metrics` (Prometheus format)
## Controllers
### 1. SearchArtistController
**Path:** `/api/SearchArtist`
**File:** `Controllers/SearchArtistController.cs`
#### GET /api/SearchArtist
**Query Parameters:**
| Parameter | Type | Required | Default | Description |
|-----------|------|----------|---------|-------------|
| Id | string | No | null | Artist ID (provider-specific format) |
| Name | string | No | null | Artist name for fuzzy search |
| Provider | ProviderType | No | Any | Target provider or Any for all |
| Offset | int | No | 0 | Pagination offset |
**Provider Parameter Values:**
- `Any` (default) - Search all 6 providers
- `Spotify` - Spotify only
- `Tidal` - Tidal only
- `MusicBrainz` - MusicBrainz only
- `Deezer` - Deezer only
- `Discogs` - Discogs only
- `SoundCloud` - SoundCloud only
**Response Model:** `SearchArtistResponse`
```json
{
"searchResultType": "Ok",
"artists": [
{
"providerType": "Spotify",
"id": "3WrFJ7ztbogyGnTHbHJFl2",
"name": "The Beatles",
"popularity": 100,
"url": "https://open.spotify.com/artist/3WrFJ7ztbogyGnTHbHJFl2",
"totalFollowers": 50000000,
"genres": ["rock", "british invasion", "classic rock"],
"images": [
{
"url": "https://i.scdn.co/image/...",
"height": 640,
"width": 640
}
],
"sortName": null,
"lastSyncTime": "2024-01-15T10:30:00Z"
}
]
}
```
**SearchResultType Enum:**
- `Ok` (0) - Results found
- `NotFound` (1) - No results
- `InQueueSync` (2) - Data sync in progress (unused in current implementation)
**SearchArtistEntity Fields:**
| Field | Type | Nullable | Providers | Description |
|-------|------|----------|-----------|-------------|
| providerType | ProviderType | No | All | Source provider |
| id | string | No | All | Provider-specific artist ID |
| name | string | No | All | Artist name |
| popularity | int | Yes | Spotify, Deezer | Popularity score (0-100) |
| url | string | Yes | All | Artist URL on provider platform |
| totalFollowers | int | Yes | Spotify, Deezer | Follower count |
| genres | string[] | Yes | Spotify, Deezer, MusicBrainz | Genre tags |
| images | ArtistImageEntity[] | Yes | Spotify, Tidal, Deezer | Artist images |
| sortName | string | Yes | MusicBrainz | MusicBrainz sort name |
| lastSyncTime | DateTime | Yes | All | Last data sync timestamp |
**Example Requests:**
```bash
# Search by name across all providers
GET /api/SearchArtist?Name=Beatles&Provider=Any
# Search Spotify only
GET /api/SearchArtist?Name=Beatles&Provider=Spotify
# Get artist by ID
GET /api/SearchArtist?Id=3WrFJ7ztbogyGnTHbHJFl2&Provider=Spotify
# Paginated search
GET /api/SearchArtist?Name=Beatles&Offset=20
```
**Input Sanitization:**
```csharp
Name = StringHelper.RemoveControlChars(Name);
```
Removes control characters (0x00-0x1F, 0x7F-0x9F) to prevent injection attacks.
### 2. SearchAlbumController
**Path:** `/api/SearchAlbum`
**File:** `Controllers/SearchAlbumController.cs`
#### GET /api/SearchAlbum
**Query Parameters:**
| Parameter | Type | Required | Default | Description |
|-----------|------|----------|---------|-------------|
| AlbumId | string | No | null | Album ID (provider-specific) |
| ArtistId | string | No | null | Filter by artist ID |
| AlbumName | string | No | null | Album name for fuzzy search |
| Provider | ProviderType | No | Any | Target provider |
| Offset | int | No | 0 | Pagination offset |
**Response Model:** `SearchAlbumResponse`
```json
{
"searchResultType": "Ok",
"albums": [
{
"providerType": "Spotify",
"id": "1klALx0u4AavZNEvC4LrTL",
"name": "Abbey Road",
"popularity": 95,
"url": "https://open.spotify.com/album/1klALx0u4AavZNEvC4LrTL",
"label": "Apple Records",
"releaseDate": "1969-09-26",
"totalTracks": 17,
"type": "album",
"upc": "00602547437389",
"copyright": "℗ 2009 Calderstone Productions Limited",
"artistId": "3WrFJ7ztbogyGnTHbHJFl2",
"images": [
{
"url": "https://i.scdn.co/image/...",
"height": 640,
"width": 640
}
],
"artists": [
{
"id": "3WrFJ7ztbogyGnTHbHJFl2",
"name": "The Beatles"
}
],
"lastSyncTime": "2024-01-15T10:30:00Z"
}
]
}
```
**SearchAlbumEntity Fields:**
| Field | Type | Nullable | Providers | Description |
|-------|------|----------|-----------|-------------|
| providerType | ProviderType | No | All | Source provider |
| id | string | No | All | Provider-specific album ID |
| name | string | No | All | Album name |
| popularity | int | Yes | Spotify, Deezer | Popularity score |
| url | string | Yes | All | Album URL |
| label | string | Yes | Spotify, Tidal, MusicBrainz, Discogs | Record label |
| releaseDate | string | Yes | All | Release date (ISO 8601) |
| totalTracks | int | Yes | All | Track count |
| type | string | Yes | Spotify, Tidal, MusicBrainz | Album type (album, single, compilation) |
| upc | string | Yes | Spotify, Tidal, Discogs | Universal Product Code |
| copyright | string | Yes | Spotify, Tidal | Copyright notice |
| artistId | string | Yes | All | Primary artist ID |
| images | AlbumImageEntity[] | Yes | Spotify, Tidal, Deezer | Album artwork |
| artists | ArtistReference[] | Yes | All | Contributing artists |
| lastSyncTime | DateTime | Yes | All | Last sync timestamp |
**Example Requests:**
```bash
# Search by album name
GET /api/SearchAlbum?AlbumName=Abbey%20Road&Provider=Any
# Search albums by artist
GET /api/SearchAlbum?ArtistId=3WrFJ7ztbogyGnTHbHJFl2&Provider=Spotify
# Get album by ID
GET /api/SearchAlbum?AlbumId=1klALx0u4AavZNEvC4LrTL&Provider=Spotify
# Combined search
GET /api/SearchAlbum?AlbumName=Abbey&ArtistId=3WrFJ7ztbogyGnTHbHJFl2
```
### 3. SearchTrackController
**Path:** `/api/SearchTrack`
**File:** `Controllers/SearchTrackController.cs`
#### GET /api/SearchTrack
**Query Parameters:**
| Parameter | Type | Required | Default | Description |
|-----------|------|----------|---------|-------------|
| TrackId | string | No | null | Track ID (provider-specific) |
| ArtistId | string | No | null | Filter by artist ID |
| TrackName | string | No | null | Track name for fuzzy search |
| Provider | ProviderType | No | Any | Target provider |
| Offset | int | No | 0 | Pagination offset |
**Response Model:** `SearchTrackResponse`
```json
{
"searchResultType": "Ok",
"tracks": [
{
"providerType": "Spotify",
"id": "6dGnYIeXmHdcikdzNNDMm2",
"name": "Here Comes The Sun",
"popularity": 90,
"url": "https://open.spotify.com/track/6dGnYIeXmHdcikdzNNDMm2",
"duration": 185000,
"explicit": false,
"discNumber": 1,
"trackNumber": 7,
"label": "Apple Records",
"isrc": "GBAYE0601715",
"album": {
"id": "1klALx0u4AavZNEvC4LrTL",
"name": "Abbey Road",
"releaseDate": "1969-09-26"
},
"artists": [
{
"id": "3WrFJ7ztbogyGnTHbHJFl2",
"name": "The Beatles"
}
],
"images": [
{
"url": "https://i.scdn.co/image/...",
"height": 640,
"width": 640
}
],
"lastSyncTime": "2024-01-15T10:30:00Z"
}
]
}
```
**SearchTrackEntity Fields:**
| Field | Type | Nullable | Providers | Description |
|-------|------|----------|-----------|-------------|
| providerType | ProviderType | No | All | Source provider |
| id | string | No | All | Provider-specific track ID |
| name | string | No | All | Track name |
| popularity | int | Yes | Spotify, Deezer | Popularity score |
| url | string | Yes | All | Track URL |
| duration | int | Yes | All | Duration in milliseconds |
| explicit | bool | Yes | Spotify, Tidal, Deezer | Explicit content flag |
| discNumber | int | Yes | All | Disc number in multi-disc albums |
| trackNumber | int | Yes | All | Track number on disc |
| label | string | Yes | Spotify, Tidal | Record label |
| isrc | string | Yes | Spotify, Tidal, MusicBrainz | International Standard Recording Code |
| album | AlbumReference | Yes | All | Parent album info |
| artists | ArtistReference[] | Yes | All | Contributing artists |
| images | TrackImageEntity[] | Yes | Spotify, Tidal, Deezer | Track/album artwork |
| lastSyncTime | DateTime | Yes | All | Last sync timestamp |
**Example Requests:**
```bash
# Search by track name
GET /api/SearchTrack?TrackName=Here%20Comes%20The%20Sun&Provider=Any
# Search tracks by artist
GET /api/SearchTrack?ArtistId=3WrFJ7ztbogyGnTHbHJFl2&Provider=Spotify
# Get track by ID
GET /api/SearchTrack?TrackId=6dGnYIeXmHdcikdzNNDMm2&Provider=Spotify
# Combined search
GET /api/SearchTrack?TrackName=Sun&ArtistId=3WrFJ7ztbogyGnTHbHJFl2
```
### 4. SearchController
**Path:** `/api/Search`
**File:** `Controllers/SearchController.cs`
**Status:** Stub implementation (not functional)
#### GET /api/Search
**Current Implementation:**
```csharp
[HttpGet]
public IActionResult Get()
{
return Ok("Search endpoint - not implemented");
}
```
**Intended Purpose:** Unified search across artists, albums, and tracks (speculation based on naming).
**Current State:** Returns placeholder string, no actual functionality.
## Swagger/OpenAPI Documentation
**Endpoint:** `/swagger`
**Framework:** Swashbuckle.AspNetCore 10.1.7
**Configuration:**
```csharp
builder.Services.AddSwaggerGen();
app.UseSwagger();
app.UseSwaggerUI();
```
**Features:**
- Auto-generated from controller attributes
- Interactive API testing
- Request/response schema documentation
- Enum value descriptions
**Access:** Available in all environments (no production disable).
**Example Swagger URL:** `http://localhost:56232/swagger/index.html`
## Prometheus Metrics
**Endpoint:** `/metrics`
**Format:** Prometheus text exposition format
**Metrics Exposed:**
### minimediametadataapi_request_total
**Type:** Counter
**Description:** Total HTTP requests
**Labels:**
- `path` - Request path (e.g., `/api/SearchArtist`)
- `method` - HTTP method (GET, POST, etc.)
- `status` - HTTP status code (200, 404, 500, etc.)
**Example Output:**
```
# HELP minimediametadataapi_request_total Total HTTP requests
# TYPE minimediametadataapi_request_total counter
minimediametadataapi_request_total{path="/api/SearchArtist",method="GET",status="200"} 1523
minimediametadataapi_request_total{path="/api/SearchAlbum",method="GET",status="200"} 892
minimediametadataapi_request_total{path="/api/SearchTrack",method="GET",status="404"} 45
minimediametadataapi_request_total{path="/metrics",method="GET",status="200"} 3200
```
**No other metrics:**
- No request duration histograms
- No database query metrics
- No error rate metrics
- No active request gauges
## Security Analysis
### Authentication
**Status:** None
**Implications:** Fully open API, no user identification
**Missing:**
- API keys
- OAuth 2.0
- JWT tokens
- Basic authentication
- Client certificates
### Authorization
**Status:** None
**Implications:** All endpoints accessible to all clients
**Missing:**
- Role-based access control (RBAC)
- Scope-based permissions
- Rate limiting per user/key
### HTTPS
**Status:** Commented out in production
**Program.cs:**
```csharp
// app.UseHttpsRedirection(); // COMMENTED OUT
```
**Implications:**
- Traffic sent in plain text
- Vulnerable to man-in-the-middle attacks
- No encryption for sensitive data (if any)
**Deployment:** Expects reverse proxy (nginx, Traefik) to handle TLS termination.
### CORS
**Status:** Not configured
**Implications:**
- Browser-based clients blocked by default
- No cross-origin requests allowed
- Must be same-origin or use proxy
**Missing Configuration:**
```csharp
// NOT PRESENT
builder.Services.AddCors(options => { ... });
app.UseCors();
```
### Input Validation
**Sanitization:** `StringHelper.RemoveControlChars()`
**Implementation:**
```csharp
public static string RemoveControlChars(string input)
{
if (string.IsNullOrEmpty(input))
return input;
return Regex.Replace(input, @"[\x00-\x1F\x7F-\x9F]", string.Empty);
}
```
**Protects Against:**
- Control character injection
- Null byte attacks
- Terminal escape sequences
**Does NOT Protect Against:**
- SQL injection (mitigated by Dapper parameterization)
- XSS (JSON serialization handles escaping)
- Path traversal (no file operations)
- Command injection (no shell execution)
**Additional Sanitization:** `StringHelper.RemoveEmojis()`
**Implementation:**
```csharp
public static string RemoveEmojis(string input)
{
if (string.IsNullOrEmpty(input))
return input;
return Regex.Replace(input,
@"\p{Cs}", // Surrogate pairs (emojis)
string.Empty);
}
```
**Usage:** Applied to database queries, not API inputs.
### Rate Limiting
**Status:** None
**Implications:**
- Vulnerable to abuse
- No protection against DoS
- Unlimited queries per client
**Missing:**
- Request throttling
- IP-based limits
- API key quotas
- Burst protection
### SQL Injection Protection
**Mechanism:** Dapper parameterized queries
**Example:**
```csharp
var sql = "SELECT * FROM spotify_artist WHERE name ILIKE @name";
var results = await connection.QueryAsync<SpotifyArtist>(sql, new { name = $"%{searchTerm}%" });
```
**Protection:** Parameters never concatenated into SQL strings.
**Risk Level:** Low (Dapper handles parameterization correctly).
## Response Formats
### Content Type
**Default:** `application/json`
**Encoding:** UTF-8
**Serialization:** System.Text.Json (ASP.NET Core default)
### Success Response (200 OK)
```json
{
"searchResultType": "Ok",
"artists": [ /* array of entities */ ]
}
```
### Not Found Response (200 OK with NotFound type)
```json
{
"searchResultType": "NotFound",
"artists": []
}
```
**Note:** Returns HTTP 200 even when no results found. Client must check `searchResultType`.
### Error Response (500 Internal Server Error)
**ASP.NET Core default error handling:**
```json
{
"type": "https://tools.ietf.org/html/rfc7231#section-6.6.1",
"title": "An error occurred while processing your request.",
"status": 500,
"traceId": "00-abc123..."
}
```
**No custom error responses** - uses framework defaults.
**Error Details:** Hidden in production (no stack traces exposed).
## Pagination
### Offset-Based Pagination
**Parameter:** `Offset` (default: 0)
**Limit:** Hardcoded to 20 results per request
**Example:**
```bash
# First page (0-19)
GET /api/SearchArtist?Name=Beatles&Offset=0
# Second page (20-39)
GET /api/SearchArtist?Name=Beatles&Offset=20
# Third page (40-59)
GET /api/SearchArtist?Name=Beatles&Offset=40
```
**SQL Implementation:**
```sql
LIMIT 20 OFFSET @offset
```
**Limitations:**
- No configurable page size
- No total count in response
- No next/previous links
- No cursor-based pagination
- Performance degrades with large offsets
**Missing Metadata:**
```json
// NOT PRESENT
{
"pagination": {
"offset": 20,
"limit": 20,
"total": 150,
"hasMore": true
}
}
```
## Provider-Specific Behavior
### ID Format Differences
| Provider | ID Type | Example |
|----------|---------|---------|
| Spotify | string (base62) | `3WrFJ7ztbogyGnTHbHJFl2` |
| Tidal | int | `12345678` |
| MusicBrainz | Guid | `b10bbbfc-cf9e-42e0-be17-e2c3e1d2600d` |
| Deezer | long | `123456789` |
| Discogs | int | `12345` |
| SoundCloud | int/long | `123456789` |
**Implication:** ID parameter must match provider's format. Cross-provider ID lookups not possible.
### Field Availability
**Popularity:**
- Available: Spotify, Deezer
- Unavailable: Tidal, MusicBrainz, Discogs, SoundCloud
- Returns: `null` when unavailable
**Genres:**
- Available: Spotify, Deezer, MusicBrainz
- Unavailable: Tidal, Discogs, SoundCloud
- Returns: `null` or empty array
**Images:**
- Available: Spotify, Tidal, Deezer
- Limited: MusicBrainz (via relationships)
- Unavailable: Discogs (URLs only), SoundCloud
- Returns: `null` or empty array
**UPC/ISRC:**
- Available: Spotify, Tidal, MusicBrainz
- Limited: Discogs (identifiers table)
- Unavailable: Deezer, SoundCloud
- Returns: `null` when unavailable
### Provider Comparison Table
| Feature | Spotify | Tidal | MusicBrainz | Deezer | Discogs | SoundCloud |
|---------|---------|-------|-------------|--------|---------|------------|
| Artist Images | ✓ | ✓ | Limited | ✓ | ✗ | ✗ |
| Album Images | ✓ | ✓ | Limited | ✓ | ✗ | ✗ |
| Popularity Score | ✓ | ✗ | ✗ | ✓ | ✗ | ✗ |
| Follower Count | ✓ | ✗ | ✗ | ✓ | ✗ | ✗ |
| Genres | ✓ | ✗ | ✓ | ✓ | ✗ | ✗ |
| UPC | ✓ | ✓ | ✗ | ✗ | ✓ | ✗ |
| ISRC | ✓ | ✓ | ✓ | ✗ | ✗ | ✗ |
| Label Info | ✓ | ✓ | ✓ | ✗ | ✓ | ✗ |
| Release Date | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ |
| Explicit Flag | ✓ | ✓ | ✗ | ✓ | ✗ | ✗ |
| Track Duration | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ |
## API Versioning
**Status:** None
**Current Approach:** Single version at `/api/*`
**Implications:**
- Breaking changes affect all clients
- No gradual migration path
- No deprecation strategy
**Missing:**
- URL versioning (`/api/v1/`, `/api/v2/`)
- Header versioning (`Accept: application/vnd.api+json;version=1`)
- Query parameter versioning (`?api-version=1.0`)
## Health Checks
**Status:** None
**Missing Endpoints:**
- `/health` - Overall health status
- `/health/ready` - Readiness probe (Kubernetes)
- `/health/live` - Liveness probe (Kubernetes)
**Implications:**
- No automated health monitoring
- Load balancers can't detect unhealthy instances
- Kubernetes probes not supported
**Recommended Implementation:**
```csharp
builder.Services.AddHealthChecks()
.AddNpgSql(connectionString);
app.MapHealthChecks("/health");
```
## API Design Evaluation
### Strengths
1. **Consistent Interface:** All search endpoints follow same pattern
2. **Provider Abstraction:** `Provider=Any` enables cross-provider search
3. **Fuzzy Search:** pg_trgm provides forgiving name matching
4. **Swagger Docs:** Interactive documentation out of the box
5. **Prometheus Metrics:** Basic observability
6. **Input Sanitization:** Control character removal
### Weaknesses
1. **No Authentication:** Fully open API
2. **No Rate Limiting:** Vulnerable to abuse
3. **No HTTPS:** Plain text traffic
4. **No CORS:** Browser clients blocked
5. **No Versioning:** Breaking changes unavoidable
6. **No Health Checks:** Monitoring gaps
7. **Fixed Page Size:** No pagination control
8. **No Total Counts:** Can't determine result set size
9. **HTTP 200 for Not Found:** Should use 404
10. **No Error Details:** Generic error responses
### Recommendations for Production
1. **Add API Key Authentication:**
```csharp
builder.Services.AddAuthentication("ApiKey")
.AddScheme<ApiKeyAuthenticationOptions, ApiKeyAuthenticationHandler>("ApiKey", null);
```
2. **Implement Rate Limiting:**
```csharp
builder.Services.AddRateLimiter(options => {
options.AddFixedWindowLimiter("api", opt => {
opt.Window = TimeSpan.FromMinutes(1);
opt.PermitLimit = 100;
});
});
```
3. **Enable HTTPS:**
```csharp
app.UseHttpsRedirection();
```
4. **Add CORS Policy:**
```csharp
builder.Services.AddCors(options => {
options.AddDefaultPolicy(policy => {
policy.WithOrigins("https://trusted-domain.com")
.AllowAnyMethod()
.AllowAnyHeader();
});
});
```
5. **Add Health Checks:**
```csharp
builder.Services.AddHealthChecks()
.AddNpgSql(connectionString);
app.MapHealthChecks("/health");
```
6. **Implement API Versioning:**
```csharp
builder.Services.AddApiVersioning(options => {
options.DefaultApiVersion = new ApiVersion(1, 0);
options.AssumeDefaultVersionWhenUnspecified = true;
options.ReportApiVersions = true;
});
```
7. **Add Pagination Metadata:**
```json
{
"data": [ /* results */ ],
"pagination": {
"offset": 20,
"limit": 20,
"total": 150
}
}
```
8. **Use Proper HTTP Status Codes:**
- 200 OK - Results found
- 404 Not Found - No results
- 400 Bad Request - Invalid parameters
- 429 Too Many Requests - Rate limit exceeded
- 500 Internal Server Error - Server errors
## Integration Examples
### cURL
```bash
# Search artists
curl "http://localhost:56232/api/SearchArtist?Name=Beatles&Provider=Any"
# Get specific artist
curl "http://localhost:56232/api/SearchArtist?Id=3WrFJ7ztbogyGnTHbHJFl2&Provider=Spotify"
# Search albums with pagination
curl "http://localhost:56232/api/SearchAlbum?AlbumName=Abbey&Offset=20"
```
### JavaScript (Fetch API)
```javascript
async function searchArtist(name, provider = 'Any') {
const params = new URLSearchParams({ Name: name, Provider: provider });
const response = await fetch(`http://localhost:56232/api/SearchArtist?${params}`);
const data = await response.json();
if (data.searchResultType === 'Ok') {
return data.artists;
} else {
return [];
}
}
```
### Python (requests)
```python
import requests
def search_track(track_name, artist_id=None, provider='Any'):
params = {
'TrackName': track_name,
'Provider': provider
}
if artist_id:
params['ArtistId'] = artist_id
response = requests.get(
'http://localhost:56232/api/SearchTrack',
params=params
)
data = response.json()
return data['tracks'] if data['searchResultType'] == 'Ok' else []
```
### C# (HttpClient)
```csharp
public async Task<List<SearchAlbumEntity>> SearchAlbum(string albumName, string provider = "Any")
{
using var client = new HttpClient();
var url = $"http://localhost:56232/api/SearchAlbum?AlbumName={Uri.EscapeDataString(albumName)}&Provider={provider}";
var response = await client.GetFromJsonAsync<SearchAlbumResponse>(url);
return response?.SearchResultType == SearchResultType.Ok
? response.Albums
: new List<SearchAlbumEntity>();
}
```