Files
metadata-agregator/docs/research/acoustid/analysis/API.md
T
Alexander a1f6701bac feat: initial implementation of metadata aggregator
- gRPC service with MusicBrainz provider
- PostgreSQL schema with migrations
- Service layer with database-first caching
- Repository pattern for data access
- YAML configuration support
- Research documentation for 17 music metadata projects
2026-04-28 16:28:53 +02:00

808 lines
17 KiB
Markdown

# AcoustID API Reference
## API Overview
The AcoustID API provides fingerprint-based music identification services. The API is RESTful, supports multiple response formats (JSON, XML, JSONP), and requires API key authentication for most operations.
**Base URL**: `https://api.acoustid.org`
**Protocol**: HTTPS only
**Authentication**: API key (application key + user key for submissions)
**Rate Limiting**: Multi-tier (global, application, IP-based)
## Public API Endpoints
### Fingerprint Lookup
Identify recordings by audio fingerprint.
#### `/v2/lookup`
**Methods**: GET, POST
**Authentication**: Required (client key)
**Rate Limit**: 3 requests/second (IP), 10 requests/second (application)
**Required Parameters**:
| Parameter | Type | Description |
|-----------|------|-------------|
| `client` | string | Application API key |
| `duration` | integer | Track duration in seconds (if using fingerprint) |
| `trackid` | string | AcoustID track ID (alternative to fingerprint) |
**Optional Parameters**:
| Parameter | Type | Description | Default |
|-----------|------|-------------|---------|
| `fingerprint` | string | Chromaprint fingerprint (base64 or compressed) | - |
| `format` | string | Response format: `json`, `xml`, `jsonp` | `json` |
| `jsoncallback` | string | JSONP callback function name | - |
| `meta` | string | Metadata to include (see below) | - |
**Metadata Options** (comma-separated):
- `recordings`: Include MusicBrainz recording metadata
- `recordingids`: Include only recording MBIDs (faster)
- `releases`: Include release metadata
- `releaseids`: Include only release MBIDs
- `releasegroups`: Include release group metadata
- `releasegroupids`: Include only release group MBIDs
- `tracks`: Include track metadata
- `compress`: Compress response with gzip
- `usermeta`: Include user-submitted metadata
- `sources`: Include submission source information
**Batch Lookup**:
Submit multiple fingerprints in a single request using indexed parameters:
```
duration.0=240&fingerprint.0=AQADtN...
duration.1=180&fingerprint.1=AQABtK...
```
**Limits**:
- Maximum 20 fingerprints per batch request
- Maximum 100 track IDs per request
**Example Request** (GET):
```
GET /v2/lookup?client=8XaBELgH&duration=240&fingerprint=AQADtNGiJE...&meta=recordings
```
**Example Request** (POST):
```
POST /v2/lookup
Content-Type: application/x-www-form-urlencoded
client=8XaBELgH&duration=240&fingerprint=AQADtNGiJE...&meta=recordings
```
**Example Response** (JSON):
```json
{
"status": "ok",
"results": [
{
"id": "7e8b1234-5678-90ab-cdef-1234567890ab",
"score": 0.95,
"recordings": [
{
"id": "a1b2c3d4-e5f6-7890-abcd-ef1234567890",
"title": "Example Song",
"duration": 240,
"artists": [
{
"id": "12345678-90ab-cdef-1234-567890abcdef",
"name": "Example Artist"
}
],
"releases": [
{
"id": "abcdef12-3456-7890-abcd-ef1234567890",
"title": "Example Album",
"country": "US",
"date": {
"year": 2020,
"month": 5,
"day": 15
},
"track_count": 12,
"medium_count": 1
}
]
}
]
}
]
}
```
**Response Fields**:
| Field | Type | Description |
|-------|------|-------------|
| `status` | string | `ok` or `error` |
| `results` | array | Array of match results |
| `results[].id` | string | AcoustID track ID |
| `results[].score` | float | Match confidence (0.0-1.0) |
| `results[].recordings` | array | MusicBrainz recordings (if requested) |
### Fingerprint Submission
Submit audio fingerprints with optional metadata.
#### `/v2/submit`
**Method**: POST
**Authentication**: Required (client key + user key)
**Rate Limit**: 3 requests/second (IP), 10 requests/second (application)
**Required Parameters**:
| Parameter | Type | Description |
|-----------|------|-------------|
| `client` | string | Application API key |
| `user` | string | User API key |
| `duration.#` | integer | Track duration in seconds |
| `fingerprint.#` | string | Chromaprint fingerprint |
**Optional Parameters**:
| Parameter | Type | Description |
|-----------|------|-------------|
| `clientversion` | string | Client application version |
| `bitrate.#` | integer | Audio bitrate in kbps |
| `fileformat.#` | string | Audio file format (mp3, flac, etc.) |
| `mbid.#` | string | MusicBrainz recording MBID |
| `track.#` | string | Track title |
| `artist.#` | string | Artist name |
| `album.#` | string | Album title |
| `albumartist.#` | string | Album artist name |
| `year.#` | integer | Release year |
| `trackno.#` | integer | Track number |
| `discno.#` | integer | Disc number |
**Batch Submission**:
Use indexed parameters (`.0`, `.1`, `.2`, etc.) to submit multiple fingerprints:
```
duration.0=240&fingerprint.0=AQADtN...&mbid.0=a1b2c3d4...
duration.1=180&fingerprint.1=AQABtK...&mbid.1=e5f67890...
```
**Example Request**:
```
POST /v2/submit
Content-Type: application/x-www-form-urlencoded
client=8XaBELgH&user=AbCdEfGh&duration.0=240&fingerprint.0=AQADtNGiJE...&mbid.0=a1b2c3d4-e5f6-7890-abcd-ef1234567890
```
**Example Response**:
```json
{
"status": "ok",
"submissions": [
{
"id": 12345678,
"status": "pending"
}
]
}
```
**Response Fields**:
| Field | Type | Description |
|-------|------|-------------|
| `status` | string | `ok` or `error` |
| `submissions` | array | Array of submission results |
| `submissions[].id` | integer | Submission ID |
| `submissions[].status` | string | `pending`, `imported`, or `error` |
### Submission Status
Check the processing status of submitted fingerprints.
#### `/v2/submission_status`
**Method**: GET
**Authentication**: Required (client key)
**Parameters**:
| Parameter | Type | Description |
|-----------|------|-------------|
| `client` | string | Application API key |
| `id` | integer | Submission ID (from submit response) |
| `format` | string | Response format: `json`, `xml`, `jsonp` |
**Example Request**:
```
GET /v2/submission_status?client=8XaBELgH&id=12345678
```
**Example Response**:
```json
{
"status": "ok",
"submission": {
"id": 12345678,
"status": "imported",
"result": {
"id": "7e8b1234-5678-90ab-cdef-1234567890ab"
}
}
}
```
**Status Values**:
- `pending`: Queued for processing
- `imported`: Successfully processed
- `error`: Processing failed
### Fingerprint Retrieval
Retrieve stored fingerprint data.
#### `/v2/fingerprint`
**Method**: GET
**Authentication**: Required (client key)
**Parameters**:
| Parameter | Type | Description |
|-----------|------|-------------|
| `client` | string | Application API key |
| `id` | string | AcoustID track ID |
| `format` | string | Response format: `json`, `xml`, `jsonp` |
**Example Request**:
```
GET /v2/fingerprint?client=8XaBELgH&id=7e8b1234-5678-90ab-cdef-1234567890ab
```
**Example Response**:
```json
{
"status": "ok",
"fingerprints": [
{
"id": 987654321,
"fingerprint": "AQADtNGiJE...",
"duration": 240,
"submission_count": 5
}
]
}
```
### Track Listing by MBID
List AcoustID tracks linked to a MusicBrainz recording.
#### `/v2/track/list_by_mbid`
**Method**: GET
**Authentication**: Required (client key)
**Parameters**:
| Parameter | Type | Description |
|-----------|------|-------------|
| `client` | string | Application API key |
| `mbid` | string | MusicBrainz recording MBID |
| `format` | string | Response format: `json`, `xml`, `jsonp` |
**Example Request**:
```
GET /v2/track/list_by_mbid?client=8XaBELgH&mbid=a1b2c3d4-e5f6-7890-abcd-ef1234567890
```
**Example Response**:
```json
{
"status": "ok",
"tracks": [
{
"id": "7e8b1234-5678-90ab-cdef-1234567890ab",
"disabled": false
}
]
}
```
### Track Listing by PUID
List AcoustID tracks linked to a MusicIP PUID (legacy).
#### `/v2/track/list_by_puid`
**Method**: GET
**Authentication**: Required (client key)
**Parameters**:
| Parameter | Type | Description |
|-----------|------|-------------|
| `client` | string | Application API key |
| `puid` | string | MusicIP PUID |
| `format` | string | Response format: `json`, `xml`, `jsonp` |
### User Management
#### `/v2/user/lookup`
Lookup user API key by MusicBrainz account.
**Method**: POST
**Authentication**: Required (client key)
**Parameters**:
| Parameter | Type | Description |
|-----------|------|-------------|
| `client` | string | Application API key |
| `musicbrainz_id` | string | MusicBrainz username |
#### `/v2/user/create_anonymous`
Create anonymous user API key.
**Method**: POST
**Authentication**: Required (client key)
**Parameters**:
| Parameter | Type | Description |
|-----------|------|-------------|
| `client` | string | Application API key |
**Example Response**:
```json
{
"status": "ok",
"user": {
"apikey": "AbCdEfGh"
}
}
```
#### `/v2/user/create_musicbrainz`
Create user API key linked to MusicBrainz account.
**Method**: POST
**Authentication**: Required (client key)
**Parameters**:
| Parameter | Type | Description |
|-----------|------|-------------|
| `client` | string | Application API key |
| `access_token` | string | MusicBrainz OAuth access token |
## Legacy API Endpoints
### `/lookup`
Legacy lookup endpoint (API v1).
**Status**: Deprecated, use `/v2/lookup` instead
**Differences**: Limited metadata options, different response format
### `/submit`
Legacy submit endpoint (API v1).
**Status**: Deprecated, use `/v2/submit` instead
**Differences**: Synchronous processing, no batch support
## Health Check Endpoints
### `/_health`
Full health check with database write test.
**Method**: GET
**Authentication**: None
**Response**:
```json
{
"status": "ok"
}
```
**Status Codes**:
- `200`: All systems operational
- `503`: Service unavailable
### `/_health_ro`
Read-only health check (database read test only).
**Method**: GET
**Authentication**: None
### `/_health_docker`
Docker-specific health check (minimal checks).
**Method**: GET
**Authentication**: None
## Internal API Endpoints
These endpoints are for administrative use only and require special authentication.
### `/v2/internal/update_lookup_stats`
Trigger lookup statistics update.
**Method**: POST
**Authentication**: Internal only
### `/v2/internal/update_user_agent_stats`
Trigger user agent statistics update.
**Method**: POST
**Authentication**: Internal only
### `/v2/internal/lookup_stats`
Retrieve lookup statistics.
**Method**: GET
**Authentication**: Internal only
### `/v2/internal/create_account`
Create new user account.
**Method**: POST
**Authentication**: Internal only
### `/v2/internal/create_application`
Create new API application.
**Method**: POST
**Authentication**: Internal only
### `/v2/internal/update_application_status`
Update application status (active/inactive).
**Method**: POST
**Authentication**: Internal only
### `/v2/internal/check_application`
Check application validity.
**Method**: GET
**Authentication**: Internal only
## Index API Endpoints
The fingerprint index service exposes its own HTTP API (separate from the main API).
**Base URL**: `http://index:6081` (internal)
**Protocol**: HTTP
**Format**: MessagePack
### `PUT /:index`
Create new index.
**Parameters**:
- `:index`: Index name
### `GET /:index`
Get index information.
**Response**:
```json
{
"name": "fingerprints",
"doc_count": 1234567,
"segment_count": 42,
"memory_segment_size": 1048576
}
```
### `DELETE /:index`
Delete index.
### `POST /:index/_search`
Search for fingerprints.
**Request Body** (MessagePack):
```python
{
"query": [term1, term2, term3, ...],
"limit": 10,
"min_score": 0.5
}
```
**Response** (MessagePack):
```python
{
"results": [
{"id": fpid1, "score": 0.95},
{"id": fpid2, "score": 0.87}
]
}
```
### `POST /:index/_update`
Batch update fingerprints.
**Request Body** (MessagePack):
```python
{
"updates": [
{"id": fpid1, "terms": [term1, term2, ...]},
{"id": fpid2, "terms": [term3, term4, ...]}
]
}
```
### `GET /:index/_segments`
List index segments.
**Response**:
```json
{
"segments": [
{
"id": 0,
"type": "memory",
"doc_count": 1024,
"size_bytes": 1048576
},
{
"id": 1,
"type": "file",
"doc_count": 100000,
"size_bytes": 52428800
}
]
}
```
### `GET /:index/_snapshot`
Create index snapshot.
**Response**:
```json
{
"snapshot_id": "snapshot_20250428_120000",
"path": "/var/lib/acoustid-index/snapshots/snapshot_20250428_120000"
}
```
### `PUT /:index/:fpid`
Insert or update fingerprint.
**Parameters**:
- `:index`: Index name
- `:fpid`: Fingerprint ID
**Request Body** (MessagePack):
```python
{
"terms": [term1, term2, term3, ...]
}
```
### `GET /:index/:fpid`
Retrieve fingerprint.
**Response** (MessagePack):
```python
{
"id": fpid,
"terms": [term1, term2, term3, ...]
}
```
### `DELETE /:index/:fpid`
Delete fingerprint.
### `GET /_health`
Index health check.
**Response**:
```json
{
"status": "ok"
}
```
### `GET /_metrics`
Prometheus metrics.
**Response** (Prometheus text format):
```
# HELP fpindex_search_duration_seconds Search duration
# TYPE fpindex_search_duration_seconds histogram
fpindex_search_duration_seconds_bucket{le="0.005"} 1234
fpindex_search_duration_seconds_bucket{le="0.01"} 5678
...
```
## Rate Limiting
### Rate Limit Tiers
AcoustID implements a three-tier rate limiting system:
| Tier | Scope | Default Limit | Override |
|------|-------|---------------|----------|
| Global | All requests | 3 req/s | Config: `cluster.rate_limiter.global_limit` |
| Application | Per API key | 10 req/s | Database: `application.rate_limit` |
| IP Address | Per client IP | 3 req/s | Config: `cluster.rate_limiter.ip_limit` |
### Rate Limit Algorithm
**Implementation**: Redis-based sliding window
**Window Configuration**:
- Window duration: 20 seconds
- Window steps: 4 (5-second buckets)
- Cleanup: Automatic expiration (25-second TTL)
**Redis Keys**:
```
rl:bucket:global:{timestamp}
rl:bucket:app:{api_key}:{timestamp}
rl:bucket:ip:{ip_address}:{timestamp}
```
### Rate Limit Headers
Responses include rate limit information:
```
X-RateLimit-Limit: 10
X-RateLimit-Remaining: 7
X-RateLimit-Reset: 1714305600
```
### Rate Limit Exceeded Response
**Status Code**: 429 Too Many Requests
**Response**:
```json
{
"status": "error",
"error": {
"code": 5,
"message": "Rate limit exceeded"
}
}
```
## Error Handling
### Error Response Format
All errors return a consistent structure:
```json
{
"status": "error",
"error": {
"code": 1,
"message": "Invalid API key"
}
}
```
### Error Codes
| Code | Message | Description |
|------|---------|-------------|
| 1 | Invalid API key | Client or user key is invalid |
| 2 | Missing required parameter | Required parameter not provided |
| 3 | Invalid fingerprint | Fingerprint format is invalid |
| 4 | Internal error | Server-side error occurred |
| 5 | Rate limit exceeded | Too many requests |
| 6 | Invalid format | Unsupported response format |
| 7 | Fingerprint not found | Requested fingerprint doesn't exist |
| 8 | Too many requests | Batch size exceeds limit |
### HTTP Status Codes
| Code | Meaning | Usage |
|------|---------|-------|
| 200 | OK | Successful request |
| 400 | Bad Request | Invalid parameters |
| 401 | Unauthorized | Missing or invalid API key |
| 403 | Forbidden | API key lacks permission |
| 404 | Not Found | Resource not found |
| 429 | Too Many Requests | Rate limit exceeded |
| 500 | Internal Server Error | Server error |
| 503 | Service Unavailable | Service down or degraded |
## Authentication
### API Key Types
1. **Application Key** (`client` parameter):
- Identifies the client application
- Required for all API calls
- Obtain from https://acoustid.org/new-application
2. **User Key** (`user` parameter):
- Identifies the end user
- Required for submissions
- Created via `/v2/user/create_*` endpoints
3. **Demo Key**:
- Limited functionality
- For testing only
- Key: `8XaBELgH`
### Key Management
**Application Keys**:
- Created via web UI or internal API
- Can be active or inactive
- Rate limits configurable per key
- Usage statistics tracked
**User Keys**:
- Anonymous or MusicBrainz-linked
- Created programmatically
- Tied to application key
- Submission history tracked
## Best Practices
### Lookup Optimization
1. **Use batch lookups** for multiple files (up to 20 per request)
2. **Request only needed metadata** (use specific `meta` flags)
3. **Cache results** to avoid redundant lookups
4. **Handle rate limits** with exponential backoff
### Submission Guidelines
1. **Include MBIDs** when known (improves accuracy)
2. **Provide metadata** (artist, album, track) for better matching
3. **Use batch submissions** for efficiency
4. **Poll submission status** asynchronously
### Error Handling
1. **Retry on 5xx errors** with exponential backoff
2. **Respect rate limits** (check headers)
3. **Validate fingerprints** before submission
4. **Log errors** for debugging
### Performance
1. **Use POST** for large requests (avoid URL length limits)
2. **Enable compression** (`meta=compress`)
3. **Reuse connections** (HTTP keep-alive)
4. **Implement timeouts** (30-60 seconds recommended)