- gRPC service with MusicBrainz provider - PostgreSQL schema with migrations - Service layer with database-first caching - Repository pattern for data access - YAML configuration support - Research documentation for 17 music metadata projects
8.8 KiB
Music Metadata API - Overview
Project Identity
Name: Music Metadata API
Repository: https://github.com/Aunali321/music-metadata-api
License: MIT
Language: Go 1.24
Maintainer: Single maintainer (Aunali321)
Status: Active, production-ready
Purpose
Music Metadata API provides a self-hosted HTTP service for querying metadata on 256 million music tracks. The service operates entirely from pre-populated SQLite databases, requiring no external API calls at runtime. It's designed as a high-performance alternative to commercial music metadata APIs like Spotify's Web API.
Core Technology Stack
Runtime Dependencies
| Component | Version | Purpose | Notes |
|---|---|---|---|
| Go | 1.24 | Runtime & stdlib HTTP server | Uses Go 1.22+ enhanced routing |
| modernc.org/sqlite | v1.34.4 | Pure Go SQLite driver | No CGO required |
| golang.org/x/time | v0.14.0 | Rate limiting (token bucket) | Only external dependency |
Build Configuration
CGO_ENABLED=0 go build -ldflags="-s -w" ./cmd/server
Flags explained:
CGO_ENABLED=0: Pure Go binary, no C dependencies-s -w: Strip debug symbols and DWARF tables (smaller binary)
Data Scale
Database Files
| Database | Size | Purpose | Records |
|---|---|---|---|
| main_database.sqlite3 | ~117GB | Core metadata (tracks, albums, artists) | 256M tracks |
| track_files.sqlite3 | ~99GB | Extended track data (lyrics flags, languages, roles) | 256M track files |
| Total | ~216GB | Combined storage requirement | - |
Dataset Coverage
- 256 million tracks across all databases
- Album metadata with images, labels, release dates
- Artist metadata with genres, follower counts, popularity scores
- ISRC codes for track identification
- Multi-language support (language_of_performance field)
- Artist role information (performer, composer, etc.)
Entry Points
Command Line
Binary: cmd/server/main.go (62 lines)
Flags:
-db string
Path to main database file (REQUIRED)
-addr string
HTTP server address (default ":8080")
Example:
./metadata-api -db /data/main_database.sqlite3 -addr :8080
Docker
Image: ghcr.io/aunali321/music-metadata-api:latest
Base: Alpine Linux 3.21
docker-compose.yml:
services:
metadata-api:
image: ghcr.io/aunali321/music-metadata-api:latest
ports:
- "8080:8080"
volumes:
- ./data:/data:ro
environment:
- LOG_LEVEL=info # NOTE: Not actually used in code
command: ["-db", "/data/main_database.sqlite3"]
healthcheck:
test: ["CMD", "wget", "--spider", "-q", "http://localhost:8080/health"]
interval: 30s
timeout: 10s
retries: 3
restart: unless-stopped
Architecture Layers
Directory Structure
music-metadata-api/
├── cmd/
│ └── server/
│ └── main.go # Entry point (62 lines)
├── internal/
│ ├── api/ # HTTP handlers, routing, middleware
│ │ ├── handlers.go
│ │ ├── ratelimit.go
│ │ └── openapi.go
│ ├── db/
│ │ └── db.go # Database layer (907 lines)
│ └── models/
│ └── models.go # Data structures (65 lines)
├── Dockerfile
├── docker-compose.yml
└── .github/
└── workflows/
└── docker-publish.yml
Layer Responsibilities
API Layer (internal/api/)
- HTTP request handling
- Rate limiting (token bucket, per-IP)
- OpenAPI specification serving
- Swagger UI hosting
Database Layer (internal/db/)
- SQLite connection management
- Query execution
- Data enrichment (joining related entities)
- Batch optimization
Models Layer (internal/models/)
- Data structure definitions
- JSON serialization tags
- Response formatting
Key Features
Performance Optimizations
- Read-only databases - No write locks, safe concurrent reads
- Conservative PRAGMAs - Optimized for read-heavy workloads
- Batch endpoints - Process up to 400 items per request
- Connection pooling - MaxOpenConns=8 for controlled resource usage
- Memory-mapped I/O - 1GB mmap for faster reads
API Capabilities
- Batch lookup - Retrieve multiple tracks/albums/artists in single request
- ISRC lookup - Industry-standard track identification
- Search - Full-text search on tracks and artists
- Relationship traversal - Album tracks, artist albums, track artists
- OpenAPI documentation - Interactive Swagger UI at
/docs
Operational Features
- Graceful shutdown - 10-second timeout for in-flight requests
- Health checks -
/healthendpoint for monitoring - Rate limiting - 100 req/s with 200 burst capacity
- Structured logging - Go stdlib
log/slogfor error tracking
Deployment Models
Standalone Binary
Pros:
- Single executable, no dependencies
- Minimal resource footprint
- Direct filesystem access to databases
Cons:
- Manual process management
- No automatic restarts
- Manual log rotation
Docker Container
Pros:
- Consistent runtime environment
- Built-in health checks
- Automatic restarts
- Easy horizontal scaling
Cons:
- Requires Docker runtime
- Additional layer of abstraction
- Volume mount for large databases
Use Cases
Primary Use Cases
- Music library enrichment - Add metadata to existing track collections
- ISRC-based lookup - Resolve ISRCs to full track metadata
- Batch processing - Enrich large catalogs efficiently
- Self-hosted alternative - Replace commercial APIs with local service
Integration Scenarios
- Metadata aggregator pipelines - Complement MusicBrainz with Spotify-style data
- Music streaming services - Populate track/album/artist information
- DJ software - Enrich track libraries with popularity, genres, images
- Music analytics - Analyze trends across 256M tracks
Limitations
Technical Constraints
- Database size - Requires 216GB disk space
- No write operations - Read-only, no data updates
- No authentication - Public API, no access control
- No CORS - Browser-based clients blocked
- Memory leak - Rate limiter visitor map grows unbounded
Data Constraints
- Database provenance unclear - "Not affiliated with Spotify"
- No freshness mechanism - Static snapshot, no updates
- Search performance - LIKE queries slow on large datasets (no FTS)
Operational Constraints
- No metrics - No Prometheus, no counters
- Naive health check - Doesn't verify database connectivity
- Hardcoded config - Timeouts, limits not configurable
- No tests - Zero test coverage
Project Maturity
Strengths
- Clean, simple codebase
- Production-ready Docker setup
- Comprehensive OpenAPI spec
- Massive dataset (256M tracks)
- Pure Go (no CGO complexity)
Weaknesses
- Single maintainer
- No test suite
- No CI test step
- Unused config (LOG_LEVEL)
- Memory leak in rate limiter
Comparison to Alternatives
| Feature | Music Metadata API | Spotify Web API | MusicBrainz API |
|---|---|---|---|
| Self-hosted | Yes | No | No |
| Authentication | None | OAuth required | Optional |
| Dataset size | 256M tracks | Full catalog | ~40M recordings |
| Rate limits | 100 req/s | Varies by tier | 1 req/s |
| Batch support | 400 items | 50 items | Limited |
| Cost | Free (MIT) | Free tier limited | Free |
| Data freshness | Static | Real-time | Community-updated |
| Identifier | ISRC, internal IDs | Spotify IDs | MBIDs |
Getting Started
Minimum Requirements
- Go 1.24+ (for building from source)
- 216GB disk space for databases
- Database files (not included in repository)
- 2GB+ RAM recommended
Quick Start
# Clone repository
git clone https://github.com/Aunali321/music-metadata-api.git
cd music-metadata-api
# Build binary
CGO_ENABLED=0 go build -ldflags="-s -w" -o metadata-api ./cmd/server
# Run server (assumes databases in /data)
./metadata-api -db /data/main_database.sqlite3 -addr :8080
# Test health endpoint
curl http://localhost:8080/health
# View API documentation
open http://localhost:8080/docs
Docker Quick Start
# Pull image
docker pull ghcr.io/aunali321/music-metadata-api:latest
# Run container
docker run -d \
-p 8080:8080 \
-v /path/to/databases:/data:ro \
ghcr.io/aunali321/music-metadata-api:latest \
-db /data/main_database.sqlite3
# Check health
curl http://localhost:8080/health
Documentation Resources
- OpenAPI Spec: http://localhost:8080/openapi.yaml
- Interactive Docs: http://localhost:8080/docs
- GitHub Repository: https://github.com/Aunali321/music-metadata-api
- Docker Image: ghcr.io/aunali321/music-metadata-api
License
MIT License - Free for commercial and personal use with attribution.