feat: initial implementation of metadata aggregator

- gRPC service with MusicBrainz provider - PostgreSQL schema with migrations - Service layer with database-first caching - Repository pattern for data access - YAML configuration support - Research documentation for 17 music metadata projects
2026-04-28 16:27:14 +02:00
commit a1f6701bac
163 changed files with 95884 additions and 0 deletions
@@ -0,0 +1,760 @@
+# Bedrock-API Evaluation
+
+## Executive Summary
+
+Bedrock-API is a music metadata and streaming aggregation service built in Go 1.25 with gRPC and HTTP interfaces. The project demonstrates strong architectural patterns (provider abstraction, fan-out concurrency, partial response handling) but lacks production-readiness features (caching, monitoring, comprehensive testing, security hardening).
+
+**Primary Value**: Cross-platform stream resolution (bridges non-streaming APIs like Spotify to streaming platforms like SoundCloud/YouTube Music).
+
+**Target Use Case**: Unified music search and streaming across multiple platforms.
+
+**Maturity Level**: Early production (functional but missing observability, caching, and security features).
+
+## Strengths
+
+### 1. Clean Provider Abstraction
+
+**Pattern**: Implicit `trackProvider` interface isolates platform-specific logic
+
+**Benefits**:
+- Easy to add new providers (implement interface)
+- Platform failures don't affect other providers
+- Testable in isolation (mock providers)
+
+**Example**:
+```go
+type trackProvider interface {
+    Name() string
+    SearchTracks(ctx context.Context, query string, limit int32) ([]*pb.Track, error)
+    GetStreamURL(ctx context.Context, id string) (string, error)
+    // ... other methods
+}
+```
+
+**Applicability to Metadata Aggregator**: Directly applicable. Same pattern can be used for metadata providers (Discogs, MusicBrainz, Last.fm, etc.).
+
+### 2. Fan-Out Concurrency
+
+**Pattern**: Parallel goroutines per provider with WaitGroup coordination
+
+**Benefits**:
+- Response time = slowest provider (not sum of all)
+- Typical search: 200-500ms (4 providers in parallel)
+- Scales linearly with provider count
+
+**Example**:
+```go
+var wg sync.WaitGroup
+for _, provider := range providers {
+    wg.Add(1)
+    go func(p trackProvider) {
+        defer wg.Done()
+        results, err := p.SearchTracks(ctx, query, limit)
+        // Aggregate results
+    }(provider)
+}
+wg.Wait()
+```
+
+**Applicability to Metadata Aggregator**: Directly applicable. Metadata queries can be parallelized across providers.
+
+### 3. Partial Response Handling
+
+**Pattern**: Return successful results even if some providers fail
+
+**Benefits**:
+- Resilient to individual provider failures
+- Degraded service instead of complete failure
+- Client can decide how to handle partial results
+
+**Example**:
+```go
+if len(errors) > 0 {
+    if len(allTracks) == 0 {
+        status = pb.ResponseStatus_ERROR
+    } else {
+        status = pb.ResponseStatus_PARTIAL
+    }
+}
+
+return &pb.SearchTracksResponse{
+    Tracks: allTracks,
+    Status: status,
+    Errors: errors, // Per-provider error details
+}
+```
+
+**Applicability to Metadata Aggregator**: Directly applicable. Metadata aggregation should be resilient to individual provider failures.
+
+### 4. Cross-Platform Stream Resolution
+
+**Pattern**: Bridge non-streaming platforms to streaming platforms
+
+**Algorithm**:
+1. Check if platform supports streaming (SoundCloud, YouTube Music)
+2. If not, search SoundCloud for matching track
+3. If SoundCloud fails, search YouTube Music
+4. Return first successful stream URL
+
+**Benefits**:
+- Unified streaming interface (even for non-streaming APIs)
+- Automatic fallback chain
+- Transparent to client
+
+**Applicability to Metadata Aggregator**: Not directly applicable (metadata aggregator doesn't need streaming). However, the fallback pattern is useful for metadata resolution (try provider A, fallback to provider B).
+
+### 5. YouTube 7-Client Fallback
+
+**Pattern**: Rotate through 7 different YouTube client types to maximize stream availability
+
+**Clients**:
+- TVHTML5_SIMPLY_EMBEDDED (primary)
+- TVHTML5
+- ANDROID_VR (2 variants)
+- ANDROID
+- IOS
+- WEB
+
+**Benefits**:
+- Maximizes success rate (different clients have different capabilities)
+- Avoids ciphered streams (encrypted, require decryption)
+- Handles geo-restrictions
+
+**Applicability to Metadata Aggregator**: Pattern is applicable for providers with multiple API endpoints or client types.
+
+### 6. ID Namespacing
+
+**Pattern**: Platform-prefixed IDs (`{platform}:{type}:{native_id}`)
+
+**Examples**:
+- `spotify:track:3n3Ppam7vgaVa1iaRUc9Lp`
+- `soundcloud:track:1234567890`
+- `deezer:album:302127`
+
+**Benefits**:
+- Prevents ID collisions across platforms
+- Explicit routing (no lookup required)
+- Self-documenting (ID reveals source platform)
+
+**Applicability to Metadata Aggregator**: Directly applicable. Metadata IDs should be namespaced to prevent collisions.
+
+### 7. gRPC for Performance
+
+**Benefits**:
+- HTTP/2 multiplexing (multiple requests over single connection)
+- Binary protocol (smaller payloads than JSON)
+- Streaming support (future use)
+- Strong typing (protobuf)
+
+**Tradeoffs**:
+- Requires client code generation
+- Less human-readable than REST/JSON
+- Tooling less mature than REST
+
+**Applicability to Metadata Aggregator**: Consider gRPC for internal services, REST for public API.
+
+### 8. JWT Authentication
+
+**Implementation**: HS256 tokens with bcrypt password hashing
+
+**Benefits**:
+- Stateless authentication (no session storage)
+- Token expiration (15min access, 7 day refresh)
+- Secure password storage (bcrypt cost 10)
+
+**Limitations**:
+- No token revocation
+- No refresh token rotation
+- Single shared secret (HS256)
+
+**Applicability to Metadata Aggregator**: JWT is suitable, but consider RS256 (asymmetric) for better security.
+
+### 9. SoundCloud Client ID Rotation
+
+**Pattern**: Rotate through multiple client IDs to avoid rate limits
+
+**Implementation**:
+```go
+func (p *SoundCloudProvider) getClientID() string {
+    p.mu.Lock()
+    defer p.mu.Unlock()
+    
+    id := p.clientIDs[p.currentID]
+    p.currentID = (p.currentID + 1) % len(p.clientIDs)
+    
+    return id
+}
+```
+
+**Benefits**:
+- Increases effective rate limit (4 IDs = 4x limit)
+- Automatic rotation (no manual intervention)
+
+**Applicability to Metadata Aggregator**: Applicable for providers with rate limits (rotate API keys).
+
+### 10. Batch Hydration (SoundCloud)
+
+**Pattern**: Fetch details for multiple IDs in single request
+
+**Implementation**: SoundCloud allows up to 30 IDs per request
+
+**Benefits**:
+- Reduces API calls (30x reduction for playlists)
+- Faster response times
+- Lower rate limit consumption
+
+**Applicability to Metadata Aggregator**: Applicable for providers that support batch requests (MusicBrainz, Discogs).
+
+## Weaknesses
+
+### 1. No Caching
+
+**Impact**:
+- High latency (200-500ms per search)
+- Provider API rate limits
+- Unnecessary API quota consumption
+- No offline capability
+
+**Recommendation**: Implement Redis caching
+
+**Cache Strategy**:
+- Track metadata: 1 hour TTL
+- Search results: 5 minutes TTL
+- Stream URLs: 1 hour TTL (expire after 1-6 hours anyway)
+- Lyrics: 24 hours TTL (rarely change)
+
+**Applicability to Metadata Aggregator**: Critical. Metadata aggregator must cache to avoid repeated API calls.
+
+### 2. Minimal Database Schema
+
+**Current**: Single `users` table (authentication only)
+
+**Missing**:
+- No metadata persistence (tracks, albums, artists)
+- No user data (favorites, playlists, history)
+- No analytics (play counts, search trends)
+
+**Impact**:
+- All data is ephemeral (fetched from providers every time)
+- No historical data
+- No offline access
+- No data ownership
+
+**Applicability to Metadata Aggregator**: Metadata aggregator needs rich schema for metadata persistence.
+
+### 3. No Monitoring
+
+**Missing**:
+- Prometheus metrics (request rate, error rate, latency)
+- Grafana dashboards
+- Distributed tracing (Jaeger)
+- Log aggregation (Loki)
+
+**Impact**:
+- No visibility into performance
+- No alerting on failures
+- Difficult to debug production issues
+
+**Recommendation**: Implement full observability stack
+
+**Applicability to Metadata Aggregator**: Critical for production. Monitoring is essential.
+
+### 4. No Rate Limiting
+
+**Missing**:
+- Per-user rate limiting
+- Per-IP rate limiting
+- Provider-level rate limiting
+
+**Impact**:
+- Abuse possible (unlimited requests)
+- Provider API rate limits can be exceeded
+- No protection against DDoS
+
+**Recommendation**: Implement rate limiting
+
+**Example**:
+```go
+import "golang.org/x/time/rate"
+
+var limiters = make(map[string]*rate.Limiter)
+
+func getLimiter(userID string) *rate.Limiter {
+    limiter, exists := limiters[userID]
+    if !exists {
+        limiter = rate.NewLimiter(rate.Every(time.Second), 10) // 10 req/sec
+        limiters[userID] = limiter
+    }
+    return limiter
+}
+```
+
+**Applicability to Metadata Aggregator**: Critical. Rate limiting prevents abuse and protects provider APIs.
+
+### 5. Stub Providers (Yandex, VK)
+
+**Status**: Placeholder only, no implementation
+
+**Impact**:
+- Incomplete platform coverage
+- Misleading (listed as supported but not functional)
+
+**Recommendation**: Remove stubs or implement fully
+
+**Applicability to Metadata Aggregator**: Don't list providers as supported unless fully implemented.
+
+### 6. No TLS
+
+**Current**: gRPC and HTTP without TLS
+
+**Impact**:
+- Credentials transmitted in plaintext
+- JWT tokens exposed
+- Man-in-the-middle attacks possible
+
+**Recommendation**: Deploy behind reverse proxy with TLS termination
+
+**Applicability to Metadata Aggregator**: TLS is mandatory for production.
+
+### 7. Go Version Mismatch
+
+**Issue**: `go.mod` specifies 1.25, Dockerfile uses 1.23
+
+**Impact**:
+- Build failures if Go 1.25 features are used
+- Inconsistent builds
+
+**Fix**:
+```dockerfile
+FROM golang:1.25-alpine AS builder
+```
+
+**Applicability to Metadata Aggregator**: Keep build environment in sync with go.mod.
+
+### 8. Custom Submodule Dependency
+
+**Issue**: `spotapi-go` is custom fork, not official library
+
+**Impact**:
+- Maintenance burden
+- Submodule initialization required
+- Potential security issues (unmaintained fork)
+
+**Recommendation**: Use official library directly
+
+**Applicability to Metadata Aggregator**: Avoid custom forks. Use official libraries or vendor dependencies.
+
+### 9. No Unit Tests
+
+**Current**: Integration tests only (require running server and providers)
+
+**Missing**:
+- Provider adapter unit tests (mocked HTTP responses)
+- Database store unit tests (mocked database)
+- Authentication unit tests (mocked JWT)
+
+**Impact**:
+- Slow test execution
+- Difficult to test edge cases
+- Requires provider credentials for testing
+
+**Recommendation**: Add unit tests with mocks
+
+**Applicability to Metadata Aggregator**: Unit tests are essential for fast feedback and edge case coverage.
+
+### 10. Health Check Stub
+
+**Current**: `GetServiceStatus` always returns healthy
+
+**Impact**:
+- No actual health monitoring
+- Kubernetes probes don't detect failures
+- No dependency health visibility
+
+**Recommendation**: Implement real health checks
+
+**Applicability to Metadata Aggregator**: Health checks are critical for orchestration (Kubernetes, Docker Swarm).
+
+### 11. No Pagination
+
+**Current**: Search results limited by `limit` parameter (max 50)
+
+**Impact**:
+- Large result sets cannot be retrieved incrementally
+- No cursor-based pagination
+- No total count
+
+**Recommendation**: Add pagination
+
+**Example**:
+```protobuf
+message SearchRequest {
+    string query = 1;
+    int32 limit = 2;
+    string cursor = 3; // Pagination cursor
+}
+
+message SearchTracksResponse {
+    repeated Track tracks = 1;
+    string next_cursor = 2; // Next page cursor
+    int32 total = 3; // Total result count
+}
+```
+
+**Applicability to Metadata Aggregator**: Pagination is essential for large result sets.
+
+### 12. No API Versioning
+
+**Current**: No version in package name or endpoint
+
+**Impact**:
+- Breaking changes affect all clients
+- No backward compatibility
+- No deprecation path
+
+**Recommendation**: Add versioning
+
+**Example**:
+```protobuf
+package bedrock.v1;
+
+service BedrockService {
+    // ...
+}
+```
+
+**Applicability to Metadata Aggregator**: API versioning is critical for backward compatibility.
+
+## Integration Complexity
+
+### Provider Integration Effort
+
+| Provider | Complexity | Reason |
+|----------|------------|--------|
+| Spotify | Medium | OAuth 2.0, submodule dependency |
+| SoundCloud | Low | Simple HTTP API, client ID rotation |
+| Deezer | Low | Public API, no auth |
+| YouTube Music | High | Undocumented Innertube API, 7-client fallback, cipher handling |
+| Yandex | Unknown | Not implemented |
+| VK | Unknown | Not implemented |
+
+**Easiest**: Deezer (public API, no auth)  
+**Hardest**: YouTube Music (undocumented API, complex fallback logic)
+
+### Client Integration Effort
+
+**gRPC Clients**: Requires protobuf compilation
+
+**Steps**:
+1. Install protoc compiler
+2. Install language-specific protobuf plugin
+3. Generate client code from `.proto` file
+4. Implement authentication (JWT in metadata)
+
+**Example** (Go):
+```bash
+protoc --go_out=. --go-grpc_out=. bedrock_service.proto
+```
+
+**Example** (Python):
+```bash
+python -m grpc_tools.protoc -I. --python_out=. --grpc_python_out=. bedrock_service.proto
+```
+
+**Complexity**: Medium (requires tooling setup)
+
+**Alternative**: Provide pre-generated clients for popular languages
+
+## Performance Analysis
+
+### Latency Breakdown
+
+**Typical Search Request** (4 providers):
+
+| Component | Latency | Notes |
+|-----------|---------|-------|
+| gRPC overhead | 1-5ms | Minimal |
+| Authentication | 1-2ms | JWT validation |
+| Provider queries (parallel) | 200-500ms | Slowest provider wins |
+| Response aggregation | 1-5ms | Mutex-protected append |
+| **Total** | **200-510ms** | Dominated by provider latency |
+
+**Optimization Opportunities**:
+- Cache metadata (reduce provider calls)
+- Implement timeouts (don't wait for slow providers)
+- Add circuit breakers (skip failing providers)
+
+### Throughput
+
+**Single Instance** (no caching):
+- Requests per second: ~10-20 (limited by provider APIs)
+- Concurrent requests: Limited by goroutine count (unbounded, risky)
+
+**With Caching** (Redis):
+- Requests per second: ~1000+ (cache hits)
+- Concurrent requests: Limited by database connections (10 max)
+
+**Scaling**:
+- Horizontal: Run multiple instances behind load balancer
+- Vertical: Increase CPU/RAM for single instance
+
+### Resource Usage
+
+**Memory**: ~50-100 MB (idle), ~200-500 MB (under load)  
+**CPU**: Low (I/O bound, waiting on provider APIs)  
+**Network**: High (streaming proxy, provider API calls)
+
+## Security Assessment
+
+### Authentication
+
+**Strengths**:
+- JWT tokens (stateless)
+- bcrypt password hashing (secure)
+- gRPC interceptors (centralized auth)
+
+**Weaknesses**:
+- No token revocation
+- No refresh token rotation
+- Single shared secret (HS256)
+- No rate limiting (brute force possible)
+- No account lockout
+
+**Risk Level**: Medium
+
+**Recommendations**:
+- Implement token revocation list (Redis)
+- Use RS256 (asymmetric keys)
+- Add rate limiting on auth endpoints
+- Add account lockout after failed attempts
+
+### Transport Security
+
+**Strengths**: None (no TLS)
+
+**Weaknesses**:
+- Credentials transmitted in plaintext
+- JWT tokens exposed
+- Man-in-the-middle attacks possible
+
+**Risk Level**: High
+
+**Recommendations**:
+- Deploy behind reverse proxy with TLS
+- Use Let's Encrypt for free certificates
+- Enforce HTTPS redirects
+
+### Input Validation
+
+**Strengths**:
+- Parameterized queries (SQL injection safe)
+- Email format validation
+
+**Weaknesses**:
+- No query length limits
+- No ID format validation
+- No limit parameter bounds
+
+**Risk Level**: Low (no SQL injection, but potential DoS)
+
+**Recommendations**:
+- Validate all inputs (length, format, bounds)
+- Sanitize user-provided data
+- Add request size limits
+
+### Secrets Management
+
+**Strengths**: None (plaintext `.env` files)
+
+**Weaknesses**:
+- Secrets in plaintext
+- No rotation
+- No encryption at rest
+
+**Risk Level**: Medium
+
+**Recommendations**:
+- Use secrets manager (AWS Secrets Manager, Vault)
+- Rotate secrets periodically
+- Encrypt secrets at rest
+
+## Scalability
+
+### Vertical Scaling
+
+**Current Limits**:
+- Database connections: 10 max
+- Goroutines: Unbounded (risky)
+- Memory: ~500 MB under load
+
+**Scaling Up**:
+- Increase database connection pool
+- Add worker pool (bounded goroutines)
+- Increase instance size (CPU, RAM)
+
+**Max Capacity** (single instance): ~100 req/sec (with caching)
+
+### Horizontal Scaling
+
+**Stateless Design**: Yes (JWT tokens, no sessions)
+
+**Scaling Out**:
+- Run multiple instances behind load balancer
+- Share PostgreSQL database (read replicas for reads)
+- Share Redis cache (cluster mode)
+
+**Max Capacity** (10 instances): ~1000 req/sec (with caching)
+
+### Database Scaling
+
+**Current**: Single PostgreSQL instance
+
+**Scaling Options**:
+- Read replicas (for read-heavy workloads)
+- Connection pooler (PgBouncer)
+- Sharding (by user ID)
+
+**Bottleneck**: Database is not bottleneck (minimal schema, simple queries)
+
+## Maintainability
+
+### Code Organization
+
+**Strengths**:
+- Clean provider abstraction
+- Separation of concerns (providers, store, auth)
+
+**Weaknesses**:
+- Single 1300+ line file (`main.go`)
+- No package documentation
+- No API documentation
+
+**Recommendation**: Split `main.go` by domain (search, retrieval, streaming, etc.)
+
+### Testing
+
+**Strengths**:
+- Integration tests for all providers
+- GitHub Actions CI/CD
+
+**Weaknesses**:
+- No unit tests
+- No test coverage reporting
+- No mocks
+
+**Recommendation**: Add unit tests with mocks, measure coverage
+
+### Documentation
+
+**Strengths**:
+- README with setup instructions
+- `.env.example` template
+
+**Weaknesses**:
+- No API documentation (OpenAPI/Swagger)
+- No architecture documentation
+- No deployment guide
+
+**Recommendation**: Add comprehensive documentation
+
+### Dependency Management
+
+**Strengths**:
+- Go modules (versioned dependencies)
+- Minimal dependencies (8 direct)
+
+**Weaknesses**:
+- Custom submodule (spotapi-go)
+- No automated updates (Dependabot)
+
+**Recommendation**: Remove submodule, add Dependabot
+
+## Comparison to Metadata Aggregator Requirements
+
+### Alignment
+
+| Requirement | Bedrock-API | Metadata Aggregator | Alignment |
+|-------------|-------------|---------------------|-----------|
+| Multi-provider aggregation | Yes (4 active) | Yes (10+ planned) | High |
+| Parallel queries | Yes (goroutines) | Yes | High |
+| Partial response handling | Yes | Yes | High |
+| Metadata persistence | No | Yes | Low |
+| Caching | No | Yes (critical) | Low |
+| Rich metadata | Medium | High | Medium |
+| Streaming | Yes | No | N/A |
+| Authentication | JWT | TBD | Medium |
+| Monitoring | No | Yes | Low |
+| Testing | Integration only | Unit + Integration | Medium |
+
+### Reusable Patterns
+
+**Directly Applicable**:
+- Provider interface pattern
+- Fan-out concurrency
+- Partial response handling
+- ID namespacing
+- gRPC interceptors
+
+**Needs Adaptation**:
+- Authentication (add RBAC, token revocation)
+- Database schema (expand for metadata)
+- Caching (add Redis)
+- Monitoring (add Prometheus)
+
+**Not Applicable**:
+- Stream resolution (metadata aggregator doesn't need streaming)
+- YouTube 7-client fallback (specific to YouTube)
+
+## Recommendations for Metadata Aggregator
+
+### Adopt
+
+1. **Provider Interface Pattern**: Clean abstraction for platform-specific logic
+2. **Fan-Out Concurrency**: Parallel queries for fast responses
+3. **Partial Response Handling**: Resilient to individual provider failures
+4. **ID Namespacing**: Prevent collisions, enable explicit routing
+5. **gRPC for Internal Services**: Performance benefits for service-to-service communication
+6. **JWT Authentication**: Stateless, scalable authentication
+7. **bcrypt Password Hashing**: Secure password storage
+
+### Avoid
+
+1. **No Caching**: Implement Redis from day one
+2. **Minimal Database Schema**: Design rich schema for metadata persistence
+3. **No Monitoring**: Implement Prometheus + Grafana from start
+4. **No Rate Limiting**: Add rate limiting to prevent abuse
+5. **Stub Providers**: Only list fully implemented providers
+6. **No TLS**: Deploy with TLS from start
+7. **Custom Submodules**: Use official libraries or vendor dependencies
+8. **No Unit Tests**: Write unit tests with mocks
+9. **Single Large File**: Split code by domain
+10. **No API Versioning**: Version API from start
+
+### Enhance
+
+1. **Add Caching Layer**: Redis for metadata, search results, provider responses
+2. **Expand Database Schema**: Tables for tracks, albums, artists, labels, genres, etc.
+3. **Implement Monitoring**: Prometheus metrics, Grafana dashboards, distributed tracing
+4. **Add Rate Limiting**: Per-user, per-IP, per-provider limits
+5. **Implement Health Checks**: Real health checks for dependencies
+6. **Add Pagination**: Cursor-based pagination for large result sets
+7. **Add API Versioning**: Version API for backward compatibility
+8. **Add Comprehensive Testing**: Unit tests with mocks, integration tests, E2E tests
+9. **Add Documentation**: API docs (OpenAPI), architecture docs, deployment guide
+10. **Add Security Features**: Token revocation, refresh token rotation, RS256, TLS
+
+## Final Verdict
+
+**Overall Assessment**: Good architectural foundation, but lacks production-readiness features.
+
+**Strengths**: Clean provider abstraction, fan-out concurrency, partial response handling, cross-platform stream resolution.
+
+**Weaknesses**: No caching, minimal database schema, no monitoring, no rate limiting, no TLS, stub providers.
+
+**Maturity Level**: Early production (functional but missing critical features).
+
+**Recommendation for Metadata Aggregator**: Adopt core patterns (provider interface, fan-out concurrency, partial responses, ID namespacing), but enhance with caching, monitoring, comprehensive testing, and security features.
+
+**Effort to Adapt**: Medium (core patterns are reusable, but significant enhancements needed for production).
+
+**Value Proposition**: Bedrock-API demonstrates proven patterns for multi-provider aggregation. The metadata aggregator can learn from its strengths (clean abstraction, concurrency, resilience) while avoiding its weaknesses (no caching, minimal schema, no monitoring).