Files
metadata-agregator/docs/research/bedrock-api/analysis/OVERVIEW.md
T
Alexander a1f6701bac feat: initial implementation of metadata aggregator
- gRPC service with MusicBrainz provider
- PostgreSQL schema with migrations
- Service layer with database-first caching
- Repository pattern for data access
- YAML configuration support
- Research documentation for 17 music metadata projects
2026-04-28 16:28:53 +02:00

13 KiB

Bedrock-API Overview

Project Identity

Repository: https://github.com/feralbureau/bedrock-api
Language: Go 1.25
License: MIT
Primary Protocols: gRPC, HTTP
Database: PostgreSQL 15
Entry Point: bedrock_server/main.go

Bedrock-API is a unified music metadata and streaming aggregation service that consolidates six music platforms into a single gRPC interface. The project's core value proposition is cross-platform stream resolution: when a platform doesn't provide streaming (Spotify partner API, Deezer public API), Bedrock bridges to SoundCloud or YouTube Music to deliver playable URLs.

Platform Coverage

Platform Status API Type Streaming Authentication Special Features
Spotify Full Partner API No (bridged) OAuth via submodule Full discography, namespaced IDs
SoundCloud Full api-v2 Yes (progressive MP3) Client ID rotation Batch hydration (30 IDs), /resolve endpoint
Deezer Full Public API No (bridged) None Concurrent artist data fetching
YouTube Music Full Innertube Yes (7-client fallback) Cookies for age-restricted WEB_REMIX metadata, itag priority
Yandex Music Stub N/A No N/A Placeholder only
VK Music Stub N/A No N/A Placeholder only

Active Platforms: 4 (Spotify, SoundCloud, Deezer, YouTube Music)
Stub Platforms: 2 (Yandex, VK)

Core Capabilities

gRPC Service Interface

Total Methods: 23 RPC endpoints
Protocol Buffer: bedrock_service.proto (622 lines)

Method categories:

  • Search: 4 methods (tracks, albums, artists, playlists)
  • Retrieval: 4 methods (get track, album, artist, playlist by ID)
  • Streaming: 1 method (GetStreamURL)
  • Discovery: 1 method (GetSimilarTracks)
  • Lyrics: 2 methods (GetLyrics, GetSyncedLyrics)
  • Statistics: 3 methods (GetTopTracks, GetTopAlbums, GetTopArtists)
  • Import: 1 method (ImportPlaylist)
  • Health: 1 method (GetServiceStatus)
  • Authentication: 3 methods (Register, Login, RefreshToken)

HTTP Streaming Proxy

Endpoints:

  • /stream/{service}/{id} - Audio stream proxy with range request support
  • /cover/{service}/{id} - Album art proxy

Ports:

  • gRPC: :50052
  • HTTP: :8080

Both endpoints support HTTP range requests for seeking and partial content delivery.

Technology Stack

Core Dependencies

google.golang.org/grpc v1.79.1
google.golang.org/protobuf v1.36.4
github.com/jackc/pgx/v5 v5.7.2
github.com/golang-jwt/jwt/v5 v5.2.1
golang.org/x/crypto (bcrypt)
github.com/joho/godotenv v1.5.1

Provider Libraries

github.com/zmb3/spotify/v2 (via spotapi-go submodule)
github.com/kkdai/youtube/v2 v2.10.3
github.com/rhnvrm/lyric-api-go v0.1.4 (Genius)

Submodule: spotapi-go (custom Spotify client wrapper)

Build Requirements

  • Go 1.25 (go.mod specification)
  • Git submodules (spotapi-go)
  • PostgreSQL 15+ (runtime)
  • Protocol buffer compiler (development)

Architecture Highlights

Fan-Out Concurrency Pattern

All search and retrieval methods execute parallel goroutines across enabled providers:

var wg sync.WaitGroup
for _, provider := range providers {
    wg.Add(1)
    go func(p trackProvider) {
        defer wg.Done()
        results, err := p.SearchTracks(query, limit)
        // aggregate results
    }(provider)
}
wg.Wait()

This pattern enables sub-second response times even when querying 4+ platforms simultaneously.

Stream Resolution Bridge

Problem: Spotify partner API and Deezer public API don't provide streaming URLs.

Solution: Three-tier fallback cascade:

  1. Check if requested platform supports streaming (SoundCloud, YouTube Music)
  2. If not, search SoundCloud for "{artist} - {title}"
  3. If SoundCloud fails, search YouTube Music with same query
  4. Return first successful stream URL

Implementation: providers/resolver.go

YouTube Music 7-Client Fallback Pool

YouTube Music streams use a client rotation strategy to maximize success rate:

TVHTML5_SIMPLY_EMBEDDED (primary)
TVHTML5
ANDROID_VR (variant 1)
ANDROID_VR (variant 2)
ANDROID
IOS
WEB

Each client has different capabilities and restrictions. The service tries clients sequentially until a valid stream URL is obtained. Ciphered streams fall back to SoundCloud.

ID Namespacing

All entity IDs use platform prefixes to avoid collisions:

spotify:track:3n3Ppam7vgaVa1iaRUc9Lp
soundcloud:track:1234567890
deezer:album:302127
youtube:video:dQw4w9WgXcQ

Format: {platform}:{entity_type}:{native_id}

Data Layer

PostgreSQL Schema

Single Table: users

CREATE TABLE users (
    id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
    email VARCHAR(255) UNIQUE NOT NULL,
    password_hash VARCHAR(255) NOT NULL,
    role VARCHAR(50) DEFAULT 'user',
    is_verified BOOLEAN DEFAULT false,
    created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);

Connection: pgx/v5 with connection pooling
Migrations: db/migrations/ (up/down SQL pairs)

Caching Strategy

Current: No caching implemented
Planned: Redis for:

  • Play deduplication (30s window)
  • Service status cache (5min TTL)
  • Stream URL cache (1hr TTL)

Authentication System

Token Type: JWT (HS256)
Access Token: 15 minutes
Refresh Token: 7 days
Password Hashing: bcrypt (cost 10)

gRPC Interceptor: Validates JWT on all methods except:

  • Register
  • Login
  • RefreshToken
  • GetServiceStatus

Storage: User credentials in PostgreSQL, tokens issued in-memory (no revocation list).

Lyrics Integration

LrcLib (Synced Lyrics)

Endpoint: https://lrclib.net/api/get
Format: LRC (timestamped)
Timeout: 5 seconds
Matching: Artist + title + album + duration

Genius (Plain Lyrics)

Authentication: GENIUS_ACCESS_TOKEN environment variable
Features: Plain text lyrics + annotations
Library: github.com/rhnvrm/lyric-api-go

Both services are queried in parallel when lyrics are requested. Synced lyrics take priority if available.

Configuration Management

Environment Variables

Required:

DATABASE_URL=postgresql://user:pass@localhost:5432/bedrock
JWT_SECRET=your-secret-key

Optional Platform Credentials:

SPOTIFY_CLIENT_ID
SPOTIFY_CLIENT_SECRET
SOUNDCLOUD_CLIENT_IDS=id1,id2,id3
DEEZER_APP_ID
YOUTUBE_COOKIES=cookie-string
GENIUS_ACCESS_TOKEN

Search Locations:

  1. Current working directory
  2. bedrock_server/ directory
  3. Parent directory

Loader: github.com/joho/godotenv

CLI Flags

-port int          gRPC server port (default 50052)
-proxy-addr string HTTP proxy address (default :8080)
-proxy-host string HTTP proxy host for URL generation

File Structure

bedrock-api/
├── bedrock_server/
│   ├── main.go           (1329 lines - service implementation)
│   ├── resolver.go       (stream resolution logic)
│   ├── proxy.go          (HTTP streaming proxy)
│   ├── auth.go           (JWT + bcrypt)
│   ├── lrclib.go         (synced lyrics)
│   └── genius.go         (plain lyrics)
├── providers/
│   ├── spotify.go        (partner API adapter)
│   ├── soundcloud.go     (api-v2 adapter)
│   ├── deezer.go         (public API adapter)
│   ├── youtube.go        (Innertube adapter)
│   ├── yandex.go         (stub)
│   └── vk.go             (stub)
├── store/
│   └── user.go           (PostgreSQL user operations)
├── db/
│   └── migrations/       (SQL migration files)
├── tests/
│   ├── auth_test.go
│   ├── spotify_test.go
│   ├── soundcloud_test.go
│   ├── youtube_test.go
│   ├── deezer_test.go
│   └── lyrics_test.go
├── proto/
│   └── bedrock_service.proto
├── Dockerfile
├── docker-compose.yml
└── go.mod

Total Service Code: ~3000+ lines (main.go + providers + auth + lyrics)
Protocol Definition: 622 lines
Test Coverage: 6 integration test files

Deployment Options

Docker

Multi-stage Build:

  • Builder: golang:1.23-alpine
  • Runtime: alpine:latest
  • Exposed Ports: 50052, 8080

Note: Dockerfile uses Go 1.23, but go.mod specifies 1.25 (version mismatch).

Docker Compose

Services:

  • PostgreSQL 15-alpine only
  • No Redis (planned)
  • No reverse proxy (TLS must be added externally)

Local Development

git clone https://github.com/feralbureau/bedrock-api
cd bedrock-api
git submodule update --init --recursive
cp .env.example .env
# Configure .env with credentials
go run ./bedrock_server

Submodule Requirement: spotapi-go must be initialized before build.

CI/CD Pipeline

GitHub Actions Workflows

test.yml:

  • Runs on: push, pull_request
  • Go version: 1.24
  • Services: PostgreSQL 15
  • Steps: Submodule init, integration tests with provider secrets
  • Timeout: 120 seconds per test

lint.yml:

  • golangci-lint (standard Go linting)
  • Custom comment linter (enforces no decorative comments, no uppercase-leading comments)

Secrets Required:

  • SPOTIFY_CLIENT_ID
  • SPOTIFY_CLIENT_SECRET
  • SOUNDCLOUD_CLIENT_IDS
  • GENIUS_ACCESS_TOKEN
  • YOUTUBE_COOKIES

Observability

Logging

Implementation: Go stdlib log.Printf
Format: [provider] message prefix pattern
Levels: No structured levels (info/warn/error mixed)

Monitoring

Current: None
Missing:

  • Prometheus metrics
  • APM/tracing
  • Structured logging (JSON)
  • Error tracking (Sentry, etc.)

Health Checks

Endpoint: GetServiceStatus RPC
Implementation: Stub (always returns OK)
Planned: Per-provider health checks with latency measurement

Performance Characteristics

Concurrency Model

  • Goroutine per provider for all search/retrieval operations
  • sync.WaitGroup for coordination
  • No rate limiting (relies on provider-level throttling)
  • No circuit breakers (failures are logged, partial responses returned)

Response Patterns

Partial Response Strategy: If 2/4 providers fail, return results from 2 successful providers with ResponseStatus: PARTIAL and ProviderError[] array listing failures.

Timeout Handling: No global timeout (relies on HTTP client defaults and provider-specific timeouts like LrcLib 5s).

Security Posture

Authentication

  • JWT tokens (HS256, not RS256 public/private key)
  • bcrypt password hashing (cost 10)
  • No rate limiting on auth endpoints
  • No account lockout after failed attempts
  • No email verification enforcement (is_verified field exists but unused)

Transport Security

  • No built-in TLS (requires reverse proxy like nginx/Caddy)
  • gRPC without TLS (insecure credentials)
  • HTTP proxy without HTTPS

Secrets Management

  • Environment variables only
  • No secrets rotation
  • Client IDs/tokens in plaintext .env files
  • No vault integration

Unique Features

  1. Cross-Platform Stream Resolution: Automatically bridges non-streaming platforms (Spotify, Deezer) to streaming platforms (SoundCloud, YouTube Music)

  2. YouTube 7-Client Fallback: Maximizes stream availability by rotating through 7 different YouTube client types

  3. SoundCloud Client ID Rotation: Handles rate limiting by cycling through multiple client IDs

  4. Dual Lyrics Sources: Combines synced (LrcLib) and annotated (Genius) lyrics

  5. Namespaced ID System: Platform-prefixed IDs prevent collisions and enable explicit routing

  6. Partial Response Model: Returns successful provider results even when some providers fail

Limitations

  1. Incomplete Platform Coverage: Yandex and VK are stubs only
  2. No Caching: Every request hits provider APIs (high latency, rate limit risk)
  3. Minimal Database Schema: Only user authentication, no metadata persistence
  4. No Observability: Missing metrics, tracing, structured logging
  5. Security Gaps: No TLS, no rate limiting, no account security features
  6. Version Mismatch: go.mod (1.25) vs Dockerfile (1.23)
  7. Submodule Dependency: Custom spotapi-go fork creates maintenance burden

Use Cases

Primary

  • Multi-platform music search aggregation
  • Stream URL resolution for non-streaming APIs
  • Unified metadata retrieval across platforms
  • Lyrics lookup with sync support

Secondary

  • Playlist import/export across platforms
  • Artist/album discovery with similar tracks
  • Top charts aggregation
  • Music recommendation engine backend

Integration Considerations

For Metadata Aggregator Project:

  • Provider adapter pattern is directly applicable
  • Fan-out concurrency model can be adopted
  • Partial response handling is valuable for resilience
  • ID namespacing prevents collision issues
  • Stream resolution bridge concept is novel but out of scope for pure metadata
  • gRPC interface requires client generation (protobuf compilation)

Reusable Patterns:

  • trackProvider interface design
  • Parallel goroutine search with WaitGroup
  • Error aggregation in partial responses
  • Platform-specific adapter isolation

Not Applicable:

  • Streaming focus (metadata aggregator doesn't need stream URLs)
  • JWT auth (different auth requirements)
  • Minimal database schema (metadata needs richer storage)