feat: initial implementation of metadata aggregator
- gRPC service with MusicBrainz provider - PostgreSQL schema with migrations - Service layer with database-first caching - Repository pattern for data access - YAML configuration support - Research documentation for 17 music metadata projects
This commit is contained in:
@@ -0,0 +1,57 @@
|
||||
# Bedrock-API
|
||||
|
||||
## Overview
|
||||
|
||||
Multi-source music streaming aggregator written in Go. Provides unified gRPC API across multiple streaming platforms with cross-platform track bridging.
|
||||
|
||||
## Key Features
|
||||
|
||||
- **API**: gRPC + HTTP streaming proxy
|
||||
- **Performance**: High-performance Go implementation
|
||||
- **Bridging**: Resolves non-streamable tracks to playable alternatives
|
||||
- **Auth**: JWT with PostgreSQL backend
|
||||
- **License**: MIT
|
||||
|
||||
## Source
|
||||
|
||||
| Resource | URL |
|
||||
|----------|-----|
|
||||
| **Repository** | https://github.com/feralbureau/bedrock-api |
|
||||
|
||||
## Supported Providers
|
||||
|
||||
| Provider | Metadata | Search | Streaming | Playlist | Bridge |
|
||||
|----------|----------|--------|-----------|----------|--------|
|
||||
| Spotify | Yes | Yes | Bridged | Yes | SoundCloud |
|
||||
| SoundCloud | Yes | Yes | Yes | Yes | - |
|
||||
| Deezer | Yes | Yes | Bridged | Yes | SoundCloud |
|
||||
| YouTube Music | Yes | Yes | Limited | Yes | SoundCloud |
|
||||
| Yandex | Partial | Partial | - | - | - |
|
||||
| VK | Partial | Partial | - | - | - |
|
||||
|
||||
## Architecture
|
||||
|
||||
- **Unified gRPC/Protobuf models** for all music entities
|
||||
- **Cross-platform bridging** - resolves non-streamable tracks
|
||||
- **Parallel provider searches** with Go concurrency
|
||||
- **HTTP streaming proxy** with range request support
|
||||
- **Lyrics integration** (LrcLib, Genius in progress)
|
||||
|
||||
## Self-Hosting
|
||||
|
||||
```bash
|
||||
git clone https://github.com/feralbureau/bedrock-api.git
|
||||
cd bedrock-api
|
||||
|
||||
# Configure providers and database
|
||||
cp config.example.yaml config.yaml
|
||||
|
||||
# Run
|
||||
go run .
|
||||
```
|
||||
|
||||
## Notes
|
||||
|
||||
- Best for streaming aggregation use cases
|
||||
- gRPC for high performance
|
||||
- Automatic track resolution across platforms
|
||||
File diff suppressed because it is too large
Load Diff
File diff suppressed because it is too large
Load Diff
File diff suppressed because it is too large
Load Diff
@@ -0,0 +1,978 @@
|
||||
# Bedrock-API Data Layer
|
||||
|
||||
## Database Technology
|
||||
|
||||
**RDBMS**: PostgreSQL 15
|
||||
**Driver**: `github.com/jackc/pgx/v5` (native PostgreSQL driver)
|
||||
**Connection Pooling**: `pgxpool` (pgx connection pool)
|
||||
**Migration Tool**: None (manual SQL execution)
|
||||
|
||||
## Database Schema
|
||||
|
||||
### Users Table
|
||||
|
||||
**File**: `db/migrations/001_create_users_table.up.sql`
|
||||
|
||||
```sql
|
||||
CREATE TABLE users (
|
||||
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
|
||||
email VARCHAR(255) UNIQUE NOT NULL,
|
||||
password_hash VARCHAR(255) NOT NULL,
|
||||
role VARCHAR(50) DEFAULT 'user',
|
||||
is_verified BOOLEAN DEFAULT false,
|
||||
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
|
||||
);
|
||||
|
||||
CREATE INDEX idx_users_email ON users(email);
|
||||
```
|
||||
|
||||
**Columns**:
|
||||
|
||||
| Column | Type | Constraints | Purpose |
|
||||
|--------|------|-------------|---------|
|
||||
| id | UUID | PRIMARY KEY, DEFAULT gen_random_uuid() | Unique user identifier |
|
||||
| email | VARCHAR(255) | UNIQUE, NOT NULL | User email (login identifier) |
|
||||
| password_hash | VARCHAR(255) | NOT NULL | bcrypt hashed password |
|
||||
| role | VARCHAR(50) | DEFAULT 'user' | User role (user/admin) |
|
||||
| is_verified | BOOLEAN | DEFAULT false | Email verification status |
|
||||
| created_at | TIMESTAMP | DEFAULT CURRENT_TIMESTAMP | Account creation timestamp |
|
||||
|
||||
**Indexes**:
|
||||
- Primary key index on `id` (automatic)
|
||||
- B-tree index on `email` (for login lookups)
|
||||
|
||||
**No Foreign Keys**: Single table schema, no relationships
|
||||
|
||||
### Schema Limitations
|
||||
|
||||
**Missing Tables**:
|
||||
- No metadata cache (tracks, albums, artists, playlists)
|
||||
- No user listening history
|
||||
- No user playlists
|
||||
- No user favorites/likes
|
||||
- No play counts
|
||||
- No search history
|
||||
- No provider credentials (Spotify tokens, etc.)
|
||||
|
||||
**Minimal User Data**:
|
||||
- No user profile (name, avatar, bio)
|
||||
- No user preferences (language, region)
|
||||
- No user settings (privacy, notifications)
|
||||
- No user sessions (active logins)
|
||||
|
||||
## Connection Management
|
||||
|
||||
### Connection Pool Configuration
|
||||
|
||||
**File**: `bedrock_server/main.go`
|
||||
|
||||
```go
|
||||
func initDB() (*pgxpool.Pool, error) {
|
||||
dbURL := os.Getenv("DATABASE_URL")
|
||||
if dbURL == "" {
|
||||
return nil, errors.New("DATABASE_URL not set")
|
||||
}
|
||||
|
||||
config, err := pgxpool.ParseConfig(dbURL)
|
||||
if err != nil {
|
||||
return nil, fmt.Errorf("parse config: %w", err)
|
||||
}
|
||||
|
||||
// Pool configuration
|
||||
config.MaxConns = 10
|
||||
config.MinConns = 2
|
||||
config.MaxConnLifetime = time.Hour
|
||||
config.MaxConnIdleTime = 30 * time.Minute
|
||||
config.HealthCheckPeriod = 1 * time.Minute
|
||||
|
||||
pool, err := pgxpool.NewWithConfig(context.Background(), config)
|
||||
if err != nil {
|
||||
return nil, fmt.Errorf("create pool: %w", err)
|
||||
}
|
||||
|
||||
// Test connection
|
||||
if err := pool.Ping(context.Background()); err != nil {
|
||||
return nil, fmt.Errorf("ping: %w", err)
|
||||
}
|
||||
|
||||
log.Println("Database connection pool initialized")
|
||||
return pool, nil
|
||||
}
|
||||
```
|
||||
|
||||
**Pool Parameters**:
|
||||
|
||||
| Parameter | Value | Rationale |
|
||||
|-----------|-------|-----------|
|
||||
| MaxConns | 10 | Limit concurrent DB connections |
|
||||
| MinConns | 2 | Keep warm connections ready |
|
||||
| MaxConnLifetime | 1 hour | Prevent stale connections |
|
||||
| MaxConnIdleTime | 30 minutes | Close idle connections |
|
||||
| HealthCheckPeriod | 1 minute | Detect dead connections |
|
||||
|
||||
**Connection String Format**:
|
||||
```
|
||||
postgresql://username:password@host:port/database?sslmode=disable
|
||||
```
|
||||
|
||||
**Example**:
|
||||
```
|
||||
DATABASE_URL=postgresql://bedrock:bedrock@localhost:5432/bedrock?sslmode=disable
|
||||
```
|
||||
|
||||
### Connection Lifecycle
|
||||
|
||||
```
|
||||
Application Start:
|
||||
1. Parse DATABASE_URL from environment
|
||||
2. Create pgxpool.Config with custom parameters
|
||||
3. Initialize connection pool
|
||||
4. Ping database to verify connectivity
|
||||
5. Pass pool to service layer
|
||||
|
||||
Request Handling:
|
||||
1. Service method receives context and pool
|
||||
2. Acquire connection from pool (automatic)
|
||||
3. Execute query
|
||||
4. Release connection back to pool (automatic via defer)
|
||||
|
||||
Application Shutdown:
|
||||
1. Close connection pool
|
||||
2. Wait for active connections to finish
|
||||
3. Release all resources
|
||||
```
|
||||
|
||||
## Data Access Layer
|
||||
|
||||
### User Store
|
||||
|
||||
**File**: `store/user.go`
|
||||
|
||||
```go
|
||||
type UserStore struct {
|
||||
db *pgxpool.Pool
|
||||
}
|
||||
|
||||
func NewUserStore(db *pgxpool.Pool) *UserStore {
|
||||
return &UserStore{db: db}
|
||||
}
|
||||
```
|
||||
|
||||
### User Operations
|
||||
|
||||
#### Save User
|
||||
|
||||
```go
|
||||
func (s *UserStore) Save(ctx context.Context, email, passwordHash string) (string, error) {
|
||||
var userID string
|
||||
|
||||
query := `
|
||||
INSERT INTO users (email, password_hash)
|
||||
VALUES ($1, $2)
|
||||
RETURNING id
|
||||
`
|
||||
|
||||
err := s.db.QueryRow(ctx, query, email, passwordHash).Scan(&userID)
|
||||
if err != nil {
|
||||
if strings.Contains(err.Error(), "duplicate key") {
|
||||
return "", errors.New("email already exists")
|
||||
}
|
||||
return "", fmt.Errorf("insert user: %w", err)
|
||||
}
|
||||
|
||||
return userID, nil
|
||||
}
|
||||
```
|
||||
|
||||
**Behavior**:
|
||||
- Inserts new user with email and password hash
|
||||
- Returns generated UUID
|
||||
- Handles duplicate email error
|
||||
- Uses parameterized query (SQL injection safe)
|
||||
|
||||
**Example**:
|
||||
```go
|
||||
userID, err := userStore.Save(ctx, "user@example.com", "$2a$10$...")
|
||||
// userID = "550e8400-e29b-41d4-a716-446655440000"
|
||||
```
|
||||
|
||||
#### Find User by Email
|
||||
|
||||
```go
|
||||
func (s *UserStore) Find(ctx context.Context, email string) (*User, error) {
|
||||
var user User
|
||||
|
||||
query := `
|
||||
SELECT id, email, password_hash, role, is_verified, created_at
|
||||
FROM users
|
||||
WHERE email = $1
|
||||
`
|
||||
|
||||
err := s.db.QueryRow(ctx, query, email).Scan(
|
||||
&user.ID,
|
||||
&user.Email,
|
||||
&user.PasswordHash,
|
||||
&user.Role,
|
||||
&user.IsVerified,
|
||||
&user.CreatedAt,
|
||||
)
|
||||
|
||||
if err != nil {
|
||||
if err == pgx.ErrNoRows {
|
||||
return nil, errors.New("user not found")
|
||||
}
|
||||
return nil, fmt.Errorf("query user: %w", err)
|
||||
}
|
||||
|
||||
return &user, nil
|
||||
}
|
||||
```
|
||||
|
||||
**Behavior**:
|
||||
- Queries user by email (uses index)
|
||||
- Returns full user record
|
||||
- Handles not found case
|
||||
- Uses parameterized query
|
||||
|
||||
**Example**:
|
||||
```go
|
||||
user, err := userStore.Find(ctx, "user@example.com")
|
||||
// user.ID = "550e8400-e29b-41d4-a716-446655440000"
|
||||
// user.Email = "user@example.com"
|
||||
// user.PasswordHash = "$2a$10$..."
|
||||
```
|
||||
|
||||
#### Find User by ID
|
||||
|
||||
```go
|
||||
func (s *UserStore) FindByID(ctx context.Context, id string) (*User, error) {
|
||||
var user User
|
||||
|
||||
query := `
|
||||
SELECT id, email, password_hash, role, is_verified, created_at
|
||||
FROM users
|
||||
WHERE id = $1
|
||||
`
|
||||
|
||||
err := s.db.QueryRow(ctx, query, id).Scan(
|
||||
&user.ID,
|
||||
&user.Email,
|
||||
&user.PasswordHash,
|
||||
&user.Role,
|
||||
&user.IsVerified,
|
||||
&user.CreatedAt,
|
||||
)
|
||||
|
||||
if err != nil {
|
||||
if err == pgx.ErrNoRows {
|
||||
return nil, errors.New("user not found")
|
||||
}
|
||||
return nil, fmt.Errorf("query user: %w", err)
|
||||
}
|
||||
|
||||
return &user, nil
|
||||
}
|
||||
```
|
||||
|
||||
**Behavior**: Similar to Find, but queries by UUID primary key
|
||||
|
||||
### User Model
|
||||
|
||||
```go
|
||||
type User struct {
|
||||
ID string
|
||||
Email string
|
||||
PasswordHash string
|
||||
Role string
|
||||
IsVerified bool
|
||||
CreatedAt time.Time
|
||||
}
|
||||
```
|
||||
|
||||
**No ORM**: Plain structs, manual scanning
|
||||
|
||||
## Database Migrations
|
||||
|
||||
### Migration Files
|
||||
|
||||
**Directory**: `db/migrations/`
|
||||
|
||||
**Naming Convention**: `{number}_{description}.{up|down}.sql`
|
||||
|
||||
**Example Structure**:
|
||||
```
|
||||
db/migrations/
|
||||
├── 001_create_users_table.up.sql
|
||||
├── 001_create_users_table.down.sql
|
||||
├── 002_add_user_roles.up.sql
|
||||
├── 002_add_user_roles.down.sql
|
||||
├── 003_add_email_verification.up.sql
|
||||
└── 003_add_email_verification.down.sql
|
||||
```
|
||||
|
||||
### Migration 001: Create Users Table
|
||||
|
||||
**Up Migration** (`001_create_users_table.up.sql`):
|
||||
```sql
|
||||
CREATE TABLE users (
|
||||
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
|
||||
email VARCHAR(255) UNIQUE NOT NULL,
|
||||
password_hash VARCHAR(255) NOT NULL,
|
||||
role VARCHAR(50) DEFAULT 'user',
|
||||
is_verified BOOLEAN DEFAULT false,
|
||||
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
|
||||
);
|
||||
|
||||
CREATE INDEX idx_users_email ON users(email);
|
||||
```
|
||||
|
||||
**Down Migration** (`001_create_users_table.down.sql`):
|
||||
```sql
|
||||
DROP INDEX IF EXISTS idx_users_email;
|
||||
DROP TABLE IF EXISTS users;
|
||||
```
|
||||
|
||||
### Migration Execution
|
||||
|
||||
**No Automated Tool**: Migrations must be run manually
|
||||
|
||||
**Manual Execution**:
|
||||
```bash
|
||||
# Apply migration
|
||||
psql $DATABASE_URL -f db/migrations/001_create_users_table.up.sql
|
||||
|
||||
# Rollback migration
|
||||
psql $DATABASE_URL -f db/migrations/001_create_users_table.down.sql
|
||||
```
|
||||
|
||||
**Recommended Tools** (not integrated):
|
||||
- `golang-migrate/migrate`
|
||||
- `pressly/goose`
|
||||
- `rubenv/sql-migrate`
|
||||
|
||||
### Migration Tracking
|
||||
|
||||
**No Tracking Table**: No record of applied migrations
|
||||
|
||||
**Risks**:
|
||||
- No way to know which migrations have been applied
|
||||
- Manual tracking required
|
||||
- Risk of applying migrations out of order
|
||||
- Risk of applying same migration twice
|
||||
|
||||
**Recommendation**: Integrate migration tool with tracking table
|
||||
|
||||
## Caching Strategy
|
||||
|
||||
### Current Implementation
|
||||
|
||||
**No Caching**: All data fetched from providers on every request
|
||||
|
||||
**Impact**:
|
||||
- High latency (200-500ms per search)
|
||||
- Provider API rate limits
|
||||
- Unnecessary API quota consumption
|
||||
- No offline capability
|
||||
|
||||
### Planned Caching (Redis)
|
||||
|
||||
**Not Implemented**: Redis integration planned but not built
|
||||
|
||||
**Proposed Cache Keys**:
|
||||
|
||||
| Key Pattern | TTL | Purpose |
|
||||
|-------------|-----|---------|
|
||||
| `track:{platform}:{id}` | 1 hour | Track metadata |
|
||||
| `album:{platform}:{id}` | 1 hour | Album metadata |
|
||||
| `artist:{platform}:{id}` | 1 hour | Artist metadata |
|
||||
| `playlist:{platform}:{id}` | 5 minutes | Playlist metadata (changes frequently) |
|
||||
| `stream:{platform}:{id}` | 1 hour | Stream URLs (expire after 1-6 hours) |
|
||||
| `search:{query}:{platform}` | 5 minutes | Search results |
|
||||
| `lyrics:{artist}:{title}` | 24 hours | Lyrics (rarely change) |
|
||||
| `play:{user_id}:{track_id}` | 30 seconds | Play deduplication |
|
||||
| `status:{platform}` | 5 minutes | Provider health status |
|
||||
|
||||
**Proposed Cache Invalidation**:
|
||||
- TTL-based expiration (no manual invalidation)
|
||||
- No cache warming (lazy loading)
|
||||
- No cache preloading
|
||||
|
||||
**Proposed Redis Configuration**:
|
||||
```go
|
||||
redisClient := redis.NewClient(&redis.Options{
|
||||
Addr: os.Getenv("REDIS_URL"),
|
||||
Password: os.Getenv("REDIS_PASSWORD"),
|
||||
DB: 0,
|
||||
MaxRetries: 3,
|
||||
PoolSize: 10,
|
||||
MinIdleConns: 2,
|
||||
})
|
||||
```
|
||||
|
||||
### Cache-Aside Pattern (Proposed)
|
||||
|
||||
```go
|
||||
func (s *server) GetTrack(ctx context.Context, req *pb.GetRequest) (*pb.Track, error) {
|
||||
// Try cache first
|
||||
cacheKey := fmt.Sprintf("track:%s", req.Id)
|
||||
cached, err := s.redis.Get(ctx, cacheKey).Result()
|
||||
if err == nil {
|
||||
var track pb.Track
|
||||
json.Unmarshal([]byte(cached), &track)
|
||||
return &track, nil
|
||||
}
|
||||
|
||||
// Cache miss, fetch from provider
|
||||
platform, nativeID := parseNamespacedID(req.Id)
|
||||
provider := s.getProvider(platform)
|
||||
track, err := provider.GetTrack(ctx, nativeID)
|
||||
if err != nil {
|
||||
return nil, err
|
||||
}
|
||||
|
||||
// Store in cache
|
||||
trackJSON, _ := json.Marshal(track)
|
||||
s.redis.Set(ctx, cacheKey, trackJSON, 1*time.Hour)
|
||||
|
||||
return track, nil
|
||||
}
|
||||
```
|
||||
|
||||
## Data Persistence Patterns
|
||||
|
||||
### No Metadata Persistence
|
||||
|
||||
**Current**: All metadata is ephemeral (fetched from providers, not stored)
|
||||
|
||||
**Implications**:
|
||||
- No historical data
|
||||
- No offline access
|
||||
- No analytics on metadata changes
|
||||
- No data ownership
|
||||
|
||||
**Alternative Approach** (not implemented):
|
||||
- Store all fetched metadata in PostgreSQL
|
||||
- Update on cache miss
|
||||
- Enable historical queries
|
||||
- Reduce provider API dependency
|
||||
|
||||
### No User Data Persistence
|
||||
|
||||
**Current**: Only authentication data is stored
|
||||
|
||||
**Missing User Data**:
|
||||
- Listening history
|
||||
- Favorite tracks/albums/artists
|
||||
- Created playlists
|
||||
- Search history
|
||||
- Playback state (current track, position)
|
||||
- User preferences
|
||||
|
||||
**Implications**:
|
||||
- No personalization
|
||||
- No recommendations based on history
|
||||
- No cross-device sync
|
||||
- No user analytics
|
||||
|
||||
## Transaction Handling
|
||||
|
||||
### No Transactions
|
||||
|
||||
**Current**: All database operations are single-statement
|
||||
|
||||
**Example** (no transaction):
|
||||
```go
|
||||
func (s *UserStore) Save(ctx context.Context, email, passwordHash string) (string, error) {
|
||||
var userID string
|
||||
err := s.db.QueryRow(ctx,
|
||||
"INSERT INTO users (email, password_hash) VALUES ($1, $2) RETURNING id",
|
||||
email, passwordHash,
|
||||
).Scan(&userID)
|
||||
return userID, err
|
||||
}
|
||||
```
|
||||
|
||||
**No Multi-Statement Operations**: No need for transactions with single table
|
||||
|
||||
**Future Considerations**: If schema expands (user profiles, playlists, etc.), transactions will be needed
|
||||
|
||||
**Transaction Example** (not used):
|
||||
```go
|
||||
func (s *UserStore) SaveWithProfile(ctx context.Context, email, passwordHash, name string) error {
|
||||
tx, err := s.db.Begin(ctx)
|
||||
if err != nil {
|
||||
return err
|
||||
}
|
||||
defer tx.Rollback(ctx)
|
||||
|
||||
var userID string
|
||||
err = tx.QueryRow(ctx,
|
||||
"INSERT INTO users (email, password_hash) VALUES ($1, $2) RETURNING id",
|
||||
email, passwordHash,
|
||||
).Scan(&userID)
|
||||
if err != nil {
|
||||
return err
|
||||
}
|
||||
|
||||
_, err = tx.Exec(ctx,
|
||||
"INSERT INTO profiles (user_id, name) VALUES ($1, $2)",
|
||||
userID, name,
|
||||
)
|
||||
if err != nil {
|
||||
return err
|
||||
}
|
||||
|
||||
return tx.Commit(ctx)
|
||||
}
|
||||
```
|
||||
|
||||
## Query Performance
|
||||
|
||||
### Index Usage
|
||||
|
||||
**Indexed Queries**:
|
||||
```sql
|
||||
-- Uses idx_users_email (B-tree index)
|
||||
SELECT * FROM users WHERE email = 'user@example.com';
|
||||
|
||||
-- Uses primary key index (automatic)
|
||||
SELECT * FROM users WHERE id = '550e8400-e29b-41d4-a716-446655440000';
|
||||
```
|
||||
|
||||
**No Full Table Scans**: All queries use indexes
|
||||
|
||||
### Query Patterns
|
||||
|
||||
**Point Lookups Only**: No range queries, no aggregations, no joins
|
||||
|
||||
**Example Queries**:
|
||||
```sql
|
||||
-- Login (index scan on email)
|
||||
SELECT id, email, password_hash, role, is_verified, created_at
|
||||
FROM users
|
||||
WHERE email = $1;
|
||||
|
||||
-- Token refresh (index scan on id)
|
||||
SELECT id, email, role
|
||||
FROM users
|
||||
WHERE id = $1;
|
||||
|
||||
-- Registration (insert with RETURNING)
|
||||
INSERT INTO users (email, password_hash)
|
||||
VALUES ($1, $2)
|
||||
RETURNING id;
|
||||
```
|
||||
|
||||
**No Complex Queries**: Simple CRUD operations only
|
||||
|
||||
## Data Consistency
|
||||
|
||||
### Email Uniqueness
|
||||
|
||||
**Constraint**: `UNIQUE` constraint on `email` column
|
||||
|
||||
**Enforcement**: Database-level (PostgreSQL)
|
||||
|
||||
**Race Condition Handling**:
|
||||
```go
|
||||
err := s.db.QueryRow(ctx, query, email, passwordHash).Scan(&userID)
|
||||
if err != nil {
|
||||
if strings.Contains(err.Error(), "duplicate key") {
|
||||
return "", errors.New("email already exists")
|
||||
}
|
||||
return "", fmt.Errorf("insert user: %w", err)
|
||||
}
|
||||
```
|
||||
|
||||
**Concurrent Registration**: Database prevents duplicate emails even with concurrent requests
|
||||
|
||||
### UUID Generation
|
||||
|
||||
**Method**: PostgreSQL `gen_random_uuid()` function
|
||||
|
||||
**Collision Probability**: Negligible (UUID v4 has 122 random bits)
|
||||
|
||||
**No Application-Level ID Generation**: Database handles ID creation
|
||||
|
||||
## Backup and Recovery
|
||||
|
||||
### No Automated Backups
|
||||
|
||||
**Current**: No backup strategy implemented
|
||||
|
||||
**Risks**:
|
||||
- Data loss on database failure
|
||||
- No point-in-time recovery
|
||||
- No disaster recovery plan
|
||||
|
||||
**Recommendations**:
|
||||
- Enable PostgreSQL continuous archiving (WAL archiving)
|
||||
- Schedule daily full backups
|
||||
- Test restore procedures
|
||||
- Store backups off-site (S3, etc.)
|
||||
|
||||
### Manual Backup
|
||||
|
||||
**pg_dump**:
|
||||
```bash
|
||||
pg_dump $DATABASE_URL > backup.sql
|
||||
```
|
||||
|
||||
**Restore**:
|
||||
```bash
|
||||
psql $DATABASE_URL < backup.sql
|
||||
```
|
||||
|
||||
## Data Security
|
||||
|
||||
### Password Storage
|
||||
|
||||
**Hashing Algorithm**: bcrypt
|
||||
**Cost Factor**: 10 (2^10 = 1024 iterations)
|
||||
|
||||
**Implementation**:
|
||||
```go
|
||||
func hashPassword(password string) (string, error) {
|
||||
bytes, err := bcrypt.GenerateFromPassword([]byte(password), 10)
|
||||
return string(bytes), err
|
||||
}
|
||||
|
||||
func checkPasswordHash(password, hash string) bool {
|
||||
err := bcrypt.CompareHashAndPassword([]byte(hash), []byte(password))
|
||||
return err == nil
|
||||
}
|
||||
```
|
||||
|
||||
**Security Properties**:
|
||||
- Salted (bcrypt includes random salt)
|
||||
- Slow (cost factor 10 = ~100ms per hash)
|
||||
- Resistant to rainbow tables
|
||||
- Resistant to brute force (with rate limiting, not implemented)
|
||||
|
||||
### SQL Injection Prevention
|
||||
|
||||
**Parameterized Queries**: All queries use `$1`, `$2` placeholders
|
||||
|
||||
**Safe Example**:
|
||||
```go
|
||||
// Safe: parameterized query
|
||||
err := s.db.QueryRow(ctx,
|
||||
"SELECT * FROM users WHERE email = $1",
|
||||
email,
|
||||
).Scan(&user)
|
||||
```
|
||||
|
||||
**Unsafe Example** (not used):
|
||||
```go
|
||||
// Unsafe: string concatenation (NOT USED IN CODEBASE)
|
||||
query := fmt.Sprintf("SELECT * FROM users WHERE email = '%s'", email)
|
||||
err := s.db.QueryRow(ctx, query).Scan(&user)
|
||||
```
|
||||
|
||||
**All Queries Are Safe**: No string concatenation in SQL queries
|
||||
|
||||
### Connection Security
|
||||
|
||||
**SSL Mode**: Configurable via connection string
|
||||
|
||||
**Example** (SSL disabled):
|
||||
```
|
||||
DATABASE_URL=postgresql://user:pass@localhost:5432/db?sslmode=disable
|
||||
```
|
||||
|
||||
**Example** (SSL required):
|
||||
```
|
||||
DATABASE_URL=postgresql://user:pass@localhost:5432/db?sslmode=require
|
||||
```
|
||||
|
||||
**Production Recommendation**: Use `sslmode=require` or `sslmode=verify-full`
|
||||
|
||||
## Database Monitoring
|
||||
|
||||
### No Monitoring
|
||||
|
||||
**Current**: No database monitoring implemented
|
||||
|
||||
**Missing Metrics**:
|
||||
- Connection pool utilization
|
||||
- Query latency
|
||||
- Slow query log
|
||||
- Deadlock detection
|
||||
- Table bloat
|
||||
- Index usage statistics
|
||||
|
||||
**Recommendations**:
|
||||
- Enable PostgreSQL `pg_stat_statements` extension
|
||||
- Monitor connection pool metrics (pgxpool provides stats)
|
||||
- Set up alerts for connection pool exhaustion
|
||||
- Log slow queries (> 1 second)
|
||||
|
||||
### Connection Pool Stats (Available but Not Used)
|
||||
|
||||
```go
|
||||
stats := pool.Stat()
|
||||
log.Printf("Total connections: %d", stats.TotalConns())
|
||||
log.Printf("Idle connections: %d", stats.IdleConns())
|
||||
log.Printf("Acquired connections: %d", stats.AcquiredConns())
|
||||
log.Printf("Max connections: %d", stats.MaxConns())
|
||||
```
|
||||
|
||||
**Not Implemented**: Stats are available but not logged or exposed
|
||||
|
||||
## Data Retention
|
||||
|
||||
### No Retention Policy
|
||||
|
||||
**Current**: Data is never deleted
|
||||
|
||||
**User Data**:
|
||||
- Users are never deleted (no account deletion endpoint)
|
||||
- No GDPR compliance (no data export, no right to be forgotten)
|
||||
|
||||
**Recommendations**:
|
||||
- Implement account deletion endpoint
|
||||
- Add soft delete (deleted_at timestamp)
|
||||
- Implement data export (GDPR compliance)
|
||||
- Add retention policy for inactive accounts
|
||||
|
||||
## Scalability Considerations
|
||||
|
||||
### Vertical Scaling
|
||||
|
||||
**Current Limits**:
|
||||
- Connection pool: 10 max connections
|
||||
- Single PostgreSQL instance
|
||||
- No read replicas
|
||||
|
||||
**Scaling Up**:
|
||||
- Increase connection pool size
|
||||
- Increase PostgreSQL resources (CPU, RAM)
|
||||
- Tune PostgreSQL configuration (shared_buffers, work_mem)
|
||||
|
||||
### Horizontal Scaling
|
||||
|
||||
**Not Supported**: Single database instance
|
||||
|
||||
**Challenges**:
|
||||
- No sharding strategy
|
||||
- No read/write splitting
|
||||
- No multi-region support
|
||||
|
||||
**Future Considerations**:
|
||||
- Add read replicas for search queries
|
||||
- Shard by user ID for user data
|
||||
- Use connection pooler (PgBouncer) for connection management
|
||||
|
||||
## Data Model Limitations
|
||||
|
||||
### Single Table Schema
|
||||
|
||||
**Pros**:
|
||||
- Simple to understand
|
||||
- No joins required
|
||||
- Fast queries (index lookups only)
|
||||
|
||||
**Cons**:
|
||||
- No relational data (playlists, favorites, etc.)
|
||||
- No metadata persistence
|
||||
- No user activity tracking
|
||||
- Limited functionality
|
||||
|
||||
### No Audit Trail
|
||||
|
||||
**Missing**:
|
||||
- No login history
|
||||
- No password change history
|
||||
- No account modification log
|
||||
- No admin action log
|
||||
|
||||
**Implications**:
|
||||
- No security forensics
|
||||
- No compliance audit trail
|
||||
- No user activity analytics
|
||||
|
||||
### No Soft Deletes
|
||||
|
||||
**Hard Delete Only**: If delete functionality is added, records are permanently removed
|
||||
|
||||
**Recommendation**: Add `deleted_at` timestamp for soft deletes
|
||||
|
||||
```sql
|
||||
ALTER TABLE users ADD COLUMN deleted_at TIMESTAMP;
|
||||
CREATE INDEX idx_users_deleted_at ON users(deleted_at);
|
||||
|
||||
-- Query active users
|
||||
SELECT * FROM users WHERE deleted_at IS NULL;
|
||||
```
|
||||
|
||||
## Testing Strategy
|
||||
|
||||
### No Database Tests
|
||||
|
||||
**Current**: No unit tests for database operations
|
||||
|
||||
**Missing Tests**:
|
||||
- User creation with duplicate email
|
||||
- User lookup by email
|
||||
- User lookup by ID
|
||||
- Connection pool exhaustion
|
||||
- Database connection failure
|
||||
- Transaction rollback (if added)
|
||||
|
||||
**Recommendation**: Add integration tests with test database
|
||||
|
||||
**Example Test** (not implemented):
|
||||
```go
|
||||
func TestUserStore_Save_DuplicateEmail(t *testing.T) {
|
||||
db := setupTestDB(t)
|
||||
defer db.Close()
|
||||
|
||||
store := NewUserStore(db)
|
||||
|
||||
// First save should succeed
|
||||
_, err := store.Save(context.Background(), "test@example.com", "hash1")
|
||||
if err != nil {
|
||||
t.Fatalf("first save failed: %v", err)
|
||||
}
|
||||
|
||||
// Second save with same email should fail
|
||||
_, err = store.Save(context.Background(), "test@example.com", "hash2")
|
||||
if err == nil {
|
||||
t.Fatal("expected duplicate email error")
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
## Environment Configuration
|
||||
|
||||
### Database URL
|
||||
|
||||
**Environment Variable**: `DATABASE_URL`
|
||||
|
||||
**Format**: PostgreSQL connection string
|
||||
|
||||
**Example**:
|
||||
```
|
||||
DATABASE_URL=postgresql://bedrock:bedrock@localhost:5432/bedrock?sslmode=disable
|
||||
```
|
||||
|
||||
**Components**:
|
||||
- Protocol: `postgresql://`
|
||||
- Username: `bedrock`
|
||||
- Password: `bedrock`
|
||||
- Host: `localhost`
|
||||
- Port: `5432`
|
||||
- Database: `bedrock`
|
||||
- SSL Mode: `sslmode=disable`
|
||||
|
||||
**No Validation**: Application crashes if DATABASE_URL is invalid
|
||||
|
||||
**Recommendation**: Validate connection string format on startup
|
||||
|
||||
## Docker Deployment
|
||||
|
||||
### Docker Compose PostgreSQL
|
||||
|
||||
**File**: `docker-compose.yml`
|
||||
|
||||
```yaml
|
||||
version: '3.8'
|
||||
|
||||
services:
|
||||
postgres:
|
||||
image: postgres:15-alpine
|
||||
environment:
|
||||
POSTGRES_USER: bedrock
|
||||
POSTGRES_PASSWORD: bedrock
|
||||
POSTGRES_DB: bedrock
|
||||
ports:
|
||||
- "5432:5432"
|
||||
volumes:
|
||||
- postgres_data:/var/lib/postgresql/data
|
||||
healthcheck:
|
||||
test: ["CMD-SHELL", "pg_isready -U bedrock"]
|
||||
interval: 10s
|
||||
timeout: 5s
|
||||
retries: 5
|
||||
|
||||
volumes:
|
||||
postgres_data:
|
||||
```
|
||||
|
||||
**Features**:
|
||||
- PostgreSQL 15 Alpine (minimal image)
|
||||
- Named volume for data persistence
|
||||
- Health check for container orchestration
|
||||
- Exposed port for local development
|
||||
|
||||
**Missing**:
|
||||
- No initialization scripts (migrations must be run manually)
|
||||
- No backup configuration
|
||||
- No replication
|
||||
- No connection pooler (PgBouncer)
|
||||
|
||||
### Database Initialization
|
||||
|
||||
**Manual Process**:
|
||||
```bash
|
||||
# Start PostgreSQL
|
||||
docker-compose up -d postgres
|
||||
|
||||
# Wait for PostgreSQL to be ready
|
||||
docker-compose exec postgres pg_isready -U bedrock
|
||||
|
||||
# Run migrations
|
||||
docker-compose exec postgres psql -U bedrock -d bedrock -f /migrations/001_create_users_table.up.sql
|
||||
```
|
||||
|
||||
**No Automated Initialization**: Migrations must be run manually after container start
|
||||
|
||||
**Recommendation**: Add init script to docker-compose
|
||||
|
||||
```yaml
|
||||
postgres:
|
||||
image: postgres:15-alpine
|
||||
volumes:
|
||||
- postgres_data:/var/lib/postgresql/data
|
||||
- ./db/migrations:/docker-entrypoint-initdb.d
|
||||
```
|
||||
|
||||
## Data Layer Summary
|
||||
|
||||
### Strengths
|
||||
|
||||
- Simple, focused schema (users only)
|
||||
- Proper indexing (email lookup is fast)
|
||||
- Connection pooling (pgx/v5)
|
||||
- Parameterized queries (SQL injection safe)
|
||||
- bcrypt password hashing (secure)
|
||||
|
||||
### Weaknesses
|
||||
|
||||
- No metadata persistence (all data is ephemeral)
|
||||
- No caching (high latency, provider API dependency)
|
||||
- No migration tool (manual SQL execution)
|
||||
- No monitoring (connection pool, query performance)
|
||||
- No backup strategy (data loss risk)
|
||||
- No audit trail (security, compliance)
|
||||
- Minimal schema (no user data beyond auth)
|
||||
|
||||
### Recommendations for Metadata Aggregator
|
||||
|
||||
**Adopt**:
|
||||
- pgx/v5 driver (excellent performance, native PostgreSQL features)
|
||||
- Connection pooling configuration (sensible defaults)
|
||||
- Parameterized queries (security best practice)
|
||||
|
||||
**Avoid**:
|
||||
- Manual migrations (use golang-migrate or goose)
|
||||
- No caching (implement Redis for metadata)
|
||||
- Minimal schema (metadata aggregator needs rich schema)
|
||||
|
||||
**Enhance**:
|
||||
- Add metadata tables (tracks, albums, artists, labels, etc.)
|
||||
- Add user data tables (favorites, playlists, history)
|
||||
- Add caching layer (Redis for hot data)
|
||||
- Add migration tool (automated schema management)
|
||||
- Add monitoring (connection pool, query latency)
|
||||
- Add backup strategy (automated backups, point-in-time recovery)
|
||||
File diff suppressed because it is too large
Load Diff
@@ -0,0 +1,760 @@
|
||||
# Bedrock-API Evaluation
|
||||
|
||||
## Executive Summary
|
||||
|
||||
Bedrock-API is a music metadata and streaming aggregation service built in Go 1.25 with gRPC and HTTP interfaces. The project demonstrates strong architectural patterns (provider abstraction, fan-out concurrency, partial response handling) but lacks production-readiness features (caching, monitoring, comprehensive testing, security hardening).
|
||||
|
||||
**Primary Value**: Cross-platform stream resolution (bridges non-streaming APIs like Spotify to streaming platforms like SoundCloud/YouTube Music).
|
||||
|
||||
**Target Use Case**: Unified music search and streaming across multiple platforms.
|
||||
|
||||
**Maturity Level**: Early production (functional but missing observability, caching, and security features).
|
||||
|
||||
## Strengths
|
||||
|
||||
### 1. Clean Provider Abstraction
|
||||
|
||||
**Pattern**: Implicit `trackProvider` interface isolates platform-specific logic
|
||||
|
||||
**Benefits**:
|
||||
- Easy to add new providers (implement interface)
|
||||
- Platform failures don't affect other providers
|
||||
- Testable in isolation (mock providers)
|
||||
|
||||
**Example**:
|
||||
```go
|
||||
type trackProvider interface {
|
||||
Name() string
|
||||
SearchTracks(ctx context.Context, query string, limit int32) ([]*pb.Track, error)
|
||||
GetStreamURL(ctx context.Context, id string) (string, error)
|
||||
// ... other methods
|
||||
}
|
||||
```
|
||||
|
||||
**Applicability to Metadata Aggregator**: Directly applicable. Same pattern can be used for metadata providers (Discogs, MusicBrainz, Last.fm, etc.).
|
||||
|
||||
### 2. Fan-Out Concurrency
|
||||
|
||||
**Pattern**: Parallel goroutines per provider with WaitGroup coordination
|
||||
|
||||
**Benefits**:
|
||||
- Response time = slowest provider (not sum of all)
|
||||
- Typical search: 200-500ms (4 providers in parallel)
|
||||
- Scales linearly with provider count
|
||||
|
||||
**Example**:
|
||||
```go
|
||||
var wg sync.WaitGroup
|
||||
for _, provider := range providers {
|
||||
wg.Add(1)
|
||||
go func(p trackProvider) {
|
||||
defer wg.Done()
|
||||
results, err := p.SearchTracks(ctx, query, limit)
|
||||
// Aggregate results
|
||||
}(provider)
|
||||
}
|
||||
wg.Wait()
|
||||
```
|
||||
|
||||
**Applicability to Metadata Aggregator**: Directly applicable. Metadata queries can be parallelized across providers.
|
||||
|
||||
### 3. Partial Response Handling
|
||||
|
||||
**Pattern**: Return successful results even if some providers fail
|
||||
|
||||
**Benefits**:
|
||||
- Resilient to individual provider failures
|
||||
- Degraded service instead of complete failure
|
||||
- Client can decide how to handle partial results
|
||||
|
||||
**Example**:
|
||||
```go
|
||||
if len(errors) > 0 {
|
||||
if len(allTracks) == 0 {
|
||||
status = pb.ResponseStatus_ERROR
|
||||
} else {
|
||||
status = pb.ResponseStatus_PARTIAL
|
||||
}
|
||||
}
|
||||
|
||||
return &pb.SearchTracksResponse{
|
||||
Tracks: allTracks,
|
||||
Status: status,
|
||||
Errors: errors, // Per-provider error details
|
||||
}
|
||||
```
|
||||
|
||||
**Applicability to Metadata Aggregator**: Directly applicable. Metadata aggregation should be resilient to individual provider failures.
|
||||
|
||||
### 4. Cross-Platform Stream Resolution
|
||||
|
||||
**Pattern**: Bridge non-streaming platforms to streaming platforms
|
||||
|
||||
**Algorithm**:
|
||||
1. Check if platform supports streaming (SoundCloud, YouTube Music)
|
||||
2. If not, search SoundCloud for matching track
|
||||
3. If SoundCloud fails, search YouTube Music
|
||||
4. Return first successful stream URL
|
||||
|
||||
**Benefits**:
|
||||
- Unified streaming interface (even for non-streaming APIs)
|
||||
- Automatic fallback chain
|
||||
- Transparent to client
|
||||
|
||||
**Applicability to Metadata Aggregator**: Not directly applicable (metadata aggregator doesn't need streaming). However, the fallback pattern is useful for metadata resolution (try provider A, fallback to provider B).
|
||||
|
||||
### 5. YouTube 7-Client Fallback
|
||||
|
||||
**Pattern**: Rotate through 7 different YouTube client types to maximize stream availability
|
||||
|
||||
**Clients**:
|
||||
- TVHTML5_SIMPLY_EMBEDDED (primary)
|
||||
- TVHTML5
|
||||
- ANDROID_VR (2 variants)
|
||||
- ANDROID
|
||||
- IOS
|
||||
- WEB
|
||||
|
||||
**Benefits**:
|
||||
- Maximizes success rate (different clients have different capabilities)
|
||||
- Avoids ciphered streams (encrypted, require decryption)
|
||||
- Handles geo-restrictions
|
||||
|
||||
**Applicability to Metadata Aggregator**: Pattern is applicable for providers with multiple API endpoints or client types.
|
||||
|
||||
### 6. ID Namespacing
|
||||
|
||||
**Pattern**: Platform-prefixed IDs (`{platform}:{type}:{native_id}`)
|
||||
|
||||
**Examples**:
|
||||
- `spotify:track:3n3Ppam7vgaVa1iaRUc9Lp`
|
||||
- `soundcloud:track:1234567890`
|
||||
- `deezer:album:302127`
|
||||
|
||||
**Benefits**:
|
||||
- Prevents ID collisions across platforms
|
||||
- Explicit routing (no lookup required)
|
||||
- Self-documenting (ID reveals source platform)
|
||||
|
||||
**Applicability to Metadata Aggregator**: Directly applicable. Metadata IDs should be namespaced to prevent collisions.
|
||||
|
||||
### 7. gRPC for Performance
|
||||
|
||||
**Benefits**:
|
||||
- HTTP/2 multiplexing (multiple requests over single connection)
|
||||
- Binary protocol (smaller payloads than JSON)
|
||||
- Streaming support (future use)
|
||||
- Strong typing (protobuf)
|
||||
|
||||
**Tradeoffs**:
|
||||
- Requires client code generation
|
||||
- Less human-readable than REST/JSON
|
||||
- Tooling less mature than REST
|
||||
|
||||
**Applicability to Metadata Aggregator**: Consider gRPC for internal services, REST for public API.
|
||||
|
||||
### 8. JWT Authentication
|
||||
|
||||
**Implementation**: HS256 tokens with bcrypt password hashing
|
||||
|
||||
**Benefits**:
|
||||
- Stateless authentication (no session storage)
|
||||
- Token expiration (15min access, 7 day refresh)
|
||||
- Secure password storage (bcrypt cost 10)
|
||||
|
||||
**Limitations**:
|
||||
- No token revocation
|
||||
- No refresh token rotation
|
||||
- Single shared secret (HS256)
|
||||
|
||||
**Applicability to Metadata Aggregator**: JWT is suitable, but consider RS256 (asymmetric) for better security.
|
||||
|
||||
### 9. SoundCloud Client ID Rotation
|
||||
|
||||
**Pattern**: Rotate through multiple client IDs to avoid rate limits
|
||||
|
||||
**Implementation**:
|
||||
```go
|
||||
func (p *SoundCloudProvider) getClientID() string {
|
||||
p.mu.Lock()
|
||||
defer p.mu.Unlock()
|
||||
|
||||
id := p.clientIDs[p.currentID]
|
||||
p.currentID = (p.currentID + 1) % len(p.clientIDs)
|
||||
|
||||
return id
|
||||
}
|
||||
```
|
||||
|
||||
**Benefits**:
|
||||
- Increases effective rate limit (4 IDs = 4x limit)
|
||||
- Automatic rotation (no manual intervention)
|
||||
|
||||
**Applicability to Metadata Aggregator**: Applicable for providers with rate limits (rotate API keys).
|
||||
|
||||
### 10. Batch Hydration (SoundCloud)
|
||||
|
||||
**Pattern**: Fetch details for multiple IDs in single request
|
||||
|
||||
**Implementation**: SoundCloud allows up to 30 IDs per request
|
||||
|
||||
**Benefits**:
|
||||
- Reduces API calls (30x reduction for playlists)
|
||||
- Faster response times
|
||||
- Lower rate limit consumption
|
||||
|
||||
**Applicability to Metadata Aggregator**: Applicable for providers that support batch requests (MusicBrainz, Discogs).
|
||||
|
||||
## Weaknesses
|
||||
|
||||
### 1. No Caching
|
||||
|
||||
**Impact**:
|
||||
- High latency (200-500ms per search)
|
||||
- Provider API rate limits
|
||||
- Unnecessary API quota consumption
|
||||
- No offline capability
|
||||
|
||||
**Recommendation**: Implement Redis caching
|
||||
|
||||
**Cache Strategy**:
|
||||
- Track metadata: 1 hour TTL
|
||||
- Search results: 5 minutes TTL
|
||||
- Stream URLs: 1 hour TTL (expire after 1-6 hours anyway)
|
||||
- Lyrics: 24 hours TTL (rarely change)
|
||||
|
||||
**Applicability to Metadata Aggregator**: Critical. Metadata aggregator must cache to avoid repeated API calls.
|
||||
|
||||
### 2. Minimal Database Schema
|
||||
|
||||
**Current**: Single `users` table (authentication only)
|
||||
|
||||
**Missing**:
|
||||
- No metadata persistence (tracks, albums, artists)
|
||||
- No user data (favorites, playlists, history)
|
||||
- No analytics (play counts, search trends)
|
||||
|
||||
**Impact**:
|
||||
- All data is ephemeral (fetched from providers every time)
|
||||
- No historical data
|
||||
- No offline access
|
||||
- No data ownership
|
||||
|
||||
**Applicability to Metadata Aggregator**: Metadata aggregator needs rich schema for metadata persistence.
|
||||
|
||||
### 3. No Monitoring
|
||||
|
||||
**Missing**:
|
||||
- Prometheus metrics (request rate, error rate, latency)
|
||||
- Grafana dashboards
|
||||
- Distributed tracing (Jaeger)
|
||||
- Log aggregation (Loki)
|
||||
|
||||
**Impact**:
|
||||
- No visibility into performance
|
||||
- No alerting on failures
|
||||
- Difficult to debug production issues
|
||||
|
||||
**Recommendation**: Implement full observability stack
|
||||
|
||||
**Applicability to Metadata Aggregator**: Critical for production. Monitoring is essential.
|
||||
|
||||
### 4. No Rate Limiting
|
||||
|
||||
**Missing**:
|
||||
- Per-user rate limiting
|
||||
- Per-IP rate limiting
|
||||
- Provider-level rate limiting
|
||||
|
||||
**Impact**:
|
||||
- Abuse possible (unlimited requests)
|
||||
- Provider API rate limits can be exceeded
|
||||
- No protection against DDoS
|
||||
|
||||
**Recommendation**: Implement rate limiting
|
||||
|
||||
**Example**:
|
||||
```go
|
||||
import "golang.org/x/time/rate"
|
||||
|
||||
var limiters = make(map[string]*rate.Limiter)
|
||||
|
||||
func getLimiter(userID string) *rate.Limiter {
|
||||
limiter, exists := limiters[userID]
|
||||
if !exists {
|
||||
limiter = rate.NewLimiter(rate.Every(time.Second), 10) // 10 req/sec
|
||||
limiters[userID] = limiter
|
||||
}
|
||||
return limiter
|
||||
}
|
||||
```
|
||||
|
||||
**Applicability to Metadata Aggregator**: Critical. Rate limiting prevents abuse and protects provider APIs.
|
||||
|
||||
### 5. Stub Providers (Yandex, VK)
|
||||
|
||||
**Status**: Placeholder only, no implementation
|
||||
|
||||
**Impact**:
|
||||
- Incomplete platform coverage
|
||||
- Misleading (listed as supported but not functional)
|
||||
|
||||
**Recommendation**: Remove stubs or implement fully
|
||||
|
||||
**Applicability to Metadata Aggregator**: Don't list providers as supported unless fully implemented.
|
||||
|
||||
### 6. No TLS
|
||||
|
||||
**Current**: gRPC and HTTP without TLS
|
||||
|
||||
**Impact**:
|
||||
- Credentials transmitted in plaintext
|
||||
- JWT tokens exposed
|
||||
- Man-in-the-middle attacks possible
|
||||
|
||||
**Recommendation**: Deploy behind reverse proxy with TLS termination
|
||||
|
||||
**Applicability to Metadata Aggregator**: TLS is mandatory for production.
|
||||
|
||||
### 7. Go Version Mismatch
|
||||
|
||||
**Issue**: `go.mod` specifies 1.25, Dockerfile uses 1.23
|
||||
|
||||
**Impact**:
|
||||
- Build failures if Go 1.25 features are used
|
||||
- Inconsistent builds
|
||||
|
||||
**Fix**:
|
||||
```dockerfile
|
||||
FROM golang:1.25-alpine AS builder
|
||||
```
|
||||
|
||||
**Applicability to Metadata Aggregator**: Keep build environment in sync with go.mod.
|
||||
|
||||
### 8. Custom Submodule Dependency
|
||||
|
||||
**Issue**: `spotapi-go` is custom fork, not official library
|
||||
|
||||
**Impact**:
|
||||
- Maintenance burden
|
||||
- Submodule initialization required
|
||||
- Potential security issues (unmaintained fork)
|
||||
|
||||
**Recommendation**: Use official library directly
|
||||
|
||||
**Applicability to Metadata Aggregator**: Avoid custom forks. Use official libraries or vendor dependencies.
|
||||
|
||||
### 9. No Unit Tests
|
||||
|
||||
**Current**: Integration tests only (require running server and providers)
|
||||
|
||||
**Missing**:
|
||||
- Provider adapter unit tests (mocked HTTP responses)
|
||||
- Database store unit tests (mocked database)
|
||||
- Authentication unit tests (mocked JWT)
|
||||
|
||||
**Impact**:
|
||||
- Slow test execution
|
||||
- Difficult to test edge cases
|
||||
- Requires provider credentials for testing
|
||||
|
||||
**Recommendation**: Add unit tests with mocks
|
||||
|
||||
**Applicability to Metadata Aggregator**: Unit tests are essential for fast feedback and edge case coverage.
|
||||
|
||||
### 10. Health Check Stub
|
||||
|
||||
**Current**: `GetServiceStatus` always returns healthy
|
||||
|
||||
**Impact**:
|
||||
- No actual health monitoring
|
||||
- Kubernetes probes don't detect failures
|
||||
- No dependency health visibility
|
||||
|
||||
**Recommendation**: Implement real health checks
|
||||
|
||||
**Applicability to Metadata Aggregator**: Health checks are critical for orchestration (Kubernetes, Docker Swarm).
|
||||
|
||||
### 11. No Pagination
|
||||
|
||||
**Current**: Search results limited by `limit` parameter (max 50)
|
||||
|
||||
**Impact**:
|
||||
- Large result sets cannot be retrieved incrementally
|
||||
- No cursor-based pagination
|
||||
- No total count
|
||||
|
||||
**Recommendation**: Add pagination
|
||||
|
||||
**Example**:
|
||||
```protobuf
|
||||
message SearchRequest {
|
||||
string query = 1;
|
||||
int32 limit = 2;
|
||||
string cursor = 3; // Pagination cursor
|
||||
}
|
||||
|
||||
message SearchTracksResponse {
|
||||
repeated Track tracks = 1;
|
||||
string next_cursor = 2; // Next page cursor
|
||||
int32 total = 3; // Total result count
|
||||
}
|
||||
```
|
||||
|
||||
**Applicability to Metadata Aggregator**: Pagination is essential for large result sets.
|
||||
|
||||
### 12. No API Versioning
|
||||
|
||||
**Current**: No version in package name or endpoint
|
||||
|
||||
**Impact**:
|
||||
- Breaking changes affect all clients
|
||||
- No backward compatibility
|
||||
- No deprecation path
|
||||
|
||||
**Recommendation**: Add versioning
|
||||
|
||||
**Example**:
|
||||
```protobuf
|
||||
package bedrock.v1;
|
||||
|
||||
service BedrockService {
|
||||
// ...
|
||||
}
|
||||
```
|
||||
|
||||
**Applicability to Metadata Aggregator**: API versioning is critical for backward compatibility.
|
||||
|
||||
## Integration Complexity
|
||||
|
||||
### Provider Integration Effort
|
||||
|
||||
| Provider | Complexity | Reason |
|
||||
|----------|------------|--------|
|
||||
| Spotify | Medium | OAuth 2.0, submodule dependency |
|
||||
| SoundCloud | Low | Simple HTTP API, client ID rotation |
|
||||
| Deezer | Low | Public API, no auth |
|
||||
| YouTube Music | High | Undocumented Innertube API, 7-client fallback, cipher handling |
|
||||
| Yandex | Unknown | Not implemented |
|
||||
| VK | Unknown | Not implemented |
|
||||
|
||||
**Easiest**: Deezer (public API, no auth)
|
||||
**Hardest**: YouTube Music (undocumented API, complex fallback logic)
|
||||
|
||||
### Client Integration Effort
|
||||
|
||||
**gRPC Clients**: Requires protobuf compilation
|
||||
|
||||
**Steps**:
|
||||
1. Install protoc compiler
|
||||
2. Install language-specific protobuf plugin
|
||||
3. Generate client code from `.proto` file
|
||||
4. Implement authentication (JWT in metadata)
|
||||
|
||||
**Example** (Go):
|
||||
```bash
|
||||
protoc --go_out=. --go-grpc_out=. bedrock_service.proto
|
||||
```
|
||||
|
||||
**Example** (Python):
|
||||
```bash
|
||||
python -m grpc_tools.protoc -I. --python_out=. --grpc_python_out=. bedrock_service.proto
|
||||
```
|
||||
|
||||
**Complexity**: Medium (requires tooling setup)
|
||||
|
||||
**Alternative**: Provide pre-generated clients for popular languages
|
||||
|
||||
## Performance Analysis
|
||||
|
||||
### Latency Breakdown
|
||||
|
||||
**Typical Search Request** (4 providers):
|
||||
|
||||
| Component | Latency | Notes |
|
||||
|-----------|---------|-------|
|
||||
| gRPC overhead | 1-5ms | Minimal |
|
||||
| Authentication | 1-2ms | JWT validation |
|
||||
| Provider queries (parallel) | 200-500ms | Slowest provider wins |
|
||||
| Response aggregation | 1-5ms | Mutex-protected append |
|
||||
| **Total** | **200-510ms** | Dominated by provider latency |
|
||||
|
||||
**Optimization Opportunities**:
|
||||
- Cache metadata (reduce provider calls)
|
||||
- Implement timeouts (don't wait for slow providers)
|
||||
- Add circuit breakers (skip failing providers)
|
||||
|
||||
### Throughput
|
||||
|
||||
**Single Instance** (no caching):
|
||||
- Requests per second: ~10-20 (limited by provider APIs)
|
||||
- Concurrent requests: Limited by goroutine count (unbounded, risky)
|
||||
|
||||
**With Caching** (Redis):
|
||||
- Requests per second: ~1000+ (cache hits)
|
||||
- Concurrent requests: Limited by database connections (10 max)
|
||||
|
||||
**Scaling**:
|
||||
- Horizontal: Run multiple instances behind load balancer
|
||||
- Vertical: Increase CPU/RAM for single instance
|
||||
|
||||
### Resource Usage
|
||||
|
||||
**Memory**: ~50-100 MB (idle), ~200-500 MB (under load)
|
||||
**CPU**: Low (I/O bound, waiting on provider APIs)
|
||||
**Network**: High (streaming proxy, provider API calls)
|
||||
|
||||
## Security Assessment
|
||||
|
||||
### Authentication
|
||||
|
||||
**Strengths**:
|
||||
- JWT tokens (stateless)
|
||||
- bcrypt password hashing (secure)
|
||||
- gRPC interceptors (centralized auth)
|
||||
|
||||
**Weaknesses**:
|
||||
- No token revocation
|
||||
- No refresh token rotation
|
||||
- Single shared secret (HS256)
|
||||
- No rate limiting (brute force possible)
|
||||
- No account lockout
|
||||
|
||||
**Risk Level**: Medium
|
||||
|
||||
**Recommendations**:
|
||||
- Implement token revocation list (Redis)
|
||||
- Use RS256 (asymmetric keys)
|
||||
- Add rate limiting on auth endpoints
|
||||
- Add account lockout after failed attempts
|
||||
|
||||
### Transport Security
|
||||
|
||||
**Strengths**: None (no TLS)
|
||||
|
||||
**Weaknesses**:
|
||||
- Credentials transmitted in plaintext
|
||||
- JWT tokens exposed
|
||||
- Man-in-the-middle attacks possible
|
||||
|
||||
**Risk Level**: High
|
||||
|
||||
**Recommendations**:
|
||||
- Deploy behind reverse proxy with TLS
|
||||
- Use Let's Encrypt for free certificates
|
||||
- Enforce HTTPS redirects
|
||||
|
||||
### Input Validation
|
||||
|
||||
**Strengths**:
|
||||
- Parameterized queries (SQL injection safe)
|
||||
- Email format validation
|
||||
|
||||
**Weaknesses**:
|
||||
- No query length limits
|
||||
- No ID format validation
|
||||
- No limit parameter bounds
|
||||
|
||||
**Risk Level**: Low (no SQL injection, but potential DoS)
|
||||
|
||||
**Recommendations**:
|
||||
- Validate all inputs (length, format, bounds)
|
||||
- Sanitize user-provided data
|
||||
- Add request size limits
|
||||
|
||||
### Secrets Management
|
||||
|
||||
**Strengths**: None (plaintext `.env` files)
|
||||
|
||||
**Weaknesses**:
|
||||
- Secrets in plaintext
|
||||
- No rotation
|
||||
- No encryption at rest
|
||||
|
||||
**Risk Level**: Medium
|
||||
|
||||
**Recommendations**:
|
||||
- Use secrets manager (AWS Secrets Manager, Vault)
|
||||
- Rotate secrets periodically
|
||||
- Encrypt secrets at rest
|
||||
|
||||
## Scalability
|
||||
|
||||
### Vertical Scaling
|
||||
|
||||
**Current Limits**:
|
||||
- Database connections: 10 max
|
||||
- Goroutines: Unbounded (risky)
|
||||
- Memory: ~500 MB under load
|
||||
|
||||
**Scaling Up**:
|
||||
- Increase database connection pool
|
||||
- Add worker pool (bounded goroutines)
|
||||
- Increase instance size (CPU, RAM)
|
||||
|
||||
**Max Capacity** (single instance): ~100 req/sec (with caching)
|
||||
|
||||
### Horizontal Scaling
|
||||
|
||||
**Stateless Design**: Yes (JWT tokens, no sessions)
|
||||
|
||||
**Scaling Out**:
|
||||
- Run multiple instances behind load balancer
|
||||
- Share PostgreSQL database (read replicas for reads)
|
||||
- Share Redis cache (cluster mode)
|
||||
|
||||
**Max Capacity** (10 instances): ~1000 req/sec (with caching)
|
||||
|
||||
### Database Scaling
|
||||
|
||||
**Current**: Single PostgreSQL instance
|
||||
|
||||
**Scaling Options**:
|
||||
- Read replicas (for read-heavy workloads)
|
||||
- Connection pooler (PgBouncer)
|
||||
- Sharding (by user ID)
|
||||
|
||||
**Bottleneck**: Database is not bottleneck (minimal schema, simple queries)
|
||||
|
||||
## Maintainability
|
||||
|
||||
### Code Organization
|
||||
|
||||
**Strengths**:
|
||||
- Clean provider abstraction
|
||||
- Separation of concerns (providers, store, auth)
|
||||
|
||||
**Weaknesses**:
|
||||
- Single 1300+ line file (`main.go`)
|
||||
- No package documentation
|
||||
- No API documentation
|
||||
|
||||
**Recommendation**: Split `main.go` by domain (search, retrieval, streaming, etc.)
|
||||
|
||||
### Testing
|
||||
|
||||
**Strengths**:
|
||||
- Integration tests for all providers
|
||||
- GitHub Actions CI/CD
|
||||
|
||||
**Weaknesses**:
|
||||
- No unit tests
|
||||
- No test coverage reporting
|
||||
- No mocks
|
||||
|
||||
**Recommendation**: Add unit tests with mocks, measure coverage
|
||||
|
||||
### Documentation
|
||||
|
||||
**Strengths**:
|
||||
- README with setup instructions
|
||||
- `.env.example` template
|
||||
|
||||
**Weaknesses**:
|
||||
- No API documentation (OpenAPI/Swagger)
|
||||
- No architecture documentation
|
||||
- No deployment guide
|
||||
|
||||
**Recommendation**: Add comprehensive documentation
|
||||
|
||||
### Dependency Management
|
||||
|
||||
**Strengths**:
|
||||
- Go modules (versioned dependencies)
|
||||
- Minimal dependencies (8 direct)
|
||||
|
||||
**Weaknesses**:
|
||||
- Custom submodule (spotapi-go)
|
||||
- No automated updates (Dependabot)
|
||||
|
||||
**Recommendation**: Remove submodule, add Dependabot
|
||||
|
||||
## Comparison to Metadata Aggregator Requirements
|
||||
|
||||
### Alignment
|
||||
|
||||
| Requirement | Bedrock-API | Metadata Aggregator | Alignment |
|
||||
|-------------|-------------|---------------------|-----------|
|
||||
| Multi-provider aggregation | Yes (4 active) | Yes (10+ planned) | High |
|
||||
| Parallel queries | Yes (goroutines) | Yes | High |
|
||||
| Partial response handling | Yes | Yes | High |
|
||||
| Metadata persistence | No | Yes | Low |
|
||||
| Caching | No | Yes (critical) | Low |
|
||||
| Rich metadata | Medium | High | Medium |
|
||||
| Streaming | Yes | No | N/A |
|
||||
| Authentication | JWT | TBD | Medium |
|
||||
| Monitoring | No | Yes | Low |
|
||||
| Testing | Integration only | Unit + Integration | Medium |
|
||||
|
||||
### Reusable Patterns
|
||||
|
||||
**Directly Applicable**:
|
||||
- Provider interface pattern
|
||||
- Fan-out concurrency
|
||||
- Partial response handling
|
||||
- ID namespacing
|
||||
- gRPC interceptors
|
||||
|
||||
**Needs Adaptation**:
|
||||
- Authentication (add RBAC, token revocation)
|
||||
- Database schema (expand for metadata)
|
||||
- Caching (add Redis)
|
||||
- Monitoring (add Prometheus)
|
||||
|
||||
**Not Applicable**:
|
||||
- Stream resolution (metadata aggregator doesn't need streaming)
|
||||
- YouTube 7-client fallback (specific to YouTube)
|
||||
|
||||
## Recommendations for Metadata Aggregator
|
||||
|
||||
### Adopt
|
||||
|
||||
1. **Provider Interface Pattern**: Clean abstraction for platform-specific logic
|
||||
2. **Fan-Out Concurrency**: Parallel queries for fast responses
|
||||
3. **Partial Response Handling**: Resilient to individual provider failures
|
||||
4. **ID Namespacing**: Prevent collisions, enable explicit routing
|
||||
5. **gRPC for Internal Services**: Performance benefits for service-to-service communication
|
||||
6. **JWT Authentication**: Stateless, scalable authentication
|
||||
7. **bcrypt Password Hashing**: Secure password storage
|
||||
|
||||
### Avoid
|
||||
|
||||
1. **No Caching**: Implement Redis from day one
|
||||
2. **Minimal Database Schema**: Design rich schema for metadata persistence
|
||||
3. **No Monitoring**: Implement Prometheus + Grafana from start
|
||||
4. **No Rate Limiting**: Add rate limiting to prevent abuse
|
||||
5. **Stub Providers**: Only list fully implemented providers
|
||||
6. **No TLS**: Deploy with TLS from start
|
||||
7. **Custom Submodules**: Use official libraries or vendor dependencies
|
||||
8. **No Unit Tests**: Write unit tests with mocks
|
||||
9. **Single Large File**: Split code by domain
|
||||
10. **No API Versioning**: Version API from start
|
||||
|
||||
### Enhance
|
||||
|
||||
1. **Add Caching Layer**: Redis for metadata, search results, provider responses
|
||||
2. **Expand Database Schema**: Tables for tracks, albums, artists, labels, genres, etc.
|
||||
3. **Implement Monitoring**: Prometheus metrics, Grafana dashboards, distributed tracing
|
||||
4. **Add Rate Limiting**: Per-user, per-IP, per-provider limits
|
||||
5. **Implement Health Checks**: Real health checks for dependencies
|
||||
6. **Add Pagination**: Cursor-based pagination for large result sets
|
||||
7. **Add API Versioning**: Version API for backward compatibility
|
||||
8. **Add Comprehensive Testing**: Unit tests with mocks, integration tests, E2E tests
|
||||
9. **Add Documentation**: API docs (OpenAPI), architecture docs, deployment guide
|
||||
10. **Add Security Features**: Token revocation, refresh token rotation, RS256, TLS
|
||||
|
||||
## Final Verdict
|
||||
|
||||
**Overall Assessment**: Good architectural foundation, but lacks production-readiness features.
|
||||
|
||||
**Strengths**: Clean provider abstraction, fan-out concurrency, partial response handling, cross-platform stream resolution.
|
||||
|
||||
**Weaknesses**: No caching, minimal database schema, no monitoring, no rate limiting, no TLS, stub providers.
|
||||
|
||||
**Maturity Level**: Early production (functional but missing critical features).
|
||||
|
||||
**Recommendation for Metadata Aggregator**: Adopt core patterns (provider interface, fan-out concurrency, partial responses, ID namespacing), but enhance with caching, monitoring, comprehensive testing, and security features.
|
||||
|
||||
**Effort to Adapt**: Medium (core patterns are reusable, but significant enhancements needed for production).
|
||||
|
||||
**Value Proposition**: Bedrock-API demonstrates proven patterns for multi-provider aggregation. The metadata aggregator can learn from its strengths (clean abstraction, concurrency, resilience) while avoiding its weaknesses (no caching, minimal schema, no monitoring).
|
||||
File diff suppressed because it is too large
Load Diff
@@ -0,0 +1,460 @@
|
||||
# Bedrock-API Overview
|
||||
|
||||
## Project Identity
|
||||
|
||||
**Repository**: https://github.com/feralbureau/bedrock-api
|
||||
**Language**: Go 1.25
|
||||
**License**: MIT
|
||||
**Primary Protocols**: gRPC, HTTP
|
||||
**Database**: PostgreSQL 15
|
||||
**Entry Point**: `bedrock_server/main.go`
|
||||
|
||||
Bedrock-API is a unified music metadata and streaming aggregation service that consolidates six music platforms into a single gRPC interface. The project's core value proposition is cross-platform stream resolution: when a platform doesn't provide streaming (Spotify partner API, Deezer public API), Bedrock bridges to SoundCloud or YouTube Music to deliver playable URLs.
|
||||
|
||||
## Platform Coverage
|
||||
|
||||
| Platform | Status | API Type | Streaming | Authentication | Special Features |
|
||||
|----------|--------|----------|-----------|----------------|------------------|
|
||||
| Spotify | Full | Partner API | No (bridged) | OAuth via submodule | Full discography, namespaced IDs |
|
||||
| SoundCloud | Full | api-v2 | Yes (progressive MP3) | Client ID rotation | Batch hydration (30 IDs), /resolve endpoint |
|
||||
| Deezer | Full | Public API | No (bridged) | None | Concurrent artist data fetching |
|
||||
| YouTube Music | Full | Innertube | Yes (7-client fallback) | Cookies for age-restricted | WEB_REMIX metadata, itag priority |
|
||||
| Yandex Music | Stub | N/A | No | N/A | Placeholder only |
|
||||
| VK Music | Stub | N/A | No | N/A | Placeholder only |
|
||||
|
||||
**Active Platforms**: 4 (Spotify, SoundCloud, Deezer, YouTube Music)
|
||||
**Stub Platforms**: 2 (Yandex, VK)
|
||||
|
||||
## Core Capabilities
|
||||
|
||||
### gRPC Service Interface
|
||||
|
||||
**Total Methods**: 23 RPC endpoints
|
||||
**Protocol Buffer**: `bedrock_service.proto` (622 lines)
|
||||
|
||||
Method categories:
|
||||
- **Search**: 4 methods (tracks, albums, artists, playlists)
|
||||
- **Retrieval**: 4 methods (get track, album, artist, playlist by ID)
|
||||
- **Streaming**: 1 method (GetStreamURL)
|
||||
- **Discovery**: 1 method (GetSimilarTracks)
|
||||
- **Lyrics**: 2 methods (GetLyrics, GetSyncedLyrics)
|
||||
- **Statistics**: 3 methods (GetTopTracks, GetTopAlbums, GetTopArtists)
|
||||
- **Import**: 1 method (ImportPlaylist)
|
||||
- **Health**: 1 method (GetServiceStatus)
|
||||
- **Authentication**: 3 methods (Register, Login, RefreshToken)
|
||||
|
||||
### HTTP Streaming Proxy
|
||||
|
||||
**Endpoints**:
|
||||
- `/stream/{service}/{id}` - Audio stream proxy with range request support
|
||||
- `/cover/{service}/{id}` - Album art proxy
|
||||
|
||||
**Ports**:
|
||||
- gRPC: `:50052`
|
||||
- HTTP: `:8080`
|
||||
|
||||
Both endpoints support HTTP range requests for seeking and partial content delivery.
|
||||
|
||||
## Technology Stack
|
||||
|
||||
### Core Dependencies
|
||||
|
||||
```
|
||||
google.golang.org/grpc v1.79.1
|
||||
google.golang.org/protobuf v1.36.4
|
||||
github.com/jackc/pgx/v5 v5.7.2
|
||||
github.com/golang-jwt/jwt/v5 v5.2.1
|
||||
golang.org/x/crypto (bcrypt)
|
||||
github.com/joho/godotenv v1.5.1
|
||||
```
|
||||
|
||||
### Provider Libraries
|
||||
|
||||
```
|
||||
github.com/zmb3/spotify/v2 (via spotapi-go submodule)
|
||||
github.com/kkdai/youtube/v2 v2.10.3
|
||||
github.com/rhnvrm/lyric-api-go v0.1.4 (Genius)
|
||||
```
|
||||
|
||||
**Submodule**: `spotapi-go` (custom Spotify client wrapper)
|
||||
|
||||
### Build Requirements
|
||||
|
||||
- Go 1.25 (go.mod specification)
|
||||
- Git submodules (spotapi-go)
|
||||
- PostgreSQL 15+ (runtime)
|
||||
- Protocol buffer compiler (development)
|
||||
|
||||
## Architecture Highlights
|
||||
|
||||
### Fan-Out Concurrency Pattern
|
||||
|
||||
All search and retrieval methods execute parallel goroutines across enabled providers:
|
||||
|
||||
```go
|
||||
var wg sync.WaitGroup
|
||||
for _, provider := range providers {
|
||||
wg.Add(1)
|
||||
go func(p trackProvider) {
|
||||
defer wg.Done()
|
||||
results, err := p.SearchTracks(query, limit)
|
||||
// aggregate results
|
||||
}(provider)
|
||||
}
|
||||
wg.Wait()
|
||||
```
|
||||
|
||||
This pattern enables sub-second response times even when querying 4+ platforms simultaneously.
|
||||
|
||||
### Stream Resolution Bridge
|
||||
|
||||
**Problem**: Spotify partner API and Deezer public API don't provide streaming URLs.
|
||||
|
||||
**Solution**: Three-tier fallback cascade:
|
||||
|
||||
1. Check if requested platform supports streaming (SoundCloud, YouTube Music)
|
||||
2. If not, search SoundCloud for "{artist} - {title}"
|
||||
3. If SoundCloud fails, search YouTube Music with same query
|
||||
4. Return first successful stream URL
|
||||
|
||||
**Implementation**: `providers/resolver.go`
|
||||
|
||||
### YouTube Music 7-Client Fallback Pool
|
||||
|
||||
YouTube Music streams use a client rotation strategy to maximize success rate:
|
||||
|
||||
```
|
||||
TVHTML5_SIMPLY_EMBEDDED (primary)
|
||||
TVHTML5
|
||||
ANDROID_VR (variant 1)
|
||||
ANDROID_VR (variant 2)
|
||||
ANDROID
|
||||
IOS
|
||||
WEB
|
||||
```
|
||||
|
||||
Each client has different capabilities and restrictions. The service tries clients sequentially until a valid stream URL is obtained. Ciphered streams fall back to SoundCloud.
|
||||
|
||||
### ID Namespacing
|
||||
|
||||
All entity IDs use platform prefixes to avoid collisions:
|
||||
|
||||
```
|
||||
spotify:track:3n3Ppam7vgaVa1iaRUc9Lp
|
||||
soundcloud:track:1234567890
|
||||
deezer:album:302127
|
||||
youtube:video:dQw4w9WgXcQ
|
||||
```
|
||||
|
||||
Format: `{platform}:{entity_type}:{native_id}`
|
||||
|
||||
## Data Layer
|
||||
|
||||
### PostgreSQL Schema
|
||||
|
||||
**Single Table**: `users`
|
||||
|
||||
```sql
|
||||
CREATE TABLE users (
|
||||
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
|
||||
email VARCHAR(255) UNIQUE NOT NULL,
|
||||
password_hash VARCHAR(255) NOT NULL,
|
||||
role VARCHAR(50) DEFAULT 'user',
|
||||
is_verified BOOLEAN DEFAULT false,
|
||||
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
|
||||
);
|
||||
```
|
||||
|
||||
**Connection**: pgx/v5 with connection pooling
|
||||
**Migrations**: `db/migrations/` (up/down SQL pairs)
|
||||
|
||||
### Caching Strategy
|
||||
|
||||
**Current**: No caching implemented
|
||||
**Planned**: Redis for:
|
||||
- Play deduplication (30s window)
|
||||
- Service status cache (5min TTL)
|
||||
- Stream URL cache (1hr TTL)
|
||||
|
||||
## Authentication System
|
||||
|
||||
**Token Type**: JWT (HS256)
|
||||
**Access Token**: 15 minutes
|
||||
**Refresh Token**: 7 days
|
||||
**Password Hashing**: bcrypt (cost 10)
|
||||
|
||||
**gRPC Interceptor**: Validates JWT on all methods except:
|
||||
- Register
|
||||
- Login
|
||||
- RefreshToken
|
||||
- GetServiceStatus
|
||||
|
||||
**Storage**: User credentials in PostgreSQL, tokens issued in-memory (no revocation list).
|
||||
|
||||
## Lyrics Integration
|
||||
|
||||
### LrcLib (Synced Lyrics)
|
||||
|
||||
**Endpoint**: `https://lrclib.net/api/get`
|
||||
**Format**: LRC (timestamped)
|
||||
**Timeout**: 5 seconds
|
||||
**Matching**: Artist + title + album + duration
|
||||
|
||||
### Genius (Plain Lyrics)
|
||||
|
||||
**Authentication**: `GENIUS_ACCESS_TOKEN` environment variable
|
||||
**Features**: Plain text lyrics + annotations
|
||||
**Library**: `github.com/rhnvrm/lyric-api-go`
|
||||
|
||||
Both services are queried in parallel when lyrics are requested. Synced lyrics take priority if available.
|
||||
|
||||
## Configuration Management
|
||||
|
||||
### Environment Variables
|
||||
|
||||
**Required**:
|
||||
```
|
||||
DATABASE_URL=postgresql://user:pass@localhost:5432/bedrock
|
||||
JWT_SECRET=your-secret-key
|
||||
```
|
||||
|
||||
**Optional Platform Credentials**:
|
||||
```
|
||||
SPOTIFY_CLIENT_ID
|
||||
SPOTIFY_CLIENT_SECRET
|
||||
SOUNDCLOUD_CLIENT_IDS=id1,id2,id3
|
||||
DEEZER_APP_ID
|
||||
YOUTUBE_COOKIES=cookie-string
|
||||
GENIUS_ACCESS_TOKEN
|
||||
```
|
||||
|
||||
**Search Locations**:
|
||||
1. Current working directory
|
||||
2. `bedrock_server/` directory
|
||||
3. Parent directory
|
||||
|
||||
**Loader**: `github.com/joho/godotenv`
|
||||
|
||||
### CLI Flags
|
||||
|
||||
```
|
||||
-port int gRPC server port (default 50052)
|
||||
-proxy-addr string HTTP proxy address (default :8080)
|
||||
-proxy-host string HTTP proxy host for URL generation
|
||||
```
|
||||
|
||||
## File Structure
|
||||
|
||||
```
|
||||
bedrock-api/
|
||||
├── bedrock_server/
|
||||
│ ├── main.go (1329 lines - service implementation)
|
||||
│ ├── resolver.go (stream resolution logic)
|
||||
│ ├── proxy.go (HTTP streaming proxy)
|
||||
│ ├── auth.go (JWT + bcrypt)
|
||||
│ ├── lrclib.go (synced lyrics)
|
||||
│ └── genius.go (plain lyrics)
|
||||
├── providers/
|
||||
│ ├── spotify.go (partner API adapter)
|
||||
│ ├── soundcloud.go (api-v2 adapter)
|
||||
│ ├── deezer.go (public API adapter)
|
||||
│ ├── youtube.go (Innertube adapter)
|
||||
│ ├── yandex.go (stub)
|
||||
│ └── vk.go (stub)
|
||||
├── store/
|
||||
│ └── user.go (PostgreSQL user operations)
|
||||
├── db/
|
||||
│ └── migrations/ (SQL migration files)
|
||||
├── tests/
|
||||
│ ├── auth_test.go
|
||||
│ ├── spotify_test.go
|
||||
│ ├── soundcloud_test.go
|
||||
│ ├── youtube_test.go
|
||||
│ ├── deezer_test.go
|
||||
│ └── lyrics_test.go
|
||||
├── proto/
|
||||
│ └── bedrock_service.proto
|
||||
├── Dockerfile
|
||||
├── docker-compose.yml
|
||||
└── go.mod
|
||||
```
|
||||
|
||||
**Total Service Code**: ~3000+ lines (main.go + providers + auth + lyrics)
|
||||
**Protocol Definition**: 622 lines
|
||||
**Test Coverage**: 6 integration test files
|
||||
|
||||
## Deployment Options
|
||||
|
||||
### Docker
|
||||
|
||||
**Multi-stage Build**:
|
||||
- Builder: `golang:1.23-alpine`
|
||||
- Runtime: `alpine:latest`
|
||||
- Exposed Ports: `50052`, `8080`
|
||||
|
||||
**Note**: Dockerfile uses Go 1.23, but go.mod specifies 1.25 (version mismatch).
|
||||
|
||||
### Docker Compose
|
||||
|
||||
**Services**:
|
||||
- PostgreSQL 15-alpine only
|
||||
- No Redis (planned)
|
||||
- No reverse proxy (TLS must be added externally)
|
||||
|
||||
### Local Development
|
||||
|
||||
```bash
|
||||
git clone https://github.com/feralbureau/bedrock-api
|
||||
cd bedrock-api
|
||||
git submodule update --init --recursive
|
||||
cp .env.example .env
|
||||
# Configure .env with credentials
|
||||
go run ./bedrock_server
|
||||
```
|
||||
|
||||
**Submodule Requirement**: `spotapi-go` must be initialized before build.
|
||||
|
||||
## CI/CD Pipeline
|
||||
|
||||
### GitHub Actions Workflows
|
||||
|
||||
**test.yml**:
|
||||
- Runs on: push, pull_request
|
||||
- Go version: 1.24
|
||||
- Services: PostgreSQL 15
|
||||
- Steps: Submodule init, integration tests with provider secrets
|
||||
- Timeout: 120 seconds per test
|
||||
|
||||
**lint.yml**:
|
||||
- golangci-lint (standard Go linting)
|
||||
- Custom comment linter (enforces no decorative comments, no uppercase-leading comments)
|
||||
|
||||
**Secrets Required**:
|
||||
- `SPOTIFY_CLIENT_ID`
|
||||
- `SPOTIFY_CLIENT_SECRET`
|
||||
- `SOUNDCLOUD_CLIENT_IDS`
|
||||
- `GENIUS_ACCESS_TOKEN`
|
||||
- `YOUTUBE_COOKIES`
|
||||
|
||||
## Observability
|
||||
|
||||
### Logging
|
||||
|
||||
**Implementation**: Go stdlib `log.Printf`
|
||||
**Format**: `[provider] message` prefix pattern
|
||||
**Levels**: No structured levels (info/warn/error mixed)
|
||||
|
||||
### Monitoring
|
||||
|
||||
**Current**: None
|
||||
**Missing**:
|
||||
- Prometheus metrics
|
||||
- APM/tracing
|
||||
- Structured logging (JSON)
|
||||
- Error tracking (Sentry, etc.)
|
||||
|
||||
### Health Checks
|
||||
|
||||
**Endpoint**: `GetServiceStatus` RPC
|
||||
**Implementation**: Stub (always returns OK)
|
||||
**Planned**: Per-provider health checks with latency measurement
|
||||
|
||||
## Performance Characteristics
|
||||
|
||||
### Concurrency Model
|
||||
|
||||
- Goroutine per provider for all search/retrieval operations
|
||||
- `sync.WaitGroup` for coordination
|
||||
- No rate limiting (relies on provider-level throttling)
|
||||
- No circuit breakers (failures are logged, partial responses returned)
|
||||
|
||||
### Response Patterns
|
||||
|
||||
**Partial Response Strategy**: If 2/4 providers fail, return results from 2 successful providers with `ResponseStatus: PARTIAL` and `ProviderError[]` array listing failures.
|
||||
|
||||
**Timeout Handling**: No global timeout (relies on HTTP client defaults and provider-specific timeouts like LrcLib 5s).
|
||||
|
||||
## Security Posture
|
||||
|
||||
### Authentication
|
||||
|
||||
- JWT tokens (HS256, not RS256 public/private key)
|
||||
- bcrypt password hashing (cost 10)
|
||||
- No rate limiting on auth endpoints
|
||||
- No account lockout after failed attempts
|
||||
- No email verification enforcement (is_verified field exists but unused)
|
||||
|
||||
### Transport Security
|
||||
|
||||
- No built-in TLS (requires reverse proxy like nginx/Caddy)
|
||||
- gRPC without TLS (insecure credentials)
|
||||
- HTTP proxy without HTTPS
|
||||
|
||||
### Secrets Management
|
||||
|
||||
- Environment variables only
|
||||
- No secrets rotation
|
||||
- Client IDs/tokens in plaintext .env files
|
||||
- No vault integration
|
||||
|
||||
## Unique Features
|
||||
|
||||
1. **Cross-Platform Stream Resolution**: Automatically bridges non-streaming platforms (Spotify, Deezer) to streaming platforms (SoundCloud, YouTube Music)
|
||||
|
||||
2. **YouTube 7-Client Fallback**: Maximizes stream availability by rotating through 7 different YouTube client types
|
||||
|
||||
3. **SoundCloud Client ID Rotation**: Handles rate limiting by cycling through multiple client IDs
|
||||
|
||||
4. **Dual Lyrics Sources**: Combines synced (LrcLib) and annotated (Genius) lyrics
|
||||
|
||||
5. **Namespaced ID System**: Platform-prefixed IDs prevent collisions and enable explicit routing
|
||||
|
||||
6. **Partial Response Model**: Returns successful provider results even when some providers fail
|
||||
|
||||
## Limitations
|
||||
|
||||
1. **Incomplete Platform Coverage**: Yandex and VK are stubs only
|
||||
2. **No Caching**: Every request hits provider APIs (high latency, rate limit risk)
|
||||
3. **Minimal Database Schema**: Only user authentication, no metadata persistence
|
||||
4. **No Observability**: Missing metrics, tracing, structured logging
|
||||
5. **Security Gaps**: No TLS, no rate limiting, no account security features
|
||||
6. **Version Mismatch**: go.mod (1.25) vs Dockerfile (1.23)
|
||||
7. **Submodule Dependency**: Custom spotapi-go fork creates maintenance burden
|
||||
|
||||
## Use Cases
|
||||
|
||||
### Primary
|
||||
|
||||
- Multi-platform music search aggregation
|
||||
- Stream URL resolution for non-streaming APIs
|
||||
- Unified metadata retrieval across platforms
|
||||
- Lyrics lookup with sync support
|
||||
|
||||
### Secondary
|
||||
|
||||
- Playlist import/export across platforms
|
||||
- Artist/album discovery with similar tracks
|
||||
- Top charts aggregation
|
||||
- Music recommendation engine backend
|
||||
|
||||
## Integration Considerations
|
||||
|
||||
**For Metadata Aggregator Project**:
|
||||
|
||||
- Provider adapter pattern is directly applicable
|
||||
- Fan-out concurrency model can be adopted
|
||||
- Partial response handling is valuable for resilience
|
||||
- ID namespacing prevents collision issues
|
||||
- Stream resolution bridge concept is novel but out of scope for pure metadata
|
||||
- gRPC interface requires client generation (protobuf compilation)
|
||||
|
||||
**Reusable Patterns**:
|
||||
- `trackProvider` interface design
|
||||
- Parallel goroutine search with WaitGroup
|
||||
- Error aggregation in partial responses
|
||||
- Platform-specific adapter isolation
|
||||
|
||||
**Not Applicable**:
|
||||
- Streaming focus (metadata aggregator doesn't need stream URLs)
|
||||
- JWT auth (different auth requirements)
|
||||
- Minimal database schema (metadata needs richer storage)
|
||||
Reference in New Issue
Block a user