a1f6701bac
- gRPC service with MusicBrainz provider - PostgreSQL schema with migrations - Service layer with database-first caching - Repository pattern for data access - YAML configuration support - Research documentation for 17 music metadata projects
793 lines
18 KiB
Markdown
793 lines
18 KiB
Markdown
# Aggregators - Entity Relationship Diagrams
|
|
|
|
Entity structure analysis for the 5 Tier 2 aggregator projects.
|
|
|
|
## Overview
|
|
|
|
| Project | Type | Persistence | Entity Model |
|
|
|---------|------|-------------|--------------|
|
|
| **Harmony** | Multi-source merger | In-memory | Harmonized release structure |
|
|
| **GraphBrainz** | GraphQL layer | Cache only | MusicBrainz schema mirror |
|
|
| **Bedrock-API** | gRPC aggregator | PostgreSQL | Unified streaming model |
|
|
| **minim** | Python library | None | API response wrappers |
|
|
| **MusicMetaLinker** | Entity linker | None | Alignment/linking model |
|
|
|
|
---
|
|
|
|
## 1. Harmony
|
|
|
|
**Purpose**: Harmonizes release metadata from 10+ providers into unified format for MusicBrainz seeding.
|
|
|
|
**Storage**: In-memory only (no database). Cached snapshots via permalinks.
|
|
|
|
```mermaid
|
|
erDiagram
|
|
HarmonyRelease {
|
|
string title
|
|
GTIN gtin
|
|
Language language
|
|
ScriptFrequency script
|
|
ReleaseStatus status
|
|
ReleaseDate releaseDate
|
|
ReleasePackaging packaging
|
|
string credits
|
|
string copyright
|
|
CountryCode[] availableIn
|
|
CountryCode[] excludedFrom
|
|
}
|
|
|
|
HarmonyMedium {
|
|
string title
|
|
int number
|
|
MediumFormat format
|
|
}
|
|
|
|
HarmonyTrack {
|
|
string title
|
|
string number
|
|
int length_ms
|
|
TrackType type
|
|
string isrc
|
|
CountryCode[] availableIn
|
|
}
|
|
|
|
ArtistCreditName {
|
|
string name
|
|
string creditedName
|
|
string joinPhrase
|
|
string mbid
|
|
}
|
|
|
|
Label {
|
|
string name
|
|
string catalogNumber
|
|
string mbid
|
|
}
|
|
|
|
Artwork {
|
|
string url
|
|
string thumbUrl
|
|
ArtworkType[] types
|
|
string comment
|
|
string provider
|
|
}
|
|
|
|
ExternalLink {
|
|
string url
|
|
LinkType[] types
|
|
}
|
|
|
|
ExternalEntityId {
|
|
string provider
|
|
string type
|
|
string id
|
|
CountryCode region
|
|
LinkType[] linkTypes
|
|
}
|
|
|
|
ProviderInfo {
|
|
string name
|
|
string internalName
|
|
string id
|
|
string url
|
|
string apiUrl
|
|
int processingTime
|
|
int cacheTime
|
|
string[] linkedReleases
|
|
bool isTemplate
|
|
}
|
|
|
|
ReleaseInfo {
|
|
ProviderMessage[] messages
|
|
}
|
|
|
|
ResolvableEntity {
|
|
string name
|
|
string mbid
|
|
}
|
|
|
|
HarmonyRelease ||--o{ HarmonyMedium : "media"
|
|
HarmonyRelease ||--o{ ArtistCreditName : "artists"
|
|
HarmonyRelease ||--o{ Label : "labels"
|
|
HarmonyRelease ||--o{ Artwork : "images"
|
|
HarmonyRelease ||--o{ ExternalLink : "externalLinks"
|
|
HarmonyRelease ||--o| ResolvableEntity : "releaseGroup"
|
|
HarmonyRelease ||--|| ReleaseInfo : "info"
|
|
|
|
HarmonyMedium ||--o{ HarmonyTrack : "tracklist"
|
|
|
|
HarmonyTrack ||--o{ ArtistCreditName : "artists"
|
|
HarmonyTrack ||--o| ResolvableEntity : "recording"
|
|
|
|
ArtistCreditName ||--o{ ExternalEntityId : "externalIds"
|
|
Label ||--o{ ExternalEntityId : "externalIds"
|
|
|
|
ReleaseInfo ||--o{ ProviderInfo : "providers"
|
|
```
|
|
|
|
### Key Entities
|
|
|
|
| Entity | Description |
|
|
|--------|-------------|
|
|
| `HarmonyRelease` | Unified release from multiple providers |
|
|
| `HarmonyMedium` | Disc/media within release (CD, Vinyl, Digital) |
|
|
| `HarmonyTrack` | Individual track with ISRC |
|
|
| `ArtistCreditName` | Artist credit with join phrases ("feat.", "&") |
|
|
| `Label` | Record label with catalog number |
|
|
| `ProviderInfo` | Metadata about each source provider used |
|
|
|
|
---
|
|
|
|
## 2. GraphBrainz
|
|
|
|
**Purpose**: GraphQL interface to MusicBrainz with extension support (Discogs, Spotify, Last.fm, etc.).
|
|
|
|
**Storage**: Configurable cache (Redis/memory). No persistent database - proxies MusicBrainz API.
|
|
|
|
```mermaid
|
|
erDiagram
|
|
Artist {
|
|
string id
|
|
string mbid
|
|
string name
|
|
string sortName
|
|
string disambiguation
|
|
string country
|
|
string gender
|
|
string type
|
|
string[] ipis
|
|
string[] isnis
|
|
}
|
|
|
|
ReleaseGroup {
|
|
string id
|
|
string mbid
|
|
string title
|
|
string disambiguation
|
|
Date firstReleaseDate
|
|
ReleaseGroupType primaryType
|
|
ReleaseGroupType[] secondaryTypes
|
|
}
|
|
|
|
Release {
|
|
string id
|
|
string mbid
|
|
string title
|
|
string disambiguation
|
|
Date date
|
|
string country
|
|
string asin
|
|
string barcode
|
|
ReleaseStatus status
|
|
string packaging
|
|
string quality
|
|
}
|
|
|
|
Recording {
|
|
string id
|
|
string mbid
|
|
string title
|
|
string disambiguation
|
|
string[] isrcs
|
|
int length
|
|
bool video
|
|
}
|
|
|
|
Track {
|
|
string mbid
|
|
string title
|
|
int position
|
|
string number
|
|
int length
|
|
}
|
|
|
|
Label {
|
|
string id
|
|
string mbid
|
|
string name
|
|
string sortName
|
|
string disambiguation
|
|
string country
|
|
int labelCode
|
|
string type
|
|
string[] ipis
|
|
}
|
|
|
|
Work {
|
|
string id
|
|
string mbid
|
|
string title
|
|
string disambiguation
|
|
string[] iswcs
|
|
string language
|
|
string type
|
|
}
|
|
|
|
Area {
|
|
string id
|
|
string mbid
|
|
string name
|
|
string type
|
|
}
|
|
|
|
ArtistCredit {
|
|
string name
|
|
string joinPhrase
|
|
}
|
|
|
|
Media {
|
|
int position
|
|
string format
|
|
int trackCount
|
|
}
|
|
|
|
ReleaseEvent {
|
|
Date date
|
|
string country
|
|
}
|
|
|
|
LifeSpan {
|
|
Date begin
|
|
Date end
|
|
bool ended
|
|
}
|
|
|
|
Relationship {
|
|
string type
|
|
string direction
|
|
string[] attributes
|
|
}
|
|
|
|
Tag {
|
|
string name
|
|
int count
|
|
}
|
|
|
|
Rating {
|
|
int voteCount
|
|
float value
|
|
}
|
|
|
|
Artist ||--o{ ReleaseGroup : "releaseGroups"
|
|
Artist ||--o{ Release : "releases"
|
|
Artist ||--o{ Recording : "recordings"
|
|
Artist ||--o{ Work : "works"
|
|
Artist ||--o| Area : "area"
|
|
Artist ||--o| Area : "beginArea"
|
|
Artist ||--o| Area : "endArea"
|
|
Artist ||--|| LifeSpan : "lifeSpan"
|
|
Artist ||--o{ Tag : "tags"
|
|
Artist ||--o| Rating : "rating"
|
|
Artist ||--o{ Relationship : "relationships"
|
|
|
|
ReleaseGroup ||--o{ Release : "releases"
|
|
ReleaseGroup ||--o{ ArtistCredit : "artistCredits"
|
|
ReleaseGroup ||--o{ Tag : "tags"
|
|
ReleaseGroup ||--o| Rating : "rating"
|
|
|
|
Release ||--o{ Media : "media"
|
|
Release ||--o{ ReleaseEvent : "releaseEvents"
|
|
Release ||--o{ ArtistCredit : "artistCredits"
|
|
Release ||--o{ Label : "labels"
|
|
Release ||--o{ Recording : "recordings"
|
|
Release ||--o{ Tag : "tags"
|
|
|
|
Media ||--o{ Track : "tracks"
|
|
|
|
Track ||--|| Recording : "recording"
|
|
|
|
Recording ||--o{ ArtistCredit : "artistCredits"
|
|
Recording ||--o{ Release : "releases"
|
|
Recording ||--o{ Tag : "tags"
|
|
Recording ||--o| Rating : "rating"
|
|
|
|
Label ||--o{ Release : "releases"
|
|
Label ||--o| Area : "area"
|
|
Label ||--|| LifeSpan : "lifeSpan"
|
|
Label ||--o{ Tag : "tags"
|
|
|
|
Work ||--o{ Artist : "artists"
|
|
Work ||--o{ Tag : "tags"
|
|
|
|
ArtistCredit }o--|| Artist : "artist"
|
|
```
|
|
|
|
### Key Entities
|
|
|
|
| Entity | Description |
|
|
|--------|-------------|
|
|
| `Artist` | Musician, band, or music professional |
|
|
| `ReleaseGroup` | Logical album concept (all editions) |
|
|
| `Release` | Specific edition (CD, vinyl, digital) |
|
|
| `Recording` | Distinct audio (linked to tracks) |
|
|
| `Track` | Recording on a specific medium |
|
|
| `Work` | Abstract composition (song as written) |
|
|
| `Label` | Record label/imprint |
|
|
| `Area` | Geographic region |
|
|
|
|
---
|
|
|
|
## 3. Bedrock-API
|
|
|
|
**Purpose**: Multi-platform streaming aggregator with cross-platform track bridging.
|
|
|
|
**Storage**: PostgreSQL (users, listening stats). Providers are queried in real-time.
|
|
|
|
```mermaid
|
|
erDiagram
|
|
Track {
|
|
string id "platform:native_id"
|
|
string title
|
|
string artist
|
|
string album_title
|
|
string cover_url
|
|
int duration_ms
|
|
string preview_url
|
|
string external_url
|
|
bool is_streamable
|
|
int popularity
|
|
string genre
|
|
Platform source
|
|
string platform_id
|
|
}
|
|
|
|
Artist {
|
|
string id "platform:native_id"
|
|
string name
|
|
string image_url
|
|
string[] genres
|
|
int followers
|
|
string external_url
|
|
Platform source
|
|
}
|
|
|
|
Album {
|
|
string id "platform:native_id"
|
|
string title
|
|
string artist
|
|
string cover_url
|
|
int total_tracks
|
|
string release_date
|
|
string external_url
|
|
string album_type
|
|
Platform source
|
|
string platform_id
|
|
}
|
|
|
|
Playlist {
|
|
string id "platform:native_id"
|
|
string title
|
|
string description
|
|
string cover_url
|
|
int total_tracks
|
|
string owner
|
|
string external_url
|
|
Platform source
|
|
string platform_id
|
|
}
|
|
|
|
User {
|
|
string id
|
|
string email
|
|
string password_hash
|
|
timestamp created_at
|
|
}
|
|
|
|
ListeningEvent {
|
|
string id "uuid"
|
|
string user_id
|
|
string track_id
|
|
string title
|
|
string artist
|
|
string artist_id
|
|
int duration_s
|
|
Platform source
|
|
bool is_public
|
|
timestamp created_at
|
|
}
|
|
|
|
Lyrics {
|
|
string lyrics
|
|
bool synced
|
|
LyricsSource source
|
|
string resolved_title
|
|
string resolved_artist
|
|
float similarity
|
|
LyricsType type
|
|
}
|
|
|
|
LyricsLine {
|
|
int time_ms
|
|
string text
|
|
}
|
|
|
|
LyricAnnotation {
|
|
int id
|
|
string url
|
|
string fragment
|
|
string body
|
|
int votes_total
|
|
bool verified
|
|
bool pinned
|
|
int comment_count
|
|
string created_at
|
|
}
|
|
|
|
AnnotationContributor {
|
|
string login
|
|
string url
|
|
string avatar_url
|
|
string role
|
|
int iq
|
|
}
|
|
|
|
PopularTrackItem {
|
|
int play_count
|
|
}
|
|
|
|
PopularArtistItem {
|
|
string artist_name
|
|
int play_count
|
|
string cover_url
|
|
string external_url
|
|
}
|
|
|
|
Track ||--o{ Artist : "artists"
|
|
Album ||--o{ Artist : "artists"
|
|
Album ||--o{ Track : "tracks"
|
|
Playlist ||--o{ Track : "tracks"
|
|
|
|
User ||--o{ ListeningEvent : "history"
|
|
ListeningEvent }o--|| Track : "track"
|
|
|
|
Lyrics ||--o{ LyricsLine : "synced_lines"
|
|
LyricAnnotation ||--|| AnnotationContributor : "contributor"
|
|
|
|
PopularTrackItem ||--|| Track : "track"
|
|
```
|
|
|
|
### Key Entities
|
|
|
|
| Entity | Description |
|
|
|--------|-------------|
|
|
| `Track` | Unified track from any platform (Spotify, Deezer, SoundCloud, etc.) |
|
|
| `Artist` | Artist with platform-specific metadata |
|
|
| `Album` | Album with release info |
|
|
| `Playlist` | User/curated playlist |
|
|
| `User` | Authenticated user (JWT) |
|
|
| `ListeningEvent` | Play history for stats |
|
|
| `Lyrics` | Plain or synced lyrics (LrcLib, Genius) |
|
|
| `LyricAnnotation` | Genius community annotations |
|
|
|
|
### Platform Enum
|
|
|
|
```
|
|
PLATFORM_SPOTIFY, PLATFORM_YANDEX, PLATFORM_VK,
|
|
PLATFORM_DEEZER, PLATFORM_SOUNDCLOUD, PLATFORM_YOUTUBE
|
|
```
|
|
|
|
---
|
|
|
|
## 4. minim
|
|
|
|
**Purpose**: Python library providing unified client interface to 7 music APIs.
|
|
|
|
**Storage**: None (library only). OAuth tokens cached locally.
|
|
|
|
```mermaid
|
|
erDiagram
|
|
SpotifyTrack {
|
|
string id
|
|
string name
|
|
int duration_ms
|
|
int popularity
|
|
bool explicit
|
|
string preview_url
|
|
string external_url
|
|
}
|
|
|
|
SpotifyArtist {
|
|
string id
|
|
string name
|
|
string[] genres
|
|
int followers
|
|
int popularity
|
|
string image_url
|
|
}
|
|
|
|
SpotifyAlbum {
|
|
string id
|
|
string name
|
|
string album_type
|
|
string release_date
|
|
int total_tracks
|
|
string[] genres
|
|
}
|
|
|
|
DeezerTrack {
|
|
int id
|
|
string title
|
|
int duration
|
|
int rank
|
|
bool explicit
|
|
string preview
|
|
string link
|
|
}
|
|
|
|
DeezerArtist {
|
|
int id
|
|
string name
|
|
int nb_fan
|
|
string picture_url
|
|
}
|
|
|
|
DeezerAlbum {
|
|
int id
|
|
string title
|
|
string release_date
|
|
int nb_tracks
|
|
string cover_url
|
|
}
|
|
|
|
TidalTrack {
|
|
int id
|
|
string title
|
|
int duration
|
|
int popularity
|
|
bool explicit
|
|
string isrc
|
|
}
|
|
|
|
TidalArtist {
|
|
int id
|
|
string name
|
|
string picture_url
|
|
}
|
|
|
|
TidalAlbum {
|
|
int id
|
|
string title
|
|
string releaseDate
|
|
int numberOfTracks
|
|
string cover_url
|
|
}
|
|
|
|
QobuzTrack {
|
|
int id
|
|
string title
|
|
int duration
|
|
bool hires
|
|
string isrc
|
|
}
|
|
|
|
iTunesTrack {
|
|
int trackId
|
|
string trackName
|
|
int trackTimeMillis
|
|
string previewUrl
|
|
string trackViewUrl
|
|
}
|
|
|
|
iTunesArtist {
|
|
int artistId
|
|
string artistName
|
|
string artistLinkUrl
|
|
}
|
|
|
|
iTunesAlbum {
|
|
int collectionId
|
|
string collectionName
|
|
string releaseDate
|
|
int trackCount
|
|
}
|
|
|
|
AudioFile {
|
|
string path
|
|
string format
|
|
int bitrate
|
|
int sample_rate
|
|
int channels
|
|
}
|
|
|
|
AudioMetadata {
|
|
string title
|
|
string artist
|
|
string album
|
|
int track_number
|
|
int year
|
|
string genre
|
|
bytes cover_art
|
|
}
|
|
|
|
SpotifyAlbum ||--o{ SpotifyTrack : "tracks"
|
|
SpotifyAlbum ||--o{ SpotifyArtist : "artists"
|
|
SpotifyTrack ||--o{ SpotifyArtist : "artists"
|
|
|
|
DeezerAlbum ||--o{ DeezerTrack : "tracks"
|
|
DeezerAlbum ||--|| DeezerArtist : "artist"
|
|
DeezerTrack ||--|| DeezerArtist : "artist"
|
|
|
|
TidalAlbum ||--o{ TidalTrack : "tracks"
|
|
TidalAlbum ||--o{ TidalArtist : "artists"
|
|
|
|
AudioFile ||--|| AudioMetadata : "metadata"
|
|
```
|
|
|
|
### API Modules
|
|
|
|
| Module | Provider | Auth |
|
|
|--------|----------|------|
|
|
| `spotify` | Spotify Web API | OAuth 2.0 (multiple grant types) |
|
|
| `discogs` | Discogs API | OAuth 1.0a |
|
|
| `itunes` | iTunes Search API | None |
|
|
| `qobuz` | Qobuz API | Password |
|
|
| `tidal` | TIDAL API | OAuth 2.0 |
|
|
| `audio` | Local files | N/A |
|
|
|
|
---
|
|
|
|
## 5. MusicMetaLinker
|
|
|
|
**Purpose**: Entity linking library - connects track metadata to external databases.
|
|
|
|
**Storage**: None (library only). Queries external APIs in real-time.
|
|
|
|
```mermaid
|
|
erDiagram
|
|
Align {
|
|
string mbid_track
|
|
string mbid_release
|
|
string artist
|
|
string album
|
|
string track
|
|
int track_number
|
|
float duration
|
|
string[] isrc
|
|
bool strict
|
|
}
|
|
|
|
MusicBrainzLink {
|
|
string mbid
|
|
string artist
|
|
string album
|
|
string track
|
|
int track_number
|
|
float duration
|
|
string[] isrc
|
|
string release_date
|
|
}
|
|
|
|
DeezerLink {
|
|
int id
|
|
string link
|
|
string artist_name
|
|
string album_title
|
|
string track_title
|
|
int track_number
|
|
float duration
|
|
string isrc
|
|
float bpm
|
|
string release_date
|
|
}
|
|
|
|
YouTubeLink {
|
|
string video_id
|
|
string link
|
|
string title
|
|
string artist
|
|
string album
|
|
float duration
|
|
}
|
|
|
|
AcousticBrainzLink {
|
|
string mbid
|
|
string link
|
|
float bpm
|
|
string key
|
|
float danceability
|
|
float energy
|
|
}
|
|
|
|
LinkedTrack {
|
|
string mbid
|
|
string isrc
|
|
int deezer_id
|
|
string youtube_id
|
|
string acousticbrainz_link
|
|
string artist
|
|
string album
|
|
string track
|
|
int track_number
|
|
float duration
|
|
string release_date
|
|
float bpm
|
|
}
|
|
|
|
Align ||--|| MusicBrainzLink : "mb_link"
|
|
Align ||--|| DeezerLink : "dz_link"
|
|
Align ||--|| YouTubeLink : "yt_link"
|
|
|
|
MusicBrainzLink ||--o| AcousticBrainzLink : "acousticbrainz"
|
|
|
|
LinkedTrack }o--|| MusicBrainzLink : "musicbrainz"
|
|
LinkedTrack }o--|| DeezerLink : "deezer"
|
|
LinkedTrack }o--|| YouTubeLink : "youtube"
|
|
LinkedTrack }o--|| AcousticBrainzLink : "acousticbrainz"
|
|
```
|
|
|
|
### Linking Flow
|
|
|
|
```
|
|
Input (any combination):
|
|
- MBID (MusicBrainz ID)
|
|
- ISRC
|
|
- Artist + Track + Album
|
|
- Duration
|
|
|
|
┌─────────────────┐
|
|
│ Align │
|
|
│ (coordinator) │
|
|
└────────┬────────┘
|
|
│
|
|
┌────────────┼────────────┐
|
|
│ │ │
|
|
▼ ▼ ▼
|
|
┌────────┐ ┌────────┐ ┌────────┐
|
|
│MusicBr.│ │ Deezer │ │YouTube │
|
|
│ Link │ │ Link │ │ Link │
|
|
└────┬───┘ └────────┘ └────────┘
|
|
│
|
|
▼
|
|
┌────────────┐
|
|
│AcousticBr. │
|
|
│ Link │
|
|
└────────────┘
|
|
|
|
Output:
|
|
- Enriched metadata from all sources
|
|
- Cross-platform IDs (MBID, Deezer ID, YouTube ID)
|
|
- Additional data (BPM, key, etc.)
|
|
```
|
|
|
|
### Supported Sources
|
|
|
|
| Source | ID Type | Data Retrieved |
|
|
|--------|---------|----------------|
|
|
| MusicBrainz | MBID | Track, artist, album, ISRC, release date |
|
|
| Deezer | Deezer ID | Track, BPM, ISRC, release date |
|
|
| YouTube Music | Video ID | Track, duration |
|
|
| AcousticBrainz | MBID | BPM, key, audio features |
|
|
|
|
---
|
|
|
|
## Comparison
|
|
|
|
| Feature | Harmony | GraphBrainz | Bedrock-API | minim | MusicMetaLinker |
|
|
|---------|---------|-------------|-------------|-------|-----------------|
|
|
| **Primary Use** | MB seeding | GraphQL proxy | Streaming | API library | Entity linking |
|
|
| **Database** | None | Cache | PostgreSQL | None | None |
|
|
| **Sources** | 10+ | MB + extensions | 6 platforms | 7 APIs | 4 sources |
|
|
| **Output** | Merged release | GraphQL | gRPC/Protobuf | Python objects | Linked IDs |
|
|
| **Language** | TypeScript | JavaScript | Go | Python | Python |
|
|
| **Unique Value** | Intelligent merge | Schema stitching | Stream bridging | Unified interface | Cross-DB linking |
|