Files
metadata-agregator/docs/research/musicbrainz-server/analysis/ARCHITECTURE.md
T
Alexander a1f6701bac feat: initial implementation of metadata aggregator
- gRPC service with MusicBrainz provider
- PostgreSQL schema with migrations
- Service layer with database-first caching
- Repository pattern for data access
- YAML configuration support
- Research documentation for 17 music metadata projects
2026-04-28 16:28:53 +02:00

15 KiB

MusicBrainz Server Architecture

Design Pattern

Hybrid MVC + Service Layer architecture built on the Catalyst web framework. The application follows a layered approach with clear separation of concerns between presentation, business logic, and data access.

Directory Structure

lib/MusicBrainz/Server/
├── Controller/          # 53 controllers, 13,000 lines
│   ├── Artist.pm
│   ├── Release.pm
│   ├── Recording.pm
│   ├── WS/             # Web Service controllers
│   │   └── 2/          # API version 2
│   └── ...
├── Data/               # 106 modules, 26,000 lines
│   ├── Artist.pm
│   ├── Release.pm
│   ├── Recording.pm
│   ├── Relationship.pm
│   └── ...
├── Entity/             # 132 entity classes
│   ├── Artist.pm
│   ├── Release.pm
│   ├── Recording.pm
│   ├── Types.pm
│   └── ...
├── Form/               # 43 form handlers
│   ├── Artist.pm
│   ├── Release.pm
│   └── ...
├── View/               # 4 view modules
│   ├── Default.pm      # Template Toolkit
│   ├── JSON.pm
│   ├── XML.pm
│   └── JSONLD.pm
├── WebService/         # API implementation
│   ├── Serializer/
│   │   ├── JSON/
│   │   ├── XML/
│   │   └── JSONLD/
│   └── Validator.pm
├── Edit/               # Edit system
│   ├── Artist/
│   ├── Release/
│   ├── Recording/
│   └── ...
├── Context.pm          # Service layer coordinator
├── DBDefs.pm           # Configuration
└── Sql.pm              # SQL abstraction layer

admin/                  # Database administration
├── sql/
│   ├── CreateTables.sql    # Schema definition (4,068 lines)
│   └── updates/            # 332 migration files

root/                   # Frontend assets
├── static/
│   ├── scripts/        # JavaScript source
│   │   ├── common/
│   │   ├── edit/
│   │   └── release/
│   ├── styles/         # CSS/LESS
│   └── images/
└── layout.tt           # Main template

t/                      # Tests
├── lib/                # Test utilities
├── pgtap/              # Database tests
└── selenium/           # Integration tests

Architectural Layers

Controller Layer (53 modules, 13,000 lines)

Responsibility: Handle HTTP requests, coordinate business logic, render responses.

Key Controllers:

  • Artist.pm - Artist entity operations
  • Release.pm - Release entity operations
  • Recording.pm - Recording entity operations
  • ReleaseGroup.pm - Release group operations
  • Work.pm - Work entity operations
  • Label.pm - Label entity operations
  • Edit.pm - Edit submission and voting
  • Search.pm - Search interface
  • WS::2::* - Web service API endpoints

Controller Pattern:

package MusicBrainz::Server::Controller::Artist;
use Moose;
BEGIN { extends 'MusicBrainz::Server::Controller' }

sub show : Path Args(1) {
    my ($self, $c, $gid) = @_;
    my $artist = $c->model('Artist')->get_by_gid($gid);
    $c->stash( artist => $artist );
}

Responsibilities:

  • Request validation
  • Authentication/authorization checks
  • Coordinate Data layer calls
  • Prepare data for views
  • Handle form submissions

Data Layer (106 modules, 26,000 lines)

Responsibility: Repository pattern for database access. Each entity has a corresponding Data module.

Key Data Modules:

  • Data::Artist - Artist CRUD operations
  • Data::Release - Release CRUD operations
  • Data::Recording - Recording CRUD operations
  • Data::Relationship - Relationship management
  • Data::Edit - Edit persistence
  • Data::Search - Search operations

Data Module Pattern:

package MusicBrainz::Server::Data::Artist;
use Moose;
extends 'MusicBrainz::Server::Data::Entity';

sub _table { 'artist' }
sub _entity_class { 'MusicBrainz::Server::Entity::Artist' }

sub get_by_gid {
    my ($self, $gid) = @_;
    return $self->_get_by_key('gid', $gid);
}

Moose Roles:

  • Role::Editable - Entities that can be edited
  • Role::Taggable - Entities that can be tagged
  • Role::Rateable - Entities that can be rated
  • Role::Relatable - Entities that can have relationships
  • Role::Aliasable - Entities that can have aliases
  • Role::Annotation - Entities that can be annotated

Data Access Pattern:

  • No ORM (not DBIx::Class)
  • Custom Moose-based abstraction
  • Raw SQL via DBD::Pg
  • DBIx::Connector for connection pooling
  • Sql.pm provides query builder utilities

Entity Layer (132 classes)

Responsibility: Domain objects representing database entities.

Key Entities:

  • Entity::Artist - Artist domain object
  • Entity::Release - Release domain object
  • Entity::Recording - Recording domain object
  • Entity::ReleaseGroup - Release group domain object
  • Entity::Work - Work domain object
  • Entity::Label - Label domain object
  • Entity::Relationship - Relationship between entities

Entity Pattern:

package MusicBrainz::Server::Entity::Artist;
use Moose;
extends 'MusicBrainz::Server::Entity';

has 'name' => ( is => 'rw', isa => 'Str' );
has 'sort_name' => ( is => 'rw', isa => 'Str' );
has 'type_id' => ( is => 'rw', isa => 'Maybe[Int]' );
has 'country_id' => ( is => 'rw', isa => 'Maybe[Int]' );
has 'begin_date' => ( is => 'rw', isa => 'PartialDate' );
has 'end_date' => ( is => 'rw', isa => 'PartialDate' );

Entity Characteristics:

  • Immutable after construction (mostly)
  • Type-safe via Moose type system
  • Lazy loading of relationships
  • No database logic (pure domain objects)

Form Layer (43 modules)

Responsibility: Form validation and processing using HTML::FormHandler.

Key Forms:

  • Form::Artist - Artist creation/editing
  • Form::Release - Release creation/editing
  • Form::Recording - Recording creation/editing
  • Form::Edit::* - Edit-specific forms

Form Pattern:

package MusicBrainz::Server::Form::Artist;
use HTML::FormHandler::Moose;
extends 'MusicBrainz::Server::Form';

has_field 'name' => ( type => 'Text', required => 1 );
has_field 'sort_name' => ( type => 'Text', required => 1 );
has_field 'type_id' => ( type => 'Select' );

View Layer (4 modules)

Responsibility: Render responses in different formats.

Views:

  • View::Default - Template Toolkit for HTML
  • View::JSON - JSON serialization
  • View::XML - XML serialization
  • View::JSONLD - JSON-LD serialization

Edit System Architecture

Pattern: Command Pattern

Concept: All data modifications are represented as "edits" - versioned, votable changes that go through a review process.

Edit Lifecycle:

  1. User submits edit via form
  2. Edit is validated and persisted to edit table
  3. Edit enters voting period (typically 7 days)
  4. Community votes on edit (yes/no/abstain)
  5. Auto-editors can approve immediately
  6. Edit is applied or rejected based on votes
  7. Full audit trail maintained

Edit Types (examples):

  • Edit::Artist::Create - Create new artist
  • Edit::Artist::Edit - Modify artist data
  • Edit::Artist::Delete - Delete artist
  • Edit::Release::Create - Create new release
  • Edit::Release::AddReleaseLabel - Add label to release
  • Edit::Relationship::Create - Create relationship
  • Edit::Relationship::Edit - Modify relationship
  • Edit::Relationship::Delete - Delete relationship

Edit Structure:

package MusicBrainz::Server::Edit::Artist::Edit;
use Moose;
extends 'MusicBrainz::Server::Edit';

sub edit_type { 1 }  # Unique edit type ID
sub edit_name { 'Edit artist' }

sub initialize {
    my ($self, %opts) = @_;
    # Store old and new data
    $self->data({
        entity_id => $opts{artist_id},
        old => { ... },
        new => { ... },
    });
}

sub accept {
    my $self = shift;
    # Apply the edit
    $self->c->model('Artist')->update($self->data->{entity_id}, $self->data->{new});
}

Edit Data Storage:

  • edit table - Edit metadata (type, status, votes)
  • edit_data table - Edit-specific data (JSON)
  • vote table - User votes on edits

Edit Statuses:

  • Open - Awaiting votes
  • Applied - Accepted and applied
  • Failed Vote - Rejected by community
  • Failed Dependency - Dependent edit failed
  • Error - Application error
  • Deleted - Cancelled by submitter

Serialization Architecture

JSON Serializer

Location: lib/MusicBrainz/Server/WebService/Serializer/JSON/2/

Modules:

  • Artist.pm - Artist JSON serialization
  • Release.pm - Release JSON serialization
  • Recording.pm - Recording JSON serialization
  • Utils.pm - Common serialization utilities

Pattern:

sub serialize {
    my ($self, $entity, $inc, $opts) = @_;
    
    my $data = {
        id => $entity->gid,
        name => $entity->name,
        'sort-name' => $entity->sort_name,
    };
    
    if ($inc->artist_credits) {
        $data->{'artist-credit'} = $self->serialize_artist_credit($entity->artist_credit);
    }
    
    return $data;
}

XML Serializer

Location: lib/MusicBrainz/Server/WebService/Serializer/XML/2/

Namespace: http://musicbrainz.org/ns/mmd-2.0#

Pattern:

sub serialize {
    my ($self, $entity, $inc, $opts) = @_;
    
    my $xml = XML::LibXML::Element->new('artist');
    $xml->setAttribute('id', $entity->gid);
    $xml->appendTextChild('name', $entity->name);
    $xml->appendTextChild('sort-name', $entity->sort_name);
    
    return $xml;
}

JSON-LD Serializer

Location: lib/MusicBrainz/Server/WebService/Serializer/JSONLD/

Context: Schema.org vocabulary

Pattern:

sub serialize {
    my ($self, $entity) = @_;
    
    return {
        '@context' => 'http://schema.org',
        '@type' => 'MusicGroup',
        '@id' => 'https://musicbrainz.org/artist/' . $entity->gid,
        'name' => $entity->name,
    };
}

Frontend Architecture

Template Toolkit (Server-Side Rendering)

Location: root/

Main Template: root/layout.tt

Template Structure:

root/
├── layout.tt           # Main layout
├── artist/
│   ├── index.tt        # Artist listing
│   ├── show.tt         # Artist detail
│   └── edit.tt         # Artist edit form
├── release/
│   ├── index.tt
│   ├── show.tt
│   └── edit.tt
└── components/
    ├── header.tt
    ├── footer.tt
    └── sidebar.tt

Template Pattern:

[% WRAPPER 'layout.tt' title=artist.name %]
  <h1>[% artist.name %]</h1>
  <p>Sort name: [% artist.sort_name %]</p>
  
  [% IF artist.releases.size %]
    <h2>Releases</h2>
    <ul>
      [% FOR release IN artist.releases %]
        <li><a href="/release/[% release.gid %]">[% release.name %]</a></li>
      [% END %]
    </ul>
  [% END %]
[% END %]

React (Progressive Enhancement)

Location: root/static/scripts/

Strategy: Progressive enhancement - server renders HTML, React hydrates for interactivity.

Component Structure:

root/static/scripts/
├── common/
│   ├── components/
│   │   ├── EntityLink.js
│   │   ├── Autocomplete.js
│   │   └── ReleaseList.js
│   └── utility/
├── edit/
│   ├── components/
│   │   ├── EditNote.js
│   │   └── VotingSection.js
│   └── reducers/
└── release/
    ├── components/
    │   ├── ReleaseHeader.js
    │   └── TrackList.js
    └── reducers/

React Pattern:

import React from 'react';
import ReactDOM from 'react-dom';

const ReleaseList = ({ releases }) => (
  <ul>
    {releases.map(release => (
      <li key={release.gid}>
        <a href={`/release/${release.gid}`}>{release.name}</a>
      </li>
    ))}
  </ul>
);

// Hydrate server-rendered content
const container = document.getElementById('release-list');
if (container) {
  const releases = JSON.parse(container.dataset.releases);
  ReactDOM.hydrate(<ReleaseList releases={releases} />, container);
}

Legacy Knockout.js

Status: Being phased out, but still present in some views.

Location: root/static/scripts/ (mixed with React)

Pattern:

ko.applyBindings({
  releases: ko.observableArray([...]),
  addRelease: function() { ... }
});

Service Layer (Context)

File: lib/MusicBrainz/Server/Context.pm

Responsibility: Coordinate operations across multiple Data modules, manage transactions, provide unified interface.

Pattern:

my $artist = $c->model('Artist')->get_by_gid($gid);
$c->model('ArtistCredit')->load($artist);
$c->model('Release')->load_for_artist($artist);
$c->model('Relationship')->load($artist);

Context Provides:

  • Database connection management
  • Transaction handling
  • Model access ($c->model('Artist'))
  • Configuration access ($c->config)
  • Session management
  • Request/response handling

Key Design Patterns

Repository Pattern

Implementation: Data layer modules

Purpose: Abstract database access, provide clean interface for entity operations.

Example:

# Instead of raw SQL everywhere:
my $artist = $c->model('Artist')->get_by_gid($gid);

# Data::Artist handles the SQL:
sub get_by_gid {
    my ($self, $gid) = @_;
    return $self->sql->select_single_row_hash(
        'SELECT * FROM artist WHERE gid = ?', $gid
    );
}

Command Pattern

Implementation: Edit system

Purpose: Encapsulate all data modifications as objects, enabling undo, audit trails, and voting.

Example:

my $edit = $c->model('Edit')->create(
    edit_type => $EDIT_ARTIST_EDIT,
    editor_id => $c->user->id,
    artist_id => $artist->id,
    old => { name => 'Old Name' },
    new => { name => 'New Name' },
);

Service Pattern

Implementation: Context object

Purpose: Coordinate operations across multiple repositories, manage transactions.

Example:

$c->model('MB')->with_transaction(sub {
    my $artist = $c->model('Artist')->insert({ name => 'New Artist' });
    $c->model('Edit')->create(
        edit_type => $EDIT_ARTIST_CREATE,
        entity_id => $artist->id,
    );
});

Data Access Layer

No ORM: MusicBrainz does not use DBIx::Class or any traditional ORM.

Custom Abstraction:

  • Moose-based Data modules
  • Raw SQL via DBD::Pg
  • DBIx::Connector for connection pooling
  • Sql.pm provides query builder utilities

Rationale:

  • Performance - Direct SQL is faster
  • Flexibility - Complex queries easier to write
  • Control - Full control over query execution
  • Legacy - Codebase predates modern ORMs

SQL Abstraction Example:

# lib/MusicBrainz/Server/Data/Sql.pm
sub select_single_row_hash {
    my ($self, $query, @args) = @_;
    my $row = $self->dbh->selectrow_hashref($query, undef, @args);
    return $row;
}

sub select_list_of_hashes {
    my ($self, $query, @args) = @_;
    my $rows = $self->dbh->selectall_arrayref($query, { Slice => {} }, @args);
    return $rows;
}