# MusicMetaLinker API Reference ## API Type MusicMetaLinker is a Python library API. No REST API, no GraphQL, no command-line interface for library functionality. Batch processing has a CLI (link_partitions.py) but the core library is Python-only. ## Primary Interface: Align Class ### Constructor ```python from musicmetalinker.linking import Align linker = Align( mbid_track=None, mbid_release=None, artist=None, album=None, track=None, track_number=None, duration=None, isrc=None, strict=False ) ``` **Parameters:** **mbid_track** (str, optional): MusicBrainz recording ID. If provided, MusicBrainz is queried first and treated as authoritative. **mbid_release** (str, optional): MusicBrainz release ID. Used for album-level metadata. **artist** (str, optional): Artist name. Used for metadata-based search when identifiers unavailable. **album** (str, optional): Album name. Used for filtering and matching. **track** (str, optional): Track name. Primary search term for metadata-based queries. **track_number** (int, optional): Track position on album. Used for filtering multiple matches. **duration** (int or float, optional): Track duration in seconds. Critical for filtering. Deezer uses ±3 second threshold. **isrc** (str, optional): International Standard Recording Code. If provided, used for direct lookup on Deezer and MusicBrainz. **strict** (bool, optional): Strict matching mode. Behavior not fully documented. Likely affects fuzzy matching thresholds. **Returns:** Align instance. No exceptions raised during construction. Queries execute lazily when getters called. **Usage patterns:** Minimal input (metadata only): ```python linker = Align(artist="Radiohead", track="Creep") ``` With identifiers (preferred): ```python linker = Align( mbid_track="6b9e7b9e-8f9e-4f9e-9f9e-9f9e9f9e9f9e", isrc="GBAYE9200070" ) ``` Full metadata for best matching: ```python linker = Align( artist="The Beatles", track="Hey Jude", album="Hey Jude", duration=431, track_number=1 ) ``` ### Metadata Getter Methods All getters return None if data unavailable. No exceptions raised. #### get_artist() ```python artist = linker.get_artist() ``` **Returns:** str or None. Artist name from best available source (MusicBrainz > Deezer > YouTube > input). **Behavior:** - If MBID available, returns MusicBrainz artist - Falls back to Deezer artist if found - Falls back to YouTube artist if found - Returns input artist if no services matched - Returns None if no artist information available #### get_album() ```python album = linker.get_album() ``` **Returns:** str or None. Album/release name. **Behavior:** Same cascading fallback as get_artist(). #### get_track() ```python track = linker.get_track() ``` **Returns:** str or None. Track/recording name. **Behavior:** Same cascading fallback as get_artist(). #### get_track_number() ```python track_number = linker.get_track_number() ``` **Returns:** int or None. Track position on album. **Behavior:** - Returns MusicBrainz track number if available - Falls back to input track_number - Returns None if unavailable #### get_duration() ```python duration = linker.get_duration() ``` **Returns:** int, float, or None. Track duration in seconds. **Behavior:** - Returns MusicBrainz duration if available (milliseconds converted to seconds) - Falls back to Deezer duration - Falls back to input duration - Returns None if unavailable **Note:** MusicBrainz stores duration in milliseconds. The library converts to seconds for consistency. #### get_release_date() ```python release_date = linker.get_release_date() ``` **Returns:** str or None. Release date in ISO format (YYYY-MM-DD) or year only (YYYY). **Behavior:** - Returns MusicBrainz release date if available - Falls back to Deezer release date - Returns None if unavailable **Format inconsistency:** MusicBrainz may return full date, Deezer typically returns year only. #### get_isrc() ```python isrc = linker.get_isrc() ``` **Returns:** str or None. International Standard Recording Code. **Behavior:** - Returns input ISRC if provided - Extracts from MusicBrainz recording if available - Extracts from Deezer result if available - Returns None if unavailable **Format:** Standard ISRC format (e.g., "GBAYE9200070"). No validation performed. #### get_bpm() ```python bpm = linker.get_bpm() ``` **Returns:** int, float, or None. Tempo in beats per minute. **Behavior:** - Returns Deezer BPM if available - Returns None if unavailable **Note:** MusicBrainz doesn't provide BPM in standard queries. Only Deezer source. ### Identifier Getter Methods #### get_mbid() ```python mbid = linker.get_mbid() ``` **Returns:** str or None. MusicBrainz recording ID (UUID format). **Behavior:** - Returns input mbid_track if provided - Queries MusicBrainz by ISRC if available - Queries MusicBrainz by metadata if ISRC unavailable - Returns None if no match found **Format:** UUID string (e.g., "6b9e7b9e-8f9e-4f9e-9f9e-9f9e9f9e9f9e"). #### get_deezer_id() ```python deezer_id = linker.get_deezer_id() ``` **Returns:** int or None. Deezer track ID. **Behavior:** - Queries Deezer by ISRC if available - Queries Deezer by metadata if ISRC unavailable - Filters by duration (±3 seconds) - Returns None if no match found **Format:** Integer (e.g., 123456789). #### get_deezer_link() ```python deezer_link = linker.get_deezer_link() ``` **Returns:** str or None. Full Deezer track URL. **Behavior:** - Calls get_deezer_id() internally - Constructs URL: f"https://www.deezer.com/track/{deezer_id}" - Returns None if no Deezer ID available **Format:** Full URL (e.g., "https://www.deezer.com/track/123456789"). #### get_youtube_link() ```python youtube_link = linker.get_youtube_link() ``` **Returns:** str or None. YouTube Music track URL. **Behavior:** - Queries YouTube Music by metadata (artist, track, album) - Returns first result (no sophisticated ranking) - Returns None if no results **Format:** Full YouTube URL (e.g., "https://www.youtube.com/watch?v=dQw4w9WgXcQ"). **Warning:** YouTube matching is weak. First result assumed correct. No duration filtering. #### get_acousticbrainz_link() ```python acousticbrainz_link = linker.get_acousticbrainz_link() ``` **Returns:** str or None. AcousticBrainz URL. **Behavior:** - Requires MBID (calls get_mbid() internally) - Checks if https://acousticbrainz.org/{mbid} returns HTTP 200 - Returns URL if exists, None otherwise **Critical issue:** AcousticBrainz shut down in 2022. This method always returns None. Dead code. ### Internal Service Methods Not part of public API but exposed in service classes. #### MusicBrainzAlign Methods **get_recording(mbid):** Direct MusicBrainz recording lookup by MBID. **get_best_match(artist, track, album, duration):** Search MusicBrainz by metadata with filtering. **get_iswc():** Retrieve International Standard Musical Work Code. **Implementation details:** ```python from musicmetalinker.linking import MusicBrainzAlign mb = MusicBrainzAlign(mbid="...") recording = mb.get_recording(mbid) # Returns dict with artist, album, track, duration, isrcs, etc. ``` Not intended for direct use. Align class wraps these methods. #### DeezerAlign Methods **best_match(artist, track, album, duration, duration_threshold=3):** Search Deezer with duration filtering. **get_rank():** Retrieve Deezer popularity rank. **Implementation details:** ```python from musicmetalinker.linking import DeezerAlign deezer = DeezerAlign(artist="...", track="...", album="...", duration=123) match = deezer.best_match(artist, track, album, duration) # Returns Deezer track object or None ``` Duration threshold defaults to 3 seconds. Adjustable for stricter/looser matching. #### YouTubeAlign Methods **get_best_match(artist, track, album):** Search YouTube Music. **get_youtube_id():** Extract video ID from search results. **Implementation details:** ```python from musicmetalinker.linking import YouTubeAlign yt = YouTubeAlign(artist="...", track="...", album="...") match = yt.get_best_match(artist, track, album) # Returns YouTube Music result dict or None ``` No duration parameter. No filtering. First result returned. ### Batch Processing API #### link_partitions.py CLI ```bash python link_partitions.py [options] ``` **Arguments:** **directory** (positional): Path to directory containing JAMS files. **Options:** **--save:** Write enriched JAMS files back to disk. Without this flag, only CSV output generated. **--limit audio:** Only process JAMS files with audio content. Skip annotation-only files. **--overwrite:** Overwrite existing enriched JAMS files. Without this flag, existing files skipped. **Output:** CSV file with columns: - jams_file: Original JAMS filename - track_name, artist_name, album_name: Metadata - track_number, duration, release_year: Attributes - musicbrainz: MBID - isrc: ISRC - deezer_id, deezer_url: Deezer identifiers - youtube_url: YouTube Music link - acousticbrainz: AcousticBrainz link (always None) - spotify_id: Spotify ID (if available) Log file: link_partitions.log in current directory. #### JAMSProcessor API ```python from musicmetalinker.preprocessor import JAMSProcessor processor = JAMSProcessor(jams_file_path) metadata = processor.extract_metadata() # Returns dict with artist, track, album, duration, etc. processor.enrich_jams(align_instance) processor.write_jams(output_path) ``` **extract_metadata():** Parses JAMS file and returns metadata dict. **enrich_jams(align):** Takes Align instance and adds identifiers to JAMS structure. **write_jams(path):** Writes enriched JAMS to file. ### Error Handling No exceptions raised by public API. All errors silently suppressed. **Pattern:** - Service query fails: Returns None - Network error: Returns None - Invalid input: Returns None - No match found: Returns None **Implications:** - No distinction between error types - No error messages - No logging of failures (except in batch mode) - Caller cannot determine why None returned **Debugging:** - Enable logging to see internal errors - Check link_partitions.log for batch processing errors - Add print statements to source code ### Rate Limiting No rate limiting implemented. **Risks:** - MusicBrainz rate limits: 1 request/second recommended, not enforced - Deezer rate limits: Unknown, not enforced - YouTube Music rate limits: Unknown, not enforced **Batch processing:** No delays between requests. High risk of rate limiting or IP bans. **Recommendation:** Add manual delays in batch processing loops. ### Caching Results cached within Align instance lifetime. No cross-instance caching. **Behavior:** - First call to get_mbid() queries MusicBrainz - Second call to get_mbid() returns cached value - Creating new Align instance queries again **No persistent cache:** No disk cache, no Redis, no memcached. **Batch processing:** Each track creates new Align instance. No cache reuse across tracks. ### Thread Safety Not thread-safe. No synchronization primitives. **Unsafe operations:** - Concurrent calls to same Align instance - Concurrent batch processing of same directory **Safe operations:** - Multiple Align instances in separate threads (each queries independently) ### Authentication **MusicBrainz:** No authentication. User-Agent header required ("elka/0.1" hardcoded). **Deezer:** No authentication for search API. **YouTube Music:** No authentication. Uses unofficial API. **Spotify:** OAuth2 client credentials required. Configured in external mml_secrets.py file. **Spotify usage:** Limited to ISRC extraction in Billboard dataset cleaning. Not used in main Align workflow. ### API Versioning No API versioning. Library version 0.0.1 indicates pre-release. **Breaking changes:** Possible in any release. No stability guarantees. **Compatibility:** No backward compatibility promises. ### Dependencies for API Usage Minimum dependencies for using Align class: - musicbrainzngs - deezer-python - ytmusicapi - requests Optional dependencies: - jams (for JAMS file support) - pandas (for batch CSV output) - spotipy (for Spotify integration) ### Performance Characteristics **Query latency:** - MusicBrainz: 100-500ms per query - Deezer: 50-200ms per query - YouTube Music: 100-300ms per query **Total latency:** Sum of all service queries (sequential execution). Expect 250-1000ms per track. **Batch processing:** Linear scaling. 1000 tracks = 1000x single track latency. ### API Limitations 1. **No bulk queries:** Each track requires separate Align instance 2. **No async support:** Synchronous only 3. **No streaming results:** All-or-nothing queries 4. **No partial updates:** Can't update single field 5. **No validation:** No input validation, no output validation 6. **No error details:** Only None on failure 7. **Dead integrations:** AcousticBrainz non-functional 8. **Weak YouTube matching:** First result assumed correct ### API Strengths 1. **Simple interface:** Single class, clear getters 2. **Flexible input:** Works with identifiers or metadata 3. **Cascading fallback:** Graceful degradation 4. **Lazy evaluation:** Only query when needed 5. **JAMS support:** Academic standard format ### API Design Recommendations For production use: 1. **Add exceptions:** Raise specific errors instead of returning None 2. **Add validation:** Validate input parameters 3. **Add async API:** Async versions of all getters 4. **Add bulk API:** Process multiple tracks in single call 5. **Add configuration:** Runtime configuration for thresholds 6. **Add logging:** Structured logging with correlation IDs 7. **Add rate limiting:** Respect API limits 8. **Remove dead code:** Delete AcousticBrainz methods 9. **Add documentation:** Docstrings for all public methods 10. **Add type hints:** Full type annotations The API surface is clean and simple. The implementation needs hardening.