# MusicMetaLinker Deployment ## Distribution Model MusicMetaLinker is distributed as source code only. No binary distributions, no PyPI package, no conda package. **Installation method:** Direct from GitHub via pip. ```bash pip install git+https://github.com/andreamust/MusicMetaLinker.git ``` **Implications:** - Requires git installed - Requires network access to GitHub - No version pinning (always installs latest commit) - No offline installation ## Build System ### Build Backend **PEP 517 compliant:** Uses pyproject.toml for build configuration. **Build backend:** hatchling (modern Python build tool). **pyproject.toml structure:** ```toml [build-system] requires = ["hatchling"] build-backend = "hatchling.build" [project] name = "musicmetalinker" version = "0.0.1" dependencies = [ "musicbrainzngs", "deezer-python", "ytmusicapi", "spotipy", "requests", "tqdm", "jams", "pandas", "cryptography" ] ``` **No setup.py:** Modern packaging only. **No setup.cfg:** All configuration in pyproject.toml. ### Build Process **Local build:** ```bash git clone https://github.com/andreamust/MusicMetaLinker.git cd MusicMetaLinker pip install -e . ``` **-e flag:** Editable install. Changes to source code immediately reflected. **Build artifacts:** None. Pure Python package, no compilation. ### Dependencies **Runtime dependencies:** - musicbrainzngs: MusicBrainz API client - deezer-python: Deezer API wrapper - ytmusicapi: YouTube Music API client - spotipy: Spotify API client - requests: HTTP library - tqdm: Progress bars - jams: JAMS format support - pandas: CSV output - cryptography: Required by spotipy **No optional dependencies:** All dependencies required. **No development dependencies:** No test framework, no linting tools, no type checkers. **Dependency versions:** No version constraints. Always installs latest compatible versions. **Risk:** Breaking changes in dependencies may break MusicMetaLinker. ## Deployment Environments ### Library Deployment **Target environment:** Python 3.8+ on any platform (Linux, macOS, Windows). **Installation:** ```bash pip install git+https://github.com/andreamust/MusicMetaLinker.git ``` **Usage:** ```python from musicmetalinker.linking import Align linker = Align(artist="...", track="...") mbid = linker.get_mbid() ``` **No configuration required** (except Spotify credentials for dataset preparation). ### Batch Processing Deployment **Target environment:** Python 3.8+ with file system access. **Installation:** Same as library deployment. **Usage:** ```bash cd /path/to/MusicMetaLinker python link_partitions.py /path/to/jams/files --save --limit audio --overwrite ``` **Requirements:** - JAMS files in target directory - Write permissions for output CSV and enriched JAMS files - Network access for API queries **Optional:** ffmpeg for audio conversion (if processing audio files directly). ### Research Environment Deployment **Typical setup:** Jupyter notebook or Python script in research project. **Installation:** ```bash pip install git+https://github.com/andreamust/MusicMetaLinker.git ``` **Interactive testing:** Notebooks included in repository: - deezer_test.ipynb: Test Deezer integration - queries.ipynb: Test various query patterns **Usage:** ```python # In Jupyter notebook from musicmetalinker.linking import Align linker = Align(...) # Interactive exploration of results ``` ## Configuration Management ### No Configuration Files All configuration hardcoded in source files. **Hardcoded values:** - User-Agent: "elka/0.1" (in linking.py) - Duration thresholds: 3s (Deezer), 5s (MusicBrainz) - Similarity threshold: 0.8 - API endpoints: In library code **No config.ini, no config.yaml, no .env files.** ### Spotify Credentials **Only external configuration:** mml_secrets.py for Spotify credentials. **Location:** Must be in Python path (typically same directory as scripts). **Structure:** ```python # mml_secrets.py SPOTIFY_CLIENT_ID = "your-client-id-here" SPOTIFY_CLIENT_SECRET = "your-client-secret-here" ``` **Not in repository:** Users must create this file manually. **No documentation:** No instructions for obtaining Spotify credentials. **Obtaining credentials:** 1. Register app at https://developer.spotify.com/dashboard 2. Copy client ID and secret 3. Create mml_secrets.py with credentials ### Environment Variables **Not used:** No environment variable configuration. **Recommendation:** Use environment variables for credentials instead of mml_secrets.py. ```python import os SPOTIFY_CLIENT_ID = os.getenv("SPOTIFY_CLIENT_ID") SPOTIFY_CLIENT_SECRET = os.getenv("SPOTIFY_CLIENT_SECRET") ``` ## Runtime Requirements ### Python Version **Minimum:** Python 3.8 **Tested on:** Unknown (no CI/CD, no test matrix). **Likely compatible:** Python 3.8, 3.9, 3.10, 3.11, 3.12 **Type hints:** Not used extensively. No runtime type checking. ### System Dependencies **Required:** - Python 3.8+ - pip - git (for installation) - Network access (for API queries) **Optional:** - ffmpeg (for audio conversion in batch processing) **No database:** No PostgreSQL, MySQL, MongoDB, etc. **No message queue:** No RabbitMQ, Redis, Kafka, etc. **No web server:** No nginx, Apache, etc. ### Platform Support **Linux:** Fully supported. Primary development platform (likely). **macOS:** Fully supported. All dependencies available. **Windows:** Likely supported. All dependencies have Windows wheels. Potential issues: - Path separators (/ vs \) - Line endings (LF vs CRLF) - Case-sensitive file systems **No platform-specific code:** Pure Python, no C extensions (except in dependencies). ## Containerization ### Docker **No Dockerfile provided.** **Sample Dockerfile:** ```dockerfile FROM python:3.11-slim WORKDIR /app RUN apt-get update && apt-get install -y git && rm -rf /var/lib/apt/lists/* RUN pip install git+https://github.com/andreamust/MusicMetaLinker.git COPY mml_secrets.py /app/ CMD ["python"] ``` **For batch processing:** ```dockerfile FROM python:3.11-slim WORKDIR /app RUN apt-get update && apt-get install -y git ffmpeg && rm -rf /var/lib/apt/lists/* RUN pip install git+https://github.com/andreamust/MusicMetaLinker.git RUN git clone https://github.com/andreamust/MusicMetaLinker.git /app/MusicMetaLinker WORKDIR /app/MusicMetaLinker ENTRYPOINT ["python", "link_partitions.py"] ``` **Usage:** ```bash docker build -t musicmetalinker . docker run -v /path/to/jams:/data musicmetalinker /data --save ``` ### Docker Compose **Not provided.** **Sample docker-compose.yml:** ```yaml version: '3.8' services: musicmetalinker: build: . volumes: - ./data:/data - ./output:/output environment: - SPOTIFY_CLIENT_ID=${SPOTIFY_CLIENT_ID} - SPOTIFY_CLIENT_SECRET=${SPOTIFY_CLIENT_SECRET} ``` ### Kubernetes **Not applicable:** MusicMetaLinker is a library/batch tool, not a long-running service. **Possible use case:** Kubernetes Job for batch processing. ```yaml apiVersion: batch/v1 kind: Job metadata: name: musicmetalinker-batch spec: template: spec: containers: - name: musicmetalinker image: musicmetalinker:latest args: ["/data", "--save"] volumeMounts: - name: data mountPath: /data restartPolicy: Never volumes: - name: data persistentVolumeClaim: claimName: jams-data ``` ## Continuous Integration/Continuous Deployment ### CI/CD Status **No CI/CD pipeline.** **No GitHub Actions, no Travis CI, no CircleCI, no Jenkins.** **Implications:** - No automated testing on commits - No automated builds - No automated releases - No quality gates ### Testing **No test suite.** **No pytest, no unittest, no nose.** **Testing approach:** - Manual testing via Jupyter notebooks - if __name__ == "__main__" blocks in some modules **No test coverage metrics.** ### Linting and Formatting **No linting configuration.** **No pylint, no flake8, no black, no isort.** **Code quality:** Inconsistent. Debug prints, commented-out code, inconsistent naming. ### Type Checking **No type checking.** **No mypy, no pyright, no pyre.** **Type hints:** Minimal. Not enforced. ## Monitoring and Logging ### Logging **Library usage:** Minimal console logging. **Batch processing:** File-based logging to link_partitions.log. **Log format:** ``` 2024-01-15 10:30:45 - INFO - Processing file: track001.jams 2024-01-15 10:30:46 - INFO - Found MBID: 6b9e7b9e-8f9e-4f9e-9f9e-9f9e9f9e9f9e 2024-01-15 10:30:47 - ERROR - Failed to query Deezer ``` **Log levels:** INFO, ERROR. No DEBUG, WARNING. **Debug output:** Multiple print() statements in code (not controlled by logging). ### Monitoring **No monitoring.** **No metrics collection, no Prometheus, no Grafana, no Datadog.** **No health checks, no status endpoints.** ### Error Tracking **No error tracking.** **No Sentry, no Rollbar, no Bugsnag.** **Errors silently suppressed.** Returns None on failure. ## Scaling Considerations ### Horizontal Scaling **Not applicable:** Library runs in single process. **Batch processing:** Can be parallelized manually. **Manual parallelization:** ```bash # Split JAMS files into partitions # Run multiple instances in parallel python link_partitions.py /data/partition1 --save & python link_partitions.py /data/partition2 --save & python link_partitions.py /data/partition3 --save & wait ``` **No built-in parallelization.** ### Vertical Scaling **CPU:** Single-threaded. More CPU cores don't help. **Memory:** Minimal usage. Each Align instance uses ~1KB. Batch processing uses more for pandas DataFrame. **Network:** Bottleneck. Sequential API calls. More bandwidth doesn't help (latency-bound). ### Performance Optimization **No performance optimization.** **Bottlenecks:** - Network latency (sequential API calls) - No caching across instances - No connection pooling - No request batching **Potential optimizations:** - Async/await for concurrent API calls - Persistent cache (Redis) - Connection pooling - Batch API requests (if services support) ## Security Considerations ### Secrets Management **Current approach:** Hardcoded in mml_secrets.py. **Issues:** - Plaintext credentials - No encryption - Risk of committing to version control **Recommendations:** - Environment variables - Secrets vault (HashiCorp Vault, AWS Secrets Manager) - Encrypted configuration files ### Network Security **HTTPS:** All API calls use HTTPS. **Certificate validation:** Handled by requests library (validates by default). **No proxy support:** No configuration for HTTP proxies. ### Input Validation **No input validation.** **Risks:** - Invalid MBIDs accepted - Negative durations accepted - Malformed ISRCs accepted **Actual risk:** Low. Invalid input causes query failures (returns None). ### Dependency Security **No dependency scanning.** **No Dependabot, no Snyk, no safety.** **Vulnerable dependencies:** Unknown. No automated checks. **Recommendation:** Run `pip-audit` or `safety check` regularly. ## Backup and Recovery ### Data Backup **No persistent data:** Nothing to back up (library is stateless). **Batch output:** CSV and JAMS files. User responsible for backup. ### Disaster Recovery **Not applicable:** Library has no state to recover. **Batch processing:** Rerun if output lost. No checkpointing, no resume capability. ## Deployment Checklist ### Library Deployment - [ ] Python 3.8+ installed - [ ] pip installed - [ ] git installed - [ ] Network access to GitHub - [ ] Network access to MusicBrainz, Deezer, YouTube Music - [ ] (Optional) Spotify credentials in mml_secrets.py ### Batch Processing Deployment - [ ] All library deployment requirements - [ ] JAMS files prepared - [ ] Write permissions for output directory - [ ] (Optional) ffmpeg installed for audio conversion - [ ] Sufficient disk space for output CSV and enriched JAMS files ### Production Deployment (Recommendations) - [ ] Pin dependency versions in pyproject.toml - [ ] Add automated tests - [ ] Add CI/CD pipeline - [ ] Add error tracking (Sentry) - [ ] Add logging (structured JSON logs) - [ ] Add monitoring (Prometheus metrics) - [ ] Add rate limiting - [ ] Add retry logic with exponential backoff - [ ] Add health checks - [ ] Use environment variables for configuration - [ ] Add input validation - [ ] Add dependency scanning - [ ] Remove AcousticBrainz integration - [ ] Fix User-Agent header - [ ] Add documentation for Spotify setup ## Deployment Recommendations ### Immediate Actions 1. **Publish to PyPI:** Enable `pip install musicmetalinker` without git. 2. **Pin dependencies:** Add version constraints to prevent breaking changes. 3. **Document Spotify setup:** Instructions for obtaining credentials. 4. **Remove AcousticBrainz:** Delete defunct integration. ### Short-Term Improvements 1. **Add CI/CD:** GitHub Actions for automated testing and releases. 2. **Add tests:** pytest suite with mocked API calls. 3. **Add Docker support:** Official Dockerfile and Docker Compose. 4. **Add configuration:** Support environment variables and config files. 5. **Add logging:** Structured logging with configurable levels. ### Long-Term Enhancements 1. **Add monitoring:** Prometheus metrics for API latency, success rates. 2. **Add caching:** Redis for cross-instance caching. 3. **Add async support:** Concurrent API calls for better performance. 4. **Add health checks:** Service availability monitoring. 5. **Add error tracking:** Sentry integration for production debugging. 6. **Add documentation:** Comprehensive deployment guide. 7. **Add versioning:** Semantic versioning with changelog. 8. **Add security scanning:** Automated dependency vulnerability checks. ## Deployment Maturity Assessment **Current state:** Research prototype. Suitable for academic exploration, not production. **Maturity level:** 1/5 **Production readiness:** Low **Gaps:** - No PyPI distribution - No CI/CD - No tests - No monitoring - No error tracking - Hardcoded configuration - Dead code (AcousticBrainz) - No documentation for deployment **Recommendation:** Use for research and prototyping only. Significant work required for production deployment.