Update flow diagrams for event bus architecture, cancel cleanup, and SubscribeEvents

Ultraworked with [Sisyphus](https://github.com/code-yeongyu/claude-agent)

Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
This commit is contained in:
Alexander
2026-05-11 15:54:32 +02:00
parent 93821ab214
commit 69752bd6a2
5 changed files with 231 additions and 103 deletions
@@ -0,0 +1,86 @@
@startuml Event Bus Architecture
skinparam componentAlign center
title Event Bus: In-Process Pub/Sub Architecture
package "Publishers" {
[Workflow Goroutine 1\n(album A, LOSSLESS)] as WF1
[Workflow Goroutine 2\n(album B, LOSSY)] as WF2
}
database "PostgreSQL" as DB {
[workflow_runs] as WR
[album_events] as AE
}
package "Event Bus (in-memory)" {
[Topic: albumA:LOSSLESS] as T1
[Topic: albumB:LOSSY] as T2
[Global Subscribers] as GS
}
package "Subscribers" {
[MonitorAlbumStream\nClient A (album A)] as S1
[MonitorAlbumStream\nClient B (album A)] as S2
[SubscribeEvents\nClient C (global)] as S3
}
WF1 --> DB : 1. Write event\n(synchronous)
WF1 --> T1 : 2. Publish\n(async notification)
WF2 --> DB : 1. Write event
WF2 --> T2 : 2. Publish
T1 --> S1 : Ring buffer\n(per subscriber)
T1 --> S2 : Ring buffer
T1 --> GS
T2 --> GS
GS --> S3 : Ring buffer
note right of DB
**Source of truth.**
Events survive restarts.
Replay via seq numbers.
end note
note right of T1
**Ephemeral notification.**
Ring buffer per subscriber.
Slow subscribers: overwrite oldest.
No backpressure on publishers.
end note
note bottom of S1
Client disconnect removes
subscriber from topic.
Workflow continues.
end note
== Subscription Lifecycle ==
note as N1
**Subscribe flow:**
1. Client calls MonitorAlbumStream or SubscribeEvents
2. Server subscribes to EventBus (per-topic or global)
3. Server queries DB for historical events (replay)
4. Server bridges: EventBus → gRPC stream
5. On disconnect: cleanup func unsubscribes
**Topic cleanup:**
When last subscriber leaves AND workflow completed:
topic removed from EventBus map.
end note
== Recovery on Restart ==
note as N2
**Server restart recovery:**
1. Query workflow_runs WHERE status = 'running'
2. For each stale run:
- If active download exists → mark completed
- Otherwise → mark failed ("server restarted")
3. RecoverOrphanedDownloads reschedules poll jobs
4. New workflows start fresh (no goroutine resurrection)
end note
@enduml