podcast-mindmap

Author	SHA1	Message	Date
Dotty Dotter	d6ccea006a	#14/#15/#16/#17 backend: Endpoints fuer Gaps, Shifts, Claims und Questions - /api/podcasts/{id}/episodes/{ep}/claims: Behauptungen einer Episode, optional gefiltert nach claim_type. - /api/podcasts/{id}/episodes/{ep}/questions: Fragen der Episode, gefiltert nach Typ und Antwort-Status. - /api/podcasts/{id}/episodes/{ep}/analyses-summary: Zaehler fuer die UI-Buttons (claims, questions, unbeantwortet). - /api/analyses/gaps: Leerstellen aus data/gaps_analysis.json (#14), Filter ueber min_size und missing_in. - /api/analyses/shifts: Narrative-Shift-Drift aus data/narrative_shifts.json (#15), Filter ueber podcast, theme und min_drift. - Wort-Timestamps via /api/podcasts/{id}/transcript/{ep}/words; Tabelle wird via _table_exists graceful behandelt. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-28 00:31:02 +02:00
Dotty Dotter	e678f75ee1	#8 Multi-Podcast-Dashboard, #9 PWA, #10 Cross-Podcast-Links, #12 Wort-Timestamps - Backend: /api/compare Endpoint für Podcast-Vergleich (Stats, gemeinsame Topics, Top-Querverbindungen), /api/.../words Endpoint für Wort-Timestamps - Frontend: Podcast-Vergleichsansicht mit Statistiken und Cross-Links, Cross-Podcast-Suche-Toggle, semantische Links im Transkript (lazy-loaded), Podcast-Switcher mit Zurück-Navigation - PWA: manifest.json, Service Worker (stale-while-revalidate für Assets, network-first für API, cache-on-success für Audio), Icons - Scripts: transcribe_words.py (mlx-whisper Batch-Transkription mit Wort-Timestamps), import_words.py (Wort-Timestamps in DB importieren) - Dockerfile: PWA-Assets in Container kopieren Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-23 20:53:06 +02:00
Dotty Dotter	cb5978132c	Phase 2: Vorberechnete semantische Similarity + API - precompute.py: Berechnet paarweise Cosine-Similarity aller Absätze, speichert Top-10-Nachbarn pro Absatz in semantic_links Tabelle - API: /api/similar-precomputed/{podcast}/{episode}/{idx} — liefert vorberechnete ähnliche Stellen in <1ms - Getestet: 728 Absätze, 7144 Links (Threshold 0.55) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-20 21:23:31 +02:00
Dotty Dotter	b0649cea49	Phase 1+2: FastAPI-Backend, SQLite, Embeddings, Semantische Suche Phase 1: - FastAPI-Backend (backend/app.py) mit REST-API - SQLite-Datenbank für Podcasts, Episoden, Absätze, Zitate - Auto-Import aus mindmap_data.json + srt_index.json beim Start - Webapp als SPA: API-first mit Static-File-Fallback - Audio als gemountetes Volume statt im Docker-Image - Docker-Compose mit Traefik-Labels Phase 2: - Qwen text-embedding-v3 via DashScope (1024-dim Vektoren) - Embedding aller Transkript-Absätze (728 für NEU DENKEN) - Semantische Suche: /api/semantic-search?q=... - Similarity-API: /api/similar/{podcast}/{episode}/{paragraph} - Cosine-Similarity auf normalisierten Vektoren, <100ms - Findet thematisch verwandte Stellen über Episoden hinweg, auch bei komplett unterschiedlicher Wortwahl Vorbereitet für Multi-Podcast (#10): Datenstruktur unterstützt mehrere Podcasts, Cross-Podcast-Similarity ist ein Parameter. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-20 10:24:53 +02:00

4 Commits