podcast-mindmap

History

Dotty Dotter c5489eabaa #16/#17 match_answers.py und match_claims.py: Cross-Episode-Matching via Embeddings + Qwen scripts/match_answers.py (#17): - Laedt offene Fragen (genuine, follow_up; answered='no'). - Embedded jede Frage und sucht den besten Kandidat-Absatz aus einer anderen Episode (optional cross-podcast) per Cosinus-Aehnlichkeit ueber die paragraph- embeddings. - Bei score >= 0.55: Qwen-Verifikation 'Beantwortet B die Frage in A?' (yes/partial/no), bei yes/partial wird answered + answered_by_* in der questions-Tabelle gesetzt. - Hard-Budget 1,50 USD, --rerun setzt bestehende Matches neu. scripts/match_claims.py (#16 Stufe 2): - Analoge Mechanik fuer claims: Embedding, Cosinus-Suche, Qwen-Verifikation in der vier-stufigen Skala 'belegt' / 'widerspricht' / 'erweitert' / 'kein_bezug'. - Schreibt Treffer (ohne 'kein_bezug') in neue Tabelle claim_matches. - Default nur verifizierbare Claims (--include-non-verifiable kippt das), --cross-podcast erlaubt Cross-Podcast-Treffer. Beide Skripte nutzen json_utils.parse_llm_json fuer robustes Parsing und sind gegen NaN-Vektoren in den Embeddings abgesichert. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>		2026-04-28 02:21:49 +02:00
..
analyse_arguments.py	#13/#18 Robuster JSON-Parser + --rerun-errors-Modus	2026-04-28 00:30:45 +02:00
config.py	Initial commit: podcast-mindmap tool	2026-04-20 01:25:42 +02:00
convert_srt.py	Initial commit: podcast-mindmap tool	2026-04-20 01:25:42 +02:00
curate_debates.py	#13/#18 Robuster JSON-Parser + --rerun-errors-Modus	2026-04-28 00:30:45 +02:00
detect_gaps.py	#12 Wort-Highlighting Frontend, #14 Leerstellen-Detektor, #15 Narrative Shift,	2026-04-23 22:29:41 +02:00
detect_narrative_shift.py	#12 Wort-Highlighting Frontend, #14 Leerstellen-Detektor, #15 Narrative Shift,	2026-04-23 22:29:41 +02:00
download_audio.py	Initial commit: podcast-mindmap tool	2026-04-20 01:25:42 +02:00
extract_claims.py	#12 Wort-Highlighting Frontend, #14 Leerstellen-Detektor, #15 Narrative Shift,	2026-04-23 22:29:41 +02:00
extract_questions.py	#12 Wort-Highlighting Frontend, #14 Leerstellen-Detektor, #15 Narrative Shift,	2026-04-23 22:29:41 +02:00
extract_quotes.py	extract_quotes.py: Auto-Quote-Extraktion je Episode via Qwen-plus	2026-04-28 00:30:54 +02:00
import_words.py	#8 Multi-Podcast-Dashboard, #9 PWA, #10 Cross-Podcast-Links, #12 Wort-Timestamps	2026-04-23 20:53:06 +02:00
index_topics.py	#2 Obsidian-Links, #6 Soundbite-Export, #7 Timeline	2026-04-20 08:03:12 +02:00
json_utils.py	#13/#18 Robuster JSON-Parser + --rerun-errors-Modus	2026-04-28 00:30:45 +02:00
match_answers.py	#16/#17 match_answers.py und match_claims.py: Cross-Episode-Matching via Embeddings + Qwen	2026-04-28 02:21:49 +02:00
match_claims.py	#16/#17 match_answers.py und match_claims.py: Cross-Episode-Matching via Embeddings + Qwen	2026-04-28 02:21:49 +02:00
match_quotes.py	Initial commit: podcast-mindmap tool	2026-04-20 01:25:42 +02:00
pipeline.py	Initial commit: podcast-mindmap tool	2026-04-20 01:25:42 +02:00
rerun_errors.py	#13/#18 Robuster JSON-Parser + --rerun-errors-Modus	2026-04-28 00:30:45 +02:00
run_all_qwen.sh	#12 Wort-Highlighting Frontend, #14 Leerstellen-Detektor, #15 Narrative Shift,	2026-04-23 22:29:41 +02:00
transcribe_words.py	#8 Multi-Podcast-Dashboard, #9 PWA, #10 Cross-Podcast-Links, #12 Wort-Timestamps	2026-04-23 20:53:06 +02:00