Newsscribe
Standalone local transcription and media-analysis scripts built around `whisper.cpp`, `ollama`, `ffmpeg`, and LM Studio.
09
One part of a portfolio focused on expressive interfaces and disciplined systems.
A branded visual for the privacy-first local transcription and media-analysis workflow at the center of Newsscribe.
Overview
Newsscribe is a 100 % local tool-chain for turning long-form video or audio into concise written briefs. The repo is essentially a collection of Python and shell utilities wrapped around whisper.cpp, ffmpeg, Ollama / LM Studio LLM runners, and—optionally—the OpenAI API. I owned product design and engineering and shipped the first public release in 2026.
What The App Does
transcribe_video.shortranscribe.sh• Uses ffmpeg to extract a clean mono WAV • Feeds the WAV to whisper.cpp for full-resolution speech-to-textframe_grab_15s.py• Pulls a frame every 15 s to catch on-screen lower-thirds and slidesocr_frames_lmstudio.py• Runs OCR locally and pipes the text into a llama-based model hosted by LM Studio for context enrichmentopenai_fact_summary.py(optional) • Sends the combined transcript/OCR bundle to OpenAI to generate a bullet-first “fact brief”- Outputs are rendered via two Markdown templates (
SUMMARY_TEMPLATE.md,FACT_SUMMARY_TEMPLATE.md) ready for newsroom CMS import.
Everything is driven from a dot-file (.transcriberc) so a single command can spin through a backlog with batch_backfill_missing.sh.
Product/UX Review
• Interaction surface: plain CLI. No GUI means zero learning curve for terminal-centric reporters but raises the bar for less technical staff. • Speed: whisper.cpp in int8 mode brings real-time transcription on an M2 and keeps the entire flow under newsroom deadlines. • Privacy: Keeping LLMs local (Ollama / LM Studio) satisfied legal that embargoed material never touches external servers; the OpenAI pass is clearly marked “optional.” • Documentation: The README and sample rc file are sufficient, though some flags are discoverable only by reading the scripts—an area for polish. • Templates: Markdown summaries drop straight into CMS, but there is no live preview.
Technical Architecture
Shell scripts handle orchestration; each heavy task is a discrete Python module so pieces can be unit-tested (pytest plus CI in .github/workflows/ci.yml).
Key components:
• whisper.cpp compiled with CPU optimisations, called by transcribe.sh.
• ffmpeg for demuxing and frame extraction.
• Ollama or LM Studio runs GGUF llama models behind a localhost REST endpoint.
• Python modules call those endpoints with plain JSON.
• Config and state are file-based (.transcriberc, Makefile, and a tiny JSON portfolio). No database.
The design keeps the dependency graph shallow—no third-party packages listed in requirements.txt beyond standard library wrappers.
AI Techniques And Patterns
• Local inference: whisper.cpp for ASR and llama models for summarisation—no cloud GPU costs.
• Prompt templating: SUMMARY_TEMPLATE.md and FACT_SUMMARY_TEMPLATE.md store system messages; scripts interpolate transcript chunks before calling the model.
• Model routing: user can select LM Studio, Ollama, or OpenAI by flag—same prompt, different back-end.
• Guardrails: none bespoke; correctness is delegated to the optional OpenAI “fact brief.”
• Evaluation: compare_models.sh runs batch transcribes against multiple models for side-by-side WER inspection.
• Human-in-the-loop: final Markdown is meant for editorial review; no automatic publishing.
What Was Learned
- Local LLMs are now performant enough to summarise hour-long transcripts on a laptop without thermal throttling.
- Whisper.cpp beats cloud ASR for both speed and cost at ≥medium model sizes but still struggles with crosstalk; extracting frames and OCRing lower-thirds materially improves entity recall.
- Simple file-based config kept onboarding friction low; over-engineering a web UI can wait until the workflow is proven.
- Optional cloud steps must be explicit; toggling them behind a flag avoided accidental data leakage.
Strengths And Tradeoffs
Strengths • Fully offline path meets strict confidentiality requirements. • Modular scripts let newsrooms replace any step (e.g., swap in faster LLM) without touching others. • Minimal dependencies make CI and multi-platform packaging trivial.
Tradeoffs / Limitations
• CLI-only interface excludes non-technical users.
• No bespoke guardrails or hallucination checks when using local llama models.
• Frame-based OCR adds ≈10 % runtime overhead; could be batched smarter.
• Lack of typed Python or logging abstractions means debugging still leans on print-style tracing.