Python · FastAPI · Make.com · Telegram

VidGist — Social Media Video Summariser

Live

Python backend pipeline that converts Instagram and TikTok videos into structured, reusable notes via CLI, API, or Telegram automation.

Stack
PythonFastAPIMake.comTelegram APIyt-dlpffmpegOpenAI API
Status
Live
Year

OVERVIEW

VidGist is a Python backend pipeline that converts Instagram and TikTok videos into structured, reusable notes. Given a video URL, it downloads the audio, transcribes the speech, and returns categorised notes — summary, key insights, direct quotes, action steps, topics, and a confidence score. Built to run in three modes: as a CLI for direct local use, as a FastAPI webhook for integration with automation tools, and as a fully conversational Telegram interface via Make.com.

THE PROBLEM

Short-form video on Instagram and TikTok is dense with useful information — tutorials, insights, advice, commentary — but extracting that information into reusable notes manually is slow, inconsistent, and easy to abandon. There was no lightweight tool that could take a URL and return structured notes without a frontend, a subscription, or manual effort.

WHAT WAS BUILT

  • URL ingestion pipeline — accepts Instagram and TikTok links directly as input
  • Audio extraction — downloads and extracts audio from video URLs using yt-dlp and ffmpeg
  • Speech-to-text transcription — converts extracted audio to clean text via OpenAI transcription models
  • Structured note extraction — LLM processes the raw transcript and returns output in a strict JSON schema: summary, key insights, direct quotes, action steps, topics, and confidence score
  • Provider-flexible summarisation layer — OpenAI as the default summarisation provider, with Claude as an optional alternative — switchable without changing the pipeline architecture
  • FastAPI webhook service — exposes /health and /summarize endpoints, making the pipeline pluggable into external workflows without building a frontend
  • Make.com automation workflow — a no-code orchestration layer that provides a complete end-to-end user interface via Telegram: send a video URL to the bot, Make.com calls the FastAPI endpoint, formats the structured response, and returns the notes directly in the chat. No frontend, no app, no friction.
  • CLI mode — runs the full pipeline locally from the command line for direct personal use

THE INSIGHT

The best solution to a problem is not always a new app. Most people who want to save information from social videos either do it manually — pausing, rewinding, typing notes — or they do not do it at all because the friction is too high. The obvious answer looks like: build a tool. Design an interface. Ship a product. The better answer was: what do I already use every day, and how do I make those things work together? Telegram was already there. Make.com was already there. The AI APIs were already there. VidGist is what happens when you connect existing tools intelligently instead of rebuilding the wheel — a fully working, daily-use pipeline assembled from things that already existed, solving a real bottleneck without writing a single line of frontend code. That is the principle: find the gap between tools that already work, and fill it with the minimum viable logic to connect them.

STACK & TOOLS

PythonFastAPIMake.comTelegram APIyt-dlpffmpegOpenAI API