A ready-to-use, minimal app that converts any speech into text.
-
Updated
Jul 5, 2024 - JavaScript
A ready-to-use, minimal app that converts any speech into text.
The main repo for Stage Whisper — a free, secure, and easy-to-use transcription app for journalists, powered by OpenAI's Whisper automatic speech recognition (ASR) machine learning models.
VOXD is a speech-to-text, voice-typing, dictation software for linux distributions. It is an open-source, free of charge, USER-FRIENDLY software, for as many linux distros as possible.
在线前端频谱分析扒谱 front-end music transcription
OpenAI/ChatGPT library for Java - Requires JDK 11 at minimum.
explore AMT from the perspective of timbre
🎙️ AI-powered Telegram bot for voice-to-text transcription using OpenAI Whisper. CPU-only, no GPU required, privacy-focused with local processing.
This contains a practical guide for non-technical users on how to use OpenAI's Whisper for transcription and translation
Automatically generate accurate, per-word video captions with timestamps using Whisper ASR and FFmpeg, perfect for YouTube, social media, and accessibility.
Offline, privacy-first screen recorder with local AI transcription and smart summaries. Built with Electron, React, and TypeScript—capture desktop video, auto-generate transcripts, and get instant AI-powered meeting and lesson insights, all cross-platform and fully customizable.
"An offline video & audio transcription tool powered by OpenAI Whisper. Convert your tutorials, lectures, and podcasts into accurate text transcripts and use AI to generate summaries, notes, and mind maps — saving hours of time and boosting productivity."
🚀📜 Customized For Agentic AI: Enhanced the Whisper Assistant extension with improved setup scripts and documentation, ensuring seamless integration and functionality on Linux platforms.
Turn podcast audio into shareable videos. Upload audio, generate subtitles with AI, and export YouTube-ready 1920×1080 MP4 videos with animated waveforms... all in your browser. No API keys, no backend required.
An AI-powered intelligence platform to reverse-engineer short-form virality. Identify high-performing outliers and extract winning content strategies across Instagram, TikTok, and YouTube.
Al MOM is an Al-powered meeting intelligence platform that delivers real-time transcription, speaker recognition, and multi-LLM summaries using FastAPI, Whisper, Groq, and OpenRouter for intelligent meeting insights.
Offline macOS speech-to-text app powered by Whisper.cpp. Fully private, runs entirely on-device.
Flick is a powerful AI-driven SaaS platform for real-time video sharing and collaboration, crafted for both web and desktop environments. Designed for seamless video recording, streaming, and sharing without third-party dependencies, Flick offers teams and individuals an integrated workspace to create, manage, and share video content in real-time.
Open Video Transcribe - Open-source video transcription tool that emphasizes the primary use case: transcribing video files to text with support for multiple model types.
a python script that can auto generate subtitle in YouTube Videos
Lightning-fast audio transcription (6x speed) with batch processing, Obsidian integration, and optimized real-time performance. Powered by faster-whisper and Distil-Whisper models.
Add a description, image, and links to the ai-transcription topic page so that developers can more easily learn about it.
To associate your repository with the ai-transcription topic, visit your repo's landing page and select "manage topics."