A macOS desktop app that automatically detects Microsoft Teams meetings, transcribes them locally with Apple Silicon GPU acceleration, and produces structured AI-powered summaries — completely offline and invisible to other participants.
Features · How it works · Screenshots · Getting started · Usage · Configuration · API · Development · Roadmap
MeetingMind runs silently in the background, watching for active Teams calls. When a meeting starts, it captures both sides of the conversation — remote participants via system audio loopback and your voice via the microphone — then runs the recording through on-device speech-to-text and an AI summariser to produce structured notes with action items, decisions, and key topics.
Everything runs locally on your Mac. No cloud transcription, no bots joining calls, no data leaving your machine (unless you opt into the Claude API for summarisation).
| Format | Description |
|---|---|
| Markdown | Obsidian-compatible .md files with YAML frontmatter (Dataview-queryable) |
| Notion | Native database pages with headings, bullets, and to-do items |
| JSON / Clipboard | Export any meeting via the API or desktop app |
| Backend | Cost | Privacy | Quality |
|---|---|---|---|
| Ollama | Free | Fully local | Good (depends on model) |
| Claude API | API credits | Transcript sent to Anthropic | Excellent |
flowchart TB
subgraph Daemon["MeetingMind Daemon · Python"]
direction TB
Detect["🔍 Detector<br/><sub>pgrep · lsof · osascript</sub>"]
Capture["🎙️ Audio Capture<br/><sub>BlackHole + Microphone</sub>"]
Transcribe["📝 Transcriber<br/><sub>MLX Whisper · Apple Silicon GPU</sub>"]
Diarise["👥 Diariser<br/><sub>Energy RMS / PyAnnote</sub>"]
Summarise["🤖 Summariser<br/><sub>Ollama / Claude API</sub>"]
Templates["📋 Templates<br/><sub>standup · retro · 1:1 · custom</sub>"]
API["🌐 REST + WebSocket API<br/><sub>FastAPI · 127.0.0.1:9876</sub>"]
DB[("💾 SQLite + FTS5<br/><sub>meetings · transcripts · embeddings</sub>")]
Output["📤 Output Writers<br/><sub>Markdown · Notion</sub>"]
Detect --> Capture --> Transcribe --> Diarise --> Summarise --> Output
Templates -.-> Summarise
Summarise --> DB
API <--> DB
end
subgraph UI["MeetingMind Desktop App · Tauri v2 + React"]
direction LR
Dash["📊 Dashboard"]
Meet["📋 Meetings"]
Search["🔍 Search"]
Live["🎤 Live View"]
Settings["⚙️ Settings"]
Onboard["🚀 Onboarding"]
Tray["🖥️ System Tray"]
end
API <-->|"REST · WebSocket"| UI
style Daemon fill:#0f172a,stroke:#334155,color:#e2e8f0
style UI fill:#0f172a,stroke:#334155,color:#e2e8f0
sequenceDiagram
participant D as TeamsDetector
participant A as AudioCapture
participant T as Transcriber (MLX)
participant S as Diariser
participant M as Summariser
participant DB as SQLite + FTS5
participant WS as WebSocket
participant UI as Desktop App
D->>D: Poll every 3s (pgrep, lsof, osascript)
D->>A: Meeting detected (3 consecutive polls)
A->>A: Record BlackHole + Mic to separate WAVs
Note over A: Independent streams avoid clock drift
D->>A: Meeting ended
A->>A: RMS normalise + merge source files
A->>T: merged.wav
T->>T: MLX Whisper on Apple Silicon GPU
T->>S: Transcript with timestamps
S->>S: Energy RMS or PyAnnote labelling
S->>M: Labelled transcript
M->>M: Generate summary (Ollama/Claude)
M->>DB: Store meeting + transcript + summary
DB->>WS: Broadcast "meeting_complete" event
WS->>UI: Real-time update
Teams notifies participants when:
- A recording is started via the Teams UI
- A bot joins the meeting
MeetingMind does neither. It captures your local system audio via a loopback driver (BlackHole), which is functionally identical to listening through your speakers. No network traffic, no bot, no Teams API calls — from everyone else's perspective, nothing has changed.
Note
Recording meetings may have legal implications depending on your jurisdiction. Many regions operate under "one-party consent" laws, meaning you can record a conversation you participate in. Verify the laws and policies that apply to you before use.
- Native macOS app — Tauri v2 desktop shell with system tray, dark/light themes, and native notifications
- Meeting history — Browse, search, filter, and label all recorded meetings with full-text search (FTS5)
- Semantic search — Vector-based search across transcripts using sentence-transformers embeddings
- Live view — Real-time audio level meters, pipeline progress, and streaming transcript during recording
- Audio player — Waveform visualisation with playback controls, speed adjustment, and click-to-seek from transcript
- Settings UI — Full configuration through the app — no YAML editing required
- Command palette —
⌘Kto quickly search meetings, start recording, or jump to settings - Onboarding wizard — Guided setup for BlackHole, audio devices, permissions, and model downloads
- Model management — Download and manage MLX Whisper models with real-time progress tracking
- Summary templates — Built-in templates (standard, standup, retro, 1:1, client-call) plus custom templates
- Meeting merge — Combine split meetings that were recorded as separate sessions
- Speaker management — Rename speaker labels across meetings
- Export — Export meetings as Markdown, JSON, or copy to clipboard
- Auto-updates — Built-in update checking via GitHub Releases
- Accessible — WCAG AA compliant with full keyboard navigation and screen reader support
- Automatic detection — Monitors macOS process state and audio device usage with debounce to detect live Teams calls without manual intervention or false positives
- Dual-source audio — Records system audio (remote participants) and microphone (your voice) to separate files, then merges with RMS normalisation so both sides are equally audible
- Apple Silicon GPU transcription — Uses MLX Whisper for ~10x faster on-device speech-to-text via the MLX framework
- Speaker diarisation — Two backends: energy-based RMS comparison (no ML, zero dependencies) or PyAnnote (ML-based, multi-speaker, requires torch)
- AI summarisation — Produces structured summaries with title, key decisions, detailed action items (with owners, deadlines, subtasks), open questions, and topic tags
- Summary templates — User-definable prompts for different meeting types (standups, retros, 1:1s, client calls)
- Reprocessing — Re-run transcription and summarisation on any meeting from its existing audio file
- Data retention — Configurable auto-cleanup of old audio files and meeting records
- Obsidian integration — Markdown output with YAML frontmatter designed for Obsidian Dataview queries
- Notion integration — Creates native Notion database pages with proper headings, bullets, and to-do blocks
- Export — Export meetings as Markdown, JSON, or copy to clipboard from the desktop app or API
MeetingMind records system audio and microphone to separate WAV files using independent audio streams, then merges them after capture with RMS normalisation.
flowchart LR
BH["BlackHole 2ch<br/><sub>System audio</sub>"] --> S1["system.wav"]
MIC["Microphone<br/><sub>Your voice</sub>"] --> S2["mic.wav"]
S1 --> NORM["RMS Normalisation<br/><sub>Balance volume levels</sub>"]
S2 --> NORM
NORM --> MERGE["merged.wav<br/><sub>Both sides, equal volume</sub>"]
S1 -.->|"if diarisation enabled"| DIAR["Diariser<br/><sub>Compare energy per segment</sub>"]
S2 -.-> DIAR
style BH fill:#1e3a5f,stroke:#334155,color:#e2e8f0
style MIC fill:#1e3a5f,stroke:#334155,color:#e2e8f0
Why separate files?
| Reason | Explanation |
|---|---|
| Eliminates clock drift | Two hardware devices (e.g. BlackHole + USB headset) run on independent clocks. Real-time mixing causes progressive desync. Separate files avoid this entirely. |
| Balances volume | System audio is typically much quieter than a close-range mic. Post-capture normalisation brings both to the same RMS level. |
| Enables diarisation | The separate source files allow energy comparison to determine who was speaking in each segment. |
MeetingMind supports two diarisation backends:
Zero dependencies. Compares RMS energy between system audio and microphone per transcript segment:
- Mic significantly louder → "Me" (you were speaking)
- System significantly louder → "Remote" (another participant)
- Both similar → "Me + Remote" (crosstalk)
ML-based speaker diarisation via pyannote.audio. Requires torch and a HuggingFace token. Distinguishes multiple remote speakers by voice characteristics.
[00:01:23] [Remote] So the quarterly numbers show a 15% increase...
[00:01:45] [Me] Right, and I think we should focus on the enterprise segment.
[00:02:10] [Remote] Agreed. Let's draft the proposal by Friday.
Tip
Diarisation works best with headsets that isolate your mic from system audio. Open speakers cause crosstalk and reduce accuracy.
| Requirement | Purpose | Install |
|---|---|---|
| macOS (Apple Silicon) | MLX Whisper requires Apple Silicon GPU | — |
| Python 3.11+ | Daemon runtime | brew install python@3.11 |
| BlackHole 2ch | Virtual audio loopback driver | brew install blackhole-2ch |
| Ollama (recommended) | Free local summarisation | brew install ollama |
| Node.js 20+ | Build desktop app (dev only) | brew install node |
| Rust (stable) | Build Tauri shell (dev only) | curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh |
BlackHole creates a virtual audio device that captures system audio via loopback.
brew install blackhole-2chAfter installation, create a Multi-Output Device in Audio MIDI Setup:
- Open Audio MIDI Setup (Spotlight → "Audio MIDI Setup")
- Click + → Create Multi-Output Device
- Check both your real speakers/headphones and BlackHole 2ch
- Set your real device as the clock source
- Set this Multi-Output Device as your system output (System Settings → Sound → Output)
Important
If you use a USB headset or external speakers, also configure Teams to use the Multi-Output Device as its speaker output (Teams → Settings → Devices → Speaker). If Teams sends audio directly to your headset, BlackHole won't capture it.
brew install ollama
ollama pull qwen3:30b-a3b # or llama3.1:8b for lighter hardware
ollama serveAlternatively, set
backend: "claude"in config and provide an Anthropic API key.
git clone https://github.com/JWhite212/meeting-mind.git
cd meeting-mind
python3 -m venv .venv
source .venv/bin/activate
pip install -r requirements.txt
cp config.example.yaml config.yaml # then edit with your valuesOr use the Makefile:
make setup # Creates venv, installs Python + UI dependenciesEdit config.yaml — the essential settings:
| Setting | Description | Default |
|---|---|---|
summarisation.backend |
"ollama" (free, local) or "claude" (API) |
ollama |
summarisation.ollama_model |
Ollama model name | qwen3:30b-a3b |
summarisation.anthropic_api_key |
Anthropic key (Claude backend only) | — |
audio.blackhole_device_name |
BlackHole device name | BlackHole 2ch |
audio.mic_device_name |
Microphone (empty = system default) | "" |
diarisation.enabled |
Label speakers as "Me" / "Remote" | false |
diarisation.backend |
"energy" (no ML) or "pyannote" (ML) |
energy |
markdown.vault_path |
Obsidian vault meetings folder | ~/Documents/Meetings |
See config.example.yaml for the full reference with all options documented.
python3 -m src.mainPolls for active Teams calls and automatically starts/stops recording. Detection uses debounce (3 consecutive positive polls over ~9 seconds) to prevent false positives.
python3 -m src.main --record-nowStarts recording immediately without waiting for Teams detection. Press Ctrl+C to stop — the recording is then transcribed and summarised.
python3 -m src.main --process /path/to/audio.wavSkip recording and run an existing audio file through the full pipeline.
make dev # Start Tauri + Vite dev serverOr for production:
make build # Build daemon binary + Tauri .app/.dmg
make install # Build + install launch agentcp com.meetingmind.agent.plist ~/Library/LaunchAgents/
launchctl load ~/Library/LaunchAgents/com.meetingmind.agent.plistEach meeting produces a file like:
~/Documents/Meetings/2026-04-08_quarterly-planning-review.md
---
title: "Quarterly Planning Review"
date: 2026-04-08
time: 14:30
duration_minutes: 45
tags: ["roadmap", "hiring", "q3-planning"]
type: meeting-note
---Followed by the AI-generated summary with sections for:
- Summary — High-level overview of what was discussed and why it matters
- Key Decisions — Decisions made during the meeting
- Action Items — Each item includes full context, owner, deadline, and subtasks
- Open Questions — Unresolved topics for follow-up
- Full Transcript — Timestamped and speaker-labelled (
[00:01:23] [Remote] So the quarterly numbers...)
A new page is created in your configured Notion database with:
- Properties: Title, Date, Tags (multi-select), Status
- Content: Native Notion blocks — headings, bullets, and to-do items (not raw Markdown)
Full config.example.yaml
# Meeting Detection
detection:
poll_interval_seconds: 3
min_meeting_duration_seconds: 30
required_consecutive_detections: 3
min_gap_before_new_meeting: 60
process_names:
- "Microsoft Teams"
- "MSTeams"
- "Teams"
# Audio Capture
audio:
blackhole_device_name: "BlackHole 2ch"
mic_device_name: ""
mic_enabled: true
mic_volume: 1.0
system_volume: 1.0
sample_rate: 16000
channels: 1
temp_audio_dir: "/tmp/meetingmind"
keep_source_files: false
# Transcription (MLX Whisper — Apple Silicon GPU)
transcription:
model_size: "mlx-community/whisper-large-v3-turbo"
language: "en" # "auto" for language detection
vad_threshold: 0.35
# Summarisation
summarisation:
backend: "ollama"
ollama_base_url: "http://localhost:11434"
ollama_model: "qwen3:30b-a3b"
ollama_timeout: 600
ollama_num_ctx: 32768
anthropic_api_key: "sk-ant-..."
model: "claude-sonnet-4-20250514"
max_tokens: 4096
chunk_threshold_words: 20000
default_template: "standard" # standard | standup | retro | 1on1 | client-call
# Speaker Diarisation
diarisation:
enabled: false
speaker_name: "Me"
remote_label: "Remote"
energy_ratio_threshold: 1.5
backend: "energy" # "energy" or "pyannote"
pyannote_model: "pyannote/speaker-diarization-3.1"
num_speakers: 0 # 0 = auto-detect (pyannote)
# Output: Markdown
markdown:
enabled: true
vault_path: "~/Documents/Meetings"
filename_template: "{date}_{slug}.md"
include_full_transcript: true
# Output: Notion
notion:
enabled: false
api_key: "ntn_..."
database_id: ""
properties:
title: "Name"
date: "Date"
tags: "Tags"
status: "Status"
# Data Retention
retention:
audio_retention_days: 0 # 0 = keep forever
record_retention_days: 0
# API Server
api:
enabled: true
host: "127.0.0.1"
port: 9876
# Logging
logging:
level: "INFO"
log_file: "~/Library/Logs/meetingmind.log"The daemon exposes a REST + WebSocket API at http://127.0.0.1:9876. Interactive Swagger docs available at /docs when running.
All endpoints require HMAC token authentication. The WebSocket uses a message-based auth handshake.
Endpoints
| Method | Endpoint | Description |
|---|---|---|
GET |
/api/health |
Health check |
GET |
/api/status |
Daemon state + active meeting info |
GET |
/api/meetings |
List meetings (paginated, filterable) |
GET |
/api/meetings/{id} |
Get meeting by ID |
DELETE |
/api/meetings/{id} |
Delete meeting and audio |
GET |
/api/meetings/{id}/audio |
Download meeting audio file |
PATCH |
/api/meetings/{id}/label |
Set meeting label |
POST |
/api/meetings/merge |
Merge multiple meetings |
GET |
/api/meetings/labels |
Get distinct labels |
POST |
/api/meetings/{id}/resummarise |
Re-summarise with different template |
POST |
/api/meetings/{id}/reprocess |
Reprocess from audio (transcribe + summarise) |
POST |
/api/record/start |
Start recording manually |
POST |
/api/record/stop |
Stop recording |
GET |
/api/config |
Get current configuration |
PUT |
/api/config |
Update configuration |
GET |
/api/devices |
List audio input/output devices |
GET |
/api/models |
List available Whisper models |
POST |
/api/models/download |
Download a Whisper model |
POST |
/api/search |
Full-text + semantic search |
POST |
/api/search/reindex |
Rebuild search index |
GET |
/api/templates |
List summary templates |
GET |
/api/templates/{name} |
Get template by name |
POST |
/api/templates |
Create custom template |
DELETE |
/api/templates/{name} |
Delete custom template |
PATCH |
/api/meetings/{id}/speakers/{sid} |
Rename speaker |
GET |
/api/meetings/{id}/speakers |
Get speakers for meeting |
GET |
/api/speakers |
Get all known speakers |
POST |
/api/export/{id} |
Export meeting (Markdown/JSON) |
WS |
/ws |
Real-time events (meeting updates, progress) |
meeting-mind/
├── config.example.yaml # Config template (tracked)
├── meetingmind.spec # PyInstaller build spec
├── pyproject.toml # Project metadata, pytest & ruff config
├── Makefile # Build automation (setup, build, test, lint)
├── com.meetingmind.agent.plist # launchd agent for auto-start on login
│
├── src/ # Python daemon
│ ├── main.py # Orchestrator (MeetingMind class)
│ ├── detector.py # State machine + debounce logic
│ ├── audio_capture.py # Dual-source recording + RMS merge
│ ├── transcriber.py # MLX Whisper speech-to-text
│ ├── diariser.py # Energy-based speaker labelling
│ ├── pyannote_diariser.py # PyAnnote ML-based diarisation
│ ├── summariser.py # AI summarisation (Ollama / Claude)
│ ├── templates.py # Summary template system
│ ├── embeddings.py # Semantic search (sentence-transformers)
│ ├── platform/
│ │ ├── detector.py # PlatformDetector protocol + factory
│ │ ├── macos.py # macOS detection (pgrep, lsof, osascript)
│ │ ├── linux.py # Linux stub
│ │ └── windows.py # Windows stub
│ ├── api/
│ │ ├── server.py # FastAPI app + background thread
│ │ ├── auth.py # HMAC token authentication
│ │ ├── schemas.py # Pydantic response models
│ │ ├── events.py # EventBus (sync/async pub-sub)
│ │ ├── websocket.py # WebSocket connection manager
│ │ └── routes/ # 14 route modules (see API Reference)
│ ├── db/
│ │ ├── database.py # SQLite + FTS5 schema + migrations
│ │ └── repository.py # Meeting CRUD, search, retention cleanup
│ ├── output/
│ │ ├── markdown_writer.py # Obsidian-compatible .md output
│ │ └── notion_writer.py # Notion database page output
│ └── utils/
│ └── config.py # YAML config → typed dataclasses
│
├── ui/ # Tauri v2 + React desktop app
│ ├── src/
│ │ ├── App.tsx # Router + layout
│ │ ├── components/
│ │ │ ├── dashboard/ # Stats, recent meetings, quick actions
│ │ │ ├── meetings/ # MeetingList, MeetingDetail, AudioPlayer
│ │ │ ├── search/ # Full-text + semantic search UI
│ │ │ ├── live/ # Real-time transcript + audio meters
│ │ │ ├── settings/ # All config sections
│ │ │ ├── onboarding/ # Setup wizard (permissions, devices, models)
│ │ │ ├── layout/ # Sidebar navigation
│ │ │ └── common/ # Toast, Skeleton, Tooltip, CommandPalette
│ │ ├── hooks/ # useDaemonStatus, useWebSocket, useTheme
│ │ ├── stores/ # Zustand state management
│ │ └── lib/ # API client, types, constants
│ └── src-tauri/
│ ├── src/lib.rs # Tauri commands (auth, updates, daemon)
│ ├── src/tray.rs # System tray with dynamic menu
│ └── tauri.conf.json # App config, bundling, updater
│
├── tests/ # 31 test files, ~5700 lines
│ ├── conftest.py # Shared fixtures (tmp DB, config, EventBus)
│ ├── test_api*.py # API integration tests (11 files)
│ ├── test_repository.py # Meeting CRUD, search, retention
│ ├── test_config*.py # Config loading, validation, edge cases
│ ├── test_detector.py # State machine transitions
│ ├── test_summariser.py # Both backends
│ ├── test_templates.py # Template CRUD + built-ins
│ ├── test_embeddings.py # Semantic search
│ └── ... # Audio, diarisation, platform, orchestrator
│
├── scripts/
│ ├── build_daemon.sh # PyInstaller daemon binary
│ ├── install.sh # Build + install launch agent
│ ├── create_dmg.sh # Create macOS .dmg installer
│ ├── bump_version.sh # Sync version across all manifests
│ └── generate_icons.py # Generate app icons from source
│
└── .github/workflows/
├── test.yml # CI: ruff lint + pytest + TypeScript check
└── release.yml # CD: build daemon → build app → GitHub Release
| Layer | Technology |
|---|---|
| Desktop app | Tauri v2 (Rust) + React 18 + TypeScript 5 |
| Styling | Tailwind CSS |
| State management | Zustand + React Query |
| Animations | Framer Motion |
| Daemon API | FastAPI + WebSocket |
| Database | SQLite + FTS5 (via aiosqlite) |
| Audio capture | sounddevice + BlackHole |
| Transcription | MLX Whisper (Apple Silicon GPU) |
| Embeddings | sentence-transformers (all-MiniLM-L6-v2) |
| Summarisation | Ollama or Claude API |
| Diarisation | Energy-based RMS (numpy) or PyAnnote |
| Packaging | PyInstaller (daemon binary) + Tauri bundler (.app/.dmg) |
| CI/CD | GitHub Actions (lint, test, build, release) |
| Platform | macOS Apple Silicon |
make setup # Create venv, install all dependencies
make dev # Start Tauri + Vite dev server with hot-reloadsource .venv/bin/activate
python3 -m src.main # Run daemon
make test # Run pytest suite (31 test files)
make lint # Run ruff lintercd ui
npm install
npm run tauri dev # Dev mode with hot-reload
npx tsc --noEmit # Type checkmake build # Build daemon binary + Tauri .app/.dmg
make install # Build + install launch agent
# Or step by step:
./scripts/build_daemon.sh # PyInstaller → dist/meetingmind-daemon/
cd ui && npm run tauri build # Tauri → .app + .dmg./scripts/bump_version.sh 0.2.0
# Updates tauri.conf.json, Cargo.toml, and package.jsonThe daemon serves interactive Swagger UI at http://localhost:9876/docs when running.
No audio captured (silent recording)
- Verify BlackHole is installed:
brew list blackhole-2ch - Check your system output is set to the Multi-Output Device (not directly to speakers)
- Check Teams' speaker setting — go to Teams → Settings → Devices → Speaker and ensure it's set to "Multi-Output Device" or "System Default"
- Run
python3 -m sounddeviceto confirm BlackHole appears as an input device - Test audio capture:
python3 -c "
import sounddevice as sd, numpy as np
data = sd.rec(int(3 * 16000), samplerate=16000, channels=2, device='BlackHole 2ch', dtype='float32')
sd.wait()
peak = np.max(np.abs(data))
print(f'Peak amplitude: {peak:.6f}')
print('Signal detected' if peak > 0.001 else 'SILENT — check audio routing')
"VAD removes all audio
The Whisper VAD filter discards segments classified as silence. If your audio is very quiet, the entire recording may be filtered out. Try lowering transcription.vad_threshold in your config (default: 0.35, lower = less aggressive).
False positive meeting detection
The detector requires 3 consecutive positive polls (~9 seconds) before triggering. If false positives persist, increase detection.required_consecutive_detections in your config.
Ollama connection refused
Ensure the Ollama server is running (ollama serve) and listening on the configured port (default: http://localhost:11434).
Diarisation labels are inaccurate
- Use a headset with good mic isolation. Open speakers cause crosstalk.
- Adjust
diarisation.energy_ratio_threshold— lower (e.g.1.2) is more decisive, higher (e.g.2.0) requires bigger energy differences. - For multi-speaker calls, switch to the
pyannotebackend for voice-characteristic-based labelling.
Contributions and issue-tracked suggestions welcome.
- Windows and Linux support — platform stubs exist at
src/platform/. Needs WASAPI loopback (Windows) and PulseAudio/PipeWire monitor sources (Linux). - More meeting platforms — extend the detector to recognise Zoom, Google Meet, Slack Huddles, and Discord calls.
- Speaker name enrichment — attach real participant names by cross-referencing Teams meeting invitations.
- Cross-meeting retrieval — query across meeting history: "what did we decide about pricing last quarter?"
- Real-time transcription — stream transcript to the UI during recording (currently post-capture only).
- Calendar integration — auto-tag meetings with calendar event titles and attendees.
Jamie White — early-career software engineer building MeetingMind as a personal product focused on local-first, privacy-respecting tooling.
If you're hiring for a software engineering role and want to discuss how MeetingMind was built — from the dual-source audio pipeline to on-device GPU transcription, speaker diarisation, and the Tauri desktop shell — email jamiecs@live.co.uk.
Released under the MIT License.




