An intelligent desktop assistant that monitors your screen, builds a knowledge graph of your work, and talks to you through voice and chat—with suggestions and actions synced to your mind map.
- Screen monitoring — Captures and analyzes your screen every few seconds; keeps a live context (app, topic, entities).
- Knowledge graph & mind map — Builds a NetworkX graph from your activity and visualizes it with React Flow.
- AI chat — Casual, helpful replies powered by Gemini, informed by current screen context and your mind map.
- Active listening — Optional always-on voice input; say something and Cortex replies and can speak back.
- Voice interface — Text-to-speech (proactive or on-demand) with a casual tone.
- Actions — Ask in natural language to open URLs (“open google”), type text (“type hello”), or press keys; Cortex runs them when possible.
- Daily reset — Graph resets each day for a privacy-first workflow.
| Layer | Stack |
|---|---|
| Frontend | Next.js 14, React Flow, Tailwind CSS, Socket.IO client |
| Desktop | Electron (optional frameless window) |
| Backend | Python 3.11+, FastAPI, Socket.IO, mss, Pillow |
| AI | Google Gemini (e.g. 2.5 Flash / 2.0 Flash) |
| Voice | pyttsx3 (TTS), Web Speech API (browser mic) |
| Automation | pyautogui, webbrowser |
- macOS 10.13+ (screen capture and automation are tested on macOS)
- Node.js 18+
- Python 3.11+
- Gemini API key (Google AI Studio)
-
Clone and install
git clone <repo-url> cd Cortex npm run install:all
-
Backend (Python)
cd backend python3 -m venv venv source venv/bin/activate # Windows: venv\Scripts\activate pip install -r requirements.txt
-
Environment
cp .env.example .env # Set GEMINI_API_KEY in .env -
Run
# Terminal 1 – backend cd backend && source venv/bin/activate && python main.py # Terminal 2 – frontend npm run dev:frontend
Open the frontend URL (e.g. http://localhost:3000). Use Chrome for best voice support (Web Speech API).
Cortex needs:
- Screen Recording — for screen capture and analysis
- Accessibility — for computer control (e.g. typing, hotkeys)
Grant in: System Preferences → Security & Privacy → Privacy.
Cortex/
├── backend/ # FastAPI + Socket.IO
│ ├── api/ # REST routes (chat, graph, execute-action, voice)
│ ├── core/ # Screen monitor, knowledge graph, voice, action executor
│ └── utils/ # Gemini client
├── frontend/ # Next.js app
│ ├── app/ # Main page, layout
│ ├── components/ # Mind map, chat, titlebar
│ ├── hooks/ # useWebSocket
│ └── lib/ # Zustand stores
├── electron/ # Optional Electron shell
├── docs/ # Add dashboard.png here for the screenshot above
├── .env.example
└── README.md
| Endpoint | Description |
|---|---|
POST /api/chat |
Send a message; returns AI response (uses screen + mind map context); can trigger actions. |
GET /api/graph |
Current knowledge graph in React Flow format. |
POST /api/execute-action |
Run an action from natural language (e.g. {"message": "open google"}). |
POST /api/voice-toggle |
Toggle proactive vs on-demand TTS. |
GET /api/health |
Health and voice status. |
- Capture a screenshot of the Cortex UI (mind map + chat).
- Save it as
docs/dashboard.png(or another path you prefer). - In this README, uncomment and fix the image line in the Dashboard section:

MIT
