Skip to content

pranayjoshi/Cortex

Repository files navigation

🧠 Cortex – AI Desktop Assistant

An intelligent desktop assistant that monitors your screen, builds a knowledge graph of your work, and talks to you through voice and chat—with suggestions and actions synced to your mind map.


Dashboard

Cortex Dashboard


Features

  • Screen monitoring — Captures and analyzes your screen every few seconds; keeps a live context (app, topic, entities).
  • Knowledge graph & mind map — Builds a NetworkX graph from your activity and visualizes it with React Flow.
  • AI chat — Casual, helpful replies powered by Gemini, informed by current screen context and your mind map.
  • Active listening — Optional always-on voice input; say something and Cortex replies and can speak back.
  • Voice interface — Text-to-speech (proactive or on-demand) with a casual tone.
  • Actions — Ask in natural language to open URLs (“open google”), type text (“type hello”), or press keys; Cortex runs them when possible.
  • Daily reset — Graph resets each day for a privacy-first workflow.

Tech stack

Layer Stack
Frontend Next.js 14, React Flow, Tailwind CSS, Socket.IO client
Desktop Electron (optional frameless window)
Backend Python 3.11+, FastAPI, Socket.IO, mss, Pillow
AI Google Gemini (e.g. 2.5 Flash / 2.0 Flash)
Voice pyttsx3 (TTS), Web Speech API (browser mic)
Automation pyautogui, webbrowser

Prerequisites

  • macOS 10.13+ (screen capture and automation are tested on macOS)
  • Node.js 18+
  • Python 3.11+
  • Gemini API key (Google AI Studio)

Quick start

  1. Clone and install

    git clone <repo-url>
    cd Cortex
    npm run install:all
  2. Backend (Python)

    cd backend
    python3 -m venv venv
    source venv/bin/activate   # Windows: venv\Scripts\activate
    pip install -r requirements.txt
  3. Environment

    cp .env.example .env
    # Set GEMINI_API_KEY in .env
  4. Run

    # Terminal 1 – backend
    cd backend && source venv/bin/activate && python main.py
    
    # Terminal 2 – frontend
    npm run dev:frontend

    Open the frontend URL (e.g. http://localhost:3000). Use Chrome for best voice support (Web Speech API).

macOS permissions

Cortex needs:

  • Screen Recording — for screen capture and analysis
  • Accessibility — for computer control (e.g. typing, hotkeys)

Grant in: System Preferences → Security & Privacy → Privacy.

Project structure

Cortex/
├── backend/           # FastAPI + Socket.IO
│   ├── api/          # REST routes (chat, graph, execute-action, voice)
│   ├── core/         # Screen monitor, knowledge graph, voice, action executor
│   └── utils/        # Gemini client
├── frontend/         # Next.js app
│   ├── app/          # Main page, layout
│   ├── components/   # Mind map, chat, titlebar
│   ├── hooks/        # useWebSocket
│   └── lib/          # Zustand stores
├── electron/         # Optional Electron shell
├── docs/             # Add dashboard.png here for the screenshot above
├── .env.example
└── README.md

API overview

Endpoint Description
POST /api/chat Send a message; returns AI response (uses screen + mind map context); can trigger actions.
GET /api/graph Current knowledge graph in React Flow format.
POST /api/execute-action Run an action from natural language (e.g. {"message": "open google"}).
POST /api/voice-toggle Toggle proactive vs on-demand TTS.
GET /api/health Health and voice status.

Adding the dashboard image

  1. Capture a screenshot of the Cortex UI (mind map + chat).
  2. Save it as docs/dashboard.png (or another path you prefer).
  3. In this README, uncomment and fix the image line in the Dashboard section:
    ![Cortex Dashboard](./docs/dashboard.png)

License

MIT

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors