LogoIRIS
Systems Engineering

SYSTEM ARCHITECTURE.

IRIS maps natural audio input directly into OS-level instructions. Explore the neural execution cycle, process isolation boundary, and complete system stack.

Real-Time Voice-To-System Pipeline

IRIS establishes a persistent full-duplex WebRTC connection with Gemini Live. Audio streams continuously; when intent is matched, the state-machine locks and executes actions, feeding the output instantly back as audio.

01
WebRTC Audio Stream
User Speech Input
Persistent bidirectional live stream (<500ms latency).
02
Gemini 3.1 Live API
Intent Recognition & Parsing
Processes speech, outputs structured function call payloads.
03
LangGraph State Machine
Cyclic Tool Selection
Protected backend process orchestrates required system operations.
04
Native OS Execution
Protected Tools System
Nut.js cursor events, CLI executions, filesystem operations take action.
05
Feedback Synthesis
Audio & UI Update Loop
System output returned to Gemini Live; synthesizes response back to speaker.

Technical Stack Inventory

Core Desktop & UI Framework

Electron & ViteHigh-performance desktop compilation & fast bundler lifecycle.
React 19Component-based, optimized virtual DOM frontend engine.
Tailwind CSS v4Utility-first styling matching modern HSL palettes.
Framer Motion & GSAPHardware-accelerated animations and scroll synchronization.
Three.js & React Three Fiber3D neural network visualizers and active orb renderers.
ZustandLightweight, centralized state management.

AI & Neural Orchestration Layer

Gemini 3.1 Live APIBidirectional real-time WebRTC audio stream + vision intelligence.
Groq SDKSub-100ms ultra-fast inference routing & fallback processing.
LangGraphAgentic state loop orchestration & cyclic tool selectors (Protected).
LanceDBHigh-speed local vector database for persistent RAG and memory.

OS Control & Automation

Nut.jsNative OS mouse coordinate targeting, keyboard typing injection.
Puppeteer & StealthHeadless browser scraping, automated form fill, web agent.
Node Window ManagerActive application window resizing, repositioning, and alignment.
Tesseract.jsOptical Character Recognition (OCR) for screen peeling code extraction.