How Voxa works
Voxa has three parts: your phone, your laptop, and Claude Code. Your laptop does all the real work; your phone is a beautiful, ringable client.
The pieces
- Phone client: a native iPhone app (or a browser page) that captures your voice and plays audio back.
- Laptop server: a small FastAPI server that bridges audio to a Gemini Live voice operator.
- Voice operator: Gemini Live listens, talks, and decides which actions to run.
- Claude Code controller: the operator drives Claude Code on your laptop to actually do the work.
The call flow
- You start a task by voice.
- Claude finishes or needs input, and the laptop sends a push to your phone.
- Your phone rings with a native call screen, even when locked.
- Answer to hear the update and reply by voice. Decline, and it rings on the next update.
What is supported today
The first version focuses on drive mode: pick a working directory, send spoken instructions, and hear the result read back. Voice folder browsing and barge-in interruption are on the roadmap.
Ready to try it yourself? See the setup guide.