Hey Jarvis, What's the Weather?
Okay, maybe it's not a fully intelligent and sentient assistant like Jarvis, but it can do some pretty cool things. VoicePilot would respond to this command by opening up the weather app automatically. Spawned from a 12 hour hackathon, with development still continuing, VoicePilot aims to be an assistive program that allows users to talk to use their computers with no manual mouse or keyboard input.
We take in user voice input and feed it through a speech recognition algorithm. With this raw speech text, we leverage a large language model, Gemini, to convert it into commands. These commands are then run through the Win32API to use the computer. With Gemini, speech doesn't have to be robotic, allowing for more robust use cases and interpretaions.
Timeline
April 2024 - Present
Skills
Python - Tkinter, PyAutoGUI
Large Language Models - Gemini