I’ve been checking in on the open source virtual assistant projects for a while now waiting for one that was completely self-hosted and looked complete enough to actually use. Mycroft looked promising but was almost impossible to separate from cloud login, while Genie/Almond seemed to hide the fact that they used a cloud STT engine rather deep in their documentation. Finally Rhasspy came far enough along and seemed to check enough of my boxes for don’t-send-my-data-anywhere to actually deploy.
I set it up in a base and satellite configuration so a server that can handle the STT is doing the bulk of the work while RPi is doing the microphone and playback work. All that seems to be going well. But I’ve hit what feels like the dumbest snag. Nowhere am I finding a good document on how to actually build out the voice commands and link them to underlying functions. I get that this was mostly designed to attach to Home Assistant and do things like operate light switches, but it feels like it’s so close to doing basic “what is the weather” commands and yet I’m struggling to figure out a simple way of doing that. Do I need Node-RED or Home Assistant to do that, or was there a native way on Rhasspy I’m just not seeing?
Hoping somebody here has played with this.
To answer my own question, just in case this ever ends up indexed: yes, either HASS or Node-RED or possibly even both are the recommended backend to Rhasspy. Also my dumbass was manually running the base server every time, and a few times the SSH session disconnected without my properly exiting the program. This manifested as the satellite recieving duplicate responses from the base which results in output repeating / looping several times (one for each instance). This wouldn’t have happened if I’d used Docker for the base, but I don’t really like Docker if I can help it.
Rhasspy now works reasonably well in this configuration. I ended up installing HASS. I want to integrate Spotify, and I need to work out which logic flow is less horrible to accomplish that. Options are either Installing Node-RED on the RPi, and using that to interact with Spotify, only sending things to HASS which actually need to use it, or installing Spotify on HASS and doing some horrid audio bounce back to Rhasspy. Something tells me the second one doesn’t scale, but the first one will be adding yet another system to this Rube Goldberg machine and I’m not sure how many more cogs I want to manage.
The answer to the Spotify problem was installing Raspotify on the RPi, configuring my Spotify Account in that, which adds it as a device linked to and controlled by the Spotify account, which can be controlled by HASS. This is nearing Rube Goldberg territory, but actually significantly LESS horrifying than some of the flows I’d been heading towards prior to this. I still have one thorn in my side, and that is that the Spotify plugin to HASS is a nightmare to script. I can play no problem but selecting the source, which is what they call the speaker for some ungodly reason, doesn’t work in script. I’m still trying to figure out if it’s me, or if it’s just half broken and poorly documented, as almost everything relating to this plugin has been.