I think what you're looking for is a local text to speech API/suite, like Simon, Jasper, Natl or Sphinx, in combination with networked home automation.
Looked into this a while ago. Generally offline speech recognition does not work as well as the online-cloud-enabled one. But one maker wrote about getting decent results with Sphinx and a wordlist (meaning: Sphinx could only recognize words or phrases on the list).
I would also like to hear about this. I don't think I have amassed nearly enough hardware to be able to run such a thing for myself, but at some point I will... :)
I think he was talking about http://lucida.ai/ at the time. I am keen to try this when FreeNAS 10 rolls out with docker support. However I might need to beef up the hardware in my NAS.
It seems mycroft still has a cloud component for the recognition? Or is that just the picroft Raspberry Pi variant.
@wendell spoke of a server with hundreds of gigabytes of ram which he was feeding when he was messing around with which he briefley mentioned in the tek, when the tek was still a thing.