For the summer, I’m starting my own Virtual Assistant for various reasons, such as personalized commands, integration with hardware modules and custom external hardware, and online access to smart home devices. I’m developing on windows but a more complete product would be put on a Raspberry Pi 3 b+. I already have most of the libraries, structure, functions, and data sets planned out already in psudo code.
1, What STT and TTS API should I use? I know some of them from Google and Microsoft are great, but they’re not free. I need a free STT and TTS that works seamlessly in code.
2, Should I use Python or Kotlin/Java? Which one has the most support with raspberry pi/external modules? I’m proficient in both, and ides are no problem. The other thing is, if I decide to connect it to my phone (or even put the entire app on my phone), do both parties have to be in Kotlin/java or can the desktop version be in python and the phone version be in Kotlin/java?
If there is anything else that you can tell me before I jump head first into this project I am more than happy to be informed. I also know that Google assistant, siri, cortana, and alexa exist, but I want the complete control of what I can do with the assistant. Plus it would be a fun project to do and good for a resume. Once it gets very far into development I might even put it on github and make it open source.
Python has excellent support on the Raspberry Pi because it is pushed by the Raspberry Pi foundation. In addition the JVM is very RAM hungry which could be a problem on the rpi, not sure. Both languages will work, it’s up to you really
Python is officially supported and has a huge ecosystem on the pi
The JVM may be too RAM hungry for the Pi. Do a small test first
Java makes large programs easier to maintain due to static typing
Java will make it easier to port the thing to Android if you want to do so in the future
Pick your poison.
If you are feeling very adventurous take a look at GraalVM. It allows you to combine java and python in the same project and is even supposed to use less RAM. The project is very new and highly experimental though.
When I last looked into open source TTS systems festival sounded best. This was years ago though, better ones may exist by now.
The other thing is, if I decide to connect it to my phone (or even put the entire app on my phone), do both parties have to be in Kotlin/java or can the desktop version be in python and the phone version be in Kotlin/java?
You’ll have to decide on a protocol to use for communication between the phone and PC. Stay away from language specific mechanisms like python’s pickle and Java’s RPC and you’re fine.
This sounds a lot like what the Mycroft project already does. Up to, and including the RPi 3! You are likely to get better results from your assistant by contributing there. That said, you could learn a lot by writing your own assistant and hone your own skills. I’ll answer the next questions as if the Mycroft werent a thing.
I work in EdTech, and this has been a dream item for us in our software for a long time. Unfortunately, we haven’t found any truly free products that give consistently good results. That said, CMU Sphinx Seems to have the best results. Not accurate enough for my work, but probably ok here?
Optimize for developer comfort here. Remember, the steps to making good code are:
Make it work
Make it correct
Make it fast
If you feel the most comfortable in Python, then by all means go there! Nobody said the whole thing has to be in one language, either!
I suppose I could write the base program in java, then write a bunch of the littler functions connecting any external libraries/devices in python. Is that what they are going for here?
All I need for it to do is record my voice and translate that into a String or something that can be converted into a string. Would this be good enough?