Language and APIs for new project

For the summer, I’m starting my own Virtual Assistant for various reasons, such as personalized commands, integration with hardware modules and custom external hardware, and online access to smart home devices. I’m developing on windows but a more complete product would be put on a Raspberry Pi 3 b+. I already have most of the libraries, structure, functions, and data sets planned out already in psudo code.

1, What STT and TTS API should I use? I know some of them from Google and Microsoft are great, but they’re not free. I need a free STT and TTS that works seamlessly in code.

2, Should I use Python or Kotlin/Java? Which one has the most support with raspberry pi/external modules? I’m proficient in both, and ides are no problem. The other thing is, if I decide to connect it to my phone (or even put the entire app on my phone), do both parties have to be in Kotlin/java or can the desktop version be in python and the phone version be in Kotlin/java?

If there is anything else that you can tell me before I jump head first into this project I am more than happy to be informed. I also know that Google assistant, siri, cortana, and alexa exist, but I want the complete control of what I can do with the assistant. Plus it would be a fun project to do and good for a resume. Once it gets very far into development I might even put it on github and make it open source.

Should I use Python or Kotlin/Java?

Python has excellent support on the Raspberry Pi because it is pushed by the Raspberry Pi foundation. In addition the JVM is very RAM hungry which could be a problem on the rpi, not sure. Both languages will work, it’s up to you really

  • Python is officially supported and has a huge ecosystem on the pi
  • The JVM may be too RAM hungry for the Pi. Do a small test first
  • Java makes large programs easier to maintain due to static typing
  • Java will make it easier to port the thing to Android if you want to do so in the future

Pick your poison.

If you are feeling very adventurous take a look at GraalVM. It allows you to combine java and python in the same project and is even supposed to use less RAM. The project is very new and highly experimental though.

https://github.com/graalvm/graalpython


What STT and TTS API should I use?

When I last looked into open source TTS systems festival sounded best. This was years ago though, better ones may exist by now.


The other thing is, if I decide to connect it to my phone (or even put the entire app on my phone), do both parties have to be in Kotlin/java or can the desktop version be in python and the phone version be in Kotlin/java?

You’ll have to decide on a protocol to use for communication between the phone and PC. Stay away from language specific mechanisms like python’s pickle and Java’s RPC and you’re fine.

This sounds a lot like what the Mycroft project already does. Up to, and including the RPi 3! You are likely to get better results from your assistant by contributing there. That said, you could learn a lot by writing your own assistant and hone your own skills. I’ll answer the next questions as if the Mycroft werent a thing.

I work in EdTech, and this has been a dream item for us in our software for a long time. Unfortunately, we haven’t found any truly free products that give consistently good results. That said, CMU Sphinx Seems to have the best results. Not accurate enough for my work, but probably ok here?

Optimize for developer comfort here. Remember, the steps to making good code are:

  1. Make it work
  2. Make it correct
  3. Make it fast

If you feel the most comfortable in Python, then by all means go there! Nobody said the whole thing has to be in one language, either!

Feel free to ping me with any other questions.

I suppose I could write the base program in java, then write a bunch of the littler functions connecting any external libraries/devices in python. Is that what they are going for here?

All I need for it to do is record my voice and translate that into a String or something that can be converted into a string. Would this be good enough?

My understanding is that you don’t even have to write helper functions. Python/Java has direct access to all functions from the other language.

https://www.graalvm.org/docs/reference-manual/languages/python/

I really wouldn’t recommend this approach though. Graal is extremely new and experimental.

Should be possible. Look at what Mycroft is doing too as it might help guide you.