babble-rnn is a research project in the use of machine learning to generate new speech by modelling human speech audio, without any intermediate text or word representations. The idea is to learn to speak through imitation, much like a baby might. The goal is to generate a babbling audio output that emulates the speech patterns of the original speaker, ideally incorporating real words into the output.
The implementation is based on Keras / Theano, generating an LSTM RNN; and Codec 2, an open source speech audio compression algorithm. The resulting models have learned the most common audio sequences of a ‘performer’, and can generate a probable babbling audio sequence when provided a seed sequence.
Read the babble-rnn tech post
View the babble-rnn code on Github