This template allows to detect certain keywords in the audio input and trigger effects when they are detected. The template generates melspectrogram image and predicts the probability of keywords with Machine Learning model.
Creating a Model
While the template comes with two ML models, each with different keywords available
you can detect any keywords by importing your own machine learning model. We’ll go through an example of what this might look like below.
Note: To learn more about Machine Learning and Lens Studio, take a look at the ML Overview page.
To create a model, you’ll need a:
- Machine learning training notebook: Download our example notebook here
- Data set: in example we are using
torch.utils.data.Datasetversion of a SPEECHCOMMANDS dataset
Tip: This dataset comes with 35 different keywords which you can pick a subset from.
Training Your Model
There are many different ways you can train your model. For our example, we will use Google Colaboratory. To see other ways of training, take a look at the ML Frameworks page of the guide section.
Head over to Google Colaboratory, select the Upload tab, and drag the python notebook into the upload area.
Tip: The notebook is well documented with information about what each section of the code is doing. Take a look at the notebook to learn more about the training process itself!
Once your notebook has been opened, you can choose which labels you want the model to segment out. Then, you can run the code by choosing
Runtime > Run All in the menu bar. This model takes about 2 hours to train.
Downloading your Model
You can scroll to the
Train Loop section of the notebook to see how your machine learning model is coming along.
Once you are happy with the result, you can download your
Note: When using a data set to train your model, make sure that you adhere to the usage license of that dataset.
Keyword Detection Controller
The keyword detection is driven by the
Keyword Detection Controller object. This object contains the
AudioSpectogram script which provides a function to calculate the
MelSpectogram of an audio and generates spectrogram texture.(You can find more information about this terms in the training notebook) The
KeywordDetectionController script then takes the generated texture, and passes it into an ML model.
Choosing your Keywords
There are two different ML models provided in the template, depending on which keywords you are looking to use. In the
Resources panel, you will find a folder called
ML [TRY_SWAPPING] with a
Labels folder. Each model has a corresponding label (keywords that you can use).
To find the list of keywords you can detect for each model, double click on one of the scripts in the Labels folder. Once you’ve decided which keywords you want to use, we can set up the
KeywordDetectionController to use them.
First we will set the model we want to use by changing the
Then, we’ll modify the
Labels [SWAP LABELS] object and choose the labels corresponding to that model.
Choosing audio to analyze
With the detection set up, we can next choose the
Audio Track we want to analyze. By default the template comes with an audio of a person saying “one two three”. You can select the
Audio Track field in the
KeywordDetectionController script to change what the Lens is listening to.
Audio from Microphone [link] to start processing user voice.
Then, make sure to enable
Microphone input in the bottom left corner of the Preview panel so that your voice will be passed into the Lens
Responding to Detected Keywords
Responding with Behavior
At the bottom of the
KeywordDetectionController, you’ll find a checkbox to
Use Behavior. This checkbox will tell the
KeywordDetectionController to call a
Custom Behavior Triggers when a keyword is detected.
With this trigger in mind, let’s see an example of how we can respond to a detected keyword.
Take a look at the example animation
Three under the
Example Numbers object to see how the animation is called when the user says the related keyword.
For example, in the
One object, you can see that there is an animation that is played in response to the trigger
Tip: There’s a lot more that Behavior can do, such as play sounds, texture animation, enable objects and more! Take a look at the Behavior guide for more information.
If you want to respond with your own script, you can listen to the behavior keyword:
Alternatively, if you want to use Visual Script Editor, you can use the provided custom node:
Scripts > Behavior > Behavior Nodes. Simply drag them into your script graph and connect the response you want to do.
Still Looking for help?Visit Support