辅导Script留学生、讲解Java/C++设计、讲解algorithm、辅导python, C/C++编程

2018.11.22 - 首页 >> Python编程

Assignment – 5: Voice Assistant

100 points

In this assignment, you will be designing your own customized voice assistant (“Hey

Google”, “Hey Siri” and Now! “Your creative phrase here”). Believe me, it is a very

simple implementation. You have all the commands at your disposal to accomplish

this task.

Training your voice with Key Phrase [30 points]

Script File: run_record_voices.m

At first, begin with the starter code “run_record_voices.m” provided as a part of this

assignment. This code is designed in a way that you record five different

modulations of the same phrase for instance it could be “Hey Jarvis” and make sure

to repeat the same phrase 5 times. Your recording is only for 1.5 seconds; make

sure your word does not exceed the duration. Visualize the Key phrase each time

you record and observe the plot, re-do the task in case you find any disturbances.

Make sure that no disturbances are present in any of those 5 words. Within the loop,

written in this code, make sure to apply the function “voice_to_envelopes.m”

(provided as a part of this assignment) with input arguments as the trigger_word and

100. This function acts like a filter. It converts your voices to envelopes (set the

variable name to be training_envelopes). Make sure to store the values of envelope

returned by the function in a column of a matrix (training_envelopes). Visualize the

envelope for each key phrase. After recording five times, variable

training_envelopes would be of size 12000 x 5. Please make sure that your envelope

meets these requirements. Now, save the envelopes in a MAT file as shown below.

“save training_words training_envelopes”

Check the comments in “run_record_voices.m” and Recording Voice Sample.mov

(present in resources) for further information.

Testing Phase [40 points]

Script file: run_test_voices.m

Create a new Script file named “run_test_voices.m”. Make sure to clear workspace

before you proceed further. Load the MAT file training_words as mentioned below:

“load training_words”

This command loads the envelopes you computed in your previous code (training

samples). Now, initiate a while loop where you would allow the user to utter phrases

for a maximum of three attempts or until he/she says the exact same key phrase

[whichever is earlier]. You can copy and paste the code that is present in

“run_record_voices.m” to record test words [make sure to assign ‘fs’ as 8000].

Apply voice_to_envelope to each phrase that user utters and save the variable as

testing_envelope.

testing_envelope=voice_to_envelope(test_audio,100)

Compare the similarity by computing the correlation between the testing_envelope

with each column of training_envelope. Maximum value of xcorr() provides the

similarity measure between 2 envelopes. Input arguments to xcorr() are

testing_envelope and training_envelope (one column each time).

Hint: You can initiate a ‘for’ loop (for the number of training words) within the

‘while’ loop to compute the similarity between the test envelope and each training

envelope.

You would have a single similarity value for a given a test phrase from each training

word [5 values in total]. If any of those values exceed the threshold (0.9) then you

would display to the user that he/she has decoded the password, else you would

recommend the user to repeat until he/she meets the maximum number of attempts.

Check Test Result Sample #1.mov, Test Result Sample #2.mov and Test Result

Sample #3.mov (in resources) for some of the possible results.

Report [30 points]

Now, write a report about the same. You may include plots you got for your key

phrase along with their envelopes. You can include plots of the envelope where the

user said something different from the key phrase and when he exactly mentioned

the same phrase. You may also study the performance of this algorithm by varying

the threshold instead of 0.9 and report the same.

Things to be submitted:

run_record_voices.m

run_test_voices.m

training_words.mat

Report