Predicting audio emotion from extracted audio features

I have build an audio emotion classifier and loaded it on my streamlit app in which I have extracted the audio features so that I can predict the emotion. My problem is that an AssertionError occurs when I call the predict method on the model:

import pickle
filename = 'classifier.pkl'
pickle_in = open(filename, 'rb') 
model = pickle.load(pickle_in)

features = extract_feature(name_fileUp, mfcc=True, chroma=True, mel=True).reshape(1,-1)
prediction = model.predict([features])[0]
st.success(f"{prediction}")

Here is my error:
AssertionError: text input must of type str (single example), List[str] (batch or single pretokenized example) or List[List[str]] (batch of pretokenized examples).

Note: The extract_feature methode is working already.

Hey @nainiayoub,

It seems that features is not a string or list of strings. My advice would be to write features to the screen and check its type with type(features).

Any elements of the features that are not coming up as strings or a list of strings, you can turn those into strings and then pass them to the model!

Happy Streamlit-ing!
Marisa

Thank you for your assistance and time @Marisa_Smith ,

I converted the features to a list of strings which resulted in the program running without any error. Except that instead of displaying the emotion as (‘neutral’, ‘fearful’, ‘angry’ or ‘surprised’), which is the output I got on my notebook, I always get [2] as prediction for every audio I have used.

Here’s my code:

import pickle
filename = 'classifier.pkl'
pickle_in = open(filename, 'rb') 
model = pickle.load(pickle_in)

features = extract_feature(name_fileUp, mfcc=True, chroma=True, mel=True).reshape(1,-1)
features = list(features)
features = [str(feat) for feat in features]

prediction = model.predict(features)[0]
st.success(f"Prediction: {prediction}")

Here’s what I get:

Any ideas ?

Hey @nainiayoub,

Am I correct in thinking that you have moved this over from a Jupyter Notebook? You may have left some code behind there, let me explain… :face_with_monocle:

Many machine learning models won’t take strings as the targets. That is when you train your model you have to encode the correct solution as a number, happy = 0, sad = 1, angry = 2 etc… So when you’re running your model here it’s working properly and giving you the encoded answer of 2, and if you were using my encoding above that would correspond to angry. :female_detective:

I imagine that you have left out a dictionary or list from your notebook that maps the number to its word classification. :open_book:

Happy Streamlit-ing!
Marisa