How to convert Top2Vec to pickle extension

Summary

Hello. I would like to export a Top2Vec model using a download button. To do this, Iโ€™m trying to save the model as a pickle. Below is a sample code:

Steps to reproduce

Code snippet:

import io
import pickle
import pandas as pd
from top2vec import Top2Vec
import streamlit as st

# df is a dataframe with text columns
df = pd.read_csv('mycsv.csv')
docs = list(df.loc[:,"mycolumn"].values)

model = Top2Vec(docs, embedding_model='universal-sentence-encoder-multilingual')

def pickle_model(model):
    """Pickle the model inside bytes. In our case, it is the "same" as storing a file, but in RAM."""
    f = io.BytesIO()
    pickle.dump(model, f)
    return f

data = pickle_model(model)
st.download_button("Download .pkl file", data=data, file_name="my-pickled-model.pkl")

Expected behavior:

I expect the date variable to be: <_io.BytesIO at 0xXXXXXXXX> and then that a download button will be created.

Actual behavior:

I get the following error:
AttributeError: Canโ€™t pickle local object โ€˜Loader._recreate_base_user_object.._UserObjectโ€™

Do you have some ideas how to fix it?

Try this:

def pickle_model(model):
    return pickle.dumps(model)

data = pickle_model(model)
st.download_button("Download .pkl file", data=data, file_name="my-pickled-model.pkl",
                   mime="application/octet-stream")

For future generations: as Top2Vec is not a machine learning model, there may be problems with pickle function. I got Nonetype object and cannot pickle Local objects errors. I recommend saving the model in memory and loading it as a variable. Only then try the pickle.

import pandas as pd
import io
import pickle
from top2vec import Top2Vec
import streamlit as st

@st.cache_resource()
def topic_modeling(docs):
    model = Top2Vec(docs, embedding_model='universal-sentence-encoder-multilingual')
    model.save("model_save")
    model = Top2Vec.load("model_save")
    return (model)

def pickle_model(model):
    f = io.BytesIO()
    pickle.dump(model, f)
    return f

uploaded_file = st.file_uploader("Choose a file", type=['csv'], help='Accept only CSV extenction')

if uploaded_file is not None:
    df = pd.read_csv(uploaded_file)
    df=df.head(1000) 
    docs = list(df.loc[:,"NPSCombined"].values)
    model = topic_modeling(docs)
    file = pickle_model(model)
    st.write(model, file)
else:
    st.write("upload file")

This topic was automatically closed 180 days after the last reply. New replies are no longer allowed.