Optimization of language model app(GPT3)

Hello community, I have been working on a text generation app using GPT3 Neo, but I observed that the app takes longer time to load on and generate text, please who as a way to optimise this app by using cach to stop the app everytime from loading the model everytime it runs here is the link to my code. Click here

Hi @seyirex,

That’s a really good idea! To prevent re-loading your entire model into memory with every widget interaction with the app, you can wrap line 20 in st.cache. Doing so will reuse the same cached model across multiple simultaneous users, rather than loading a separate 500 MB model into RAM for each user.

Here’s an example – replace line 20 with the following:

@st.cache(hash_funcs={aitextgen: id})
def load_model():
  model = aitextgen(model="EleutherAI/gpt-neo-125M")
  return model

ai = load_model()

Happy Streamlit-ing! :balloon:
Snehan

Thanks alot @snehankekre this is really helpful.

1 Like