Hello community, I have been working on a text generation app using GPT3 Neo, but I observed that the app takes longer time to load on and generate text, please who as a way to optimise this app by using cach to stop the app everytime from loading the model everytime it runs here is the link to my code. Click here
Hi @seyirex,
That’s a really good idea! To prevent re-loading your entire model into memory with every widget interaction with the app, you can wrap line 20 in st.cache
. Doing so will reuse the same cached model across multiple simultaneous users, rather than loading a separate 500 MB model into RAM for each user.
Here’s an example – replace line 20 with the following:
@st.cache(hash_funcs={aitextgen: id})
def load_model():
model = aitextgen(model="EleutherAI/gpt-neo-125M")
return model
ai = load_model()
Happy Streamlit-ing!
Snehan
Thanks alot @snehankekre this is really helpful.