PyTorch App is re-running every time despite having cached the function

Hi guys!

Iโ€™ve got a Streamlit script to which Iโ€™ve added the st.cache decorator - as follows:

@st.cache
def get_response(input_text,num_return_sequences,num_beams):        batch = tokenizer.prepare_seq2seq_batch([input_text],truncation=True,padding='longest',max_length=60).to(torch_device)        translated = model.generate(**batch,max_length=60,num_beams=num_beams, num_return_sequences=num_return_sequences, temperature=1.5)
    tgt_text = tokenizer.batch_decode(translated, skip_special_tokens=True)
    return tgt_text

I assume this is the correct code for caching purposes, yet everything re-runs every time I amend anything in the app, although the cached data does not mutate.

Not sure whether Iโ€™m doing the caching incorrectly or whether anything else needs caching?

I pasted the full code FYI:

import streamlit as st
import torch
from transformers import PegasusForConditionalGeneration, PegasusTokenizer
model_name = 'tuner007/pegasus_paraphrase'
torch_device = 'cuda' if torch.cuda.is_available() else 'cpu'
tokenizer = PegasusTokenizer.from_pretrained(model_name)
model = PegasusForConditionalGeneration.from_pretrained(model_name).to(torch_device)

@st.cache
def get_response(input_text,num_return_sequences,num_beams):
    batch = tokenizer.prepare_seq2seq_batch([input_text],truncation=True,padding='longest',max_length=60).to(torch_device)
   translated = model.generate(**batch,max_length=60,num_beams=num_beams, num_return_sequences=num_return_sequences, temperature=1.5)
    tgt_text = tokenizer.batch_decode(translated, skip_special_tokens=True)
    return tgt_text

context = "The ultimate test of your knowledge is your capacity to convey it to another."
num_return_sequences=10
num_beams=10
response = get_response(context,num_return_sequences,num_beams)

st.write('response 3')
st.write(response)

Any guidance is greatly appreciated, as always! :slight_smile:

Thanks,
Charly