Hi guys!
I’ve got a Streamlit script to which I’ve added the st.cache
decorator - as follows:
@st.cache
def get_response(input_text,num_return_sequences,num_beams): batch = tokenizer.prepare_seq2seq_batch([input_text],truncation=True,padding='longest',max_length=60).to(torch_device) translated = model.generate(**batch,max_length=60,num_beams=num_beams, num_return_sequences=num_return_sequences, temperature=1.5)
tgt_text = tokenizer.batch_decode(translated, skip_special_tokens=True)
return tgt_text
I assume this is the correct code for caching purposes, yet everything re-runs every time I amend anything in the app, although the cached data does not mutate.
Not sure whether I’m doing the caching incorrectly or whether anything else needs caching?
I pasted the full code FYI:
import streamlit as st
import torch
from transformers import PegasusForConditionalGeneration, PegasusTokenizer
model_name = 'tuner007/pegasus_paraphrase'
torch_device = 'cuda' if torch.cuda.is_available() else 'cpu'
tokenizer = PegasusTokenizer.from_pretrained(model_name)
model = PegasusForConditionalGeneration.from_pretrained(model_name).to(torch_device)
@st.cache
def get_response(input_text,num_return_sequences,num_beams):
batch = tokenizer.prepare_seq2seq_batch([input_text],truncation=True,padding='longest',max_length=60).to(torch_device)
translated = model.generate(**batch,max_length=60,num_beams=num_beams, num_return_sequences=num_return_sequences, temperature=1.5)
tgt_text = tokenizer.batch_decode(translated, skip_special_tokens=True)
return tgt_text
context = "The ultimate test of your knowledge is your capacity to convey it to another."
num_return_sequences=10
num_beams=10
response = get_response(context,num_return_sequences,num_beams)
st.write('response 3')
st.write(response)
Any guidance is greatly appreciated, as always!
Thanks,
Charly