How to prevent reccurring API calls with each code re-run? (st.cache throws the UnhashableTypeError)

Charly_Wargnier · July 19, 2020, 10:12pm

Hi guys,

I’m currently experimenting with the Google NLP API in Streamlit, seeking to limit the number of API calls in order to optimise costs. (app users will have to upload their own GCP credentials, so cost optimisation is key. :))

Currently, each time I move e.g. a slider, the full codebase is re-run and API costs occur accordingly.

There are 2 functions related to these API calls:

@st.cache(allow_output_mutation=True) 
def sample_analyze_entities(html_content):
    client = language_v1.LanguageServiceClient()
    type_ = enums.Document.Type.HTML #you can change this to be just text; doesn't have to be HTML.
    language = "en"
    document = {"content": html_content, "type": type_, "language": language}
    encoding_type = enums.EncodingType.UTF8
    response = client.analyze_entities(document, encoding_type=encoding_type)
    return response

@st.cache(allow_output_mutation=True) 
def return_entity_dataframe(response):
  output = sample_analyze_entities(response.data)
  output_list = []
  for entity in output.entities:
    entity_dict = {}
    entity_dict['entity_name'] = entity.name
    entity_dict['entity_type'] = enums.Entity.Type(entity.type).name
    entity_dict['entity_salience('+response._request_url+')'] = entity.salience
    entity_dict['entity_number_of_mentions('+response._request_url+')'] = len(entity.mentions)
    output_list.append(entity_dict)
  json_entity_analysis = json.dumps(output_list)
  df = pd.read_json(json_entity_analysis)
  summed_df = df.groupby(['entity_name']).sum()
  summed_df.sort_values(by=['entity_salience('+response._request_url+')'], ascending=False)
  return summed_df

I tried @st.cache(allow_output_mutation=True) on both, yet no luck - throwing the UnhashableTypeError error:

I tried various st.cache parameters, yet every time get the above error.

Any idea on how to overcome this issue?

Very grateful for your help, as always!

Thanks,
Charly

Ian_Calvert · July 19, 2020, 11:24pm

The issue is that it can’t work out how to has the input arguments. You can explicitly tell it how, but easier is to use data structures streamlit knows how to hash already.

The function only cares about response.data and the URL, you could pass those into the function instead of the response object, they are more likely to be hashable.

Charly_Wargnier · July 20, 2020, 12:09am

Thanks Ian!

Pardon my ignorance about caching but would you mind clarifying:

easier is to use data structures streamlit knows how to hash already

Which data structures are you referring to?

Thanks,
Charly

Ian_Calvert · July 20, 2020, 6:26am

So for example, streamlit knows how to calculate the hash of a string, number, list, dicts I think, etc. There’s some more complex types it supports including stringio and bytesio.

Charly_Wargnier · October 24, 2020, 8:35pm

I’ve finally sorted it but thanks for your input!

Charly

Topic		Replies	Views
Cacheing output of expensive function calls Using Streamlit cache , pytorch	9	3803	November 19, 2021
Caching Function With an Unhashable Argument Using Streamlit cache , discussion	3	278	May 1, 2025
Help us stress test Streamlit’s latest caching update Official Announcements cache	23	9584	February 7, 2022
Avoid rerunning some code Using Streamlit cache , session-state	10	25302	May 2, 2022
CacheReplayClosureError on cached API calling function Using Streamlit cache , api , debugging	0	29	November 13, 2025

How to prevent reccurring API calls with each code re-run? (st.cache throws the UnhashableTypeError)

Related topics