Help us stress test Streamlit’s latest caching update

Hi - Posting this here in case it helps anyone looking to resolve this issue:
I ran into an number of hashing problems writing a class that included the use of Keras. Here is how I worked around it, including all of the objects that caused hashing errors:

    hash_funcs={'_thread.RLock' : lambda _: None, 
                '_thread.lock' : lambda _: None, 
                'builtins.PyCapsule': lambda _: None, 
                '_io.TextIOWrapper' : lambda _: None, 
                'builtins.weakref': lambda _: None,
                'builtins.dict' : lambda _:None}

and then before every cached function:

@st.cache(hash_funcs=hash_funcs)
2 Likes

Hey all :wave:,

A few quick updates.

As of 0.58.0, type tf.Session is now natively supported in Streamlit.

As of 0.59.0 the following are now natively supported in Streamlit:

As of 0.60.0 the following are now natively supported in Streamlit:

As of 0.61.0 the following are now natively supported in Streamlit:

We’ll update the thread when we have a few more on nightly or in a general release :hearts:

5 Likes

Similar bug to Issue #1181 mentioned above in comment #11.

Decorating a method that calls super() raises streamlit.hashing.InternalHashError: Cell is Empty

Specifically:

    
    # from a file that I'm hesitant to import streamlit into because it's a shared dependency reused elsewhere
    class Dataset:

        def load_master_dataset(self, csv_path):
            self.master_df = pd.read_csv(csv_path)
            self.master_df.rename(columns = {v:k for k,v in self.label_map.items()},inplace=True)
            self.master_df.drop_duplicates(subset = ['catalog_number'], keep='first', inplace=True)
            self.master_df.set_index('catalog_number', inplace = True)
        
        ...

    # in another file
    class CacheDataset(Dataset):

        @st.cache
        def load_master_dataset(self, csv_path):
              super().load_master_dataset(csv_path)
    
        ...

Raises:

streamlit.hashing.InternalHashError: Cell is empty

While caching the body of load_master_dataset(), Streamlit encountered an
object of type builtins.function, which it does not know how to hash.

In this specific case, it’s very likely you found a Streamlit bug so please
[file a bug report here.]
(Sign in to GitHub · GitHub)

In the meantime, you can try bypassing this error by registering a custom
hash function via the hash_funcs keyword in @st.cache(). For example:

@st.cache(hash_funcs={builtins.function: my_hash_func})
def my_func(...):
    ...

If you don’t know where the object of type builtins.function is coming
from, try looking at the hash chain below for an object that you do recognize,
then pass that to hash_funcs instead:

Object of type builtins.function: <function CacheDataset.load_master_dataset at 0x12351fca0>

Please see the hash_funcs [documentation]
(https://streamlit.io/docs/caching.html)
for more details.

I’m pretty sure the built-in function in question is super().

  1. I tried decorating the base class’s load_master_dataset with @st.cache and then directly importing that (so cutting out CacheDataset), which works just fine. I can make this change for my use case, but it isn’t super elegant.
  2. super’s mro includes <class 'object'>

So am I misusing the @st.cache, or should I figure out how to hash super()