Caching with class variables

Summary

Wanting to know if there’s a way to cache method results when the methods are part of a class and rely on class variables.

Steps to reproduce

Code snippet:

class TestClass(object):
    def __init__(self, test_var):
        self.test_var = test_var

    @st.experimental_memo
    def var_plus_one(self):
        return self.test_var + 1

test_class_a = TestClass(2)
print(test_class_a.var_plus_one())
print(test_class_a.var_plus_one())

test_class_b = TestClass(10)
print(test_class_b.var_plus_one())

Expected behavior:

Ideally this would print 3, then 3 and then 11. The second print of “3” should not have to re-run and would be cached instead.

Basically if a method doesn’t have any of its input variables OR the class’s attributes changed, it would use the cached version. Similarly, if we create a new class, it should definitely re-run with a new attribute and whether it re-runs with the same attribute (for example, test_class_b = TestClass(2) above) is debatable - I don’t need it to re-run in my case, but I can image there are times when it might need to re-run.

Actual behavior:

I get an error when trying to run this.

Additional information

If this isn’t possible currently, I’m wondering if the best way to cache is to move the methods outside of the class (then it becomes a balancing act between whether it’s better to have caching or OOP for each case).

That should be covered by functools.cache.

This is my first attempt at really using caching, so I might not be fully grasping your idea of using functools.cache:

  • If that exists and does basically the same thing as st.expiremental_memo, why did streamlit even come up with their own version - I’m guessing there’s some difference?
  • Is there a similar functools method that deals with singletons?

On cache hits, st.experimental_memo returns a new object each time you call the function, so mutating the returned object does not affect the cached one. It also offers ttl, max_entries and integration with Streamlit widgets.

I don’t understand what you mean by a similar method that deals with singletons.

@msquaredds What needs to happen is that the streamlit cache needs to recognize the argument self (which has type TestClass) as a cacheable type of argument. Here are ways to make an object of that class register as cacheable – you can either make it a frozen dataclass, or add a __reduce__ method on the class. Either of these works fine.

from dataclasses import dataclass

import streamlit as st


@dataclass(frozen=True)
class TestClass:
    test_var: int

    @st.experimental_memo
    def var_plus_one(self):
        return self.test_var + 1


test_class_a = TestClass(2)
print(test_class_a.var_plus_one())
print(test_class_a.var_plus_one())

test_class_b = TestClass(10)
print(test_class_b.var_plus_one())


class TestClass2:
    def __init__(self, test_var):
        self.test_var = test_var

    def __reduce__(self):
        return TestClass2, (self.test_var,)

    @st.experimental_memo
    def var_plus_one(self):
        return self.test_var + 1


test_class_a2 = TestClass2(2)
print(test_class_a2.var_plus_one())
print(test_class_a2.var_plus_one())

test_class_b2 = TestClass2(10)
print(test_class_b2.var_plus_one())

Interesting, thank you @Goyo and @blackary for the help, I think I understand a little better now.

@Goyo the experimental_memo use case makes sense to me now. What I meant by singletons is that streamlit also has experimental_singleton - can functools.cache be applied in the same way to deal with functions that return something like a DB connection (I haven’t played around with it yet but it wasn’t clear to me from the description)?

Well, in that sense functools.cache is more akin to st.experimental_singleton than to st.experimental_memo, because it caches by reference (it returns the very same object that was cached, not a copy of it)), so I guess you can use it to cache a db connection.

Indeed I couldn’t tell that just from looking at the docs, I had to actually try.

Got it, that makes sense, thanks!