Caching not working for a model training attribute

So I have this pipeline class that consists of all the functions for preprocessing, feature analysis and finally training. My application is using cache for data loading and model training. Each time I perform one operation such as feature removal or analysis, data load function is retrieved from cache. However, when I train the model and try to do evaluation, model training function is not cached or cache is missed. I have tried replacing the model training function with simple function that takes a string and returns it, cache is still missing.

However, when I just load data and run model training, then it works. So, when I perform all the preprocessing steps and train the model and do evaluation. In that case only it is not working.

Hi @Sarmila_Upadhyaya, welcome to the Streamlit community!

So that the community can more effectively help you, please post your code as text (not as a picture), so that people can see what you are doing.


I have one Y label data which i convert into numpy ndarray using following code

truth = df[truth_column].astype(str).values
I then passed this argument along with other arguments to a function

> def get_trained_model(self, model, config, data, truth):
> ml_model = model.create(config)
>, truth)
> return ml_model

The caching was not working with the “truth” argument because i tried not passing “truth” and the cache worked. (I tried a dummy function with three argument which returns data itself)
Only when I passed truth it was not working.

However, I replaced the above code with

truth = df.eval(truth_column).apply(str)

and then passed a series rather than numpy ndarray, it worked. I am not sure if it is a bug or not.