Streamlit-aggrid select rows very slow on Streamlit Cloud

I am running into problems using Streamlit-aggrid on Streamlit cloud. The issue arises when using paginated display and allowing the user to select rows. In particular, when select_mode=“multiple” and I initialize all boxes to checked, the aggrid is very slow to run (the problem grows with data size). I have this problem only on Streamlit Cloud not my local machine. I illustrate the problem with a simple app (codebase here).

Below is a summary of my analysis of the problem and some questions (you can reproduce this behavior using the “Explore in Depth” page of my app). Is it possible to fix this on Streamlit Cloud?

  • It seems like st_aggrid.__init__.py is being called each time a selection event (click) on the aggrid display occurs. I can detect this by catching a warning that it throws (at line 42 of __init.py__, about iteritems() deprecation) and recording the total number of warnings. I thought __init__.py was called only once, when the app started. Why is it called so often? I believe this is the source of the problems.

  • When I use select_mode=“multiple”, initialize all boxes to checked, and use a modestly-size dataset (like 200 rows), the aggrid is very slow to run and throws many warnings. It looks like it is calling st_aggrid.__init__.py (and perhaps some other initialization) once for each page of data in the Aggrid display. I suspect this is why it is so slow.

    • This problem only arises on Streamlit Cloud and not my local computer (running the same package versions but Windows 10 not Linux). I haven’t tested on other cloud platforms yet.

Below is code for a minimal example (same code as 01_Minimal_Example.py in repo) and a screenshot of the results I got running on Streamlit Cloud.

import warnings
import numpy as np
import pandas as pd
import streamlit as st
#
from st_aggrid import GridOptionsBuilder, AgGrid, GridUpdateMode
from st_aggrid.shared import ColumnsAutoSizeMode


# Make sample data
number_of_rows = 200
number_of_columns = 5
df = pd.DataFrame(
    data=np.arange(number_of_rows*number_of_columns).reshape((number_of_rows, number_of_columns))
)

# DataFrame for AgGrid
df = df.reset_index()   # aggrid doesn't show DataFrame index
df.columns = [str(c) for c in df.columns]    # bug: GridOptionsBuilder only allows string column names

# Displays the DataFrame and get click events
# Count the number of warnings it throws.
# These warnings all come from st_aggrid.__init__.py
with warnings.catch_warnings(record=True) as warns:
    st.header("Minimal Example")
    st.markdown("""
        This page is the minimum example illustrating the problem:
        
        - When select_mode="multiple" and all checkboxes are initialized to true, the data is slow to load, presumably related to the many warnings that are thrown, as can be seen by the warning count displayed below the aggrid DataFrame.  This problem is even worse for larger datasets.
    """)

    st.subheader("Aggrid Data Selection")
    # Set grid options.
    # Bug seems to occur when selection_mode="multiple" and all rows are pre-pre-selected
    gb = GridOptionsBuilder.from_dataframe(df)
    gb.configure_selection(
        selection_mode="multiple",
        use_checkbox=True,
        pre_selected_rows=list(range(len(df))),
    )
    gb.configure_pagination(paginationAutoPageSize=False, paginationPageSize=10)
    gridOptions = gb.build()

    # Display DataFrame and get click events
    grid_response = AgGrid(
        df,
        gridOptions=gridOptions,
        update_mode=GridUpdateMode.GRID_CHANGED,
        columns_auto_size_mode=ColumnsAutoSizeMode.FIT_CONTENTS,
    )

    # Warning ....
    st.subheader("Warning Information")
    if "warning_count_mwe" not in st.session_state:
        st.session_state["warning_count_mwe"] = 0
    st.session_state["warning_count_mwe"] += len(warns)
    st.write(f"""Number of warnings since app start: {st.session_state["warning_count_mwe"]}""")

    st.write("Most recent warnings:")
    st.write(warns)
    for w in warns:  st.write(w)

Hi @ngallo1 , do you have figured out how to resolve this issue?

I also encountered same problem on Cloud side, I am using aggrid for single select and then display result. It takes extremely long for aggrid to start respond to the select action.

You may also report your issues at its github repo so that it is easier to track.