My streamlit folium map crashes when I apply groupby operations on my original h3 dataframe

I am developing a geo app with streamlit focused on plotting Uber H3 hexagons that have different type of of feature groups to be plotted over a folium map. The dataframe I have with the information of the hexagons looks something like

                 hex_id  hex_resolution    region  latitude  longitude  a__points  a_mean_value  b_points  b_mean_value
0       8b664175a6c9fff              11  Cacahual  3.380089 -67.729749                       3                  113.829310                          16                        0.000000
1       8b664e9958a5fff              11  Cacahual  3.159770 -67.741843                       2                   49.890930                          19                       85.263158
2       8b66412a5d51fff              11  Cacahual  3.426441 -67.520657                       2                   48.978256                          20                       90.000000
3       8b66412b50eefff              11  Cacahual  3.412557 -67.573996                       4                   49.393820                          17                       96.470588
4       8b664104ab4afff              11  Cacahual  3.606199 -67.513259                       4                   48.174440                          16                       99.375000

For displaying the map I am using streamlit-folium. I have different zoom values to display different resolutions of the h3 hexagons and I do the conversion between each of the resolutions using h3pandas with the following function

def charge_df(file_path, resolution, first_n_rows = None ):

    #charges the original dataframe
    df =  pd.read_csv(file_path)

    # conversion of the original resolution to a new one with h3pandas
    if first_n_rows:
        h3df = df\
            .head(first_n_rows)\
            .h3.geo_to_h3(resolution, 
                        lat_col = "latitude", 
                        lng_col = "longitude")
    else:
        h3df = df\
            .h3.geo_to_h3(resolution, hex_resolution  latitude  longitude  carbon_removals_points  carbon_removals_mean_value  tropical_tree_cover_points  tropical_tree_cover_mean_value                                           geometry
h3_05                                                                                                                                                                                                                  
85664107fffffff            11.0  3.642390 -67.515120                2.538934                   67.715361                   16.963115                      100.343107  POLYGON ((-67.57841 3.7442, -67.62704 3.67422,...
8566410ffffffff            11.0  3.543830 -67.636296                2.428571                   93.160690                   17.000000                       65.569301  POLYGON ((-67.71448 3.68601, -67.76322 3.61592...
85664117fffffff            11.0  3.751133 -67.518407                2.524590                   97.021296                   17.016393                      140.074436  POLYGON ((-67.56855 3.89578, -67.61715 3.82594...
85664123fffffff            11.0  3.414848 -67.423126                2.697868                   66.142274                   17.145505                       80.192160  POLYGON ((-67.46216 3.49883, -67.51075 3.42869...
85664127fffffff            11.0  3.356174 -67.347446                2.576923                   76.884983                   16.879121                       79.051022  POLYGON ((-67.33614 3.40529, -67.38464 3.33512...
8566412bfffffff            11.0  3.360755 -67.560013                2.599462                   75.963019                   17.118280                       60.881480  POLYGON ((-67.59816 3.4404, -67.64685 3.37015,...
8566412ffffffff            11.0  3.287853 -67.446837                2.626781                   73.198487                   17.128205                       61.123238  POLYGON ((-67.47197 3.3468, -67.52059 3.27652,...
85664133fffffff            11.0  3.557516 -67.454476                2.578947                   68.361575                   17.093301                       89.720296  POLYGON ((-67.45236 3.65065, -67.50092 3.58065...
                        lat_col = "latitude", 
                        lng_col = "longitude")
    
    gdfh3 = h3df.h3.h3_to_geo_boundary()
    st.write(gdfh3.columns)
    return gdfh3

This function creates a new geopandas dataframe based on the original that looks like

hex_resolution  latitude  longitude  a_points  a_mean_value  b_cover_points  b_mean_value                                           geometry
h3_05                                                                                                                                                                                                                  
85664107fffffff            11.0  3.642390 -67.515120                2.538934                   67.715361                   16.963115                      100.343107  POLYGON ((-67.57841 3.7442, -67.62704 3.67422,...
8566410ffffffff            11.0  3.543830 -67.636296                2.428571                   93.160690                   17.000000                       65.569301  POLYGON ((-67.71448 3.68601, -67.76322 3.61592...
85664117fffffff            11.0  3.751133 -67.518407                2.524590                   97.021296                   17.016393                      140.074436  POLYGON ((-67.56855 3.89578, -67.61715 3.82594...
85664123fffffff            11.0  3.414848 -67.423126                2.697868                   66.142274                   17.145505                       80.192160  POLYGON ((-67.46216 3.49883, -67.51075 3.42869...
85664127fffffff            11.0  3.356174 -67.347446                2.576923                   76.884983                   16.879121                       79.051022  POLYGON ((-67.33614 3.40529, -67.38464 3.33512...
8566412bfffffff            11.0  3.360755 -67.560013                2.599462                   75.963019                   17.118280                       60.881480  POLYGON ((-67.59816 3.4404, -67.64685 3.37015,...
8566412ffffffff            11.0  3.287853 -67.446837                2.626781                   73.198487                   17.128205                       61.123238  POLYGON ((-67.47197 3.3468, -67.52059 3.27652,...
85664133fffffff            11.0  3.557516 -67.454476                2.578947                   68.361575                   17.093301                       89.720296  POLYGON ((-67.45236 3.65065, -67.50092 3.58065...

Where the index is the id of the new hex resolution. As you can see, there are duplicated indexes, and that’s why I wanted to make a modification to the function I showed before, so that it now calculates the mean value of the duplicated indexes and grouped them by the id. The function looks now like

def new_charge_df(file_path, resolution, first_n_rows = None ):
    df =  pd.read_csv(file_path)
    if first_n_rows:
        h3df = df\
            .head(first_n_rows)\
            .h3.geo_to_h3(resolution, 
                        lat_col = "latitude", 
                        lng_col = "longitude")
    else:
        h3df = df\
            .h3.geo_to_h3(resolution, 
                        lat_col = "latitude", 
                        lng_col = "longitude")
        
    resolution_column = h3df.index.name
    h3df = h3df.groupby(resolution_column).mean(numeric_only=True).reset_index().set_index(resolution_column)
    gdfh3 = h3df.h3.h3_to_geo_boundary()
    st.write(gdfh3.columns)
    return gdfh3

However, even though before this modification the code was working correctly in the part of the plotting, that looks like

# Sidebar widgets
    zoom, feature_group_to_add = sidebar_widgets()
    
    # Load the data
    first_n_rows = 10000
    cacahual_gdfh3 = charge_df(file_path="./db/cacahual_db.csv", 
                               resolution = zoom - 3,
                               first_n_rows = first_n_rows)
    print(cacahual_gdfh3)
    
    # Create the feature groups and the colormaps
    fg_cm_dic = {"A mean value": create_feature_group(gdf=cacahual_gdfh3, 
                                                                   data_to_plot="a_mean_value", 
                                                                   label_to_plot="A mean value (m.u.)", 
                                                                   feature_group_name="A mean value"),
                 "B mean value": create_feature_group(gdf=cacahual_gdfh3, 
                                                                       data_to_plot="b_cover_mean_value", 
                                                                       label_to_plot="B mean value (m.u.)", 
                                                                       feature_group_name="B mean value"),
                 "None": [None, None]}
    
    # Create the map
    m = folium.Map(location=[3.5252777777778, -67.415833333333],
                   max_bounds = True)
    
    # Add the colormap to the map
    if fg_cm_dic[feature_group_to_add][1]:
        colormap = fg_cm_dic[feature_group_to_add][1]
        colormap.add_to(m)
    
    # Obtain the feature group to add
    fg = fg_cm_dic[feature_group_to_add][0]
        
    # Display the map
    st_folium(m, 
              feature_group_to_add=fg, 
              zoom = st.session_state["zoom"])

After the modification for finding the mean value in the groupby, I am getting the following error

AssertionError: The field region is not available in the data. Choose from: ('hex_resolution', 'latitude', 'longitude', 'a_points', 'a_mean_value', 'b_points', 'b_mean_value').
Traceback:

File "/home/juanessao2000/.local/share/virtualenvs/Test-repository-vctNLQQo/lib/python3.11/site-packages/streamlit/runtime/scriptrunner/exec_code.py", line 88, in exec_func_with_error_handling
    result = func()
             ^^^^^^
File "/home/juanessao2000/.local/share/virtualenvs/Test-repository-vctNLQQo/lib/python3.11/site-packages/streamlit/runtime/scriptrunner/script_runner.py", line 590, in code_to_exec
    exec(code, module.__dict__)
File "/home/juanessao2000/Documents/PRINCIPAL/PROYECTOS/TRABAJO-PLANETAI/Projects/Mapbox/Test-repository/app.py", line 155, in <module>
    main()
File "/home/juanessao2000/Documents/PRINCIPAL/PROYECTOS/TRABAJO-PLANETAI/Projects/Mapbox/Test-repository/app.py", line 51, in main
    st_folium(m,
File "/home/juanessao2000/.local/share/virtualenvs/Test-repository-vctNLQQo/lib/python3.11/site-packages/streamlit_folium/__init__.py", line 350, in st_folium
    feature_group_string += _get_feature_group_string(
                            ^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/juanessao2000/.local/share/virtualenvs/Test-repository-vctNLQQo/lib/python3.11/site-packages/streamlit_folium/__init__.py", line 163, in _get_feature_group_string
    feature_group_to_add.render()
File "/home/juanessao2000/.local/share/virtualenvs/Test-repository-vctNLQQo/lib/python3.11/site-packages/folium/map.py", line 80, in render
    super().render(**kwargs)
File "/home/juanessao2000/.local/share/virtualenvs/Test-repository-vctNLQQo/lib/python3.11/site-packages/branca/element.py", line 681, in render
    element.render(**kwargs)
File "/home/juanessao2000/.local/share/virtualenvs/Test-repository-vctNLQQo/lib/python3.11/site-packages/folium/features.py", line 835, in render
    super().render()
File "/home/juanessao2000/.local/share/virtualenvs/Test-repository-vctNLQQo/lib/python3.11/site-packages/folium/map.py", line 80, in render
    super().render(**kwargs)
File "/home/juanessao2000/.local/share/virtualenvs/Test-repository-vctNLQQo/lib/python3.11/site-packages/branca/element.py", line 681, in render
    element.render(**kwargs)
File "/home/juanessao2000/.local/share/virtualenvs/Test-repository-vctNLQQo/lib/python3.11/site-packages/folium/features.py", line 1191, in render
    value in keys

What could be caussing this problem? I cannot find a reason for this error neither I find how to search for other questions in forums that could help me. Any comment on the matter is of great help. Thanks in advance!

Solved! resulted that the mean(numeric_only = True) only conserves the columns where there are numeric values, and the “region” column was not taken into account.

This topic was automatically closed 180 days after the last reply. New replies are no longer allowed.