Vaex hdf5 file in streamlit : problem with the attribute "between"


Hi :),

I come to you because I’m stuck with an attribute that Streamlit doesn’t seem to like when I use a dataset in hdf5 format (which I discovered).

I have an error message that appears and my data is not displayed, the code is located in the filters of my data:

AttributeError: ‘Expression’ object has no attribute ‘between’

If I put my dataframe of data which is in hdf5 into pandas dataframe :

df = df.to_pandas_df()

…I no longer have this error message with the ‘between’ in my filters.

Steps to reproduce

Code snippet:

# 1 - Data

df ="../Data.hdf5")

Local_selection = df['type_local'].unique()
Annees_selection = df['annee_mutation'].unique()
Departement_selection = df['code_departement'].unique()
Piece_selection = df['nb_piece_group'].unique()
Valeur_selection = df['valeur_m2'].unique()

# 2 - Sidebar and filters

Type_local = st.sidebar.selectbox(options=Local_selection,)

Region = st.sidebar.multiselect(options=["Region 1"])

Departement = st.sidebar.multiselect(Departement_selection,default=Departement_selection)

Annees = st.sidebar.multiselect(Annees_selection, default=Annees_selection)

Nb_piece_group = st.sidebar.multiselect(Piece_selection, default=Piece_selection)

Valeur = st.sidebar.slider("""Choice:""", 1500, 40000, (2500, 11000), #value=(3500, 5000), #step=250,)

# 3 - Filters for data

df_selection = df[(df['type_local'] == (Type_local)) & (df['code_departement'].isin(Departement)) & 
(df['annee_mutation'].isin(Annees)) & 
(df['nb_piece_group'].isin(Nb_piece_group)) & 

Expected behavior:

I would like an alternative to the “between” attribute for my double slider that allows selecting a minimum price and a maximum price so that it works with an hdf5 file.

If I transform my hdf5 file into pandas dataframe, I don’t have this problem anymore and it works. If I keep the hdf5 file, I get this error that blocks me. I really don’t understand the problem.

Debug info

  • Streamlit version: (get it with 1.22.00 streamlit version)
  • Python version: (get it with 3.10 python --version)
  • OS version:
  • Browser version: Firefox

Additional information

This line of code is not working with my hdf5 file (no problem with a .csv or to convert hdf5 into pandas dataframe) :


To convert my csv file to hdf5 I had no problems :

import vaex

vaex_df = vaex.from_pandas(df, copy_index=False)

The .dtypes of ‘valeur_m2’ is int64 before to convert into .hdf5

Thanks :slight_smile:

This should do (not tested).

(Value[0] <= df['value_m2']) & (df['value_m2'] <= Value[1])
1 Like

Thank you Goyo!!!

Your code works perfectly.

I don’t understand why with hdf5 files I have to do this kind of modification for my code to work. I had no problem with pandas dataframe.

Any idea please?

For example I still have a problem that I didn’t have before :slight_smile:

No problems with a pandas Dataframe (.csv file) :

df_eq = df_equipement.loc[df_equipement['insee'].isin(df_selection['insee'])]

Now the error code I have with this same code when I use the hdf5

AttributeError: 'DataFrameLocal' object has no attribute 'loc'

File "C:\Anaconda\envs\Immo3.10\lib\site-packages\streamlit\runtime\scriptrunner\", line 565, in _run_script
    exec(code, module.__dict__)
File "F:\01 - Formation Data Analyst\02B - ProjetImmo\Streamlit\", line 324, in <module>
    df_eq = df_equipement.loc[df_equipement['insee'].isin(df_selection['insee'])]
File "C:\Anaconda\envs\Immo3.10\lib\site-packages\vaex\", line 288, in __getattr__
    return object.__getattribute__(self, name)

I don’t understand that I have to make these changes to read and use an hdf5 file. I’ve researched, looked at examples but couldn’t find why.

Your code using pandas interfaces will work the same no matter where the data comes from, as long as you keep using pandas objects. It can stop working if you use vaex objects instead, because they have different interfaces, albeit similar.

1 Like

This topic was automatically closed 2 days after the last reply. New replies are no longer allowed.