After upgrade to the latest version now this error id showing up **ArrowInvalid**

After upgrading to the latest streamlit version now when i run the function it display the below error:

ArrowInvalid: (‘Could not convert int64 with type numpy.dtype: did not recognize Python value type when inferring an Arrow data type’, ‘Conversion failed for column Data Type with type object’)
Traceback:
File “F:\AIenv\lib\site-packages\streamlit\script_runner.py”, line 350, in _run_script
exec(code, module.dict)
File “f:\AIenv\streamlit\app2.py”, line 1051, in
main()
File “f:\AIenv\streamlit\app2.py”, line 597, in main
explore(df)
File “f:\AIenv\streamlit\app2.py”, line 91, in explore
st.write(df_types)
File “F:\AIenv\lib\site-packages\streamlit\elements\write.py”, line 182, in write
self.dg.dataframe(arg)
File “F:\AIenv\lib\site-packages\streamlit\elements\dataframe_selector.py”, line 85, in dataframe
return self.dg._arrow_dataframe(data, width, height)
File “F:\AIenv\lib\site-packages\streamlit\elements\arrow.py”, line 82, in arrow_dataframe
marshall(proto, data, default_uuid)
File “F:\AIenv\lib\site-packages\streamlit\elements\arrow.py”, line 160, in marshall
proto.data = type_util.data_frame_to_bytes(df)
File “F:\AIenv\lib\site-packages\streamlit\type_util.py”, line 371, in data_frame_to_bytes
table = pa.Table.from_pandas(df)
File “pyarrow\table.pxi”, line 1479, in pyarrow.lib.Table.from_pandas
File “F:\AIenv\lib\site-packages\pyarrow\pandas_compat.py”, line 591, in dataframe_to_arrays
for c, f in zip(columns_to_convert, convert_fields)]
File “F:\AIenv\lib\site-packages\pyarrow\pandas_compat.py”, line 590, in
arrays = [convert_column(c, f)
File “F:\AIenv\lib\site-packages\pyarrow\pandas_compat.py”, line 577, in convert_column
raise e
File “F:\AIenv\lib\site-packages\pyarrow\pandas_compat.py”, line 571, in convert_column
result = pa.array(col, type=type
, from_pandas=True, safe=safe)
File “pyarrow\array.pxi”, line 301, in pyarrow.lib.array
File “pyarrow\array.pxi”, line 83, in pyarrow.lib._ndarray_to_array
File “pyarrow\error.pxi”, line 84, in pyarrow.lib.check_status

code:

import streamlit as st
import pandas as pd

def explore(df):

# DATA

    st.write('Data:')

    st.write(df)

    # SUMMARY

    df_types = pd.DataFrame(df.dtypes, columns=['Data Type'])

    numerical_cols = df_types[~df_types['Data Type'].isin(['object',

                    'bool'])].index.values

    df_types['Count'] = df.count()

    df_types['Null Values'] = df.isnull().sum()

    df_types['Unique Values'] = df.nunique()

    df_types['Min'] = df[numerical_cols].min()

    df_types['Max'] = df[numerical_cols].max()

    df_types['Average'] = df[numerical_cols].mean()

    df_types['Median'] = df[numerical_cols].median()

    df_types['St. Dev.'] = df[numerical_cols].std()

    st.write('Summary:')

    st.write(df_types)

#show Summary

if st.checkbox("Show Summary"):

     explore(df)

what this error mean ? and How to fix this error

Hi!
I had the same problem. It’s a bug that came with streamlit 0.85.0. I hope developers will solve it soon.
The current solution is downgrade to version 0.84.2

pip install streamlit==0.84.2

Yes that is what i did i downgraded streamlit version to 0.84
Now there is a upgraded version 0.86 hope that this solve the problem.

Still having the same issue with latest release

Me too. Very simple read in of xlsx file with diverse column data types.

st.markdown("### Books In Print")

df = pd.read_excel("BIP4streamlit.xlsx")

df['title'].astype(str)

df['Contributor 1'].astype(str)

df
 File "/Users/fred/.virtualenvs/nimbleAI/lib/python3.8/site-packages/pyarrow/pandas_compat.py", line 571, in convert_column
    result = pa.array(col, type=type_, from_pandas=True, safe=safe)
  File "pyarrow/array.pxi", line 301, in pyarrow.lib.array
  File "pyarrow/array.pxi", line 83, in pyarrow.lib._ndarray_to_array
  File "pyarrow/error.pxi", line 84, in pyarrow.lib.check_status
pyarrow.lib.ArrowInvalid: ('Could not convert A Day Book with Prompts with type str: tried to convert to int', 'Conversion failed for column title with type object')

upgraded to 0.88.

Hi all

A preferable solution for this is to use the old dataframe serializer by setting this in your .streamlit/config.toml file:

[global]
dataFrameSerialization = "legacy"

This allows you to continue upgrading to the latest version of Streamlit and getting all the latest goodies!

More info about Arrow here.


Meanwhile, I’ll forward this thread to our eng team. We want to find all instances where Arrow isn’t working well and fix them :slight_smile: