StreamlitAPIException: Unable to convert numpy.dtype to pyarrow.DataType

jaseuts · October 19, 2021, 1:33am

When I executed the following script

import streamlit as st
import pandas as pd

df = pd.read.csv('path\to\a random.csv')
df_types =  pd.DataFrame(df.dtypes, columns=['Data Type'])

where my csv file had 3 columns: Text (object), Date-Time (object), Number (float64).

However, I got this error message

StreamlitAPIException: Unable to convert numpy.dtype to pyarrow.DataType.
This is likely due to a bug in Arrow (see https://issues.apache.org/jira/browse/ARROW-14087).
As a temporary workaround, you can convert the DataFrame cells to strings with df.astype(str).

Could someone please explain why:

df_types is a dataframe but why it couldn’t be displayed?
what numpy.dtype did the error message refer to?
why was numpy.dtype needed to convert to pyarrow.DataType?

Thanks in advance

Berry_Perembe · October 28, 2021, 1:33pm

I also had the same issue on my function. I resolved it by converting my df_types Dataframe to df_types.astype(str) and Steamlit was able to render the Dataframe without issues

def explore(data):

df_types = pd.DataFrame(data.dtypes, columns=['Data Type'])
numerical_cols = df_types[~df_types['Data Type'].isin(['object',
               'bool'])].index.values
df_types['Count'] = data.count()
df_types['Unique Values'] = data.nunique()
df_types['Min'] = data[numerical_cols].min()
df_types['Max'] = data[numerical_cols].max()
df_types['Average'] = data[numerical_cols].mean()
df_types['Median'] = data[numerical_cols].median()
df_types['St. Dev.'] = data[numerical_cols].std()
return df_types.astype(str)

jaseuts · October 28, 2021, 10:08pm

Yes, that’s what I did (using astype(str)) so that streamlit could display the dataframe.

Does that mean streamlit Docs misleads readers when it says

st.dataframe(data=None, width=None, height=None)

where data is pandas.DataFrame, pandas.Styler, pyarrow.Table, numpy.ndarray, Iterable, dict, or None

Berry_Perembe · October 29, 2021, 7:25am

I haven’t really got the answer to that - but I presume so. Also what I noted is that from your code and my code is we are creating a dataframe comprising of dytpes - so st.dataframe(df_types) will cause that error yet if I load df using pd.read_csv then st.dataframe(df) you get no issues. I will be interest in finding out what is the real issue there as well.

system · October 29, 2022, 7:25am

This topic was automatically closed 365 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Dtype conversion error in V0.85 of the recent Pyarrow migration Using Streamlit pandas	0	2410	August 1, 2021
ArrowInvalid: ("Could not convert dtype('int64') with type numpy.dtype[int64]: did not recognize Python value type when inferring an Arrow data type", 'Conversion failed for column dtype with type object') Using Streamlit pandas	4	5762	August 22, 2023
Error with StreamlitAPIException Using Streamlit	1	1290	November 22, 2022
Applying automatic fixes for column types to make the dataframe Arrow-compatible Community Cloud pandas , streamlit-cloud	3	7937	April 9, 2024
I just want a SIMPLE bar chart Using Streamlit	6	2874	October 6, 2023

StreamlitAPIException: Unable to convert numpy.dtype to pyarrow.DataType

Related topics

Hello there 👋🏻

Cookie settings

Strictly necessary cookies

Performance cookies

Functional cookies

Targeting cookies