StreamlitAPIException: Unable to convert numpy.dtype to pyarrow.DataType

When I executed the following script

import streamlit as st
import pandas as pd

df = pd.read.csv('path\to\a random.csv')
df_types =  pd.DataFrame(df.dtypes, columns=['Data Type'])

where my csv file had 3 columns: Text (object), Date-Time (object), Number (float64).

However, I got this error message

StreamlitAPIException: Unable to convert numpy.dtype to pyarrow.DataType.
This is likely due to a bug in Arrow (see https://issues.apache.org/jira/browse/ARROW-14087).
As a temporary workaround, you can convert the DataFrame cells to strings with df.astype(str).

Could someone please explain why:

  1. df_types is a dataframe but why it couldn’t be displayed?
  2. what numpy.dtype did the error message refer to?
  3. why was numpy.dtype needed to convert to pyarrow.DataType?

Thanks in advance

I also had the same issue on my function. I resolved it by converting my df_types Dataframe to df_types.astype(str) and Steamlit was able to render the Dataframe without issues

def explore(data):

df_types = pd.DataFrame(data.dtypes, columns=['Data Type'])
numerical_cols = df_types[~df_types['Data Type'].isin(['object',
               'bool'])].index.values
df_types['Count'] = data.count()
df_types['Unique Values'] = data.nunique()
df_types['Min'] = data[numerical_cols].min()
df_types['Max'] = data[numerical_cols].max()
df_types['Average'] = data[numerical_cols].mean()
df_types['Median'] = data[numerical_cols].median()
df_types['St. Dev.'] = data[numerical_cols].std()
return df_types.astype(str)

Yes, that’s what I did (using astype(str)) so that streamlit could display the dataframe.

Does that mean streamlit Docs misleads readers when it says

st.dataframe(data=None, width=None, height=None)

where data is pandas.DataFrame, pandas.Styler, pyarrow.Table, numpy.ndarray, Iterable, dict, or None

I haven’t really got the answer to that - but I presume so. Also what I noted is that from your code and my code is we are creating a dataframe comprising of dytpes - so st.dataframe(df_types) will cause that error yet if I load df using pd.read_csv then st.dataframe(df) you get no issues. I will be interest in finding out what is the real issue there as well.