Streamlit Can't use Pandas Describe function

So im trying to use describe() function to my dataframe to find some descriptive statistics of the data in my Streamlit webpage. Here is the head of the dataframe:
image

st.write(superstore_data.describe(include='all'))

But it returns with an exception like this:

StreamlitAPIException: ("Could not convert 'CA-2018-100111' with type str: ,tried to convert to int64", 'Conversion failed for column Order ID with type object')

I tried describe function directly (not using Streamlit), and it works:

import pandas as pd
superstore_data = pd.read_csv('data/superstore.csv')
superstore_data.describe(include='all')

image

Is there any details i missed from streamlit? Why Streamlit API canโ€™t use pandas describe function?

Hi! Thatยดs super weird. Can you print both pandas versions, and share the dataset?

From the error, it seems that the error is on st.write. Try this to verify:

df_aux = superstore_data.describe(include='all')
st.write(df_aux)

i already tried to store it in dataframe first and then write it with st.swrite (tried st.table too), it prints the same error. Here i share the datasets Superstore Sales Dataset | Kaggle
This is the full code:

import streamlit as st
import time
import numpy as np
import pandas as pd


st.set_page_config(page_title="About Datasets", page_icon="๐Ÿ•ฎ")

st.title("About Datasets")
st.subheader("1. Superstore Sale Datasets, didapat dari Kaggle [disini](https://www.kaggle.com/datasets/rohitsahoo/sales-forecasting)")

superstore_data = pd.read_csv('data/superstore.csv')

st.write('preview:')
st.write(superstore_data.head())

columns_length=len(superstore_data.columns)
rows_length=len(superstore_data)

st.write('Jumlah kolom:', columns_length)
st.write('Jumlah baris:', rows_length)


st.write('#### Cek Nilai null')
st.write(superstore_data.isnull().sum(), 'Terdapat nilai null pada kolom Postal Code sebanyak 11 baris. Dikarenakan field/kolom postal code tidak digunakan pada analisis dan juga sulit untuk mengisinya karena kekurangan informasi, maka nilai null tersebut diabaikan')
# null = superstore_data[superstore_data.isnull().any(axis=1)]
# st.write(null)
st.write('#### Cek Data Duplikat')
st.write('terdapat', superstore_data.duplicated().sum(), 'data duplikat')
st.write('#### Descriptive Statistics')
df = superstore_data.describe(include='all')
st.write(df)

And this is the error

I agree itโ€™s a weird conversion error. I donโ€™t understand the internals, but at least you can avoid it with some additional manipulation!


df = superstore_data.describe(include='all').fillna("").astype("str")
st.write(df)

Thanks, yeah i think its a bug or something. But your solution is perfect, the null values dont really have much impact in describe function.

1 Like

This topic was automatically closed 365 days after the last reply. New replies are no longer allowed.