So im trying to use describe() function to my dataframe to find some descriptive statistics of the data in my Streamlit webpage. Here is the head of the dataframe:
st.write(superstore_data.describe(include='all'))
But it returns with an exception like this:
StreamlitAPIException: ("Could not convert 'CA-2018-100111' with type str: ,tried to convert to int64", 'Conversion failed for column Order ID with type object')
I tried describe function directly (not using Streamlit), and it works:
import pandas as pd
superstore_data = pd.read_csv('data/superstore.csv')
superstore_data.describe(include='all')
Is there any details i missed from streamlit? Why Streamlit API can’t use pandas describe function?
i already tried to store it in dataframe first and then write it with st.swrite (tried st.table too), it prints the same error. Here i share the datasets Superstore Sales Dataset | Kaggle
This is the full code:
import streamlit as st
import time
import numpy as np
import pandas as pd
st.set_page_config(page_title="About Datasets", page_icon="🕮")
st.title("About Datasets")
st.subheader("1. Superstore Sale Datasets, didapat dari Kaggle [disini](https://www.kaggle.com/datasets/rohitsahoo/sales-forecasting)")
superstore_data = pd.read_csv('data/superstore.csv')
st.write('preview:')
st.write(superstore_data.head())
columns_length=len(superstore_data.columns)
rows_length=len(superstore_data)
st.write('Jumlah kolom:', columns_length)
st.write('Jumlah baris:', rows_length)
st.write('#### Cek Nilai null')
st.write(superstore_data.isnull().sum(), 'Terdapat nilai null pada kolom Postal Code sebanyak 11 baris. Dikarenakan field/kolom postal code tidak digunakan pada analisis dan juga sulit untuk mengisinya karena kekurangan informasi, maka nilai null tersebut diabaikan')
# null = superstore_data[superstore_data.isnull().any(axis=1)]
# st.write(null)
st.write('#### Cek Data Duplikat')
st.write('terdapat', superstore_data.duplicated().sum(), 'data duplikat')
st.write('#### Descriptive Statistics')
df = superstore_data.describe(include='all')
st.write(df)