Loading our own datasets

Hey guys, I’ve started learning streamlit just today and I have got a doubt on how to load our own datasets in excel and CSV formats into the streamlit? I have seen loading datasets from the traditional sklearn datasets only in tutorial videos. So please help me with this.

It would be really helpful if someone solves this for me!

Hi @Byri_Manoj, welcome to the Streamlit community!

You can use pandas to read in CSV and Excel files:

Good luck on your learning journey!

Best,
Randy

Hey @Byri_Manoj, welcome to the community :slight_smile:

Well if you know how to load a dataset with Pandas, you’re already 90% done! Imagine that Streamlit is a layer over your Python code to visualize data.

So you can load your data the way you know using Pandas, then use Streamlit to visualize the data:

# app.py, run with 'streamlit run app.py'
import pandas as pd
import streamlit as st

df = pd.read_csv("./data/titanic.csv")  # read a CSV file inside the 'data" folder next to 'app.py'
# df = pd.read_excel(...)  # will work for Excel files

st.title("Hello world!")  # add a title
st.write(df)  # visualize my dataframe in the Streamlit app

Now you can edit the path to the file in your Streamlit script, and Streamlit will automatically rerun in the background to load the new dataframe and display its content immediately. Nice!

There’s a more interactive way to specify files with the file uploader. Basically Streamlit it will send the file as a buffer to Python, and pd.read_csv and pd.read_excel know how to deal with them!

import pandas as pd
import streamlit as st

st.title("Hello world!")

uploaded_file = st.file_uploader("Choose a file")
if uploaded_file is not None:
  df = pd.read_csv(uploaded_file)
  st.write(dataframe)

Other than that, all the rest is Python code. So you’re free to manipulate your dataframe in df the way you want, and if you need to show something on Streamlit, try st.write(<your variable>) to display it, it works on a lot of things, Pandas Dataframes, Matplotlib graphs, markdown text

For example in the following I compute a Maplotlib plot and want to display it in Streamlit:

import matplotlib.pyplot as plt
import pandas as pd
import streamlit as st

st.title("Hello world!")

uploaded_file = st.file_uploader("Choose a file")
if uploaded_file is not None:
  df = pd.read_csv(uploaded_file)
  st.write(df)

  # Add some matplotlib code !
  fig, ax = plt.subplots()
  df.hist(
    bins=8,
    column="Age",
    grid=False,
    figsize=(8, 8),
    color="#86bf91",
    zorder=2,
    rwidth=0.9,
    ax=ax,
  )
  st.write(fig)

And don’t forget, it’s all rerendered in real time !

That’s the intro :slight_smile: hope it helps!

Fanilo

1 Like