"Unrecognised Dataset" error occurs in altair chart for multiuser

Summary
I have developed an app to plot the values in a altair chart. The values are first appended in the pandas dataframe and then plotted using the altair chart. The app works fine in the single user case. In the multi user case, app throws the Unrecognised dataset error in the plotting area.

Steps to reproduce
Import the necessary libraries
Declare the pandas dataframe
Declare the data to be plotted
Append the data one by one in the dataframe
Plot the data using altair chart

Code snippet:

import streamlit as st
import pandas as pd
from datetime import datetime, timedelta
import altair as alt
import time as t      
import math                               
st.title("Sine wave Plotting")
df_Tidal_Volume = pd.DataFrame(columns=["time","Tidal_Volume"])
chart_Tidal_Volume = st.empty()
start_time = datetime.now()
while True:
    try:
        data = [5*math.sin(math.radians(0)),5*math.sin(math.radians(9)),5*math.sin(math.radians(13.5)),5*math.sin(math.radians(18)),5*math.sin(math.radians(22.5)),5*math.sin(math.radians(27)),5*math.sin(math.radians(31.5)),5*math.sin(math.radians(36)),5*math.sin(math.radians(40.5)),5*math.sin(math.radians(45))] 
        t.sleep(0.03)
        time = datetime.now()
        df_Tidal_Volume = df_Tidal_Volume.append({"time": time, "Tidal_Volume": data[0]}, ignore_index=True)
        latest_time = time - start_time
        if latest_time >= timedelta(milliseconds=50):
            a=chart_Tidal_Volume.altair_chart(alt.Chart(df_Tidal_Volume.tail(200)).mark_line().encode(x='time',y=alt.Y('Tidal_Volume', scale=alt.Scale(domain=[0,6]))))
            print("Plot Success")
        else:
            print("Data not received")
        t.sleep(0.03)

Append more data from the given array and plot using altair chart. Deploy the app. While accessing the app by more than one user at a time, the error occurs.

Expected behavior:

The app should plot the data while accessing the app from multiple devices

Actual behavior:
The app throws the following error while accessing from more than one device

Links

Hi @Santhosh_Graceson

Looking at the error message and trying out your code in the GitHub repo indicated that the dataset was not able to be generated in the form of a DataFrame. Once there is no DataFrame, the downstream plot creation would consequently not work. As the first step, I would recommend to ensure that the DataFrame is properly generated, once that is true then we can proceed to the plot creation.

As the code has several redundancies, I would suggest to refactor the code which would make it more concise and also easier to debug.

It appears that you’re generating data from 4 data variables:

data = [5*math.sin(math.radians(0)),5*math.sin(math.radians(9)),5*math.sin(math.radians(13.5)),5*math.sin(math.radians(18)),5*math.sin(math.radians(22.5)),5*math.sin(math.radians(27)),5*math.sin(math.radians(31.5)),5*math.sin(math.radians(36)),5*math.sin(math.radians(40.5)),5*math.sin(math.radians(45))] 
data1 = [5*math.sin(math.radians(49.5)),5*math.sin(math.radians(54)),5*math.sin(math.radians(58.5)),5*math.sin(math.radians(63)),5*math.sin(math.radians(67.5)),5*math.sin(math.radians(72)),5*math.sin(math.radians(76.5)),5*math.sin(math.radians(81)),5*math.sin(math.radians(85.5)),5*math.sin(math.radians(90))] 
data2=[5*math.sin(math.radians(94.5)),5*math.sin(math.radians(99)),5*math.sin(math.radians(103.5)),5*math.sin(math.radians(108)),5*math.sin(math.radians(112.5)),5*math.sin(math.radians(117)),5*math.sin(math.radians(121.5)),5*math.sin(math.radians(126)),5*math.sin(math.radians(130.5)),5*math.sin(math.radians(135))]
data3=[5*math.sin(math.radians(139.5)),5*math.sin(math.radians(144)),5*math.sin(math.radians(148.5)),5*math.sin(math.radians(153)),5*math.sin(math.radians(157.5)),5*math.sin(math.radians(162)),5*math.sin(math.radians(166.5)),5*math.sin(math.radians(171)),5*math.sin(math.radians(175.5)),5*math.sin(math.radians(180))]

So in the following example I will show only the use of data which you can adapt to the other 3.

I would recommend to create modular functions such as:

def generate_data(input_data):
  # Start time
  start_time = datetime.now()
  # Generating the data
  data = [5*math.sin(math.radians(0)),5*math.sin(math.radians(9)),5*math.sin(math.radians(13.5)),5*math.sin(math.radians(18)),5*math.sin(math.radians(22.5)),5*math.sin(math.radians(27)),5*math.sin(math.radians(31.5)),5*math.sin(math.radians(36)),5*math.sin(math.radians(40.5)),5*math.sin(math.radians(45))] 
  t.sleep(0.03)
  # Time
  time = datetime.now()
  # Create Pandas Series and concatenate them to a DataFrame
  time_Series = pd.Series({"time": time})
  Tidal_Volume_Series = pd.Series({"Tidal_Volume": data[0]})
  df_Tidal_Volume = pd.concat([time_Series, Tidal_Volume_Series], axis=0)
  # Latest time
  latest_time = time - start_time
  return df_Tidal_Volume, latest_time

which when run would return a tuple.

Thus, you can select the first value from the tuple to return the DataFrame:

# Display DataFrame
generate_data()[0]

which gives

time            2023-06-26 03:28:52.297175
Tidal_Volume                           0.0
dtype: object

Next return the second value from the tuple:

# Print latest_time
generate_data()[1]

In the above code you’ll see that I’ve replaced “creating an empty DataFrame + append” by creating 2 Pandas Series then concatenating them via pd.concat() in order to create the DataFrame. The original approach did not generate the DataFrame.

Thus, in the code snippet above the DataFrame has been generated successfully. Please continue to apply this to the entire code and it should work.

Hope this was helpful.

Best regards,
Chanin

This topic was automatically closed 180 days after the last reply. New replies are no longer allowed.