Altair time transforms are off?

I am getting wait times across a variety of geographic locations. The times are stored in a mongodb collection as a UTC timestamp, and when it comes time to present them, I convert them to local time. The dataframes come out ok, as in the data makes sense. However, the graphs seem offset by 2 hours.

Example of the graph I am illustrating

Example of the dataframe that feeds it:

Specific example of the offset error:


My code:

    hist = pd.DataFrame(mongo_queries.get_hist_data(f)) # get the data
    hist['local_time'] = hist.apply(get_local,axis=1) # convert UTC to local time
    
    alt_color = {'Maximum':'max(wait):Q', 'Average':'average(wait):Q','Median':'median(wait):Q'}
    hist
    st.write('### Schedule View')
    scatter = alt.Chart(hist).mark_rect().encode(
        alt.Y('hours(local_time):O', title='hour of day',),
        alt.X('day(local_time):O', title='Weekday'),        
        alt.Tooltip(['hours(local_time):O','day(local_time):O',alt_color[metric]]),
        color=alt_color[metric],
        
    ).interactive()
    st.altair_chart(scatter, use_container_width=True)

As you can see in the data, you would expect that the latest times (within the 7th hour AM) would bin together and average to 17.75. The average is correct, but it is being illustrated on the 9AM block…

After some googling it seemed that the date format was causing this issue, so I converted the local_time column to an ISOFORMAT which was apparently the fix, but it is not fixing it.

Upon further investigation, it seems that altair is converting it to MY local time, regardless. I put the UTC time as the input to the graph(alt.Y and alt.X above) and its still binning them to the 9th hour, which is what I am in currently…

I guess the question is, how do I stop Altair(I think that's the culprit) from converting times and letting them be on the specified local times. I am expecting each location to follow the same generic trends throughout the day, respective of their local times. With locations across the continent, if they are all mashed together in my timezone, it’s not going to work…

Another weird thing is, my UTC time seems to be the right time, but the wrong timezone offset. The local time is correct. I am confused…

More info:

the data is stored as a utc, here’s a screenshot from mongo compass:
image

I loaded up the same data in a notebook, and it doesn’t seem to come with my local timezone, so maybe it is a streamlit dataframe issue?

Even more info:
I’ve now fixed the UTC column, where the timezone is now correct.

However, the binning is still offset by 2 hours…
The average wait for all samples taken 9:00 - 9:59 AM is 22.166667, which is correctly calculated, but aligned with the wrong hour. It seems to default to my(the client viewing the app) timezone, as the samples taken near me are correct, but mountain time is off by 2, pacific by 3.

This is a serious problem. Streamlit doesn’t respect the expected behavior of dates and times in altair outlined extensively here: Times and Dates in Altair — Altair 4.2.0 documentation

Even if you don’t set timezones as altair suggests, they still get treated as UTC, not as local time, as altair intends.

Unfortunately, this off-by-two hours is probably related to this underlying altair issue (which is related to an underlying veta-lite issue) Date axis labels are off by one · Issue #2077 · altair-viz/altair · GitHub

1 Like

Yeah, I discovered that this has to do with how Vega-lite deserializes times encoded by altair.

This solution of using non-ISO compliant timestamps worked for us: Monthly data and ordinal encoding create an "off by one" error · Issue #1027 · altair-viz/altair · GitHub

Apologies for blaming streamlit above <3

1 Like

This topic was automatically closed 365 days after the last reply. New replies are no longer allowed.