Is there an example of using the `Arrow` data structure without using `Pandas`?

Is there an example of using the Arrow data structure without using Pandas ?

Please update your streamlit to latest version, arrow data structure had be inside.

Maybe look into polars?

This is a great question @Ran_Feldesh! The purpose of adding Arrow serialization is two-fold:

  1. It makes Streamlit more efficient. For most people, they aren’t going to care how things work, just that their app is performant. By removing our custom data serialization logic to pass data between Python and the browser, we’ve removed something like 1000 lines of code from Streamlit, removing the maintenance burden on us while also ensuring better compatibility (assuming that Arrow as a project is handling all the edge cases)

  2. It allows other packages to pass data to Streamlit in an efficient manner

For #2, take the following example. Suppose you as a library wanted to make a Streamlit component that pass a giant result to Streamlit. You could send that as JSON, or within the library you might do something like this:

import streamlit as st
import pyarrow as pa


# user probably wouldn't do this, but other packages could pass pyarrow RecordBatches
data = [
pa.array([1, 2, 3, 4]),
pa.array(['foo', 'bar', 'baz', None]),
pa.array([True, None, False, True])
]

batch = pa.RecordBatch.from_arrays(data, ['f0', 'f1', 'f2'])

# or other libraries might pass pyarrow Tables
batches = [batch] * 5
table = pa.Table.from_batches(batches)

st.dataframe(table)

Now an end-user wouldn’t hand-write pyarrow structures like this; pandas is the expected higher-level API that people would be working with. But if you’re a library developer, passing pyarrow data structures means you don’t have to do a data serialization to pandas, nor include pandas in your dependencies. Depending on what the package does, that could shave seconds off of the run-time.

So that’s the basic reasoning behind the move to integrate Arrow more deeply into Streamlit. For end-users, it’s more than likely they’ll never encounter an Arrow data structure during data analysis, but there are some benefits to having Streamlit support doing so.

Best,
Randy

1 Like