Data transfer costs using streamlit

I’m using streamlit on AWS EC2 server and with data in an S3 storage bucket. I’m trying to understand data transfer costs associated with this setup. That is, I want to know how much data is transferred externally (to the internet user).

Such as and specifically, how much of the data that is being loaded into the EC2 server from the S3 bucket is being sent to the end-client when displaying charts and dataframes in the web browser?

Scenario:
I load a large pandas dataframe from S3 onto the EC2 server to create a plotly chart using only a couple of columns from that dataframe.

Question:
is the data transferred externally closer to the size of 'a couple of dataframe columns used to create the plotly chart or closer to (and perhaps larger than) the size of the entire dataframe?

Any notes about st.cache with respect to data transfer is also appreciated.

For that scenario, my understanding is that plotly charts are pre-rendered server side so the full data frame leveraged would not flood the clients browser memory.

You can test this somewhat on your local by by monitoring the resource usage on your machine vs browser usage. To do this you would run Streamlit in CMD (or whatever you use) and monitor the memory allocation via task manager or htop (linux). Then load your page in your browser and check the memory allocated for that tab by hovering over the tab header if you use chrome. Or even more precisely, press F12 and go to network and you can monitor the exact sizes of the packages received client side.

Hope this helps!

1 Like