Streamlit app on Google Cloud Run is expensive

Managed to deploy a multi-page Streamlit App on Google Cloud Run. However, even with minimal traffic (just family and friends), Google Cloud Run is costing about ~$1-$2 a day. Seems way too expensive. Canā€™t seem anything obvious either.

Capacity:

  • 256MiB

  • 1 CPU

  • 1800 request timeout (30 mins) ā€“ else session states are wiped

  • 80 max requests per container

  • Execution environment: Default (I checked, GCP is using First Gen)

  • Autoscaling: 1 min instance + 100 max instances

Anyone has seen something similar? Or, anyone has rough Google Cloud Run cost for their streamlit app that they can share?

Thanks!

Hi @dclin ,

We experience similar costs for the Streamlit apps we host on Google Cloud Run, but of course, this depends a lot on the deployment parameters.

Read through the Google Cloud Run pricing documentation - these three considerations might help you reduce your costs:

  • CPU allocation: The prices for ā€œCPU always allocatedā€ are lower than the prices for ā€œCPU only allocated during request processingā€. You also get more free vCPU-seconds and GiB-seconds.

  • Region: Prices for Tier 1 regions are lower than prices for Tier 2 regions.

  • Committed use discounts (CUD): This option allows you get a significant discount if you commit (i.e. no refunds) to a certain usage.

Hope this helps!

Thanks! @marduk

Out of curiosity, how many concurrent users / billable container seconds do your applications get?

It seems like this may just be the baseline cost from self-hosting a Streamlit application on a serverless stack. Every 10-15 minutes or so, I am observing pings to ā€œst-allowed-message-originsā€ and ā€œstreamā€ from Streamlit server ā€“ and these pings are essentially causing a container to be kept alive throughout the day ā€“ even when is no actual traffic.

Canā€™t seem to find a way to control this behavior in streamlit configs either.

I upgraded my streamlit version to 1.19.0 late Friday. After that, Cloud Run billing is now much more reasonable. My guess is https://github.com/streamlit/streamlit/pull/5534 was the cause of what I was seeing.

Hi again @dclin ,

Forgot to reply to your previous question :slight_smile:.

Our apps might see 3-5 concurrent users at peak times, but never go beyond 1 active container instance (always active since we use ā€œCPU always allocatedā€). This depends a lot on your appā€™s code, of course, and not intended as a benchmark. For example: If your app performs CPU-intensive calculations, then keep in mind Google Cloud Run automatically monitors CPU usage, and if it goes high enough, will trigger a new container instance regardless of how many concurrent requests.

Regarding Streamlit version 1.19.0 reducing Google Cloud Run costs, thanks a lot for the heads-up, we will test this over the next few days. It would also be great to know if the Streamlit team :balloon: has more insight into what is causing this great side effect :sweat_smile:.

ā€¦that seems extremely expensive to me :open_mouth:


I would like to add:
Google Cloud Run may not be the cheapest cloud service platform/provider for Streamlit Apps. A Streamlit app is actually a classic web server application that runs 24/7. Maybe you should just rent a small VPS from a classic server provider (Digitalocean, Vultr, OVH, Ionos and many moreā€¦) for about 5-10$ per month. Put the Streamlit app in a Docker container and add a second container with a reverse proxy.

After 1.19.0, and changing autoscaling to 0 min instance, the cost now drops to ~$0.20 a day (with similar # of requests).

3 Likes

Hi @dclin, what does that mean exactly for the yaml file? I had the following set up and it was costing me something like $2.40 per day :japanese_goblin:

runtime: custom
env: flex
manual_scaling: 
  instances: 1
resources:
  cpu: 1
  memory_gb: 0.5
  disk_size_gb: 10

@Luke

Not the biggest Cloud Run expert here. But I wonder whether your manual_scaling instance of 1 is causing a container to be up the whole time? Check billable container instance time metric on the Cloud Run dashboard. (When I was on Streamlit 1.17.0, something in that Streamlit version was causing at least one billable container to be up at all time. That was the cause for my initial bill).

Also, I did eventually switch course and hosted my app on Streamlit Community Cloud first ā€“ so I can gage the demand first before paying for hosting infra cost.

Thanks for the reply @dclin :slight_smile: Yes I suspect itā€™s something to do with that, maybe the fact that itā€™s a custom env: flex environment which costs more money ā€“ regardless that itā€™s constrained to 1.

And just another small point of confusion for me, what do you mean: ā€œcausing the container to be up the whole timeā€? How could the app be running if the container wasnā€™t up? I know streamlit cloud puts projects to sleep after some inactivity, but then whatā€™s the point of paying for a service if itā€™s not up?!

Hi @Luke. I ended up going with autoscaling (rather than manual scaling) and 0 minimum instance. That way, I only paid for container when there was traffic. Yes, it meant slower startup time. But it was acceptable, since the app was under development still.

Separately, a Streamlit 1.17.0 bug caused a single container to be up for me most of the time, despite the autoscaling + 0 minimum instance configuration and minimal friends and family traffic. That was resolved after I upgraded to Streamlit 1.19.0. The single container being up most of the time was costing me around $1-$1.50 a day.

In your case. I agree with you. With one container always up + your custom set up, may result in about $2 a day.

The key is to set the minimum instances to zero . You will be able to use it at no cost, as the free tier is largerly enough.

-When you set the minimum instance to one, means that you reduce cold start, but youā€™re paying the resources that youā€™re not using. You can cansider that only for production service where startup latency(~50secs) is an issue. After the service is not used for a while, say, 10 minute, the service is dormant, and the first request is going to have a latency.