Streamlit app on Google Cloud Run is expensive

Managed to deploy a multi-page Streamlit App on Google Cloud Run. However, even with minimal traffic (just family and friends), Google Cloud Run is costing about ~$1-$2 a day. Seems way too expensive. Can’t seem anything obvious either.

Capacity:

  • 256MiB

  • 1 CPU

  • 1800 request timeout (30 mins) – else session states are wiped

  • 80 max requests per container

  • Execution environment: Default (I checked, GCP is using First Gen)

  • Autoscaling: 1 min instance + 100 max instances

Anyone has seen something similar? Or, anyone has rough Google Cloud Run cost for their streamlit app that they can share?

Thanks!

Hi @dclin ,

We experience similar costs for the Streamlit apps we host on Google Cloud Run, but of course, this depends a lot on the deployment parameters.

Read through the Google Cloud Run pricing documentation - these three considerations might help you reduce your costs:

  • CPU allocation: The prices for “CPU always allocated” are lower than the prices for “CPU only allocated during request processing”. You also get more free vCPU-seconds and GiB-seconds.

  • Region: Prices for Tier 1 regions are lower than prices for Tier 2 regions.

  • Committed use discounts (CUD): This option allows you get a significant discount if you commit (i.e. no refunds) to a certain usage.

Hope this helps!

Thanks! @marduk

Out of curiosity, how many concurrent users / billable container seconds do your applications get?

It seems like this may just be the baseline cost from self-hosting a Streamlit application on a serverless stack. Every 10-15 minutes or so, I am observing pings to “st-allowed-message-origins” and “stream” from Streamlit server – and these pings are essentially causing a container to be kept alive throughout the day – even when is no actual traffic.

Can’t seem to find a way to control this behavior in streamlit configs either.

I upgraded my streamlit version to 1.19.0 late Friday. After that, Cloud Run billing is now much more reasonable. My guess is https://github.com/streamlit/streamlit/pull/5534 was the cause of what I was seeing.

Hi again @dclin ,

Forgot to reply to your previous question :slight_smile:.

Our apps might see 3-5 concurrent users at peak times, but never go beyond 1 active container instance (always active since we use “CPU always allocated”). This depends a lot on your app’s code, of course, and not intended as a benchmark. For example: If your app performs CPU-intensive calculations, then keep in mind Google Cloud Run automatically monitors CPU usage, and if it goes high enough, will trigger a new container instance regardless of how many concurrent requests.

Regarding Streamlit version 1.19.0 reducing Google Cloud Run costs, thanks a lot for the heads-up, we will test this over the next few days. It would also be great to know if the Streamlit team :balloon: has more insight into what is causing this great side effect :sweat_smile:.

…that seems extremely expensive to me :open_mouth:


I would like to add:
Google Cloud Run may not be the cheapest cloud service platform/provider for Streamlit Apps. A Streamlit app is actually a classic web server application that runs 24/7. Maybe you should just rent a small VPS from a classic server provider (Digitalocean, Vultr, OVH, Ionos and many more…) for about 5-10$ per month. Put the Streamlit app in a Docker container and add a second container with a reverse proxy.

After 1.19.0, and changing autoscaling to 0 min instance, the cost now drops to ~$0.20 a day (with similar # of requests).

3 Likes

Hi @dclin, what does that mean exactly for the yaml file? I had the following set up and it was costing me something like $2.40 per day :japanese_goblin:

runtime: custom
env: flex
manual_scaling: 
  instances: 1
resources:
  cpu: 1
  memory_gb: 0.5
  disk_size_gb: 10

@Luke

Not the biggest Cloud Run expert here. But I wonder whether your manual_scaling instance of 1 is causing a container to be up the whole time? Check billable container instance time metric on the Cloud Run dashboard. (When I was on Streamlit 1.17.0, something in that Streamlit version was causing at least one billable container to be up at all time. That was the cause for my initial bill).

Also, I did eventually switch course and hosted my app on Streamlit Community Cloud first – so I can gage the demand first before paying for hosting infra cost.

Thanks for the reply @dclin :slight_smile: Yes I suspect it’s something to do with that, maybe the fact that it’s a custom env: flex environment which costs more money – regardless that it’s constrained to 1.

And just another small point of confusion for me, what do you mean: “causing the container to be up the whole time”? How could the app be running if the container wasn’t up? I know streamlit cloud puts projects to sleep after some inactivity, but then what’s the point of paying for a service if it’s not up?!

Hi @Luke. I ended up going with autoscaling (rather than manual scaling) and 0 minimum instance. That way, I only paid for container when there was traffic. Yes, it meant slower startup time. But it was acceptable, since the app was under development still.

Separately, a Streamlit 1.17.0 bug caused a single container to be up for me most of the time, despite the autoscaling + 0 minimum instance configuration and minimal friends and family traffic. That was resolved after I upgraded to Streamlit 1.19.0. The single container being up most of the time was costing me around $1-$1.50 a day.

In your case. I agree with you. With one container always up + your custom set up, may result in about $2 a day.

The key is to set the minimum instances to zero . You will be able to use it at no cost, as the free tier is largerly enough.

-When you set the minimum instance to one, means that you reduce cold start, but you’re paying the resources that you’re not using. You can cansider that only for production service where startup latency(~50secs) is an issue. After the service is not used for a while, say, 10 minute, the service is dormant, and the first request is going to have a latency.