I have a large image classification model (859 MB) that I want to deploy in streamlit, but since the large model can’t be cloned by streamlit cloud via github, I added it as a release in the repository and added it in
requirements.txt like the following:
artists_classifier @ https://github.com/nainiayoub/paintings-artist-classifier/releases/download/v1.0.0/artists_classifier.h5
Which is the same way I would add a spacy model to requirements.txt.
Apparently this causes an error while instaling requiments on streamlit cloud, any idea on what causes the problem?
Thanks in advance for your help!
On Streamlit Cloud, each app is allocated 1GB of RAM. Large image classification models will exceed that limit and crash the app. I would suggest hosting the model inference process on another provider and make API calls to it from your Streamlit Cloud app.
Hello @snehankekre, thank you so much for your response.
So I tried what you suggested by building an API with FastAPI and tried to deploy ot with Heroku.
However, I have added my large model (859 MB) to the repository via GitHub LFS which isn’t supported by Heroku by default and if it was it will basically saturate the slug size since the limit is 500MB.
I am not sure if I should ask this here but what I did as a solution was to request the model at the top of the app and use it to return the prediction like the following:
urll = 'https://github.com/nainiayoub/paintings-artist-classifier/releases/download/v1.0.0/artists_classifier.h5'
filename_model = urll.split('/')[-1]
model_file = filename_model
My API was successfuly deployed however it returns
503 Undocumented Error: Service Unavailable.
At this ppoint I am stuck and I am not sure how to proceed, do you have any idea or an alternative solution to deploy my large model?
Thank you so much in advance!
The solution that I found is to reduce my model size by converting it to tflite like the following, and using it afterwards to classify the input images in an API:
tflite_model_name = 'model_reduced'
converter = tf.lite.TFLiteConverter.from_keras_model(model)
converter.optimizations = [tf.lite.Optimize.OPTIMIZE_FOR_SIZE]
tflite_model = converter.convert()
open(tflite_model_name + '.tflite', 'wb').write(tflite_model)
The model size was reduced to 72MB, which was added to the size of the dependencies without saturating the slug size in Huroku, hence the api was deployed successfully.