How to download large model files to the sharing app?

Hi! I have an DL app that I’m trying to deploy where the model weights are about 200 MB. This is too big to check into github, so I’m trying to have the app dynamically load the files from a google drive. Is there a better way to do this?

Hi @metasemantic, welcome to the forum!

I think downloading from Dropbox/Drive/S3 bucket/…is your best bet for now. Do note that the team is aware of this requirement :wink:

Cheers,
Fanilo

2 Likes

Sounds good. TBH, this is something that I would pay for. I think github LFS costs something too, though I’m not 100% on that.

A tutorial to point new users to for artifacts would be great if it exists.

1 Like

Hi! Can you please tell me how to use Dropbox for loading the weights? I have an image captioning model in Pytorch (Two ~100 MB files).
Also, if we download weights from a service like Dropbox, are the weights downloaded each time or they are stored/cached once downloaded?

Of course! I don’t know DropBox, but my Google Drive solution should be similar that you can adapt for it. This is what I used in my skyAR app:

@st.cache
def load_model():

    save_dest = Path('model')
    save_dest.mkdir(exist_ok=True)
    
    f_checkpoint = Path("model/skyAR_coord_resnet50.pt")

    if not f_checkpoint.exists():
        with st.spinner("Downloading model... this may take awhile! \n Don't stop it!"):
            from GD_download import download_file_from_google_drive
            download_file_from_google_drive(cloud_model_location, f_checkpoint)
    
    model = torch.load(f_checkpoint, map_location=device)
    model.eval()
    return model

Basically, check if the weights exist, if not download them from the cloud. Because streamlit sharing uses a semi-permanent instance you only have to download once for all users. st.cache is clutch here, for each session you only need to load it once.

4 Likes

Hey! Thanks for replying.
I actually used wget command inside python by os.system(bash command). The only catch was that I had to install the wget command via packages.txt .
You can check out my code if you are interested. I am also downloading the weights and storing them although I am not using cache. I removed it after I got an Resource exceeded error this morning.

You can check my CaptionBot here.

I am seeing this GD_download for the first time. I will use this in future. Thanks!
Btw, the skyAR app is awesome!!!

1 Like

Awesome, glad it worked. I had problems with resource limits too. What helped for caching was putting in a TTL (time to live) and a limit to the number of items to cache. I don’t like using os.system since it replies on the backend (though wget should be installable on every linux backend right?).

Thoughts on your app – 1] have a default image so people can see what it looks like 2] allow a button to repeat the caption generation process 3] have links somewhere (on the sidebar?) to what model you’re using

Nice job!

3 Likes

Thank you for the suggestions. I will make these changes asap. I think I will link the README section where I discuss the models. I did try earlier to use a generate button during local testing but it seems that the python program does not wait for the file-uploader. This leads to a None object from the file_uploader and then nothing is generated. Although, that’s only for the first prediction.
(I don’t really understand if streamlit keeps looping over the same program? I will read a bit more on streamlit)

Yes, wget is available on all linux backends. One can always install with apt-get command. When this worked, i used this :sweat_smile:. I had looked into gdown, requests and even
torch.hub. download_url_to_file ( url , dst , hash_prefix=None , progress=True )
Check here for more info.

But it seems the session stopped for all of them. This was the first time I deployed something. Thanks to streamlit, the process was much smoother I guess (Heroku would have been much more time-consuming and storage is also lesser). I have college exams coming so I am in a bit of hurry :sweat_smile:

Regarding the None for the uploader. I loaded the filename if the uploader was None:

mod_sky_img = st.sidebar.file_uploader("Upload Sky Image 🌧️")
f_skybox = args["f_skybox"] if mod_sky_img is None else mod_sky_img
img_sky = load_skybox_image(f_skybox)

Then in loading the image, I had to either load from the file or open the stream directly from streamlit.

@st.cache(ttl=3600, max_entries=10)
def load_skybox_image(f_img):

    if isinstance(f_img, str):
        img = cv2.imread(f_img, cv2.IMREAD_COLOR)
        img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
    else:
        img = np.array(Image.open(f_img))
    ...
1 Like

Thanks. That is pretty clever. I will make the changes now. My app went over resource limits again. It ran fine for like 30 hours. I don’t know why. (i have removed st.cache() also.) I have to look into this.(i am learning from your source code :grinning:)

Looking at your code, i realise that GD_download is a script and not a package :rofl: :rofl: (Earlier, i thought it was a module)

Update:
Looking at your code

@st.cache(ttl=3600, max_entries=10)
def compute_skymask(img):
    h, w, c = img.shape
    imgx = cv2.resize(img, (args["in_size_w"], args["in_size_h"]))
    imgx = np.array(imgx, dtype=np.float32)
    imgx = torch.tensor(imgx).permute([2, 0, 1]).unsqueeze(0)

    with torch.no_grad():
        pred = model(imgx.to(device))
        pred = torch.nn.functional.interpolate(
            pred, (h, w), mode="bicubic", align_corners=False
        )
        pred = pred[0, :].permute([1, 2, 0])
        pred = torch.cat([pred, pred, pred], dim=-1)
        pred = np.array(pred.detach().cpu())
        pred = np.clip(pred, a_max=1.0, a_min=0.0)

Won’t caching the function cause any changes to the model’s operations when you are passing it in the resnet?
model(imgs.to(device))

If no changes happen, I could probably just cache the function in which i do four beam searches.

Not sure what you mean by “Won’t caching the function cause any changes to the model’s operations”, but ideally with inference we don’t want to change the model! Running the inference step in torch.no_grad() should be free of side effects.

I also learned streamlit literally a few weeks ago, so I’m not the resident expert! Also if you edit a comment, I won’t see notice of it until I come back on the forums.

Thank you sir! I implemented all the suggestions and did the caching related changes. It has been working fine so far.
Yes, I meant side-effects only :sweat_smile: . Yes, I have been using torch.no_grad() for passing the image through the resnet but I just realised that I am not using torch.no_grad() for decoder :no_mouth:. Although, I had not cached only this function. And not performing any gradient update so I was safe.

I think I will save memory with torch.no_grad(). It could be the reason for resource limit exceeded!!

Yeah, torch.no_grad() will save memory … but it will also speed up the computation! It could up up to 2 to 3 times faster as you don’t have to run the autograd.

1 Like

Yes. I somehow missed/forgot it in the streamlit script. Thanks. :innocent:

Also, I recently read about TTL. Another reason for resource exceeded could be the Images that might have been stored in some form on the server. Adding the default image function probably rectifies that. (Because I remember that before implementing the default image function, the FileUploader object used to keep the previous image that I had uploaded in byte form. And that probably happens for all the users. But then, afaik, those images are stored in the user’s browser)

Enough speculation :grinning: now. Thanks a lot for your time.
You can check out the CaptionBot v1.1 now.

Hey so I just noticed that my application is using python 3.5, which is currently incompatible with Path().mkdir() is there a way to specify the python version that we are using?

@Djmcflush, I’m not sure if this should be a new question, but are you talking about Streamlit sharing? In that case I think they assume you are using python >= 3.6, based off this page https://docs.streamlit.io/en/stable/

If it’s Path().mkdir() that is the problem, you could replace it with os.mkdir which is similar.

Hi @metasemantic,

Model weights can be saved within a GitHub release:

You can then download these within your Streamlit app using:

import urllib.request

url = 'https://github.com/pymedphys/data/releases/download/VacbagModelWeights/unet_vacbag_512_dsc_epoch_120.hdf5'
filename = url.split('/')[-1]

urllib.request.urlretrieve(url, filename)

Within the PyMedPhys repository (https://github.com/pymedphys/pymedphys) I go a few steps further and also cache the download to the user’s home directory and a few other things so that it doesn’t need to be redownloaded. See that logic over at:

Zenodo (https://zenodo.org/) also offers unlimited free large file storage.

2 Likes

@SimonBiggs the progress bar with the download is a really nice trick! I’ll have to try that sometime.

In the meantime, I was able to get my model weights loaded in via a GDrive link that downloads once and caches. I found that github wouldn’t save my model weights when they were > 500 MB. Zenodo is a nice idea, but I don’t want to create a full DOI for a one-off project using model weights that aren’t even my to begin with.

1 Like

There is actually a “Zenodo sandbox” site where you can upload stuff and then delete it. It’s designed for demo uploads of data

https://sandbox.zenodo.org/

1 Like