CaptionBot- The image captioning bot

Hi everybody!

CaptionBot takes an image and generates a caption in less than 40 words. It is an implementation of the research paper “Show, Attend and Tell”.
You can check it here. Link for the source code here

Special thanks to Streamlit team, forums and @metasemantic for answering my doubts.


I gave it a go!

Looks like my :cat2: is a :dog2: and I never knew! :stuck_out_tongue_winking_eye:

Love the app, very easy to use!

Happy Streamlit-ing!

1 Like

Hi! Actually, the model does not work with cats well :sweat_smile: :sweat_smile:. It seems there are no images related to “cats” in Flickr30K dataset. In imageNet dataset also, dogs are more. Something like dogs and people work much better.


Thanks for trying out!

WHAT! no cats in Flickr30K!?!? :scream_cat: :crying_cat_face:


Super cool! I hope this becomes standard, where researcher put up streamlit demos right away.

A few thoughts on the app:

1] Maybe include a link to the publication or model on the app? (Streamlit team, would be really nice if we could modify the footer HINT HINT @Marisa_Smith et al )
2] Auto generate the captions if an image is uploaded … I mean why not?
3] st.file_uploader(label= 'Upload Image', type = ['png', 'jpg', 'jpeg']) why not webp? Dragging an image from another webpage sometimes uses that format and st can handle it just fine.

1 Like

@metasemantic met hahaha will pass along your very subtle suggestion! :crazy_face: :rofl:


@metasemantic psst. add your text in footer like this till then :wink:

import streamlit as st

    footer:after {
        content: " References: XYZ, ABC!" !important;
        color: red !important;
        font-size: 16px !important;

It will look like this,

Hope it helps!

1 Like

Thanks again!

  1. The link to the Github repository where I have explained everything is given in the sidebar. But I will add a footer.
  2. I am using a button because I feel it creates some excitement. Also, you can press repeatedly press the button to generate captions. In the earlier version of CaptionBot(trained on Flickr8k and had some mistakes), each button press created new captions. Although, this model is mostly free of mistakes and thus, has less “randomness” in predictions. But still, it might create different captions for complex images.
  3. Thanks. I will add this right now. I was unaware of ‘webp’. :rofl: