How can I integrate a recaptcha in to my app? or anything else that can prevent crawler to scapy my dataset

Hello,

I am using streamlit for web app. it’s fantastic! However, I use a big dataset, I found that somebody is tried to use a crawler to get the data automatically and waste system resources in this way. How can I prevent this? my goal is to add a captcha before they run the search button or at the beginning of the main function, they can only go ahead if they passed. Or maybe there should be any other way to achieve this?

Thank you so much!

2 Likes

I haven’t tried recaptcha with a Streamlit app yet. Given the new components feature, I have a project to try to build a recaptcha Streamlit component, where a user would have to pass in their Google creds for recaptcha and it would return the text that can be rendered with st.html() or something like that.

Another option would be to put your app behind a VPN and white list the IPs that can access it.

3 Likes

Thanks you for your reply! It will be great if a component can be made for this feature! I would also thinking about set the time limit. e.g. If anyone stay at the page more than 10min will be kicked out. I tried, but the problem is streamlit will rerun entirely after each request. I also tried st.cache, but another problem is, the cache is not based on the client, in other words, if there is nobody click on (‘clean cache’) button,the cache will never change for different visitors.