Text-extraction-app

I created an app using streamlit and tesseract OCR which allows a user to upload an image (jpg) of text and have the OCR extract the text. Streamlit and tesseract are in seperate docker containers, linked via a flask api, and my plan is to add a docker-compose to allow the app to be deployed easily. This is currently just a demonstrator but there are many features which could be added, such as named entity recognition, sentiment analysis etc. Anyone interested in contributing is welcome to do so, otherwise fork and use as the foundation for your own projects.
Cheers

6 Likes

That’s really cool! What’s the Flask part for, serving tesseract?

Hi Randy, thats correct, allows reusing the tesseract serving in other applications, and I wanted to try something microservices like with streamlit as frontent

1 Like

Hi Robo,

Really cool. I am a newbie, mostly do python scripting. I can use this at my work and there is need. Basic questions (I am still learning streamlit and Datascience in python): Can’t streamlit directly have this ‘upload -> extract text’ app ? As in why need docker (and I have no knowledge of docker). Also thought streamlit sort of obviated the need for Flask (don’t want to learn flask if I don’t have to).

Appreciate it!

Hi Raz
sure you can create a monolith and run without docker, but in the long run these are decisions you might regret :slight_smile:

1 Like

Davide fiocco has published a nice write up on deploying streamlit with a fastapi backend, taking a similar approach to mine with docker etc. Worth a read!

4 Likes