AWS text extraction app

I’ve started work on an app to extract text from photos of documents, using AWS recognition which I have found to be quite accurate. The long term goal is that someone could point to a directory of pdf documents, and iterate through them extracting the text they care about. They can draw boxes over the parts of the document they want to extract text from, and automatically exclude text they don’t care about. It is an MVP currently, allowing processing of a single image and extracting all the text to json file.

3 Likes

Brilliant!