Hey community! IThis is my first time sharing an app!. I’ve been working on a little side project that I thought might be useful for anyone involved in research. It’s a Document Highlight Extractor, and it’s designed to make my (your??) life easier when knee-deep in research papers or lengthy documents.
A big problem of mine is compiling all the highlights from my research projects. It’s a tedious, time-consuming process that usually involves a lot of copy-pasting and formatting headaches. I couldn’t find reliable or free projects so I tried making my own.
This app automates the extraction of highlighted text from PDFs and Word documents. Here’s some features:
- Highlight Extraction: Pulls out all your highlighted text automatically.
- Image Extraction: Experimental feature to grab images from your documents too…This is kind of patchy but neat when it works
- Multiple Output Formats: Generates a neat, formatted document with your highlights in PDF, Word, HTML, or Markdown.
It’s not perfect , but it kind of does the job for me! Here are some areas I think could use some love and all feedback is helpful
- Performance Optimization: It’s not too bad but I’m a novice at this
- Image Extraction: This feature is still experimental and could use some refinement.
- UI Enhancement: The Streamlit interface is functional, but there’s room for a more polished look or perhaps some additional custom components you guys think it might benefit from.
- Additional File Formats: Currently limited to PDFs and Word docs. Support for more formats would be awesome and open to suggestions.
- Error Handling: It’s pretty basic right now. More robust error handling would make it more user-friendly. I’m working on this.