Extracting images from a docx

turrrick · October 25, 2021, 9:48am

Hello everyone,

I am working on an automatic word processing app that intakes Word files and repurposes them by extracting text snippets, some images, and in general copying over data from one file to a different template.

Locally, I save these images in a temp folder using shutil:

IMAGE_EXT = ('png', 'jpeg', 'jpg')
def is_image(filename):
        return any(filename.endswith(ext) for ext in IMAGE_EXT)

import shutil

with ZipFile("incoming_file.docx") as working_zip:
       image_list = [name for name in working_zip.namelist() if is_image(name)]
       working_zip.extractall(path=".\Images", members=image_list)

Some of these images are then placed inside the template I want to use:

t=docx.Document(".\Template.docx")
pics_table=t.add_table(rows=len(new_product_images), cols=1)
pics_table.style="Table Grid"
for r in range(len(new_product_images)):
    cell=pics_table.cell(r,0)
    cell.add_paragraph().add_run().add_picture(new_product_images[r])

Locally, this works fine.
How would you go about doing this in a web app, where temporary server storage is likely to be hard?

Thanks for the help!!

system · October 25, 2022, 9:48am

This topic was automatically closed 365 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
How to create a downloadable link for word document? 🎈 Using Streamlit	3	2638	May 13, 2022
Downloading a MS Word document 🎈 Using Streamlit file-download	6	4083	December 7, 2023
Load generated docs in a db/temporary folder 🦄 Random	1	377	May 13, 2023
How to download local folder? 🎈 Using Streamlit file-download	8	8750	May 13, 2022
Downloading a Word document returned by function(s) 🎈 Using Streamlit file-download	1	1147	August 6, 2023

Extracting images from a docx

Related Topics

Hello there 👋🏻

Cookie settings

Strictly necessary cookies

Performance cookies

Functional cookies

Targeting cookies