Extracting images from a docx

Hello everyone,

I am working on an automatic word processing app that intakes Word files and repurposes them by extracting text snippets, some images, and in general copying over data from one file to a different template.

Locally, I save these images in a temp folder using shutil:

IMAGE_EXT = ('png', 'jpeg', 'jpg')
def is_image(filename):
        return any(filename.endswith(ext) for ext in IMAGE_EXT)

import shutil

with ZipFile("incoming_file.docx") as working_zip:
       image_list = [name for name in working_zip.namelist() if is_image(name)]
       working_zip.extractall(path=".\Images", members=image_list)

Some of these images are then placed inside the template I want to use:

t=docx.Document(".\Template.docx")
pics_table=t.add_table(rows=len(new_product_images), cols=1)
pics_table.style="Table Grid"
for r in range(len(new_product_images)):
    cell=pics_table.cell(r,0)
    cell.add_paragraph().add_run().add_picture(new_product_images[r])

Locally, this works fine.
How would you go about doing this in a web app, where temporary server storage is likely to be hard?

Thanks for the help!!

This topic was automatically closed 365 days after the last reply. New replies are no longer allowed.