Categories
Python

Merge Word documents automatically with Python

Merging Docx files is complicated

Docx is a complicated document format made of multiple XML files zipped together. Concatenating Word documents that only contain text is straightforward and can directly be done using Word. However if your Word documents contain more complicated objects like pictures or tables then you need to add their style/content to the internal structure of the resulting merged Docx file.

A very repetitive and common problem

Combining Word documents is a menial and repetitive task. Generally, one would open two Word documents side by side and copy/paste the content from one side to the other. Also Docx are used by millions of people everyday, so there must be a simple way to automate all this !

The solution

RPA (Robotic Process Automation) can help solving this problem. Indeed, it is now possible to easily and quickly merge multiple Word documents thanks to the following awesome Python libraries :

Then simply replace the files and composed variables with your own values in the following Python code :

That’s it ! This code iterates over all the Word files we want to merge, it adds a page break at the end of each file (except the last one) and finally it appends the file to the composer. Now we have a merged Docx file in memory that we can save on disk using composer.save() .

If you need help, please leave a reply below, we answer within 24h.

Leave a Reply

Your email address will not be published. Required fields are marked *