Python is very light and convenient for manipulating regex. To extract the text between two strings tag1 and tag2 from the string content, we can use the Python regex library. We only need 2 lines of code: pattern = “(?<=tag1).*(?=tag2)” extracted = re.search(pattern, content).group(0) re is the Python regex library, you will need to import […]
Category: Python
Merging Docx files is complicated Docx is a complicated document format made of multiple XML files zipped together. Concatenating Word documents that only contain text is straightforward and can directly be done using Word. However if your Word documents contain more complicated objects like pictures or tables then you need to add their style/content to the internal structure of […]