Image by rawpixel via Unsplash
As a graduate student, I love to write in Latex. It is easy to start with, handles all the formatting requirements of different publishers, has great bibliography support, and so on. All these advantages make writing academic articles in Latex much more enjoyable than writing in MS Word. But one drawback is that not all collaborators know how to use Latex, and always my advisors like me to write in Word, since the ?Comments? and ?Track changes? features in Word is very useful for collaborations on writing. As a result, there are times it is needed to convert between Latex and MS Word. I ended up manually paste from latex to word and the process is very painful and time-consuming.
There are some software can do the job, but they are either paid software or the results are not satisfactory. Until I came across Pandoc, a wonderful program that converts between all kinds of markup formats (including between markdown, latex, and docx documents). What?s more, Pandoc is also free, open-source software.
Step 1: Installation
Installation of Pandoc is relatively easy, and detailed procedures for different Operating Systems are provided in the webpage.
Step 2: Convert from Latex to MS Word
I?m going to assume that you have a Latex file ready to convert to Word. Then you need to open a CMD window and direct to the directory containing the Latex file you want to convert. Then we can convert the file through the following command.
pandoc mydoc.tex -o mydoc.docx
All this command does is tell Pandoc to take mydoc.texand convert it to mydoc.docx . -o tells Pandoc the output that we want. Note that we could name the output docx file anything we want ? it doesn?t need to have the same name as our input Latex document.
Pandoc handles Latex equations nicely, all the equations are converted into Word equation editor so there is no requirement of MathType.
Currently, there is no good way to right numbering the equations in MS Word using Equation Editor, the common way is to create a three-column table, to put the equation in the center column and the equation number in the right column. If we have a lot of equations, which is typically true in a lot of academic publications, it is very time-consuming to edit the numbering of equations.
Citations in Latex may not show up properly in the converted Word file. We can fix this through pandoc-citeproc, Which is typically installed when we install Pandoc. We just need to let Pandoc know the location of the reference file, for example, the.bib file. If the file is in the same folder as the Latex document, we can use the following command,
pandoc mydoc.tex –bibliography=myref.bib -o mydoc.docx
Another nice feature of Pandoc, is that we can specify the style of the resulting docx file by a reference docx file. For example, if we want to submit a paper to IEEE, we can download the Word template from IEEE, and use it as a reference file, then the resulting docx file from Pandoc will have the same style as the IEEE template. This can be achieved by the following command,
pandoc mydoc.tex –bibliography=myref.bib –reference-docx=IEEE_template.doc -o mydoc.docx
To handle the numbering for figures, equations, tables, and cross-references to them, there is a filer available called pandoc-crossref. I am using Windows OS, so I downloaded the pre-built executables available at the release page of the GitHub Repo. Then I put the executable file in the installation directory of the Pandoc (typically is in C drive by default).
Then we can specify the pandoc-crossreffilter in the command,
pandoc mydoc.tex –filter pandoc-crossref –bibliography=myref.bib –reference-docx=IEEE_template.doc -o mydoc.docx
I hope this was useful. I think that using Pandoc to convert Latex to Word is good enough for collaborations with coauthors who uses MS Word. In the case that we want to submit to journals that only accept MS Word files, Pandoc can also save us a lot of time, we only need to make small changes to the resulting docx file instead of retyping the entire file in MS Word manually.