Converters from PC Textprocessors to LaTeX - Overview

Switch conversion direction: From LaTeX to PC

last update: Nov 12, 1998

The url of this page is http://www.kfa-juelich.de/isr/1/texconv/pctotex.html

Although this page resides on the official WWW server of Forschungszentrum Jülich GmbH, it is NOT officially supported by Forschungszentrum Jülich but results from my personal work.

I maintain these pages because I need converters between LaTeX and PC Textprocessors for my work and I want to share the information with others who need it. They grew significantly and, because I maintain them in my spare time, I can no longer maintain a text version and a German version in parallel, as the previous version had.

This list is as good or as bad as its support, and I need YOUR support to update and supplement this list. Please supplement if you know more and/or better ones. There are some more converters on the CTAN sites, but the following seem to be most promising for conversion to and from the currect versions of wordprocessors.

Neither correctness nor completeness is guaranteed.
All opinions mentioned (if any) are my own, not my employer's. Please send corrections, enhancements and supplements to the following address:
W.Hennings@fz-juelich.de

Note that this FAQ list contains information about converters ONLY between LaTeX and PC word processors. Converters to and from other formats may have own FAQ lists - e.g. see the link for converters to and from HTML.


General Remarks

Principal problems of LaTeX to wordprocessor conversion

One advantage of LaTeX is that it forces to structure a document, whereas wordprocessors like Word/WordPerfect allow unstructured documents. It is hardly possible to automatically structure a document where there was no structure before.

However it is nevertheless possible to write a structured document with a wordprocessor by consistently using styles. Therefore, wordprocessor documents using styles can be converted to LaTeX e.g. by a macro written for the specific wordprocessor.

There are several ways to convert

To illustrate these, let me restrict it to the Microsoft Word case:

  1. Word binary format -> LaTeX
  2. RTF (Word ASCII format, use Word's own RTF export) -> LaTeX
  3. WordPerfect 5.1 format (use Word's own export) -> LaTeX
  4. HTML (use Word's internet assistant) -> LaTeX
  5. maybe other external format(s)

In the previous version I dared to recommend using HTML as an intermediate format. However I got some comments saying that they had much better experience with <insert your favorite converter here>.

Moreover, the <favorite converter> of someone else didn't work at all for me, and the other way around.

So I am sorry not to have ANY recommendation now. There is no converter satisfying everyone's needs, nor working under all conditions. I am sorry again you have to try for your own.


Using a Word macro

Free:

winw2ltx: A set of macros for WinWord 2, now also available for WinWord 6 and 7 (95)

Commercial:

MathType: PC equation editor with export to LaTeX. MathType home page (USA)


Converting from Word binary format

Free:

LAOLA: LAOLA can read Word6- and Word7-documents under Unix and extract the text. LAOLA homepage (DE site)

word2x: Converts MS Word for Windows 6.0 documents (binary!) to LaTeX or plain text. word2x homepage (UK site)

MSWordView is a program that can understand the microsofts word 8 binary file format (office97), it currently converts word into html, which can then be read with a browser. MSWordView homepage (Ireland site)


Converting from RTF

To use an RTF converter, the wordprocessor document must first be "saved as" Rich Text Format. However each new version of MS Word came with a new level of the RTF language. Most of the available converters cannot understand the current RTF version

Free:

rtflatex understands only older RTF levels

rtf2latex understands only older RTF levels. RTF utilities homepage (USA site)

w2latex understands only older RTF levels

Commercial:

Scientific Word: Win95 based TeX/LaTeX system with graphical editor and rtf import capability including MS's equation editor equations. Understands rtf level up to WinWord 7(95). Scientific Word home page (USA)


Converting from WordPerfect format

General Remarks

Apart from Scientific Word/Workplace which come with an equation-capable rtf-to-LaTeX converter, here are the only available converters which can handle equations. The problem for me is that Microsoft WinWord 7 (95) (I don't have other versions available) does a bad job converting equations to WordPerfect. In fact, only very simple constructs are (partially) converted, and more complex equations are not converted at all.

Free:

WP2LaTeX: converts WordPerfect 4.x / 5.x / 6.x, including equations, to LaTeX. homepage

TeXPerfect: WordPerfect 5.1 for DOS -> LaTeX Translater

Commercial:

Publishing Companion: Word/WordPerfect -> LaTeX converter, equation editor. KTALK's home page (USA)


HTML as intermediate format

There are free HTML converters for Word 6 and 7 for Windows available from Microsoft:
Download... IA for Word 6 / IA for Word 7 / IA for Word for Mac
Word 97 contains it by default, but in contrary to the previous versions it only recognizes heading styles if they are first converted into the corresponding html styles.

WordPerfect 7 and up have an integrated InternetPublisher.
For WordPerfect 6.1 for Windows, the InternetPublisher is available separately:
Download... InternetPublisher for WPWin 6.1

There also is a tool for Unix which is intended to convert word6, word7(95) and word8(97) binary files to html. See http://www.su.shuttle.de/turbo/word2html.c.gz

General Remarks

Because HTML is a structured format, the conversion between HTML and LaTeX is rather straightforward. However there remain the limitations of HTML compared to LaTeX, i.e. there are many elements in LaTeX which can not (yet?) be represented in HTML.

See www.w3.org for a list of converters between word processors and HTML . Or see Liste von Konvertern zwischen HTML und LaTeX (auf Deutsch).

Some converters are available from CTAN ("Comprehensive TeX Archive Network"), e.g. in .../support/latex2html and .../support/html2latex.
(The ... stands for a host specific base directory, which often is either "/pub/tex" or "/tex-archive")

HTML to LaTeX

html2latex (local): Description of HTML-to-LaTeX converter
html2latex (USA site)

htmltools (NL site): Another HTML-to-LaTeX converter. The source is no longer downloadable from there, maybe because the author is no longer employed there. To preserve the access for the public, I put the whole stuff here (local)


Other intermediate formats

Although I got an e-mail telling about ongoing work to use SGML as intermediate format for several conversions, it seems not to be ready-to-use at present.


Converting from FrameMaker

FrameMaker Utilities (UK site): Contains converters for both directions (LaTeX <-> FrameMaker) as well as templates which make conversion from Framemaker to LaTeX more easy


Converting from NotaBene

NB4LATEX: converts files from NotaBene4 (including ancient Greek and all the symbols of logic) to LaTeX2e format. homepage


Converting from Excel

Excel-macro to convert Excel to Latex: http://www.jam-software.com/software.html

The generated LaTeX code uses the tabular environment: http://www.hsh.no/~ag/tabular/


Related WWW pages:

General:

DANTE's LaTeX-PC-Konverter-Liste (auf Deutsch)

Der deutsche CTAN Server (The German CTAN server)

The British CTAN server

The USA CTAN Server

Deutscher CTAN Server, freie Konverter

British CTAN server, free converters

USA CTAN Server, free converters