Bounding Unicorns

How To Create Empty Office Documents on Linux (LibreOffice Writer/Calc/AbiWord/Gnumeric)

Anyone using an office suite, such as LibreOffice, frequently creates new documents. My normal workflow involves a terminal, and it's much more convenient to specify the document path in the terminal rather than navigate through the GUI save dialog. Unfortunately, none of the popular Linux office software (LibreOffice/OpenOffice, AbiWord, Gnumeric) have a mechanism to specify the file path to edit when the file there doesn't exist.

For completeness, if you try to launch these programs with a non-existent path as an argument (i.e. soffice new.odt), you get the following behavior:

Program Invocation Behavior
LibreOffice Writer soffice path/to/new.odt Error message: File does not exist
AbiWord abiword path/to/new.odt Creates new unnamed document, save dialog offers no pre-completed file name and starts out in the current directory instead of in path/to
LibreOffice Calc soffice path/to/new.ods Error message: File does not exist
Gnumeric gnumeric path/to/new.ods Error message: File does not exist

Users have been asking about how to create new named documents and there is even a formal feature request for LibreOffice which hasn't seen any progress in 12 years other than to provide the "empty file" workaround which, as we'll see shortly, doesn't even work for LibreOffice Calc.

So, the claimed way to have a new named document is to simply create an empty document with the desired name and pass its path to the program, as follows:

touch path/to/new.odt
soffice path/to/new.odt

Let's see what actually happens when this is attempted with the various programs. For this test, we'll create a new empty file with touch as just described, launch the office program, type something, save the changes, exit, then look at the contents of our file.

Program File Extension Outcome
LibreOffice Writer .odt Valid OpenDocument text file
AbiWord .odt Plain text file, silent loss of all formatting
LibreOffice Calc .ods Depending on LibreOffice build and runtime configuration, either a valid OpenDocument spreadsheet or a "Flat XML" file
Gnumeric .ods Error message: Unsupported file format

This solution works for 1.5 out of 4 cases. Reliably it only works for Writer. AbiWord has the worst behavior of the bunch - it saves the file with no warnings but uses the plain text format, thus any formatting (and, presumably, embedded objects and such) entered by the user are silently lost. I learned that there is a beast called OpenDocument Flat XML format, and if Calc has the plugins to understand it, this format is used for saving the spreadsheet if Calc is opened with an empty file. This is not a complete disaster because Calc should be able to read the file back, but it won't be able to without the Flat XML plugin, which is apparently optional, and other software such as Gnumeric can't read the Flat XML format. Another issue with this is that the file contents doesn't match the extension (the proper extension for Flat XML spreadsheet is .fods, not .ods). Gnumeric refuses to launch when given an empty file as an argument, producing the "Unsupported file format" error (which technically is true, a zero-byte zip archive is probably not a valid zip file), so this solution doesn't work for it at all.

An alternative solution is to have an empty file of the desired type and copy it whenever a new document is needed. While technically a workable solution, it is not without its problems:

  1. The office documents are complex and besides developers of office software, likely nobody knows what is actually stored in them. The original, empty document could be storing unknown identifiers and metadata (e.g. authorship) which would end up being reused for all of the copies.
  2. Intentional or unintentional modification of the "prototype" document would copy the changes into all subsequent new documents, again without anyone knowing about it.
  3. The "prototype" documents still need to be created by something.

Some users have resorted to using programs importing e.g. Python libraries that can read and/or write OpenDocument files, but fortunately this level of masochism is not necessary. Turns out, the requirements for a valid OpenDocument file are easy to meet with a simple shell script.

All OpenDocument files are Zip archives. The archives must contain a minimum of three files for LibreOffice to accept them:

  • mimetype
  • content.xml
  • META-INF/manifest.xml

All of these files can have fixed contents, as described below.

For a spreadsheet, mimetype must contain:

application/vnd.oasis.opendocument.spreadsheet

content.xml must contain:

<?xml version="1.0" encoding="UTF-8"?>
<office:document-content
  xmlns:office="urn:oasis:names:tc:opendocument:xmlns:office:1.0"
  xmlns:table="urn:oasis:names:tc:opendocument:xmlns:table:1.0"
>
  <office:body>
    <office:spreadsheet>
      <table:table table:name="Sheet1" table:print="true">
      </table:table>
    </office:spreadsheet>
  </office:body>
</office:document-content>

Gnumeric requires there to be a <table:table/> tag. Without it, when the file is opened, the first sheet is called "Sheet2" instead of "Sheet1" and when the document is saved, Gnumeric prompts to choose a file name rather than saving to the specified path. Importantly, although there is no error message when the sans-table document is loaded, Gnumeric doesn't consider the current document to have come from the specified path on disk.

For LibreOffice, a "table" (sheet) is not necessary, and the following is sufficient:

<?xml version="1.0" encoding="UTF-8"?>
<office:document-content
  xmlns:office="urn:oasis:names:tc:opendocument:xmlns:office:1.0"
>
</office:document-content>

META-INF/manifest.xml must contain:

<?xml version="1.0" encoding="UTF-8"?>
<manifest:manifest xmlns:manifest="urn:oasis:names:tc:opendocument:xmlns:manifest:1.0">
 <manifest:file-entry manifest:full-path="/" manifest:media-type="application/vnd.oasis.opendocument.spreadsheet"/>
 <manifest:file-entry manifest:full-path="content.xml" manifest:media-type="text/xml"/>
</manifest:manifest>

This produces a document that LibreOffice Calc can open and save back to the same path. Gnumeric starts without errors but upon save prompts for a file name, i.e. it needs more fields filled out.