Writing Export Filter for KWord using libexport

Ariya Hidayat
Nicolas Goutte

Last revision: $Date: 2002/03/05 14:25:47 $

Introduction

KWord's Export Filter Library (in short: Libexport) is a library to help developers writing an export filter for KWord. Basically, an export filter will read a KWord document and convert it to some other format. Since such filters will likely have a quite amount of common code, especially in parsing the KWord document, a base class for doing this should be useful.

KWord libexport is written and maintained by Nicolas Goutte. It is part of KWord filters and can be found in koffice/filters/kword/libexport.

Basics of KWord libexport

An export filter consists of two parts, a "Leader" to parse KWord document and a "Worker" to create output document. The Worker itself will be "directed" by the Leader. Libexport has already provide a Leader, namely KWEFKWordLeader so you won't need to write your own. while a Worker is unique to your filter.

The worker is a sort of black box for the output file, the leader a sort of black box for the (KWord) input file. In particular, the worker does not care about the input file.

To implement your own Worker, derive the Worker from KWEFBaseWorker and implement several virtual methods whose name started with "do". Eventually, these methods will be called by the Leader. Each method should return true if everything is OK (i.e.parsing should continue) and return false when error happens (and then abort).

Here is a short explanation of some basic methods:

doOpenFile(const QString& filenameOut, const QString& to)
opens output file filenameOut
doCloseFile()
closes output file
doAbortFile()
closes file upon error
doOpenDocument()
opens the document (like <html> in HTML)
doCloseDocument()
closes the document (like </html> in HTML)
doFullParagraph(const QString& paraText, const LayoutData& layout, const ValueListFormatData& paraFormatDataList);
processes a paragraph, paraText is the text for the paragraph (it's plain text), layout holds information in KWord's LAYOUT, paraFormatDataList is a list of formatting sequences.

Using the Worker/Leader is not difficult. First, create your Worker. Next, create an instance of KWEFLeader and passing the Worker in the constructor. Now you can call KWEFKWordLeader::filter(filenameIn,filenameOut,from,to,param) to start conversion. The Leader will parse KWord document specified by filenameIn and call virtual methods of its worker which actually handle the process of creating output file filenameOut. Effectively, this converts a KWord document into your filter format.

Note that sequence of the "do..." calls might be somehow not appropriate for the output file format. The only workaround is to store the information and write it later to the file.

Example

Very often the code speaks better. So here is a simple example on how to use KWord libexport. In this example, we would like to print out plain text version of a KWord document.

The first step is to create the class TextWorker:

class TextWorker : public KWEFBaseWorker
{
public:
   TextWorker(void){}
   virtual ~TextWorker(void){}
   virtual bool doOpenFile (const QString& , const QString& );
   virtual bool doCloseFile();
   virtual bool doOpenDocument();
   virtual bool doCloseDocument();
   virtual bool doFullParagraph(const QString& paraText, const LayoutData& layout,
      const ValueListFormatData& paraFormatDataList);
};

And here is the implementation:

bool TextWorker::doOpenFile(const QString& , const QString& )
{
  return true;
}

bool TextWorker::doCloseFile()
{
  return true;
}

bool TextWorker::doOpenDocument()
{
  return true;
}

bool TextWorker::doCloseDocument()
{
  return true;
}

bool TextWorker::doFullParagraph(const QString& paraText,
    const LayoutData& layout, const ValueListFormatData& paraFormatDataList)
{
  kdDebug() << paraText << endl;
  return true;
}

As you might see, we are not interested in other that doFullParagraph so other methods will simply return true.

To invoke the Leader/Worker:

TextWorker* worker=new TextWorker();
KWEFKWordLeader* leader=new KWEFKWordLeader(worker);
leader->filter(filenameIn,filenameOut,from,to,param);

That's all !

Up to this point, it is suggested that you also examine some other Workers found in several KWord export filters. The following filters are already using libexport:

ASCII filter is the easiest to understand. You might want take a look at PalmDoc filter, which is very similar to ASCII filter.

In case you still have problems, feel free to send questions to Nicolas Goutte.

History

The original goal of libexport was to allow changes in KWord's file format without the need to change much each export filter. A second goal was introduced by Eva Brucherseifer, who had the idea of making filters as modular as possible (see her email). Parts of Eva's ideas have been put into libexport. Leader and Worker in KWord libexport are associated with Director and Builder in Eva's proposal respectively.