This article is about generating some ebook files formats from HTML.
"Ebooks" are file formats that are viewable on ebook readers,
including PDAs and smart cell phones/mobile phones.
HTML files are fine for viewing on the web, and in fact many
ebook readers support HTML.
Plain text format is the most universal format,
supported by nearly all ebook readers.
and often more compact for small displays, but lack images.
Palm OS Doc PDB file format is supported on Palm Pilots and
other devices that run Palm OS.
zTXT file format is a highly-compressed format (more than Palm Doc), and requires it's own reader.
Plucker file format, which optionally contains images, also runs on Palm Pilot devices, among others.
Plucker is my favorite format. It's open source, so doesn't come pre-installed on PDAs and readers,
but is easy-to-use and more versatile than proprietary software.
Plucker is free DRM restrictions, so is not widely used by commercial ebook publishers.
with thousands of eBooks, supports it though.
PDF files, supported for some ebook readers,
can also come embedded with images, although line wrapping is often a problem with small displays.
This article shows how to use four software programs,
txt2pdbdoc to generate plain text and Palm DOC PDB files,
Plucker to create Plucker PDB files,
and Ebookconverter to create PDF files,
and Jmakeztxt to create zTXT files,
all from a source HTML or text file.
All of this software is open source.
Txt2pdbdoc is available from
converts from HTML and plain text formats to plain text and Palm OS Doc PDB format.
Paul J. Lucas is the current maintainer of txt2pdbdoc.
The software comes in source form, but is easy to compile on Solaris, Linux, and similar systems. Basically, you type this line in the source directory:
./configure; make; make check; make install
The files install under /usr/local/.
For your convenience, I compiled Txt2pdbdoc for Solaris SPARC and x86,
and Linux 2.6 (x86). The source and binaries are at
and may be extracted with this command:
gzcat txt2pdbdoc-1.4.4-bin.tar.gz | tar xvf -
it's best to obtain pre-compiled binaries if you can.
Solaris packages are available from
and many Linux distributions come with Plucker.
Otherwise, build from source files available at
Be sure to get the Plucker "distiller", not just the Plucker Viewer (which manages and views Plucker PDB files, but doesn't create them).
comes in Linux, Apple OS X, Windows versions
to view and upload Plucker files to your PDA or smartphone.
Jmakeztxt is available from http://jmakeztxt.sourceforge.net
It's a Java program by Karin Herm. To run in GUI mode, type
java -jar ./jmakeztxt-1.9.jar
To run in command line mode, type
java -jar ./jmakeztxt-1.9.jar net.sourceforge.jmakeztxt.MakeztxtCmd filename.txt
Either way creates a .pdb zTXT file.
is available from
as a zip file.
It's Java software, so can run on any system with Java 1.5 or higher.
Extract the software using unzip or similar software.
I installed the software under /opt/ebookconverter/.
Kevin Boone wrote ebookconverter.
I use the current above software to generate ebooks automatically
each week from selected HTML webpages.
This is done with a shell script, generate_ebooks.
I'll step through a simplified version of this shell script
(with error handling and site-specific stuff removed for readability).
download generate_ebooks here.
You can use this to create ebooks automatically from your website,
or can generate HTML files yourself from other software
(such as OCR software,
to create custom ebooks.
This last step is an exercise left to the reader :-).
The first part of the ksh shell script does initial housekeeping,
such as getting the input HTML filename and
creating output filenames.
The script extracts the author's name automatically from the
<meta name="author"> HTML tag.
The first files we generate are plain text and PalmOS Doc Pdb files.
html2pdbtxt creates a plain text file
for input into
txt2pdbdoc, which creates the PalmOS Doc file.
After creating the Doc file, the script removes
(\*) markers at the beginning of some lines (and removes a end-of-line if it's in the middle of a paragraph),
and removes the PalmOS end-of-file marker, <(\*)>
# Convert to text and Palm Doc Pdb Format
Next, we generate a zTXT format file, which also has a .pdb extension, but is in a different
format from Palm Doc files (and about half the size) and has it's own reader.
We use Jmakeztxt to create a file from our earlier-generated plain text file.
# Create zTXT format from .txt with JmakezTXT
Next, we generate a Plucker PDB file.
Plucker files share the same extension as PalmOS Doc files,
but the two formats are not interchangeable.
You need to install Plucker Desktop software (free)
on your Desktop computer and PDA to read it.
Plucker files, unlike PalmOS files, can optionally come with embedded
images and has rich text capabilities, such as bold and italics,
for a richer reading experience.
# Create Plucker format
Finally, we generate a PDF file from HTML, using ebookconverter.
Images are embedded with ebookconverter,
although not in a sophisticated way.
Image centering and sizing is ignored by ebookconverter,
and JPEG files tend to be too large, and PNG files too small.
However, the ability to run this software unattended in the background is great.
The text rendering is excellent and may be done sans-serif (default),
or serif, as done here.
# Create PDF format with ebookconverter
Finally, lets run this shell script.
It produces 4 output files from one input file, index.html
$ generate_ebooks index.html
Results of files generated by this software can be seen at
Solaris x86 FAQ website,
which has the Solaris x86 FAQ available in multiple formats.
Also I have dozens of ebooks available at
Yosemite Online Library,
Ebook file formats come in many shapes and sizes—more than are necessary, in fact.
If you know of other ebook file formats not here, please leave a comment here.
They must have freely-available converter software that runs on
UNIX-class operating systems (such as Solaris and Linux).
(Note: trademarks here are owned by their respective manufacturers.)