HTML Utilities

Contents of Distribution Package
================================

o HTMLTOOL.txt: This documemtation file
o TXT2HTML.awk: AWK Program to convert plain text to HTML
o HTMLTBL.awk:  AWK Program to add a Table of Contents to HTML
                documents

=== Introduction ===

HTML (Hypertext Markup Language) has become a standard for writing
documents for the World Wide Web (WWW). Many documents, however, are still
plain text. Converting them to HTML manually is not difficult but time
consuming and boring. The program TXT2HTML takes over the initial work of
converting a text document into HTML. The result may still require some
cosmetic changes, but the bulk of the work is done. Manual post processing of
TXT2HTMLs output requires knowledge about HTMLs markup tags.

A frequently encountered problem of HTML documents is the lack of a table of
contents. The second tool addresses this by adding one to a
given file.

Required Software

In order to use the programs in this package you need:
* The AWK Programing Language. For the HP100/200 you can find one in  
  the HPHAND library 15, file AWK.ZIP.
* Any HTML reader. There is an excellent HTML reader for the HP100/200
  in the HPHAND library 11, file HV.ZIP

Program Descriptions
====================

The following sections describe how to use the AWK programs, their scope, and
limitation.

HTML2TXT.awk

 USAGE: awk -f txt2html.awk file.txt > file.htm

HTML2TXT scans the text file <file.txt> and looks for formatting it
understands and adds appropriate HTML tags to the output file. The
recognized formatting elements are rather limited, therefore the generated HTML
file almost certainly requires manual changes. At the end of the generated
document HTML2TXT adds a table of contents which it has assembled from the
headings it was able to pick up.

If you want to convert documents written in MEMO, print them to a file
with left margin set to 1 before. AWK cannot handle the extremely long
lines generated by MEMO very well.

The quality of the generated HTML file will also depend on the pattern
matching capabilities of the version of AWK you use. The various AWK
programs differ in their implementation of the so called EXTENDED REGULAR
EXPRESSIONS.

To get a feeling on HTML2TXTs capabilities convert this text document to HTML
by:
 awk -f txt2html.awk htmltool.txt > htmltool.htm

Recognized Style elements:

* The very first line will be converted to the document title.
* A line containing just containing repeated characters like *,+,-,=,~,#
  is converted into a horizontal ruler line if the lines before and
  after this line are empty.
* A line underlined with *,+,-,=,~, is converted into a level 1
  heading
* A line is converted into a level 2 heading if the lines before and
  after this line are empty
* Lines starting with -,o,*,+,#,x are converted to unnumbered lists
* Lines starting with a number or letter followed by .,:,) are
  converted to numbered lists
* Indented lines are treated as pre-formatted
* Lines like '****** Some Text ******' are converted to level 1
  headings

  ------------------------------------------------------------

HTMLTBL.awk

 USAGE: awk -f htmltbl.awk file.htm > filewithTOC.html

HTMLTBL scans the HTML file <file.htm> for headings. Each heading is
converted to a so called target location anchor which is used as target for
links in the table of contents. Additionally the headings
are collected and appended in the same order to the end of the file as links
pointing to the corresponding headings. The processed HTML file is written to
the file <filewithTOC.htm>.

If this sounds complicated, try it with a HTML file you have. Be sure to use
different names for <file.htm> and <filewithTOC.htm> in order not to destroy
your input file.

Legal Stuff
===========

Although the AWK programs are copyrighted, they are being licensed to you for
your use free of charge. However, ownership of and interest in this package
shall remain with the author. No one other than the author is allowed to derive
a commercial benefit from distributing or using this package.

You may:
1. Use this software on as many computers as you want at any given
   time.
2. Make as many backup copies of this software as you like.
3. Alter the software in any manner as you see fit FOR YOUR OWN
   PERSONAL USE. Such altered version should not be distributed. The
   creation of such derivatives shall not diminish the author's title
   to this software.
4. Terminate this agreement at any time by destroying all copies of
   this software and derivatives of this software and cease
   distributing the same.

You may not create any derivative works from this software for
distribution.

DISCLAIMER OF WARRANTY

In using this software, you understand and agree that this software is
provided "as is" without warranty of any kind. The entire risk as to the
results and performance of using this software lies entirely with you, the
user.

  ---------------------------------------------------------
 |Peter ERNST          | Tel       : +49-7031-809494       |
 |                     | Fax       : +49-7031-809494       |
 |Dachsklingeweg 19    |                                   |
 |D-71067 Sindelfingen | CompuServe: 100271,632            |
 |GERMANY              | UNIX-Mail : peer@hpbbn.bbn.hp.com |
