:h1. SpHyDir Program Logic
:h2. Program Structure
:p.
Bugs can be most easily fixed when they are reported with enough
information to reproduce or at least localize the problem.
Suggestions will be most helpful if the user understands what
things are simple and what changes will be most difficult. For
these reasons, it is a good idea to provide at least a high
level review of SpHyDir program logic.
:p.
SpHyDir is driven by the abilities and limitations of the OS/2
operating system, the Rexx language, and the VX-Rexx development
environment. It would not have been possible to develop SpHyDir
so quickly with anything like its current function in any other
environment.
:p.
Rexx is an interpreted language. This means that execution works
directly off a version of the source. Development is fast, and
debugging is easy. However, some errors that in other languages
would be detected by the compiler are, in Rexx, not found until
execution time.
:p.
An important limit of Rexx is that programs in two separate
source files cannot share global variables. A program like
SpHyDir works only if it can be packaged as one massive source
file. Using a normal editor, such a large program becomes
unmanagable. VX-Rexx addresses this problem by dividing the code
into "sections." Each section has a name and logically appears
to the programmer as if it were a separate file. In the end,
VX-Rexx combines all the sections into a single file so that
variables can be shared.
:p.
Although IBM is distributing a test version of an Object
Oriented Rexx, SpHyDir is written in the traditional SAA OS/2
Rexx language. Object would probably have been useful, but many
current Warp users will not have access to the Object Rexx
package and PCLT cannot redistribute it.
:p.
VX-Rexx creates its own objects to manage the Graphic User
Interface. A Rexx program, such as SpHyDir, cannot define new
types of objects, but there are a few general purpose VX-Rexx
objects that are quite flexible. In particular, the Document is
expanded in the Workarea as a sequence of Record Objects with a
Container. VX-Rexx created (actually OS/2 PM created) the
Container and Record. Records have attributes (the icon
displayed, the caption, the parent and sibling order that
creates the tree). VX-Rexx provides functions that SpHyDir can
call to set or interrogate these attributes or to create new
instances of Records. The rest of the VX-Rexx objects (windows,
buttons, lists, and menus) are defined statically during
development using the VX-Rexx tools.
:p.
The objects exist in memory managed by the VX-Rexx system
outside the scope of Rexx internal variables. Statically defined
objects have a name which is set at design time and can be added
to any program. Dynamic objects, such as records, have an
internally generated id (the "handle"). Handles can be passed as
arguments or stored in global variables. However, the currently
selected record in the Workarea can be determined at any time by
calling a VX-Rexx function (to essentially ask the workarea what
record is selected), and much of the code interrogates the
VX-Rexx objects for information rather than relying on global
variables or other ugly interface options.
:p.
The Main Window contains the Toolbar, Workarea, and entry
fields. It is loaded when SpHyDir starts and goes away when the
program ends. The other windows (Link Manager, Table of
Contents, Text Edit, and Hotword Selection) are loaded as
needed. Because the Main Window is always loaded, the Workarea
object (and its Records) are available to every part of the
code.
:p.
In VX-Rexx, a secondary window is created during program design
and is populated with lists, fields, buttons, and menus. The
Rexx code that will be called when the user types data into or
uses controls on a secondary window can be a separate file, but
SpHyDir chooses to make it part of the one large common code
routine.
:p.
However, the objects on a secondary window do not exist until
that window is "loaded." SpHyDir loads secondary windows in
response to actions in the Main Window. The Table of Contents
and Link Manager are loaded from the "Window" menu, the
Clipboard loads when the user puts something in it, and the Text
Edit window loads when the user doubleclicks on a Paragraph,
Point, or other record containing text content. In all cases,
the objects in the secondary window do not exist until it has
been loaded.
:p.
Secondary windows are created by default as "children" of the
main window. This means that when the secondary and main windows
overlap, the secondary window is alway "on top." A secondary
window can also be created as a child of the desktop, which
would allow the main window to be on top. SpHyDir doesn't use
the latter option (it didn't seem to look right, but that is a
value judgement).
:p.
The Workarea is a container permanently set to a Tree-Name view.
The Toolbar is a Value Set whose contents are icons. Normally a
value set is a kind of radio button (one value is selected) but
that is not how the Toolbar is used. Rather, sometimes the icons
are double-clicked and sometimes they are drag-dropped. The
Value Set was chosen simply because it is a convenient way to
arrange a bunch of icons.
:p.
VX-Rexx provides a function that returns the handle of the first
record in a Container. In SpHyDir, this is the Document record.
Other functions can find the first child of any parent record,
or the next record among a collection of siblings. All the
information about the document (text, links, structure, etc.)
can be globally accessed by any function in the SpHyDir program
by simply "walking" the tree of container records.
:p.
Records have attributes and data fields.
:UL.
:li.  :hp2.Attributes:ehp2. apply to all records. They are defined by
PM or VX-Rexx. They include the Icon, Caption text, and emphasis
(looks Selected, looks Open, etc.). Attributes are set when the
record is created, and can be changed by calling the method
functions provided by the Container object. A record can be
moved, for example, by changing the two attributes that
determine position&colon. :hp2.Parent:ehp2. that determines what
if any record is directly above it in the tree, and
:hp2.Previous:ehp2. that determines the record in front of it
among all the records under that parent (though if Previous is
set to "FIRST" it goes in front of its siblings just after the
Parent).
:li.  :hp2.Fields:ehp2. contain the data that an application chooses
to define. Normally, the Fields of a record are the columns
displayed when the container is put in Details View. However,
SpHyDir doesn't ever change the container view, so the Field
contents are never displayed on the screen. What does show up to
the right of the Icon is the Caption, which is an attribute as
described above. SpHyDir assigns five text string fields to
every record&colon.
:OL.
:li.  :hp2.Type:ehp2. is a word that determines the type of the
object. The Type value is always capitalized. Types include DOC,
SECTION, PARA, IMG, OLIST, ULIST, DLIST, POINT, TOC, TARGET,
PRE, HR, SUBDOC, FORM, ENTRY, MLE, RADIOBUT, CHECKBOX, SPIN, and
LISTBOX.
:li. The :hp2.Text:ehp2. string contains the text of the paragraph,
point, or preformatted section, the title of a document, the
header of a section, the alternate text of an image, or the
caption of a box or button. Conceptually, Text is what is
"inside" the object in the sense that when you double click on
an object to open the text window, this is what shows up.
:li. The :hp2.Name:ehp2. string contains the file name of the
Document or Image object, the term of a Defininition List item,
the label of a Target, or the name of a variable associated with
a Form object.
:li. The :hp2.Links:ehp2. string contains a list of file references
or URLs. An image has only one link. A paragraph or point can
have multiple hotword links. Each 0x10,0x11 character pair in
the Content string correspond to one filename/URL in the Links
string.
:li.  :hp2.Align:ehp2. was originally used only with images and
contained the alignment option. When Forms objects were added,
the field was reused to contain shape information (height,
width, length) but it has not yet been renamed.
:li.  :hp2.Entry:ehp2. contains the static default string for a
simple text entry area. The Text field contains the caption, and
there was no other field to put this in.
:eOL.
:p.
The Text field contains a string that represents the contents of
the paragraph or list point. Within this string, <BR> tags have
been converted to Carriage Return, Linefeed. Hotwords start with
a 0x10 and end with a 0x11. Each consecutive hotword string in
the text corresponds to one file/URL in the Link field. The
character format tags (I, B, CITE, CODE, etc.) are embedded in
the text, but the < and > have been replaced by 0x1E and 0x1F
respectively. The &amp.amp, &amp.lt, and &amp.gt entities have
been converted back to &amp., <, and > for easy editing. They
will be converted back to entities when the file is written out.
:eUL.
:h2. Input HTML Processing
:li. SpHyDir loads an HTML file when one is dropped onto the
workarea, or if one is passed as an argument when SpHyDir is
loaded (ususally by dropping an HTML file on the Program Object
in WPS configured to run SpHyDir). To process a new file, first
the workarea is cleared of all existing objects. Then the HTML
file is processed in three stages.
:p.
The first stage reads the file into memory and breaks it into
sections of Text and Tags. Tags are the part between "<" and ">"
characters. Text is outside the tags. The process creates a stem
variable called _Token. Each entry in the stem contains either
tag information or the text string.
:p.
When the _Token. table has been built, a second pass scans the
tags for matching Start and End tags. SpHyDir has a vocabulary
of tag names that pair off&colon. PRE HTML HEAD BODY TITLE H1 H2
H3 H4 H5 H6 A B I U DL UL OL ADDRESS. Every time it encounters a
starting tag in this list, it pushes an element on a stack. When
it encounters the matching end tag, it pops the stack. Ordinary
<P> tags are not on the list because the </P> tag is often
missing. Normally, the tag that ends should be on the top of the
stack, but a large number of real Web documents are sloppy.
SpHyDir allows two adjacent tags to end in the wrong order,
since "<H1><B> ... </H1> </B>" has been observed in many
documents. Otherwise, if an ending tag doesn't seem valid given
the current stack, SpHyDir stops and pops up the HTML edit
window for manual correction.
:p.
Once the scope of the tags has been determined, it is now
possible to build the objects that structurally represent the
document. The first step is to scan through the HTML HEAD TITLE
... /TITLE /HEAD BODY part at the front. The title goes on the
document object. The rest of the header is discarded by the
current version of SpHyDir but may be supported in a later
release. The body of the document is then processed, with
H1...H6 tags turned into sections, IMG tags turned into images,
OL/UL/DL tags turned into lists, and text between paragraph
breaks turned into paragraphs.
:p.
Tag parsing is a big SELECT ("case") statement. Anything that is
not recognized is turned back into a tag (though the "<" and ">"
are replaced by 0x1E and 0x1F during SpHyDir editing). This is
intentional for formatting tags (B, I, CITE, CODE) and seemed
like a good fallback for any unrecognized experimental,
obsolete, or exotic stuff.
:p.
Object construction is a recursive procedure. When it encounters
a object that has contents (SECTION, ULIST, OLIST, DLIST) then
it benefits from the previous scan that paired off starting and
ending tag locations. It can create a new parent object, push a
new level into the tree, and recursively call itself to process
all the tags between the start and the end of the current
structural component.
:h2. Filenames
:p.
The issue of file names should have been more carefully thought
out. SpHyDir has tried to go back and fix things systematically,
but problems may still arise.
:p.
SpHyDir is typically used to edit files on a personal machine.
They need to be locally tested with Web Explorer. Then they will
be transferred to the production server, which can be a Unix,
NT, or OS/2 machine. Three problems can occur in the
transfer&colon.
:OL.
:li. A Unix system requires the use of "/" between directory levels.
Although NT and OS/2 use "\" normally, they will tolerate the
forward slash. Therefore, a forward slash is generated in all
HTML references to files. However, Rexx is not quite so
forgiving. The Rexx functions that parse and validate file names
require that the "\" character be used. Thus SpHyDir is always
internally jumping back and forth between the internal "\" and
external "/" version of file names.
:li. Unix file names are case sensitive. OS/2 and NT are not. If the
file names are left in mixed case, then Rexx string comparisons
will not find that "stuff.htm" and "STUFF.HTM" are the same
file. If everything is folded to uppercase, then the Unix server
may complain. This is currently categorized as an outstanding
bug that needs to be fixed sometime soon.
:li. The target HTML library on the server may be a subdirectory of
some larger stucture. For example, on the author's machine
SpHyDir operates against the F&colon.\PCLT directory, but on the
server the same tree is stored under D&colon.\HTTP\PCLT. All the
file references generated in the HTML are relative to the
current document. This document is SPHYDIR/LOGIC.HTM when viewed
from the main PCLT directory. Within the SPHYDIR subdirectory, a
reference to ../ICONS/HOOD.GIF references \PCLT\ICONS\HOOD.GIF
(the ".." goes up one level from \PCLT\SPHYDIR). Again, although
this particular syntax has to be religiously generated in HTML,
Rexx won't accept any of this syntax. Every such relative
reference has to be convertable to an absolute path name on the
current machine ("F&colon.\PCLT\SPHYDIR\LOGIC.HTM") in order for
Rexx to open the file or find its extended attributes.
:eOL.
:p.
The "solution" was to create the general purpose Parse_Filename
subroutine. It takes a file name in one of two formats&colon. a
fully qualified OS/2 path ("F&colon.\PCLT\WINWORLD\OS2.HTM") or
a Unix path relative to the the current document
("../author.htm"). It produces three output forms for the
filename&colon. the OS/2 path, the document-relative path, and
the library-relative path ("winworld/os2.htm"). To get the
document-relative path you have to pass the library relative
path of the current document. If this argument is not supplied,
then there is no current document and all Unix-style parameters
are library-relative.
:p.
The PC path format is only used for system interface subroutines
and file I/O. It is never stored in any record field, kept in a
shared variable, or written to any file. You can copy the entire
library to another disk or directory without effecting any
logical links.
:p.
The Unix-style reference to the position of another file
relative to the current document is used for the GIF file that
is the source of an Image object, as a Hotlink to another
library file, and when referencing a subdocument.
:p.
The Unix-style reference from the start of the library (or more
properly from the HTMLLIB environment setting) is used to refer
to the current document itself. As a consequence, it is also
stored in the Parent Extended Attribute for a subdocument file.
:p.
If care is taken to remember the rules, then things will come
out all right. Sloppy thinking can embed the wrong type of file
reference and cause trouble. After a bit of thinking about the
problem, the various forms become somewhat natural. It should be
noted, however, that while the Subdoc Extended Attribute from
the parent file to the subdocuments is relative to the parent
location, the Parent Extended Attribute from the subdocuments
back to the parent is a library-relative expression.
:h2. Document Editing
:p.
There are five ways to edit a document.  Some attributes of
objects can be changed by selecting the object and typing new
values in the entry fields that appear at the top of the
Workarea.  Image and Subdocument objects are associated with a
file by dropping the icon of the file on top of the object. 
Text-y objects are edited by double-clicking the object and
opening the Text Editor Window.  New objects are created by
dropping icons from the Toolbar into the document.  Existing
objects can be moved by dragging them around or through the
SpHyDir Clipboard.
:h3. Fields at top of Workarea
:p.
At design time, the top of the work area is configured to have a
set of fields&colon.
