How the TowerBells Website is maintained

This page presents a moderately detailed overview of how most of the work of maintenance of this Website is done, for the enlightenment of those visitors who are interested in such things.  In doing so, it also serves to provide some documentation of the types of material found here, as well as of various unusual aspects of the technology that is employed for this work.  Finally, it explains why some requested updates don't get done as promptly as their requesters might like to see.

The processes and technology described in this overview became effective in midsummer 2021.  While many aspects are functionally identical to those of the previous maintenance process, there are enough significant differences to warrant this almost entirely new page.  The most significant change was the abandonment of three stages of computer hardware, some of which used technology that was over 30 years old.  While software that old is still being used, the hardware that it formerly ran on is now being emulated on modern hardware; details are given below.

Hardware systems

The hardware systems used for this Website are a host system and an offline system.

Host system

The computer system which hosts this Website (the Web server) belongs to a commercial Web-hosting enterprise.  Its services are paid for by Your Editor as a public service, and they include provision of the basic hardware/software infrastructure and the Internet access point.  The infrastructure includes the host computer and its operating system (Linux), utilities for Website maintenance and disk storage management, as well as Web server software (Apache).  The host computer and Web server support long, mixed-case filenames which are case-sensitive.  Public access is unrestricted for display of Webpages and downloading of selected files; maintenance access for adding or replacing Webpages and other files is restricted to authorized personnel (currently only Your Editor).

Offline system

The offline system hardware which supports this Website consists of two Apple Macintosh computers owned by Your Editor and resident in his home.  They are locally networked together, and both have full-time high-speed access to the Internet.  One contains the offline mirror of the Website plus a workspace and various software tools that run in the native Mac OS X environment.  The other contains the database for carillons of the world and its DOS-based maintenance tools (all described fully below), as well as handling email and other facilities for collecting new information from elsewhere.  While all of these features could in fact be made resident on a single computer, using two closely-linked computers provides useful backup and redundancy.  There is additional off-site backup for disaster recovery.

(For the technorati:  The offline mirror for the entire Website is resident on an Apple Macintosh computer running Mac OS X 10.6.8.  The HTML editor is BBEdit 10.5.13.  The FTP utility used for uploading files is Fetch 5.7.7.)

Main pages

The "main pages" on this Website are those Webpages which are in the home (or root) directory of the Website or in subdirectories other than the data directory (see below).  They include the TowerBells Home Page and all others which have Uniform Resource Locators (URLs) in which the domain name "www.TowerBells.org" is immediately followed by a solidus (/) and then a filename.  That includes all pages in the "general introduction to tower bells" section, as well as almost everything else linked from the home page.  All of these pages are essentially plain text (even if sometimes formatted in complex ways), and contain NO information from the database of carillons of the world.

NOTE:  Each and every page on this Website, regardless of what section it is in, has the owner's email address (csz_stl@swbell.net) at the bottom of the page, since he should be contacted for any questions, suggestions or comments which you may have about that page.  Please do NOT use the email link on any page to write about a different page, because each such link pre-sets the subject line of your message to reference the particular page on which that link occurs.

Data pages

The "data pages" on this Website are those Web pages which are in the "data" subdirectory or below it, i.e., those which have URLs of the same form as that of the page you are now reading.  Specifically, they all have the "/data/" directory name in the middle of their URLs.  These pages have to do with the instruments which are the reason for the existence of both the database that undergirds this Website and the institutions that it most directly supports — carillons and their "relatives" in the world of tower bells.  "World" is not only symbolic but geographically literal, because these pages are derived directly or indirectly from the author's database of carillons of the world.  These pages are maintained offline using the technologies described below, and uploaded to the Website when ready.  Like the main pages, each such page has his email address in page-customized form (e.g., csz_stl@swbell.net), since he should be contacted for any questions, suggestions or comments which you may have about any particular page.

The data pages may be further subdivided into text data pages, site data pages and index pages.  There are also a very few hybrid data pages, as well as hardcopy files (PDFs) derived from the same database.

Text data pages

Text data pages provide the framework within which all data is organized — introductory and general descriptive material, as well as access to the various kinds of indexes.  There are also some special articles on various subjects.  Text data pages all have mixed-case filenames ending in ".html", e.g., "Data_Top.html".  They were composed and are maintained using a methodology just like that used for maintaining the main pages (see above).  The page that you are reading falls into this category.

Site data pages

Site data pages are the reason for the existence of the /data/ section of this Website, since each such page describes one carillon or other tower bell instrument somewhere in the world.  Collectively, they make up by far the largest part of this Website.  All site data pages have uppercase filenames ending in ".HTM" (e.g., "CODENVUD.HTM"); they are derived from a database which originally was resident on an IBM-compatible personal computer (PC) and now resides in a DOS PC emulation environment on a Mac computer system.  This database and its current environment are further described below.

The process of producing these pages works as follows:

  1. Run a data-extraction program (described below) in the DOS emulation environment to select information from the database and organize it into a set of new HTML files (Web pages) that reflect all substantive changes to the database since the last such run; one of those pages will be a revision index (see below).
  2. Move the generated HTML files from the PC emulation environment to a workspace in the Mac's native environment.
  3. For each generated file, apply the HTML editor as follows:
    1. If it is an improved version of an existing site data page, use the file-compare feature to find and copy forward any information which was manually added in previous versions and remains relevant.  This includes all site-specific internal and external Weblinks. 
      If revisions to technical details in that site data page necessitate changes to any of the indexes which reference it, move those indexes from the internal mirror into the workspace and update them appropriately.
    2. If it is a new site data page, add to it the appropriate site-specific internal Weblinks, using standard templates; also copy the revision index entry for this site into all relevant existing index files, with appropriate adaptations for each.
    3. Convert from DOS format (CR-LF line ends) to Macintosh format (CR line ends).  (Both are different from Unix format, which has LF line ends.)  This happens automatically within BBEdit.
    4. Convert any remaining characters with diacritical marks (such as á,é,É,ç) from what works on the PC to the equivalent Macintosh character.
    5. If it is a site data page, and new external Weblinks have been found for that site, incorporate those into the page.
  4. Add an appropriate descriptive entry to the "What Is New" page, including a link to the revision index.
  5. Verify the consistency of all internal links, and verify that all external links are still current.
Once the batch of new and revised pages is ready, they are uploaded to the Website together.  Then BBEdit is used to find all of the email addresses in the batch, in order to send a standard notification message to as many of those sites as possible; a copy of that message also goes to other interested persons.  (See the Subscribing to update notices page for more information about this.)

Index pages

Index pages enable the finding of site data pages using any of several different criteria, as well as showing how sites relate to others with similar characteristics under any particular index criterion.  There are two categories of index pages — revision indexes and permanent indexes.

Website revision indexes are pages with names such as "IX990924.HTM".  Their purpose is solely to identify the members of the package of new or revised site data pages with which they were uploaded, and they are referenced only from the What Is New page entry for the date of that upload.  They are generated by the same data-extraction program as the site data pages, as part of the same data extraction process, and are handled in the same manner (see above).

Permanent indexes are pages that have mixed-case filenames like text data pages, but they are composites of text plus database material, and their filenames always begin with IX.  Each such index is based on one or more clearly defined criteria, which may be geographic and/or technical in nature.  (For example, there are indexes to the traditional carillons of each region of the world in which such instruments are found.)  Most of these indexes began as special extracts from the database, in a format essentially identical to revision indexes, to which appropriate text material was added.  Over time, these permanent indexes have been expanded by copy-and-paste methods using revision indexes as the raw source material (see above).  Normally, these indexes are revised and uploaded only in conjunction with the new or changed site data pages which are associated with changes in the index lines which link to such pages.  This stricture is followed in an attempt to make sure that index pages are never out of sync with the site data pages which they index.  (It is still possible for human error in editing an index page to result in a discrepancy between the index and what is being indexed.  Please report such discrepancies as soon as you find them, using the "mailto" link at the very bottom of the affected page.  Doing so will automatically generate an appropriate subject line which identifies the page.  Please send a separate message for each discrepant page.)

Hybrid site data pages

A very small number of site data pages have filenames ending in ".htm", e.g., "DENEWCAS.htm".  These are for 6-bell rings in North America, which are too small to be included in the database but too important to be omitted as we support the North American Guild of Change Ringers (NAGCR).  These pages were constructed by hand to resemble normal site data pages, and are also maintained by hand.

Maps

The site locator maps and regional locator maps that visitors to this Website can see do not exist as static pages on the Web server.  Instead, they are dynamically generated upon request, using geographic information that is contained in XML files.  This is what makes it possible for visitors to interact dynamically with maps in interesting and perhaps unusual ways, including (for example) the combining of maps for adjacent regions into a single map for the conjoined region.  Those XML files are produced and maintained in ways that are almost identical to the production and maintenance of index pages.

All of the above

No matter how they were originally constructed, final maintenance of all pages and map files on this Website is now done using a sophisticated HTML-aware text editor, operating in a workspace associated with an offline mirror of the Website.

Database

Several aspects of the underlying database are particularly relevant to the process of Website maintenance.

The DOS emulation environment

As implied above, the database of carillons of the world resides in a software environment that emulates an IBM-compatible personal computer (model AT/486) running the Disk Operating System (DOS) version 6.22.  The reason for using that environment is to be able to use a set of DOS-based software that has no equivalent or replacement in more modern hardware and software environments.  That software includes a text editor (Kedit), a file manager (Stereo Shell), the data-extraction program and a Borland Turbo Pascal compiler that is used for revising it.

Database maintenance

The database is composed of plain text files (sometimes called flat files), each of which includes information about one category of known tower bell instruments in one region of the world.  These files are updated using KEDIT 5.0, a PC-based equivalent of XEDIT, a powerful text editor that has been widely used on IBM mainframe computers for decades.  Some of its text-manipulation features are still not available in any contemporary WYSIWYG ("what you see is what you get") text editor.

Tracking of changes to the database is done using the two dates which are presented near the bottom of every site data page.  These dates show when the textual and technical parts of each site's data were last revised in the database.  They also control which sites will be extracted in each batch run, when data-extraction criteria such as "changed on date" or "changed since date" are used.

Special characters

Different types of computers represent character data in different ways, so one of the most troublesome problems for computer users is managing the translation of character data from one system to another.  American-made computers all share a common base of ASCII (American Standard Code for Information Interchange), which includes upper and lower case alphabetic characters, numeric digits, and common punctuation marks.  But they have differing ways of extending ASCII to represent various "special" characters e.g., those with diacritical marks, such as á,é,É,ç.

In the database for carillons of the world, the encoding of special characters was originally designed for optimal display of western European languages on an HP LaserJet IIIP printer.  As the data portion of the GCNA Website expanded to cover areas beyond North America, procedures were developed to manage the translation of these special characters from one system to another.  When the database-related software was moved from PC hardware to the DOS emulation environment, these translation procedures were revised appropriately.  However, the database itself remains limited to an extended ASCII character set. 

The extraction program translates most extended ASCII characters within site data pages into HTML entities so as to be independent of font and character set.  Those few special characters which are not handled this way are corrected by hand in the editing step in which they are first seen outside of the DOS emulation environment; simultaneously the files are converted to Unicode (UTF-8).  Thereafter, these alterations are carried forward semi-automatically as described above.  The Fetch utility automatically preserves the Unicode character set during the upload process.  Please report incorrect special characters (which may appear as asterisks or other "garbage" characters) as soon as you find them, using the "mailto" link at the bottom of the page where they appear.  Fetch also translates Mac line ends to Unix line ends during the upload process (see above).

PDF files

Linked from the Hardcopy text data page are various files in Adobe Portable Document Format (PDF), intended to be downloaded for viewing and printing.  They are all contained in the "pdf" directory below the "data" directory, i.e., they all have the "/data/pdf/" path name in the middle of their URLs.  All of these files originated as printable output from the database data-extraction program, though some were produced by processing plain text files which are not, strictly speaking, part of the database itself.  Those print files are moved from the PC emulation environment to a workspace in the Mac's native environment.  On the Mac, they are imported into a WordPerfect template document to add a copyright footer, lightly edited to polish the pagination, and "printed" to PDF files.

Precautions

After verification of a set of changes, all new and revised pages, together with a concurrent revision of the What Is New page, are immediately uploaded to the Website in one batch.  This minimizes the possibility that a visitor to the Website might follow an index link to a site data page which did not agree with the content of the index.

No information extracted from the database is ever altered after extraction, to miminize the risk of inconsistencies or loss of information.  (Occasionally, information from the database may be slightly re-formatted for the sake of appearance.  Also, since the Technical data section of each site data page is a limited interpretation of only certain aspects of the database contents, some editorial changes are occasionally made to overcome those limitations and present such information more accurately.)  Whenever information kept in the database is changed, the affected pages are extracted anew.  (Note the distinction between information and its format.)

Mirroring

A local mirror of the Website is maintained on one of the author's Web-linked computers for test and reference purposes, so that statistics of public access to the "real" Website are not biased by access for development work.  This mirror is distinct from the workspace on the same computer, and is normally updated only when the public Website is updated.

Archiving

Immediately after a set of files is uploaded to the Website, it is also copied into the local mirrors, as well as being copied to archives on three different storage devices for backup purposes.  One of those devices is part of a group which is periodically rotated off-site to provide backup against natural disaster.  The others provide backup against device failure and/or human error.  As a result, it is theoretically possible to reconstruct this Website as it looked at any point in its history.  From a practical point of view, this archive is only used to check the historical content of individual pages.

Process timeline

The batch process which is used for extraction, finishing and upload of site data pages explains in part why additions and corrections sent to the maintainer may not appear promptly on the Website.  Such changes are first collected and organized, then used to update the database itself, as described above.  Then the data-extraction program is run to produce a printable report of the changes, for purposes of proofreading.  After all changes have been confirmed as correct, the extract and upload process described above can take place.

Although this may appear to be a tedious process, it is vital to insure that the information which we present is as accurate and consistent as we can make it.  If we violated process rules for the sake of speed, we would not only risk introducing discrepancies into what we publish, but also lose track of what is current versus what is not.  Or else we would lose the ability to proofread changes to the database effectively.

Database history

As indicated under "Process timeline" (above), the database which underlies the "data" pages of the Website is nothing more than a set of card-image text files, maintained with a plain-text editor.  What is meant by "card-image" is that each record of a file contains nothing but human-readable characters as if it were an 80-column punched card of the sort that once was used on mainframe computers.  In its original form, the database was in fact a box of such cards, and revision with a card punch machine was tedious.  The basic format of those original cards is still in use today, just as it was designed more than 50 years ago.  Although it has been extended and expanded to add new categories of information, is has never been necessary to change it.  Conversion from a box of cards to a set of files on a floppy disk, and now a set of files on a hard drive, has greatly eased the maintenance process.  Not only can changes be made more easily and quickly, but the risk of error has been considerably reduced, because the result of each change is immediately visible on the computer screen.

The program which extracts information from the database has undergone considerably more change.  Originally it was written in the high-level programming language FORTRAN, and itself resided in a box of punched cards.  Several versions of FORTRAN were used, as the program migrated from one mainframe to another.  Eventually, the data (and the program) migrated from actual punched cards to card-image files on reels of half-inch magnetic tape, of the type that used to be common on mainframe systems.

When personal computers became not only affordable but also sufficiently powerful to handle the processing requirements for the database, the program was completely rewritten in Pascal, another high-level programming language.  (Initially, that was on a DEC Rainbow personal computer; later it was converted to Borland Turbo Pascal on an IBM AT/486-compatible personal computer.)  All of the fundamental logic of the original program was retained, including the subprogram structures, though a number of minor implementation details had to be changed.  It was at this point that both program and data became resident on direct-access storage ("hard disk"), eliminating the need for either punched cards or magnetic tape.  The increased ease of editing in this environment affected not only maintenance of the data but also maintenance of the program.  As new categories of information were added to the database, changes were made to the program to display that information appropriately.  The largest single change was the addition of an option to format extracted information as HTML files, i.e., Web pages.  Previously, all program output had been in the form of print files, i.e., files formatted for delivery to a printer for producing paper copy.

Data extraction

The data-extraction program mentioned above, which is used to select and extract information from the database, can be viewed conceptually as having three principal components: a control statement interpreter, a data loader, and a report generator. 

Control statement interpreter

The control statement interpreter accepts input from either the keyboard or a control file.  Control statements can cause loading of a data file, setting of various processing options, selection of data (based on a wide variety of simple or complex criteria), sorting of selected data (based on single or multiple parameters), and generation of several different kinds of reports.  There is online help in the use and format of all control statements, though some can only be used in control files.

Data loader

The data loader reads one flat file and builds three temporary data structures, one in memory and two on disk.  It can be invoked repeatedly to load any number of data files together, subject only to the constraints of available space for the in-memory table.  (Loading additional flat files simply expands the three temporary data structures, as if a single large flat file had been read.)  Unfortunately, the total number of carillons and chimes in the world is so large that it is now impossible to load all data simultaneously.  This makes the production of certain types of reports impractical (e.g., world-wide summaries).

The in-memory table contains all of the condensed technical information found in the flat file, some of it converted from character strings to integers.  This table also has forward and backward pointers which connect all of the rows which describe a particular tower bell instrument, with each row describing a particular stage in the instrument's history.  The newest (or only) row for a particular instrument always describes its present (or last known) configuration, and is considered the primary record; other rows are secondary to it, in reverse chronological order.

One of the on-disk temporary data structures is a sequential list of the condensed technical information records found in the flat file, in un-converted form.  The in-memory table has pointers to the records in this list, which are used to produce Condensed Information Listing (CIL) reports. 

The other on-disk temporary data structure is a sequential list of the textual information records found in the flat file, in their original form.  The in-memory table also has pointers to the records in this list, which are used to produce Master Information Listing (MIL) reports, etc.. 

Report generator

What is conceptually a single report generator is actually a collection of special purpose report generators.  Which ones will be used in a given batch run, and how each of them will operate, is determined by the control statement interpreter.  It is trivially simple to combine several different reports into a single print file which can be used to make a PDF file.  Several of the possible reports are described on the Hardcopy page, and several combinations of them are available there as PDF downloads. 

The production of Web pages (both site data pages and index pages) and XML files for generating maps (all as described above) is accomplished with one of these report generators.

The future of the database

From time to time, various people have urged the author to convert to use of a conventional relational database program (e.g., Microsoft Access).  The motivation for such a conversion is that it would enable the database to be maintained by anyone who has a reasonable degree of expertise in the use of such a program.  Of course that is in principle an excellent idea; very few (if any) people have the author's peculiar combination of expertise in relatively obscure programming languages and data design, and so such a conversion would make continuation of the author's work after his inevitable death a relatively straightforward matter.  Unfortunately, from the author's viewpoint it is not such a good idea (at least not just now), for two reasons. 

Firstly, the present organization of the data files does not fit the requirements of relational databases.  Relational database programs (RDBs) are excellent for handling data which fit the constraints of the relationships for which they are designed, and many kinds of commonly used data do that.  Nevertheless, RDBs are not a panacea.  On the one hand, they are unnecessarily complex for managing small quantities of simple data.  On the other hand, there are data relationships which can be forced into the "relational" mold only with great difficulty, requiring extraordinary contortions in design and programming and maintenance.  In the author's professional opinion, his existing set of data regarding carillons and chimes falls into this category.  It seems highly probable that significant information might actually be lost in the course of conversion to such a database.

Secondly, such a conversion would have almost no direct benefit for the author, and indeed would be to his detriment.  The total redesign of the database, the conversion of the existing data into an entirely new format, and the construction and debugging of an entirely new report-generator system would be a very large task that would contribute nothing to the current project for extending the scope of the Website to cover all of the carillons and chimes of the world.  Indeed, the time and effort required would considerably delay that project, the basic data for which already exists in the present format.

The one possible direct benefit to the database owner from such a conversion would be the ability to view the entire database at once for production of worldwide summaries and reports.  At present, that is an insufficient incentive for the author.

/signed/   Carl Scott Zimmerman, database owner.


Postscript (Feb.2024):  A new project has just begun, which may provide a path to a future database.  Modifications to the report-generator program will enable output in JSON files, which could then be ingested by a non-relational database program.  Currently a project design document is under development, along with a JSON schema.  If successful, this would lay the groundwork for transforming the entirety of the present database into a modern format, as well as eliminating all of the deficiencies of the present format.


[TowerBells Home Page] [Site data top page] [What's New] [Feedback]

This page was created 1999/10/04 and last revised 2024/01/22.

Please send comments or questions about this page to csz_stl@swbell.net.