{INCLUDE _LOGOS.HTM}

Using Directories to Organize Web Content

WWW content is often organized in the same manner as directories are organized under DOS or Unix. This allows web pages to be grouped by category making documents easier to locate and manage.

URLs

A Universal Resource Locator is a three part string that is used to identify a resource on the World Wide Web (actually, a URL is much more generic than WWW usage, but the web will be used as an example). When you specify the URL http://www.modsoft.com/piclan/manual.htm this actually indicates three different elements.

The string before the : indicates the protocol. HTTP stands for "HyperText Transfer Protocol". This is a specification of the communications layer that is used between the web server and browser. Most web applications use the HTTP/1.0 protocol.

After the : is the name of the host system that the request is addressed to. For internet servers, this name consists of two // characters followed by either the web server DNS name or the web server IP address as decimal numbers seperated by periods. In this example, the host name is www.modsoft.com which is the equivelent of 207.137.72.146.

After the host name is a string which specifies what resource is to be queried. In this case, the resource is /piclan/manual.htm. With early web servers, the resource name would simply point to a Unix file complete with directory specification. Modern web servers (including PicLan-IP) allow you to create "rules" of how resource names are interpreted and related to file space.

PicLan-IP "Directories"

The PicLan-IP Web Server allows you to create resources that appear to the outside world as DOS or Unix style directories. This is allowed even if the web content is stored completely within the MV native file system.

The next step is to decide where to store web content. PicLan-IP can store web content in three distictly different manners:

You choose which storage method is best for your application based on the characteristics and ease of each. Storing web content directly in a MV file is the fastest solution because it eliminates file search and transfer operations between environments. Storing web content in a network file system can make authorizing of content easier because content does not have to be batch transferred to the MV file system. You can also initially store your content in a network directory and then use an import utility (included with PicLan-IP) to move it to a local MV file when development is complete or you wish to deploy a set of web pages.

Wherever possible, the PicLan-IP Web Server will perform detailed processing on the contents of a document only when the document is created or altered. Subsequent accesses to a document then use the originally computed data. This behaviour is very important in reducing web server overhead to the rest of the MV host and maximizing server speed to the web clients. This is referred to as "server-side caching". When you setup the PicLan-IP Web Server, you have several choices that impact the behaviour of server caching:

With web content stored in local MV files, you can choose to automatically or manually check the cache when a document is accessed. If you configure access so that cache is automatically checked, the web server will keep a second copy of your document. Whenever a web document is accessed, this second copy will be compared to your original web document and the cache contents will only be rebuilt if the two copies are different. This is the default configuration. If you configure access so that the cache is not automatically checked, the web server will be unaware of document updates until you run a utility program that manually updates the cache. Non-automatic cache updating is faster and uses less disk space, but it is often less convenient. To help you manage non-cached web content, you can use the utility program CACHE-WEB to insure that all cache content is up to date.

With web content stored in a remote network directory, you have different options available to you. You can configure for automatic cache updating. In this case, instead of storing a second copy of the data, the web server will store the date of time of the directory entry. If these do not compare, the file will be re-imported and the cache rebuilt. You can also configure remote network directories for operation without caching of any kind. This is primarily intended for serving of large binary files that are not accessed frequently. Eliminating caching makes web server operations much slower, but reduces disk space requirments dramatically if you are serving large binary files (such as software .ZIP downloads).

Which type of web content storage you use will depend on your needs and system configuration. Fortunately, it is possible (even easy) to configure the PicLan-IP web server so that some content is stored locally and some content is stored on a network directory. This configuration takes place by defining web directories that control how the web server responds to requests.

A web directory is a grouping of web files that are stored with a single set of rules. For a simple web server, you would define a web directory as:

/ FILE WWW.ROOT
This / indicates the web resource directory portion. The word FILE indicates that this resource is stored in a local MV data file. WWW.ROOT is the name of the local MV file. This file can have any name so long as it is accessible from the PICLAN-IP account.

At this point, it is best to introduce what the web configuration control item looks like. In order to define a web servers functions, you must create this control item. Each line in the control item define a "virtual directory" that the web server will respond to. The format of each line is:

ip_addr:tcp_port dir_path file_type type_parameters (options
ip_addr and tcp_port are used to indicate what TCP/IP addresses this rule applies to. If you are building a web server that is only responding as a single site (this is not a virtual host), then you would use the IP addresses of the MV host in dotted-decimal format. The tcp_port address that you use is typically port 80 (the default for http), but can be any other decimal number if desired.

dir_path is the directory part of the URL.

file_type is either FILE for a local file, HOST for a host file system directory, or DSG for a file refereneced through a DSG.

type_parameters is a collection of words that vary depending on whether the file local or accessed through a DSG. For local files, this is the name of the dictionary of the MV file. For DSG files, this is the name of the local cache file followed by the name of the PicLan DSG and the root directory that will be accessed. If a DSG file is run in no-cache mode, the file name is specified as RAW and no local file is used.

options are used to control caching and other interpreting of data. If the letter C is included, the cache is not automatically validated. If the letter B is included, embedded basic is not interpreted in html files. If the letter S is included then this file is allowed to contain "sub-directories". For a local MV file, sub-directories allow MV items with embedded / characters. For network directories, sub-directories allow network sub-directories to be accessed with a single MV cache file. If sub-directories are specified, the total length of the resource name must be less than the item length limit for the underlying MV file system.

The apweb.piclan.com host is currently using the following web configuration

207.215.231.99:80 / DSG DSG1 O:/WWW/ (S
207.215.231.99:80 /SRC/ DSG DSG1 O:/WWW/ (SB
This allows files that begin with root to be read through a DSG (called DSG1) from the drive letter and directory O:/WWW/ including subdirectories.If a file is specified that begins with /SRC/, the B option indicates that embedded BASIC will not be interpreted (this is how you can actually see the BASIC source pages).

{INCLUDE _GEN_FTR.HTM}