NETS header NETS Homepage UCAR Homepage NCAR Homepage CISL Homepage NETS Homepage About NETS Work requests & support

How the NETS Web pages work

Introduction

The NETS Web is the directory tree of HTML files that contain information used by the Network Engineering and Technology Section. The pages are maintained by NETS staff. Pete Siemsen is the "webmaster". This page describes conventions used to organize the NETS Web. Many of these conventions apply to other web trees housed on netserver, including those for the FRGP, BRAN, NPAD and NLR.

NETS Web pages are maintained using several tools, including DreamWeaver, text editors like Emacs and vim, Microsoft Word (not recommended), Excel and PowerPoint, and Visio. DreamWeaer is the recommended tool.

NETS web pages have been maintained by many people over several years, so they don't have a completely consistent look and feel. Nevertheless, we try to use standard headers and footers in most NETS Web pages.

NETS web pages are meant to convey content efficiently, without glitz. We try to keep complexity to a mimimum. This makes web web pages load fast and requires little supporting software on either the user's browser or the server. Thus, in general, we don't use frames, JavaScript, PHP, Flash, or even images except where necessary. We use server-side includes to standardize headers and footers, and we use CSS a little bit in the port lists and in a few other places.

Where the NETS Web is located

The NETS Web is maintained on the machine named netserver, in directory /usr/web/nets. NETS staff members who are comfortable with Unix simply log in to netserver to edit the files. Staff members who use Microsoft Windows "share" the /usr/web directory onto their PCs, taking advantage of the SAMBA server that runs on netserver to support this access. The SAMBA server is also accessed by NETS staff that use DreamWeaver's "site" feature to copy files to/from their PCs for local editing.

Each night, the entire NETS Web tree is copied from netserver to the production Web machine named webpub.ucar.edu. This allows NETS staff to develop web pages on netserver without affecting the production NETS Web, so they have a chance to "see how it looks" before the world sees their changes. See How the web pages are copied to their servers.

How the NETS web is organized

The directory hierarchy in the NETS web pages evolved to its present form over time. Many discussions, arguments and meetings influenced the categories on the main NETS web page. We think we've settled on a good set of main categories, though admittedly "Documents" is a catch-all for everything that doesn't fit in one of the other categories.

Here's a brief overview of the directory hierarchy in the NETS web.

frgp (Front Range GigaPop)
nets
    cgi (programs used by forms)
    datacomm (files protected for NETS-only access)
    devices (major classes of devices)
    docs ("documentation" that doesn't fit elsewhere)
    forms (interactive forms for purchase requests, etc.)
    images (pictures used in other pages, like the NETS logo)
    internal (files protected for UCAR-only access)
    intro (the "About NETS" pages)
    inventory (administrative inventories of NETS assets)
    linkdoc (cron-generated analysis of NETS web broken link problems)
    minutes (minutes of meetings)
    noc (per-machine notification instructions used by the NCAR NOC)
    presentations (PowerPoint files of talks presented by NETS staff)
    projects (current and completed major NETS projects)
    other-sites (websites that NETS might find convenient)
    stats (statistics of UCAR networks)
    tools (programs that NETS finds useful)
    topics (experimental topic-based indices)
    ucar-directories (PeopleSearch databases and yellow and blue pages)
nlr (Network Network Path and Application Diagnosis pages)
npad (National Lambda Rail pages)

Internal (secure) directories

The "internal" directory contains secure webpages - anything that should be seen only by UCAR employees. For example, the NETS "Contacts" page contains home phone numbers of many people, so it isn't for general distribution. The "internal" directory contain a directory hierarchy that is almost identical to the one in the main NETS directory. Here are some of the directories found in "internal".

       archives
          projects
             index.html (pointing to annual completed project indices)
             2004
             2005
             2006
          docs
          forms
       contacts.html
       portlists
       projects (other NETS folders as needed for internal)
       vendors

Archiving old web pages

The "archives" directory contains old stuff that we don't consider current, but that we think we might want someday. It lives under "internal" so that we don't have to consider whether our obselete stuff should be secure or not - it just is.

Like the "internal" directory, "archives" contains a directory hierarchy that is almost identical the one in the main NETS directory. We did this so that when we "archive" a directory, we can just move it to "archives" and be fairly sure that all its internal relative links remain valid after the move. When we do this, only the links into and out of the moved directory are broken, which minimizes the work that is needed to repair links.

When web pages become merely "old" or "completed", we move them to a directory named "old" or "completed" under the directory that they belong in. For example, under the main "Projects" page, there is a "completed" page. Also, under "Projects/voip", there is "completed" page for the completed subprojects of the voip project.

When files are copied from netserver to the production web server, the /usr/web/nets/archives directory is not copied. Other directories are, including for example /usr/web/nets/devices/upses/archives.

Programs that help manage the NETS web pages

weblint

The weblint program is installed on the netserver machine. It scans one or more HTML files and complains about basic violations of HTML. See the weblint man page. Weblint can be run manually, or by the verify-html.pl script.

linklint

The linklint program is installed on the netserver machine. It scans Web directory structures and complains about various problems, such as broken links. See the NETS System Administration Documents for notes on running linklint.

Some cron jobs run linklint on every web tree every weekend. The results are at

Replacing strings in NETS Web pages

Sometimes, it's useful to replace a string with another string in all the NETS web pages. A tool called "rpl" does this. For example, as part of the effort to install standard headers on NETS web pages, to get rid of the unneeded "index.html" on the end of URLs in the NETS web pages, I did the following rpl command on 2008-04-21, as root, from /usr/web/nets:

rpl -R -v -x'.html' -x'.htm' -x'.shtml' '/index.html"' '/"' *

The above replaced 14996 strings in 284041 files. I immediately ran linklint to see if I'd caused any damage.

fixweb

There is Perl code in /usr/web/frgp/fixweb/ on netserver that can help with problems in the NETS web pages.

Renaming NETS Web pages

Pete wrote a Perl program name "rename.pl" that safely renames files in the NETS web tree. When you use the program to rename a file, it finds and changes all the links to the renamed file. The program will also help when you want to move a file from one place to another in the web directory tree. We now have a way to avoid generating broken links every time we protect or archive web pages.

Automated Web pages

Some web pages are generated automatically by programs every night. If you edit these pages, your changes will be lost when the programs come along and overwrite the files with new contents. Here is a brief list of the web pages that you should never edit, followed by more detailed descriptions:

Details about automatically-generated web pages:
Statistics pages
These include Statistics pages are generated by Cricket software. If there's something that needs to be fixed on these pages, send an email to Dave Mitchell or John Hernandez.

Port lists pages
The portlists pages are generated a Perl script on netserver named /usr/web/nets/internal/portlists/SwitchMap.pl. If there's something that needs to be fixed on these pages, send an email to Pete Siemsen.

Contacts pages
These include
  • All FGRP internal pages, including
    • FRGP contacts page. If you need to change contact information on this page, edit the entries in the main NETS Contacts page. To mark a person as an FRGP contact so that they will appear in the FRGP contacts page, put an "FRGP" tag comment on the entry in the NETS Contacts page. At night, your changes will be copied to the FRGP contacts page.
    • Authorized contacts pages, such as /usr/web/nets/internal/csm/authorized.html. To change authorized contacts, edit the main NETS Contacts page. Change the marker tags, which say, like "Site contact 1" or "Site contact 2". At night, the entries that are marked will be copied to the authorized contacts pages.
    • FRGP members pages, such as /usr/web/nets/internal/csm/
  • NETS staff call-list page
These pages are generated by a Perl script on netserver named /usr/web/nets/internal/convert-contacts/convert-contacts.pl, which reads the main NETS Contacts page.

Configuration analysis pages
These include These pages are generated by a Perl program on netserver named ~siemsen/check-configs/check-configs.pl. If there's something that needs to be fixed on these pages, send an email to Pete Siemsen.

Errors

On the production web server, if CGI scripts are generating "Internal Server Error" messages, you'll find the Apache error logs at /web/var/weblog/error.log.

Demoroniser

Microsoft programs such as Word, Excel and PowerPoint produce non-standard HTML. A program exists to help people who need to repair the HTML files produced Microsoft Office software. See Demoroniser.

DreamWeaver

Many NETS staff people use Macromedia Dreamweaver to edit HTML files. Those that do should watch out for a gotcha: if they have Automatic Wrapping set, Dreamweaver will compress the HTML. This makes the HTML file smaller and unreadable by humans and some software. In particular, Perl scripts that parse the NETS contacts page can't deal with a contacts page that has been compressed. All NETS Dreamweaver users should turn off Automatic Wrapping. Here is how to set it in DreamWeaver:

  1. Edit --> Preferences
  2. Category --> Code Format
  3. Disable the 'Automatic Wrapping' box
  4. Click 'OK'

Security

There are several kinds of security mechanisms in use in the NETS web pages:

  1. No security
  2. "internal" directories. On the production web server, if the browser is not in 128.117.x.x, Apache will require UCAS passwords. This is for pages that are meant to be accessed by UCAR staff, but not outsiders. On the FRGP web server, Apache will require the FRGP username/password for these files. WARNING: Be careful: the top-level "internal" directory gets deleted and overwritten every night by the netserver cron job that runs the /usr/web/nets/internal/convert-contacts/convert-contacts.pl script, so don't put files there!
  3. Password-protected pages that have a unique username/password. These are typically used when Marla wants to share some sensitive content with a set of people. She then tells them the username/password. This form of security has nothing to do with whether the people have UCAS passwords or not. On the Westnet web page and on the WRN "protected" page, a file named "pwords" holds the password needed to access the page. The Apache web server on nets.ucar.edu is configured specifically to look for the pwords files for these directories.
  4. Files that simply aren't copied from the development server to the production server.

internal directories

The production web server machine runs an Apache that is configured such that any directory path that contains "/internal/", then the user's browser will prompt the user for their UCAR UCAS username and password. Thus, to protect a Web directory so that people outside UCAR can't read the pages, you put the files in a directory named "internal". This security mechanism is maintained by Joel Daves.

WARNING: The FRGP web server does something totally different. It has a bunch of separate clauses in the /etc/apache/httpd.conf file, one for each file or directory that is protected. This is a source of much confusion, because NETS staff assumes that "internal" directories work the same way on all webservers.

Password-protected

Internal directories are fine for basic security, but don't work for the FRGP "internal" directory, or for Marla's special directories, where she wants users outside NCAR, who don't have UCAS accounts, to have simple username/password authentication. You can't use .htaccess files for this, because they are turned off in the production web server. Instead, use a command like

htpasswd -c pwords <username>
to create a file named pwords allowing a "user" named username to access the directory. You'll be prompted for a password. Then ask Joel Daves to modify the production web server's Apache config to do "basic authorization" on the directory. Tell him the name of the password file and the "username". Henceforth, you'll be able to change the password yourself by running the above command. You will not be able to change the "username" without telling Joel , because the username is needed in the production Apache server configuration file.

Verifying security of web pages

We try to protect all sensitive web pages using the mechanisms described above, but it is fairly easy to make mistakes. Such mistakes can quietly leave web pages unprotected. To prevent this, we periodically check the protection of several representative web pages.

As of 2006-02-22, we use the machine named npad, which is outside the UCAR security perimeter. David Mitchell wrote a Python script that attempts to access all the pages that should be secure. The script runs in a cron job as mitchell once a week. The script is /home/mitchell/bin/web_check.py. It emails a report to wc@ucar.edu.

Virtual hosts on the FRGP web server

In July 2004, DU requested that the FRGP temporarily host the DU web server while they move their computer center from one building to another. The move was scheduled to last 4 days, from August 19th to the 23rd. The idea was to temporarily add a second IP address to the Ethernet interface on the FRGP web server machine. Then, before the move, change DNS to point to the new IP address. After the move, change it back.

We'd configure the Apache web server on the FRGP machine with an Apache IP-based Virtual Host using the new IP address. That way, during the move, users would perceive no difference when accessing www.du.edu. We'd create an account on the FRGP machine for DU to use to copy their their content to the FRGP server and maintain it.

Setting up the du account

Create an account for DU, so they can maintain their content. Create the root directory for their content and CGI scripts. Create a directory for their log files, and change it's group ownership to www-data, so that the web server can write to the log directory.

		useradd du -u 1005 -g 96 -c "temporary account so DU can maintain web content" -m
		passwd du
		mkdir /home/du/htdocs /home/du/logs /home/du/cgi-bin
		chown du:datacomm /home/du/htdocs /home/du/cgi-bin
		chgrp www-data /home/du/logs
	

Setting up the IP alias

The first step was to establish a second IP address, or "IP alias" on the FRGP machine. I chose the unused IP address 192.43.217.15. To set it up instantly:

ifconfig eth0:0 192.43.217.15 netmask 255.255.255.0

To set it up so it happens at boot time, add this to the /etc/network/interfaces file:

	  # Temporary second IP address on the same Ethernet card
	  # Added 2004-07-28 by Pete Siemsen to set up temporary
	  # hosting of DU's web pages while their web server is down.
	  auto eth0:0
	  iface eth0:0 inet static
	  address 192.43.217.15
	  netmask 255.255.255.0
    

Setting up the Apache virtual host

The Apache server can support more than one tree of HTML files. This is described on the Apache IP-based Virtual Host page.

# Virtual host for University of Denver temporary web
# August 19th through the 23rd, 2004, installed by Pete Siemsen
<VirtualHost 192.43.217.15>
    ServerAdmin  siemsen@ucar.edu
    DocumentRoot /home/du/htdocs
    ServerName   www.du.edu
    ErrorLog     /home/du/logs/error_log
    TransferLog  /home/du/logs/access_log
    ScriptAlias  /cgi-bin/ "/home/du/cgi-bin/"
<Directory />
    Order deny,allow
    Allow From All
    Options FollowSymLinks ExecCGI
    AllowOverride None
</Directory>
</VirtualHost>
    

BRAN website

Subject: Mirroring of BRAN website
Date: 08 Jan 2004 09:35:29 -0700
From: Alex Hsia
To: bran-tech@branfiber.net

As requested at the last BRAN tech meeting, the BRAN website has been mirrored to a server at NOAA-Boulder. This will increase the reliability of access to the BRAN information should NCAR Mesa Lab lose connectivity due to a network outage, i.e. BRAN cut.

The mirroring is accomplished via wget which runs daily to keep the content up to date. The intelligent DNS resolution is done through Cisco Distributed Directors which can determine if a web server is available and only returns the IP address(es) of healthy servers to client requests. Any questions about the process can be directed to the NOAA-Boulder NOC nb-noc@noaa.gov.

Alex Hsia

NLR website

Marla takes minutes of NLR meetings, and puts them on the NLR website at http://www.nlr.net/board/. She develops the pages on netserver in /usr/web/nlr/board. They get copied to the NLR web server every night. See How the web pages are copied to their servers. All the files in netserver:/usr/web/nlr/board are copied to the NLR web server at webdoc2.grnoc.iu.edu.

People Search

NETS maintains the UCAR phone directories, in /usr/web/nets/ucar-directories and /usr/web/nets/internal/ucar-directories. WEG (Markus Stobbs) maintains the headers on some of the pages. The headers say "People Search". When Markus needs to update the headers, he asks us to edit some of the directory files. The files that have Markus's stuff in them are:

Markus will supply a new styles.css file and the HTML for the first table in the .html files.

How the web pages are copied to their servers

Copying in general

NETS staff edits web pages on the netserver machine. Every night, cron jobs copy the web pages to their "production" web servers. All the cron jobs that copy web pages are run as "siemsen" except the cron job that copies the NLR web tree, which is run as root. As of 2006-02-16, when I logged in as siemsen and did a "crontab -l" command:

12 1 * * * /usr/web/copyweb/copy-cyrdas-to-webpub.sh
22 1 * * * /usr/web/copyweb/copy-cyrdaswww-to-webpub.sh
32 1 * * * /usr/web/copyweb/copy-ncab-to-webpub.sh
42 1 * * * /usr/web/copyweb/copy-npad-to-webpub.sh
52 1 * * * /usr/web/copyweb/copy-portlists-to-webpub.sh
12 2 * * * /usr/web/copyweb/copy-accis-to-webpub.sh
# nlr is now copied by a root cron job
#22 2 * * * /usr/web/copyweb/copy-nlr-to-bix.sh
32 2 * * * /usr/web/copyweb/copy-irwin-to-webpub.sh
12 3 * * * /usr/web/nets/internal/convert-contacts/convert-contacts.pl
22 3 * * * /usr/web/copyweb/copy-bran-to-webpub.sh
42 3 * * * /usr/web/copyweb/copy-nets-to-netman.sh
12 4 * * * /usr/web/copyweb/copy-frgp-to-frgp.sh
42 4 * * * /usr/web/copyweb/copy-nets-to-webpub.sh
12 5 * * * /usr/web/copyweb/copy-nets-to-nagman.sh

The cron jobs use rsync, and rsync uses ssh to transfer the files. To satisfy CSAC policies for secure file transfer, we adhere to the NETS System Administration Best Practices for SSH and cron jobs. See those pages for a description of how ssh keys are managed for copying pages to servers.

Copying NLR pages

For the NLR web pages, there is a cron job on netserver that runs /usr/web/copyweb/main.sh every night. The main.sh script runs several other scripts, one of which is named copy-nlr-to-nlrserver.sh. That script copies the NLR board web pages from /usr/web/nlr/board/bd/ to the NLR web server named webdoc.grnoc.iu.edu. The script executes an rsync command to do the copy. It uses ssh as the transport mechanism, and requires an account on the NLR web server to exist. The NLR folks have created an account named nlrboard for this purpose.

Not copying stats pages

There are some directories that are explicitly excluded from being copied to other machines every night. Two examples are the NETS and FRGP /stats directories. The NETS /stats directory is simply not copied - it doesn't exist on the production web server. Here's the README.txt file that explains how the FRGP /stats directory is handled.

This text is found in the netserver:/usr/web/frgp/stats/README.txt file and in the frgp-ws-1:/usr/web/stats/README.txt file and in netserver:/usr/web/nets/intro/how-pages-work/index.html, a.k.a. http://netserver.ucar.edu/nets/intro/how-pages-work/. It text explains why the FRGP /stats directory is handled specially by the cron job that copies files from netserver to the FRGP workstation.

A nightly cron job on netserver runs a shell script named /usr/web/copyweb/copy-frgp-to-frgp.sh, which copies web pages from netserver to the FRGP workstation. The shell script is responsible for copying the entire /usr/web/frgp directory tree from netserver to the FRGP workstation. This /stats directory is a subdirectory of /usr/web/frgp, but it gets copied differently than all the other subdirectories under /usr/web/frgp.

We intend the FRGP workstation to have an exact copy of what's on netserver, so we use the "-delete" option on the rsync command that copies /usr/web/frgp. The "-delete" option deletes any files found on the FRGP side that don't exist on the netserver side. The /stats subdirectory is an exception because on the FRGP side, it contains some files that we don't want to be deleted, even though they don't exist on the netserver side. The files are statistics pages generated periodically by software that lives on the FRGP workstation.

To solve this, there are two rsync commands in the cron job shell script, one for the /stats directory, and one for all the others. The rsync command for the /stats directory does not have the "-delete" option, and the other rsync command does.

Thus, files in /usr/web/frgp/stats get copied from netserver to the FRGP workstation, but other files that may exist in /stats on the FRGP workstation are not disturbed. This means that whoever maintains the /stats directory (Scot Colburn as of 2006-02-16) has to remember this.

Favicons

Favicons are those little graphic pictures that show up to the left of the URL in many browser "location" boxes. If you place a file named favicon.ico in the root directory of a website, all the pages on the website will display with the favicon. If you place a favicon.ico file in a directory, all pages in that directory will display the favicon. There is a way to specify the favicon file in the HTML of a web page to get per-page favicons.

To create a favicon, I went to http://antifavicon.com/.

If you don't have a file named favicon.ico in your home directory, you'll messages in your web server's error.log file every time someone's browser tries to get the favicon. This is reason enough to make such a file, even if it contains a blank square, like the one I created.

Server-side includes for standard headers and footers

To improve the "look and feel" of the NETS website, we try to make all the pages have the same top and bottom (header and footer). That way, the header stays the same when a user clicks from page to page - only the content changes. To the user, it feels like they are still on the same website.

To get a standard header and footer on all NETS web pages, we use "server-side includes", or SSI. We put the HTML that represents the standard NETS header and footer into two files named standard-nets-header.html and standard-nets-trailer.html. These files are in the NETS home directory. We include the files in other HTML files using SSI tags like this:

<!--#include virtual="/nets/standard-nets-header.html" -->
<!--#include virtual="/nets/standard-nets-trailer.html" -->

Include tags like this work only in files with names that end in a ".shtml". If the above lines were put into a ".html" or ".htm" file, they'd be ignored.

Besides improving the look and feel of NETS web pages, this scheme allows us to change the headers or footers of all the NETS web pages at once, just by changing a single file.

Using SSI goes a long way towards making pages look the same, but it's not the whole story. Each NETS page has a title which appears in two places: the "title" tag and the main title of the webpage. We chose to standardize the main title along with the header. For example, the first few lines of the NETS "Devices" page looks like this:

    <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
    <html>
        <head>
            <title>Devices</title>
        </head>
        <body bgcolor="white">

            <!--#include virtual="/nets/standard-nets-header.html" -->

            <br><font size="+3"><strong><center>Devices</center></strong></font>
            <hr>
	        

The first 11 lines of all NETS web pages should look like this, with the 2 instances of the word "Devices" replaced with the title of the web page. This standard produces a nice effect when clicking through the NETS web - the header and the horizontal rule on every page stay the same, but the title and the content change.

To make the web server on netserver match the behavior of the production CISL server, I edited the /etc/apache/srm.conf file to look like this:

AddType text/html .shtml .html .htm
AddHandler server-parsed .shtml .html .htm
DirectoryIndex index.html index.htm index.shtml

To make the server handle SSI in Apache2 (which we run on nagman), you have to enable the mod_include module. To do so, do this as root:

$ a2enmod include
$ service apache2 restart

CSS

As of 2010-12, we don't use much CSS in the NETS web pages. Obstacles include That said, there is a file named "styles.css" in the main NETS directory. It contains some code needed to handle the header navigation used by the "Directories" page at /usr/web/nets/-ucar-directories, and some classes for formatting source code in our web pages. An example of a NETS page that uses CSS is

NETS "topics"

The NETS header contains a light blue line that reads "Browse NETS topics", followed by 26 links, one for each letter of the alphabet. These web pages started out as glossary lists that define the meanings of various arcane networking acronyms. Most of the original entries came from a glossary maintained by Jeff Custard. Jeff now uses these pages instead of his own glossary. They are a convenient place to make definitions or links to anything.


Address comments or questions about this Web page to the Network Engineering & Telecommunications Section (NETS) by opening a ticket at netshelp@ncar.ucar.edu. The NETS is part of the Computational & Information Systems Laboratory (CISL) of the National Center for Atmospheric Research (NCAR). NCAR is managed by the University Corporation for Atmospheric Research (UCAR). This website follows the UCAR General Privacy Policy and the NCAR/UCAR/UCP Terms of Use. The National Center for Atmospheric Research is sponsored by the National Science Foundation (NSF). Any opinions, findings and conclusions or recommendations expressed in this material do not necessarily reflect the views of the National Science Foundation.