The NETS Web is the directory tree of HTML files that contain information used by the Network Engineering and Technology Section. The pages are maintained by NETS staff. Pete Siemsen is the "webmaster". This page describes conventions used to organize the NETS Web. Many of these conventions apply to other web trees housed on netserver, including those for the FRGP, BRAN, NPAD and NLR.
NETS Web pages are maintained using several tools, including DreamWeaver, text editors like Emacs and vim, Microsoft Word (not recommended), Excel and PowerPoint, and Visio. DreamWeaer is the recommended tool.
NETS web pages have been maintained by many people over several years, so they don't have a completely consistent look and feel. Nevertheless, we try to use standard headers and footers in most NETS Web pages.
The NETS Web is maintained on the machine named
netserver, in directory
/usr/web/nets. NETS staff members
who are comfortable with Unix simply log in to netserver to edit
the files. Staff members who use Microsoft Windows "share" the
/usr/web directory onto their PCs,
taking advantage of the SAMBA server that runs on netserver to
support this access. The SAMBA server is also accessed by NETS
staff that use DreamWeaver's "site" feature to copy files to/from
their PCs for local editing.
Each night, the entire NETS Web tree is copied from netserver to
the production Web machine named
webpub.ucar.edu. This allows NETS
staff to develop web pages on netserver without affecting the
production NETS Web, so they have a chance to "see how it looks"
before the world sees their changes. See
How the web pages are copied to their servers.
The directory hierarchy in the NETS web pages evolved to its present form over time. Many discussions, arguments and meetings influenced the categories on the main NETS web page. We think we've settled on a good set of main categories, though admittedly "Documents" is a catch-all for everything that doesn't fit in one of the other categories.
Here's a brief overview of the directory hierarchy in the NETS web.
frgp (Front Range GigaPop) nets cgi (programs used by forms) datacomm (files protected for NETS-only access) devices (major classes of devices) docs ("documentation" that doesn't fit elsewhere) forms (interactive forms for purchase requests, etc.) images (pictures used in other pages, like the NETS logo) internal (files protected for UCAR-only access) intro (the "About NETS" pages) inventory (administrative inventories of NETS assets) linkdoc (cron-generated analysis of NETS web broken link problems) minutes (minutes of meetings) noc (per-machine notification instructions used by the NCAR NOC) presentations (PowerPoint files of talks presented by NETS staff) projects (current and completed major NETS projects) other-sites (websites that NETS might find convenient) stats (statistics of UCAR networks) tools (programs that NETS finds useful) topics (experimental topic-based indices) ucar-directories (PeopleSearch databases and yellow and blue pages) nlr (Network Network Path and Application Diagnosis pages) npad (National Lambda Rail pages)
The "internal" directory contains secure webpages - anything that should be seen only by UCAR employees. For example, the NETS "Contacts" page contains home phone numbers of many people, so it isn't for general distribution. The "internal" directory contain a directory hierarchy that is almost identical to the one in the main NETS directory. Here are some of the directories found in "internal".
archives projects index.html (pointing to annual completed project indices) 2004 2005 2006 docs forms contacts.html portlists projects (other NETS folders as needed for internal) vendors
The "archives" directory contains old stuff that we don't consider current, but that we think we might want someday. It lives under "internal" so that we don't have to consider whether our obselete stuff should be secure or not - it just is.
Like the "internal" directory, "archives" contains a directory hierarchy that is almost identical the one in the main NETS directory. We did this so that when we "archive" a directory, we can just move it to "archives" and be fairly sure that all its internal relative links remain valid after the move. When we do this, only the links into and out of the moved directory are broken, which minimizes the work that is needed to repair links.
When web pages become merely "old" or "completed", we move them to a directory named "old" or "completed" under the directory that they belong in. For example, under the main "Projects" page, there is a "completed" page. Also, under "Projects/voip", there is "completed" page for the completed subprojects of the voip project.
When files are copied from netserver to the production web server, the /usr/web/nets/archives directory is not copied. Other directories are, including for example /usr/web/nets/devices/upses/archives.
weblint program is installed on the
netserver machine. It scans one or more HTML files and complains
about basic violations of HTML. See the weblint man page. Weblint
can be run manually, or by the
linklint program is installed on the netserver
machine. It scans Web directory structures and complains about various problems,
such as broken links. See the
NETS System Administration Documents
for notes on running linklint.
Some cron jobs run linklint on every web tree every weekend. The results are at
Sometimes, it's useful to replace a string with another string in all the NETS web pages. A tool called "rpl" does this. For example, as part of the effort to install standard headers on NETS web pages, to get rid of the unneeded "index.html" on the end of URLs in the NETS web pages, I did the following rpl command on 2008-04-21, as root, from /usr/web/nets:
rpl -R -v -x'.html' -x'.htm' -x'.shtml' '/index.html"' '/"' *
The above replaced 14996 strings in 284041 files. I immediately ran linklint to see if I'd caused any damage.
There is Perl code in /usr/web/frgp/fixweb/ on netserver that can help with problems in the NETS web pages.
Pete wrote a Perl program name "rename.pl" that safely renames files in the NETS web tree. When you use the program to rename a file, it finds and changes all the links to the renamed file. The program will also help when you want to move a file from one place to another in the web directory tree. We now have a way to avoid generating broken links every time we protect or archive web pages.
Some web pages are generated automatically by programs every night. If you edit these pages, your changes will be lost when the programs come along and overwrite the files with new contents. Here is a brief list of the web pages that you should never edit, followed by more detailed descriptions:
- Statistics pages
- These include
- Port lists pages
- The portlists pages are generated a Perl script on netserver named /usr/web/nets/internal/portlists/SwitchMap.pl. If there's something that needs to be fixed on these pages, send an email to Pete Siemsen.
- Contacts pages
- These include
These pages are generated by a Perl script on netserver named /usr/web/nets/internal/convert-contacts/convert-contacts.pl, which reads the main NETS Contacts page.
- All FGRP internal pages, including
- FRGP contacts page. If you need to change contact information on this page, edit the entries in the main NETS Contacts page. To mark a person as an FRGP contact so that they will appear in the FRGP contacts page, put an "FRGP" tag comment on the entry in the NETS Contacts page. At night, your changes will be copied to the FRGP contacts page.
- Authorized contacts pages, such as /usr/web/nets/internal/csm/authorized.html. To change authorized contacts, edit the main NETS Contacts page. Change the marker tags, which say, like "Site contact 1" or "Site contact 2". At night, the entries that are marked will be copied to the authorized contacts pages.
- FRGP members pages, such as /usr/web/nets/internal/csm/
- NETS staff call-list page
- Configuration analysis pages
- These include
On the production web server, if CGI scripts are generating "Internal Server Error" messages, you'll find the Apache error logs at /web/var/weblog/error.log.
Microsoft programs such as Word, Excel and PowerPoint produce non-standard HTML. A program exists to help people who need to repair the HTML files produced Microsoft Office software. See Demoroniser.
Many NETS staff people use Macromedia Dreamweaver to edit HTML files. Those that do should watch out for a gotcha: if they have Automatic Wrapping set, Dreamweaver will compress the HTML. This makes the HTML file smaller and unreadable by humans and some software. In particular, Perl scripts that parse the NETS contacts page can't deal with a contacts page that has been compressed. All NETS Dreamweaver users should turn off Automatic Wrapping. Here is how to set it in DreamWeaver:
There are several kinds of security mechanisms in use in the NETS web pages:
The production web server machine runs an Apache that is configured such that any directory path that contains "/internal/", then the user's browser will prompt the user for their UCAR UCAS username and password. Thus, to protect a Web directory so that people outside UCAR can't read the pages, you put the files in a directory named "internal". This security mechanism is maintained by Joel Daves.
WARNING: The FRGP web server does something totally different. It has a bunch of separate clauses in the /etc/apache/httpd.conf file, one for each file or directory that is protected. This is a source of much confusion, because NETS staff assumes that "internal" directories work the same way on all webservers.
Internal directories are fine for basic security, but don't work for the FRGP "internal" directory, or for Marla's special directories, where she wants users outside NCAR, who don't have UCAS accounts, to have simple username/password authentication. You can't use .htaccess files for this, because they are turned off in the production web server. Instead, use a command like
to create a file named pwords allowing a "user" named username to access the directory. You'll be prompted for a password. Then ask Joel Daves to modify the production web server's Apache config to do "basic authorization" on the directory. Tell him the name of the password file and the "username". Henceforth, you'll be able to change the password yourself by running the above command. You will not be able to change the "username" without telling Joel , because the username is needed in the production Apache server configuration file.
htpasswd -c pwords <username>
We try to protect all sensitive web pages using the mechanisms described above, but it is fairly easy to make mistakes. Such mistakes can quietly leave web pages unprotected. To prevent this, we periodically check the protection of several representative web pages.
As of 2006-02-22, we use the machine named npad, which is outside the UCAR security perimeter. David Mitchell wrote a Python script that attempts to access all the pages that should be secure. The script runs in a cron job as mitchell once a week. The script is /home/mitchell/bin/web_check.py. It emails a report to email@example.com.
In July 2004, DU requested that the FRGP temporarily host the DU web server while they move their computer center from one building to another. The move was scheduled to last 4 days, from August 19th to the 23rd. The idea was to temporarily add a second IP address to the Ethernet interface on the FRGP web server machine. Then, before the move, change DNS to point to the new IP address. After the move, change it back.
We'd configure the Apache web server on the FRGP machine with an Apache IP-based Virtual Host using the new IP address. That way, during the move, users would perceive no difference when accessing www.du.edu. We'd create an account on the FRGP machine for DU to use to copy their their content to the FRGP server and maintain it.
Create an account for DU, so they can maintain their content. Create the root directory for their content and CGI scripts. Create a directory for their log files, and change it's group ownership to www-data, so that the web server can write to the log directory.
useradd du -u 1005 -g 96 -c "temporary account so DU can maintain web content" -m passwd du mkdir /home/du/htdocs /home/du/logs /home/du/cgi-bin chown du:datacomm /home/du/htdocs /home/du/cgi-bin chgrp www-data /home/du/logs
The first step was to establish a second IP address, or "IP alias" on the FRGP machine. I chose the unused IP address 220.127.116.11. To set it up instantly:
ifconfig eth0:0 18.104.22.168 netmask 255.255.255.0
To set it up so it happens at boot time, add this to the /etc/network/interfaces file:
# Temporary second IP address on the same Ethernet card # Added 2004-07-28 by Pete Siemsen to set up temporary # hosting of DU's web pages while their web server is down. auto eth0:0 iface eth0:0 inet static address 22.214.171.124 netmask 255.255.255.0
The Apache server can support more than one tree of HTML files. This is described on the Apache IP-based Virtual Host page.
# Virtual host for University of Denver temporary web # August 19th through the 23rd, 2004, installed by Pete Siemsen <VirtualHost 126.96.36.199> ServerAdmin firstname.lastname@example.org DocumentRoot /home/du/htdocs ServerName www.du.edu ErrorLog /home/du/logs/error_log TransferLog /home/du/logs/access_log ScriptAlias /cgi-bin/ "/home/du/cgi-bin/" <Directory /> Order deny,allow Allow From All Options FollowSymLinks ExecCGI AllowOverride None </Directory> </VirtualHost>
As requested at the last BRAN tech meeting, the BRAN website has been mirrored to a server at NOAA-Boulder. This will increase the reliability of access to the BRAN information should NCAR Mesa Lab lose connectivity due to a network outage, i.e. BRAN cut.
The mirroring is accomplished via wget which runs daily to keep the content up to date. The intelligent DNS resolution is done through Cisco Distributed Directors which can determine if a web server is available and only returns the IP address(es) of healthy servers to client requests. Any questions about the process can be directed to the NOAA-Boulder NOC email@example.com.
Marla takes minutes of NLR meetings, and puts them on the NLR website at http://www.nlr.net/board/. She develops the pages on netserver in /usr/web/nlr/board. They get copied to the NLR web server every night. See How the web pages are copied to their servers. All the files in netserver:/usr/web/nlr/board are copied to the NLR web server at webdoc2.grnoc.iu.edu.
NETS maintains the UCAR phone directories, in /usr/web/nets/ucar-directories and /usr/web/nets/internal/ucar-directories. WEG (Markus Stobbs) maintains the headers on some of the pages. The headers say "People Search". When Markus needs to update the headers, he asks us to edit some of the directory files. The files that have Markus's stuff in them are:
Markus will supply a new styles.css file and the HTML for the first table in the .html files.
NETS staff edits web pages on the netserver machine. Every night, cron jobs copy the web pages to their "production" web servers. All the cron jobs that copy web pages are run as "siemsen" except the cron job that copies the NLR web tree, which is run as root. As of 2006-02-16, when I logged in as siemsen and did a "crontab -l" command:
12 1 * * * /usr/web/copyweb/copy-cyrdas-to-webpub.sh
22 1 * * * /usr/web/copyweb/copy-cyrdaswww-to-webpub.sh
32 1 * * * /usr/web/copyweb/copy-ncab-to-webpub.sh
42 1 * * * /usr/web/copyweb/copy-npad-to-webpub.sh
52 1 * * * /usr/web/copyweb/copy-portlists-to-webpub.sh
12 2 * * * /usr/web/copyweb/copy-accis-to-webpub.sh
# nlr is now copied by a root cron job
#22 2 * * * /usr/web/copyweb/copy-nlr-to-bix.sh
32 2 * * * /usr/web/copyweb/copy-irwin-to-webpub.sh
12 3 * * * /usr/web/nets/internal/convert-contacts/convert-contacts.pl
22 3 * * * /usr/web/copyweb/copy-bran-to-webpub.sh
42 3 * * * /usr/web/copyweb/copy-nets-to-netman.sh
12 4 * * * /usr/web/copyweb/copy-frgp-to-frgp.sh
42 4 * * * /usr/web/copyweb/copy-nets-to-webpub.sh
12 5 * * * /usr/web/copyweb/copy-nets-to-nagman.sh
The cron jobs use rsync, and rsync uses ssh to transfer the files. To satisfy CSAC policies for secure file transfer, we adhere to the NETS System Administration Best Practices for SSH and cron jobs. See those pages for a description of how ssh keys are managed for copying pages to servers.
For the NLR web pages, there is a cron job on netserver that runs /usr/web/copyweb/main.sh every night. The main.sh script runs several other scripts, one of which is named copy-nlr-to-nlrserver.sh. That script copies the NLR board web pages from /usr/web/nlr/board/bd/ to the NLR web server named webdoc.grnoc.iu.edu. The script executes an rsync command to do the copy. It uses ssh as the transport mechanism, and requires an account on the NLR web server to exist. The NLR folks have created an account named nlrboard for this purpose.
There are some directories that are explicitly excluded from being copied to other machines every night. Two examples are the NETS and FRGP /stats directories. The NETS /stats directory is simply not copied - it doesn't exist on the production web server. Here's the README.txt file that explains how the FRGP /stats directory is handled.
This text is found in the netserver:/usr/web/frgp/stats/README.txt file and in the frgp-ws-1:/usr/web/stats/README.txt file and in netserver:/usr/web/nets/intro/how-pages-work/index.html, a.k.a. http://netserver.ucar.edu/nets/intro/how-pages-work/. It text explains why the FRGP /stats directory is handled specially by the cron job that copies files from netserver to the FRGP workstation.
A nightly cron job on netserver runs a shell script named /usr/web/copyweb/copy-frgp-to-frgp.sh, which copies web pages from netserver to the FRGP workstation. The shell script is responsible for copying the entire /usr/web/frgp directory tree from netserver to the FRGP workstation. This /stats directory is a subdirectory of /usr/web/frgp, but it gets copied differently than all the other subdirectories under /usr/web/frgp.
We intend the FRGP workstation to have an exact copy of what's on netserver, so we use the "-delete" option on the rsync command that copies /usr/web/frgp. The "-delete" option deletes any files found on the FRGP side that don't exist on the netserver side. The /stats subdirectory is an exception because on the FRGP side, it contains some files that we don't want to be deleted, even though they don't exist on the netserver side. The files are statistics pages generated periodically by software that lives on the FRGP workstation.
To solve this, there are two rsync commands in the cron job shell script, one for the /stats directory, and one for all the others. The rsync command for the /stats directory does not have the "-delete" option, and the other rsync command does.
Thus, files in /usr/web/frgp/stats get copied from netserver to the FRGP workstation, but other files that may exist in /stats on the FRGP workstation are not disturbed. This means that whoever maintains the /stats directory (Scot Colburn as of 2006-02-16) has to remember this.
Favicons are those little graphic pictures that show up to the left of the URL in many browser "location" boxes. If you place a file named favicon.ico in the root directory of a website, all the pages on the website will display with the favicon. If you place a favicon.ico file in a directory, all pages in that directory will display the favicon. There is a way to specify the favicon file in the HTML of a web page to get per-page favicons.
To create a favicon, I went to http://antifavicon.com/.
If you don't have a file named favicon.ico in your home directory, you'll messages in your web server's error.log file every time someone's browser tries to get the favicon. This is reason enough to make such a file, even if it contains a blank square, like the one I created.
To improve the "look and feel" of the NETS website, we try to make all the pages have the same top and bottom (header and footer). That way, the header stays the same when a user clicks from page to page - only the content changes. To the user, it feels like they are still on the same website.
To get a standard header and footer on all NETS web pages, we use "server-side includes", or SSI. We put the HTML that represents the standard NETS header and footer into two files named standard-nets-header.html and standard-nets-trailer.html. These files are in the NETS home directory. We include the files in other HTML files using SSI tags like this:
<!--#include virtual="/nets/standard-nets-header.html" -->
<!--#include virtual="/nets/standard-nets-trailer.html" -->
Include tags like this work only in files with names that end in a ".shtml". If the above lines were put into a ".html" or ".htm" file, they'd be ignored.
Besides improving the look and feel of NETS web pages, this scheme allows us to change the headers or footers of all the NETS web pages at once, just by changing a single file.
Using SSI goes a long way towards making pages look the same, but
it's not the whole story. Each NETS page has a title which appears
in two places: the "title" tag and the main title of the webpage.
We chose to standardize the main title along with the header. For
example, the first few lines of the NETS "Devices" page looks like this:
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
<!--#include virtual="/nets/standard-nets-header.html" -->
The first 11 lines of all NETS web pages should look like this, with the 2 instances of the word "Devices" replaced with the title of the web page. This standard produces a nice effect when clicking through the NETS web - the header and the horizontal rule on every page stay the same, but the title and the content change.
To make the web server on netserver match the behavior of the production CISL server, I edited the /etc/apache/srm.conf file to look like this:
AddType text/html .shtml .html .htm
AddHandler server-parsed .shtml .html .htm
DirectoryIndex index.html index.htm index.shtml
To make the server handle SSI in Apache2 (which we run on nagman), you have to enable the mod_include module. To do so, do this as root:
$ a2enmod include
$ service apache2 restart
The NETS header contains a light blue line that reads "Browse NETS topics", followed by 26 links, one for each letter of the alphabet. These web pages started out as glossary lists that define the meanings of various arcane networking acronyms. Most of the original entries came from a glossary maintained by Jeff Custard. Jeff now uses these pages instead of his own glossary. They are a convenient place to make definitions or links to anything.
Address comments or questions about this Web page to the
The National Center for Atmospheric Research is sponsored by the