Sunday, October 31, 2010

Dealing with Disconnection

I sometimes work at home from a VPN Internet connection. I also have a small home network of about 4 computers of various types. All of them route through my BSD box to the Internet. However, like most people I travel back and forth to the office each day. Sometimes, when I'm working, I don't want to stop what I'm working on remotely just because its time to leave.

I use ssh from my BSD desktop to connect to the computers I work on. They are all remotely hosted and I have never actually seen any of the boxes. Usually, I am editing files using vi(1), managing asterisk servers from the console or setting up opensips processes.

My biggest problem working remote is not the lack of speed, highspeed internet really solved that, not like in the old days when I first started doing this. I remember dealing with that - push a key, go eat lunch, no real problem. But when working remote, the biggest problem is getting disconnected. Its really frustrating when I am in the middle of a project and the VPN gets disconnected or the WiFi just loses signal because someone turns on the microwave and I lose all my work.  vi(1) can sometimes be forgiving, but it's far from the ideal method. I can usually recover part of my work.

Back when dialup was the only option, I would often get disconnected and log back in and the program I was working on was still running. Using w(1), I could see that the server still thought that I was logged in and hadn't terminated the program.

I was still working from dialup back when a friend of mine introduced me to screen(1). He ranted and raved about how wonderful screen(1) was and how the world needed to know about it. I've been using it ever since.

Saturday, October 30, 2010

DragonFlyBSD 2.8 Released

DragonflyBSD 2.8 was just released.  You can download it here.

From the Release Notes:

Big-ticket items


Return of the GUI - The 2.8 release includes a larger 4G USB image with a working X environment and full sources in addition to the standard 700M ISO and 1G USB images.
Crypto support - A cryptsetup compatible cryptographic device mapper target was written for DragonFly. This means that it is now possible to encrypt DragonFly partitions (e.g., HAMMER and UFS). While it is possible to only encrypt any partition like /home/, it is also possible to encrypt the whole root file system. The latter is especially useful for mobile devices. It is also possible to encrypt the swap partition while still being able to dump a kernel core. Further, the code is SMP aware, so expect a speedup if using multi-core machines and don't have cryptographic hardware support.
Packet Filter (pf) - Pf was updated to a version based upon OpenBSD 4.2. The previous version of pf in DragonFly was based on OpenBSD 3.5. This, in addition to laying the ground for further following OpenBSD's implementation, introduced several performance gains: Information like route-to, altq, tags, etc are now stored in the mbuf header directly. This was partially already the case up to DragonFly 2.6, but now the implementation corresponds to OpenBSD's. Furthermore an often unnecessary checksumming was removed, which gains another 10% performance. Also state tables and interface bound states were reimplemented and the pf_test_*() functions where fold into pf_test_rule() to make things clearer. DragonFly-specific additions, support for fairq packet queueing and pickups, have remained intact.
WiFi Stack Update - FreeBSD's WiFi (802.11) network stack has been ported. While not all WiFi drivers have been ported the ability to port drivers from FreeBSD much more easily will allow us to ultimately add support for more and newer WiFi devices in the near future.
MP Performance - The multiprocessor work that has been ongoing in DragonFly is really starting to bear fruit. The MPLOCK (The primary lock, that when held ensures only a single cpu is operating within the kernel) has been pushed back significantly with this release. Most of the frontend code now uses soft tokens instead of the MPLOCK, though for safety these particular soft tokens still acquire the MPLOCK. We will be phasing out the safety feature as work progresses. More importantly, HAMMER now runs with a per-mount lock and has specific optimizations to run 100% MPSAFE in the cached read & stat paths. Much of the system backend including the buffer cache, the networking subsystem (protocol stacks and netif drivers), and the AHCI disk driver are now completely MP-safe and do not acquire the MPLOCK at all. For most intents and purposes the system is running MP-safe. I don't want to sell this short because large portions of the core infrastructure have been MP-safe for years. But now those MP-safe paths for the first time can reach all the way from userland to the device drivers on the backend.

Friday, October 29, 2010

Invite to MeetBSD 2011

Matt Olander invites us to attent MeetBSD 2011
Come discuss all the BSD flavors with your peers next week at MeetBSD
California! It's on Friday and Saturday, November 5th & 6th at
Hacker Dojo, in Mountain View, California, USA.

We have an interactive Unconference on the first day. This means that
the attendees will get to decide the topics in real time.
For the second day, a more traditional format of speakers and
works-in-progress will be followed. It's highly hackable, informative,
and fun.

Of course, a legendary BSD-party featuring special guests, activities,
and entertainment will occur Saturday evening at the Dojo ;)

Thanks to our generous sponsors, the cost is only $25 USD which includes
snacks, lunches, and admission to the after-party.


If you are planning on attending, please reserve your space now:
http://www.meetbsd.com

Wednesday, October 27, 2010

Philip Paeps - FreeBSD, Detangling and debugging


Philip recommends debugging without using the debug tools.

"Debugging is universally anticipated with distaste, performed with reluctance and bragged about forever"  -- anonymous.

One of the biggest drawbacks to using the debug tools is losing an entire day rebuilding the system to include the debug symbols and then to figure out that the problem was a simple typo that you could have caught with five minutes of critical thinking and some code review.

Suggestions to debug without the debugger

  • Printf's are boring.  Instead when your program crashes, have it print a stack trace.
  •  Cookies -- Write an unsigned long as a global variable and use it as a poor man's running stack trace.  Write to it  (fiddle with the bits) in the different subsystems to keep track of where you have been.  Works great for embedded systems.
  • GCC is your friend.  Don't silence the debugger with a cast, fix the problem.
  • use GCC -E  -- It goes through the pre-processor and prints out the info.
  • Know your -w flags.  Use -w Error to stop the program on warnings and fix them.  Lots of problems can go away when you fix the warnings.
  • Use GCC instrumentations  -- Very useful in userspace, not so much in kernel.
  • Do an object dump. -- Useful, but you need to know a lot to use it.  Also you have to remove the -fomit-frame-pointer flag on intel platforms, or this process is useless.  You can use this method to disassemble  your program and figure out where the program crashed.  Very useful in trace analysis. 
Summary

Try not to debug, try to think first.
Take shortcuts.  You have already broken something, cheating won't make it worse.
Remember who your friends are, like nm and object dump.
Document your clever tricks.

Tuesday, October 26, 2010

New Episode of The BSD Show: John Hixson


The guys at The BSD Show try an impromptu program with an iXSystems developer.  Originally, he was hired to do the Flash port to FreeBSD, but Adobe withdrew and didn't cooperate.  So that is why Flash doesn't work natively on BSD.  John has been working with FreeNAS and PCBSD recently.

They discuss the future of Sysinstall and all the die hard sysinstall users who won't migrate to MSI files.
Lots of good sysinstall vs PC Sysinstall conversations and its time to migrate to PC Sysinstall for FreeBSD.

PC Sysinstall handles a lot of the new features such as ZFS and gmirror, gpt paritions, etc.  It doesn't currently support PXE booting yet, but it will.  It does have cool stuff like zfs root.

PC Sysinstall has support for scripting, so you could use PC Sysinstall as a backend and script your own front end for whatever you were doing.

Much discussion, swearing and planning for MeetBSD ensues.

Monday, October 25, 2010

Sławek Żak - NoSQL

He gives a good explanation of what databases do well and what they don't do well.  Then he gets into how NoSQL makes things better.

Sunday, October 24, 2010

SSH Primer

When you ask how you can make your BSD box more secure, the first thing people will tell you is to use SSH if you aren't already. If you are new to BSD or Unix in general, you might still be mastering the art of logging in to the console and not have given a thought to logging in remotely. If you are using Mac OS X, you may not have even realized that you can log in remotely.

When you login to the console, BSD gives you a login: prompt and asks for your user name. You are then prompted for your password. If you successfully give both, it logs you in and presents you with a shell prompt. The alternative situation is if you are using a GUI login such as xdmkdm, or Mac OS X. The login process will be the same; however, you will be presented with a windowing system upon login instead of the shell. To get to a shell from this point, you will need to run an xterm or the Terminal App in Mac OS X, which is found in the utilities folder.

From the shell prompt, you can use the ssh utility. In its most basic form, ssh gives you secure, console access to another computer. All the traffic that goes between the two computers is encrypted so it can't be intercepted in transit. Previously a utility called telnet handled the same duties, however it did so using unencrypted protocols. It became very easy for hackers to intercept telnet sessions and discover passwords used to remotely manage systems.

OpenBSD, NetBSD, FreeBSD, Mac OS X, and Darwin all come with OpenSSH installed as part of the base installation. OpenSSH was developed by the OpenBSD project and has quickly become the de facto standard for ssh. If you don't like using ssh as a command line, people have developed GUI front ends to ssh, but I'm not covering that in this article.

Saturday, October 23, 2010

AsiaBSDCon 2011 -- Call for Papers

Hiroki Sato is calling for papers for AsiaBSDCon 2011.

 It will be held on March 17-20th 2011 in Tokyo Japan. That would be an fun conference to attend. I've been to asia before, but not Japan specifically.

The details on submitting a paper for the conference can be found at http://2011.asiabsdcon.org/.  Submission deadline is December 20th, 2010.

I'm going to have to go back and watch all the videos from last year to catch up.

Friday, October 22, 2010

Remove .svn entries from your web site

I have several websites that are managed by subversion. I didn't realize till I did a security scan of my web server that the .svn directories have been exposed to the public.

I did a search and found this code to add to my httpd.conf file.


<directorymatch .*="" .*\.svn="">
Deny From All
</directorymatch>


I added that code and restarted apache and it worked like a charm.

Thursday, October 21, 2010

error: the --with-apr parameter is incorrect

If you are upgrading apache, you will need to notice that the apr port has upgraded a major version. The old version you have installed is now apr1 and the new version is just apr in the ports tree. However, things like portupgrade don't notice it.

You have to manually remove the old port to get things working again.

# cd /usr/ports/devel/apr1
# make deinstall clean

This fixed my errors and allowed me to get apache22 upgraded successfully.

Wednesday, October 20, 2010

pcre error with php5-filter and php5-zip

I upgraded to PHP 5.3.3 and found that pcre has been included into PHP5 instead of being a separate add on in the BSD ports/pkg_src collection. This caused a couple of problems when I tried to upgrade all the modules I was using on FreeBSD 7.3.

I had to force an extra include path so it could find the pcre.h file that was missing.

# cd /usr/ports/archivers/php5-zip/
# make install CFLAGS=-I/usr/local/include

I did the same thing for php5-filter

# cd /usr/ports/security/php5-filter/
# make install CFLAGS=-I/usr/local/include

I expect they will fix the ports before too long.

Tuesday, October 19, 2010

Advances in Embedded ARM processors, for performance


Dimitri works for Marvell, a semiconductors company. Most of the chips they make have to do with networking. I was surprised to learn how many ARM cpus there are out there. ARM processors are optimized for cost, but still very high performance, reaching up to 2Ghz as of this video.

Marvell makes a point of working with compliers like GCC and operating systems like BSD and Linux to make sure all the advanced features of the CPU can be taken advantage of.

Marvell also makes a plug computer. A small computer that plugs into an outlet. Seems like its a micro server that fits in a plug and is designed to work in a home setting. The entire plug computer consumes less than 15 watts of power.

Kris Moore and PCBSD

ZFS in FreeBSD, by Pawel Jakub Dawidek

Keynote, Peter Losher, Internet Systems Consortium, AsiaBSDCon 2008

Using FreeBSD to Promote Open Source Development Methods, Brooks Davis, ...

GEOM - in Infrastructure We Trust, Pawel Jakub Dawidek, AsiaBSDCon 2008

Reducing Lock Contention in a Multi-Core System, Randall Stewart, AsiaBS...

Tracking FreeBSD in a Commercial Setting, M. Warner Losh

Send and Receive of File System Protocols: Userspace Approach With puffs...

BSD Implementations of XCAST6, Yuji Imai

Logical Resource Isolation in the NetBSD Kernel, Kristaps Džonsons

A Portable iSCSI Initiator, Alistair Crooks

OpenBSD Network Stack Internals, Claudio Jeker

Ken Caruso, Using BSD in SchmooCon Labs (DCBSDCon 2009)

Robert Luciani, M:N Threading in DragonFly BSD (DCBSDCon 2009)

Kurt Miller, Implementing pie on OpenBSD (DCBSDCon 2009)

Isolating Cluster Jobs for Performance and Predictability, Brooks Davis ...

Epitome, Marco Peereboom (DCBSDCon 2009)

OpenBSD vs SMP, Threading, and Concurrency, Ted Unangst

network perimeter redundancy with pfsense, chris buechler

Jason Dixon Closing Remarks of DCBSDCon - BSD is Still Dying

faster packets: performance tuning in the openbsd network

Philip Paeps, Crypto Acceleration on FreeBSD, AsiaBSDCon2009

Constantine A. Murenin, OpenBSD Hardware Sensors Framework

M. Warner Losh, An Overview of FreeBSD/mips, AsiaBSDCon2009

R. Jaworowski, FreeBSD on hi-perf. multi-core embedded PPC

D. Gwynne, Active-Active Firewall Cluster Support in OpenBSD

K. Dzonsons, Deprecating groff for BSD manual display

A. Zakharchenko, Mail system for distributed network

Work-in-Progress Session in AsiaBSDCon 2009

A. Kantee: Environmental Independence: BSD Kernel TCP/IP

A. Rao: The Locking Infrastructure in the FreeBSD kernel #1

A. Rao: The Locking Infrastructure in the FreeBSD kernel #2

Isolating Cluster Users for Performance and Predictability

Mohamad Dikshie Fauzie: FreeBSD and SOI-Asia Project

Kris Moore: PC-BSD - Making FreeBSD on the Desktop a reality

AsiaBSDCon 2009: Internet Mail — Past, Present, and (a bit of) the Future

AsiaBSDCon 2009:The OpenBSD Release Process: A Success Story

Slackathon 2009: OpenBSD physical memory management

Slackathon 2009: OpenBSD on sun4v - virtualization done right

Slackathon 2009: Active-Active - high performance firewalling with PF

Slackathon 2009: Faster Packets - Performance Tuning in the OpenBSD netw...

Slackathon 2009: Libssh2

Slackathon 2009: Schizophrenic Firewalls - virtualized network stack and...

Slackathon 2009: A quick show of things that make Bob scream

Slackathon 2009: Hacking VFS in OpenBSD

Kris Moore: PC-SYSINSTALL - A new system installer backend for PC-BSD an...

Alexandre Ratchov: OpenBSD audio & MIDI framework for music and desktop ...

Ana Kukec: Native SeND kernel API for *BSD

Claudio Jeker: vscsi(4) and iscsid -- iSCSI initiator the OpenBSD way

Peter Losher: Closing the DNS Security Loop with DNSSEC

Paul Schenkeveld: Minimizing service windows on servers using NanoBSD + ...

Simon Perreault: Ecdysis: Open-Source DNS64 and NAT64

Antti Kantee: Rump Device Drivers: Shine On You Kernel Diamond

Takuya ASADA: SMP Implementation for OpenBSD/sgi

Ryan McBride: What's wrong with PF

Constantine A. Murenin: Quiet Computing with BSD

Rui Paulo: Wireless Mesh Networks under FreeBSD

Marco Peereboom: Epitome2: dedup for the masses

Brooks Davis: Porting HPC Tools to FreeBSD

Marco Peereboom: Softraid: OpenBSD's virtual HBA, with benefits

Massimiliano Stucchi: BSD in the routing industry

George Neville-Neil: Hardware Performance Monitoring Counters on non-X86...

Journaled Soft-Updates, Dr. Kirk McKusick, BSDCan 2010

Attilio Rao - The VFS/vnode interface in the FreeBSD kernel

Dru Lavigne - Update on BSD Certification

Hans Peter Selasky - The new USB stack in FreeBSD

Jakub Klama - FreeBSD on DaVinci DMSoC (polish)

Save as Draft

Jan Srzednicki - What ideas can FreeBSD borrow from AIX?

Martin Matuska - mfsBSD

Nikolay Aleksandrov - FreeBSD-based solution for Internet traffic manage...

Monday, October 18, 2010

Redirecting in PHP and Apache

There are a couple of ways to redirect a webpage.  The first consideration is whether this is a permanent or a temporary redirect.  Each has its own HTTP code.  Permanent is 301 and temporary is 302.

Permanent Redirect in PHP

<?php 
    header( "HTTP/1.1 301 Moved Permanently" ); 
    header( "Location: http://www.example.com/newpage.php" ); 
    die(); 
?>

Temporary Redirect in HTML


<META HTTP-EQUIV="refresh" CONTENT="10; URL=http://example.com/newpage.php">

It delays 10 seconds before taking them to the new page, allowing them to see the contents of your page before going to the new page.

Permanent Redirect in Apache


Add the following line to httpd.conf

Redirect permanent /foo http://www.example.com/bar


You can redirect a directory, a single page, or the entire site.


Sunday, October 17, 2010

What to do AFTER you have BSD installed

The first time you install BSD, you face a huge learning curve. Unless you come from a UNIX environment, it's a totally new way of thinking. There are several things that people assume you know and therefore leave out of the documentation, or they have arranged the documentation in such a way that you have to know about it to find it.

This happens a lot in the UNIX world. The man(1) pages which are the primary source of UNIX online documentation, assume that you already know what command you are trying to learn about and how it is spelled. (Commands found in the manual are often tagged with their manual section in parenthesis like this: man(1).  The 'man' command is found in section 1 of the manual. Try typing 'man man' for more information on the 'man' command.)

 If you don't already know the command you are trying to learn, the man pages won't help much. Another place people seem to cut short on documentation is immediately after installation.
The install process on BSD is getting easier and easier, therefore more people with fewer UNIX skills are getting through the install process. Because of this, there seems to be a growing number of people that get BSD installed, and have no clue what to do when presented with the first login prompt.

Darwin/BSD (localhost) (console)

  login:
Because BSD is a highly secure operating system, unless the new user knows the default login name and password, the newly installed BSD system is useless to them. The command line interface is also foreign to most new generation users.

Friday, October 15, 2010

Creating an SSL Certificate on FreeBSD for ApacheSSL

This is just a quick syntax guide for creating a CSR for Apache SSL.  I do this a couple of times a year for the different domains I manage and I have to look up the syntax each time.  So, I'm recording it here so I can look it up easily.

Creating A CSR

A CSR is a Certificate Signing Request.  Its what you send to the SSL provider so they can create your actual certificate.

sudo openssl req -new -key /path/to/your/private/ssl/server.key \
-out /path/to/your/new/signingrequest.csr

The server.key is the private key for your server.  It identifies your computer and should be locked with a password.  In case someone breaks in and copies it, they can't use it without the password.  If evil doers manage to get a hold of your server key, they can pretend to be your website and people will trust them, because they have your ID.

The only draw back to using a password protected key is that apache will not start until you put in the password manually.  If you have apache set to load on boot up, it will actually keep you from logging into the server, because the boot process will stop, waiting on you to enter your password.

FreeBSD RC Scripts

The real problem is that SSH hasn't started at the time apache is trying to load.  One way to solve this problem is to add/modify the following line in the FreeBSD rc.d script for apache.

# REQUIRE: LOGIN cleanvar sshd

The REQUIRE field tells the RC system to wait for certain things to happen before loading. Since I use sshd to login remotely, I absolutely want to make sure that sshd is running before apache tries to load and things get stuck.

Yes, the # is part of the line. I worried the first time I tried it that it was commented out and I would have to remove the # to make it work.  But it works fine with the # sign, because it is parsed by the RC scripts as a directive instead of by the shell parser as a command to be executed.

Normally, I don't bother running secure apache from the rc scripts.  I just run apache manually, but just in case, I like to make sure it won't hang the server on boot up.


One side effect this will have though.  If sshd isn't running for any reason, you won't be able to start apache using the RC scripts.  You can still start it using apachectl though.

Thursday, October 14, 2010

Setting up Apache SSL with Multiple Certificates on FreeBSD behind a PFSense firewall using NAT

I was setting up SSL on my servers and I've setup SSL before, but only for one domain at a time.  This time, I'd setup SSL to run through a PFSense box to do load balancing.  I have three servers in the web cluster, all running the same software.  So I was planning on setting up the SSL the same way I setup the regular HTTP access, using name based virtual hosting.  I already had five different sites running on the web cluster with no problems.  The PFSense setup was extremely easy.

However, when I added the SSL certificates to the setup, I started getting errors from the client web browsers.  They were complaining that the site certificate mismatched.  And it was true, apache was serving up one SSL cert for all the domains.

I googled around and figured out that you can't use name based virtual hosting for SSL like you can with regular HTTP.  You have to switch to IP based virtual hosting and give each SSL cert its own IP address. 

In PFSense, under Firewall -> Virtual IP, I created a new external IP address for example2.com domain, since I already had an IP created for example1.com domain.

Then I went to Services -> LoadBalancer to create the pools.  A pool is the list of internal servers that will handle the web requests.  I had already created a pool for example1.com  They were being served by 10.10.1.5, 10.10.1.6, and 10.10.1.7.  I labeled it "example1 secure".

Wednesday, October 13, 2010

Backing up Key Files

Not everyone has access to a tape drive backup system big enough to make daily copies of all the files on all their servers. Even if they do have access to such a system, making a complete backup of every file can be a waste of tape.

Most system administrators choose to backup only key files and user data. Then when it comes time to restore the operating system, they use the original install media to do the bulk of the work. It's even possible to do software upgrades in this manner.

The theory is that after backing up all the right data, you then re-install using your newest OS install CD set instead of the version you used originally. Then after a quick restore of the key files and user data, you should be up and running. The tricky part then comes in knowing which files are "key" files and which ones you can live without. Some stuff is easier to regenerate on a new system, and things like the tape backup driver that took 10 hours to troubleshoot should never be left behind.

Oh Kernel, My Kernel!

You should always backup the kernel. Most administrators have created a custom kernel to allow for special drivers and specific setups on their systems.

FreeBSD calls it /boot/kernel/kernel (Its good to backup this the entire /boot directory)
NetBSD calls it /netbsd
OpenBSD calls it /bsd

However, backing up just the kernel won't necessarily get everything. Some systems are using loadable kernel modules. If you have installed a custom module you will need to back that up as well. Normally you will be able to get all the modules from the install CDs.

If you are upgrading to a new version, the kernel file won't really be that much of a benefit. What you really need to backup is the custom kernel configuration file you used to create your kernel. The config file is used to create a new kernel binary. If you are upgrading to a major release, you may need to merge in some changes from the system.

On most BSD systems you can find the kernel config files in /usr/src/sys/arch/$ARCH/conf, where $ARCH is the platform or architecture of the system (for example: i386 or sparc64), for NetBSD and OpenBSD or /usr/src/sys/$ARCH/conf in FreeBSD. You should consult your kernel compiling documentation if you can't remember where you left it.

System Configuration Files - /etc

The /etc directory contains all the important system configuration files. The directory is usually very small and easy to backup. If you don't backup the entire directory, you will at least want to get the following files.
/etc/rc.conf
Main system configuration file.
/etc/master.password
The Master Password File
/etc/group
Contains all your user groups.
/etc/fstab
Your filesystem layout table. If you fdisk your drive when you upgrade, you may not want to restore this file, but it will be invaluable during a system recovery procedure.
/etc/inetd.conf
The internet daemon configuration file. This file has become less important recently, since most of it's services come disabled by default; however if you are using it, you will want to back it up.
/etc/XF86Config
the XFree86 configuration file, could also be located in /etc/X11 or /usr/X11R6/etc/X11.
You will also want to backup your mail config files. Generally this will be /etc/sendmail.cf (or /etc/mail/sendmail.cf), however you may have upgraded to Postfix or Qmail, which usually place their configuration under /usr/local/etc/postfix and /var/qmail/control respectively (if you are not sure, consult the documentation for the mail system you are using).

Local Configuration Files - /usr/local/etc

Much of the 3rd party software you install puts configuration files in a common directory, usually /usr/local/etc. This generally makes things easy to backup. Another directory you will want to pay attention to is the rc.d directory. It contains startup scripts for various software packages you install. Many of these files can be regenerated easily by re-installing the package, but of course you should backup anything you have changed.

Conclusion

As with any backup situation, its always a good idea to test your procedures before needing to use them in real life. This also helps you be sure you have backed them up on an accessable media. There are times when I print out some of the configuration files just to preserve the settings for use later, just in case. And don't forget to keep an install CD set around if you aren't backing up all the system files.

Webalizer

It is increasingly more common to have several virtual domains running on a single server. Apache makes it extremely easy to configure this and FreeBSD is such a powerful server, multiple domains are limited mostly by bandwidth available.
Weblogs are an important part of running website. A good traffic analysis can help you figure out where your traffic is coming from and where you need to promote your site so you can increase traffic. It also helps convince advertisers who want to advertise on your site if you can reliably show how large an audience you reach.
Since Daemon News is so broken up in to virtual domains, I found it quite hard to get a reliable snapshot of traffic to the box. I've been using Webalizer to parse and graph our weblogs, but it has a few quirks with regard to processing previous information from multiple sites.
Webalizer runs in two modes: regular and incremental. The incremental mode is designed to run everyday and keep track of where it was last. If it encounters data older than where it thinks it is, it discards the data. I found that trying to cat together several sites would result in only the first site being recorded, because other sites contained data contemporary with the other sites that would not get processed as old data.
The solution was to sort the logs before processing them with webalizer. My first thought was to sort the entire logfile collection, but this proved impractical. First, the data was several gigs and difficult to sort reliably. A simple sort command line wasn't sufficient: it would require two or more sortings or some more complex algorithms. So I opted for a simpler solution, since the logs were by default partially already sorted.
I ended up collecting all the data for a single day from every site, then sorting that by hour and minute. It took a few tries to get sort to grab the right context. And since the days were already sorted in chronological order, I could do this with pipes and not have to make extra copies of the data.
Here is the resulting script.
#!/bin/sh

# Get the filename for each day for the year 2003
cd /usr/local/weblogs/ezine/
for i in access.2003*.gz
do

# Change back to the directory where the script is running and 
# where the webalizer config script is located.
cd /usr/home/chrisc/webstats
# Collect each sites weblogs one day at a time into one large temp file
zcat /usr/local/weblogs/ezine/$i >tmp
zcat /usr/local/weblogs/daily/$i >> tmp
zcat /usr/local/weblogs/www/$i >> tmp
zcat /usr/local/weblogs/magazine/$i >> tmp
zcat /usr/local/weblogs/search/$i >> tmp

#Sort the data by hour and run it through webalizer.
cat tmp | sort -t[ +1 | webalizer

#clean up the tmp file so its empty for next time
rm tmp

done
This is just a quick script and could use better temp file management, but it illustrates the steps needed to get the right results from webalizer for multiple sites. There are probably other ways to get the same results and your mileage may vary.

Symbolic Links

There comes a time when it's handy to have a file exist in more than one place on your file system; however, its not practical to copy it to each place. Aside from the added disk space, you would quickly lose track of which file was the latest copy. Under certain circumstances you could use CVS or RCS to maintain the several copies, but that would still require updating each copy every time a change was committed.
A symbolic link accomplishes the same thing as multiple copies of a file, without requiring any actual copies of the file. The link merely places an additional name for the file in another part of the of the filesystem. No file copying is involved. Two different names both reference the same data.
There are two kinds of links: hard links and symbolic links. A hard link is indistinguishable from the original file and must exist on the same filesystem partition as the original file. It truely becomes an alternate name for the file. Deleting a hard link to a file does not delete the file until the last link is deleted. Since there is no difference between the original filename and the hard links, it makes no difference which filename remains; deleting the original filename doesn't delete the file as long as one link remains to provide access to the data.
When you try to link between two partitions or link to a directory, you have to use a symbolic link. A symbolic link behaves similarly to a hyper-link like you would find on a web page. It places an alternate name for the file that points to the location of the file being linked to. Much like web links, if you delete the destination or original file, the link becomes a broken link.
Symbolic links are also easy to spot. Using ls -l, you can see which files are symbolic links and where they point to.
> ls -l /
 -rwxrwxr-t   1 root    admin          9 Nov 28 18:49 mach -> /mach.sym
 -r--r--r--   1 root    admin     705904 Nov 28 18:49 mach.sym
 -rw-r--r--   1 root    wheel    3728752 Nov  5 23:01 mach_kernel
 drwxr-xr-x   3 chrisc  admin        102 Nov 13 22:33 opt
 drwxr-xr-x   6 root    wheel        204 Nov 28 18:49 private
 drwxr-xr-x  60 root    wheel       2040 Nov 28 18:43 sbin
 lrwxrwxr-t   1 root    admin         11 Nov 28 18:49 tmp -> private/tmp
 drwxr-xr-x  12 root    wheel        408 Jul 14 00:21 usr
 lrwxrwxr-t   1 root    admin         11 Nov 28 18:49 var -> private/var
You will notice that the above list of directories shows three symbolic links, or symlinks. The file mach is points to mach.sym and the directories /tmp and /var are linked to directories in the /private directory. If I deleted any of the symlinks, the destination files would remain untouched, the link would merely go away. So, deleting mach would not delete mach.sym; however, deleting mach.sym would make the symlink pointing to it stop working.
The big benefit of using symlink is that you can move the destination file around and then adjust the symlinks that point to it. A practical use of this would be to move a large file or directory from a small partition to a larger partition and then replace it with a symlink that points to the original file.
For example, my /tmp directory is located on my root directory which is very small and I keep having problems with programs requiring more space than /tmp has available. I have two options, first I could repartition the disk and give more disk space to the /tmp directory. Or, I could move the /tmp directory to a partition that had lots of room and replace it with a symlink. If I didn't replace it with a symlink then any program that relied on the location of /tmp would quit working. However, the symlink tells them the new location of /tmp without making them go look for it.
Links are created with the ln(1) command. The syntax for ln(1) can be a bit confusing at first. It takes the file that exists as the first argument and the second parameter is the location of the link to be created. By default ln(1) creates hard links, you have to use the -s flag to create a symbolic link.
So to move the /tmp directory, you would use the following commands. (You will need to be root to do this.)
# mv /tmp /usr/local/tmp
 # ln -s /usr/local/tmp /tmp

 >ls -l /tmp
 lrwxrwxr-t   1 root    admin         11 Nov 28 18:49 tmp -> /usr/local/tmp
As you can see, /tmp has been moved to /usr/local/tmp and been replaced with a symlink. If I cd to that /tmp I will find myself in /usr/local/tmp instead.
> cd /tmp
 > pwd 
 /usr/local/tmp
 > cd ..
 > pwd 
 /usr/local
If you find yourself constantly changing to a directory that is very deep, a symlink shortcut could be just what you need. It is possible create a shell alias to do that for you, but a symlink shortcut will allow your scripts to use it as well as any programs that might rely upon it, where your shell alias won't.
The options to ln(1) are as follows:
-f
Unlink any already existing file, permitting the link to occur.
-h
If the target_file or target_dir is a symbolic link, do not follow it. This is most useful with the -f option, to replace a symlink which may point to a directory.
-n
Same as -h, for compatibility with other ln implementations.
-s
Create a symbolic link.
To adjust a link use the -f and -h options together.
#  mv /usr/local/tmp /usr/share/tmp
 #  ln -sfh /usr/share/tmp /tmp
An ls -l /tmp will reveal the newly adjusted location.
> ls -l /tmp
 lrwxrwxr-t   1 root    admin          9 Nov 28 18:49 mach -> /mach.sym
One word of caution however; you don't want to link a file to a symbolic link. It will let you do it, because it doesn't resolve where the file points to when the link it created. So its possible do to this:
lrwxr-xr-x   1 chrisc  staff     7 Dec  2 21:46 file -> oldfile
 lrwxr-xr-x   1 chrisc  staff     4 Dec  2 21:46 newfile -> file
 lrwxr-xr-x   1 chrisc  staff     7 Dec  2 21:46 oldfile -> newfile
However, when you try to write to any of the actual files, it will give you an error.
Error: file: Too many levels of symbolic links.

Conclusion

We can now recognize symlinks with ls(1) and create them with ln(1). We have also learned how to adjust them without rm(1) to delete them. Links are very powerful when you need to re-arrange your filesystem without disturbing programs already installed. It can also make management easier by creating shortcuts to files or directories that are located deep in the filesystem.

Gathering System Information

If you manage multiple BSD computers, you have noticed that one command line looks very much like any other command line. It is easy to get turned around and think you are working on one computer, when you are really working on another. This is especially true if you use the same or similar passwords on multiple systems.
I know several BSD admins that have rebooted the wrong computer because they forgot which computer they were logged into. They all look the same when you are logged in remotely.
BSD provides several utilities to display information about the computer you are working on. hostname will display the name of the computer you are logged into.
myname# hostname
myname.my.domain
This helps if all you need to know is the name of the computer to verify that you are working on the right one. However, this isn't always enough. The next level of information about your system can be gathered from the uname utility. According to the manpage:
The uname command writes the name of the operating system implementation to standard output. When options are specified, strings representing one or more system characteristics are written to standard output.
If you just type uname it will tell you which operating system it is.
To get any really useful information from uname, we need to use the -a option. I have listed below the uname -a output from FreeBSD, OpenBSD, and NetBSD. I tried to grab one from different architectures as well as different BSDs. I don't actually have all these machines, so I grabbed these off the mailing list archives.
myname# uname -a
FreeBSD myname.my.domain 3.3-STABLE FreeBSD 3.3-STABLE #8: Fri Dec 17 20:43:04 GMT 1999 root@myname.my.domain:/usr/src/sys/compile/grumpy i386

# uname -a
OpenBSD bigturd 2.5 GENERIC#172 sparc

pc164# uname -a
NetBSD pc164 1.4P NetBSD 1.4P (PC164.v6-intl) #5: Sat Nov 27 18:31:37 CET 1999 root@pc164:/usr/src/sys/arch/alpha/compile/PC164.v6-intl alpha
Each one of them starts out by printing the operating system type, followed by the computer name. Then it gives the OS version. The # following the OS version is the number of times the running kernel has been recompiled without modification to the kernel configuration file. Basically, the system was upgraded or changed and they used the same kernel config file.
The architecture of the computer is also listed, along with the name of the Kernel config file that was used.
When you are asking other people for help, especially on a mailing list, they will want to know the output of uname -a. Yet, there will often be times when this is still not enough information about your system. dmesg is a utility that will display the information that appeared during boot up. This includes the processor speed, amount of RAM memory, and all the device drivers that loaded.
OpenBSD 2.1 (TWP) #3: Sat Jul 19 18:37:43 CDT 1997
    twp@twp.tezcat.com:/usr/src/sys/arch/i386/compile/TWP
CPU: Pentium (GenuineIntel 586-class CPU) 133 MHz
BIOS mem  = 654336 conventional, 32505856 extended
real mem  = 33157120
avail mem = 29097984
using 430 buffers containing 1761280 bytes of memory
mainbus0 (root)
isa0 at mainbus0
com0 at isa0 port 0x3f8-0x3ff irq 4: ns16550a, working fifo
com1 at isa0 port 0x2f8-0x2ff irq 3: ns16550a, working fifo
lpt0 at isa0 port 0x378-0x37f irq 7
isadma0 at isa0
aic0 at isa0 port 0x340-0x35f irq 11
scsibus0 at aic0: 8 targets
probe(aic0:3:0): sync, offset 8, period 100nsec
sd0 at scsibus0 targ 3 lun 0:  SCSI2 0/direct fixed
sd0: 2063MB, 6703 cyl, 5 head, 126 sec, 512 bytes/sec
cd0 at scsibus0 targ 4 lun 0:  SCSI2 5/cdrom removable
wdc0 at isa0 port 0x1f0-0x1f7 irq 14
wd0 at wdc0 drive 0: 
wd0: 2067MB, 4200 cyl, 16 head, 63 sec, 512 bytes/sec (96KB cache)
wd0: using 16-sector 16-bit pio transfers, lba addressing
npx0 at isa0 port 0xf0-0xff: using exception 16
vt0 at isa0 port 0x60-0x6f irq 1: s3 765 (Trio64 V+), 80 col, color, 8 scr, mf2-kbd, [R3.32]
spkr0 at vt0 port 0x61
fdc0 at isa0 port 0x3f0-0x3f7 irq 6 drq 2
fd0 at fdc0 drive 0: 1.44MB 80 cyl, 2 head, 18 sec
pci0 at mainbus0 bus 0: configuration mode 1
vendor 0x8086 product 0x7030 (class bridge, subclass host, revision 0x02) at pci0 dev 0 function 0 not configured
vendor 0x8086 product 0x7000 (class bridge, subclass ISA, revision 0x01) at pci0 dev 7 function 0 not configured
vendor 0x5333 product 0x8811 (class display, subclass VGA, revision 0x54) at pci0 dev 9 function 0 not configured
biomask 4840 netmask 4840 ttymask 48da
root on wd0a
This dmesg output provides a lot of very helpful information. The downside to this is that is isn't apparent to the new user what it all means. The first section of it describes the operating system, CPU type and physical RAM memory. Try to get Windows to tell you that with out rebooting. :-)
The rest of the dmesg output is a list of devices drivers and how they loaded. I'll list a few of the basic devices:
aic0 is the SCSI controller.
    sd0 is the first SCSI disk drive.
    wd0 is the first IDE disk drive.
    cd0 is a SCSI CD Drive.
    wdc0 is an IDE controller.
    fdc0 is the floppy disk controller.
    spkr0 is the internal speaker.
    fd0 is the floppy disk drive.

These will differ a bit on each BSD and they tend to change a little over time. This is an older dmesg, but it does the job.
There is still some information that we might need. The filesystem has not been addressed yet. We still need to be able to tell how much disk space we have available and how our filesystem is arranged. The command df displays the filesystem layout and how much of it is used. According to the man page:
Df displays statistics about the amount of free disk space on the speci- fied filesystem or on the filesystem of which file is a part.
chris# df
Filesystem  1K-blocks     Used    Avail Capacity  Mounted on
/dev/wd0s1a     31743    27686     1518    95%    /
/dev/wd0s1f   1359855  1078917   172150    86%    /usr
/dev/wd0s1e     31743     5779    23425    20%    /var
procfs              4        4        0   100%    /proc
Each disk or partition on disk is listed. The "Mounted on" listing shows which directory device is attached to. For example we have 172 Meg available in the /usr/ directory and only 23 Meg available in the /var/ directory.
This should be enough information to keep you informed about your computer. For further information read the man pages fordmesgdf, and uname.