VirtualBox gets upset about screen resolution

I've just spent considerable time trying to get a Fedora 12 (beta) VirtualBox guest running (on a Windows 7 host). The guest was working perfectly before, and when I shut it down it was running in seamless mode. Apparently this was a bad idea.

When trying to boot the guest, I got the normal splash screen, but instead of the login window I just got a black screen with a blinking cursor in the corner. When I logged in on another virtual terminal, I found the following messages in the X log file:

[...]
(II) VBoxVideo(0): Not using built-in mode "1920x1170" (unknown reason)
(II) VBoxVideo(0): No remaining probed modes for output VBOX1
(II) VBoxVideo(0): Output VBOX1 connected
(WW) VBoxVideo(0): Unable to find initial modes
(EE) VBoxVideo(0): Output VBOX1 enabled but has no modes
(EE) VBoxVideo(0): Initial CRTC configuration failed!
(II) UnloadModule: "vboxvideo"
[...]
This indicated that the driver was unhappy with the desired resolution for an increadibly helpful "unknown reason". Apparently, attempting to set the resolution used in seamless mode was failing, possibly because it was too high for windowed mode.

As it turns out, VirtualBox stores the last resolution in the machine description file in the guest property /VirtualBox/GuestAdd/Vbgl/Video/SavedMode:

$ VBoxManage guestproperty get Fedora /VirtualBox/GuestAdd/Vbgl/Video/SavedMode
VirtualBox Command Line Management Interface Version 3.0.8
(C) 2005-2009 Sun Microsystems, Inc.
All rights reserved.

Value: 1920x1170x32

After spotting this, the fix was reasonably straight-forward:

$ VBoxManage guestproperty set Fedora /VirtualBox/GuestAdd/Vbgl/Video/SavedMode 800x600x32
fixed the problem.

Avoid timecounter "TSC" on modern PCs and laptops!

Most times when I boot NetBSD, the system time runs at only about half the real speed, meaning the time displayed by date is way behind. This would only be a nuisance if it didn't also mean that my e-mail client puts the wrong timestamps on every e-mail I send, which can be confusing or even embarrassing. Sometimes, very rarely, all is fine and the clock runs quite accurately.

I managed to capture the dmesg output for both cases, and this is the relevant difference:

--- ./dmesg.accurateclock       2009-09-08 16:20:59.000000000 +0100
+++ ./dmesg.clockslow   2009-09-07 09:49:39.000000000 +0100
@@ -12,14 +12,14 @@
 timecounter: Timecounter "i8254" frequency 1193182 Hz quality 100
 System manufacturer System Product Name (System Version)
 mainbus0 (root)
-cpu0 at mainbus0 apid 0: Intel 686-class, 2672MHz, id 0x106a5
-cpu1 at mainbus0 apid 2: Intel 686-class, 2672MHz, id 0x106a5
-cpu2 at mainbus0 apid 4: Intel 686-class, 2672MHz, id 0x106a5
-cpu3 at mainbus0 apid 6: Intel 686-class, 2672MHz, id 0x106a5
-cpu4 at mainbus0 apid 1: Intel 686-class, 2672MHz, id 0x106a5
-cpu5 at mainbus0 apid 3: Intel 686-class, 2672MHz, id 0x106a5
-cpu6 at mainbus0 apid 5: Intel 686-class, 2672MHz, id 0x106a5
-cpu7 at mainbus0 apid 7: Intel 686-class, 2672MHz, id 0x106a5
+cpu0 at mainbus0 apid 0: Intel 686-class, 3741MHz, id 0x106a5
+cpu1 at mainbus0 apid 2: Intel 686-class, 3741MHz, id 0x106a5
+cpu2 at mainbus0 apid 4: Intel 686-class, 3741MHz, id 0x106a5
+cpu3 at mainbus0 apid 6: Intel 686-class, 3741MHz, id 0x106a5
+cpu4 at mainbus0 apid 1: Intel 686-class, 3741MHz, id 0x106a5
+cpu5 at mainbus0 apid 3: Intel 686-class, 3741MHz, id 0x106a5
+cpu6 at mainbus0 apid 5: Intel 686-class, 3741MHz, id 0x106a5
+cpu7 at mainbus0 apid 7: Intel 686-class, 3741MHz, id 0x106a5
 ioapic0 at mainbus0 apid 8: pa 0xfec00000, version 20, 24 pins
 ioapic1 at mainbus0 apid 9: pa 0xfec8a000, version 20, 24 pins
 acpi0 at mainbus0: Intel ACPICA 20090730
@@ -170,7 +170,7 @@
 ieee1394if0: 1 nodes, maxhop <= 0, cable IRM = 0 (me)
 ieee1394if0: bus manager 0 (me)
 timecounter: Timecounter "clockinterrupt" frequency 100 Hz quality 0
-timecounter: Timecounter "TSC" frequency 2672861800 Hz quality 3000
+timecounter: Timecounter "TSC" frequency 3741967920 Hz quality 3000
 azalia0: codec[0]: ATI R600 HDMI (rev. 1.0), HDA rev. 1.0
 audio0 at azalia0: full duplex, independent
 azalia1: codec[0]: Realtek ALC888 (rev. 1.1), HDA rev. 1.0
This shows that the processor clock is detected differently in the two cases: in the cases when my clock runs slow, the processor is detected to be running at 3.7GHz, otherwise it's 2.6GHz. As a consequence of misdetecting the processor clock, the kernel also assumes the incorrect frequency for the "TSC" timecounter.

The reason this happens is that my CPU is an Intel i7 which uses the a technology called Turbo Boost. Basically, when there isn't much work to do the processor gets clocked down to save energy. Processors designed for mobile devices have been doing this for quite a while.

The manual page timecounter(9) indicates that it originated on FreeBSD, and in fact I found the solution to my problem in the FreeBSD FAQ.

Luckily, the "TSC" timecounter isn't my only timing source on this machine. As a matter of fact, the kernel variable kern.timecounter.choice offers plenty of such:

kern.timecounter.choice = TSC(q=3000, f=3741919640 Hz)
        clockinterrupt(q=0, f=100 Hz) hpet0(q=2000, f=14318179 Hz)
        ACPI-Fast(q=1000, f=3579545 Hz) lapic(q=-100, f=133687675 Hz)
        i8254(q=100, f=1193182 Hz) dummy(q=-1000000, f=1000000 Hz)
NetBSD has even already gone to the trouble of rating the quality of each of these, so all I need to do is to pick the one with the next lower value:
# sysctl -w kern.timecounter.hardware=hpet0
kern.timecounter.hardware: TSC -> hpet0
This seems to have solved the problem.

I think the code for the "TSC" timecounter should either be adapted to modern hardware with changing processor speeds. At the very least, the quality rating should be lowered significantly if such hardware is detected so it isn't chosen as the default time source.

Simple conditional formatting with CSS and JavaScript

Today, I wanted to make sections of shell commands in my blog stand out more clearly from other examples like program code or sample files. To do that, I decided to add some JavaScript to my site that automatically assigns a special CSS class to preformatted code sections containing shell code:
var shells = document.getElementsByTagName('pre');
for (var sh = shells.length - 1; sh != -1; --sh) {
	var txt = shells[sh].firstChild.textContent;
	if ('CODE' === shells[sh].firstChild.nodeName &&
		('$' === txt[0] || '#' === txt[0]) &&
		' ' === txt[1])
	{
		shells[sh].className = 'shell';
	}
}
Sections illustrating shell code are recognisable because the first line starts either with '$' for a user shell or '#' for a root shell, except scripts that start with a shebang.

Building new packages from scratch

Today I tried yet another way to upgrade my packages. Since it was recently fixed and after thinking more about Hubert's upgrade procedure I decided to give mk/bulk/mksandbox another chance. For more discussion about upgrading packages, also see the NetBSD wiki. I have also described a method to upgrade only packages that changed.

This time, I wanted to skip the step of building packages and installing them later. Instead, I decided to build the packages in a sandbox on my /usr partition and simply move the resulting /usr/pkg, /var/db/pkg and /var/db/pkg.refcount directories into the right place when finished. This way, I get the benefit of having a working system while building the new one and being able to switch over to the new packages in one swell foop when finished. This requires the build environment in the sandbox to be fairly similar to the real system: in particular, the user and group ids in the sandbox must match the ones in the real system. Luckily, this is exactly the way that mk/bulk/mksandbox sets up the sandbox, so nothing special needs to be done.

Because all packages are rebuilt during the following upgrade procedure, there are no problems when dynamic libraries in the base system are upgraded. This happens fairly frequently in my case because the base system is NetBSD-current.

To update, I first get a new pkgsrc tree:

$ cd /usr/pkgsrc
$ mv cvs.up cvs.up.0
$ cvs up -A 2>&1 | tee cvs.up
$ grep ^[^cUP] cvs.up
Then I generate a pkg_chk configuration file of my required packages:
$ pkg_leaves -a | \
>       while read pkg; do
>               pkg_info -Q PKGPATH "$pkg";
>       done > /usr/pkgsrc/pkgchk.conf
or without pkg_leaves:
$ pkg_info -e "*" | \
>       while read pkg; do
>               if [ -z "`pkg_info -q -R "$pkg"`" ]; then
>                       pkg_info -Q PKGPATH "$pkg";
>               fi;
>       done > /usr/pkgsrc/pkgchk.conf
Don't fall into the trap of trying to use pkg_info -u for this purpose: it uses the package variable "automatic" which gets set to a completely random value in my experience. Instead, the above script prints all packages that are not depended on like in pkg/32827. This speeds up the following build procedure by reducing the number of lines in pkgchk.conf since each line of pkgchk.conf gives rise to an execution of make clean CLEANDEPENDS=yes.

I then save all my binary packages and clean any obsolete distfiles using lintpkgsrc.

$ rm -r packages-old
$ mv packages packages-old
$ lintpkgsrc -mopr

I also clean any stale working directories that might still be around:

$ rm -r /usr/pkgsrc/*/*/work

The only package that needs to be installed in the »host« environment is pkgtools/mksandbox. Note that it has to be called with an absolute path or bad things will happen! This is how I build a new sandbox in my home directory and jump inside:

$ sudo mksandbox $HOME/sb
$ cd $HOME/sb
$ sudo sh sandbox chroot
#
I then set up the /etc/mk.conf file with my desired settings for the build:
.ifdef BSD_PKG_MK
# Setting WRKOBJDIR is inconvenient for development:
#WRKOBJDIR=     /usr/obj/usr/pkgsrc
PKG_DEVELOPER=  yes

UPDATE_TARGET=  bin-install
BINPKG_SITES=   # prevent bin-install from accessing ftp
USE_DESTDIR=    yes
MAKE_JOBS=      8
SKIP_LICENSE_CHECK=yes

# Make Firefox look like Firefox
PKG_OPTIONS.firefox+= official-mozilla-branding
PKG_OPTIONS.gecko+=   official-mozilla-branding
PKG_OPTIONS.thunderbird+=official-mozilla-branding

PKG_DEFAULT_OPTIONS+= gnome

ALLOW_VULNERABLE_PACKAGES=yes

.if exists(/usr/pkg/bin/sudo)
SU_CMD=        /usr/pkg/bin/sudo /bin/sh -c
.endif
.endif

The first package that needs to be installed is pkg_chk because it is in charge of the following procedure.

# cd /usr/pkgsrc/pkgtools/pkg_chk
# make update
Don't do this, it breaks at least textproc/icu: I then tend to install pkgtools/autoswc which provides a cache for configure. This is useful because these tests are typically performed by every package just to produce the same answers to identical questions.

Now, all that is left to do is to start the build and get some coffee while it's working away:

# pkg_chk -akr 2>&1 | tee /usr/obj/pkgchk-akr.0.log

After the build is finished, the sandbox can be unmounted:

# ^D
$ sudo sh sandbox umount

It's probably worth checking for any special configuration in /usr/pkg/etc and merge differences you care to keep into the new tree:

$ diff -ur /usr/pkg/etc $HOME/sb/usr/pkg/etc

Also, you should check for any changes to files in /etc and apply them to your system's configuration. In particular, many packages set up users and groups that you have to copy with identical id's as in the chroot in order to match the owners of the installed files.

$ diff -ur /etc $HOME/sb/etc

Then I switch into single-user mode and exchange my /usr/pkg, /var/db/pkg and /var/db/pkg.refcount directories with the newly built ones:

$ sudo shutdown now
# mv /usr/pkg /usr/pkg-old
# mv /var/db/pkg /var/db/pkg-old
# mv /var/db/pkg.refcount /var/db/pkg.refcount-old
# mv $HOME/sb/usr/pkg /usr/pkg
# mv $HOME/sb/var/db/pkg /var/db/pkg
# mv $HOME/sb/var/db/pkg.refcount /var/db/pkg.refcount
# ^D
At this point, the system should boot back into multi-user mode using the new packages. If anything goes wrong, it's fairly easy to revert back to the old packages using the backups. For this to work properly it's essential that the sandbox directories are on the same partition as their final locations: if they're not, moving the created files while preserving hard links and the like is best done using pax -rw.

It can't hurt to read the install messages for the installed packages using

$ pkg_info -Da
and do as they suggest (create users and groups, copy files from /usr/pkg/share/examples/rc.d to /etc/rc.d etc.)

After everything works to satisfaction,

# rm -fr $HOME/sb
# rm -fr /usr/pkg-old
# rm -fr /var/db/pkg-old
# rm -fr /var/db/pkg.refcount-old

Flash videos and other essentials

Today I mentioned to my girlfriend that one of the reasons I liked NetBSD was that by installing 300MB you can get a fully working system in less than 1GB of hard disk space. She coughed when I said "fully working", reminding me that my sound card isn't working yet, there's no 3D acceleration and my browser can't even play youtube videos. I went on to explain that another great thing about NetBSD is that it supports 57 platforms on 16 CPU architectures and that as a NetBSD user you tend to care more about important UNIX things such as running daemons, rather than dispensable amusements like playing music or watching movies -- but who am I fooling? I've used NetBSD as my daily desktop for many years and I like it when things work.

So I set out to make flash videos play tonight, which I'd tried briefly before without success. The youtube plug-in in totem could search youtube, but as soon as I double-clicked a video it told me that a plug-in was missing, keeping obnoxiously secret about which one. Since I'd already installed the gstreamer0.10-plugins meta-package, which claimed to include all the GStreamer plug-ins, I didn't know where to start looking. Eventually I found out that four current GStreamer plug-ins are actually missing from the meta-package, namely alsa, faac, jpeg and soup. I don't know about the other three, but the soup plug-in was mentioned as a dependency of totem's youtube plug-in. So I installed it and now totem can actually play the videos it finds!

The next thing I tried was the gnash Firefox plug-in, but their wiki page makes it look rather daunting to play even a single video using shell commands like these:

$ url=http://youtube.com/watch?v=9sJUDx7iEJw
$ vars=$(wget --quiet -O - "$url" | grep -F watch_fullscreen | cut -d \? -f 2 | cut -d \" -f1)
$ echo "$vars"
$ gnash -vv -F 2 -P "FlashVars=$vars" http://youtube.com/player2.swf
I might have misunderstood Web 2.0 completely, but surely this is would hardly put an end to my girlfriend's mockery?

Next up was swfdec. I already had the standalone swfdec-gnome player installed, but hadn't taken a look at the swfdec-mozilla plug-in yet. And behold -- after installing it and linking the resulting plug-ins from share/mozilla/plugins to share/firefox/plugins, I now have a youtube player! There are decoding errors and sound sync problems in some of the videos I've tried, but with the quality of most of the videos on youtube nowadays that can only enhance the atmosphere. And the best thing about it: it already has Flashblock built-in, requiring a click before it will start playing -- bonus!

So now I have two ways of watching youtube videos either in a separate UI or in the browser, increasing the chances that one of them can actually play all those indispensable videos out there.

Show coloured diffs in blogs

A quick Google search for JavaScript code to format context-diffs as coloured side-by-side diffs didn't turn up anything obvious (admittedly I haven't the slightest clue where to look for this sort of stuff on the 'net).

So I went and quickly wrote my own formatting function which formats a given context diff and returns a table-formatted side-by-side diff:

function formatDiff(textContent) {
    "use strict";

    function E(name, attrs) {
        var arg, elem = document.createElement(name), prop;

        for (prop in attrs) {
            elem.setAttribute(prop, attrs[prop]);
        }
        for (arg = 2; arg != arguments.length; ++arg) {
            elem.appendChild(arguments[arg]);
        }

        return elem;
    }

    var td = E.bind(undefined, "td");
    var th = E.bind(undefined, "th", {});
    var tr = E.bind(undefined, "tr", {});
    var T = document.createTextNode.bind(document);

    var res = E("table", {"class": "diff"});

    var sects = {"unchanged": [], "removed": [], "added": []};

    var lines = textContent.split("\n");
    lines.push("");
    lines.forEach(function (l) {
        var leftAttrs = {}, rightAttrs = {};

        var lineType = !l.length ? "eof"
                     : "--- " === l.substr(0, 4) ? "oldfile"
                     : "+++ " === l.substr(0, 4) ? "newfile"
                     : "@@ " === l.substr(0, 3) ? "range"
                     : "-" === l[0] ? "removed"
                     : "+" === l[0] ? "added"
                                    : "unchanged";

        /*
         * An "unchanged" section ends at a line that isn't "unchanged".
         */
        if (sects["unchanged"].length && "unchanged" !== lineType) {
            res.appendChild(tr(
                td({}, T(sects["unchanged"].join("\n"))),
                td({}, T(sects["unchanged"].join("\n")))
            ));

            sects["unchanged"].length = 0;
        }

        /*
         * An added/removed section ends
         *  o at a line that isn't "added" if there are "added" lines,
         *  o at a line that isn't "added" or "removed" if there are "removed" lines.
         */
        if (sects["added"].length && "added" !== lineType ||
            sects["removed"].length && "removed" !== lineType && "added" !== lineType)
        {
            if (sects["added"].length && sects["removed"].length) {
                leftAttrs["class"] = rightAttrs["class"] = "diff-changed";
            } else if (sects["removed"].length) {
                leftAttrs["class"] = "diff-removed";
            } else {
                rightAttrs["class"] = "diff-added";
            }

            res.appendChild(tr(
                td(leftAttrs, T(sects["removed"].join("\n"))),
                td(rightAttrs, T(sects["added"].join("\n")))
            ));

            sects["added"].length = sects["removed"].length = 0;
        }

        if ("oldfile" === lineType) {
            res.appendChild(tr(th(T(l.substring(4)))));
        } else if ("newfile" === lineType) {
            res.lastChild.appendChild(th(T(l.substring(4))));
        } else if ("range" === lineType) {
            var ds = td(
                {"class": "diff-section", "colspan": 2},
                T(l.split("@@").join(" "))
            );
            res.appendChild(tr(ds));
        } else if ("eof" !== lineType) {
            sects[lineType].push(l.substring(1));
        }
    });

    return res;
}

// Clone the HTML node collection before modifying the DOM.
var diffs = Array.prototype.slice.call(document.getElementsByClassName("diff"));
diffs.forEach(function (diff) {
    diff.replaceChild(formatDiff(diff.firstChild.textContent), diff.firstChild);
});
The code at the bottom reformats any elements of class diff: see it in action!

Here's a small unit test:

--- file1
+++ file2
@@ -0,+1 @@ something
 0
 1
-2
+2
+3
-3
-4
+4
-5
 6
+7
 8

There are already various source code highlighters out there so arguably this posting could be made to look a lot nicer. :-)

Why does pulseaudio deliberately ignore my sound card?

After installing the Gnome desktop on my new NetBSD machine, I was over the moon about having the sleekest NetBSD system I'd ever used by a long way, approaching the nearly immaculate out-of-the-box experience of recent Fedora or OpenSolaris distributions. However, I could not get any sound from the video player totem, the messaging client Pidgin or even the file browser nautilus. Since pulseaudio is in charge of sound output nowadays, it was the place to look for the cause of the problem.

In the default configuration, pulseaudio uses a module called module-hal-detect which has an interesting check to "detect" only the audio device which happens to be assigned index 0 by hald. The commit comment admits that's what it's doing but leaves us in the dark as to why except that it's also done for ALSA. On my machine, device 0 happens to be the HDMI sound interface of my video card which I'd want to use if I watched a movie on a digitally connected TV. My regular sound card gets assigned device number 1 and hence pulseaudio never becomes aware of it.

After recognising the cause, the problem was easily fixed by commenting out the offending test:

--- src/modules/module-hal-detect.c.orig        2009-01-12 23:10:34.000000000 +0000
+++ src/modules/module-hal-detect.c
@@ -253,7 +253,7 @@ static int hal_oss_device_is_pcm(LibHalC
             goto finish;
 
     device = libhal_device_get_property_int(context, udi, "oss.device", error);
-    if (dbus_error_is_set(error) || device != 0)
+    if (dbus_error_is_set(error) /* || device != 0 */)
         goto finish;
 
     r = 1;
Now pulseaudio happily recognises my sound card, and after selecting the newly appeared device as my default, I finally have sound -- but not before wiping my ~/.pulse directory which still mapped the output of all the programs I had used to the old device (which used to be the default) with no obvious GUI to change it (it's stored in the gdbm database ~/.pulse/<host>:stream-volumes.<arch>--<os>.gdbm if you must know).

Now I have to admit that I'm slightly behind because I'm using pulseaudio 0.9.14 which is currently in pkgsrc instead of the current version 0.9.15. In the meantime, two things have happened: First, someone had my exact problem so (while still owing an explanation for the check in the first place) module-hal-detect got an argument subdevs that makes it recognise all sound cards in the sysetem and which was later renamed to subdevices. Second, the entire hal support was ripped outdeprecated in line with the abandonment of hald in favour of udev, which I don't know anything about including whether it will work on NetBSD. Oh well, something to look forward to in pulseaudio 0.9.16 I guess...

Shell configuration

Each time I log on to a system for the first time, I get reminded how accustomed I am to my normal shell environment. There are commands that I use unconsciously many times a minute, and if they don't work I keep getting annoyed at the resulting error messages. So it's never long before I tweak the shell configuration to match my required environment in order to be able to do some work. Some shells don't require many changes (sh, ksh), while others need a lot of persuasion before they behave in a remotely sane way (bash).

Everything used to be very simple: each user had a .profile that was executed once every time they logged in. It was the place to lift your spirits with an amusing message, be let down by a calendar of the day's events in history, and set up environment variables for the day's work that all the child processes could inherit.

It was not a good place to set up shell functions, aliases etc. because as soon as a child shell was spawned (from inside an editor or by opening a new terminal window), those settings would be forgotten. That's where the environment variable ENV came in: if it contains a file name, this file gets executed each time a shell process starts (and after .profile is finished). It follows from the above that each child shell that inherits the login shell's environment will run the contents of this file. It is therefore the perfect place to set up shell functions, aliases etc. that should always be available. As an added bonus a user becoming a different user (for example by using su) can take their shell configuration with them unless they explicitly reset their environment variables.

In an environment like this, my .profile usually looks like this:

PATH=$HOME/bin:$PATH

export EDITOR=vi
export PAGER=more

# To shut up perl when gdm has set LANG=de_DE.UTF-8:
export LC_COLLATE=C
export LC_NUMERIC=C

ulimit -p 320 # -j8 compiles may fail with the default -p 160

export ENV=$HOME/.shrc
This sets up my favourite editor and pager. On GNU systems, more tends to be terminally feature-starved, while less is unable to show short files without clearing the entire screen and requires an extra keystroke. While "less -E" gets rid of the last problem, clearing the screen on terminals that support it means you cannot see short files at all -- in which case "less -EX" is actually needed to emulate more's time-tested standard behaviour.

The locale variables get around limitations of NetBSD's implementation that upset Perl in particular. The last line specifies that I would like for the file .shrc in my home directory to be read by every shell instance.

The .shrc file usually contains something like the following:

ps1="$PS1"
if [ -f /etc/shrc ]; then
        . /etc/shrc
fi

case "$-" in *i*)
        # interactive mode settings go here
        cd() { command cd "$@" && PS1="${USER}@${HOST%%.*}:${PWD##*?/}$ps1"; }
        PS1="${USER}@${HOST%%.*}:${PWD##*?/}$ps1"
        set -o vi
        alias ls='ls -F'
        alias la='ls -a'
        alias ll='la -l'
        alias ..='cd ..'
        alias m="$PAGER"
        alias hd='hexdump -C'
        ;;
esac
In it I first save the shell's prompt, then run the system's shrc file (only on machines on which I don't object to the contents). If the shell is interactive (as indicated by the occurrence of the letter 'i' in $-), I lastly set command line editing to vi-mode, configure some aliases and define a cd() function that displays the current directory in the shell prompt.

Unfortunately, these perfectly simple and reasonable mechanisms were not considered adequate by some, who created shells with much more elaborate, intricate startup behaviours. Coercing such shells to support this simple scheme faithfully requires implementing a number of files that get executed and ignored at times which are so impenetrably documented that it is best left to experimentation.

Setting up /etc/mk.conf for NetBSD and pkgsrc

For my NetBSD builds, I usually set
MKCRYPTO_IDEA=  yes
MKCRYPTO_MDC2=  yes
MKCRYPTO_RC5=   yes
MKDEBUGLIB=     yes
in /etc/mk.conf. The first three build several cryptography libraries which can only be a good thing (I don't think I've ever actually needed one of those).

The effect of MKDEBUGLIB is to create a library with source file and line number information for each system library whose name has _g appended to it. With these libraries I hope to be able to do source-level debugging inside the system libraries which is pretty sweet if you happen to run GNOME and regularly get random crashes inside libpthread, for example.

The remainder of my /etc/mk.conf file is dedicated to pkgsrc:

.ifdef BSD_PKG_MK
# For package development, setting WRKOBJDIR is inconvenient
#WRKOBJDIR=     /usr/obj/pkgsrc
PKG_DEVELOPER=  yes

UPDATE_TARGET=  bin-install
BINPKG_SITES=   # prevent bin-install from accessing ftp
USE_DESTDIR=    yes
MAKE_JOBS=      8
SKIP_LICENSE_CHECK=yes

PKG_DEFAULT_OPTIONS+=   gnome
# Make Firefox look like Firefox
PKG_OPTIONS.firefox+=   official-mozilla-branding

ALLOW_VULNERABLE_PACKAGES=yes # x11/vte

.if exists(/usr/pkg/bin/sudo)
SU_CMD=                 /usr/pkg/bin/sudo /bin/sh -c
.endif
.endif
The interesting value for SU_CMD is needed because pkgsrc passes a shell script to be executed as a single argument, while sudo only supports running a single command with some arguments.

I always use make update to install packages even if they aren't installed yet because this automatically builds, installs and cleans packages and the target it runs is configurable by setting UPDATE_TARGET so if I ever change my mind I can simply change the variable instead of my habit of typing make update.

Finding a VoIP client that works

For a while I've tried to find an Open Source VoIP client that works. I try SIP Communicator regularly, but it fails to connect to the test numbers of all my different SIP accounts. When I tried QuteCom it seemed to work, so I kept it.

The other day I had to attend a phone conference, so I used QuteCom to dial into the conference  system. I used that system for the first time and somehow couldn't make it let me join the conference. I tried all the codes I had in all the different menus it was offering, to no avail.

After much frustration (which involved a colleague setting up a conference call for me to test with), I got suspicious that the system didn't seem to react to all my key presses. So I checked my VoIP provider's homepage who recommended X-Lite except for Windows 7 which I was using. Instead, it offered a link to PhonerLite, which I installed. And surprise, I was suddenly able to operate the conference system.

I think it's quite surprising that QuteCom seems to generate dial tones that are close to, but not close enough to those required by the standard to be interoperable. What's more, all the VoIP clients mentioned above including SJPhone which I use on OS X have the ugliest user interfaces I had to use in a very long time, ironically in some cases due to lack of styling and in others because of its exuberance. Why isn't there a reliable, easy-to-use open source VoIP client?

Updating packages in NetBSD

Assuming you're using pkgsrc for your packages, updating regularly can be a bit of a nightmare because make update tends to remove most of your stuff and then regularly fails building the new ones, rendering your computer an expensive doorstop.

To alleviate this, I have used chroot environments to build my packages for a long time. Until recently I used mk/bulk/mksandbox which comes with pkgsrc, but is a little rough around the edges. For example, the unmodified version stopped working when X Window moved to /usr/X11R7 because it only null-mounts /usr/X11R6 (this is fixed). Also while the null mounts are very frugal, they are a bit messy and don't allow building in a different environment from the one your main system is using (such as i386 on an amd64, for example).

Reluctantly, I tried pkgtools/pkg_comp because it felt like overkill using a package for such a simple task. It turns out pkg_comp is comfortably light-weight, so switching from my old jail was a snap.

The following upgrade method makes sure that only packages are rebuilt that actually changed. Unfortunately it can lead to problems when dynamic libraries in the base system are upgraded. Recently, for example, the version number of /usr/lib/libintl changed from 0.0 to 1.0 on NetBSD. My upgrade method will produce new packages liked against version 1.0, while unchanged old packages still use version 0.0, which can lead to problems when the programs are being run. In this case, the only proper way to upgrade is to rebuild all packages that one wants to install by moving the existing /usr/pkgsrc/packages out of the way. For more discussion and other ways to upgrade packages, also see the NetBSD wiki.

To update, first I get a new pkgsrc tree:

$ cd /usr/pkgsrc
$ mv cvs.up cvs.up.0
$ cvs up -A 2>&1 | tee cvs.up
$ grep ^[^cUP] cvs.up
Then I create a pkg_chk configuration file that lists all my packages:
$ pkg_chk -g

I then save all my binary packages and clean any obsolete package and distfiles using lintpkgsrc. Removing outdated binary packages is essential because the bin-install target I use to build the new packages will silently pick up stale dependencies instead of rebuilding them, and unfortunately pkg_chk isn't clever enough to build all required packages first to prevent this from happening.

$ rm -r packages.old
$ mkdir -p packages.old/All.old
$ ln packages/All/* packages.old/All.old
$ lintpkgsrc -mopr
Then I build a fresh chroot environment and jump in:
$ sudo su
# pkg_comp removeroot
# pkg_comp makeroot
# pkg_comp build pkgtools/pkg_chk
# pkg_comp chroot
After going around mindlessly deleting packages using lintpkgsrc above, our binary packages are in a somewhat precarious state: the dependencies of some binary packages may be missing, meaning that a pkg_add of those packages will fail. Unfortunately, when this happens pkg_chk gives up instead of resorting to building the package and all its dependencies from source.

One possible workaround is to pass the -s switch to pkg_chk to prevent it from ever trying to install binary packages. Because I set UPDATE_TARGET=bin-install in /etc/mk.conf (see the pkg_comp configuration file), the update target invoked by pkg_chk will use binary packages if they exist and can be installed successfully. If pkg_add fails, the package and its dependencies are automatically rebuilt.

Although this does the trick, it is undesirable because it means we get many spurious runs of clean CLEANDEPENDS=yes actions which would not occur if pkg_chk had used pkg_add instead of make update. At least three solutions come to mind:

  • Fix pkg_chk to first update all dependencies required by a package so pkg_add doesn't fail
  • Fix pkg_chk so it behaves like bin-install and tries to build the package from source if pkg_add fails
  • Fix the update target so it doesn't clean CLEANDEPENDS=yes if pkg_add succeeded
The final steps are building all packages in the chroot and then installing them in the real world:
# pkg_chk -a -r -s 2>&1 | tee /p/pkg_chk-ars.log
# ^D
# cd /usr/pkgsrc
# pkg_chk -a -r -b 2>&1 | tee pkg_chk-arb.log
# ^D
$
Redirecting the pkg_chk log file in the chroot environment is necessary because pkgsrc is mounted appropriately read-only. The log file will appear in /var/chroot/pkg_comp/default/p/.

It can't hurt to read the install messages for the installed packages using

$ pkg_info -Da
and do as they suggest (create users and groups, copy files from /usr/pkg/share/examples/rc.d to /etc/rc.d etc.)

Lastly, and mostly because I just spent a couple of hours looking for it (and reading some other interesting material in the process), here's a command to show you beforehand what make update will update:

$ make show-needs-update
defined in mk/flavor/pkg/utility.mk (obviously). And there are more treats in this file:
$ make show-installed-depends # alias sid
$ make show-depends-options

Setting up chroot NetBSD

To simplify building packages and trying other risky things with the system, I use chroot environments setup using pkgtools/pkg_comp. Here is my ~/pkg_comp/default.conf configuration:
AUTO_TARGET="bin-install"
BUILD_TARGET="bin-install"
DISTRIBDIR="/usr/obj/releasedir/amd64"
DESTDIR="/var/chroot/pkg_comp/default"
REAL_PACKAGES="/usr/pkgsrc/packages.x86_64"
ROOTSHELL="/bin/sh"
PKG_DEFAULT_OPTIONS="gnome official-mozilla-branding"
MKCONF_VARS="$MKCONF_VARS PKG_DEFAULT_OPTIONS"
USE_DESTDIR="yes"
MKCONF_VARS="$MKCONF_VARS USE_DESTDIR"
UPDATE_TARGET="bin-install"
MKCONF_VARS="$MKCONF_VARS UPDATE_TARGET"
BINPKG_SITES=""
MKCONF_VARS="$MKCONF_VARS BINPKG_SITES"
MAKE_JOBS="8"
MKCONF_VARS="$MKCONF_VARS MAKE_JOBS"
SKIP_LICENSE_CHECK="yes"
MKCONF_VARS="$MKCONF_VARS SKIP_LICENSE_CHECK"
It sets up an environment suitable for building packages. This will only work after build.sh release was used to build a NetBSD release in /usr/obj/releasedir.

Inspired by Mike Volokhov's response to Jared McNeill's post about building 32bit packages, I then set up a 32bit build environment: First, I built a 32bit NetBSD release in /usr/obj/releasedir by following my normal procedure and adding the argument -m i386 to build.sh. Here is my ~/pkg_comp/32bit that sets up a 32bit environment to build packages (I'm not sure yet what this could be used for, maybe for building JDK):

AUTO_TARGET="bin-install"
BUILD_TARGET="bin-install"
DISTRIBDIR="/usr/obj/releasedir/i386"
DESTDIR="/var/chroot/pkg_comp/32bit"
REAL_PACKAGES="/usr/pkgsrc/packages.i386"
ROOTSHELL="/bin/sh"
PKG_DEFAULT_OPTIONS="gnome official-mozilla-branding"
MKCONF_VARS="$MKCONF_VARS PKG_DEFAULT_OPTIONS"
USE_DESTDIR="yes"
MKCONF_VARS="$MKCONF_VARS USE_DESTDIR"
UPDATE_TARGET="bin-install"
MKCONF_VARS="$MKCONF_VARS UPDATE_TARGET"
BINPKG_SITES=""
MKCONF_VARS="$MKCONF_VARS BINPKG_SITES"
MAKE_JOBS="8"
MKCONF_VARS="$MKCONF_VARS MAKE_JOBS"
SKIP_LICENSE_CHECK="yes"
MKCONF_VARS="$MKCONF_VARS SKIP_LICENSE_CHECK"
Simply use this with pkg_comp -c 32bit.

Both configurations use the user-destdir method of building packages, which means that packages get installed underneath their work directory as opposed to /usr/pkg and the binary package created from there. To really install a package in /usr/pkg there is a package-install target. To get rid of the binary package you can use package-remove.

This procedure has obvious advantages over the old method where it was basically impossible to retroactively build a binary package for an already installed package.

I noticed that amd64 NetBSD has 32bit libraries installed in /usr/lib/i386, so it can run 32bit binaries unmodified.

NetBSD update procedure

To prevent myself from constantly forgetting the correct steps to update my NetBSD or to issue them in sub-optimal order, I took some notes today:

Before using CVS for the first time, it is advisable to set up a ~/.cvsrc file with at least the following contents to make some standard commands work more as expected:

update -Pd
checkout -P
diff -pU8
Now we can update the source code (remembering that X Window lives in a separate directory):
$ cd /usr/xsrc
$ mv cvs.up cvs.up.0
$ cvs up -A 2>&1 | tee cvs.up
$ cd /usr/src
$ mv cvs.up cvs.up.0
$ cvs up -A 2>&1 | tee cvs.up
To find out whether there were any changes at all (you do update every day, don't you?), use
$ grep ^[^cPU] cvs.up
After checking any entries in UPDATING, risk losing 20 minutes or simply do
$ rm -fr /usr/obj/amd64
Time to build and install a new kernel and reboot:
$ sh build.sh -j 8 -U \
        -M /usr/obj/amd64 -R /usr/obj/releasedir \
        tools kernel=$HOME/NetBSD/YACHT 2>&1 | \
        tee build-yacht.log
$ cd /usr/obj/amd64/usr/src/sys/arch/amd64/compile/YACHT
$ sudo su
# make install
# shutdown -r now
If anything seems dodgy after the machine comes up, quickly save your old kernel /onetbsd as /netbsd.works, for example, because after the next attempt /onetbsd will get overwritten.

In the next step, I use the release target to ensure I get all sets built correctly. While "distribution sets" builds some sets, it omits others (for example, kern-*.tgz), which I need for some purposes like setting up chroot environments.

Time to build and install the world, update /etc, perform post-install fixes and reboot again:

$ cd /usr/src
$ sh build.sh -x -j 8 -U \
        -M /usr/obj/amd64 -R /usr/obj/releasedir \
        release 2>&1 | \
        tee build-release-amd64.log | grep ===
$ sudo su
# sh build.sh -x -j 8 -U \
        -M /usr/obj/amd64 -R /usr/obj/releasedir \
        install=/ 2>&1 | \
        tee build-install.log
# S=/usr/obj/releasedir/amd64/binary/sets
# etcupdate -a -l -s $S/etc.tgz -s $S/xetc.tgz
# postinstall -s $S/etc.tgz -s $S/xetc.tgz fix ...
# shutdown -r now

As a side note, this procedure doesn't update the boot code, so when features are added this is needed:

$ sudo cp /usr/share/mdec/boot /boot

And this is what it looks like as a shell script:

#!/bin/sh

usage() {
        cat <<EOF
usage: $0 [-h] [-u]
        -h      print help
        -u      continue previous build
EOF
        exit $1
}

opt_u=
tools=tools
while [ $# != 0 ]
{
        case "$1" in
        -h)     usage 0
                ;;
        -u)     opt_u=-u
                tools=
                ;;
        *)      exec >&2
                echo "Invalid option $1"
                usage 1
                ;;
        esac
        shift
}

objdir=/usr/obj/amd64
releasedir=/usr/obj/releasedir
sets="$releasedir"/amd64/binary/sets
kernel=$HOME/NetBSD/YACHT

if [ -z "$opt_u" ]; then
        (cd /usr/xsrc && \
                { [ ! -f cvs.up ] || mv cvs.up cvs.up.0 } && \
                cvs up -PdA 2>&1 | tee cvs.up | grep ^C) &
        (cd /usr/src && \
                { [ ! -f cvs.up ] || mv cvs.up cvs.up.0 } && \
                cvs up -PdA 2>&1 | tee cvs.up | grep ^C) &
        echo "Starting CVS update..."
        wait || exit 1
        echo "Finished CVS update."
fi
cd /usr/src && \
        sh build.sh $opt_u -j 8 -U -M "$objdir".new -R "$releasedir" \
                $tools kernel="$kernel" 2>&1 | \
                tee build-"`basename "$kernel"`".log | \
                grep === || \
        exit 1
cd /usr/src && \
        sh build.sh $opt_u -x -j 8 -U -M "$objdir".new -R "$releasedir" \
                release 2>&1 | \
                tee build-release-amd64.log | \
                grep === || \
        exit 1
cd "$objdir".new/usr/src/sys/arch/amd64/compile/"`basename "$kernel"`" && \
        tail -30 /usr/src/build-"`basename "$kernel"`".log && \
        echo "$PWD make install" && \
        sudo make install || \
        exit 1
cd /usr/src && \
        tail -30 build-release-amd64.log && \
        echo "$PWD build.sh install=/" &&
        sudo sh build.sh -x -j 8 -U -M "$objdir".new -R "$releasedir" \
                install=/ 2>&1 | tee build-install.log || \
        exit 1
# move $objdir out of the way if it exists
([ -d "$objdir" ] && \
        rm -fr "$objdir".old && \
        mv "$objdir" "$objdir".old) &
sudo etcupdate -a -l -s "$sets"/etc.tgz -s "$sets"/xetc.tgz
echo "Removing old $objdir..."
wait
echo "Finished removing old $objdir..."
[ -d "$objdir" ] || mv "$objdir".new "$objdir"
echo "Don't forget to run postinstall!"

NetBSD vesa boot

In an attempt to test Jaren McNeil's recent contributions to a boot splash in NetBSD, I updated my boot code today and typed
vesa list
at the boot prompt. I chose
vesa 1600x1200x32
and the boot messages appeared very slowly in high resolution in a frightening resemblance to Linux. Here are the relevant dmesg differences:
@@ -51,11 +51,11 @@
 ppb1: unsupported PCI Express version
 pci2 at ppb1 bus 2
 pci2: i/o space, memory space enabled, rd/line, wr/inv ok
-vga0 at pci2 dev 0 function 0: vendor 0x1002 product 0x954f (rev. 0x00)
-wsdisplay0 at vga0 kbdmux 1: console (80x25, vt100 emulation)
+genfb0 at pci2 dev 0 function 0: vendor 0x1002 product 0x954f
+genfb0: framebuffer at 0xd0000000, size 640x480, depth 8, stride 640
+wsdisplay0 at genfb0 kbdmux 1: console (default, vt100 emulation)
 wsmux1: connecting to wsdisplay0
-radeondrm0 at vga0: ATI Radeon HD 4350
-radeondrm0: Initialized radeon 1.29.0 20080613
+drm at genfb0 not configured
 azalia0 at pci2 dev 0 function 1: Generic High Definition Audio Controller
 azalia0: interrupting at ioapic1 pin 10
 azalia0: host: 0x1002/0xaa38 (rev. 0), HDA rev. 1.0
@@ -375,18 +375,12 @@
 boot device: wd0
 root on wd0a dumps on wd0b
 root file system type: ffs
-wsdisplay0: screen 1 added (80x25, vt100 emulation)
-wsdisplay0: screen 2 added (80x25, vt100 emulation)
-wsdisplay0: screen 3 added (80x25, vt100 emulation)
-wsdisplay0: screen 4 added (80x25, vt100 emulation)
+wsdisplay0: screen 1 added (default, vt100 emulation)
+wsdisplay0: screen 2 added (default, vt100 emulation)
+wsdisplay0: screen 3 added (default, vt100 emulation)
+wsdisplay0: screen 4 added (default, vt100 emulation)
Unfortunately, the fact that radeondrm didn't attach meant that X Window wasn't accelerated which is not tolerable. Also, I learned later from Hubert's post that setting the splash image wasn't actually implemented yet (as I'd foolishly assumed). So I turned it back off in my /boot.cfg and will give it another try later.

One important lesson I learned that the /boot code doesn't actually get updated by my regular update routine, so I should make a note to issue a cp /usr/share/mdec/boot /boot every once in a while.