Installing NetBSD on a DEC Multia by booting over the network

As it happens, I own a DEC Multia which has been sitting around for far too long and deserves some of my attention.

Unfortunately, when I boot from the built-in SCSI disc, the NetBSD bootstrap cannot find /boot and crashes. I then get back to the SRM console. I don't know what I did to the system years ago, but it's not booting.

The Multia has a built-in floppy drive, but it's my only floppy drive, so I can't create boot floppies for it (even though I still have an old pack of disks).

So I decided to try installation via bootp/tftp. On my OpenWRT router, I configured a static IP address for my Multia's MAC address since bootp doesn't support dynamic assignments. I then set the tftp path to /tmp/tftp since that partition has enough space on my router.

I'm following the instructions in NetBSD's diskless(8) manual page for setting up a network boot.

NetBSD's netboot secondary stage boot loader goes into the /tmp/tftp/ directory and I set the tftp file name to netboot.

Here the trouble starts: the boot loader needs to have the client's MAC address hardcoded into it because it can't find it out by itself. This is the job of setnetbootinfo(8), which of course needs another NetBSD system to run on.

Since I only have an OpenBSD system running on Intel handy, I compile their setnetbootinfo program:

$ cc -o setnetbootinfo \
    -I /usr/src/sys/arch/alpha/stand \
    /usr/src/sys/arch/alpha/stand/setnetbootinfo/setnetbootinfo.c

I use it like this:

$ wget https://ftp.openbsd.org/pub/OpenBSD/6.4/alpha/netboot
$ ./setnetbootinfo -a xx:xx:xx:xx:xx:xx netboot

It also works on NetBSD's netboot loader and it now advances one step further, trying to find out the kernel server and file name by using bootp.

The above mentioned manual page has the following sample configuration for NetBSD's dhcpd:

host myclient {
        hardware ethernet 8:0:20:7:c5:c7;
        fixed-address myclient;         # client's assigned IP address
        filename "myclient.netboot";    # secondary bootstrap
        next-server myserver;           # TFTP server for secondary bootstrap
        option swap-server myserver;    # NFS server for root
        option root-path "/export/myclient/root";
}

OpenWRT runs dnsmasq instead of dhcpd, so I need to translate the swap-server and root-path options to it.

Abridged for brevity, I get this on my OpenWRT box:

# dnsmasq --help dhcp
Known DHCP options:
[...]
 16 swap-server
[...]
 17 root-path
[...]

This suggests a dnsmasq(1) command line like this:

# dnsmasq \
        --dhcp-option=option:swap-server,0.0.0.0 \
        --dhcp-option=option:root-path,/data/netboot

This can be set up using the following uci commands:

# uci add_list 'dhcp.@dnsmasq[0].dhcp_option=option:swap-server,0.0.0.0'
# uci add_list 'dhcp.@dnsmasq[0].dhcp_option=option:root-path,/data/netboot'
# uci commit

This should add two lines in the config dnsmasq section of /etc/config/dhcp:

list dhcp_option 'option:server-swap,0.0.0.0'
list dhcp_option 'option:root-path,/data/netboot'

Afterwards, /etc/init.d/dnsmasq restart activates the changes.

I now get stuck at bootp: no reply exactly like Tobias, whose problem turned out to be his BNC connection while I'm using twisted pair. In my case, the problem turned out to be caused by a fix for »broken« firmware, as described in PR/6446. After finding another working Alpha box, I was able to patch netboot and get it working (see below).

Eventually I found a netboot image for Debian Lenny (the last one to support Alpha) in their archive. The linux boot works without a second-stage boot loader and loads the boot loader and kernel together using the SRM console's BOOTP/TFTP request. The boot loader asks for kernel arguments and then proceeds to load the kernel at the aboot> prompt. However, there is no further reaction after the message

aboot: starting kernel network with arguments

Back to the NetBSD netboot loader: it seems I need a little patch in if_prom.c to remove some code that looks like it was added to work around an unspecified bug in certain firmware versions:

Index: if_prom.c
===================================================================
RCS file: /pub/NetBSD-CVS/src/sys/arch/alpha/stand/netboot/if_prom.c,v
retrieving revision 1.19
diff -u -p -u -8 -r1.19 if_prom.c
--- if_prom.c   13 Mar 2003 14:15:58 -0000      1.19
+++ if_prom.c   9 Apr 2019 06:09:21 -0000
@@ -98,26 +98,24 @@ prom_get(struct iodesc *desc, void *pkt,
        prom_return_t ret;
        time_t t;
        int cc;
        char hate[2000];
 
        t = getsecs();
        cc = 0;
        while (((getsecs() - t) < timeout) && !cc) {
-               if (broken_firmware)
+               if (0 && broken_firmware)
                        ret.bits = prom_read(booted_dev_fd, 0, hate, 0);
                else
                        ret.bits = prom_read(booted_dev_fd, sizeof hate, hate, 0);
                if (ret.u.status == 0)
                        cc = ret.u.retval;
        }
-       if (broken_firmware)
-               cc = min(cc, len);
-       else
+       if (len < cc)
                cc = len;
        memcpy(pkt, hate, cc);
 
        return cc;
 }
 

What is strange here is that the routine always returns the requested number of bytes read even if timeout occurs unless broken_firmware is true, in which case it always passes 0 as the number of bytes to read to the PROM reading routine which doesn't work (on my machine). The broken_firmware flag is based on whether the boot device name contains the hardware ethernet address, which isn't the case on my system (see above the need to use setnetbootinfo). Still, I need to pass the correct amount of len in order to read successfully. For reference, the SRM console version is

Multia SRM Console  BL5 V3.8-1, built on Jun 22 1995 at 18:10:45

With this, I can successfully boot the NetBSD 8 installation kernel. Installing OpenBSD was then just a couple of commands away:

# mount_nfs jeopardy:/data/netboot /mnt
# dd if=/mnt/miniroot64.fs of=/dev/rsd0c
# reboot

Patching OpenBSD installation process

To successfully install OpenBSD 6.4 and 6.5 with 88MB of RAM and a 512MB HDD, I needed to disable kernel relinking (KARL) because the machine has no swap space and runs out of memory, resulting in randomly killed ksh processes. So before running the installation, I removed that section from the installation script install.sub.

# ed install.sub
/Relink
                echo -n "Relinking to create unique kernel..."
-
        if [[ -f $_kernel_dir.tgz ]]; then
c
        if false; then
.
w
q
# ./install

Note: It might be sufficient to skip installing the kernel tgz package, but I haven't tried that.

After installation and before the first boot, I also modified the on-disk rc script to avoid kernel relinking:

# ed /mnt/etc/rc
/reorder_kernel
/usr/libexec/reorder_kernel &
s/^/#
#/usr/libexec/reorder_kernel &
w
q

Lastly, I turned off library address space layout randomisation as it takes a very long time during boot on this machine.

# echo library_aslr=NO >> /mnt/etc/rc.conf.local

For geeks (and google), here is the NetBSD dmesg of the machine:

Copyright (c) 1996, 1997, 1998, 1999, 2000, 2001, 2002, 2003, 2004, 2005,
    2006, 2007, 2008, 2009, 2010, 2011, 2012, 2013, 2014, 2015, 2016, 2017,
    2018 The NetBSD Foundation, Inc.  All rights reserved.
Copyright (c) 1982, 1986, 1989, 1991, 1993
    The Regents of the University of California.  All rights reserved.

NetBSD 8.0 (INSTALL) #0: Tue Jul 17 14:59:51 UTC 2018
 mkrepro@mkrepro.NetBSD.org:/usr/src/sys/arch/alpha/compile/INSTALL
(PCI ISA), 267MHz, s/n
8192 byte page size, 1 processor.
total memory = 90112 KB
(2368 KB reserved for PROM, 87744 KB used by NetBSD)
avail memory = 75640 KB
timecounter: Timecounters tick every 0.976 msec
Kernelized RAIDframe activated
mainbus0 (root)
cpu0 at mainbus0: ID 0 (primary), LCA-2 (21066)
lca0 at mainbus0
pci0 at lca0 bus 0
pci0: i/o space, memory space enabled, rd/line, rd/mult, wr/inv ok
siop0 at pci0 dev 6 function 0: Symbios Logic 53c810 (fast scsi)
siop0: interrupting at isa irq 11
scsibus0 at siop0: 8 targets, 8 luns per target
sio0 at pci0 dev 7 function 0: vendor 8086 product 0484 (rev. 0x84)
tlp0 at pci0 dev 8 function 0: DECchip 21040 Ethernet, pass 2.3
tlp0: interrupting at isa irq 15
tlp0: Ethernet address 08:00:2b:e5:f0:1e
tlp0: 10baseT, 10baseT-FDX, 10base5, manual
vendor 0047 product 0280 (miscellaneous network, revision 0x47) at pci0 dev 9 function 0 not configured
tga0 at pci0 dev 11 function 0: DC21030 step B, board type T8-02
tga0: 1024 x 768, 8bpp, Bt485 RAMDAC
tga0: interrupting at isa irq 10
wsdisplay0 at tga0 (kbdmux ignored): console (std, vt100 emulation)
isa0 at sio0
lpt0 at isa0 port 0x3bc-0x3bf irq 7
com0 at isa0 port 0x3f8-0x3ff irq 4: ns16550a, working fifo
com1 at isa0 port 0x2f8-0x2ff irq 3: ns16550a, working fifo
pckbc0 at isa0 port 0x60-0x64
pckbd0 at pckbc0 (kbd slot)
pckbc0: using irq 1 for kbd slot
wskbd0 at pckbd0 (mux ignored): console keyboard, using wsdisplay0
pms0 at pckbc0 (aux slot)
pckbc0: using irq 12 for aux slot
wsmouse0 at pms0 (mux ignored)
fdc0 at isa0 port 0x3f0-0x3f7 irq 6 drq 2
mcclock0 at isa0 port 0x70-0x71: mc146818 compatible time-of-day clock
stray isa irq 3
stray isa irq 4
timecounter: Timecounter "clockinterrupt" frequency 1024 Hz quality 0
timecounter: Timecounter "PCC" frequency 266631696 Hz quality 1000
scsibus0: waiting 2 seconds for devices to settle...
sd0 at scsibus0 target 0 lun 0:  disk fixed
sd0: 518 MB, 4212 cyl, 4 head, 63 sec, 512 bytes/sect x 1061712 sectors
sd0: sync (100.00ns offset 8), 8-bit (10.000MB/s) transfers
fd0 at fdc0 drive 0: 1.44MB, 80 cyl, 2 head, 18 sec
md0: internal 4650 KB image area
root on md0a dumps on md0b
root file system type: ffs
kern.module.path=/stand/alpha/8.0/modules

And here's OpenBSD's.

[ using 1143368 bytes of bsd ELF symbol table ]
consinit: not using prom console
Copyright (c) 1982, 1986, 1989, 1991, 1993
 The Regents of the University of California.  All rights reserved.
Copyright (c) 1995-2018 OpenBSD. All rights reserved.  https://www.OpenBSD.org

OpenBSD 6.4 (GENERIC) #264: Thu Oct 11 23:15:35 MDT 2018
    deraadt@alpha.openbsd.org:/usr/src/sys/arch/alpha/compile/GENERIC
(PCI ISA), 267MHz
8192 byte page size, 1 processor.
real mem = 92274688 (88MB)
rsvd mem = 2424832 (2MB)
avail mem = 78299136 (74MB)
mainbus0 at root
cpu0 at mainbus0: ID 0 (primary), LCA-2 (21066 pass 2)
lca0 at mainbus0
pci0 at lca0 bus 0
siop0 at pci0 dev 6 function 0 "Symbios Logic 53c810" rev 0x02: isa irq 11
scsibus0 at siop0: 8 targets, initiator 7
sd0 at scsibus0 targ 0 lun 0:  SCSI2 0/direct fixed serial.TOSHIBA_MK1924FBV_75M20420W_M_/_
sd0: 518MB, 512 bytes/sector, 1061712 sectors
sio0 at pci0 dev 7 function 0 "Intel 82378IB ISA" rev 0x84
de0 at pci0 dev 8 function 0 "DEC 21040" rev 0x23, DEC 21040 pass 2.3: isa irq 15, address 08:00:2b:e5:f0:1e
tga0 at pci0 dev 11 function 0 "DEC 21030" rev 0x02: DC21030 step B, board type T8-02
tga0: 1024 x 768, 8bpp, Bt485 RAMDAC
tga0: interrupting at isa irq 10
wsdisplay0 at tga0 mux 1: console (std, vt100 emulation)
isa0 at sio0
isadma0 at isa0
fdc0 at isa0 port 0x3f0/6 irq 6 drq 2
com0 at isa0 port 0x3f8/8 irq 4: ns16550a, 16 byte fifo
com1 at isa0 port 0x2f8/8 irq 3: ns16550a, 16 byte fifo
pckbc0 at isa0 port 0x60/5 irq 1 irq 12
pckbd0 at pckbc0 (kbd slot)
wskbd0 at pckbd0: console keyboard, using wsdisplay0
pms0 at pckbc0 (aux slot)
wsmouse0 at pms0 mux 0
pcppi0 at isa0 port 0x61
spkr0 at pcppi0
lpt0 at isa0 port 0x3bc/4 irq 7
pcic0 at isa0 port 0x3e0/2 iomem 0xd0000/65536
pcic0 controller 0:  has sockets A and B
pcmcia0 at pcic0 controller 0 socket 0
pcmcia1 at pcic0 controller 0 socket 1
pcic0: irq 14, polling enabled
mcclock0 at isa0 port 0x70/2: mc146818 or compatible
stray isa irq 14
stray isa irq 7
vscsi0 at root
scsibus1 at vscsi0: 256 targets
softraid0 at root
scsibus2 at softraid0: 256 targets
siop0: target 0 now using 8 bit 10.0 MHz 8 REQ/ACK offset xfers
root on sd0a (942e3e0ab7215cc1.a) swap on sd0b dump on sd0b
fd0 at fdc0 drive 0: 1.44MB 80 cyl, 2 head, 18 sec