[OpenBIOS] Read errors from SILO booting Debian Wheezy/sparc64

Mark Cave-Ayland mark.cave-ayland at ilande.co.uk
Sun Apr 28 22:29:05 CEST 2013


On 27/04/13 07:06, Artyom Tarasenko wrote:

>> Is this a regression? If so, what was the last know good revision?
>
> Doesn't look like a regression. Checked the OpenBIOS image packaged with QEMU
> (they didn't update it recently) and it fails too.
> Just haven't tested it before. Usually I boot from a CD/DVD image or
> load the kernel directly with the -kernel option.

Ha okay - I finally figured this one out with a reproducible test case 
like this:

1) First drop the Linux filesystem cache:
echo 3 > /proc/sys/vm/drop_caches

2) Start up QEMU - this now blows up in SILO:
./qemu-system-sparc64 -hda 
/home/build/src/qemu/image/sparc64/debian-wheezy.qcow2 -boot c 
-nographic -snapshot

3) Load the debian-wheezy.qcow2 image into the filesystem cache
cat debian-wheezy.qcow2 > /dev/null

4) Restart QEMU again and the image boots
./qemu-system-sparc64 -hda 
/home/build/src/qemu/image/sparc64/debian-wheezy.qcow2 -boot c 
-nographic -snapshot

Once I could reproduce this at will, I recompiled OpenBIOS with 
CONFIG_DEBUG_IDE enabled and got the following output:

Jumping to entry point 0000000000004000 for type 0000000000000005...
switching to new context: entry point 0x4000 stack 0x00000000ffe86a39
SIDE - ob_ide_open: opening channel -1539032 unit 0
IDE DRIVE @ffe88428:
unit: 0
present: 1
type: 1
media: 32
model: QEMU HARDDISK
nr: 0
cyl: 8322
head: 16
sect: 63
bs: 512
IDE - ob_ide_block_size: ob_ide_block_size: block size 200
IDE - ob_ide_max_transfer: max_transfer 1fe00
IDE - ob_ide_read_blocks: ob_ide_read_blocks ffe86b20 block=0 n=1
IDE - ob_ide_read_sectors: ob_ide_read_sectors: block=0 sectors=1
IDE - ob_ide_read_blocks: ob_ide_read_blocks ffe86b20 block=0 n=1
IDE - ob_ide_read_sectors: ob_ide_read_sectors: block=0 sectors=1
IDE - ob_ide_read_blocks: ob_ide_read_blocks ffea9e98 block=2 n=2
IDE - ob_ide_read_sectors: ob_ide_read_sectors: block=2 sectors=2
IDE - ob_ide_read_blocks: ob_ide_read_blocks 41dc block=5248 n=1
IDE - ob_ide_read_sectors: ob_ide_read_sectors: block=5248 sectors=1
IDE - ob_ide_error: ob_ide_error drive<0>: timed out waiting for BUSY clear:
IDE - ob_ide_error:     cmd=20, stat=d0IDE - ob_ide_error:
IDE - ob_ide_pio_data_in: bytes=512, stat=d0
IDE - ob_ide_read_blocks: ob_ide_read_blocks: error
EXIT
0 >

Looking at drivers/ide.c, it was apparent that we were timing out 
waiting for the IDE event to finish:

/*
* wait for BUSY clear
*/
if (ob_ide_wait_stat(drive, 0, BUSY_STAT | ERR_STAT, &stat)) {
	ob_ide_error(drive, stat, "timed out waiting for BUSY clear");
	cmd->stat = stat;
	break;
}

And ob_ide_wait_stat() has a loop that looks like this:

for (i = 0; i < 5000; i++) {
	stat = ob_ide_pio_readb(drive, IDEREG_STATUS);
	if (!(stat & BUSY_STAT))
		break;

	udelay(1000);
}

This led me to wonder if there was a problem with udelay() returning too 
quickly and so I had a look at the implementation in 
arch/sparc64/openbios.c which looks like this:

  void udelay(unsigned int usecs)
  {
  }

Ah. So the problem is that we are not waiting long enough for the IDE 
command to finish which is why we are getting the error. A quick patch 
to get things working again for me is:

--- a/openbios-devel/arch/sparc64/openbios.c
+++ b/openbios-devel/arch/sparc64/openbios.c
@@ -547,6 +547,9 @@ void setup_timers(void)

  void udelay(unsigned int usecs)
  {
+    volatile int i;
+
+    for (i = 0; i < usecs * 100; i++);
  }

But we probably need a better solution for this that works for both 
SPARC32 and SPARC64. I've seen some code that waits for the RTC to tick 
1s and measures the number of loop iterations within that time to find a 
basic timing constant which looks quite appealing as it is simple and 
will work across both SPARC32 and SPARC64 (SPARC32 doesn't have a %tick 
register).

Blue - any thoughts on how to implement this?


ATB,

Mark.



More information about the OpenBIOS mailing list