Re: [Hampshire] soft lockup detected on CPU#0

Top Page

Reply to this message
Author: Simon Capstick
Date:  
To: Hampshire LUG Discussion List
Subject: Re: [Hampshire] soft lockup detected on CPU#0
Samuel Penn wrote:
> Hi all,
>
> We have an 8 core Dell Poweredge which we're running Gentoo on,
> which is basically just being used to run VMWare server (not ESX).
>
> Kernel/CPU details as follows:
>
> Linux hydra 2.6.19-gentoo-r5 #2 SMP Tue Oct 9 16:25:09 GMT 2007
> x86_64 Intel(R) Xeon(R) CPU E5335 @ 2.00GHz GenuineIntel GNU/Linux
>
> (later versions of the kernel cause problems for VMWare).
>
> Occassionally, the machine hangs for a couple of minutes, in which
> we can't access it in any way. Eventually it comes back as if nothing
> has happened.
>
> According to dmesg, we get a "soft lockup detected on CPU#0" around
> the time that the hangs occur. There are also a large number of
> hda/ide errors in dmesg, though the only device on hda is the cdrom
> drive, which isn't being used (and the tray isn't open).
>
> There's some articles on LKML giving a similar error, though we
> haven't been able to apply anything there to our problem.
>
>
> Has anyone seen anything like this before? Suggestions?
>
>
> Some of the dmesg output is included below.
>
>
>
> end_request: I/O error, dev hda, sector 0
> hda: tray open
> end_request: I/O error, dev hda, sector 0
> hda: irq timeout: status=0xd0 { Busy }
> ide: failed opcode was: unknown
> hda: status timeout: status=0xd0 { Busy }
> ide: failed opcode was: unknown
> hda: drive not ready for command
> BUG: soft lockup detected on CPU#0!
>
> Call Trace:
> <IRQ> [<ffffffff80252a3f>] softlockup_tick+0xdb/0xed
> [<ffffffff80239bf5>] update_process_times+0x42/0x68
> [<ffffffff802181d0>] smp_local_timer_interrupt+0x34/0x55
> [<ffffffff80218874>] smp_apic_timer_interrupt+0x52/0x6a
> [<ffffffff8020a146>] apic_timer_interrupt+0x66/0x70
> [<ffffffff80439e1d>] ide_outb+0x0/0x9
> [<ffffffff80438b60>] ide_inb+0x4/0x8
> [<ffffffff80439d37>] ide_wait_stat+0xaa/0x110
> [<ffffffff80437b6c>] ide_do_request+0x437/0x983
> [<ffffffff80219573>] __unmask_IO_APIC_irq+0x4f/0x6f
> [<ffffffff802195b4>] unmask_IO_APIC_irq+0x21/0x35
> [<ffffffff80253990>] default_enable+0x18/0x21
> [<ffffffff80253936>] check_irq_resend+0x16/0x58
> [<ffffffff80438b40>] ide_timer_expiry+0x2bf/0x2db
> [<ffffffff8022ad6d>] rebalance_tick+0x170/0x369
> [<ffffffff80438881>] ide_timer_expiry+0x0/0x2db
> [<ffffffff802394a2>] run_timer_softirq+0x130/0x1a5
> [<ffffffff80236096>] __do_softirq+0x55/0xc3
> [<ffffffff8020a69c>] call_softirq+0x1c/0x28
> [<ffffffff8020bae3>] do_softirq+0x2c/0x7d
> [<ffffffff80218879>] smp_apic_timer_interrupt+0x57/0x6a
> [<ffffffff80208071>] mwait_idle+0x0/0x20
> [<ffffffff8020a146>] apic_timer_interrupt+0x66/0x70
> <EOI> [<ffffffff881c27f8>] :vmnet:VNetHub_AllocVnet+0x29c/0x2dc
> [<ffffffff80208070>] mwait_idle_with_hints+0x44/0x45
> [<ffffffff8020807d>] mwait_idle+0xc/0x20
> [<ffffffff80208988>] cpu_idle+0x8a/0xae
> [<ffffffff807876e0>] start_kernel+0x218/0x21d
> [<ffffffff8078715a>] _sinittext+0x15a/0x15e
>
> hda: status timeout: status=0xd0 { Busy }
> ide: failed opcode was: unknown
> hda: drive not ready for command
> hda: status timeout: status=0xd0 { Busy }
> ide: failed opcode was: unknown
> hda: drive not ready for command
> hda: status timeout: status=0xd0 { Busy }
> ide: failed opcode was: unknown
>


I've seen similar soft-lockups when using Xen but the causes look very
different. There's definitely something going on with your IDE adapter.
I've actually seen this before many years ago but cannot remember the
cause. It may be worth checking that the cable to the CD drive is
firmly inserted. Or maybe just remove or disconnect the CD drive if
it's unsolvable and you can't live with it.

Simon