Network transmit timed out

Issues related to hardware problems
Post Reply
my2cents
Posts: 1
Joined: 2014/07/29 02:31:03

Network transmit timed out

Post by my2cents » 2014/07/29 03:51:51

I am getting sporadic network issues with NETDEV WATCHDOG: eth0: transmit timed out. I have looked at a lot of postings on this and other sites that give suggestions about apic, acpi, mDNSResponder, bios, etc, and so on.

I have a 2000 era HP Pavilion XE783 running CentOS Linux 5.10, kernel 2.6.18-371.9.1.el5 on i686. The network card is a Linksys EtherPCI LAN Card II (LNEPCI2) with chip Winbond W89C940F. This computer had Windows before and ran fine. I recently installed CentOS with Asterisk. After bootup it randomly gets the following errors after a few hours or a few days and the network looses connection. A system reboot gets it working again. The following messages in /var/log/messages this time started about 15 hours after the last boot and was spread over a 2 hour period before the network lost connection.

kernel: WARNING: at drivers/net/8390.c:815 ei_rx_overrun()
kernel: [<e0a6783a>] ei_interrupt+0x10f/0x2bc [8390]
kernel: [<c0451c89>] handle_IRQ_event+0x45/0x8c
kernel: [<c0451d98>] __do_IRQ+0xc8/0x118
kernel: [<c0451cd0>] __do_IRQ+0x0/0x118
kernel: [<c040753c>] do_IRQ+0x9b/0xc3
kernel: [<c04059ca>] common_interrupt+0x1a/0x20
kernel: [<c0403c55>] default_idle+0x31/0x59
kernel: [<c0416fed>] apm_cpu_idle+0x197/0x1e8
kernel: [<c0403d1c>] cpu_idle+0x9f/0xb9
kernel: [<c07189fc>] start_kernel+0x37b/0x383
kernel: =======================
kernel: NETDEV WATCHDOG: eth0: transmit timed out
kernel: NETDEV WATCHDOG: eth0: transmit timed out
last message repeated 4 times
last message repeated 3 times
kernel: NETDEV WATCHDOG: eth0: transmit timed out
last message repeated 2 times
kernel: ip_tables: (C) 2000-2006 Netfilter Core Team
kernel: Netfilter messages via NETLINK v0.30.
kernel: ip_conntrack version 2.4 (4087 buckets, 32696 max) - 228 bytes per conntrack
kernel: NETDEV WATCHDOG: eth0: transmit timed out
last message repeated 3 times

When I rebooted after the above errors occured, I noticed the following messages in /var/log/messages.
The following lines appear near the start of boot process: (notice shared IRQ9)

kernel: PCI: Probing PCI hardware
kernel: ACPI Error (tbget-0168): Invalid address flags 8 [20060707]
last message repeated 3 times
kernel: PCI quirk: region 1000-107f claimed by ICH4 ACPI/GPIO/TCO
kernel: PCI quirk: region 1180-11bf claimed by ICH4 GPIO
kernel: ACPI Error (tbget-0168): Invalid address flags 8 [20060707]
last message repeated 4 times
kernel: pci 0000:00:1e.0: PCI bridge to [bus 01-01] (subtractive decode)
kernel: pci 0000:00:1f.0: PIIX/ICH IRQ router [8086/2410]
kernel: pci 0000:00:1f.5: found PCI INT B -> IRQ 9
kernel: pci 0000:00:1f.5: sharing IRQ 9 with 0000:00:1f.3

The following lines appear near the end of boot process: (again notice shared IRQ9)

kernel: ne2k-pci.c:v1.03 9/22/2003 D. Becker/P. Gortmaker
kernel: http://www.scyld.com/network/ne2k-pci.html
kernel: ne2k-pci 0000:01:08.0: found PCI INT A -> IRQ 9
kernel: ne2k-pci 0000:01:08.0: sharing IRQ 9 with 0000:00:01.0
kernel: eth0: Winbond 89C940 found at 0x2000, IRQ 9, 00:20:78:16:8D:9A.
kernel: i801_smbus 0000:00:1f.3: found PCI INT B -> IRQ 9
kernel: i801_smbus 0000:00:1f.3: sharing IRQ 9 with 0000:00:1f.5
kernel: Intel ICH 0000:00:1f.5: found PCI INT B -> IRQ 9
kernel: Intel ICH 0000:00:1f.5: sharing IRQ 9 with 0000:00:1f.3

Have used other flavors of Linux before. Not sure if I'm looking at a IRQ conflict happening or if the older network card is just not able to handle the amount of traffic load Asterisk throws at it. Looking for any suggestions on tweaking system configuration, etc.

Thanks

User avatar
TrevorH
Site Admin
Posts: 33202
Joined: 2009/09/24 10:40:56
Location: Brighton, UK

Re: Network transmit timed out

Post by TrevorH » 2014/07/29 08:49:31

I had a quick look around and I found the routine that's crashing in the CentOS 6 version of the driver. It's been refactored so the source is different but that routine still exists and has the following comment block before it
/**
* ei_rx_overrun - handle receiver overrun
* @dev: network device which threw exception
*
* We have a receiver overrun: we have to kick the 8390 to get it started
* again. Problem is that you have to kick it exactly as NS prescribes in
* the updated datasheets, or "the NIC may act in an unpredictable manner."
* This includes causing "the NIC to defer indefinitely when it is stopped
* on a busy network." Ugh.
* Called with lock held. Don't call this with the interrupts off or your
* computer will hate you - it takes 10ms or so.
*/
Honestly, at this point I would spend the US$5 it takes to find a secondhand Intel e100 based 100Mbps network card and install that instead.
The future appears to be RHEL or Debian. I think I'm going Debian.
Info for USB installs on http://wiki.centos.org/HowTos/InstallFromUSBkey
CentOS 5 and 6 are deadest, do not use them.
Use the FAQ Luke

Post Reply