11/13/2007 - Replaced
Dell cables with
Gore cables on the following machines after seeing physical link counters increase:
umfs07,umfs09,umfs10,umfs11
1/25/2008 - eth2 interface on dq2 started randomly going up and down. Contacted Myricom (case 56372) and they did not think the card was bad. They recommended trying a different brand of cable if possible. Swapped in Gore cable from umfs11. We have seen no more errors to date (2-1-2008).
2/01/2008 - Getting stats on all 10GE systems. I (Shawn) recorded the ethtool -S ethx statistics on January 30 or 31 and here are the results Feb 1:
Updated all results below Feb 22. Unrecorded counters were "0":
- dq2.aglt2.org - No "bad" counters incremented since check on Jan 30th . Total counters: dropped_bad_phy: 101173 dropped_bad_crc32: 2 from original Dell cable
- umfs03.aglt2.org - The dropped_bad_phy increased from 65458 to 65500 in 27 hours over a total of (128267482-128134650) received packets and (122944383-122855082) transmitted packets. Feb 22: Uptime 70 days, rx_packets: 130955925 tx_packets: 124658635 dropped_bad_phy: 66715
- umfs04.aglt2.org - The dropped_bad_phy increased from 1417873 to 1422247 in 27 hours over a total of (940188767-924079838) received packets and (614773595-598672216) transmitted packets. Feb 22: Uptime 92 days, rx_packets: 1136637582 tx_packets: 813449213 dropped_bad_phy: 1486149 dropped_bad_crc32: 12
- umfs05.aglt2.org - The dropped_bad_phy increased from 30 to 590 in 27.5 hours over a total of (705040387-30288502) received packets and (182970290-8971014) transmitted packets. Feb 22: Uptime 22 days, rx_packets: 10030738419 tx_packets: 2753525892 dropped_bad_phy: 2466
- umfs06.aglt2.org - The dropped_bad_phy increased from 12556 to 12633 in 27.5 hours over a total of (48119618-41238009) received packets and (45801997-39066716) transmitted packets. Feb 22: Uptime 45 days, rx_packets: 175393592 tx_packets: 168720871 dropped_bad_phy: 15711
- umfs07.aglt2.org - No "bad" phys counters incremented over 27.5 hours . Total of (5311026410-5271395657) received packets and (2076273867-2069393839) transmitted packets. Feb 22: No bad phys counters incremented in 63 days uptime. (rx_packets: 5854957553 tx_packets: 2171953191).
- umfs08.aglt2.org - The dropped_bad_phy increased from 48081 to 50782 in 27.5 hours over a total of (51862696-33138392) received packets and (18589145-11646834) transmitted packets. Feb 22: Uptime 41 days, rx_packets: 87177517 tx_packets: 27290062 dropped_bad_phy: 56907 dropped_bad_crc32: 2
- umfs09.aglt2.org - Feb 22: No "bad" counters incremented over 10 days uptime . rx_packets: 605210 tx_packets: 224296
- umfs10.aglt2.org - No "bad" counters incremented over 27.5 hours . Total of (*89533936-8900951) received packets and (75269179-11291051) transmitted packets. Feb 22: No bad phys counters incremented in 7 days uptime. ( rx_packets: 293449 tx_packets: 54339)
- umfs11.aglt2.org - Solaris 10 x86 (others RHEL4). No stats from Feb 1. Feb 22: Uptime 3 days - ipackets: 64048, opackets: 659, dropped_bad_crc32: 0, dropped_bad_phy: 1225
- msufs02.aglt2.org - No "bad" counters incremented over 28 hours. Total of (58836448-58818660) received packets and (7968785-7966208) transmitted packets. Feb 22: No bad counters over last 10 days (rx_packets: 287885 tx_packets: 182212)
- msufs03.aglt2.org - The dropped_bad_phy increased from 3526 to 3529 in 28 hours over a total of (4214335-4198054) received packets and (719590-718553) transmitted packets. Feb 22: Over a 10 day period accumulated 64 dropped_bad_phys errors (rx_packets: 288832 tx_packets: 183719).
--
BenMeekhof - 01 Feb 2008