Network Testing and Debugging for AGLT2
During the last year we have seen many indications that all is not right with our network connections to BNL (and perhaps other sites). The most graphic example of the problem was observed during preparations for SC08 Terapaths demos. We had Hiro (BNL) start 10 "production" transfers from AGLT2 to NET2. Each of the 10 transfers was running at around 8 Mbits/second. A Terapaths reservation was put in place, reserving 1 Gigabit between AGLT2 and NET2. As soon as the reservation was activated the transfers increased from 8 Mbits/second to 12MBytes/second (total of 120 Mbytes/sec for 10 flows ~ 1 Gbit/sec). Since the end systems/applications running were the same before and after the reservation we can conclude that it was the change in network path corresponding to the reservation that made the difference.
Recently poor transfer performance moving files from AGLT2 to BNL has prompted us to try to debug the issue and hopefully isolate it so it can be fixed.
We have the following resources to use:
- Two hosts at BNL, both with 10GE NICs: nettest10g.usatlas.bnl.gov and newmon.bnl.gov
- One host at StarLight/Chicago with a 10GE NIC: wood1-chi.uslhcnet.org (10GE at 192.84.86.10)
- Multiple 10GE hosts at AGLT2:
- UM: umfs11.aglt2.org, umfs12.aglt2.org umfs13.aglt2.org
- MSU: msufs05.aglt2.org msufs08.aglt2.org
In addition we have perfSONAR boxes at BNL, UM(psum01.aglt2.org, psum02.aglt2.org) and MSU (psmsu01.aglt2.org, psmsu02.aglt2.org)
Initial Info
I needed to setup explicit routes for
wood1-chi.uslhcnet.org to insure the 10GE NIC was used to test to/from the subnets of interest. I added:
- route add -net 192.41.230.0/23 eth2 ! (AGLT2)
- route add -net 130.199.0.0/16 eth2 ! (BNL)
- route add -net 192.12.15.0/24 eth2 ! (BNL/usatlas)
[root@wood1-chi ~]# route
Kernel IP routing table
Destination Gateway Genmask Flags Metric Ref Use Iface
192.65.196.64 * 255.255.255.224 U 0 0 0 eth0
192.84.86.0 * 255.255.255.224 U 0 0 0 eth2
192.12.15.0 * 255.255.255.0 U 0 0 0 eth2
192.41.230.0 * 255.255.254.0 U 0 0 0 eth2
130.199.0.0 * 255.255.0.0 U 0 0 0 eth2
169.254.0.0 * 255.255.0.0 U 0 0 0 eth2
default E600chi-prod.us 0.0.0.0 UG 0 0 0 eth0
The current traceroutes between machines are shown here.
root@umfs11 ~# traceroute 192.84.86.10
traceroute to 192.84.86.10 (192.84.86.10), 30 hops max, 46 byte packets
1 vl4001-nile.aglt2.or.230.41.192.in-addr.arpa (192.41.230.2) 0.289 ms 0.217 ms 0.209 ms
2 r04chi-te-1-4-ptp-umich.ultralight.org (192.84.86.229) 5.734 ms 5.690 ms 5.691 ms
3 192.84.86.10 (192.84.86.10) 5.683 ms !<10> 5.699 ms !<10> 5.644 ms !<10>
root@umfs11 ~# traceroute newmon.bnl.gov
traceroute to newmon.bnl.gov (130.199.3.7), 30 hops max, 46 byte packets
1 vl4001-nile.aglt2.or.230.41.192.in-addr.arpa (192.41.230.2) 0.281 ms 0.215 ms 0.194 ms
2 r04chi-te-1-4-ptp-umich.ultralight.org (192.84.86.229) 5.791 ms 5.738 ms 5.710 ms
3 chi-ultralight.es.net (198.125.140.205) 6.021 ms 6.058 ms 6.056 ms
4 chiccr1-starcr1.es.net (134.55.207.33) 6.215 ms 6.078 ms 6.041 ms
5 clevcr1-ip-chiccr1.es.net (134.55.217.53) 15.141 ms 23.795 ms 15.043 ms
6 washcr1-ip-clevcr1.es.net (134.55.222.58) 22.760 ms 22.723 ms 22.719 ms
7 aofacr2-washcr1.es.net (134.55.218.78) 27.921 ms 27.919 ms 45.452 ms
8 bnlmr1-aoacr1.es.net (134.55.217.57) 29.814 ms 29.872 ms 29.803 ms
9 bnlsite-bnlmr1.es.net (198.124.216.178) 125.724 ms 31.603 ms 48.383 ms
10 newmon.bnl.gov (130.199.3.7) 29.663 ms 29.620 ms 29.615 ms
root@umfs11 ~# traceroute nettest10g.usatlas.bnl.gov
traceroute to nettest10g.usatlas.bnl.gov (192.12.15.25), 30 hops max, 46 byte packets
1 vl4001-nile.aglt2.or.230.41.192.in-addr.arpa (192.41.230.2) 0.454 ms 0.225 ms 0.201 ms
2 r04chi-te-1-4-ptp-umich.ultralight.org (192.84.86.229) 5.723 ms 5.707 ms 5.684 ms
3 chi-ultralight.es.net (198.125.140.205) 5.997 ms 6.003 ms 5.952 ms
4 chiccr1-starcr1.es.net (134.55.207.33) 6.073 ms 6.012 ms 6.039 ms
5 clevcr1-ip-chiccr1.es.net (134.55.217.53) 15.084 ms 15.048 ms 15.051 ms
6 washcr1-ip-clevcr1.es.net (134.55.222.58) 22.704 ms 22.704 ms 22.688 ms
7 aofacr2-washcr1.es.net (134.55.218.78) 27.876 ms 27.907 ms 27.859 ms
8 bnlmr1-aoacr1.es.net (134.55.217.57) 30.423 ms 30.169 ms 29.876 ms
9 bnlsite-bnlmr1.es.net (198.124.216.178) 192.944 ms 204.639 ms 89.670 ms
10 nettest10g.usatlas.bnl.gov (192.12.15.25) 29.759 ms 29.733 ms 29.746 ms
[root@wood1-chi ~]# traceroute umfs11.aglt2.org
traceroute to umfs11.aglt2.org (192.41.230.31), 30 hops max, 40 byte packets
1 umfs11.aglt2.org (192.41.230.31) 6.897 ms 6.961 ms 6.950 ms
[root@wood1-chi ~]# tracepath umfs11.aglt2.org
1: 192.84.86.10 (192.84.86.10) 0.055ms pmtu 9000
1: 192.84.86.1 (192.84.86.1) 3.318ms
2: 192.84.86.230 (192.84.86.230) 6.535ms
3: umfs11.aglt2.org (192.41.230.31) 5.718ms reached
Resume: pmtu 9000 hops 3 back 3
traceroute to newmon.bnl.gov (130.199.3.7), 30 hops max, 40 byte packets
1 * * *
2 * * *
3 * * *
4 * * *
5 * washcr1-ip-clevcr1.es.net (134.55.222.58) 18.794 ms 18.859 ms
6 aofacr2-washcr1.es.net (134.55.218.78) 23.969 ms 22.274 ms 22.281 ms
7 bnlmr1-aoacr1.es.net (134.55.217.57) 24.317 ms 24.185 ms 24.261 ms
8 bnlsite-bnlmr1.es.net (198.124.216.178) 24.235 ms 24.248 ms 24.249 ms
9 newmon.bnl.gov (130.199.3.7) 24.038 ms 23.998 ms 24.034 ms
[root@wood1-chi ~]# tracepath newmon.bnl.gov
1: 192.84.86.10 (192.84.86.10) 0.061ms pmtu 9000
1: 192.84.86.1 (192.84.86.1) 3.172ms
2: chi-ultralight.es.net (198.125.140.205) asymm 3 0.349ms
3: chiccr1-starcr1.es.net (134.55.207.33) asymm 4 0.760ms
4: clevcr1-ip-chiccr1.es.net (134.55.217.53) asymm 5 9.625ms
5: no reply
6: aofacr2-washcr1.es.net (134.55.218.78) asymm 7 22.459ms
7: bnlmr1-aoacr1.es.net (134.55.217.57) 25.035ms
8: bnlsite-bnlmr1.es.net (198.124.216.178) 102.907ms
9: bnlsite-bnlmr1.es.net (198.124.216.178) asymm 8 103.916ms pmtu 1500
10: newmon.bnl.gov (130.199.3.7) asymm 9 24.073ms reached
Resume: pmtu 1500 hops 10 back 9
[root@wood1-chi ~]# traceroute nettest10g.usatlas.bnl.gov
traceroute to nettest10g.usatlas.bnl.gov (192.12.15.25), 30 hops max, 40 byte packets
1 * * *
2 * * *
3 * * *
4 * * *
5 * washcr1-ip-clevcr1.es.net (134.55.222.58) 19.051 ms 19.115 ms
6 aofacr2-washcr1.es.net (134.55.218.78) 24.143 ms 22.246 ms 22.250 ms
7 bnlmr1-aoacr1.es.net (134.55.217.57) 24.642 ms 45.929 ms 46.031 ms
8 bnlsite-bnlmr1.es.net (198.124.216.178) 24.518 ms 24.142 ms 24.155 ms
9 nettest10g.usatlas.bnl.gov (192.12.15.25) 24.186 ms 24.144 ms 24.196 ms
[root@wood1-chi ~]# tracepath nettest10g.usatlas.bnl.gov
1: 192.84.86.10 (192.84.86.10) 0.054ms pmtu 9000
1: 192.84.86.1 (192.84.86.1) 2.888ms
2: chi-ultralight.es.net (198.125.140.205) asymm 3 0.352ms
3: chiccr1-starcr1.es.net (134.55.207.33) asymm 4 0.749ms
4: clevcr1-ip-chiccr1.es.net (134.55.217.53) asymm 5 9.622ms
5: no reply
6: aofacr2-washcr1.es.net (134.55.218.78) asymm 7 22.493ms
7: bnlmr1-aoacr1.es.net (134.55.217.57) 25.018ms
8: bnlsite-bnlmr1.es.net (198.124.216.178) 40.501ms
9: bnlsite-bnlmr1.es.net (198.124.216.178) asymm 8 73.872ms pmtu 1500
10: nettest10g.usatlas.bnl.gov (192.12.15.25) 24.270ms reached
Resume: pmtu 1500 hops 10 back 10
Information on NICs/stack settings:
[root@wood1-chi ~]# ethtool -i eth2
driver: myri10ge
version: 1.4.4
firmware-version: 1.4.38 -- 2008/12/18 23:27:11 m
bus-info: 0000:09:00.0
net.ipv4.udp_wmem_min = 4096
net.ipv4.udp_rmem_min = 4096
net.ipv4.udp_mem = 386208 514944 772416
net.ipv4.tcp_rmem = 4096 87380 16777216
net.ipv4.tcp_wmem = 4096 65536 16777216
net.ipv4.tcp_mem = 196608 262144 393216
net.ipv4.igmp_max_memberships = 20
net.core.optmem_max = 20480
net.core.rmem_default = 126976
net.core.wmem_default = 126976
net.core.rmem_max = 16777216
net.core.wmem_max = 16777216
root@umfs11 ~# ethtool -i eth2
driver: myri10ge
version: 1.4.0
firmware-version: 1.4.29 -- 2008/01/03 14:46:05 m
bus-info: 0000:0a:00.0
net.ipv4.tcp_rmem = 4096 87380 800000000
net.ipv4.tcp_wmem = 4096 65536 800000000
net.ipv4.tcp_mem = 786432 1048576 1572864
net.ipv4.igmp_max_memberships = 20
net.core.optmem_max = 20480
net.core.rmem_default = 135168
net.core.wmem_default = 135168
net.core.rmem_max = 800000000
net.core.wmem_max = 800000000
--
ShawnMcKee - 12 Feb 2009