Tribe has 177 compute nodes with an Infinipath interconnect. This page documents their hardware and software configuration.
This section documents the hardware that makes up each icompute node.
The motherboard is an ASUS K8N-DRE rev 1.
There are two dual-core AMD Opteron 275 processors running at 2.2Ghz.
There are two memory configurations for the icompute nodes. All DIMMs are 1GB ECC.
There is a single 80GB hard drive attached via SATA. Each disk is using approximately 8GB for software and 4GB for swap space. The rest is used for compute job temporary files.
There are two Broadcom BCM95721 1GbE interfaces on the motherboard.
There is one SDR 4x InfiniBand to PCI Express x8 Host Channel Adapter. Also know as the InfiniPath QLE7140 HCA. It is attached to the Infiniband switch via a very large and sensitive cable. This makes recabling the cluster very time-consuming and expensive.
This section documents the drivers used by the icompute nodes.
The two on-board GbE interfaces both use the tigon3 (aka tg3) linux kernel module.
The PCI-Express InfiniPath SDR card uses the Qlogic driver
The motherboard uses nforce2 and w83792d sensors modules.
Here are some performance numbers for this hardware.
These are some initial performance numbers for MPI communication over the InfiniPath card.
<format gnuplot> set title “QLogic QLE7140 Bandwidth Benchmark” set log xy set style line set xtic (1,8,128,2048,32768,524288) set xlabel “Packet Size (bytes)” set ylabel “Bandwidth (MB/s)” set grid plot '-' using 1:2 lw 2 with lines title “OpenMPI”, \
'-' using 1:2 lw 2 with lines title "QLogic MPI"
1 1.54883 2 3.11035 4 5.35547 8 10.7725 16 20.8506 32 39.3379 64 72.4219 128 124.479 256 195.767 512 276.749 1024 412.351 2048 562.999 4096 690.003 8192 779.283 16384 644.171 32768 717.925 65536 797.36 131072 848.962 262144 869.342 524288 887.411 1048576 893.077 2097152 900.281 e 1 1.80469 2 3.6084 4 6.4502 8 12.8213 16 23.5479 32 43.1953 64 77.9033 128 138.97 256 212.755 512 292.649 1024 435.465 2048 585.407 4096 705.567 8192 789.323 16384 648.296 32768 709.626 65536 796.315 131072 850.257 262144 873.186 524288 889.866 1048576 900.948 2097152 906.362 e </format> <format gnuplot> set title “QLogic QLE7140 Latency Benchmark” set style line set log xy set xtic (1,8,128,2048,32768,524288) set xlabel “Packet Size (bytes)” set ylabel “Latency (us)” set grid plot '-' using 1:2 lw 2 with lines title “OpenMPI”, \
'-' using 1:2 lw 2 with lines title "QLogic MPI"
1 2.45 2 2.45 4 2.84 8 2.83 16 2.93 32 3.10 64 3.37 128 3.92 256 4.99 512 7.06 1024 9.47 2048 13.88 4096 22.64 8192 40.10 16384 97.02 32768 174.11 65536 313.53 131072 588.95 262144 1150.30 524288 2253.75 1048576 4478.90 2097152 8886.11 e 1 2.10 2 2.11 4 2.36 8 2.38 16 2.59 32 2.83 64 3.13 128 3.51 256 4.59 512 6.67 1024 8.97 2048 13.35 4096 22.15 8192 39.59 16384 96.41 32768 176.15 65536 313.95 131072 588.06 262144 1145.23 524288 2247.53 1048576 4439.76 2097152 8826.49 e </format> <format gnuplot> set title “QLogic QLE7140 Multi-host Latency Benchmark” set xlabel “Slots (1 per node)” set ylabel “Latency (us)” set yrange [0:14] set grid plot '-' using 1:2 lw 2 with lines title “OpenMPI”, \
'-' using 1:2 lw 2 with lines title "QLogic MPI"
2 2.47 3 2.43 4 2.41 5 2.40 6 2.43 7 2.41 8 2.42 9 2.41 10 2.41 11 2.40 12 2.42 13 2.43 14 2.44 15 2.44 16 2.43 17 2.44 18 2.42 19 2.42 20 2.42 21 2.43 22 2.43 23 2.41 24 2.43 25 2.43 26 2.43 27 2.43 28 2.43 29 2.44 30 2.44 31 2.44 32 2.44 33 2.45 34 2.45 35 2.49 36 2.44 37 2.45 38 2.46 39 2.45 40 2.45 e 2 2.11 3 2.14 4 2.13 5 2.14 6 2.13 7 2.14 8 2.14 9 2.15 10 2.15 11 2.16 12 2.14 13 2.17 14 2.14 15 2.16 16 2.15 17 2.17 18 2.12 19 2.15 20 2.13 21 2.15 22 2.14 23 2.12 24 2.14 25 2.15 26 2.14 27 2.14 28 2.15 29 2.15 30 2.16 31 2.15 32 2.15 33 2.14 34 2.16 35 2.15 36 2.16 37 2.13 38 2.16 39 2.18 40 2.16 e </format> <float left>
Size | OpenMPI Latency (us) | OpenMPI Bandwidth (MB/s) |
---|---|---|
1 | 2.45 | 1.54883 |
2 | 2.45 | 3.11035 |
4 | 2.84 | 5.35547 |
8 | 2.83 | 10.7725 |
16 | 2.93 | 20.8506 |
32 | 3.10 | 39.3379 |
64 | 3.37 | 72.4219 |
128 | 3.92 | 124.479 |
256 | 4.99 | 195.767 |
512 | 7.06 | 276.749 |
1024 | 9.47 | 412.351 |
2048 | 13.88 | 562.999 |
4096 | 22.64 | 690.003 |
8192 | 40.10 | 779.283 |
16384 | 97.02 | 644.171 |
32768 | 174.11 | 717.925 |
65536 | 313.53 | 797.36 |
131072 | 588.95 | 848.962 |
262144 | 1150.30 | 869.342 |
524288 | 2253.75 | 887.411 |
1048576 | 4478.90 | 893.077 |
2097152 | 8886.11 | 900.281 |
</float> <float left>
Size | QLogic MPI Latency (us) | QLogic MPI Bandwidth (MB/s) |
---|---|---|
1 | 2.10 | 1.80469 |
2 | 2.11 | 3.6084 |
4 | 2.36 | 6.4502 |
8 | 2.38 | 12.8213 |
16 | 2.59 | 23.5479 |
32 | 2.83 | 43.1953 |
64 | 3.13 | 77.9033 |
128 | 3.51 | 138.97 |
256 | 4.59 | 212.755 |
512 | 6.67 | 292.649 |
1024 | 8.97 | 435.465 |
2048 | 13.35 | 585.407 |
4096 | 22.15 | 705.567 |
8192 | 39.59 | 789.323 |
16384 | 96.41 | 648.296 |
32768 | 176.15 | 709.626 |
65536 | 313.95 | 796.315 |
131072 | 588.06 | 850.257 |
262144 | 1145.23 | 873.186 |
524288 | 2247.53 | 889.866 |
1048576 | 4439.76 | 900.948 |
2097152 | 8826.49 | 906.362 |
</float> <float left>
Slots/Nodes | OpenMPI Latency (us) | QLogic MPI Latency (us) |
---|---|---|
2 | 2.47 | 2.11 |
3 | 2.43 | 2.14 |
4 | 2.41 | 2.13 |
5 | 2.40 | 2.14 |
6 | 2.43 | 2.13 |
7 | 2.41 | 2.14 |
8 | 2.42 | 2.14 |
9 | 2.41 | 2.15 |
10 | 2.41 | 2.15 |
11 | 2.40 | 2.16 |
12 | 2.42 | 2.14 |
13 | 2.43 | 2.17 |
14 | 2.44 | 2.14 |
15 | 2.44 | 2.16 |
16 | 2.43 | 2.15 |
17 | 2.44 | 2.17 |
18 | 2.42 | 2.12 |
19 | 2.42 | 2.15 |
20 | 2.42 | 2.13 |
21 | 2.43 | 2.15 |
22 | 2.43 | 2.14 |
23 | 2.41 | 2.12 |
24 | 2.43 | 2.14 |
25 | 2.43 | 2.15 |
26 | 2.43 | 2.14 |
27 | 2.43 | 2.14 |
28 | 2.43 | 2.15 |
29 | 2.44 | 2.15 |
30 | 2.44 | 2.16 |
31 | 2.44 | 2.15 |
32 | 2.44 | 2.15 |
33 | 2.45 | 2.14 |
34 | 2.45 | 2.16 |
35 | 2.49 | 2.15 |
36 | 2.44 | 2.16 |
37 | 2.45 | 2.13 |
38 | 2.46 | 2.16 |
39 | 2.45 | 2.18 |
40 | 2.45 | 2.16 |
</float> <clear/>