sawyl: (Default)
[personal profile] sawyl
I've spent quite a lot of time over the last month or two investigating an interesting network performance problem involving three systems connected by a mixture of ethernet and infiniband.

Given a pair of machines connected together by a 10GE link, the network performance is perfectly acceptable. When I introduce a third machine connected to the second vai a QDR infiniband link running with superpackets and transfer data from the first machine to the third, the available bandwidth appears to halve. When I enable path MTU discovery on the third machine to reduce the amount of fragementation carried out by the machine in the middle, the bandwidth drops still further.

My conclusions? I suspect that the difference in observed bandwidth between the two 10GE nodes and the two 10GE nodes + the QDR infiniband node is caused by the overhead of fragmenting and forwarding the packets. I also think that the additional performance decrease seen when the MTU size is stepped down to the lowest common demoninator - in this case 1500 bytes - is caused by the relatively poor performance of infiniband when working with relatively small packets.

Although this seems counter-intuitive — the default rule of network optimisation is to avoid packet fragmentation wherever possible — it seems to be backed up by IBM's documentation on superpackets, which states "[c]hanging the MTU value from the default settings can also have unexpected consequences so it is not recommended except for the most advanced user."

Profile

sawyl: (Default)
sawyl

August 2018

S M T W T F S
   123 4
5 6 7 8910 11
12131415161718
192021222324 25
262728293031 

Most Popular Tags

Style Credit

Expand Cut Tags

No cut tags
Page generated Feb. 5th, 2026 12:20 am
Powered by Dreamwidth Studios