NIS performance tuning
Apr. 16th, 2008 09:58 pmMy decision to reconfigure NIS has been comprehensively vindicated. I decided to up the number of slaves from a paltry one to six and to change the default bindings to distribute the load evenly across the servers.
When I moved the first cluster of systems off the master, I noticed that the parallel
To establish a baseline, I ran a parallel
I then ran the command against 34 systems, split to use two different NIS slaves. Each system was running a full workload. The command took 0.6 seconds to complete — 26 times faster than the baseline case!
My test, which relies on logging in to systems in parallel and running a series of commands, is likely to be extremely sensitive to poor NIS latencies. But it also seems to me to be indicative of the sorts of delays a large scale parallel application might see when launching across a large number of nodes. In most cases, a 15–20 second delay would be neglible. In the case of a real time application, especially one that depends on running a number of large MPI executables within a single job, the cumulative start-up delays might well have a significant impact.
When I moved the first cluster of systems off the master, I noticed that the parallel
ypwhich I issued came back rather more swiftly than the first, but I assumed that it was load related and thought no more of it. When the same thing when I moved the second cluster, I decided run a couple of experiments to confirm my intuitive feeling that the things were running with greater celerity.To establish a baseline, I ran a parallel
ypwhich across 23 systems connected to the original NIS master. The systems were not under load — all the work had been checkpointed to allow for maintenance. The command took some 16 seconds to complete.I then ran the command against 34 systems, split to use two different NIS slaves. Each system was running a full workload. The command took 0.6 seconds to complete — 26 times faster than the baseline case!
My test, which relies on logging in to systems in parallel and running a series of commands, is likely to be extremely sensitive to poor NIS latencies. But it also seems to me to be indicative of the sorts of delays a large scale parallel application might see when launching across a large number of nodes. In most cases, a 15–20 second delay would be neglible. In the case of a real time application, especially one that depends on running a number of large MPI executables within a single job, the cumulative start-up delays might well have a significant impact.