A tricky scheduling problem solved...
Dec. 13th, 2013 02:51 pmA busy few days trying to trace the source of a substantial degradation in the performance of the batch subsystem. Examining the dispatch cycles — normally 10-20 seconds but now inflated to 130-180 seconds — I immediately suspected a bad job but was unable to trace the cause until I enabled full and performance debugging. With this switched on I noticed that each dispatch cycle contained over 330,000 lines referring to a specific job which, when examined, proved to have requested a set of resources that could never be satisfied.
Knowing the likely cause of the problem, I slapped an admin hold on the job and the cycle times promptly dropped back to something like normal. My sense of timing was, as ever, impeccable: I fixed the problem in time for a couple of the others to push off for their Christmas lunch; while I hung around for as long as I could to make sure that everything had settled before pushing off to keep my appointment in town. Excelsior!
Knowing the likely cause of the problem, I slapped an admin hold on the job and the cycle times promptly dropped back to something like normal. My sense of timing was, as ever, impeccable: I fixed the problem in time for a couple of the others to push off for their Christmas lunch; while I hung around for as long as I could to make sure that everything had settled before pushing off to keep my appointment in town. Excelsior!