sawyl: (Default)
[personal profile] sawyl
I've discovered a nasty little problem with LoadLeveler reservations. A problem which means that somes, just sometimes, the reservation actually prevents the job from running, but which is kind of hard to describe.

Imagine a pair of reservations. Both reservations have at least one node in common between them, but because the first reservation is due to finish before the second is due to start, this is not a problem.

Now, consider what happens when RESERVATION_CAN_BE_EXCEEDED is set to true in the LoadL_config file and a job is submitted into the first reservation. If the job has a wallclock time that is longer that the reservation, but which means that it will finish because the second reservation becomes active, there is no problem. But if the wallclock time of the job means that it will end after the second reservation is due to start — i.e. if the current time plus the wallclock time of the job is larger than the start time of the second reservation — the two resource requests will clash and consequently, the job will not run.

Thus, under certain circumstances, running a job in a reservation may actually hamper its execution, rather than assist it. And, worst still, because the placement of reservations is dependent on the state of the system, the problem does not necessarily occur repeatably...
This account has disabled anonymous posting.
If you don't have an account you can create one now.
HTML doesn't work in the subject.
More info about formatting

Profile

sawyl: (Default)
sawyl

August 2018

S M T W T F S
   123 4
5 6 7 8910 11
12131415161718
192021222324 25
262728293031 

Most Popular Tags

Style Credit

Expand Cut Tags

No cut tags
Page generated Jul. 8th, 2025 12:55 pm
Powered by Dreamwidth Studios