Backfill simulations
Jun. 27th, 2014 03:44 pmI'm rather happy with the way my backfill model has come out. Using a very simple algorithm that does almost no forward planning, I've been able to come up with something that keeps my simulated system almost fully utilised and produces a very cute allocation map at the end of its run.

This shows simulation time on the ordinate and simulated nodes on the abscissa. Individual jobs are shown in different colours, although this is slightly hard to see on plot; the colouring algorithm doesn't take the trouble to ensure that adjacent jobs use distinct colours and a single job may not necessarily be assigned to contiguous nodes, making it slightly tricky to tell which job is which. The model was run with 128 nodes and a job count of 256, where the largest possible job was constrained to be 1/16th of the total system size.
A plot of the utilisation of each step — nothing more than the percentage of allocated nodes — shows that while there are sufficient jobs waiting in the input queue, the scheduler is able to keep the system over 98 per cent busy.

Obviously the situation is not realistic — job times are assumed to be accurate and all the jobs are known ahead of time, rather than appearing as the simulation evolves — but it provides a useful way of studying the effects of a few basic parameters on job allocation and scheduler efficiency, as well as pointing up some interesting areas for further study.

This shows simulation time on the ordinate and simulated nodes on the abscissa. Individual jobs are shown in different colours, although this is slightly hard to see on plot; the colouring algorithm doesn't take the trouble to ensure that adjacent jobs use distinct colours and a single job may not necessarily be assigned to contiguous nodes, making it slightly tricky to tell which job is which. The model was run with 128 nodes and a job count of 256, where the largest possible job was constrained to be 1/16th of the total system size.
A plot of the utilisation of each step — nothing more than the percentage of allocated nodes — shows that while there are sufficient jobs waiting in the input queue, the scheduler is able to keep the system over 98 per cent busy.

Obviously the situation is not realistic — job times are assumed to be accurate and all the jobs are known ahead of time, rather than appearing as the simulation evolves — but it provides a useful way of studying the effects of a few basic parameters on job allocation and scheduler efficiency, as well as pointing up some interesting areas for further study.