Monday, February 20, 2017

What datastructure does the CFS use and why

CFS is the Linux kernels completely fair scheduler. It uses red black (RB) trees. 

Red Black trees - datastructure
Nice notes on understanding red-black trees are here [1] and [2].

Why and How is it used in the scheduler?


  • RB trees offer guarantees on worst case times (log n) for insertion, deletion, and search along with their self balancing properties which makes them useful for working with time sensitive applications. 
  • The RB tree for the scheduler is indexed by virtual run time which is the accounted CPU time already made available to the task. Hence, tasks in the left half of the tree are the ones that have received the least time while the ones on the right are the ones which have already received CPU time. 
  • The CFS just picks a node from the left half of the tree and after giving the task CPU time, it adds it to the virtual run time of the task. Then if the task is still runnable it is added back to the RB tree. 
  • Now since the aggregate virtual run time of the tree has gone up as compared to the other nodes, it gets moved to the right of the tree. In this way, tasks keep moving through the RB tree as they receive CPU time. 


References:
[1] Red black trees on Wikipedia https://en.wikipedia.org/wiki/Red%E2%80%93black_tree
[2] Visualization of red black trees https://www.cs.usfca.edu/~galles/visualization/RedBlack.html
[3] Completely fair scheduler design on wikipedia https://en.wikipedia.org/wiki/Completely_Fair_Scheduler
[4] Linux Journal - completely fair scheduler http://www.linuxjournal.com/magazine/completely-fair-scheduler