My guess is that the lower-level code has a MUCH easier time with tasks like "render the initial substring of this that is at most N pixels wide".
Fun fact: your example code is almost exactly how we do it internally.(Although we use a binary search and cache the result - highly recommended if you want performance.)
