This removes all of the code that was previously used to get them from
/timedtext, and instead, always uses whatever is extracted from the
video page.
This does unfortunately now require a whole video fetch just for the
captions. But assuming captions are only requested by a frontend, this
won't be a problem due to the memory cache. The captions link will be
in memory because the just-requested video is in memory too.