Also, detecting videos that are inappropriate for children is a lot harder than determining certain content creators that are trustworthy to post videos that are appropriate (and to tag them correctly). That can be learned from the user's history, how many times their stuff has been flagged, getting upvotes from users that are themselves deemed credible, and so on. The more layers of indirection, the better, a la PageRank.
So even without analyzing the video itself, it would have a much smaller set of videos it can recommend from, but still potentially millions of videos. You still need some level of staff to train the algorithm, but you don't have to have paid staff look at every single video to have a good set of videos it can recommend. The staff might spend most of their time looking at videos that are anomalous, such as they were posted by a user the algorithm trusted but then flagged by a user that the algorithm considered credible. Then they would tag that video with some rich information that will help the algorithm in the future, beyond just removing that video or reducing the trust of the poster or the credibility of the flagger.
I'm not sure heavy automation is needed here, people jump from content creator to content creator by word of mouth. In contrast most algorithmic suggestions to me seem highly biased towards what is popular in general. I click on one wrong video in a news article and for the next two days my recommendations are pop music, Jimmy Kimmel, Ben Shapiro and animal videos
All three things I just mentioned are fairly niche, comparatively, yet it knows that I've been watching a lot of them lately and is giving me more of it.