It seems that a lot of users will upload video which is by default published with the default settings and thus is visible from the outside. Even if they change the settings fairly quickly, automated systems like ours will already know about the existence of that video.
There could be other reasons but this seems the most likely, especially as a video that is being uploaded can be published fairly swiftly.
[0] https://pex.com
[1] https://blog.pex.com/what-content-dominates-on-youtube-39081...
Even if it may not be illegal, at the very least it would seem un-ethical to link to private videos like this, and it would seem trivial for you to "re-scrape" your database every now and then to check whether any existing videos have changed from listed -> unlisted, and if they have, remove them.
I think a better approach for everyone involved would be to only store references to videos which were posted more than x minutes ago. I'm not sure if they have that information when scraping though.