PSA: Watch in an private/incognito tab/window. If you are currently logged into your google account, this WILL pollute your watched history: https://www.youtube.com/feed/history
Wow. It's a fascinating look at what likely makes up a large majority of YouTube content that I would otherwise never come into contact with.
I also love how the creator packaged it up as a though an alien visitor was using YouTube to sample our civilization. This is the 99%.
But this... this is mesmerizing. As cheesy as premise may be, you do feel a little like an outsider voyeur - not in a perverse sense, but in the having-no-expectations-or-context sense. Each video proves a gem, and timing is right. And knowing that you may be the only person who has ever seen it just adds to mystique... absolutely brilliant! :O
[1] "a"/"last hour": https://www.youtube.com/results?search_query=a&sp=EgIIAQ%253...
The entire value here, for me at least, is a) randomness an b) fact they haven't been watched before.
There's so much curatrd stuff (which is great), and all tech companies are trying so hard to send me what they think I'll like / agree with, it's refreshing to step out of that box.
Edit: apologies if, perchance, I missed some subtle sarcasm btw... You never know on them interwebs
It's a video of a woman reciting a poem that she wrote for her eldest son that speaks of her love for her son and her wish that he would get "off the streets". Emotional, honest, real. YouTube like I've never experienced. Brilliant.
I also saw someone rave about "Donut" this week - schedules random 1-on-1's with people in your company to help with cross-pollination and bigger picture context. Chat Roulette and what a dumpster fire that is comes to mind, but I wonder if a LinkedIn-based service of a similar nature would be good just to learn about other companies, other corporate cultures, etc..
Good idea to do a full reset periodically anyways to reset your recommendations.
Netflix, Amazon, Youtube, PornHub, etc... they're all accomplishing little more than "similar to the one, and only one item you last saw", with dramatic shifts in "profiling" from one or two videos.
Actually, Netflix acknowledges this and splits the recommendation into "because you watched X..", so at least it covers a greater range (eg last 5 things seen)
I'm damned sure they could be much more useful if they would let me tell them what I like, by implementing rating systems that are worth using (e.g. the ability to browse and edit previous ratings in a sane fashion)
but user-useful recommendation is not the actual goal, so really its just that our metrics are wrong. It's probably great according to view counts.
How does it work, technically? Is there an API to pull videos with a certain title format within a certain range, and then are the sections of video randomly chosen?
Edit: Found this https://github.com/wonga00/astronaut - answers some questions:
> The server currently pulls in videos daily from youtube. Search criteria is [TAG]XXXX with upload time this week, where TAG is a raw video prefix such as 'dsc' or 'img'. This search turns out to be a good approximation for the data set of home videos created in the last week.
Can anyone explain the prevalence of group exercise videos? Has anyone had radically different experiences?
That is, the user is capable of efficiently informing the engine of their taste, and there’s significant incentive for the user to consistently re-evaluate their ratings (playlists), so it can be trusted as up to date.
Another very important aspect is that playlists are useful enough to the user that they actually want to maintain it.
For example, amazon, netflix and pornhub all have rating systems, but they’re not at all useful. The interface isn’t useful enough for reviewing and reflecting on, its not comprehensive enough to keep as a primary list (because it only covers what they offer, which is very limited) and there’s of course no impact on the recommendation engine (because the rating systems are not worth using; chicken and egg). No sane person would touch the things (beyond “upvoting”, which isn’t significantly related to taste)
Imo ratings are absolutely vital to useful reccomendation, but they’ve been totally neglected
I hope he was able to retire after all!
It seems that a lot of users will upload video which is by default published with the default settings and thus is visible from the outside. Even if they change the settings fairly quickly, automated systems like ours will already know about the existence of that video.
There could be other reasons but this seems the most likely, especially as a video that is being uploaded can be published fairly swiftly.
[0] https://pex.com
[1] https://blog.pex.com/what-content-dominates-on-youtube-39081...
When I go to YouTube on a fresh device without being logged in, it's a pile of steaming clickbait and pop-internet-culture garbage. On my account, by contrast, YouTube is full of mostly great recommendations of high-quality content and I can pretty much always find several new and interesting things to watch should I feel like it.
(I watch stuff like Kurzgesagt, Smarter every day, AvE, Rick Beato, Today I found out, Wisecrack, Wendover Productions, Practical Engineering, Vox, Crash Course, SciShow... that's just from browsing my current recommendations. I would guess none of that shows up for fresh accounts)
Even if it may not be illegal, at the very least it would seem un-ethical to link to private videos like this, and it would seem trivial for you to "re-scrape" your database every now and then to check whether any existing videos have changed from listed -> unlisted, and if they have, remove them.
That said, the "patient" may not be one; lots of trainee doctors and nurses use YouTube to show their abilities, using mock patients.
I think a better approach for everyone involved would be to only store references to videos which were posted more than x minutes ago. I'm not sure if they have that information when scraping though.
Nice to see you completely ignored my advice though :P
We are far from the "SHOCKING: A WHALE EATS A BABY LIVE!!!" with the red circled preview image.
This was one of the first few that popped up for me, a cyberpunk/Terry Gilliam reality thing: https://www.youtube.com/watch?v=RPyfqim1KA8
It's not immediately clear which one is the "original one", but this one has been around since at least 2012[1].
[1] https://github.com/wonga00/astronaut/commit/a9bdaf0d00588b7a...
That said, you can indeed tell them what you like but by use of negative space. When you get bombarded with obviously horrible recommendations, do the two-step process of clicking 'Not Interested' (if possible without even watching the video, or you can check it in incognito mode, assuming they're not watching that even more closely) and then 'tell us why', and respond 'I'm not interested in this channel: "undesirable video maker".
That assumes you can be sure you want to nuke the channels and subjects in question, but when it's clickbait channels and/or alt-right propaganda it's generally easy to identify and not get wrong. I'm sure the same would be true for leftwing propaganda, but the stuff I don't want pushed on me has a whole language and lexicon that's easily recognizable by video title, channel title and attempted clickbait image. If stuff trips my sensors on those grounds, I'm generally comfortable nuking it unseen.
I think quite a lot has to do with machine learning working out that if you panic the human animal they pay attention to threats, and therefore to maximize engagement 'if it bleeds, it leads' (old newspaper maxim). Newspaper editors can (and might not) automatically apply a social-benefit heuristic or sense of social shame (not wanting to be a 'muckraker' or troublemaker), and machine learning may not even start with such a concept.
If engagement was maximized by turning viewers into cannibalistic humanoid underground dwellers (CHUDs), machine learning would simply make note of that and run as hard as it could in that direction, since it doesn't have a larger context in mind unless programmed to do so.
(Such a larger context is actually sort of controversial: lot of people demonize the very concept of social justice, and without it you get these hacks to maximize engagement by tapping into really unmanageable human/animal behaviors)
>It seems that a lot of users will upload video which is by default published [and then they change it to private] //
So to avoid that sort of unexpected public-ing (ie publishing) only one extra scrape would be needed. Or, if they knew the period over which the setting was normally changed then they could just delay the scrape until most would have already been changed.
I imagine though, in part, the 'fun' is catching inadvertent publication and morality is no t considered.
In the same vein of avoiding bubbles, I browse reddit by 'Top Of The Hour'. Filtrated enough to be decent quality, very fresh content, and not yet subverted by bubble affiliation or mind hiveing.
It works go something like: System transcodes when demand is applied; normally demand misses 90% of videos; demand surely focuses only on unprocessed videos; 'attack' diverts resources to processing videos that would otherwise have never required processing.
Doesn't anyone question that youtube is basically run by an ad service? If you really want to connect to people then why does "company x" wants to know and keep tabs?
Why is there a thumbs up or down in the first place? Or even, why doesn't the number of views matter to you in this case. "company x" had a great search engine but now it seems crippled by the fact you can't say: "exclude the top x percent popular results"
I'm writing this because when I searched for something obscure, I go to page 8 of the "company x" results and got slapped multiple times with "suspicious behaviour" notifications and had to wait or solve a captcha.
You can also see how many viewers are currently on the site, if you inspect the websockets messages.
Or maybe because an astronaut explores the space between the stars, and watching unwatched videos is exploring the space between YouTube "stars".
The concept is great. It's real, and that's amazing, but it's also a reminder of how terrible people can be.
https://alexgarces.github.io/loststories/
The titles of all the videos shown are random strings based on the default media file names of some popular devices, such as iPhone or Samsung Galaxy. Some examples of these titles would be IMG_8869.MOV, DSC 0711 or MVI 6710.
All the videos, requested in real time, are not more than one year old. They are almost undiscovered, usually with very few views (or not even one).
I think of the ad ecosystem as like a tree that has grown so tall and dense that little can grow beneath it. If that tree weren't there I don't think we'd have nothing. I think we'd have a richer ecosystem with many more things growing.
What alternative funding model do you propose?
Also I don't believe unlisted videos are considered to be private. There is a private setting which disallows for public to see such a video.
And finally, it's not very trivial to touch 5.5 billion videos often enough to see if any of those became unlisted.
It would beat the purpose of our service would we delay our identification, and it would actually require some significant engineering efforts in order to introduce such capabilities into our system with significant economical impact on our business.
- https://twitter.com/@wongavision
- https://twitter.com/@astrojams1
Originally posted here: https://news.ycombinator.com/item?id=13413225- Video IDs are spit out onto a Socket.io connection. (Another person claims it’s synchronized, which seems likely.)
- While one video plays, another player is in the background buffering the next video. Making it quite seemless.
- The code is from 2011, apparently, and it feels like it. You have code in script tags and plain old unminified JS, not to mention jQuery. Nothing wrong with that, but it’s almost nostalgic at this juncture.
So many of the videos it was pulling up had IMG/MOV/DCS in the title that I wondered if that was the strategy for finding unwatched videos, but I don’t think so, it must just be a consequence of many people uploading videos directly from camera files.
One remark I do have is that it seems to not be picking the most recent videos. There might be good reason for that (maybe waiting filters out bad content, or content that will have views?)
That is from the initial page load. So it would seem that the title pattern that you observed is intentional
I guess on one level it's invasive as hell but in an increasingly streamlined online experience it's nice to get glimpses at all the other stuff that's going on out there.
"from, get this, talking to my colleagues on a regular basis"
There's a ton of content on YouTube that's generated automatically, as well as marketing videos, screencasts, etc. but those are not going to be nearly as interesting as something that someone recorded and uploaded by hand.
Right now I see the follow up to the guy who build his own VGA card, video of someone building a camera that can see wifi, and a PBS video on the "quantum internet". All great suggestions.
Fun fact: if you listen to a song returned from the site it'll never be seen on the site again (as it would have >0 views)
Suppose I had an alternate funding model. Also suppose I wanted youtube to change it (note: I never said that). How does that invalidate the bad things I pointed out with the current model?
You never said whether or not you agreed with my original points.
So the question really just becomes: is the good of YouTube worth the bad?
Perhaps we need something like Astronaut for Spotify?
[1] https://write.as/poseur-to-composer/poseur-approach-to-makin...
[2] https://tonedeaf.thebrag.com/spotify-turns-5-reveals-theres-...
My only worry is that it selects for people who don't know or care to properly name their youtube videos, e.g. after watching for ten minutes, I'm yet to see a young person from the West. Though this is probably one of the reasons why, for me, the videos are so strikingly unfamiliar.
I'd say that's why. To deter dislike bots