Over the last year, its become palpable.
Google has such utility in this regard that in some cases, a hallucinating lie-machine offers better answer than an index of what information is available on the internet.
This issue with with Googles failure to respond to the deluge of SEO driven content in their searches. They can do better. They've chosen to not do so.
I’d argue that they even encouraged it.
AI will be a victim of it’s own success too. Or it will need to be human researched and curated rather than just letting an algorithm run freely across the web.
It can only index stuff that's on the Web. Stuff on the Web is, contrary to what is popularly asserted, only a tiny fraction of all human knowledge.
I think people are forgetting how bad search was before Google. Google drove Web directories to extinction. Remember Yahoo!? Back in that era, if I were looking for something as simple as the University of Michigan, I clicked and drilled down through a Yahoo directory. The obvious search query would have been useless. Google changed all that.
I view Google as the yellow pages. It works well for that. Is it an oracle of knowledge? Of course not. How could I possibly expect to find knowledge on a place where there is no reward for making it available? People producing knowledge don't work for free.
I've tried ChatGPT and it's no better. It serves up stuff that is flat-out wrong.
Not optimize for "most documents indexed" but "highest quality of results". One of them encourages adding spam to their index, the other encourages removing spam from their index.
It has been a self-fulfilling positive feedback loop since then.
And yet for some reason they're all too eager to serve up sites scraping stackoverflow.
So do I. I can't tell you the last time I even held yellow pages in my hands.
In the last 2-3 months search quality for me has absolutely crashed and is barely usable.
In the present day, I cannot find my answer on the first page. If I click on the top hits the page is a deluge of useless blogg fluff which takes me more time to find what I am looking for.
More often than not have to add reddit, forum, stackoverflow, etc to find what I am looking for because online communities provide more concise answers.
This is why googles utility has collapsed.
idk, sounds plausible to me, the way things've been going.
HN is constantly pushing this notion that "spam" is some well-defined, solvable problem, so obviously Google wants it. That narrative just doesn't make sense from any angle. The notion that more click bait improves Google's bottom line is absurd
How much do you think companies are willing to spend to be the answer to, "what is the most reliable car?" or, "who should I vote for?"
It's not that the content doesn't exist or isn't indexed, its that its been drowned out by noise. Sifting through noise better was the entire reason google took off from more standard crawlers. It now returns results worse than crawlers from the previous era.
That’s still a thing, although it seems they’re A/B testing its removal. I just opened a private tab (as I always do) and got a boring "More results" button, but I tried another browser (also with a private tab) and got the classic pagination.
Relevant search results that aren't just marketing sites or the big websites.
> It can only index stuff that's on the Web.
And much of it isn't really exposed by Google search.
> I view Google as the yellow pages. It works well for that
It used to. For me, it stopped working well for that a few years ago and has been getting steadily worse ever since.
Product reviews alone, whether it is enterprise software or sports clothing should be something that they can easily comb through by hand, as humans, and uprank sites that are putting out more than affiliate link assemblies.
That is an absurd exaggeration.
And unfortunately Google has become worse than ever at being able to differentiate between insight and fluff.
Before the results would just not match what I was looking for. Now they do match what I was looking for, except some AI procedurally generated the content to show up when I searched those terms, with no regard for the accuracy of what the page says.
And why is the GoogleBot still on HTTP 1.1...
Today:
* Any term that might be related to a commercial product? That product comes first and frequently only.
* Search for two terms? It will first give it's prefer result for each separately - usually commercial products (ha). And then might give them together.
* Quoted terms are often taken as vague suggestions. Negative sign is often useless, etc.
Luckily HN posters don’t exactly represent a meaningful portion of the population.
I'm willing to accept that maybe you are exaggerating to make a point. Maybe you have a better example that is actually illustrative?
If I say "show me the best winter gloves, and only from sites that you can verify actually product tested" and it follows the instruction (ignoring sites that just have a list of popular search results aggregated) then it is better. If it doesn't do what I want, I expect to be able to follow up and teach it.
I expect the chat style stateful search to take instruction for what type of sites I want results from. "Return me a list of websites with recipes for Bolognese that do not have a long story above the recipe. Build a table with the top five results normalized for portion size, comparing and contrasting the ingredients. Highlight unique ingredients in bold."
Then, with Google, it got better and almost all results were relevant.
But we’ve been regressing over the years, and now we’re at the point where 80% of all results are both irrelevant and simply SSO spam.
I find it really hard to believe Google has some of the smartest people in the world on search and they cannot identify this.
For example, just the other day I was searching for one string that I knew was part of a common code repository. To my surprise google couldn't find anything at all. Yandex on the other hand found the repository immediately and linked to github.
Other common issue with google is the difficulty of finding stuff like forum posts related to the search query. Sure, you could append "reddit" to the query, but there are still plenty of traditional forum sites and some of them have decades worth of discussion. I Never see those sites pop up on a typical google search unless I specifically look for them. Again, with yandex, my experience is much better, it is not uncommon to see posts from forums to be on the first page of results.
ChatGPT usually gives me the answer that I'm looking for and nothing else. Sometimes it does add extra info, which often teaches me about something that I wasn't aware of at all.
But the greatest benefit is I can ask it to clarify anything I don't understand. I don't need to go on a completely new Google quest, or jump through hoops to register on some site and hope a random internet person will ordain to help me out. I can just ask, in the same conversation, and immediately get clarification.
Many people underestimate the incredible learning opportunities a well trained language model provides. It doesn't matter that it hallucinates or lies. Whatever it claims is usually easy to validate. What matters is the speed with which you can find uncluttered new leads or answers.
Yet GANs work quite well
You are dealing with a moving target that has a huge financial incentive. It's a very difficult problem.
Google didn't innovate that much except to provide a clutter-free interface and slightly better search. Prior to that, I used Webcrawler and then HotBot. A search like what you described would have easily returned useful results.
Also, once the ChatGPT AI takes off and becomes ubiquitous, then what if there is a lack of credible content for it to train on?
I want you to start a blank slate C (or C++) project. Ask Google how to write heapify, push_heap, and pop_heap in C. Ask ChatGPT the same.
I did this a few weeks ago. I literally could not find the answer on Google. ChatGPT gave me actual C code that I definitely did not trust but did verify.
Google results for questions like that are genuinely awful. It’s full of shitty tutorial websites that are full of ads and either don’t have the answer I need or don’t have it in a convenient form.
And I've /recently/ hunted for something obscure, couldn't find it, managed to find an old bookmark to it, the server was still online and the content I wanted was still there. And no amount of crafting of a google search would bring it up. And the server in question didn't contain copyrighted material which would have resulted in a takedown block or anything like that.
It's frustrating how /bad/ Google has gotten for anything other than fairly basic, high level "searching".
I mean what you just listed.
Google won the search war because of PageRank eliminating lots of spam, and then something like 15 years of staying ahead of SEO spam and providing useful search.
Lately it seems like they've given up on the arms race and let the SEO spam win, but it isn't clear why.
And Google didn't produce high quality search for free, they used ads and sold the eyeballs they won.
In theory it is possible that sponsored content will creep in, but that does not invalidate the incredible benefits a well trained language model will have, even despite the occasional for-profit bias.
Im legitimately asking, who is responsible for Search at Google? Prabhakar Raghavan is SVP, Search, Assistant & Ads, and I click under him, he has 8 product groups reporting to him, and none of the people are responsible for Search. Yossi Matias is responsible for Search Engineering.
It may at first come off as a laughable answer, but Google Search has been in a directionless spiral since Marissa Mayer left. Her Yahoo tenure was not well received, but at Google she cared about the end quality of the product. Her title was Search Products and User Experience. Notice how we have gone from User Experience to Search Engineering, forgetting about the people who actually use the product.
Google search has two huge problems: SEO and censorship. Search for anything related to products/torrents/streams/politics on Google and your results will SUCK, due to one of the two reasons stated above.
The recent Yandex hack/leak has the cynic in me connecting the dots and, seeing how Google search seems to be facing REAL threats to it's dominance since it's creation... maybe some guys with the deepest pockets in the world are starting to enter WAR mode.
Destroying Yandex advantages on the SEO battlefield by way of divulging their parameters to the world would be the Franz Ferdinand assassination moment of the Great Search War.
The competition for many kinds of search terms is causing a race to the bottom. E.g. tech docs, lyrics, recipes, reviews.
That’s why Kago has a lense for “non-spammy recipe searches” — there’s just so much noise on popular, easily copyable material.
You don’t get the best site by popular vote like PageRank was known for, you get the one that generates the most ad revenue.
You figure out a way to crowdsource certain decisions and establish who you can trust. Ask them questions with right and wrong answers. You start to tackle it one product category at a time. Instead of pagerank, which was a web of who linked to who" you start figuring which voters you have who consistently turn in good feedback.
This is some form of metamoderation that slashdot tried to implement.
If you are going to be a tastemaker, stop hiding behind "the algorithm" having some mind of its own that cant be controlled.
#1 result is a long article with culinary history, detailed instructions, many pictures, and a credited author originally from Shanghai.
#2 result is a simple recipe listing from Buzzfeed. Written by a young white guy from Minnesota who worked as a producer. No fluff, no pictures, no backstory. Doubtful the author ever made the recipe at all. You could grab a recipe database and generate thousands of these pages.
I've been burned by #2 too many times disregard the fluff. It shows their investment in the content.
Are you asserting that they look at their copious data and decide to make search worse because it makes them more money? Rather than figuring out a way to make search better and then further optimize their advertising income with this better product? And it seems like they've been pretty damn clever about monetizing quality over the years. It's possible that they have chosen to make search worse for profit, wouldn't be the first time a business did something like that but they have a pretty deep institutional fear of search losing relevance and it's hard to see them doing that.
As a long time user and user of the other guys before google, I think Google is shockingly good at finding specific answers to specific questions that I have about all sorts of things, often with fairly deep technical context. Now what is definitely lacking is in the good old days I'd enter some search terms, get pages of results and then some time later I'd find myself enjoyably down some rabbit hole that is tangential to my search needs on some part of the internet that I never even knew existed before. Maybe I'm too busy with work, but I used to spend a lot more time doing internet "research" to get some specific answers, that time seems to be much more efficient; I do sort of feel like I'm corralled to smaller portion of the internet than I used to be. I don't feel like I can't find the information but I have had a hard time re-finding some specific web page I found once way back.
Where does ChatGPT and Bard fit in this? I've played with ChatGPT and it's fun, it's neat, I haven't been able to get it to some how synthesize some wisdom though. It's not hard to see it just mimicking things. That might be valuable. That might be fun. Using it to seed search might be an enjoyable thing. Maybe it can help extract context from people to find out the actual question they are asking to find the actual answer they seek. Now I can absolutely see ChatGPT/Bard assisting in me wasting time going down rabbit holes, I'm not sure it'll be as enjoyable or as magical as how it used to be.
Are there some examples of shitty google search you can bring up? I just entered "Roth contribution income limit" and without even going to another website, I got what looks like a legit answer to my question. Now I'll click though a few to make sure it's accurate and authentic; at a glance, it's coming from Schwab and it looks like a legit answer to my question. Bing comes up with the same answer, it's presented in a nice table but it's lower on the page and below a sizable paid ad from Merrill Lynch (edge?)
Lots of trash out there but Serious Eats is good quality.
https://www.verybestbaking.com/toll-house/recipes/original-n...
https://www.allrecipes.com/recipe/10813/best-chocolate-chip-...
> The notion that more click bait improves Google's bottom line is absurd
If you don't find what you're looking for on the first try, you'll need to try again, and see more ads. What else are you going to do, go elsewhere, visit a library, ask the town elders or give up on looking for things you want to know? You don't have a choice, you know it, they know it.
I find it equally plausible that Youtube's search sucks badly because they don't care what you're looking for, they want you to watch videos that they predict will lead to the maximum time spent on the site, again so you watch more ads. What other explanation is there that the world's leading search engine has the search of one of their flagship products run at 1999 quality? Presumably they have giant teams of people working on that too?
I see two options: a) Google can't do any better than that, b) Google has a reason to keep it in the current state (I'll put "Google doesn't know because nobody at Google has used Youtube in the last 5 years" and similar options under "a").
a) sounds ridiculous, b) sounds conspiratorial. What are the other options?
And again, I'm not saying they are making search worse on purpose (no "from now on our core mission is to make search suck"). I'm saying they aren't optimizing for SERP quality. They seem to care about index size (maybe it's an internal KPI? would certainly explain their aggressive guessing at additional URLs that you might have on their page but don't link to, don't add in sitemaps etc, and their stubbornness in keeping results from the index even if they've been 301ed or 410ed ages ago (they do get downranked after a while though)), but I assume that they mostly care about paid ad clicks, and if something increases ad clicks while the result quality decreases, they'll do it.
I use BBC good food, almost always straight to the point
You mean all your Replika "friends" that spit out answers from Bard's mouth? :)
I get not everyone is a foodie that cares about the details and wants to tweak it, but I appreciate them.