zlacker

[parent] [thread] 14 comments
1. danShu+(OP)[view] [source] 2023-02-08 22:19:19
I don't understand this hype and I feel like I'm looking at different products than everyone else is. There are very few complaints I have about Google that I think this technology helps solve, and for most of my complaints, getting summaries of searches makes the situation worse, not better. To be completely clear: even if the AI was perfect, I don't know that I want even an actual human being to sit down and summarize an answer to my question rather than show me a list of search results.

The problem with search is not that our answers aren't summarized well, it's that the quality of information returned for those searches is getting increasingly worse, and we are getting increasingly worse at categorizing or filtering that information in any useful way. And LLMs pulling information in and summarizing it for me is... not helpful? It's summarizing the same garbage, except now sometimes it also summarizes it wrong.

But it's not even an issue with the quality (although the quality of information from LLMs is also pretty over-hyped I think). Conceptually, I don't know that this is a product that I would ever want. I can't think of any time where I've sat down to do a search on Google or DuckDuckgo and thought, "You know what I want? I want these results presented to me in a less structured format using natural language and with less granular knowledge about where each specific statement is coming from."

At least Bing seems to be trying to do inline citations in some of its answers, which is a step up over Google's AI announcement, I guess?

Maybe I'm just in the minority on that. Users seem to like this a lot. But my ideal version of the Internet is one that decreases the number of abstractions and layers and summaries between myself and primary data rather than increasing them. My ideal Internet is a tool that makes it easier for me to actually find things, not a tool that increases the layers between me and the raw source/information that I'm looking for. I already have enough trouble needing to double-check news summaries of debates, events, and research. Getting another summary of the summaries doesn't seem helpful to me?

I can think of some ways where I might use an LLM in search, even really exciting ways where maybe it could help with categorization or grouping, but it doesn't seem like Google/Bing are interested in pursuing any of that. I look at both the Bing and Google announcements and just think, "why are you making it worse?" But who knows, maybe the actual products will sell me on the concept more.

replies(5): >>emoden+B >>dilipp+5v >>lumb63+AF >>nether+KW >>hLineV+P71
2. emoden+B[view] [source] 2023-02-08 22:22:52
>>danShu+(OP)
You know what it reminds me a lot of? CPedia. It’s basically the same concept, though, from the sound of it, much more capably executed.

I do see LLMs as potentially more useful for “fanciful” queries, like “what can I make with kale, tomatoes, and mushrooms?”

replies(1): >>danShu+K9
◧◩
3. danShu+K9[view] [source] [discussion] 2023-02-08 23:02:50
>>emoden+B
Just for clarification, do you mean the actual encyclopedia, or Wikipedia?

I think that encyclopedias are cool, but I also feel like the Internet was hopefully going to be a slightly better version of that and that it's a little frustrating to be going in the opposite direction. I'm not sure how to articulate that other than that encyclopedias are in some ways a compromise around the fact that we very often don't have good search, so we accept human beings trying to pre-aggregate data for us with the hope that they are better than Google is at aggregating that data.

And I think Wikipedia is valuable not so much because of the summaries, and more because it's obsessively curated and has (or attempts to have) a very specific, predictable set of rules that it (tries) to adhere to about sources and coverage. The text-portion of Wikipedia isn't really the part that I think is most impressive about it. If GPT-3 was being used to aggressively curate search results and remove low-quality content, then yeah, that would be potentially interesting (although I'm not sure how well it could handle that task).

----

> what can I make with kale, tomatoes, and mushrooms?

I sort of see it, it's one of the more... it's fine. It wouldn't be a strict downgrade over existing search, maybe it would save time in some situations. But if I'm being completely honest, what I want as an answer to that question isn't a paragraph of text explaining multiple recipes, it's a bullet-pointed list of recipes with links to the original sites they're listed on, so I can check to see if they're worth making.

Bing's results (to its credit) seem like they're sort of headed in that direction, which, nice. Yeah, I could use that. But a bullet-pointed list of recipes is also... search results. So how much time and effort have we put into reinventing a cleaner search interface where at best it solves the same problems that search already solved, and where we in the meanwhile haven't made any progress on the really hard fundamental problem of "how do we get a good list of recipes to display in the first place and what does 'good' mean in that context?"

----

To be less cynical, one way I could see this genuinely being useful would be completely behind the scenes in a non-user-facing role just as a way of applying "tags" to webpages or doing filtering. I would love to be able to search "is spinach poisonous for rats who are pets and just because I used the word 'poisonous' does not mean I want 2 pages of links about exterminators or getting rid of wild rats."

An LLM would be a great fit for that, because we've done such a garbage job of categorizing the web that it's very difficult to know which words to type to exclude "categories" of information from a search query. So maybe an LLM helps standardize that a bit? But then I want a normal page of search results. I don't want to have a conversation with the LLM, I want the LLM's role to be exclusively "I think this webpage should be additionally included/excluded in your query. I think this webpage is about extermination even if it doesn't use that specific word."

If Bing's service ends up being able to do that kind of thing well, then yeah, that's useful. I'm a little skeptical they can, because... gestures to the current quality of search results, but maybe their integration with GPT gives them way more capabilities and ratchets up their quality.

But similar to above, it feels a little bit like reinventing the wheel. I can refine GPT queries during a conversation, great. Could I have that feature for regular search? Why do I need to do it as part of a conversation? That seems like a good thing tied to a bad UX (although again, I might be atypical in thinking that natural language conversations are often bad UX).

And I do want to couch that by saying that maybe Bing will surprise me and it will have easy options to do that kind of thing. I'm just not currently seeing it presented in a way that looks useful.

replies(1): >>emoden+Ya
◧◩◪
4. emoden+Ya[view] [source] [discussion] 2023-02-08 23:09:13
>>danShu+K9
CPedia was an attempted pivot by doomed search startup Cuil that would more or less build a Wikipedia-like page out of random text on the Web about whatever you searched for. This article describes it a bit. https://www.plagiarismtoday.com/2010/06/09/cpedia-a-spam-blo...
replies(1): >>danShu+ni
◧◩◪◨
5. danShu+ni[view] [source] [discussion] 2023-02-08 23:42:50
>>emoden+Ya
Oh wow. Thanks for the link, first time I've heard of that.
6. dilipp+5v[view] [source] 2023-02-09 00:57:05
>>danShu+(OP)
There seem to be two different directions for innovation here.

The first is a little more mundane: LLM embeddings. OpenAI currently offers an API that turns sentences into coordinates for a point in some 1536-dimensional conceptual space such that two points are close together if they are conceptually close together. This is insanely powerful. For example, you can generate captions for a bunch of images and store the embeddings for them. Then, you can look for a "picture of a rabbit eating a carrot" by turning that phrase into a 1536-dimensional point and looking for the nearest points around it. Basically, it blows open search technology for everyone. You no longer have to deal with synonyms, idiomatic phrases that mean similar things, misspellings etc - the problems you'd run into when trying to implement simple text search using traditional techniques. It all gets simplified to generating coordinates in some hyperspace and looking for nearest neighbors. This is a total game changer.

The second direction is ChatGPT. Sure, if you want to read a detailed analysis of the demographic situation in China, you'd prefer an article written by an expert. You would still use a search engine, pick a search result and do things the way you do them today. However, there's an entire collection of things that can be answered directly by ChatGPT. For example "how many mins should I hard boil an egg" or "Can I take NyQuil when I'm stoned" or anything else where you really just want a single sentence answer. Today, you launch a browser, search for what you want, skip past the first 10 advertisements, look for a site that seems reasonably reputable, click through all the GDPR warnings, scroll past the banner ads and the SEO optimizing bullshit text to find that one sentence that you wanted all along. Or, you could ask ChatGPT and get an answer instantly. (assuming chatGPT is good enough eventually).

It's hard to predict which of these two technologies will disrupt the current status quo in search. Neither might. But we haven't ever been this close to a level playing field in search since the 1990s. The excitement is hard to resist.

replies(2): >>karpie+YQ >>danShu+pR
7. lumb63+AF[view] [source] 2023-02-09 02:08:43
>>danShu+(OP)
I shared this sentiment in one of my comments recently: “certainly I can’t be the only one who wants their search engine to search, right?”
◧◩
8. karpie+YQ[view] [source] [discussion] 2023-02-09 03:54:33
>>dilipp+5v
> The first is a little more mundane: LLM embeddings.

You know Google has been doing this for years now?

> However, there's an entire collection of things that can be answered directly by ChatGPT. For example "how many mins should I hard boil an egg" or "Can I take NyQuil when I'm stoned" or anything else where you really just want a single sentence answer.

Google has been doing this for years via search cards, which are AI generated summaries of website information.

replies(1): >>dilipp+d21
◧◩
9. danShu+pR[view] [source] [discussion] 2023-02-09 04:00:43
>>dilipp+5v
So LLM embedding are an actual useful thing that could actually improve search. Categorization is a real problem that AI could help solve (particularly with search queries). But that's not a new category of search, it's just a question of whether the current LLMs would be better than whatever Google is currently using to make the same inferences. And the direction Bard and Bing seem to be going with these giant models is the conversational direction, and where that's concerned:

> there's an entire collection of things that can be answered directly by ChatGPT. [...] assuming chatGPT is good enough eventually

I am a lot less impressed with this. And I know I'm an outlier and plenty of people are shocked at how good GPT is at this kind of problem, so I am constantly second-guessing myself and thinking to myself, "are we using the same product?" Because I think ChatGPT produces really bad quality information. It's cool, it's wildly impressive, it's a massive achievement and an incredible milestone for AI, but 'cool' is different from 'useful.'

Leaving aside the problem that answering simple questions is a very small subset of what search is used for, and isn't on its own probably a big enough category of questions to make me change search engines, the bigger problem is that the current state of ChatGPT seems to be wildly inconsistent about what it knows and what it doesn't know and I don't have a way to pre-predict what categories of information it's safe to ask about. And the only way for me to verify the answers it gives me are to... double check its work with a real search.

I would not advise anyone to ask ChatGPT for advice about what drugs are safe to take while high, that seems profoundly unwise to me.

So it's a bit like Instant Answers. Google has been trying to auto-answer questions for ages, and in practice the only time it's ever been useful for me is when it's extremely predictable and when I know that a category of question will only ever have its answer pulled from one site and where I know what the format of that answer will be.

Unpredictability is generally a quality that I try to avoid any time that I am using a computer. One of the primary strengths of a computer to me is specificity and predictability. And so the bar here is really high. The question I ask myself is, "would I want to replace a search engine with a human assistant?" And I think the answer is no, I feel like that would be missing the point of what a search engine is. And ChatGPT gives worse answers than a human assistant would, and its sources/knowledge is just as unpredictable as a human's would be if not worse. So, I also don't want to replace my search engine with ChatGPT.

It could get more accurate in the future, and if it does then maybe my opinion will change then, but... it's hard for me to get excited about using a worse product today on the promise that it might get better in the future. And I guess it's accurate enough that a bunch of people keep telling me that they're saving time when they use it, so maybe I don't understand what I'm talking about. But I just don't see how people are reaching that conclusion unless they're either asking questions where they don't actually care about the accuracy or unless they're just rolling the dice and trusting that ChatGPT won't accidentally poison them when they ask what drug combinations they can take.

replies(1): >>dilipp+d61
10. nether+KW[view] [source] 2023-02-09 04:54:22
>>danShu+(OP)
From my point of view the only usage of LLM is generation. Such as writing peer reviews, self reviews, OKRs, when I already know the truth and can edit out any errors. I will never trust a LLM's answer to something I don't already know.
replies(1): >>hLineV+Gb1
◧◩◪
11. dilipp+d21[view] [source] [discussion] 2023-02-09 05:58:04
>>karpie+YQ
> You know Google has been doing this for years now?

Of course.

What’s changed is that before now, they were the only ones who could do it, vs now, everyone can do it. So this technology only got deployed where someone could get a promo out of it whereas now, every to-do list app, dating app, and even a reasonably sophisticated nigerian prince can find a place to deploy it.

Think of what gcc did to software tool chains, Apache to servers, Linux to operating systems and to a lesser degree, blockchains to distributed databases.

> Google has been doing this for years via search cards, which are AI generated summaries of website information.

Yes. But now, someone else can do it too. And if that someone else does a good job, Google just lost the opportunity to show advertisements to all of those “searches”.

Doesn’t mean that anyone is about to beat Google in terms of sheer talent and experience with this stuff. But a very hungry and determined community of entrepreneurs just got hands on something that’s about as good as Google’s secret sauce and they’re about to run wild.

◧◩◪
12. dilipp+d61[view] [source] [discussion] 2023-02-09 06:44:33
>>danShu+pR
There are three groups of people here.

First, you have people who don’t have any idea about how any of this works and are generally far removed from the tech communities. They did not see this coming. To this community, ChatGPT is a fascinating toy. It’s not perfect, but at this point everyone is conditioned to believe that thinks will somehow get better. This group is excited.

Second, you have the tech community who is skeptical. This group of people sees everything that’s wrong with ChatGPT and see the magnitude of work needed for anything to even start approaching Google as a credible threat. This group is generally confused by the excitement going around because it doesn’t seem warranted, and worse, the excitement is being seen in people who should know better. There’s a range of responses from being dismissive to feeling like they’re being gaslighted.

Third, you have the people who are filling up the YC summer 23 applications. They’re all looking for big unsaturated markets to build a pitch deck around. ChatGPT looks like a very promising sign post that says “look for ideas here” to this group. They are excited. Most of them will fail. But if anyone survives and manages to thrive, where will they be 10 years from now? How about FDA approved chatbots integrated with a blood pressure monitor and thermometer that can take a first pass at routine prescription refills at 1/100th the cost of an equivalent doctors appointment? How about live translation of television events synthesized back to the original speaker’s voice? How about video game engines that can synthesize music loops dynamically to keep up with the gaming pace of that particular gaming session?

Sure you might say, but none of that “dethrones google”. My response to that is - what role does a text based internet play in daily life 10 years from now? Everything on the internet today went something along this path: primary research -> classrooms -> textbooks -> niche blogs/forums -> mainstream websites.

10 years from now, would you bet against primary research -> ChatGPT ingress -> widely deployed ChatGPT model? What role do ad driven websites play in this chain? What role does a search engine play in this chain?

Sure, it doesn’t make the internet nor search engines obsolete. But it changes how we do things. Potentially in a very big way.

replies(1): >>danShu+wy2
13. hLineV+P71[view] [source] 2023-02-09 07:00:46
>>danShu+(OP)
Fully agree, but the problem is not really specific to AI. Most search results today, whether written by humans or machines, tend to be mindless, inaccurate, and unhelpful n-th hand summaries produced with very little effort.

And this doesn't seem unreasonable, given that you get everything for free or at a very low cost. When you pay for high quality books or periodicals, you get in return much better sourced information written by people who know a lot more about the subject they're writing on than the average journalist or AI language model.

Yes, occasionally one might find high quality contents in blogs, forums, wikis, or open-access periodicals, but far more are locked inside proprietary platforms or behind paywalls that do very little to actually compensate the authors.

Search engines and content platforms are supposed to make it easier to find what you want. But the reality is that it's a lose-lose situation for both the writer and the reader. The writer is forced to give up their contents at a very low price and overpay for ads, while the reader is left with low quality contents that aren't relevant to their needs. But neither can escape the monopoly, who alone profits at everyone else's expense.

◧◩
14. hLineV+Gb1[view] [source] [discussion] 2023-02-09 07:36:56
>>nether+KW
Or creative writing, where factual inaccuracies are often a feature, not a bug.
◧◩◪◨
15. danShu+wy2[view] [source] [discussion] 2023-02-09 16:07:52
>>dilipp+d61
> what role does a text based internet play in daily life 10 years from now? [...] 10 years from now, would you bet against primary research -> ChatGPT ingress -> widely deployed ChatGPT model?

I suspect I'm a little bit more skeptical about this than other people might be? 10 years out is a long way to predict and I'm hesitant to try, but I might be primed wrong looking at voice assistants/video/etc... where I think the format changes have been a little over-exaggerated in some ways in the past, and where I think people have traditionally underestimated how much staying power traditional models have. There were a lot of things that were supposed to kill a traditional text-based Internet, but the only thing that's come close is video, and that seems to get a lot of blowback; I'm not sure it was an improvement.

But regardless, your comment is a insightful perspective that gives me some alternative ways of looking at this. So I think I agree. I might be just more on the skeptical side of things of how well-suited the current tools are for building the kinds of applications you're describing, but I get the theory.

And generative AI that's separate from AI answering questions is kind of another story; I can absolutely imagine potential creative impacts around stuff like music loop generation, art generation, etc... I similarly am looking at the current state of things and saying, "well... the tech doesn't seem to be as good as people say it is, so I don't want to start using it now before it improves", but I can at least imagine how things might change if the tech does get good. I'm not sure it would be a "revolution" or that it's going to put artists out of business or whatever, but it could potentially lower the barrier of entry for certain applications. And certainly LLMs for general website classification I think would be a really good use that I wish was being pursued more.

At the least though, even if I'm skeptical, I can definitely understand why someone would have that perspective, and your summation of the different groups rings true to me.

I still don't think I want it in my current search box today though :) I don't think that LLMs are useless at all, I'm just not sure searching in specific is a good use for them.

[go to top]