zlacker

[parent] [thread] 11 comments
1. rxhern+(OP)[view] [source] 2023-02-09 12:35:15
I really don't understand how engineers are having good experiences with it; a lot of the stuff I've seen it output w.r.t. swe is only correct if you're very generous with your interpretation of it (re: dangerous if you use it as anything more than a casual glance at the tech). W.r.t. anything else it outputs, it's either so generic that I could do it better, outright wrong (e.g. cannot handle something as simple as tic tac toe), or functions as an unreliable source (in cases where I simply don't have the background).

I wish I could derive as much utility as everyone else that's praising it. I mean, it's great fun but it doesn't wow me in the slightest when it comes to augmenting anything beyond my pleasure.

replies(3): >>bsaul+O >>Engine+94 >>AnIdio+e4
2. bsaul+O[view] [source] 2023-02-09 12:40:45
>>rxhern+(OP)
The fact that i can use this tool as a source of inspiration, or a first opinion on any kind of problem on earth is totally incredible. Now whenever i'm stuck on a problem, chatgpt has become an option.

And this happens in the artistic world as well with the other branch of NN : "mood boards" can now be generated from prompts infinitely.

I don't understand how some engineers still fail to see that a threshold was passed.

replies(1): >>rxhern+u2
◧◩
3. rxhern+u2[view] [source] [discussion] 2023-02-09 12:52:07
>>bsaul+O
I've literally asked it to generate stories from prompts and, it has, without fail, generated the most generic stories I have ever read. High school me could have generated better with little to no effort (and I don't say that lightly) and I'm not a good writer by any means.

Moreover, it's first opinion on the things I'm good at has been a special kind of awful. It generates sentences that are true on their face but, as a complete idea, are outright wrong. I mean, you're effectively gaslighting yourself by learning these half truths. And as someone with unfortunate lengthy experience in being gaslit as a kid, I can tell you that depending on how much you learn from it, you could end up needing to spend 3x as much time learning what you originally sought to learn (if you're lucky and the only three things you need to do is learn it very poorly, unlearn it and relearn it the right way)

replies(1): >>bsaul+6j
4. Engine+94[view] [source] 2023-02-09 13:01:13
>>rxhern+(OP)
I'm a Civil Engineer with a modest background including some work in AI. I'm pretty impressed with it. It's about as good or better than an average new intern and it's nearly instant.

I think a big part of my success with it is that I'm used to providing good specifications for tasks. This is, apparently, non-trivial for people to the point where it drives the existence of many middle-management or high-level engineering roles whose primary job is translating between business people / clients / and the technical staff.

I thought of a basic chess position with a mate in 1 and described it to chatGPT, and it correctly found the mate. I don't expect much in chess skill from it, but by god it has learned a LOT about chess for an AI that was never explicitly trained in chess itself with positions as input and moves as output.

I asked it to write a brief summary of the area, climate, geology, and geography of a location I'm doing a project in for an engineering report. These are trivial, but fairly tedious to write, and new interns are very marginal at this task without a template to go off of. I have to lookup at least 2 or 3 different maps, annual rainfall averages over the last 30 years, general effects of the geography on the climate, average & range of elevations, names of all the jurisdictions & other things, population estimates, zoning and land-use stats, etc, etc. And it instantly produced 3 or 4 paragraphs with well-worded and correct descriptions. I had already done this task and it was eerily similar to what I'd already written a few months earlier. The downside is, it can't (or rather won't) give me a confidence value for each figure or phrase it produces. ...So given it's prone to hallucinations, I'd presumably still have to go pull all the same information anyway to double check. But nevertheless, I was pretty impressed. It's also frankly probably better than I am at bringing in all that information and figuring out how to phrase it all. (And certainly MUCH more time efficient)

I think it's evident that the intelligence of these systems is indeed evolving very rapidly. The difference in ChatGPT 2 vs 3 is substantial. With the current level of interest and investment I think we're going to see continued rapid development here for at least the near future.

replies(1): >>rxhern+M5
5. AnIdio+e4[view] [source] 2023-02-09 13:01:28
>>rxhern+(OP)
I agree. Even understanding its limitations as essentially a really good bullshit generator, I have yet to find a good use for it in my life. I've tried using it for brainstorming on creative activities and it consistently disappoints, it frequently spouts utter nonsense if asked to explain something, code it produces is questionable at best, and it is even a very boring conversation partner.
◧◩
6. rxhern+M5[view] [source] [discussion] 2023-02-09 13:11:04
>>Engine+94
I can't speak to the rest of what you wrote because I couldn't be further from the field of civil engineering but if you feel impressed with it on chess, ask it to play game of tic tac toe; for me it didn't seem to understand the very simple rules or even keep track of my position on the grid.

There are so few permutations in tac tac toe that it's lack of memory and lack of ability to understand extremely simple rules make it difficult for me to have confidence in anything it says. I mean, I barely had confidence left before I ran that "experiment" but that was the final nail in the coffin for me.

replies(2): >>Sunhol+ib >>billyt+2e
◧◩◪
7. Sunhol+ib[view] [source] [discussion] 2023-02-09 13:44:01
>>rxhern+M5
This is like complaining that your computer isn't able to toast bread. It's a language model based on multicharacter tokens, outputting grids of single characters is not something you would expect it to succeed at.

If you explained the rules carefully and asked it to respond in paragraphs rather than a grid, it might be able to do it. Can't test since it's down now.

replies(1): >>rxhern+ze
◧◩◪
8. billyt+2e[view] [source] [discussion] 2023-02-09 13:57:15
>>rxhern+M5
Let's talk about what ChatGPT (or fine-tuned GPT-3) actually is and what it is not. It is a zero-shot or few-shot model that is pretty good at a variety of NLP tasks. Playing tic tac toe or chess is not a traditional NLP task so shouldn't expect it to be good at that. But board games can be completely played in a text format so it is not unexpected either that it can kinda play a board game.

If GPT-3 was listed on Huggingface, its main category listing would be a completion model. Those models tend to be good at generative NLP tasks like creating a Shakespeare sonnet about French fries. But they tend not to be as good at similarity tasks, used by semantic search engines, as models specifically trained for those tasks.

replies(1): >>rxhern+ng
◧◩◪◨
9. rxhern+ze[view] [source] [discussion] 2023-02-09 13:59:07
>>Sunhol+ib
You're acting like it's a grid of arbitrary size and an arbitrary amount of characters. It's a 3x3 with 2 choices for each square.

Neglecting that (only because it's harder to navigate whether I should expect it to handle state for an extremely finite space; even if it's in a different representation than it's directly used to), I know I saw a post where it failed at rock, paper, scissors. Just found it:

https://www.reddit.com/r/OpenAI/comments/zjld09/chat_gpt_isn...

◧◩◪◨
10. rxhern+ng[view] [source] [discussion] 2023-02-09 14:07:39
>>billyt+2e
That's a core problem with this. If people with expertise can't even tell us clear boundaries of its truth, how is anyone else going to come to rely on this for that purpose. I mean, you could say you defined a fuzzy boundary and I shouldn't trend towards that boundary from the wrong direction (re: text games that use different tokens than the ones it was trained on) but, how will I know if I'm too close to this boundary when I'm trending from a direction of doing things it's supposed to be good at?

It can't play tic tac toe, fine. But I know it gets concepts wrong on things I'm good at. I've seen it generate a lot of sentences that are correct on their own, but when you combine them to form a bigger picture, it paints something fundamentally different than what's going on.

Moreover, I've had terrible results with it as something to generate creative writing; to the extent that it's on par with a lazy secondary school student that only knows a rudimentary outline of what they're writing about. For example, I asked it to generate a debate between Chomsky and Trump and it gives me a basic debate format around a vague outline of their beliefs where they argue respectfully and blandly (both of which Trump is not known for).

It's entirely possible I haven't exercised it enough and that it requires more than the hours I put into it or it just doesn't work for anything I find interesting.

replies(1): >>billyt+0d1
◧◩◪
11. bsaul+6j[view] [source] [discussion] 2023-02-09 14:19:16
>>rxhern+u2
My experience was more 50% bullshit 50% fact. I did however explicitly forbid members of my team at work to use its code answers for subjects they weren't already experts in.

However I'm not advocating using its answers directly, but more as a source of inspiration.

Now everybody is aware of the problem of chatGPT not "knowing" the difference between facts vs opinion. It does, however seem a less hard features to add than what they've already built (and MS already pretends its own version is able to correctly provide sources). Future will tell if i'm wrong..

◧◩◪◨⬒
12. billyt+0d1[view] [source] [discussion] 2023-02-09 17:32:54
>>rxhern+ng
I agree that the state of the art isn't ready yet for general consumption. I think GPT-3, etc are good enough to help with a wide range of tasks with guardrails. To clear, I don't mean guardrails around racist language, etc which is a separate topic. Rather guardrails around when to use the results because of limitations and accuracy.

For example, let's say you have a website that sells clothes and you want to make the site search engine better. Let's also say that a lot of work has been done to make the top 100 queries return relevant results. But the effort required to get the same relevance for the long tail of unique queries, think misspellings and unusual keywords, doesn't make sense. However you still want to provide a good search experience so you can turn to ML for that. Even if the model only has a 60% accuracy, that's still a lot better than 0% accuracy. So applying ML queries outside the top 100, should improve the overall search experience.

ChatGPT/GPT-3 has an increased the number of areas where ML can be used but it still has plenty of limitations.

[go to top]