zlacker

[parent] [thread] 22 comments
1. bongod+(OP)[view] [source] 2023-11-21 01:10:18
I see lots of people trying to prompt with incomplete sentences, not capitalizing, using slang, bad grammar, imprecise terminology etc. And it still works. However, I find that you get a noticable a quality boost if you use proper English and treat it more like a human.

"i want a python app that calculates a roadtrip for me"

vs

"Please write me a Python program using a map API that measures the distance between two locations as a car would drive. Think carefully about the program architecture and be sure to use a human readable Pythonic style. Please show me the complete program in it's entirety."

The former game me a high level overview with a ton of explanation and didn't write any code. You can try to walk it through the process of all the steps it needs, but it will write "confused", albeit working, code after a few prompts. The latter just wrote working code on the first response. Moving forward, the context is just so more concise and correct that everything after will be of much higher quality.

I rarely go past 5-10 responses due to what I'd call "context poisoning". If it makes a simple syntax error or something small, I'll shoot it the error and let it correct itself. But as soon as it invents a function or otherwise hallucinates, it gets copy pasted into a new prompt saying "here's some bad code, fix this" and it is far more likely to come up with an elegant solution rather that rewriting everything or making huge changes to solve a one off error or something it's previous context was preventing it from grasping.

What you're saying is almost the meta of using good grammer and context, and I completely agree.

replies(8): >>galaxy+qd >>lupire+4j >>CtrlAl+5x >>kromem+fE >>averev+4U >>miven+601 >>ameliu+Q61 >>jiggaw+Z91
2. galaxy+qd[view] [source] 2023-11-21 02:34:19
>>bongod+(OP)
If your prompt-input is high quality it is more likely to match high-quality training inputs
replies(1): >>eurode+jN
3. lupire+4j[view] [source] 2023-11-21 03:13:34
>>bongod+(OP)
Your example confounds many variables.
replies(1): >>bongod+fL
4. CtrlAl+5x[view] [source] 2023-11-21 04:56:44
>>bongod+(OP)
Using a common search engine for "python app calculate roadtrip"

is way faster, free, doesn't require a phone number or login, and gives much better results.

replies(2): >>cosmoj+uB >>jibal+WM
◧◩
5. cosmoj+uB[view] [source] [discussion] 2023-11-21 05:28:42
>>CtrlAl+5x
Not nearly as quickly or directly, though. LLMs augmented by search engines (or vice versa) seem to be an obvious and permanent innovation, especially for the general public who are notoriously awful at personally generating optimal keywords for a desired search query.
replies(1): >>Roark6+qK
6. kromem+fE[view] [source] 2023-11-21 05:55:21
>>bongod+(OP)
A recent paper along these lines you might be interested in was Large Language Models Understand and Can be Enhanced by Emotional Stimuli: https://arxiv.org/abs/2307.11760

It makes complete sense and has been a part of my own usage for well over a year now, but it's been cool seeing it demonstrated in research across multiple models.

replies(1): >>bongod+n41
◧◩◪
7. Roark6+qK[view] [source] [discussion] 2023-11-21 06:56:29
>>cosmoj+uB
I'm not convinced. On these few occasions where an AI chat bot went out, did a Google search and responded with results the quality of that answer was always much worse than if it just replied from it's training data. This of course excludes things that happened after training data ends.

For example, ask chatgpt about writing a python script that does anything with AWS inspector 2. It will do very badly, it will hallucinate, etc. Even with Internet access. Ask about doing the same with some other API that was well represented in the training set and it's great.

This is why I think predicting death for sites like stackoverflow is very premature. What happens 10 years down the line once everything chatgpt knows is old tech? It can't be simply trained with more recrnt data, because unless stackoverflow regains it's popularity there will be very little training data. Of course various data generation techniques will be invented and tried, but no one will match the gold standard of human generated data.

Unfortunately I have to predict inevitable enshittification of general purpose chat bots.

replies(1): >>dwattt+WX
◧◩
8. bongod+fL[view] [source] [discussion] 2023-11-21 07:03:50
>>lupire+4j
You're definitely right. I'm painting with very broad strokes to make a point of what I've been seeing.
◧◩
9. jibal+WM[view] [source] [discussion] 2023-11-21 07:21:19
>>CtrlAl+5x
Utterly false. A google search for that phrase yields "It looks like there aren't many great matches for your search". And no search engine will yield the code for such an app unless the engine is LLM-based.
replies(2): >>CtrlAl+1P1 >>jibal+on4
◧◩
10. eurode+jN[view] [source] [discussion] 2023-11-21 07:25:47
>>galaxy+qd
This seems intuitively true but has it been established ?
replies(1): >>galaxy+ek2
11. averev+4U[view] [source] 2023-11-21 08:21:27
>>bongod+(OP)
Smallish model (7b) require a somewhat simplified grammar tho. Especially with longer complex instruction I found more luck by joining all the conditions with ands and to have everything that's a follow up and need to happen in order joined by then, instead of having more natural sentences.

So instead of "write a short story of a person that's satisfied at work" something along the line of "write a short story and the protagonist must be a person and the protagonist must be happy at work" boost comprension especially as the condition list becomes longer.

◧◩◪◨
12. dwattt+WX[view] [source] [discussion] 2023-11-21 08:54:44
>>Roark6+qK
https://www.inf.ufpr.br/renato/profession.html
13. miven+601[view] [source] 2023-11-21 09:13:30
>>bongod+(OP)
Are there any risks I miss to asking a model (in a separate context as to not muddy the waters) to rewrite the informal prompt into something more proper and then use that as a prompt?

Seems like a pretty simple task for an LLM as long as the initial prompt isn't too ambiguous. If it really does help with the recall it could be interesting to have this as an optional preprocessing layer in chat clients and such.

replies(3): >>bongod+ck1 >>geoduc+Cz1 >>kromem+hx4
◧◩
14. bongod+n41[view] [source] [discussion] 2023-11-21 09:46:28
>>kromem+fE
This is wonderful, thank you.
15. ameliu+Q61[view] [source] 2023-11-21 10:07:07
>>bongod+(OP)
How often does:

"Please write me ..."

occur in training data? And why does it still work?

16. jiggaw+Z91[view] [source] 2023-11-21 10:37:05
>>bongod+(OP)
When experimenting with the early models that were set up for "text completion" instead of question-answer chat, I noticed that I could get it to generate vastly better code by having the LLM complete a high quality "doc comment" style preamble instead of a one-line comment.

I also noticed that if I wrote comments in "my style", then it would complete the code in my style also, which I found both hilarious and mildly disturbing.

replies(1): >>kromem+Mx4
◧◩
17. bongod+ck1[view] [source] [discussion] 2023-11-21 12:04:13
>>miven+601
I do this all the time. "Summarize in a YAML like markup that retains all information." Then plug that as is into something else.
◧◩
18. geoduc+Cz1[view] [source] [discussion] 2023-11-21 13:42:35
>>miven+601
That is a pretty good use case. In fact, if your prompt is very long, you will need to summarize it (with an LLM).

Also, when you fine-tune the LLM, you can also use an LLM to summarize or concatenate content that you train it on (e.g. rewrite this content in the style of a human having a conversation with a computer)

◧◩◪
19. CtrlAl+1P1[view] [source] [discussion] 2023-11-21 14:56:30
>>jibal+WM
Are we using the same google? Did you make a typo?

"python app calculate roadtrip"

>About 6,470,000 results (0.34 seconds)

Four out of the top five results have code. The other one is a video tutorial where the app is coded live.

◧◩◪
20. galaxy+ek2[view] [source] [discussion] 2023-11-21 16:59:22
>>eurode+jN
Not sure. But it does make sense like you say. The output must somehow correspond to the input, in a meaningful way, that is the purpose of LLMs. If you gave the LLM just one words as input who knows what the output would be. But if you give it more meaningful information it has more to work with, to give you an answer that more precisely matches your question.
◧◩◪
21. jibal+on4[view] [source] [discussion] 2023-11-22 03:07:17
>>jibal+WM
P.S.

https://www.google.com/search?q=%22python+app+calculate+road...

If you leave off the quotes (which were present in the comment I responded to) then of course you will get millions of irrelevant hits. Somewhere in that chaff there is some Python code that alleges to have something to with road trips, though it's not always clear what. If I give the same prompt to ChatGPT I get a nicely formatted box with a program that uses the Google Maps Distance Matrix API to calculate distance and duration, without a bunch of junk to wade through. (I haven't tried it so it could be a complete hallucination.)

◧◩
22. kromem+hx4[view] [source] [discussion] 2023-11-22 04:50:28
>>miven+601
Preprocessing prompts is actually a great approach.

Personally I think given the model loss with fine tuning people who want the cutting edge LLM at any cost would - instead of fine tuning the model itself - fine tune a preprocess prompter that takes a chat/instruction and converts it to a good TextCompletion prompt.

So for example taking "write me a paragraph of marketing copy for an athletic shoe" and tuning it into:

"Marketing case study: Athletic shoe The problem: The client needed a paragraph of high quality marketing copy to promote their new athletic shoe on their website. The solution: Our award winning copywriters wrote the outstanding copy reproduced below."

Followed by an extractor that reformats the completion result into an answer for the initial prompt, as well as potentially a safety filter that checks the result isn't breaking any rules (which will as a bonus be much more resistant to jailbreaking attempts).

◧◩
23. kromem+Mx4[view] [source] [discussion] 2023-11-22 04:54:59
>>jiggaw+Z91
The fact that 90% of the people aware of and using LLMs have yet to experience it thinking their own thoughts before they do means we're in store for a whole new slew of freak outs as integration in real world products expands.

It's a very weird feeling for sure. I remember when Copilot first took a comment I left at the end of the day for me to start my next day with and generated exactly the thing I was going to end up thinking of 5 minutes later in my own personal style.

It doesn't always work and it often has compile issues, but when it does align just right - it's quite amazing and unsettling at the same time.

[go to top]