That's what makes it such a good giveaway. I'm happy to be told that I'm wrong, and that you do actually use the proper double long dash in your writing, but I'm guessing that you actually use the human slang for an emdash, which is visually different and easily sets your writing apart as not AI writing!
Word converts any - into an em dash based on context. Guess who’s always accused of being a bot?
The thing is, AI learned to use these things because it is good typographical style represented in its training set.
Also, phone keyboards make it easy. Just hold down the - and you can select various types.
"the formal emdash"?
> AIs are very consistent about using the proper emdash—a double long dash with no spaces around it
Setting an em-dash closed is separate from whether you using an em-dash (and an em-dash is exactly what it says, a dash that is the width of the em-width of the font; "double long" is fine, I guess, if you consider the en-dash "single long", but not if, as you seem to be, you take the standard width as that of the ASCII hyphen-minus, which is usually considerably narrower than en width in a proportional font.)
But, yes, most people who intentionally use em-dashes are doing so because they care about detail enough that they are also going to set them closed, at least in the uses where that is standards. (There are uses where it is conventional to set them half-closed, but that's not important here.)
> whereas humans almost always tend to use a slang version - a single dash with spaces around it.
That's not an em-dash (and its not even an approximation of one, using a hyphen-minus set open—possibly doubled—is an approximation of the typographic convention of using an en-dash set open – different style guides prefer that for certain uses for which other guides prefer an em-dash set closed.) But I disagree with your claim that "most humans" who describe themselves as using em-dashes instead are actually just approximating the use of en-dashes set open with the easier-to-type hyphen-minus.
In certain places it does seem to do the substitution - Notes for example - but in comment boxes on here and (old) Reddit at least it doesn't.
Still less obvious than the emails I see sent out which contain emojis, so maybe I'm overthinking things...
https://news.ycombinator.com/threads?id=tkgally&next=3380763...
They’re simple enough key combinations (on a Mac) that I wouldn’t be surprised if I guessed them. I certainly find it confusing to imagine someone who has to write professionally or academically not working out how to type them for those purposes at least.
on Macintosh: option+shift+-
on Linux: compose - - -
We're the training data.
On Linux, I use Compose-hyphen-hyphen-hyphen.
I don't use it as often as I used to; but when I was younger, I was enough of a nerd to use it in my writing all the time. And yes, always careful to use it correctly, and not confuse it with an en-dash. Also used to write out proper balanced curly quotes on macOS, before it was done automatically in many places.
Being able to insert self-interjections and such with the correct character would undoubtedly be more widespread if it were more accessible to insert for most.
>That's not an em-dash (blahblahblah...
What, exactly, did you thing "slang" in the phrase "slang version" meant?
Examples within the last week include >>44996702 , >>44989129 , >>44991769 , >>44989444 . I typed all of those.
I never use space-hyphen-space instead of an em dash. I do sometimes use TeX's " --- ".
I don't buy the pro-clanker pro-em dash movement that has come out of nowhere in the past several years.
There’s a subculture effect: this has been trivial on Apple devices for a long time—I’m pretty sure I learned the Shift-Option-hyphen shortcut in the 90s, long before iOS introduced the long-press shortcut—and that’s also been a world disproportionately popular with the kind of people who care about this kind of detail. If you spend time in communities with designers, writers, etc. your sense of what’s common is wildly off the average.
No longer. Just like you can no longer bold key phrases, you can no longer use emdashes if your writing being ID'd as "AI" is important (or not).
Any source of text with huge amounts of automated and community moderation will be better quality than, say, Twitter.
(I learned to use dashes like this from Philip Dick's writings, of all places, and it stuck. Bet nobody ever thought of looking for writing style in PKD!).
Hope AI didn't ruin this for me!
Bots that are trying to convince you they’re human..
The LLM is first trained as an extreneley large Markov model predicting text scraped from the entire Internet. Ideally, a well trained such Markov model would use em dashes approximately as frequently as they appear in real texts.
But that model is not the LLM you actually interact with. The LLM you interact with is trained by somethig called Reinforcement Learning from Human Feedback, which involves people reading, rating and editing its responses, biasing the outputs and giving the model a "persona".
That persona is the actual LLM you interact with. Since em dash usage was rated highly by the people providing the feedback, the persona learned to use it much more frequently.
If they’re using AI to speed things up and deliver really clear and on point documents faster then great. If they can’t stand behind what they’re saying I will call them out.
I get AI written stuff from team members all the time. When it’s bad and is a waste of my time I just hit reply and say don’t do this.
But I’ve trained many people to use AI effectively and often with some help they can produce way better SOPs or client memos or whatever else.
It’s just a tool. It’s like getting mad someone used spell check. Which by the way, people used to actually argue back in the 80’s. Oh no we killed spelling bees what a lost tradition.
This conversation has been going on as long as I’ve been using tech which is about 4 decades.
But yes, it's absurd to complain about LLMs resulting in increased literacy.
Anyone who makes errors like this should not be talking.
I've found that people who say this sort of thing rarely change their beliefs, even after being given evidence that they are wrong. The fact is, as numerous people have pointed out, Word and other editors/word processors change '--' to an em-dash. And the "slang version" of an em-dash is "I went to work--but forgot to put on pants", not "I went to work - but forgot to put on pants".
BTW, "humans almost always tend to use" is very poor writing--pick one or the other between "almost always" and "tend to". It wouldn't be a bad thing if LLMs helped increase human literacy, so I don't know why people are so gung ho on identifying AI output based on utterly non-substantive markers like em-dashes. Having an LLM do homework is a bad thing, but that's not what we're talking about. And someone foolishly using the presence of em-dashes to detect LLM output will utterly fail against someone using an editor macro to replace em-dashes with the gawdawful ' - '.
I'm gonna use it more thanks to this tip. Thanks!
I don't care if people or robots think I'm a robot.
I'd be suspicious of people doing their writing in Word and copying it over into random comment fields, too.
> And the "slang version" of an em-dash is "I went to work--but forgot to put on pants", not "I went to work - but forgot to put on pants".
The fun thing about slang is that different groups have different slangs! I use the latter pretty regularly, but have never done the former.
> BTW, "humans almost always tend to use" is very poor writing--pick one or the other between "almost always" and "tend to".
Nah.
> It wouldn't be a bad thing if LLMs helped increase human literacy,
Where "literacy" is defined as strictly following arbitrary rules without any concern for whether it actually helps people read it?
And, on the assumption that those rules actually are meaningful, wouldn't you rather have people learn them for themselves?
Sigh.
I agree, HN is an amazing community with brilliant people and top quality content, but it's not enough to train an LLM.
Last thing. An LLM is just a tool, it can clean up your writing the same way a photo app can enhance your pictures. It took a while for people to accept that grandma's photos looked professional because they had filters. Same will happen with text. With ChatGPT, anyone can write like a journalist. We're just not used to grandma texting like one, yet :)
That said, this feature doesn't sound like a great leap for mankind.
I’m not the person you asked, but I do.
> the proper emdash—a double long dash with no spaces around it
The spaces around it depend on style guide, it is not universal that they should not exist.
> That's because most keyboards don't have an emdash key
Nor do they have keys for proper quotes and apostrophes or interrobangs, yet it doesn’t stop people from using them. The keys don’t need to exist.
> That's what makes it such a good giveaway.
It’s not. It might be one signal but it is far from sufficient.
> I'm happy to be told that I'm wrong, and that you do actually use the proper double long dash in your writing
I do use the proper em-dash in my writing—and many other characters too—and my HN history is ample proof. I explained at length in another comment how I insert the characters, plus how simple it is if you use any Apple OS.
Both make sense, to a degree. On the one hand you can argue that the em-dash—being longer—should require and extra key, but on the other hand it has more uses so it should not have the extra key to be more accessible.
I reject everything else about that poorly reasoned "suspicious" response as well.
Once I started self-publishing in the 1990s, I disregarded her opinion.
I never use hyphens where em dashes would be correct.
I do have issues determining when a two-word phrase should or shouldn't be hyphenated. It surely doesn't help that I grew up in a bilingual English/German household, so that my first instinct is often to reject either option, and fully concatenate the two words instead.
(Whether that last comma is appropriate opens a whole other set of punctuation issues ... and yes, I do tend to deliberately misuse ellipses for effect.)
Sentences "need" very little, but without style and personality, writing becomes very boring. I suppose simplicity without any affectation works for raw communication of plain technical facts, but there's more to writing than that.
I would argue that LLMs overuse the emdash more because they overuse specific rhetorical devices, e.g. antithesis, than because they are being too correct about punctuation.
Also you can ctrl-z immediately after an autocorrect to undo it.
I do them without surrounding spaces, because that's... how you're supposed to use them, and it's also less typing.
They also used to be a really good Shibboleth to tell if someone was using a Mac—the key combo on there is easy, and also easy to remember, so Mac users were far more likely than the median to employ em-dashes. It wasn't a sure tell, but it was pretty reliable.
I would personally avoid writing that "poorly composed sentences" have an "affect"—rather than the writer having or presenting an affect, or the sentences' tone being affected—as I find an implied anthropomorphizing of "sentences" in that usage, which anthropomorphizing isn't serving enough useful purpose, to my eye, that I'd want it in my writing, but I'm not sure I'd call that an error either.
What did you mean?
> Commas and parentheses can do it all, and an excess of either is a sign of poorly edited prose.
This attitude, however, is a disease of modern English literacy.
It took centuries for the written word to acquire spaces between words, and then the US decided to jam them back together again.
Curious why folk are using two hyphens "--" instead of en-dash.
a) prose doesn't have intentions ... it should be "prose intended to"
b) "effect of", not "affect of"
> I don't see what I'd call an actual error.
That's a serious problem. It's downright weird that you thought he was actually talking about affect (the noun).
This is an old conversation ... I won't revisit it.
But it’s possible I was reading too generously and this was a botched attempt to employ “effect”, which would also fit (and better, I think).
Oh no, oh lord lmao
I meant "affect" and not "effect." You need to learn what affect means. I'm not asking you to learn about affect theory, but ffs no part of my sentence implied it meant "effect" and not "affect." Ugh. It doesn't even make sense. What would the "effect" of "poorly composed sentences" be? Only affect makes sense there.
noun
Psychology., feeling or emotion.
Psychiatry., an expressed or observed emotional response.
Restricted, flat, or blunted affect may be a symptom of mental illness, especially schizophrenia.
Obsolete., affection; passion; sensation; inclination; inward disposition or feeling.
Now let's replace that in my original phrase:> prose intending to imitate the affect of poorly composed sentences
becomes
> prose intending to imitate the feeling or emotion of poorly composed sentences
My point was that the author is trying to convey a specific feeling by way of poorly composed sentences. Perhaps they want a colloquial feel or a ranting feel or a rambling one. An obvious example would be the massive run-on sentence in Ulysses.
Minus the fact-checking, transparency, truth and social responsibility.
Been using shift+option+hyphen to make and use em-dashes (sans spaces) since at least 2005, when I got my first publishing job and also started blogging (so writing a ton more). I also use option+hyphen (en-dash) for date and number ranges. In my experience, ChatGPT consistently adds spaces around both.
So... that's just to say that people who are exposed to the sorts of can't-unsee-it-now typesetting OCD that LaTeX and various popular extension packages within that ecosystem exposes can learn to write write "--" as en-dash.
It's sort of like being unable to return to the blissful state of not being hyperaware that Ariel and Helvetica are different.