I don't comment much but I have read everything that Friedrich Nietzsche wrote, and because of him, have always used em-dashes on my writing. I think I even saw some memes in circles that discuss his work when people started realizing GPT used them a lot...
genuine question: How could you tell they were em-dashes?
Like, I could see some people noticing that the book they're reading has dashes that are a bit longer than normal, but what made you think "That must be it's own thing, separate from a normal dash" as opposed to something like "In this font the dashes are very long"?
While it is generally considered a No-No to start a bar chart from a baseline that is not zero, there is no corresponding prohibition, especially among numerically sophisticated audiences, for scatter plots or line charts. In general, we want graphs to focus on the area of variation.
The real answer actually depends. In cases where you want to visually emphasize the ratio between any pair of values, you should start from zero. In cases where only the difference between any pair of values matters and the ratio is meaningless you can start at a different baseline. A surprising number of measures are interesting in their ratio though, so we generally prefer a zero-based chart.
For stock prices, starting the y axis wherever is aesthetically pleasing makes some sense because everybody will have a different non-zero cost basis for their investment, and the graphs need to be able to clearly depict fluctuations that are minor on a percentage basis. For something like the em-dash prevalence on HN, the most meaningful question is whether it has doubled, tripled, or whatever relative to the pre-LLM corpus, and that's most clearly visually depicted by starting the y axis at precisely zero.
> Visually, this is vastly exaggerating the variation. Actual usage did not even double.
No, it is literally showing the exact variation of interest. If you think it's exaggerating the variation, you are not reading the chart. You are glancing at the chart, ignoring what it actually says in multiple ways, and imagining it has a baseline of zero, when it clearly does not.
This is interesting. I just fixed a Github issue where the code did not handle Em-Dash correctly. Ran some queries to check the stats there. No surprises:
https://deepspaceplace.com/emdash
em-dashes help flow ideas better than other means. For whatever reason, it's easier to process in my brain a comment with an em-dash rather than trying to split the idea into separate succinct sentences.
You can do small succinct sentences, but style-wise it sucks for longer passages.
AI raised awareness of em-dashes among people who didn't/don't read much, especially the kind of long-form writing that LLMs have been trained on. Treating em-dashes as a tell of LLM output is a form of unintentional "vice signalling".
Press Ctrl+Shift+U to enter Unicode entry mode in GTK controls, then enter the code point for the em dash, 2014. That will produce '—'.
Although I still prefer the traditional ASCII double-dash -- easier to type, and less potential for character encoding issues. Also, LLMs don't seem to use it at all.
I think it's both. People started writing AI comments and also started using em-dashes. However when my former boss would write emails with AI he would add intentional typos and remove all dashes.
For my part, editing Wikipedia raised my awareness of the different types of dashes, and when to use them appropriately. Unfortunately, my Chromebook is not so forthcoming in ease of input.
Unconsciously and consciously yes, and this new awareness means others are now consciously avoiding the use of them so their writing is less likely to be perceived as AI generated junk
In my case, yes. I have never used AI to write any prose (including HN comments), and I never will. But I certainly started using them more often since the ChatGPT era began, purely through osmosis. I'm not exactly proud of that, but there you have it.
Sometimes swearing a little or grumbling “HEY. I typed what I typed” at it helps a little.
I don’t even know how many times in 20-30+ years I’ve checked some box in system or program preferences begging it to knock that off.
This is the real reason I already loathe and avoid the emdash (nitpicking over a personal stylistic preference I won’t relent on even if I’m wrong) but I can’t be the only one this happens to.
Getting piled on and called “AI” really doesn’t ease my distaste for it, but .. do people.. not write enough to understand that it brute forces its way into human copy as well?
and yes. phone posting on HN. will insert them. to my dismay.
The other one that ticks me off endlessly but I’ve finally said to hell with it and just let it go?
Turning " into “.
(Writer. Not a very good one and I’m not here to steer anyone to that drivel. But at least I’m a human one.)
I just learnt that em dash in a mac is option+shift+hyphen. I hadn't realized it was so difficult and inconvenient, and in the end it looks so similar to the other one: — -. Thin value. It's no surprise humans barely use them. Then why did it get picked up so much by AIs? I'd have imagined it's not in a lot of training data. Print media practices I guess?
> and in the end it looks so similar to the other one:
Maybe if you are looking at it in a monospaced environment like the HN edit window; rendered in a proportional font, hyphens, en-dashes, and em-dashes are quite distinct from eachother.
> It's no surprise humans barely use them. Then why did it get picked up so much by AIs?
It got picked up by AIs because their training corpus includes plenty of professionally published work, not just informal, off-the-cuff communication, and professionally published work uses typographic dashes (em-dashes, en-dashes, and even 2-em- and 3-em-dashes) extensively. (3-em less so in newer works, it having, e.g., dropped out of the recommendations of the Chicago Manual of Style as of 2024.)
I love em dashes. They are so much less pretentious than colons or semicolons — and they help with flow of speech. I learned that key command a couple years ago and it made me feel so smart. I’ve had my comeuppance but I’m not stopping — just a better way to write
Difficult and inconvenient compared to what, I wonder? I've always really liked the Mac OS option-key system, which I found convenient and easy to understand; I sometimes wish I could type that way in linux instead of using compose keys.
What is it that you like about it specifically? If you’re not picky about the choice of modifier key, you can configure the so-called “level 3 shift key” and have the em dash on the hyphen key at level four (both L3 shift and L2 aka normal shift pressed). For instance, on GNOME Wayland I have “Input Source” = “English (Western European AltGr dead keys)”, “Alternate Characters Key” (GNOME lingo for the L3 shift) = “Right Alt”, so the em dash is RAlt-Shift-hyphen.
The option-key layout system was easier to memorize than the compose-key patterns, which I struggle to recall. I couldn't tell you why, I just felt like I got the hang of it easily, while using the compose key system has always been slow and clunky.
I've never heard of a "level 3 shift key"; I'll have to look that up.
It's used a lot in LaTeX and Word. It's not as rare as people make them out to be. It's just that we haven't had a convenient way to enter it in a browser form that some of us (younger folks!) find the em-dash weird.
If it's all comments, including flagged/dead/downvoted/etc., then it's not reflective of the actual filtering HN does.
But if it's weighting comments by their likelihood of being read -- e.g. mostly top comments on popular stories -- then I'd be a lot more curious.
I'm not surprised AI spam has increased substantially. But I'd be surprised if it's affected the comments most people actually read to anywhere close to the degree shown in this graph.
All the time. So funny, it's so automatic I genuinely didn't even realize I was using them in a comment about em dashes. My comment history has been full of them for over a decade by now... and I think you can tell which comments are from my phone vs my laptop by whether they're converted to — or not.
I'll stand firm on my believe that no one types an em or en dash. its always an llm. its a pain in the ass to type on most keyboards, impossible on some, and pointless on phones
Like, I could see some people noticing that the book they're reading has dashes that are a bit longer than normal, but what made you think "That must be it's own thing, separate from a normal dash" as opposed to something like "In this font the dashes are very long"?
For example, take a look at just about any stock chart (try https://www.google.com/finance/beta/quote/GOOG:NASDAQ?hl=en). There's actual money on the line, but no baseline. Why do you think that is?
Visually, this is vastly exaggerating the variation. Actual usage did not even double.
No, it is literally showing the exact variation of interest. If you think it's exaggerating the variation, you are not reading the chart. You are glancing at the chart, ignoring what it actually says in multiple ways, and imagining it has a baseline of zero, when it clearly does not.
Read the chart. What does it actually say?
Now she's been accused of using AI for her pieces.
Oh well.
You can do small succinct sentences, but style-wise it sucks for longer passages.
Although I still prefer the traditional ASCII double-dash -- easier to type, and less potential for character encoding issues. Also, LLMs don't seem to use it at all.
This gets corrected to an emdash.
I get annoyed and put the double dash back in.
Sometimes swearing a little or grumbling “HEY. I typed what I typed” at it helps a little.
I don’t even know how many times in 20-30+ years I’ve checked some box in system or program preferences begging it to knock that off.
This is the real reason I already loathe and avoid the emdash (nitpicking over a personal stylistic preference I won’t relent on even if I’m wrong) but I can’t be the only one this happens to.
Getting piled on and called “AI” really doesn’t ease my distaste for it, but .. do people.. not write enough to understand that it brute forces its way into human copy as well?
and yes. phone posting on HN. will insert them. to my dismay.
The other one that ticks me off endlessly but I’ve finally said to hell with it and just let it go?
Turning " into “.
(Writer. Not a very good one and I’m not here to steer anyone to that drivel. But at least I’m a human one.)
It went from 19.3 to 32.5. It did not even double. Which means that if you see a comment with an em-dash, it's more likely to be human than LLM.
Maybe if you are looking at it in a monospaced environment like the HN edit window; rendered in a proportional font, hyphens, en-dashes, and em-dashes are quite distinct from eachother.
> It's no surprise humans barely use them. Then why did it get picked up so much by AIs?
It got picked up by AIs because their training corpus includes plenty of professionally published work, not just informal, off-the-cuff communication, and professionally published work uses typographic dashes (em-dashes, en-dashes, and even 2-em- and 3-em-dashes) extensively. (3-em less so in newer works, it having, e.g., dropped out of the recommendations of the Chicago Manual of Style as of 2024.)
I've never heard of a "level 3 shift key"; I'll have to look that up.
If it's all comments, including flagged/dead/downvoted/etc., then it's not reflective of the actual filtering HN does.
But if it's weighting comments by their likelihood of being read -- e.g. mostly top comments on popular stories -- then I'd be a lot more curious.
I'm not surprised AI spam has increased substantially. But I'd be surprised if it's affected the comments most people actually read to anywhere close to the degree shown in this graph.
key insight - https://trends.google.com/explore?q=key%2520insight&date=all...
etc.
https://en.wikipedia.org/wiki/Caedite_eos._Novit_enim_Dominu....
Is HN more botted, or less? And are banned accounts excluded?
— A spooky ghost
WooooooOOOOOOOOooooooooOOOOOOOOOoooooo-
- A less spooky ghost