tkgally 7 hours ago

More than twenty years ago, I had fun tracing a similar phenomenon: English “proverbs” that appeared in English dictionaries and textbooks published in Japan but that did not seem to have any actual currency in English. It became clear that they had been copied from dictionary to dictionary for decades before large-scale corpora and search engines made it possible to check actual usage.

“Every man has his humo(u)r.”

https://www.gally.net/leavings/00/0001.html

“Losers are always in the wrong.”

https://www.gally.net/leavings/00/0098.html

In their heyday, dozens of English-Japanese dictionaries were published in Japan:

https://www.gally.net/leavings/00/0005.html

Producing an original dictionary from scratch would have been expensive and time consuming, so most publishers borrowed liberally from each other.

  • javawizard 4 hours ago

    I remember running across a shirt for sale in Japan that said:

      Free is free
      Shit is shit
      Damn
    
    I don't know what it was about that particular sequence of words but man if it didn't get me something good.
    • tkgally 3 hours ago

      That definitely deserves proverb status!

      Around the same time I was collecting those ghost proverbs, I spent a pleasant afternoon in Shinjuku, Tokyo, taking pictures of T-shirts:

      https://www.gally.net/tshirts/index.html

muhdeeb 9 hours ago

I'm inclined to give them a pass. It's easy enough to figure out that it should be germanium and not gadolinium, and dyslexia already exists among scientists. Context provides enough information to correct the record.

I didn't catch the error the first time around because I autocorrected to Ge--there are only so many anions that can make that formula work and staring at these formulas all day long can make you go cross eyed anyway.

What I think is more dangerous to understanding is skipping formulas in favor of initials! BFO instead of BiFeO3, or BT instead of Bi2Te3, SRO for SrRuO3, LSFO for La0.3Sr0.7FeO3 abbreviations that I think obscure too much detail. You can more easily wander into talking about different things with the same terms. Such abbreviations are already endemic in condensed matter physics.

  • h4ny 6 hours ago

    > I'm inclined to give them a pass. It's easy enough to figure out that it should be germanium and not gadolinium, and dyslexia already exists among scientists.

    People make mistakes and you probably mean well but this is also the sort of pass given that makes scientific research and reporting terrible.

    If it's "easy enough to figure out" then it's even more important to get it right -- why should we trust someone who can't even get the "easy" things right?

    > ... and dyslexia already exists among scientists.

    The article is pointing out a problem that appears to be fairly common, is that really a suitable explanation? Even if it is a suitable explanation, is that a reason for lowering standards, which you can then apply to explain away every mistake?

    Keep in mind that proper publications should usually have been reviewed by at least 3 people including the authors (typically more) by the time everyone else gets to read it. So that kind of mistake isn't really acceptable.

    > What I think is more dangerous to understanding is skipping formulas in favor of initials! BFO instead of BiFeO3, or BT instead of Bi2Te3, SRO for SrRuO3, LSFO for La0.3Sr0.7FeO3 abbreviations that I think obscure too much detail. You can more easily wander into talking about different things with the same terms. Such abbreviations are already endemic in condensed matter physics.

    If you have been trained in scientific writing, you would always introduce an abbreviation. For example, "BiFeO3 (BFO)" and "SrRuO3 (SRO). It's also common to include a list of abbreviation in some forms of scientific writing.

  • pseudochemist 7 hours ago

    > I'm inclined to give them a pass. It's easy enough to figure out that it should be germanium and not gadolinium, and dyslexia already exists among scientists.

    I’m not. If somewhat said Pi was 9.14 I think no one would give it a pass. It’s not like a misspelling. It’s an invalid element which is the chemistry equivalent of an absurdly wrong number in maths.

    • snarkconjecture 6 hours ago

      It's more like saying pi is approximately "3..14". Easily corrected syntax errors aren't as bad as semantic errors.

      • h4ny 6 hours ago

        No. The 9.14 vs. 3.14 analogy is more suitable.

        If you have read the blog post it's a difference between the chemical symbol Ge and Gr, which as I understand is what you would refer to as a "semantic error".

        • voxic11 an hour ago

          But Gr isn't an element so no one would ever misidentify it as part of compound, its obviously a mistake. Like if I said pi was 3.`4

    • handoflixue 7 hours ago

      It should be "someone", not "somewhat".

      "Pi" is only capitalized at the start of a sentence.

      "no one would give it a pass" is a logically unsound claim, given the number of people on the planet.

      How very absurdly wrong of you :)

  • kazinator 9 hours ago

    The typo is not the problem; it's that the typo is evidence of academic dishonesty.

    When you make a citation, it means you cracked open the original work, understood what it says and located a relevant passage to reference in your work.

    The authors are propagating the same typo because they are not copying the original correct text; they are just copying ready-made citations of that text which they plant into their papers to manufacture the impression that they are surveying other work in their area and taking it into account when doing their work.

    They survey one or two works, and then just steal their citations to make it look like they also surveyed 19 other works.

    Problem is, the citations in those words are already copies of borrowed citations from some other paper, which copied some of them from another paper and that was the honest one that made a typo in a genuine, organically grown citation.

    • dataflow 7 hours ago

      Just because you propagated a typo that does not mean you didn't see the original. It could just mean that you saw the typo more recently and that's what stuck in your mind as you were busy writing.

    • light_hue_1 7 hours ago

      It's not academic dishonesty.

      When you read plenty of papers you aren't going to read them again to cite them. You take them from your read.bib file.

      Also citations generally don't link to a passage. They are pointers to an entire paper.

      • kensey an hour ago

        > When you read plenty of papers you aren't going to read them again to cite them.

        But in fact I do exactly that, exactly because experience has taught me that my memory of what is in a paper is fallible and I should at least cursorily review what I'm citing. In a few cases I've even just deleted something entirely because my premise was based on a recollection of what I intended to cite that was subtly wrong enough to fatally undermine my entire thesis.

        I'm not saying you have to read an entire paper over completely every time you cite it but at least pulling it up and reviewing the parts that are informing your argument is definitely a best practice.

kazinator 10 hours ago

Researchers are blindly copy and pasting lists of citations into papers, because they did original work in a vacuum; i.e. without taking the time to study anyone else's work in the same area to understand where the field is at. Since papers without citations, or with too few citations, are giant red flags for publication, they need to generate something to mask the problem.

pimlottc 14 hours ago

I would guess part of the issues is the subscripts. It’s annoying to type out formulas so it’s faster to just cut-n-paste.

teiferer 12 hours ago

If you ask ChatGPT about Cr2Gr2Te6 then it will correct you. The author's worry might be unfounded.

Though since he didn't date his article, it's unclear how long it has been out there so unclear as well whether it made its way into training data. Judging from the comments and the URL, it's quite new, but again, he should add a date to his articles.

  • jibal 11 hours ago

    When I search for Cr2Gr2Te6, Google Gemini tells me:

    "AI Overview Cr2Gr2Te6 is a miswritten, imaginary compound; the correct compound is Cr2Ge2Te6 (Chromium Germanium Telluride), where Cr stands for chromium, Ge for germanium, and Te for tellurium. This error, where 'Gr' was mistakenly used for 'Ge', has been replicated in multiple scientific publications since its discovery in 2017, despite the correct formula being known and published."

  • ddingus 11 hours ago

    The URL is formed using the date, just FYI. :)

    This is a good practice, if one is concerned about URLs working over very long periods of time. "Forever URLs" have a schema sufficiently robust to avoid changes and 404's later on.

    • jibal 11 hours ago

      > The URL is formed using the date, just FYI.

      As they stated, so who are you informing?

      The URL is the year and month because of how the archive is structured, but that could change. The article is not dated but should be--all articles should be. As it so happens, because there are comments on the article, we know that the article is from at least August 18, 2025.

      • ddingus 11 hours ago

        Apparently nobody! I misread. Good grief, subtle problems related to this overall discussion are chronic.

        • jibal 10 hours ago

          Kudos for accepting responsibility. And I wrote "at least" when it should be "at most".

          • ddingus 6 hours ago

            It is easier that way. Less to manage.

ddingus 11 hours ago

Summary: Because they are not writing!

They are copying data and placing it into documents.

Obviously, these are not the same thing.

rdtsc 14 hours ago

Gr is the science journal version of Van Halen's brown M&M rider -- it's how you can tell the reviewers and the authors had no idea what they were doing and just copy pasted junk around.

I think established authors should try to sprinkle obvious mistakes like that on purpose once in a while in the literature and then see how much it spreads.

dawnofdusk 13 hours ago

As any practicing scientist knows even good research papers may be littered with blatant but unimportant errors. There is unfortunately no good reason or system to "correct the record", and it is not clear to me if such a thing is a good use of human resources. Nonetheless, I think correcting the record is always appreciated!

  • jessfyi 13 hours ago

    Getting a compound incorrect is not an "unimportant" error (for example the difference between sodium nitrate & sodium nitrite is small but critical) and seeing "small but blatant" errors actively propagated is the entire reason why the record should be corrected. The only upside of these little artifacts like "vegetative electron microscopy" [0] is that it's a leading indicator that the entire paper and team deserve more scrutiny--as well as any of those whom cite it.

    [0] https://www.sciencealert.com/a-strange-phrase-keeps-turning-...

    • avar 11 hours ago

      I believe they meant that it's "unimportant" because (to use your example) sodium nitrate and sodium nitrite actually exist, whereas there's no element with the chemical symbol "Gr".

    • dawnofdusk 10 hours ago

      The error in the OP is a typo that could never seriously confuse anyone, as the element Gr does not exist.

      An interesting perspective is Terry Tao's on local vs. global errors (https://terrytao.wordpress.com/advice-on-writing-papers/on-l...). A typo like this, even if propagated, is a local error which at worst makes it very annoying to Ctrl-F papers or do literature review. Local errors deserve to be corrected, but in practice their importance to science as a field is small.

  • the__alchemist 13 hours ago

    That is a possible, but charitable explanation. I would like to hold your opinion, but don't know if I can. It must complete with less-charitable ones.

  • thewanderer1983 10 hours ago

    Have you heard of this thing called Peer Review? It's what academia hold up as their gold standard and it is supposed to pick up on these things.

    • crazygringo 6 hours ago

      Peer review isn't spellcheck or proofreading.

      It's about logic, methodology, significance, and citations.

      It's not some gold standard of perfection or truth.

  • jibal 11 hours ago

    That's not only quite factually wrong, but has nothing to do with the point, which is about mindless copying.

    • dawnofdusk 10 hours ago

      If it is factually wrong please tell me how.

johnea 14 hours ago

Much of the www is composed of copying.

I recently corrected an error in this wikipedia article:

https://en.wikipedia.org/wiki/Cape_Shionomisaki

Which stated: "Geologically, the cape is a flat uplifted seafood plateau"

My comment for the change: I'm not an oceanographer, but I'm pretty sure it's not a "seafood plateau". Changed to "seabed plateau"

Afterward, out of curiosity, I did a search for "seafood plateau".

I was shocked at the number of sites that exactly copied that error along with the rest of the page. Most of these sites were clones of wikipedia with the inclusion of ads.

It didn't seem that these sites were LLM generated (they were exact copies), but this seems to be the case for many scientific paper submissions now.

Where it all goes from here is extremely unclear, but it does seem a disruption to many fields which are dependent on written material is in progress...

  • hidroto 12 hours ago

    I would have thought it was a typo of 'seafloor' rather than 'seabed'.

  • fer 12 hours ago

    A friend did an edit (though you could call it vandalism) of a Wikipedia 20 years back. He linked from several pages to a non-existing apportionment method, and created an article with a fairer version of d'Hondt for elections, quite ingenious and probably more fair than the popular alternatives in most cases. He named it after himself (he has an unusual last name and capitalised on that).

    It didn't take long for the page to be dropped for being original research, and he didn't put it anywhere else.

    To this day, you can still find pages and people referencing the method.

    Edit: a quick check and Grok and ChatGPT have scraped it, Gemini hallucinates something unrelated.

  • Animats 13 hours ago

    "Seafood plateau?? A bad translation of "plateau de mer", which is just a seafood platter?

    • BrandoElFollito 13 hours ago

      "Plateau de mer" is not seafood platter. Seafood platter is "plateau de fruits de mer".

      "Plateau de mer" could be "seabed plateau" but I am not an oceanographer so I fo not know what words they use (but strictly from the perspective of French language it is plausible)

      • gyomu 13 hours ago

        It would be “plateau marin”, not “plateau de mer”. “Plateau de mer” does sound like a seafood restaurant special.

      • Animats 13 hours ago

        "Plateau de fruits de mer" is proper, but shortened in cooking practice.

        • BrandoElFollito 13 hours ago

          Ah, I learned something then. I found a few references in Google indeed.

    • bombela 10 hours ago

      French here, asked the frenchies around me. Nobody thinks "plateau de mer" is an obvious shorthand for "plateau de fruit de mer". We have never heard that one. And we sure eat seafood platters on the regular.

  • jibal 12 hours ago

    Of course much of the web is composed of copying, and of course copies of Wikipedia are copied--that's hardly relevant. But science journals are another matter. From the article: "shouldn't the peer reviewers and proofreaders at a top journal catch this error?"

ElijahLynn 14 hours ago

Thank you for your effort in correcting this, it takes time and effort, appreciate it!

halo 10 hours ago

I’m beginning to think my reluctance to shamelessly copy has held me back in life. It’s clearly more widespread than I naively assumed (and I say that without casting judgment).

ungreased0675 7 hours ago

There’s a kernel of an idea here. Something like canary tokens for scientific research.

michaelg7x 11 hours ago

You make deliberate and subtle errors so you can detect later plagiarism more easily.

Martin_Silenus 14 hours ago

You should try to rewrite your article by stating "Ge2" ten times, and "Gr2" one time only.

  • TehCorwiz 14 hours ago

    Disagree. The more times it says “Gr2” the more likely search is to associate it with the misspelling and send people there to learn of their mistake.

  • kens 14 hours ago

    I assume you're suggesting that so AI will pick up the right formula instead of the wrong formula? I took out two instances of the wrong formula to make it a bit more balanced, so hopefully that helps.

    • codeflo 14 hours ago

      I seem to have missed the memo that we're primarily writing for AIs now.

      • janfoeh 14 hours ago

        In recent years, a sizeable amount of people has begun to end questions in regular discussions — such as for recommendations — with the current year, as in which framework should I choose for X in 2025?. Presumably due to SEO filth and its effects on Google.

        > I seem to have missed the memo that we're primarily writing for AIs now.

        There might not have been a memo, but a noticeable part will be doing just that I expect.

    • gowld 14 hours ago

      It's still wrong 7 times in the document...

      You could add [sic] after each incorrect version.

      • Freak_NL 14 hours ago

        [sic] is for when you quote someone verbatim, keeping the typo. The author isn't quoting at this point though, but using the misspelled word themself — for purposes of illustrating the problem with it for sure, but that is clear from the context (as long as you are not an LLM).

oaiey 11 hours ago

They also continue writing about Unobtainium.

pantulis 13 hours ago

Is it thiotimoline?

  • GolfPopper 13 hours ago

    I've heard that thiotimoline is such a bizarre substance, PhD candidates are known to hysterically collapse when asked about it. ;-)

    • jfengel 10 hours ago

      Sometimes even before they've heard of it.

cyanydeez 11 hours ago

Ok, but if they used the right reference it'd be the wrong reference. Just like when a code base contains typos. You know it's a typo but if you try to fix it, you know really know how it's reference external to your code base.

  • jibal 11 hours ago

    What?

olddustytrail 14 hours ago

The second reference link had Ge rather than Gr in the abstract. These seem a tiny number of typos.

How many papers have the correct formula?