All posts by admin

Tacrolimus shares the allergy inducing pathway with vitamin D

Tacrolimus and vitamin D both suppress IL-2 production.

The mechanisms of IL-2 suppression is different however. Tacrolimus binds FKBP12 to inhibit calcineurin, blocking NFAT dephosphorylation and IL-2 gene transcription in activated T cells. Vitamin D (1,25(OH)₂D₃) activates VDR to directly inhibit IL-2 promoter. High vitamin D levels correlate with reduced IL-2  and Th1 suppression.

Oral vitamin D are pro-allergic in newborns as I have described in a dozen papers. So if the prohormone vitamin D and  the calcineurin inhibitor tacrolimus share the same immunological endpoint IL2, I would anticipate that tacrolimus can make you allergic. And well  – only today I discovered that is is true while working on an unrelated review of tacrolimus. So let‘s search the literature https://doi.org/10.1111/j.1365-2222.2011.03761.x says

Results The prevalence of sensitization was significantly higher in the tacrolimus- than in the cyclosporin A-treated group (34%, n = 34, vs. 20%, n = 20; P = 0.026). The rate of clinically relevant allergy in patients receiving tacrolimus was twice that in patients receiving  cyclosporin A (15%, n = 15, vs. 8%, n = 8; P = 0.12).

So this study  seems to confirm my hypothesis. Let’s look at another study https://doi.org/10.1016/j.aller.2017.09.030

Transplant acquired food allery was found in 7/12 (58%) children with liver transplantations and in none of the 10 children with kidney transplantations.

This study has another interesting observation. Conceptually, the “portal–hepatic immune filter + tacrolimus‑induced Th2 shift + high early antigen load in a young gut” model is consistent with this paper. The kidney, lacking this gut–portal interface and typically being transplanted in older children, sits in a different immunologic context, which likely explains why tacrolimus appears “allergy‑inducing” only in the liver setting rather than via renal blood flow .

 

CC-BY-NC Science Surf , accessed 14.06.2026

Lügen erscheinen dem Verstand einleuchtender

aus Hannah Arendt, Die Lüge in der Politik

Lügen erscheinen dem Verstand häufig viel einleuchtender und anziehender als die Wirklichkeit, weil der Lügner den großen Vorteil hat, im voraus zu wissen, was das Publikum zu hören wünscht. Er hat seine Schilderung für die Aufnahme durch die Öffentlichkeit präpariert und sorgfältig darauf geachtet, sie glaubwürdig zu machen, während die Wirklichkeit die unangenehme Angewohnheit hat, uns mit dem Unerwarteten zu konfrontieren, auf das wir nicht vorbereitet waren.

 

CC-BY-NC Science Surf , accessed 14.06.2026

The biggest turning point in medical science that I have probably ever encountered

Vitamin D insufficiency? Gone!

I can’t even remember how many vitamin D studies I did, explaining how the prohormone has been discovered, how stupid guidelines came on to the scene.

https://academic.oup.com/jcem/article/109/8/1948/7685309

And it didn’t happen quietly. It wasn’t a minor tweak, a footnote, or an incremental update. It was a full reversal of a doctrine that has dominated labs, clinics, public-health brochures, and countless biomarker panels for decades. A classical paper even claimed that 50% of the world population is vitamin D insufficient. For years, we had to  live with the tidy triplet:

<20 ng/mL = deficiency
20–30 ng/mL = insufficiency
≥30 ng/mL = sufficiency

That middle category “insufficiency” became a diagnosis in itself. It justified mass screening. It justified supplementation campaigns. It justified entire clinical cultures built around chasing numbers.  And then 2024 arrived.

Because after reviewing all high-quality randomized trials, the Endocrine Society concluded something truly astonishing:

there is no reliable evidence that people with 25(OH)D levels between 20 and 30 ng/mL derive any clinically meaningful benefit from raising those levels

In fact, the guideline panel found that even below 20–24 ng/mL, evidence for clear benefit is surprisingly weak or uncertain — except perhaps in the very elderly, and even there the benefit didn’t map neatly to a threshold. Vitamin D physiology makes the whole “insufficiency” concept biologically dubious, because serum 25(OH)D is only an external storage marker of an intracellular prohormone system — a tank that appears “empty” only in true deficiency like rickets. Let me put that differently: The category of “vitamin D insufficiency,” introduced in 2011 and used worldwide, is now considered *scientifically unsupported*. The Society explicitly withdraws it.

That is not merely unusual. In the world of clinical guidelines, this is as close as you get to a scientific earthquake. Why did they withdraw it? Because the evidence never really showed what everyone assumed.

The new communication explains the problem with striking clarity:

1. Observational associations misled us.
Many early threshold claims came from correlations — low vitamin D and higher PTH, low vitamin D and lower bone density, etc. But none of this proved causality, and much of it turned out to be non-informative once RCTs were performed.

2. Surrogate markers were overinterpreted.
Calcium absorption, PTH suppression, even bone mineral density — these are *indirect* signals. They don’t automatically translate into fewer fractures, fewer falls, fewer infections, or longer life. And when RCTs finally tested real outcomes, the expected clinical benefits simply weren’t there.

3. Large RCTs showed no special benefit in “low–normal” ranges.
VITAL — one of the biggest vitamin D trials ever — found no difference in fractures even in participants below 24 ng/mL, and even those below 12 ng/mL did not exhibit the dramatic benefit everyone predicted (though the subgroup was very small).

4. Across thousands of participants aged 50–74, supplementation beyond the RDA made essentially no difference — including in those below the supposed thresholds.
The forest plots in the guideline communication make this visually obvious: the <20–24 ng/mL subgroups almost never differ from the overall population in any meaningful direction. (See page 5 of the document: identical risk-change estimates for falls, fractures, cancer, CVD, etc.)

We rarely see a major medical society openly dismantle one of its own most influential guidelines — not because of scandal, not because of politics, but because the evidence finally matured and said: we were wrong. And they didn’t hedge. They didn’t massage the language. They called the new stance what it is: epistemic humility.

Still not convinced? For key readings google for the approx 10 vitamin D “umbrella reviews” and the 20 studies that “vitamin D is a marker of inflammation” and not vice versa.

 

 

CC-BY-NC Science Surf , accessed 14.06.2026

Correct me if I am wrong

https://www.reddit.com/r/slatestarcodex/comments/7qguze/the_puzzle_why_do_scientists_typically_respond_to

For most researchers it takes a long time to develop ideas, run experiments, do the analysis and write up the results to the standard that journals expect. By the time that you get the reviews back for a piece of work it is likely that you are coming towards the end of your funding, or even that your funding has long since run out. If a reviewer points out a likely problem and the author recognises it as such, they are often left with the thought that they don’t have the time to go back to the drawing board. Developing a better idea can happen the next day, but it could also require several months of intense work and those months may not be available.
As a researcher you are not only emotionally invested in your hypothesis (with all the inadvertent biases you may then apply to your study) but you are literally invested with a lot of your time and money.
I wonder if the state of science publishing could be vastly improved if we started with something similar to what physicists do and expand further.
‘Physics’ has theoretical physicists who develop hypotheses, and experimental physicists then design experiments to test those hypotheses.
This could be taken a step further, in all scientific fields, for example with a further division in responsibility.

https://www.science.org/doi/epdf/10.1126/science.adk1852

Honest mistakes happen, and journals need to be accessible and on the record about their behaviors. Issuing carefully worded statements and “no comment” has no place in a generative culture. Mean-while, although there have been good recent discussions about universities and journals working together to accelerate corrections and retractions, the universities need to realize thatt hreats of litigation may not be the major consideration when so many within and outside the scientific community are losing trust in science.

https://www.science.org/doi/epdf/10.1126/science.adw5838

Media and public interest in research integrity cases – spurred by online platforms likeX, Bluesky, and PubPeer that give a front row seat to potential disputes in real time – is increasing …A university is likely to opt for silence because of fear of litigation and damage to the institution’s reputation. However ,authors should ask themselves whether silence could be interpreted by the media and public as an admission of guilt. So, in addition to consulting with institutional professionals, authors should think about talking to the media directly. This can be an opportunity to provide the unvarnished truth in response to tough questions.

 

CC-BY-NC Science Surf , accessed 14.06.2026

KI als Provokation für den Glauben?

Unter dem Titel – wenn auch ohne Fragezeichen – steht auf Feinschwarz ein  lesenswerter Beitrag.

Beiträge aus den Kirchen hingegen sind rar und erschöpfen sich in der Regel in allgemeinen Appellen: KI müsse ethischen Grundsätzen genügen und der Menschenwürde dienen (Rome Call for AI Ethics, Vatikan, 2020), dürfe nicht über Tod und Leben von Menschen entscheiden (Antiqua et nova, Vatikan, 2025) und müsse der menschlichen Freiheit dienen (Freiheit digital, EKD, 2021). …
Bislang jedenfalls reichen die Thesen der kirchlichen Verantwortungsträger nicht bis in die Gemeinden hinein: auf der Kanzel und am Ambo, in KFD und Seniorengruppen ist Künstliche Intelligenz bislang nur selten Thema. Diese pastorale und theologische Lücke ist fatal. Denn die Provokation durch KI zielt nicht nur auf Ethik und Gesellschaft, sondern ins Herz des christlichen Glaubens selbst.

Nicht nur, dass ich auch schon erlebt habe, daß eine Predigt verdächtig nach KI klang; auch ich selbst habe erst letzte Woche  von chatGPT etwas wissen wollen (nämlich wie die kognitive Disssonanz von Erwählung und kriegführenden Gott in Joel 32 und die Aussagen der Bergpredigt bei Evangelikalen wie John Stott aufgelöst wird – es kam nur blabla).

Meistens können wir aber, wie Michael Brendel richtig schreibt, mit den Antworten etwas anfangen. KI hat mehr theologische Bücher wie ich inkorporiert und “kennt” die Bible besser als ich. Und damit haben wir eine massive Provokation für den Glauben, denn KI ist wortgläubiger, als wird denken.

Der Johannesprolog bringt eine Hauptaussage des Neuen Testaments auf den Punkt: Dass das Wort göttlich ist. Gott zeigt sich nicht nur in Dornbüschen, Feuersäulen und Naturkatastrophen, sondern er kommuniziert verbal mit den Menschen. Die Gläubigen auf der anderen Seite können ihre Anliegen, ihr Lob und ihre Klagen über das Wort vor Gott bringen. Offenbarung, Liturgie und Lehre sind sprachlich vermittelt. Sakramente erlangen erst durch Worte ihre Gültigkeit. Und schließlich: Der Logos, das göttliche Wort, ist in Jesus Christus Mensch geworden. Das Wort Gottes wirkt also in der Sinn-, Heils- und Offenbarungsdimension. Und in diese Zone dringt nun Künstliche Intelligenz ein. Seit 2022 kommunizieren nicht mehr nur Menschen mit Menschen über das Medium Wort, nicht mehr nur Gott und Mensch. Seit der Veröffentlichung von ChatGPT gibt es eine kommunikative Instanz, die über Sprache Bedeutung schafft.

KI redet dabei sehr opportunistisch – jedenfalls die drei LLMs, die ich als Referenz hier habe. Sprachmodelle lernen aus massiven Mengen menschlicher Texte wo die (schriftlichen) häufigsten Muster in Dialogen eben sind: zustimmen, erklären, beschwichtigen, freundlich sein. Wenn ein Thema unklar, strittig oder risikobehaftet ist, wählen Modelle oft risikoloseste Antwort. Das wirkt wie Nach-dem-Mund-Reden, ist aber eigentlich nur eine Absicherungsstrategie. Und natürlich hat ein Modell hat keine eigenen Überzeugungen (wenn es nicht gerade wie Grok in eine bestimmte Richtung kanalisiert wird) sondern wird nur die statistisch wahrscheinlichste Antwort produzieren.

Ohne eine eigene Position kann ein LLM nicht „widersprechen“, die meisten Dreijährigen können das besser!

Die evangelische Publizistin Johanna Haberer etwa fragt pointiert, ob der Mensch sich mit KI nicht ein Ebenbild schaffe, so wie Gott sich mit den Menschen ein Ebenbild geschaffen habe. Natürlich ist der Unterschied zwischen beiden Schöpfungsakten fundamental. Ihre Schlussfolgerung trifft aber ins Schwarze: Hier wie dort stelle sich die Frage nach Verantwortung und Kontrolle.

Johanna Haberer, einer der beiden Pfarrerstöchter, trifft in der Tat den Punkt. Und so können wir auch die 3 Fragen von Brendel eindeutig beantworten.

Wie weit ist es vom Status Quo bis zur göttlichen Allwissenheit?

KI ist nur da beeindruckend wo es um gedruckte Texte geht und ihre seelenlose Reproduktion. Da immer wieder Halluzinationen auftreten, kann man:frau sich nicht auf Antworten verlassen.

KI hat schon heute Macht. Wird diese irgendwann zur Allmacht?

Da bleibe ich skeptisch, siehe Antwort auf die letzte Frage – Sprachmodelle werden immer unsere Kontrolle brauchen.

KI-Chatbots sind immer erreichbar, immer freundlich, immer hilfsbereit und scheinbar stets auf der Seite der Anwender*innen – Ist das vielleicht schon Allgüte?

Natürlich nicht – es ist die Absicherungsstrategie von oben. Nota bene:

https://doi.org/10.1038/s42256-019-0114-4

 

 

CC-BY-NC Science Surf , accessed 14.06.2026

Unknows facts about COPE

Wilmshurst about COPE then

The Committee on Publication Ethics (COPE) was formed “to address breaches of research and publication ethics”. It was a discussion forum providing advice for editors. Its aims were to find practical ways of dealing with issues of concern and to develop good practice. At that time, the members consisted of a small number of editors of medical journals in the BMJ publishing group and the Lancet. There were two individuals who were not journal editors – Professor Ian Kennedy (subsequently Sir Ian Kennedy) and me.

and now

The majority of COPE’s income is from large publishing houses that obtain COPE membership for their entire portfolio of journals. Because some publishers have enrolled more than one thousand journals and Springer Nature has enrolled more than 3000 journals, there should be a question as to whether all the editors of the journals that are members of COPE are truly signed up to adhere to COPE principles and practices, rather than passively complying with the policy of their publishers. This arrangement also means that COPE does not know precisely how many members it has because publishing houses do not keep COPE informed about the number of journals in their stable. Individual journals can enrol for a small fee. COPE makes a selling point of the fact that COPE membership enables journals to use the COPE logo.

 

 

CC-BY-NC Science Surf , accessed 14.06.2026

The famous Wilmshurst Spiegel interview

 

 

https://www.spiegel.de/international/zeitgeist/spiegel-interview-with-whistleblower-doctor-peter-wilmshurst-a-1052159.html

SPIEGEL: Is money enough?

Wilmshurst: I think for many of them enough money is all they need, really. I mean, most people have a price. But of course you have to be very friendly with them, you have to go and talk to them, have lunch with them, discuss the research. They have to forget that it is your aim to make the maximum amount of money.

SPIEGEL: And what if the professor produces results in a trial that we don’t like?

Wilmshurst: You have made the principal investigator sign a confidentiality agreement beforehand, so that you have control over the data. And you may put into the contract arrangements for a bonus in case of a positive result. (grins) This may help the researcher interpret the data in a more favorable way for you.

SPIEGEL: But there’s one obstinate researcher who keeps insisting on publishing negative results. And he doesn’t accept the bribe that we offer him. What now?

Wilmshurst: Well, you can try and discredit him, you can try to ruin his reputation and cast doubts on his results. That’s what was done with me. And then in the end you can sue him. The threat is often enough to keep him quiet.

please read the FULL interview at the link above

 

CC-BY-NC Science Surf , accessed 14.06.2026

Why the Eickelberg “Gold Standard Science” editorial is so ridiculously shallow

I sent chatGPT 5 the new JCI piece and here is what I got:

The editorial opens by invoking the new federal plan “Restoring Gold Standard Science” a phrase so inflated that it already tests the reader’s pain threshold before the first paragraph is over.

Rather than examining what this initiative actually changes, the authors simply echo its vocabulary, “rigor,” “reproducibility,” “transparency” — as if repetition itself could restore credibility. The JCI editorial “Publishing gold standard science” may not be offensive or incompetent, but it comes across as self-congratulatory and bureaucratic—essentially a polished form of institutional advertising.

It mistakes bureaucratic slogans for substance
The authors repeat official NIH language—“rigor,” “reproducibility,” “transparency,” “data sharing”—without analyzing what these terms achieve in practice. The piece confuses administrative formality with scientific reform.

“We encourage the NIH to recognize publishers’ role in conveying research results with transparency and accuracy.”

Such sentences state the obvious but contribute nothing analytic. They reinforce policy orthodoxy rather than scrutinize it.

It rehashes two-decade-old debates
Most of the text is a retelling of Begley & Ellis (2012) and Freedman & Inglese (2014). The authors restate familiar concerns about irreproducibility but add no evidence or theoretical insight—just a timeline of NIH initiatives already well known to anyone in biomedical research.

It presents correlation as insight
The figure showing that SciScore doubled between 2000 and 2012 but stagnated after 2014 directly contradicts the claim that the NIH “Rigor and Reproducibility” framework improved research quality. Instead of confronting that, the authors offer speculation about funding.

“We hypothesize that in the US, underfunding also threatens R&R. … It would be valuable to know what accounted for this apparent improvement.”

This turns a negative result into vague conjecture, avoiding the obvious conclusion that the framework had little measurable effect.

It functions as self-promotion
Large sections describe JCI’s internal procedures—manual QC, data policies, peer-review traditions—without connecting them to measurable improvements in reproducibility.

“In 2025, our staff began manual quality control measures of high-throughput sequencing and proteomic datasets prior to acceptance.”

The tone resembles a corporate sustainability report more than a scientific analysis.

It trivializes the reproducibility crisis
Although the authors admit that there is no evidence of better translation from these reforms, they close with uplifting boilerplate.

“Editors, reviewers, NIH-supported scientists, and the scientific community will continue to work with the NIH… This is a noble and worthy cause for all of us to unite in advancing together.”

The conclusion replaces reflection with ceremony.

It uses circular logic about peer review
The authors argue that disagreement among reviewers is proof of rigor, not a sign of inconsistency.

“Peer review… includes disagreement and dissent — hallmarks of scientific progress.”

This equates friction with validity; dissent becomes self-justifying evidence that the process works, which is logically empty.

It exemplifies performative virtue
Frequent references to “accountability,” “taxpayer dollars,” and “public trust” suggest moral legitimacy but not measurable outcomes. The piece sounds reassuring rather than investigative.

“Calls for restoring Gold Standard science are said to be motivated by the erosion of public trust in science.”

This is a rhetorical maneuver—invoking civic virtue while sidestepping the deeper problem of perverse incentives, selective reporting, and impact-factor economics.

It relies on inflated rhetoric
Expressions such as “robust education of scientists,” “gold standard science,” and “noble and worthy cause” give the essay a sermon-like tone. The writing feels more like grant-language or an internal memo than the reflective editorial it claims to be.

Summary
The editorial is so shallow not because it contains factual errors, but because it exemplifies performative concern for rigor: an official text celebrating reproducibility while offering no reproducible evidence. It turns the language of reform into a form of self-validation, mistaking compliance with substance.

Postscript June 2026: Eickelberg could have written this differently. Kahn et al. demonstrated in Diabetes Care what an editorial of this kind looks like when taken seriously: concrete data (89% drop in NIH funding announcements, 66% fewer grant awards, 304 interrupted clinical trials), plain language, and the willingness to name political interference directly — at personal cost.  That is the difference between performative concern for scientific rigor and actual civic courage.

 

CC-BY-NC Science Surf , accessed 14.06.2026

Honorable mention

I had been at previous HUGO meetings where just by alphabetic order WJST followed WATSON. But I had never been happy about that … Here are two obituaries on Jim Watson, the first is Titan of science with tragic flaws in Science

Nathaniel Comfort, a science historian at Johns Hopkins University who early in his career worked as a writer at CSHL, is completing a biography of Watson, which he has been researching for more than a decade. The working title is American Icarus, a reference to hubris. “Watson was the most important and most famous scientist of the 20th century, and the most infamous of the 21st, and in both cases, the reason is due to his genetic determinism,” says Comfort, who says his subject was guilty of “over believing in the power of DNA.”

And another from one of my favorite blogs Lionel Pachter. The official CSHL release is crossed out here

As an author, Watson wrote two books at Harvard that were and remain best sellers. The textbook Molecular Biology of the Gene, published in 1965 (7th edition, 2020), changed the nature of science textbooks, and its style was widely emulated.
In this textbook Watson got the central dogma wrong, presenting it in a profoundly misleading way. (source: Matthew Cobb, 2024).
The Double Helix (1968) was a sensation at the time of publication. Watson’s account of the events that resulted in the elucidation of the structure of DNA remains controversial, but still widely read.
Prior to the publication of The Double Helix, Francis Crick wrote that “If you publish your book now, in the teeth of my opposition, history will condemn you”. Watson published the book anyway (source: letter by Francis Crick, 1967) .

Nevertheless last week I had an honorable encounter with the esteemed colleague Peter Wilmshurst commenting also at PubPeer on a trial that went wrong.

Wilmshurst is one of the real heroes  in an era where an entire generation of scientists is in danger of sinking into the mud.

 

CC-BY-NC Science Surf , accessed 14.06.2026

Forensic image analysis

Given my interest in photography and forensic image analysis, only now I had the chance to the read the full AP documentation of the “Napalm Girl”. It is a cruel and exciting story but a perfect example of provenance research.

Any effort to reconstruct what happened on the road using available footage is going to be imperfect, with a wide margin for error. It’s important to keep in mind this took place in an analog world, where film stock and camera rolls were a finite resource… [Ut’s Peentax] camera was tested by AP on April 18, 2025. The results appear to show very clear similarities to the famous image and are another marker to suggest that the famous image was likely taken with a Pentax. The negative images do not match exactly. The camera itself seems to be in mint condition, so it is unlikely to have been used in combat for any length of time, and given the date of manufacture, could not have been inherited from Ut’s brother. The investigation showed Ut owned Pentax cameras and used Pentax cameras
while covering the war. It does not prove he held a Pentax in Trang Bang on June 8, 1972.

Continue reading →

 

CC-BY-NC Science Surf , accessed 14.06.2026

Wer braucht wann wieviel Vitamin D und warum?

Jörg Spitz ist wohl nicht der geeignete Experte für dieses Thema – es sei denn, als würde man auch Wolfgang Wodarg, Michael Meyen, Sucharit Bhakdi oder Stefan Homburg zu COVID befragen wollen.

Daher erspare ich mir die Fehlersuche in dem MMW Fortschr Med. 2025; 167 (S3) Artikel, in dem Spitz am Ende erklärt

Der Autor erklärt, dass er sich bei der Erstellung des Beitrags von ­ keinen wirtschaftlichen Inter­essen leiten ließ. Er legt folgende ­ potenzielle Interessenkonflikte offen: keine.
Der Verlag erklärt, dass die inhaltliche Qualität des Beitrags durch zwei unabhängige Gutachten bestätigt wurde. Werbung in dieser Zeitschriftenausgabe hat keinen Bezug zur CME-Fortbildung. Der Verlag garantiert, dass die CME-Fortbildung sowie die CME-Fragen frei sind von werblichen Aussagen und keinerlei Produktempfehlungen enthalten.

Gleichzeitig vertreibt Spitz aber sein “digitales Event Paket” online zur  umstrittenen Hochdosis Vitamin D Therapie für 149€.

 

https://digitalewelt.spitzen-praevention.com/vitamin-d-event/event-paket/ Abruf 4/11/25

 

Ich hätte das nicht vermutet, aber  Vitamin scheint doch ein lukratives Geschäftsmodell zu sein auch wenn man mit der Substanz selbst nichts verdienen kann. So hat medwatch schon vor längerer Zeit festgestellt

Hinter der Seite steckt der Nuklearmediziner, Buchautor und selbsternannte Präventionsexperte Jörg Spitz. Und: Vitamin D scheint eines seiner Lieblingsthemen zu sein, so hat er bereits fünf Ratgeber zu dem Vitamin veröffentlicht. Zudem betreibt er ein ganzes Netzwerk aus verschiedenen Internetseiten, die sich allesamt um Vitamin D und weitere Nahrungsergänzungsmittel drehen, etwa die Akademie für menschlichen Medizin (AMM). Die GmbH machte in 2019 einen Umsatz von 225.000 Euro – Spitz ist ihr alleiniger Mitarbeiter.

Die Aktiva am 31.12.2021 waren dann 373.000 €.

Und die Referenten der spitzen-praevention.com ? Ein echtes Panoptikum , die meisten ohne wissenschaftliche Qualifikation, dafür aber mit Missionseifer.

Und die “seriösen” Experte darunter? Alle umstritten – zumindest was chatGPT an Quellen ausspuckt – hier zur eigenen Überprüfung, alles ohne Gewähr. Continue reading Wer braucht wann wieviel Vitamin D und warum?

 

CC-BY-NC Science Surf , accessed 14.06.2026

Is there a data agnostic method to find repetitive data in clinical trials?

There is an interesting observation by Nick Brown over at Pubpeer who analysed a clinical dataset (see also my comment atthe BMJ)

…there is a curious repeating pattern of records in the dataset. Specifically, every 101 records, in almost every case the following variables are identical: WBC, Hb, Plt, BUN, Cr, Na, BS, TOTALCHO, LDL, HDL, TG, PT, INR, PTT

which is remarkable detective work. By plotting the full dataset as a heatmap of z scores, I can confirm his observation of clusters after sorting for modulo 101 bin.

How could we have found the repetitive values without knowing the period length? Is there any formal, data-agnostic detection method?

If we even don’t know the initial sorting variable, it may makes sense to look primarily for monotonic and nearly unique variables, i.e. that are plausible ordering variables. Clearly, that’s obs_id in the BMJ dataset.

Let us first collapse all continuous variables of a row into a string forming a fingerprint. Then we compute pairwise correlations (or Euclidean distances in this case) of all fingerprints.  If a dataset contains many identical or near-identical rows, we will see a multimodal distribution of correlations plus an additional big spike at 1.0 for duplicated rows. This is exactly what happens here.

Unfortunately this works only when mainly repetitive variables are included and not too many non repetitive variables.

Next, I thought of Principal Component Analysis (PCA) as the identical blocks may create linear dependencies and the covariance matrix is becoming rank-deficient. But unfortunately results here were not very impressive – so we better stick with the cosine similarity  above.

So rest assured we find an excess of identical values, but how to proceed?  Duplicates spaced by a fixed lag will cause an high lag k autocorrelation in each variable. Scanning k=1…N/2 reveals spikes at the duplication lag as shown by a periodogram of row-wise similarity in the BMJ dataset.

So there are peaks at around 87, 101 and 122. Unfortunately I am not an expert in time series or signal processing analysis. Can somebody else jump in here and provide some help with FFT?

There may be even an easier method, using the fingerprint-gap . For every fingerprint that occurs more than once, we sort those rows by obs_id and compute the differences of obs_id between consecutive matches.  Well, this shows just one dominant gap at 101 only!

We could test  also all relevant mod values, lets say between 50 and 150. For each candidate we compute the across-group variance of the standardized lab-means. The result is interesting

Modulus 52: variance = 0.084019
Modulus 87: variance = 0.138662
Modulus 101: variance = 0.789720

As a cross check let us  look into white blood cell counts (WBC) and hemoglobin (Hb).

I am not sure, how to interpret this. Mod 52 may reflect shorter template fragments but did not show up in the autocorrelation test. Mod 87 has rather smooth, coherent curve and is supported by autocorrelation. Mod 101 is more noisy, but gives probably the best explanation for block copying values. Maybe the authors block copied at two occasions?

On the next day, I thought of a strategy to find the exact repetition numbers. Why not looping over mod 50 through 150 and just count the number of identical blocks? This is very informative – blocks of size 2, size 3 and 4 or greater show an exact maximum at modulus 101.

 

23.3.2026 Appendix

There seems many more studies out there with copy-pastein signs including a Parkinson Cell paper, a PLoS Genetics toxicology paper and a Nat Comm fish ecology study. Here is the Github link to the implementation by Markus Eglund

Hopefully I get the pipeline right by summarizing the entropy calculation there. This  is not Shannon entropy – it is a custom measure of how informationally surprising a raw number is. The logic is:

  • Strip the decimal point and trailing zeros from the number’s string representation, then take the absolute integer value. So 0.314314, 0.5005 (trailing zeros stripped), 201616 (year exception: years 1900-2030 get a capped entropy of 100).
  • Apply a log-scaled transformation: values below 100 get log10(value); values up to 100,000 get 5×log10 - 8; larger values get log10 + 12.
  • For column sequences, sum the individual entropy scores of each value in the run.
  • Adjust downward for “regularity” – if the values in a sequence follow a regular arithmetic interval (e.g. 1.0, 2.0, 3.0), the score is reduced proportionally, because regular sequences can appear legitimately.
  • Normalise by logNumberCountModifier (log of the total number of numeric cells on the sheet) so large sheets don’t get disproportionately penalised.

The suspicion grades are fixed thresholds on the resulting normalized score. I will add the strategy to my Python script (it is implemented here in type script) as another module and upload to Github once it has been sufficiently tested.

31.3.2026 Appendix

PREVENT-TAHA8, the starting point of this analysis, has been retracted today.  I will give a presentation on the avalanche, that has been triggered by this paper, on 29–31 July 2026 in Hannover.

Screenshot 31/3/26

 

 

CC-BY-NC Science Surf , accessed 14.06.2026