The Nutri-Score analogy
When Yuka scans a food product, it does not tell you if it tastes good. It tells you if the composition is balanced: sugar, salt, saturated fats, additives. The Nutri-Score evaluates the recipe, not the flavor.
Publi-Score does the same for scientific publications. We do not evaluate whether the authors are right. We evaluate whether the process that produced their results is rigorous: randomization, pre-registration, data sharing, funding independence.
This is a fundamental distinction. A paper can have true results and a fragile methodology. A paper can have an exemplary methodology and results that will later be refuted. Science advances through accumulation — a single paper, even perfect, is never definitive. Publi-Score measures the quality of the contribution to the scientific debate, not its ultimate truth.
The 7 categories
The Publi-Score v1 grid covers 7 categories and more than 30 sub-criteria. Here are the 7 categories explained simply.
Study design
How was the experiment built?
Randomization, double-blind, control group, active comparator or placebo. This is the backbone of any study. A well-designed randomized controlled trial is the reference: it avoids selection biases and allows causal conclusions.
Example: A study comparing a drug to a placebo in double-blind fashion scores well here. A study with no control group scores poorly.
Transparency and pre-registration
Were the rules of the game defined before playing?
Pre-registration (on ClinicalTrials.gov, OSF, PROSPERO...) forces researchers to define their hypotheses and outcome criteria before seeing the data. Without it, it is easy to look for what works in the data after the fact — what is called HARKing (Hypothesizing After Results are Known).
Example: A pre-registered study with verified protocol compliance scores 4/4. A study without pre-registration scores 0.
Statistical power
Was the study large enough to detect a real effect?
An underpowered study misses a real effect. An overpowered study detects effects with no practical relevance. The a priori power calculation shows that researchers thought about their sample size before starting — not after.
Example: A trial of 50 patients on a rare disease with a justified power calculation can score correctly. A trial of 12 patients without justification scores poorly.
Statistical analysis
Were the numbers manipulated?
Post-hoc subgroups, multiple endpoints, unplanned interim analyses, per-protocol vs intention-to-treat — each of these practices, if unplanned, increases the risk of false positives. Publi-Score applies cumulative penalties for statistical inflation.
Example: A study that pre-specifies its primary analyses and corrects for multiple comparisons scores well. A study reporting 30 subgroups of which only one is significant scores poorly.
Data and reproducibility
Can the calculations be verified?
Sharing of raw data, analysis code, and complete protocol. Reproducibility is a fundamental property of science: if nobody can redo the calculation, nobody can verify the error. Open-access journals with shared data score better.
Example: A study depositing its data on a public repository (OSF, Zenodo, Dryad) scores 2/2. A study with no shared data and no justification scores 0.
Reporting
Does the article say everything that needs to be said?
The CONSORT (clinical trials), PRISMA (meta-analyses), STROBE (observational studies) standards define what must be reported for an article to be evaluable. Conflicts of interest, acknowledged limitations, negative results — all must appear.
Example: A trial following CONSORT and explicitly declaring conflicts of interest scores well. An article omitting non-significant secondary results scores poorly.
Editorial process
Was the publication seriously reviewed?
Journal quality (DOAJ, impact factor, double-blind peer review), submission to acceptance delay, presence in recognized indexed databases. An article accepted in 3 days by a little-known journal is a warning signal — not proof.
Example: A NEJM article with documented peer review scores 1.0. An article in a predatory journal (Beall's list) scores 0.
The integrity coefficient: a multiplier, not a bonus
The 7 categories produce a raw score out of 100 — the methodological quality. This score is then multiplied by an integrity coefficient between 0 and 1.
The integrity coefficient evaluates two dimensions:
- ·Author integrity — conflict of interest disclosure, funding independence
- ·Editorial process — peer review quality, publication delay, recognized journal
The logic of the multiplier is important: it is not a bonus that adds up, it is a necessary condition. A retracted paper for fraud with a raw score of 90/100 is not worth 80 — it is worth 0. Blocking warning signals (retraction, proven fraud, predatory journal) force the coefficient to zero, regardless of the rest.
The A–E scale (and X)
Standards are met on almost all criteria. The evidence is robust and interpretable.
A few minor shortcomings that do not invalidate the conclusions, but deserve attention.
Important methodological weaknesses that call for caution in interpretation.
Conclusions are fragile. The study may contain useful observations, not evidence.
Limitations are too numerous to draw reliable conclusions.
Retraction, proven fraud, or blocking signal. Score is forced to zero.
What Publi-Score does not measure
- ⊘Truth — a score A does not mean the study is right. A score E does not mean it is wrong.
- ⊘Impact — a highly cited study can have a low Publi-Score if its methodology is lacking.
- ⊘Practical relevance — a statistically significant effect can be clinically negligible, and vice versa.
- ⊘Novelty or importance — a replication confirming a known result can have an excellent score.
Publi-Score answers one question: did this study follow the rules of the scientific game?That is a necessary question. It is not the only one.
The grid is public — and contestable
The Publi-Score v1 grid is fully documented on the methodology page. Every sub-criterion, every scale, every acknowledged limitation is accessible.
If you think a criterion is miscalibrated, a score is incorrect, or an important dimension is missing, you can flag it from any article page. Every justified challenge leads to a documented revision of the score or the grid.
Transparency is not a selling point. It is the condition for Publi-Score itself to be credible.
Try it with an article → you will see the score category by category.
