Holding up RCTs as 'gold standard' of research is a false hope and conceals statistical pitfalls, experts say

By Hank Schultz

- Last updated on GMT

Holding up RCTs as 'gold standard' of research is a false hope and conceals statistical pitfalls, experts say

Related tags Scientific method Nutrition

Randomized, placebo-controlled trials are a straitjacket for nutritional science, making FTC’s hardening stance an increasing burden on industry, experts said at the recent CRN conference. The approach poses ethical quandaries and may even have statistical traps built in, they said.

Regina Nuzzo, PhD and Jeffrey Blumberg, PhD participated in a session titled “Clinical Research, Statistics and Other Deceptions”​ at the Council for Responsible Nutrition’s annual meeting, held this year in in Laguna Beach, CA.  Nuzzo spoke on the widespread fallacy of equating P values to high levels of certainty about the truth of underlying hypotheses, whereas Blumberg laid out how the RCT model is inappropriate for most of the questions that nutritional science asks.

The question is a burning one as the Federal Trade Commission for the last several years has started to require that companies that sign consent decrees have at last two RCTs under their belts to substantiate future claims on products. While the agency has said that this standard applies only to those companies that signed decrees and is meant as a ‘fencing in’ measure for companies that were making outlandish claims, the development has had a chilling effect on the broad spectrum of nutritional research.  

False promise of P values

Part of the move to hold RCTs up as a gold standard of evidence lies in a fallacy about what they actually mean, said Nuzzo, a professor of statistics at Gallaudet Univeristy in Washington, DC. There is widespread belief among the mainstream media and even among some regulators that a P value of 0.05 (a measure of statistical probability) equates to a 95% certainty that the hypothesis is true. This is a simplistic view of what a P value actually represents, Nuzzo said.

“All a P value can do is to summarize the data assuming assuming a specific null hypothesis. It can’t work backwards and make statements about the underlying reality,”​ Nuzzo said.  What’s really necessary to make a judgement about reality using a P value as a reference point is the knowledge of how likely the original hypothesis was to be true in the first place. A high P value of 0.05 or better on a study provides additional support for something you were already pretty sure was true, Nuzzo said. A P 0.05 result for a very uncertain hypothesis might boost confidence in the truth of the underlying reality only to something like 50%, Nuzzo said. It doesn’t mean that a P 0.05 study is true and valuable and a P 0.07 study is not, and drawing this sort of bright line is something that FTC has been increasingly leaning toward.

The concept of P value has been around for about seven decades, and researchers have had reservations about the validity of using it as a tool to measure a study’s quality for almost that whole time, Nuzzo said.  While it can be a valuable reference point, it was orginally meant as a measure of whether a given line of research seemed promising enough to deserve a second look, she said.

“Some journals are moving away from P values altogether,”​ she said. “What’s really needed is more sharing, more transparency and more collaboration to build statistical power from related studies.” 

Drug testing

Blumberg, who heads the Antioxidants Research Laboratory at Tufts University in Medford, MA, said FTC is trying to tie the research question up in a neat bow using the two RCT standard, but the tool the agency has chosen is not fit for the task at hand.

“RCTs were in fact designed to test drugs,”​ Blumberg said. “Drugs have large effects and target specific systems. Nutrients by definition are pan-systemic. RCTs have very limited generalized ability to test the effects of nutrients.”

And then there are ethical questions, Blumberg said. To have a true control group, a deprivation of some sort would have to be in place. While this can be done with rats and mice, it obviously can’t happen with humans in most cases. And in nutrition science numerous other confounding factors can come into play, he said.

“At best you can test high intake levels and compare them to what low levels look like,”​ he said.

Totality of evidence as the standard

So what is a researcher to do?  The effort to drive evidence-based decision making into ever more rigid channels is something that responsible members of industry and the research community have to struggle against, Blumberg said.  Black and white is not the question, but rather how dark a shade of gray can you achieve.

“I believe in the power of nutrition, but the effects are modest and aggregated across multiple systems. Nutrients usually have multiple thresholds that are often under homeostatic control. Rather than using an RCT ‘gold standard’ we should consider the totality of research approaches,” ​Blumberg said.

“We should be able to use scientific judgement. We need to look at things like benefit and risk. If the shoe doesn’t fit, we shouldn't have to wear it. We need to be able to use the power of nutrition to advance public health without being held to a false standard like the RCT,” ​he said.

Related news

Related products

show more

The solutions to botanical supply chain challenges

The solutions to botanical supply chain challenges

Content provided by Ayana Bio | 05-Jun-2024 | Infographic

Many botanicals continue to face supply chain challenges, from the surging demand for stress-relieving adaptogens and immune-support ingredients to the...

Kaneka Ubiquinol® and Preconception Health

Kaneka Ubiquinol® and Preconception Health

Content provided by Kaneka Nutrients | 02-Feb-2024 | White Paper

An ally in the fight against oxidative stress, Kaneka Ubiquinol® offers antioxidant support for men and women concerned about reproductive health.

Related suppliers


Show more

To P or not to P.

Posted by Larry,

I remember dong these calculations in college. Whether P is a problem is different problem. What is the problem is how P is being used. Too many nutritional companies pay for a small population test and then market the results of that test on the basis of P. This is the real problem.

Report abuse

P-Value Does Not Speak to the Validity of Hypotheses

Posted by Chris Melville,

The P-values are a measure of the likelihood the findings are random. It is about the observed data and does not provide any information about the author's interpretation of the data.

Report abuse

Experts Say?

Posted by Michael Evans,

Your "experts" want to replace data (P values) with "scientific judgement" whatever that is, and look at "benefit and risk". What does that really mean and how do you measure benefit and risk anyway? Well if you are going to get meaningful data you will need to collect data in a controlled study, blind it from bias, compare it to a control, and measure it using proven statistical data. Since when is that a "false standard" and if so what proven alternative would the "experts" propose?

Report abuse

Follow us


View more