In a recent article for Reason, I argued that the hundreds of studies that New York University professor Jonathan Haidt has assembled to support his claim that social media is causing the teen mental health crisis not only don't back up his claim; they undermine it.
Haidt, who has also been calling for new federal legislation that would restrict teen access to social media, responded to my article (and the criticisms of three other writers, who he labeled "the skeptics") with a lengthy Substack piece.
Jonathan Haidt continues to set a fine example of how debate on public policy should be conducted, and so seldom is. He offers clear claims with detailed references. He acknowledges complexities and uncertainties. He offers detailed replies to critics, without rancor or name-calling.
However, in his response, Haidt erroneously depicted me as being dismissive of all social science research. He characterized my critique of his work as consisting mainly "of criticisms of specific studies," conceding that many of those concerns are "justified," but asking "what level of skepticism is right when addressing the overall question: is social media harming girls?" He continued, "If multiple studies find that girls who become heavy users of social media have merely twice the risk of depression, anxiety, self-harm, or suicide, [Brown] doesn't want to hear about it because it COULD conceivably be random noise."
I didn't express "concerns" about specific studies; I argued that the majority of the 301 papers cited in his document are garbage. I went through each category of studies on Haidt's list, chose the first one that studied social media and depression to get a random sampling, and then showed that they were so embarrassingly bad as to be completely useless. They were guilty of coding errors, fatal defects hidden in mid-paper jargon, inappropriate statistics, longitudinal studies that weren't longitudinal, experiments in name only, and red flags for hypothesis shopping and p-hacking (that is, misusing data analysis to yield results that can be presented as statistically significant).
He should remove them from his research compendium and excise them from his upcoming book. Including them would be analogous to the financial industry's decision to bundle toxic mortgage assets in the lead-up to the 2008 financial crisis. "A bad study is like a bad mortgage loan," I wrote in my original piece. "Packaging them up on the assumption that somehow their defects will cancel each other out is based on flawed logic, and it's a recipe for drawing fantastically wrong conclusions."
Haidt was correct, however, when he noted that I won't take a field of study seriously that can't produce a 3–1 odds ratio or greater, e.g. a subpopulation with at least three times the risk of depression than similar people who use less social media. That's because there are so many studies that draw conclusions based on weak findings. It's not that studies with weaker findings "COULD conceivably be random noise," as Haidt wrote; we must assume they're random noise until a researcher can meet a high enough bar to demonstrate actual causation. If you lower the bar so that studies can be rigged to confirm our suspicions instead of actually testing them, statistics is worse than useless, because it gives a false veneer of rigor, or what the economist Friedrich Hayek called "scientism."
When a researcher can refine a result to a 3–1 odds ratio, there probably is an important causal connection to investigate. If not, the research is probably worth discounting.
Of course, investigations rarely start with 3–1 odds ratios. Skilled researchers know how to zone in on observation. Maybe you see in a broad population survey that heavy social media users have a 10 percent higher rate (1.1 odds ratio) of being admitted to emergency rooms for self-harm. So you do more research and then zero in on teenage girls and get a 2–1 odds ratio. Further studies isolate certain types of teenage girls (perhaps ones with single parents and no siblings) and specific types of mental issues (perhaps insomnia caused by anxiety). Once you get to the 3–1 odds ratio, you have a proverbial smoking gun.
Haidt compared his research quest to a civil trial in which a preponderance of the evidence is sufficient. But he hasn't come close to reaching that bar. The analogy also fails because legal trials have to result in verdicts, but for policy questions there's always the option of not making any changes because the statistical evidence doesn't lead to a strong enough conclusion.
So what is an appropriate standard? If you are going to recommend parents think twice about buying a 12-year-old girl a smartphone, guessing that it could be harmful even in the absence of statistical evidence may suffice. But if you are going to recommend new laws, as Haidt has, which are ultimately enforced by state violence, you must clear a higher bar. And even then you need to consider the likelihood that your intervention may not yield the effect you want.
Contrary to Haidt's claim that I'm dismissive of all social science research, I actually found an excellent one first on his list, titled "A Large-Scale Test of the Goldilocks Hypothesis: Quantifying the Relations Between Digital-Screen Use and the Mental Well-Being of Adolescents." It pre-registered its design, which is a simple step that hugely increases credibility. It used a large and carefully selected sample with a high response rate. It employed exploratory data analysis rather than cookbook statistical routines.
But it didn't measure the variables Haidt is interested in. (The results, in fact, strongly undercut his thesis, which I'll get back to in a moment.) In fact, none of the studies I looked at in Haidt's compendium studied depressed teenage girls who used social media. Instead, researchers found data on other types of subjects that someone compiled for another purpose. Few or none of the subjects were depressed teenage girls who used social media heavily.
Haidt also responded to my critique by asserting that "the map is not the territory. The dataset is not reality." Haidt argued correctly that a weak effect in one study might be the result of factors like measurement error, misspecified models, insufficient data, or other issues. So one weak result doesn't mean there isn't a strong causal effect. In other words, if you can't find a lost city of gold on a map of the Americas, it doesn't mean there is no lost city of gold.
But he's using this metaphor to cover for the glaring deficiencies in the research he's assembled. Yes, the map is not the territory. If 301 maps have missed the lost city, I oppose policies that assume it exists.
All that said, must we still assume that Haidt is right because there are no other plausible explanations? Haidt conceded that there are problems with the research he's assembled (my claim, again, goes much further), but then concluded that nothing else "can explain the relatively synchronous international timing" of the mental health crisis and spiking use of smartphones and social media. Writing in the Washington Examiner in defense of Haidt, Tim Carney did not dispute my claim "that social science has yet to prove social media is harming the mental and emotional health of young people." But then he asserted that if "you know any significant number of teenagers, you know that this is true. If you spend any time on social media, you can see roughly why and how social media use would be both addictive and harmful."
The purpose of social science research is not to confirm but to challenge our knee-jerk assumptions, because reality is so complicated. It wasn't that long ago that everybody knew that homosexuals were rare and mentally troubled, that women took wolf whistles as compliments, and that sparing children the rod spoiled them. And most social changes remain unexplained. Why do crime rates, attitudes toward gay marriage, music, fashions, and everything else change the way they do? Why did the Arab Spring, reality television, and PT Cruisers come and go?
I grant that social media use is a plausible contributing factor to teenage girl depression, in terms of both psychosocial development and timing. But Haidt is claiming far more certainty than he should. Instead of compiling flawed studies to confirm his guess, he should ponder more seriously why he's been unable to find any well-executed studies that support his thesis.
The proper scientific approach is to try to falsify hypotheses, not to confirm them.
If Haidt wants parents to allow smartphones only for high school–age students, he should look into the age at which depressed teenage girls got smartphones and see if it's younger than the population average for similar girls. This could prove that the policy is unwise, falsifying Haidt's assumption. We trust hypotheses that survive rigorous falsification efforts, not ones that are weakly confirmed by indirect and low-quality studies.
Haidt also wants schools to forbid phones during the school day. Why is there no study asking depressed teenage girls about the rules in their schools for phones? Did their schools have looser phone rules than would be expected by random chance?
Testing Haidt's proposal to limit social media access to kids over 16 is a little trickier, but you could gather some evidence as to whether this would work. In many states, seventh-graders will turn 13 between September 1 and August 31. Girls born in September will be internet adults under current law for most of the seventh grade, while girls born in August will be internet minors until the eighth grade. If this legislation will help, we'd expect to see higher rates of teenage girl depression for September birthdays than August birthdays. Of course, we'd have to adjust for specific state rules, and also children younger or older than usual for each grade.
The good study that I mentioned above, "A Large-Scale Test of the Goldilocks Hypothesis," does shed some light on the likely impact of social media prohibition. The authors found the most well-being among moderate television watchers, video game players, computer users, and smartphone users, with lower or higher use rates associated with lower measures of well-being. Nearly all the bad studies use methods like correlation that assume linear relations, and these are useless if the actual relation is nonlinear. The very similar graphs for these four different activities suggest that the specific activity doesn't matter. Fifteen-year-olds who spend seven or more hours a day—nearly all their free time—in any one activity report lower levels of well-being than kids with varied activities. This suggests that even if depression is associated with heavy social media use, the problem is spending excessive time on any one activity rather than anything dangerous about social media. Social media prohibition could thus likely lead to more concentration in other activities and erode rather than improve mental health.
This isn't surprising, because prohibition rarely has its intended effect. It often drives behaviors underground, making them harder to monitor and thus less safe. In this case, if there were a law preventing teenagers from having social media accounts, they might switch to harder-to-regulate forms of connecting online that are further outside their parents' purviews. Or they might find new activities with new risks. The only certainty is they won't go back to doing what teenagers did 20 years ago. Therefore there is no reason to assume depression rates will fall to 2003 levels even if leisure activities are the key driver of depression.
Raising kids is hard and no new social media law is going to make it easy. Parents need freedom and information and help more than acts of Congress. I applaud calling attention ("alarm ringing," in Haidt's words) to the consequential choices of giving young people smartphones or allowing extensive social media use in elementary and middle school. Responsible parents will keep their eyes on social media use, especially if it consumes most of a teenager's spare time or seems to involve negative moods and emotions. Haidt's writings might help focus and inform that attention. But I don't see anything like the evidence I would need to support the legislation that Haidt is calling for, and he should eliminate the many, many deeply flawed studies from his analysis.
The post Not Every Study on Teen Depression and Social Media Is Bad. Only Most of Them. appeared first on Reason.com.