In Neal Stephenson’s 1995 science fiction novel, The Diamond Age, readers meet Nell, a young girl who comes into possession of a highly advanced book, The Young Lady’s Illustrated Primer. The book is not the usual static collection of texts and images but a deeply immersive tool that can converse with the reader, answer questions, and personalize its content, all in service of educating and motivating a young girl to be a strong, independent individual.
Such a device, even after the introduction of the Internet and tablet computers, has remained in the realm of science fiction—until now. Artificial intelligence, or AI, took a giant leap forward with the introduction in November 2022 of ChatGPT, an AI technology capable of producing remarkably creative responses and sophisticated analysis through human-like dialogue. It has triggered a wave of innovation, some of which suggests we might be on the brink of an era of interactive, super-intelligent tools not unlike the book Stephenson dreamed up for Nell.
Sundar Pichai, Google’s CEO, calls artificial intelligence “more profound than fire or electricity or anything we have done in the past.” Reid Hoffman, the founder of LinkedIn and current partner at Greylock Partners, says, “The power to make positive change in the world is about to get the biggest boost it’s ever had.” And Bill Gates has said that “this new wave of AI is as fundamental as the creation of the microprocessor, the personal computer, the Internet, and the mobile phone.”
Over the last year, developers have released a dizzying array of AI tools that can generate text, images, music, and video with no need for complicated coding but simply in response to instructions given in natural language. These technologies are rapidly improving, and developers are introducing capabilities that would have been considered science fiction just a few years ago. AI is also raising pressing ethical questions around bias, appropriate use, and plagiarism.
In the realm of education, this technology will influence how students learn, how teachers work, and ultimately how we structure our education system. Some educators and leaders look forward to these changes with great enthusiasm. Sal Kahn, founder of Khan Academy, went so far as to say in a TED talk that AI has the potential to effect “probably the biggest positive transformation that education has ever seen.” But others warn that AI will enable the spread of misinformation, facilitate cheating in school and college, kill whatever vestiges of individual privacy remain, and cause massive job loss. The challenge is to harness the positive potential while avoiding or mitigating the harm.
What Is Generative AI?
Artificial intelligence is a branch of computer science that focuses on creating software capable of mimicking behaviors and processes we would consider “intelligent” if exhibited by humans, including reasoning, learning, problem-solving, and exercising creativity. AI systems can be applied to an extensive range of tasks, including language translation, image recognition, navigating autonomous vehicles, detecting and treating cancer, and, in the case of generative AI, producing content and knowledge rather than simply searching for and retrieving it.
“Foundation models” in generative AI are systems trained on a large dataset to learn a broad base of knowledge that can then be adapted to a range of different, more specific purposes. This learning method is self-supervised, meaning the model learns by finding patterns and relationships in the data it is trained on.
Large Language Models (LLMs) are foundation models that have been trained on a vast amount of text data. For example, the training data for OpenAI’s GPT model consisted of web content, books, Wikipedia articles, news articles, social media posts, code snippets, and more. OpenAI’s GPT-3 models underwent training on a staggering 300 billion “tokens” or word pieces, using more than 175 billion parameters to shape the model’s behavior—nearly 100 times more data than the company’s GPT-2 model had.
By doing this analysis across billions of sentences, LLM models develop a statistical understanding of language: how words and phrases are usually combined, what topics are typically discussed together, and what tone or style is appropriate in different contexts. That allows it to generate human-like text and perform a wide range of tasks, such as writing articles, answering questions, or analyzing unstructured data.
LLMs include OpenAI’s GPT-4, Google’s PaLM, and Meta’s LLaMA. These LLMs serve as “foundations” for AI applications. ChatGPT is built on GPT-3.5 and GPT-4, while Bard uses Google’s Pathways Language Model 2 (PaLM 2) as its foundation.
Some of the best-known applications are:
ChatGPT 3.5. The free version of ChatGPT released by OpenAI in November 2022. It was trained on data only up to 2021, and while it is very fast, it is prone to inaccuracies.
ChatGPT 4.0. The newest version of ChatGPT, which is more powerful and accurate than ChatGPT 3.5 but also slower, and it requires a paid account. It also has extended capabilities through plug-ins that give it the ability to interface with content from websites, perform more sophisticated mathematical functions, and access other services. A new Code Interpreter feature gives ChatGPT the ability to analyze data, create charts, solve math problems, edit files, and even develop hypotheses to explain data trends.
Microsoft Bing Chat. An iteration of Microsoft’s Bing search engine that is enhanced with OpenAI’s ChatGPT technology. It can browse websites and offers source citations with its results
Google Bard. Google’s AI generates text, translates languages, writes different kinds of creative content, and writes and debugs code in more than 20 different programming languages. The tone and style of Bard’s replies can be finetuned to be simple, long, short, professional, or casual. Bard also leverages Google Lens to analyze images uploaded with prompts.
Anthropic Claude 2. A chatbot that can generate text, summarize content, and perform other tasks, Claude 2 can analyze texts of roughly 75,000 words—about the length of The Great Gatsby—and generate responses of more than 3,000 words. The model was built using a set of principles that serve as a sort of “constitution” for AI systems, with the aim of making them more helpful, honest, and harmless.
Ten minutes after class starts, a student flings open the door, struts in, and yells, “What’s up, bitches?”
If this kind of conduct is familiar to you, you don’t need a primer on how behavior has become worse—much worse—since students returned to school post-pandemic. Chances are you’ve observed just what the data from the National Center for Education Statistics report: 84 percent of school leaders say student behavioral development has been negatively impacted. This is evident in a dramatic increase in classroom disruptions, ranging from student misconduct to acts of disrespect toward teachers and staff to the prohibited use of electronic devices.
Bad behavior “continues to escalate,” said Matt Cretsinger, director of special services for the Marshalltown Community School District in Iowa. “There are more behavioral needs than we’ve ever seen. . . . It’s a shock to teachers.”
Student behavior is “definitely worse” post-pandemic, said Mona Delahooke, a pediatric psychologist. “There are much heavier stress loads that teachers and students are carrying around.”
And it’s not as if discipline weren’t a problem pre-pandemic. “The numbers tell the story,” said student-behavior specialist Ross Greene. “We’re suspending kids like there’s no tomorrow; we’re giving detentions even more than that. We’re expelling to the tune of 100,000 students a year.” Greene added that corporal punishment is at 100,000 instances a year, restraint or seclusion is close to that, and school arrests tally more than 50,000 a year.
Through the nonprofit organization he founded in 2009, Lives in the Balance, Greene and his colleagues train schools in his Collaborative & Proactive Solutions model and advocate for the elimination of punitive, exclusionary disciplinary practices in schools and treatment facilities.
In a small but growing number of schools, teachers and administrators are drawing on Greene’s advice to change how they handle misbehavior. Pointing to hundreds of research studies that say students who respond poorly to problems and frustrations are lacking skills, these schools are actively looking to end punitive discipline, take the focus off student behavior, and train their staffs to recognize—and avoid—situations likely to cause bad behavior. If something is triggering outbursts from students—simply asking them to sit quietly at their desks or giving them a surprise quiz, for instance—teachers might be better off finding other ways to accomplish what is needed.
Not blaming children for their outbursts requires a paradigm shift that, according to some practitioners, is long overdue.
Stuart Ablon, the founder and director of Think:Kids in Massachusetts General Hospital’s department of psychiatry, said simply, “We must move away from thinking students do well if they want to, to students do well if they can.”
Delahooke has her own go-to phrase: “Children don’t throw tantrums; tantrums throw children.”
And Robert Sapolsky, a noted neuroendocrinology researcher and Stanford University professor, goes even further when he traces how various factors—ranging from neurons and hormones to evolution, culture, and history—factor into a person’s behaviors. “Biology is pretty much out of our control, and free will looks pretty suspect,” he said.
The Staying Power of Behaviorism
While these beliefs about student behavior and the growing number of schools adopting these disciplinary methods may seem new, leaders such as Ablon say they’ve been pushing this model for 30 years. And even though some schools are changing their practices, getting people to end their reliance on the punishments and rewards of behaviorism has proven difficult.
Behaviorism—the notion that behavior is shaped by conditioning via environmental stimuli (rewards and punishment)—was a popular theory in the early and mid-20th century. The irony, Ablon said, is that even when the idea was most in vogue, it was not effective. Punishment may put a stop to a certain behavior, but the effect is only temporary.
“It’s not only ineffective; it actually makes matters worse,” Ablon said.
A report that examined how discipline could alienate students from schools found that “when responses to student behavior fail to account for student perspectives and experiences, youths can experience feelings of alienation and disconnection.” Another study that looked specifically at why attempts to influence adolescent behavior often founder proposed the hypothesis “that traditional interventions fail when they do not align with adolescents’ enhanced desire to feel respected and be accorded status; however, interventions that do align with this desire can motivate internalized, positive behavior change.”
Part of the problem is that even when people agree that suspensions and other punishments aren’t working, they fall back on these patterns if they lack an alternative, according to Greene.
“The old mentality is dying hard,” Greene said. “People know a certain way of doing things. They have structures in place [that reinforce those practices]. You’ve got to replace what you’re doing with something; there can’t be a vacuum.”
“The research is pretty clear about what works and what doesn’t,” said Cretsinger. “There’s a significant delay between research and school practice.”
A 2021 study by the American Institutes for Research concluded that out-of-school suspensions for middle school students “actually had a negative effect on . . . students’ future behavioral incidents.” These students were also more likely to be suspended in the future, the study found.
While the study did not report the same effect for high school students, it did conclude that severely disciplining these older students “does not serve as a deterrent for future misbehavior.
“Our educational system is in the dark ages when it comes to understanding behaviors,” said Delahooke. “That’s the bottom line.”
It’s understandable. The education world is awash in articles trying to figure out what artificial intelligence is going to mean for schools and students (see “AI in Education,” features, Fall 2023). But before we get too focused on the latest technological breakthrough, let’s not pretend that we have figured out how to cope with the previous one. Over the last decade, smartphones have become commonplace. Today, 95 percent of American teenagers have a supercomputer in their pocket.
Jonathan Haidt, Jean Twenge, and others have brought necessary attention to the likelihood that smartphones and social media are partly to blame for the teenage mental health epidemic gripping our nation. It’s not a watertight case, because it’s nearly impossible to prove a causal relationship with a phenomenon as ubiquitous as this one.
What scholars can say is that the sudden rise in teenage anxiety and depression, suicidal ideation, and suicide all happened at the same time that teenagers’ adoption of smartphones passed the 50 percent mark—around 2012 or 2013. They can also show that the children most likely to engage in heavy use of smartphones and social media—girls, especially liberal girls—also experienced the greatest increase in mental health challenges. And they can point to other countries that show similar patterns.
My purpose here is not to evaluate this evidence, though I generally agree with Haidt that we should adopt the precautionary principle and assume that phones and social media are likely doing real damage to our kids. Then we should act accordingly.
My immediate question, however, is whether phones and social media might also be behind the plateauing and decline of student achievement that we’ve seen in America, also starting around 2013, long before pandemic-era shutdowns sent test scores over a cliff.
I don’t believe this was the only cause of our achievement woes in the 2010s. As I’ve argued before, I believe the Great Recession was also to blame, both because of its impact on families’ home circumstances, and because of the sudden and significant budget cuts that followed in 2013 and 2014, especially in high-poverty schools. Kirabo Jackson has been particularly persuasive that these spending cuts had a measurable negative impact on achievement (see “The Costs of Cutting School Spending,” research, Fall 2020). Another potential factor was a shift away from school accountability; in 2012 the Obama administration softened the consequences for low test scores targeted by the No Child Left Behind Act. Then in 2015, and Congress replaced it with the Every Student Succeeds Act.
But I do think we need to take the smartphone hypothesis seriously. Especially because, unlike the Great Recession or the pandemic, these trends are not receding in the rearview mirror. Indeed, adolescent phone use continues to rise. If it is one reason that students aren’t learning as much as they did in the pre-smartphone era, that’s a problem we need to grapple with.
Figure 1: Explosive Growth in Adolescents with Smartphones
So what’s the evidence? First and foremost, as mentioned above, the timing lines up (see Figures 1 and 2). We see smartphone ownership really taking off among adolescents in middle and high school around 2013. That’s also when median achievement on the 8th-grade math test in the National Assessment on Educational Progress (NAEP) peaked. It’s fallen modestly ever since. For our lowest-performing students—those at the 10th and 25th percentiles—the declines were more dramatic.
Figure 2: Declines in Math Performance
Another piece of evidence comes from Catholic schools, which serve as a plausible control group for the smartphone hypothesis (see Figure 3). Catholic-school students also take NAEP math and reading tests. But they are not directly impacted by changes in education policy such as the shifts in federal school-accountability rules or cuts in public-school spending. So if Catholic schoolkids also saw achievement declines around 2013, which in fact happened, especially in reading, that could be an indication that something outside education policy is to blame.
Figure 3: Similar Trends in Catholic Schools
But there is also some conflicting evidence. The drops in achievement in the 2010s tended to be for our lowest-achieving students, who are disproportionately poor, Black, Hispanic, and male. And yet, as we know from the studies that Haidt and others point to, phone and social media use was most concentrated among middle-class girls (at least initially). So that doesn’t match up.
Before I conclude with the obligatory call for more research, it’s worth pondering what mechanisms could link smartphone and social media use to lower student achievement. Most obvious are problems around attention, as students’ brains adapt to the rush from “likes,” YouTube videos, TikToks, and other platforms, and then struggle to listen to (much less read) slower-moving and less-vivid presentations, such as the ones they are likely to encounter in class and homework. (Our poor teachers!) Or it could be phones’ impact on mental health; it’s hard to learn when you’re anxious or depressed.
There’s also the issue of sleep (see Figure 4). This is cited in the mental health literature, too, as we know that kids sleep less today than before phones and social media entered the scene, and we also know that there’s a relationship between less sleep and poor mental health.
Figure 4: Teens Sleeping Less
But so too is there a relationship between less sleep and less student learning. After all, sleep is when the brain works much of its magic, forming connections and cementing ideas in long-term memory. Plus, it’s hard to learn when you’re tired, and it’s really hard to learn when you stay home from school because you have been up much of the night. So there is an angle here that also connects with our chronic absenteeism crisis.
What to make of all of this? If we return to the precautionary principle, the least we can do is try to encourage parents to curb their tweens’ and teens’ phone and social media use. Educators can do their part by setting and enforcing classroom rules that phones be turned off or at least stowed away, unless there’s a compelling instructional reason to use them—though that is admittedly an uphill battle (see “Take Away Their Cellphones,” features, Fall 2022). Abolition is likely impossible, though some legislative proposals to make it harder for kids to access social media apps until they are 16 might help. But schools could certainly encourage parents to limit screen time to a reasonable number of hours per day, be much tougher about earlier bedtimes, and require kids to dock their phones outside their bedroom during sleeping hours. There’s a strong foundation of research to back up any effort to protect and promote students’ sleep, which may help ease some uncomfortable conversations (see “Rise and Shine,” research, Summer 2019).
Indeed, more sleep might be the killer app that could make a huge difference—both for students’ academic achievement and mental health. It’s a good reminder that as we contemplate the future impact of AI on schools and society, what likely matters most aren’t the machines we use but the attention we give to our children’s timeless human needs.
Michael J. Petrilli is president of the Thomas B. Fordham Institute, visiting fellow at Stanford University’s Hoover Institution, and an executive editor of Education Next.