Want ethical AI? Hand the keys to middle schoolers.

Share

This story originally appeared in the Youth issue of Popular Science. Current subscribers can access the whole digital edition here, or click here to subscribe.

LI XIN ZHANG’S SUMMER CAMP began with sandwiches—not eating them but designing them. The rising seventh grader listened as teachers asked her and her peers to write instructions for building the ideal peanut butter, jelly, and bread concoction. Heads down, the students each created their own how-to.

When they returned to the Zoom matrix of digital faces and told one another about their constructions, they realized something: Each of them had made a slightly different sandwich, favoring the characteristics they held dear. Not necessarily good, not necessarily bad, but definitely not neutral. Their sandwiches were biased. Because they were biased, and they had built the recipe.

The activity was called Best PB&J Algorithm, and Zhang and more than 30 other Boston-area kids between the ages of 10 and 15 were embarking on a two-week initiation into artificial intelligence—the ability of machines to display smarts typically associated with the human brain. Over the course of 18 lessons, they would focus on the ethics embedded in the algorithms that snake through their lives, influencing their entertainment, their social lives, and, to a large degree, their view of the world. Also, in this case, their sandwiches.

“Everybody’s version of ‘best’ is different,” says Daniella DiPaola, a graduate student at Massachusetts Institute of Technology who helped develop the series of lessons, which is called Everyday AI. “Some can be the most sugary, or they’re optimizing for an allergy, or they don’t want crust.” Zhang put her food in the oven for a warm snack. A parent’s code might take cost into account.

A pricey PB&J is low on the world’s list of concerns. But given a familiar, nutrient-rich example, the campers could squint at bias and discern how it might creep into other algorithms. Take, for example, facial recognition software, which Boston banned in 2020: This code, which the city’s police department potentially could have deployed, matches anyone caught on camera to databases of known faces. But such software in general is notoriously inaccurate at identifying people of color and performs worse on women’s faces than on men’s—both of which lead to false matches. A 2019 study by the National Institute of Standards and Technology used 189 algorithms from 99 developers to analyze images of 8.49 million people worldwide. The report found that false positives were uniformly more common for women and up to 100 times more likely among West and East African and East Asian people than among Eastern Europeans, which had the lowest rate. Looking at a domestic database of mug shots, the rate was highest for American Indians and elevated for Black and Asian populations.

The kids’ algorithms showed how preference creeps in, even in benign ways. “Our values are embedded in our peanut butter and jelly sandwiches,” DiPaola says.

The camp doesn’t aim to depress students with the realization that AI isn’t all-knowing and neutral. Instead, it gives them the tools to understand, and perhaps change, the technology’s influence—as the AI creators, consumers, voters, and regulators of the future.

To accomplish that, instructors based their lessons on an initiative called DAILy (Developing AI Literacy), shaped over the past few years by MIT educators, grad students, and researchers, including DiPaola. It introduces middle schoolers to the technical, creative, and ethical implications of AI, taking them from building PB&Js to totally redesigning YouTube’s recommendation algorithm. For the project, MIT partnered with an organization called STEAM Ahead, a nonprofit whose mission is to create educational opportunities for Boston-area kids from groups traditionally underrepresented in scientific, technical, and artistic fields. They did a trial run in 2020, then repeated the curriculum in 2021 for Everyday AI, expanding the camp to include middle-school teachers. The goal is for educators across the country to be able to easily download the course and implement it.

DAILy is designed to enable average people to be better informed about AI. “I knew that AI was pretty helpful for humans, and it might be a huge part of our life,” Zhang says, reflecting on what she’d learned. When she started, she says, “I just knew a little bit, not a lot.” Coding was totally new to her.

DAILy’s creators and instructors are at the forefront of a movement to bake ethics into the development process, as opposed to its being an afterthought once the code is complete. The program isn’t unique, though others like it are hardly widespread. Grassroots efforts range from a middle-school ethics offering in Indiana called AI Goes Rural to the website Explore AI Ethics, started for teachers by a Minnesota programmer. The National Science Foundation (NSF) recently funded a high-school program called TechHive AI that covers cybersecurity and AI ethics.

[Related: An AI finished Beethoven’s last symphony. Is it any good?]

Historically, ethics hasn’t been incorporated into technical AI education. “It’s something that has been lacking,” says Fred Martin, professor and associate dean for teaching, learning and undergraduate studies at the University of Massachusetts Lowell. In 2018, Martin co-founded the AI4K12 initiative, which produced guidelines for teaching AI in K–12 schools. “We conceived of what we call five big ideas of AI, and the fifth is all about ethics.” He’s since seen AI ethics education expand and reach younger students, as evidenced by AI4K12’s growing database of resources.

The directory links to MIT offerings, including DAILy. Ethics is “front and center in their work,” Martin says. “It’s important that kids begin learning about it early so they can be informed citizens.”

At the Everyday AI workshop, the hope is that students will feel empowered. “You do have agency,” says Wesley Davis, a instructor at the 2020 pilot camp. “You have the agency to understand. You have the agency to explore that curiosity, down to creating a better system, creating a better world.

“That’s a little flowery-philosophical,” he laughs. But that peculiar mix of idealism and cynicism is the specialty of teenagers. And so when asked if she thought she could, someday, make AI better than today’s, Zhang gave a resounding “Maybe.”

binary code balancing on scales

DAILY BEGAN as a way to right a wrong. Blakeley Payne (née Hoffman), a computer science major at the University of South Carolina, was hanging out in 2015 with her best friend, who had just applied for a job at Twitter. The rejection came back in a blink. How could the company possibly have decided so quickly that she wasn’t “a good fit”? They posited that perhaps an algorithm had made the decision based on specific keywords. Mad, Payne began reading up on research about bias in, and the resulting inequities caused by, AI.

Since Payne’s experience, AI partiality in hiring has become a famously huge problem. Amazon, for instance, made headlines in 2018 when Reuters reported that the company’s recruitment engine discriminated against women—knocking out résumés with that keyword (as in “women’s chess club captain”) and penalizing applicants for having gone to women’s colleges. Turns out developers had trained their algorithm using “résumés submitted to the company over a 10-year period,” according to Reuters, most of which had come from men. A 2021 paper in International Journal of Selection and Assessment found that people largely rate a human’s hiring judgment as more fair than an algorithm’s, though they often perceive automation to be more consistent.

At first, the whole situation soured Payne on her field. Ultimately, though, she decided to try to improve the situation. When she graduated in 2017, she enrolled at MIT as a graduate student to focus on AI ethics and the demographic where education could make the most difference: middle-school students. Kids this age are often labeled “AI natives.” They’ve never not known the tech, are old enough to consider its complications, and will grow up to make the next versions.

Over the next couple of years, Payne developed one of the first AI ethics curricula for middle graders, and her master’s thesis helped inform another set of interactive lessons, called “How to Train Your Robot.” When she graduated in 2020 and went on to do research for the University of Colorado, Boulder, MIT scholars like DiPaola continued and expanded her efforts.

[Related: Do we trust robots enough to put them in charge?]

Payne’s projects helped lay the groundwork for the larger-scale DAILy program, funded by the NSF in March 2020. DAILy is a collaboration among the MIT Scheller Teacher Education Program (STEP), Boston College, and the Personal Robots Group at the MIT Media Lab, an interdisciplinary center where DiPaola works. A second NSF grant, in March 2021, funds a training program to help teachers use DAILy in their classrooms. By forging partnerships with districts in Florida, Illinois, New Mexico, and Virginia and with youth-education nonprofits like STEAM Ahead, the MIT educators are able to see how their ivory-tower lessons play out. “The proving ground for any curriculum is in the real classroom and in summer camps,” says DiPaola.

When those kids—and many adults, even—think of AI, one thing usually comes to mind: robots. “Robots from the future, killer robots that will take over the world, superintelligence,” says DiPaola. “It was a big shock to them that AI is actually in the technologies they use every single day.”

Teachers have often told the STEP Lab’s Irene Lee, who oversees the grants, that they didn’t realize AI was being “deployed.” They thought it was an abstraction in labs. “‘Deployed’?!” Lee says to them. “You’re immersed in it!”

It’s in smart speakers. It recommends a Netflix film to chill to. It suggests new shoes. It helps give the yea or nay on bank loans. Companies weed out job applicants with it; schools use it to grade papers. Perhaps most importantly to the summer-camp students, it powers apps like TikTok and whatever meme-bending video the platform surfaces.

They know that when they’re looking at cat-mischief TikToks, they’ll get recommendations for similar ones, and that their infinite scroll of videos is different from their friends’. But they don’t usually realize that those results are AI’s doing. “I didn’t know all these facts,” says Zhang.

Soham Patil, one of her camp-mates, agrees. A rising eighth grader, he’d been studying how AI works and writing software recreationally for a few months before the program. “I kind of knew how to code, but I didn’t really know the practical uses of AI,” Patil says. “I knew how to use it but not what it’s for.”

PATIL, ZHANG, AND THEIR PEERS’ next activity involved a different food group: noodles. They saw on their screens a member of a strange royal family—a cat wearing a tiara, with hearts for eyes.

“There is a land of pasta known for most excellent cuisine with a queen who wants to classify all the dry pasta in her land and store them in bins,” reads the lesson. “… YOU, as a subject in PastaLand, are tasked with building a classification system that can be used to describe and classify the pasta so the pasta can easily be found when the queen wants a certain dish.”

Ethics of monarchy aside, the students’ goal was to develop an identification system called a decision tree, which arrives at classification by using a series of questions to sort objects based on their characteristics, first into two groups, then each of those into two more groups, then each of those into two more, until there is only one kind of object left in each group. For pasta, STEP Lab’s Lee explains, “The first question could be, ‘Is it long?’ ‘Is it curly?’ ‘Does it have ridges?’ ‘Is it a tube?’” Zhang’s team started with “Is it round?” “Is it long?” and “Is it short?”

As before, though, when the kids reassembled, they realized their questions were all different: Some might ask whether a piece of pasta can hold a lot of sauce or only a little. Another might separate types based on whether they’re meant to be stuffed or not. Patil noticed that some kids would try to separate the unclassified pasta into two roughly equal groups at every juncture.

“Could someone who is blind follow their key?” the teachers asked. What about the subjectivity in simply determining what “long” is? Even pasta was influenced by culture, experience, and ability. The students then extended this realization—that it’s easy to bake in bias, exclude people, or misread your opinions as objective—to higher-stakes situations. Predictive policing is an example. The technology uses past crime data to forecast which areas are high risk or who is purportedly most likely to offend. But any AI that uses legacy data to predict the future is liable to reinforce past prejudices. A 2019 New York University  Law Review paper looked at case studies in Illinois, Arizona, and Louisiana and noted that a failure to reform such systems risks “creating lasting consequences that will permeate throughout the criminal justice system and society more widely.”

[Related: How Google’s newest tool could change how you search online]

The students could see, again, how AI-based choices affect outputs. “They can know, ‘If I design it this way, these people will be impacted positively, these people will be impacted negatively,’” says DiPaola. They can ask themselves, How do I make sure the most vulnerable people are not harmed?

AI developers find themselves grappling with these questions more frequently, in part because their work now touches so many aspects of people’s lives. The biases in their code are largely society’s own. Take recommendation algorithms like YouTube’s, which former Google developer Guillaume Chaslot asserts drive viewers toward more sensationalistic, more divisive, often misinformational videos—to keep more people watching longer and attract advertising. Such a choice arguably favors profits over impartiality.

By teaching kids early what ethical AI looks like, how unfairness gets in there, and how to work around it, educators hope to enable them to recognize that unfairness when it occurs and devise strategies to correct the problem. “Ethics has been taught either as a completely separate course or in the last two or three lessons of a semester course,” says DiPaola. That, she says, conveys an implicit lesson: “Ethics doesn’t need to be thought of at the same time as you’re actually building something, or ethics is kind of an afterthought.”

Better integration of ethics is important to Denise Dreher, a database programmer who recently retired from the IT department of St. Paul, Minnesota’s Macalester College. As a personal project, she has been cataloging curricula like DAILy and making the K-12 lessons available on her website, Explore AI Ethics, for teachers to use in the classroom. She believes that AI education should look more like engineering instruction. “There’s a long and very good tradition of safety and ethics for engineer training,” she says, “because it’s a profession,” one with a codified career path. You can’t just go build a bridge, or get through bridge-building school without having to work through the implications of your bridge.

“AI?” she continues. “Any 10-year-old in your basement can do it.”

binary code as colorful toys

AS CAMP PROGRESSED, the ethical questions grew bigger, as did the technology that students dealt with. One day, Mark Zuckerberg—CEO of Facebook, a social network largely populated by olds—appeared on their screens. “I wish I could keep telling you that our mission in life is connecting people, but it isn’t,” Zuckerberg said. “We just want to predict your future behaviors. … The more you express yourself, the more we own you.”

That would be an unusually candid speech. And, actually, the whole thing looked a little off. Zuckerberg’s eyelids were a little blurrier than the rest of him. And he stared at the camera without blinking for longer than a normal person would. These, instructors pointed out, are tells.

He didn’t look like a normal person because he wasn’t a normal person. He wasn’t even a real person. He was a deepfaked videomorph giving a deeply faked speech. A deepfake is footage or an image produced by an AI after it parses lots of footage or photos of someone. In this case, the software learned how Zuckerberg looks and sounds saying different words in different situations. With that material, it assembled a Zuck that doesn’t exist, saying something he never said. “It’s kind of hard to think how AI could create a video,” says Patil.

Zhang, whose preferred social medium is YouTube, watches a lot of videos and already assumed that not all of them are “real”—but didn’t have any tools to parse truth from fiction till this course.

The campers had all likely encountered AI-based fakery before. An app called Reface, for example, lets them switch visages with another person—a popular TikTok hobby. FaceTune conforms selfies to conventional European standards of beauty, bleaching teeth, slimming noses, pouting up lips. But they can’t always tell when someone else has been tuned. They may just think that so-and-so just had a good complexion day.

In fake visual media, the real and synthetic—the human and the AI—have two faces that look nearly identical. When the kids fully grasp that, “It’s a moment where shit gets real, so to speak,” says Gabi Souza, who worked at the camp both summers. “They know that you can’t trust everything you see, and that’s important to know, especially in our world of so much falsehood so widely propagated.” They at least know to question what’s presented.

NOT ALL LESSONS went over so well. “There are a couple of activities that even in person would be scratching at the top level of comprehension,” says instructor Davis. Patil, for instance, had a hard time understanding the details of neural networks, software inspired by the brain’s interconnected neurons. The goal of the code is to recognize patterns in a dataset and use those patterns to make predictions. In astronomy, for instance, such programs can learn to predict what type of galaxy is shining in a telescope picture. At camp, the kids acted like the nodes of a neural network to predict the caption for a photo of a squirrel “water-skiing” in a pool. It worked kind of like a game of telephone: Teachers showed the picture to several students, who wrote down keywords describing it, and then each passed a single word on to students who hadn’t seen the image. Those kids each picked two words to pass to a final camper, who chose four words for the caption. For the “nodes,” understanding their role in that network, and transposing that onto software, was hard.

But even with the activities that didn’t melt youthful brains, how well a lesson went depended on “how many students had breakfast this morning, is it Monday or is it Thursday afternoon,” says Davis. It wasn’t all canoes and archery, like traditional camps. “It’s a lot of work,” says Zhang.

Making AI education accessible, and diversely implemented, is more complicated than teaching it in person to private-school kids who get MacBook Pros. While the collaboration partners had always planned to make the curriculum virtual to make it more accessible, the pandemic sped up that timeline and highlighted where they needed to improve, like by making sure that the activities would work across different platforms and devices.

[Related: The Pentagon’s plan to make AI trustworthy]

Then there are complications with the Media Lab’s involvement. The organization came under fire in 2019 for taking money and ostensible cultural cachet from convicted sex offender Jeffrey Epstein, which led to the departure of the lab’s director. Writer Evgeny Morozov, who researches the social and political implications of technology, pointed out in the Guardian that the “third culture” promoted by organizations like the lab—where scientists and technologists represent society’s foremost “deep thinkers”—is “a perfect shield for pursuing entrepreneurial activities under the banner of intellectualism.” Perhaps you could apply that criticism to Personal Robots director Cynthia Breazeal, whose company garnered around $70 million in funding between 2014 and 2016 for a “social robot” named Jibo that would help usher in a new era of human-machine interaction. The story had an unhappy ending: delayed shipments, dissatisfied customers, layoffs, a sell-off of intellectual property, and no real revolution.

But those too are perhaps good lessons for students to learn while they’re young. Flashy, fancy things can disappoint in myriad ways, and even places that teach ethics early can nevertheless have lapses of their own. And maybe that shouldn’t be so surprising: After all, the problems with AI are just human problems, de-personified.

The seamy undersilicon of AI—its discrimination, its invasiveness, its deception—didn’t, though, discourage campers from wanting to join the field, as both Zhang and Patil are considering.

And now they know that, more likely than not, no matter what job they apply for, an algorithm will help determine if they’re worthy of it. An algorithm that, someday, they might help rewrite.

This story originally ran in the Fall 2021 Youth issue of PopSci. Read more PopSci+ stories.

Sarah Scoles Avatar

Sarah Scoles

Contributing Editor

Sarah Scoles is a freelance science journalist and regular Popular Science contributor, who’s been writing for the publication since 2014. She covers the ways that science and technology interact with societal, corporate, and national security interests.