Could this game replace the SAT?
— Jack Buckley, Imbellus“If people teach to the test, then we need to make tests that are worth teaching to.”
Transcript
Transcript:
Could this game replace the SAT?
ARCHIVAL: We’re here today to announce charges in the largest college admissions scam ever prosecuted by the department of justice.
CATERINA FAKE: Hi, it’s Caterina. About a year ago, the big news in education was the SAT cheating and college admissions scandal. Code name: Varsity Blues. It was all very sensational.
ARCHIVAL: Business executives and yes, Hollywood celebrities, caught rigging the system. Is there a rethinking now of the whole college admissions process?
FAKE: The test with little bubbles you fill-in with a number two pencil has weathered many controversies over the years. The SAT has been around for almost a century.
ARCHIVAL: His crime was taking the SAT and ACT tests for other people.
FAKE: Now, the testing giants are again in the position of defending the value of their exams.
ARCHIVAL: Testing centers have been closed. Everything’s come to a standstill with all exams cancelled or postponed due to the coronavirus outbreak.
ROB FRANEK: We don’t use the word “tectonic” lightly at Princeton Review. But it will be tectonic.
FAKE: Rob Franek is editor-in-chief at the Princeton Review, a test-prep company.
None of us chose the reason for this testing standstill. But it could be just the opening needed for some tech pioneers to deliver a brand-new standard for testing. Something more interactive. Less biased. Something that looks like a video game.
REBECCA KANTAR: Our tests aim to be beautiful and interesting and replicate a lot of kinds of what you’d expect from a triple-A video game, but perhaps with fewer zombies and less shooting.
FAKE: Rebecca Kantar is a 28-year old Harvard dropout. She’s challenging the dominance of outdated paper and pencil tests. And so far, she’s raised $25 million from people who agree that today’s students would benefit from a new kind of testing.
KANTAR: I think the bar for us is even higher than the bar for a traditional multiple-choice question. Simulation-based or game-based assessments are a totally new category of assessment, and they have never been used in a large-scale, high-stakes assessment situation.
FAKE: Kantar says what she has to offer could measure more than just how good someone is at algebra or finding the main point of a paragraph. It could reveal potential in high schoolers that isn’t currently measured – like empathy and problem solving.
But do we really need another high-stakes test to replace the old one? Is this the perfect time to change an outdated model? Or is it the worst time, when we’re still trying to answer basic questions like, “How will all students be allowed back on campus?”
[THEME MUSIC]
FAKE: Hey, it’s Caterina. Back in mid-March, the United States was just beginning self-quarantine. It was also the time of year when high schoolers were getting ready to take the SAT. At the last minute, the College Board, a nonprofit entity that oversees the SAT, closed all of its testing sites. Not just in the U.S. but around the world. But it happened so suddenly, so last-minute, that not everybody got the message.
In Southern California, one test site was nearly empty – its proctors replaced by honking geese. The only students who were there were a handful of teenagers who hadn’t gotten the memo from the College Board.
DAKOTA PEARSON: We are at Pasadena High School. I was here to take the SAT, but apparently it’s canceled, due to the virus. My name is Dakota Pearson. I’m 18 years old.
FAKE: Dakota acknowledges the importance of the SAT – even if he doesn’t personally believe in it.
PEARSON: I think it’s really important, because it determines your future, to get into college. Honestly, it’s not really necessary. It’s just a really hard test for no reason that we have to take.
FAKE: The College Board estimates that over a million students couldn’t take the SAT this spring, and subsequent SAT and ACT test dates were pushed back for months.
In September, half the students who registered were still unable to take the SAT because of pandemic restrictions. Clearly, the old system of college standardized testing was struggling in the face of the COVID-19 outbreak. But for some people, like Rebecca Kantar, the need for updated testing had already been apparent for years.
KANTAR: What’s so exciting and compelling is we actually have gotten the medium of simulation-based assessment to work pretty well. We’ve actually seen our test be valid. We’ve seen them be reliable. We’ve seen them be fair.
FAKE: Kantar believes that tests like the SAT and ACT are inherently flawed. That’s because the tests focus on an outdated set of skills that require the memorization of facts in math, reading comprehension, writing, and basic science.
Students who have the money to pay for tutoring and test-prep services will always have the upper hand. Instead of measuring what a person knows, she says her tests analyze how a person thinks. And her next-generation test is interactive.
KANTAR: Our scenarios are a lot like little video games where, on the back end, we’re looking at the types of decisions you make, how you navigate information, the order of the steps you take in developing a mental model of what’s going on.
FAKE: Back in March, I spoke with Rebecca.
FAKE: Great talking to you.
FAKE: Rebecca’s testing company, Imbellus, is based in Los Angeles. But Rebecca was in quarantine back in Boston, with her family and her Dalmatian.
KANTAR: I’m very fortunate to be in a nice warm home with lots of food.
FAKE: The Dalmatian colleague actually also sounds ideal.
KANTAR: Well, we have two labs, two yellow labs here and one Dalmatian, and I’ll tell you, it’s a study in genetics.
FAKE: Kantar’s an unlikely person to try to redesign the entire U.S. education system through the prism of an innovative, high-tech college entrance exam.
KANTAR: I went to public school all the way through. I experienced my first dislike of school, I would say, after kindergarten. I always felt it was downhill from there.
FAKE: But outside the classroom, Kantar was an energetic overachiever. She learned Mandarin in junior high. She even secured a grant and produced Cinderella in Chinese. When she was just 14, Kantar helped launch a national campaign.
KANTAR: …working on combating child sex trafficking in the United States.
FAKE: Then she got into Harvard. But the zenith of American education felt like a letdown.
FAKE: And then, what gave you the confidence to walk away?
KANTAR: Well, at the time, it was really not an act of courage or defiance. It was really, I was just unhappy, and I had at the time been fortunate enough to know hundreds of other young entrepreneurs and social entrepreneurs who I’d met throughout my nonprofit work in high school and at Harvard.
So I had this kind of network of do-gooders and really interesting entrepreneurial folks. And that’s what I’d really built out first.
FAKE: Rebecca Kantar left Harvard after her sophomore year to launch her first company. But her parents weren’t exactly thrilled – at first.
KANTAR: I think the turning point for me was actually a year after I’d left school, I was at a Christmas party that my family has gone to in Boston every year forever. And my mom was explaining that I had left school. And a woman overheard her and just said, “Oh, I’m so sorry that your daughter didn’t make it through.” And my mom kind of responded, “Well, you know, she’s been getting an MBA for free in the real world, building something, and she loves it. And she’s happy every day.”
FAKE: Then at 21, Kantar began working on Imbellus.
KANTAR: I started really trying to understand how might you change high school in particular so that it leaves all kids – no matter what they’re going to pursue next – you leave them with a really solid foundation of skills that they need to be functional adults. And that was my driving question.
FAKE: She says one thing that drives what gets taught in high schools is what gets tested. So she wants to measure traits employers say they need for the future – empathy, creativity, collaboration, problem solving. But persuading the powerful and change-resistant testing industry to try something new, like using a game-based assessment, wouldn’t be easy. Especially for a 21-year-old with no gaming experience and no background in education.
FAKE: Thank you for sending over the screenshots. They looked a lot like a video game. Was that on purpose?
KANTAR: Yeah. So all of our content looks and feels like a video game. You know, one of my initial theories in starting Imbellus was, if people are engaged and not just because they’re running on adrenaline and fear of what happens if I don’t do well on this test but because they actually find the content interesting and stimulating and beautiful and awe-inspiring, perhaps you get kids to put their best foot forward, right? Maybe you can help make that assessment experience an authentic view of how people actually think and what they’re like.
FAKE: And also, no black-and-white sheets, numbered questions, and bubbles to fill in.
KANTAR: Nope. We really are more interested in when we set up an interesting scenario, and you start to understand what’s going on, how do you use the information there? What kind of actions do you take to help us understand your mental model and the way you’re approaching solving it?
FAKE: So, I’m a test taker. Put me in the room. What’s it like?
KANTAR: An example of a scenario would be something like: It’s your job to figure out how to stop an invasive species from reaching a native plant. And you’re kind of seeing this game board where there are different tiles that have different properties. There might be a mountain tile which causes an invasive species to re-route and seek a different path. There might be a rocky terrain which causes an invasive species to spend an extra turn. It’s also your job to think about how to deploy different defenders who do damage to the invading populations.
How do you think through order of consequences? How do you adapt based on the information the scenario gives back to you? And how do you really moderate and make a different plan and edit your approach? So, that’s an example.
FAKE: Then an administrator or an analyst has to interpret the data to tell us what the test is revealing about each person. Tell me a little bit about how that works.
KANTAR: Yes. So right now, we do not provide student-level or applicant-level results that are just delivered to you as an individual test taker. That obviously has to change when you’re working in the education system.
FAKE: It’s not being used in schools yet, but Imbellus has been working with management consulting firm McKinsey & Company to use their assessment as part of their recruitment and hiring process.
Since Rebecca Kantar didn’t start Imbellus with a background in video games or testing, she hired to compensate. At Imbellus, she put together a team of education-establishment veterans – like Jack Buckley as President and Chief Scientist.
KANTAR: I met him while he was at AIR, the American Institutes for Research. Jack had previously been the head of the National Center for Educational Statistics. He’d also worked on the redesign of the SAT at the College Board. So, Jack was someone who had very much been not only in the belly of the beast, but controlling the entire beast.
JACK BUCKLEY: Yes, hello.
FAKE: Hi Jack, how are you?
FAKE: Rebecca and I called Jack at his home in New York City.
FAKE: What brought you to Imbellus?
BUCKLEY: You know, one of the things that I think a lot of people in the field have always wanted to do is to get more engaging, authentic assessments, to better integrate gameplay and scenarios to try to build the kinds of assessments that we’re building now at Imbellus. And for me, this was really a unique opportunity to try to build the kinds of assessments that we’ve always wanted to have as a field.
FAKE: And what is the utopian scenario that you envision for Imbellus?
BUCKLEY: Well, I mean, I think. That’s… I’m not a utopian-ist.
FAKE: He’s a more cautious man.
KANTAR: Jack’s about as practical a guy as you’re ever going to find. That’s why I wanted to make sure we asked him this uncomfortable question.
BUCKLEY: It’s funny. I measure things for a living. So, on a good day, I’m happy when we’re able to measure something with validity and reliability.
FAKE: Still, Jack does think this technology has the potential to do good.
BUCKLEY: This is going to sound trite, but if people teach to the test, then you need to make tests that are worth teaching to. And part of that is making sure that the types of cognition that you really want to see in your citizens and in your workforce, there’s an incentive for teachers to teach those things. For the system to say, it’s not just about drilling on these same Algebra 1 concepts, it’s actually demonstrating that you can think creatively or you can cooperate in a team.
FAKE: Jack Buckley and Rebecca Kantar are motivated to make a positive difference in young people’s lives. But this new high-stakes technology could go sideways. I asked Jack Buckley what keeps him up at night.
BUCKLEY: You know, the history of assessment is one with a lot of dead ends and sort of dark alleys. I think that what I worry about most in any next generation of assessment is that it’s misused, right? That somehow either people over-index on it or make all their decisions on the basis of some new measurement tool.
They get a shiny new number, and they do make too many decisions on that basis when it’s probably not valid or reliable for those purposes. Any kind of measurement, honestly – be it a simple survey or a three-hour assessment – I always worry that it could be misused or used irresponsibly.
FAKE: Jack brings up a great point. Rebecca Kantar said the point of developing new testing is to improve teaching. But in college admissions, tests aren’t used that way. They’re used to measure, to compare – and to cull. A new testing model might uncover new reasons to let a particular student into an elite school. Couldn’t it also turn up reasons to keep them out?
[AD BREAK]
FAKE: Hi, welcome back. We’re talking about the future of standardized tests: how they could change, what they should measure, and how they might be administered. We’re now going to hear from one of the world’s leading thinkers in this field, Dr. Robert Sternberg. Former president of the American Psychological Association, he’s now a professor at Cornell.
Despite all his accolades, Bob Sternberg knows what it feels like to be at the bottom of the statistical heap.
ROBERT STERNBERG: I had severe test anxiety.
FAKE: What did you feel like as a child when you were being measured?
STERNBERG: When I was a kid, we had IQ tests probably every year, paper-and-pencil IQ tests. And a psychologist would come into our elementary school classrooms. She looked really frightening to me.
And when she came in, I would freeze, as do so many kids when they take standardized tests. And other kids would be turning the page and I would be just sitting there, frozen in time. And then she’d say “stop”, and I would have done two or three problems maybe.
FAKE: What Bob is describing is “test anxiety,” and it’s incredibly common in children. The National Education Association has called it an epidemic of its own. And once a child gets test anxiety, it’s often reinforced.
STERNBERG: My teachers thought I was stupid. I thought I was stupid. I did stupid work. But the bottom line was that in large part, because of this standardized testing business, people set low expectations for me, which I met. And everyone thought – including me – that was the best I could do.
FAKE: But even as a kid, Sternberg believed he could come up with something better. He created his own IQ test and started giving it to his junior high school peers until the principal shut him down. He eventually went to Yale, and as a freshman…
STERNBERG: I took introductory psychology, planning to major in psychology, and I bombed it. I got a C. Which my professor, he handed me a paper and he said, “Well, you know, there’s a famous Sternberg in psychology and it’s obvious there won’t be another one.”
FAKE: Haha.
STERNBERG: Years later, when I was president of the American Psychological Association, I said to the guy who was president the year before, Phil Zimbardo, who was a famous Stanford psychologist. So, I said to Phil, “Jeez, how did I ever get to be president when I got a C in introductory psychology?” And he said, “Well, I got a C, too.”
FAKE: No, haha.
STERNBERG: So, there’s a real lesson for young people there. And that is: you don’t let the C or the D or whatever it is discourage you. If you have a passion, and you’re willing to work hard for something, then you go for it.
FAKE: Today, Bob Sternberg is professor of Human Development at Cornell. His perspective on testing – and on failure – is uniquely well-rounded. His point of view is broader than just his personal experiences. It’s historical.
STERNBERG: The SAT, the ACT, the GRE – all of these tests go back to the effort of Alfred Binet.
FAKE: Alfred Binet was a French psychologist. In the early 1900s, he was commissioned to design a test that could help identify which students might struggle in school and would benefit from extra help.
Eventually, the results of these tests were summed up with a number: IQ. The intelligence quotient. And while Binet himself argued that intelligence was malleable, his IQ test would be used to try and prove the opposite, blocking and regulating disadvantaged groups with its results.
STERNBERG: When the SAT started, they had a good goal, too, which was to create a meritocracy instead of everything being your socioeconomic status – how much money your parents have.
FAKE: But, in the end, Sternberg says the SAT only reinforced the status quo. Privileged kids still did better on the test. There are a number of reasons for that – from access to test prep to academic support. So in 2005, when he was Dean of Arts and Sciences at Tufts University, Sternberg took it upon himself to design an alternative admissions test.
STERNBERG: I have felt for many years that tests like the SAT and the ACT only measure a small part of intelligence. If you look at people who really make a difference to the world, who make the world a better place, they’re not just good analytical thinkers, which is what the tests measure. These are people who are creative. They have common sense. And most importantly, they have wisdom.
And I said, I have these ideas about how we could change admissions at Tufts by measuring not just SAT kinds of smarts or ACT kinds of smarts, but also creative, practical, and wisdom-based skills. And the first reaction I got was: “How much money is it going to cost?” Isn’t that a surprise?
FAKE: Here’s the kicker, Bob funded the program by specifically soliciting alumni who had been C students like him – many of whom had gone on to find success in other endeavors. And Rebecca Kantar cites the program, called the Kaleidoscope Project, as an influence for her Imbellus test.
But the Kaleidoscope Project didn’t become the new standard. It didn’t replace the SAT or the ACT. Bob Sternberg still believes that current standardized tests prioritize the wrong values.
FAKE: But if tests like the SAT reflect our values as a society, what values does it currently reflect?
STERNBERG: If you look at failures of leadership, whether it’s in the national political scene or in corporations or nonprofits, the failed leaders so often are people who are smart in an SAT sense, but they only use the smarts to advance their own interests.
And if you look at the response to the current pandemic, you can see the staggering cost of having leadership that knew for many years that there was going to be a pandemic, and then did nothing about it.
So our big mistake is thinking that people who are good at solving multiple-choice problems that are fairly trivial, that have a unique correct answer, and that measure things that are never going to matter again in your whole life, that that is going to be predictive of who is going to be able to handle really tough, unstructured, ambiguous, real-world problems.
FAKE: Another pioneer in developing new approaches to student assessment is Dr. Valerie Shute. Her work has been an inspiration to Rebecca Kantar at Imbellus. Shute is a professor of educational psychology and learning systems at Florida State University. She started studying the benefits of game-based learning more than 30 years ago. But even now…
VALERIE SHUTE: I’m a little bit nervous about setting this free into the world for fears of it being used wrongly, I guess.
FAKE: Educators have long believed that video games and virtual worlds could be used to supplement classroom instruction but not necessarily as testing tools.
Shute coined the term “stealth assessment” – meaning the computerized test is seamless, flowing, and unobtrusive. Almost without students knowing – and woven into a lesson. To do that, she turned to video games.
SHUTE: It’s to move a ball over to hit a balloon. Kids are solving the problem by drawing objects on the screen. They’re drawing things like ramps, levers and pendulums and springboards to be able to move the ball to hit the balloon.
FAKE: And her target has been engaging kids who aren’t normally interested in science. Shute’s game is called “Physics Playground,” solving a puzzle related to Newton’s Laws.
SHUTE: And when a player draws objects on the screen, these objects come alive because they’re operating with the laws of physics that’s built into the game.
FAKE: The game allows kids to experiment and fail. And try again. It doesn’t measure an understanding of physics, it goes much deeper, analyzing the critical thinking skills behind a student’s decision making.
SHUTE: So measures of persistence and problem solving and critical thinking and creativity can provide a much more accurate gauge of a student’s potential to succeed in college than just their verbal and quantitative scores. And that’s what the SATs are and the ACT is, to see who is going to succeed in college. And that’s very limited information.
FAKE: Her research also shows that the virtual world reduces gender and racial bias in the learning process. In Shute’s studies, when girls answer physics questions before playing Shute’s game, they generally score lower than the boys. But that changes after the girls play Physics Playground.
SHUTE: The same kind of findings appear where some of our preliminary ethnicity comparisons, where there’s initial advantages for white students on the pre-test, but at the end, there’s no differences on the post-test in terms of ethnicity. So that was a really big deal as far as really kind of leveling the playing field.
FAKE: That’s undeniably a positive result. But the fact that Shute’s work is a step forward in testing doesn’t mean that it couldn’t be misused.
SHUTE: I don’t want it to be the basis for anybody making decisions about my child or myself down the road – promotion decisions or whatever. Like a credit card that contains your entire history of learning this stuff.
If that got in the wrong hands, that could potentially be used against you somehow. And I don’t know how to thwart that kind of bad behavior.
FAKE: Shute says her videogame test and the concept of “stealth assessment” works better when the stakes are low. She believes the best possible future for this technology is not for high-stakes testing.
SHUTE: I really try to emphasize that the data collected for stealth assessments stay at a low-stakes level for formative purposes to help students learn and not serve as any kind of gatekeeper.
FAKE: For now, the gatekeepers are still the SAT and the ACT. And for the students who take them, the stakes are high. Stakes are also high for colleges – especially private universities that make a point of taking only the very top percentile of test-takers.
And now that a global pandemic has driven high schools and colleges online, testing stays frozen in place. So, is this the moment that determines who the new gatekeepers might be?
[AD BREAK]
FAKE: Hi, we’re back, and concluding with some reflections on how COVID-19 has revealed multiple system failures. So we’re kicking around the idea of how this outbreak might be the catalyst to launch the next generation of interactive testing.
Psychologist Robert Sternberg is all in.
STERNBERG: If there’s ever been a wakeup call, the pandemic is it. Clearly, whatever it is that these tests are measuring, it’s not important for the future of the world.
FAKE: Yeah.
STERNBERG: So here’s our chance, as people are dying, to say, “We have screwed up.” Let’s do something different. There are other options out there.
FAKE: Pre-COVID, more than 1,000 colleges and universities offered test-optional admissions and hundreds more have joined that list for the next year or two. And there’s a growing list of schools that are declaring themselves “test free” – eliminating the ACT and SAT from their admissions process.
Even so, Rob Franek, with the Princeton Review test-prep company, says the SAT is still the most trusted and reliable assessment.
FRANEK: You know, the SAT and the ACT are our great gatekeepers in the admission process. Even test-optional schools are still using the SAT and ACT, combined with GPA from high school, for scholarship dollars – not all test-optional schools, but a heck of a lot of them are.
FAKE: Even Rebecca Kantar at Imbellus defends the importance of these tests – for now.
KANTAR: I wish I could say, “Yeah, Imbellus’ assessment is ready to go, and we’ll replace everything, everywhere.” It’s just not true, right? The reality is, these tests – something like the SAT, ACT–- are the most sophisticated tests in the world, full stop, for educational assessment.
FAKE: Needless to say, there’s a lot of pressure to get any new high-stakes test right. So, Imbellus is moving forward incrementally.
KANTAR: You never want to be the one who botched it so no one ever tries it again. And so it’s very important to me that no matter what happens long term, we are as diligent and thoughtful and as thorough in exploring this opportunity and helping other people understand it and adopt it.
FAKE: Jack Buckley, president of Imbellus, agrees. And for him, there’s another best possible future on the horizon.
BUCKLEY: To me, I guess a utopia looks like we still have a system of assessments. But those assessments actually are engaging and have a consensus from not only colleges but also in the workforce and the population more broadly that they measure the things that matter – and therefore that our education systems teach those things.
FAKE: It’s a modest but powerful utopia. I like it.
FAKE: When the Imbellus alternative test is ready, Rebecca Kantar says she wants it to deliver more than a score. She envisions a test that helps students recognize their own natural talents.
KANTAR: This is all just one dimension of a person, right? And so much more than any test. Ours or anyone else’s is ever going to measure. But, this could be a starting point for exploring and interest or exploring a skill that you might have known you had. But of course, seeing it reflected in a standardized assessment, is always nice validation that other people saw that skill present in you as well.
FAKE: It’s nice to imagine a world in which student assessments behave more the way Rebecca Kantor and Valerie Shute envision them: as ways to improve teaching and correct bias, not just to quantify.
We’re not there yet. But it seems as though we might be getting closer.
Look, I don’t get to decide Should This Exist? And neither does this show. Our goal is to inspire you to ask that question, and the intriguing questions that grow from it.
LISTENER: If the tests to get into college were more like a video game, I assure you I could have gone anywhere I wanted.
LISTENER: But it’s not going to be fun the same way that regular videogames are.
LISTENER: The first thing I have to say is some gratitude. I was one of those B, C students. Doing very well on the SAT really reset my perception of what my life could be.
LISTENER: It caused anxiety. I had a panic attack. I didn’t do well on it, even though I did really well in school, because I was a super hard worker.
LISTENER: I think testing is such an elitist scam.
LISTENER: What about creativity? What about empathy?
LISTENER: Why is it that the educational system gets away with having something which is so proverbially limited and unsophisticated? Why can’t we do better? And I don’t think a video game is just better.
FAKE: Agree? Disagree? You might have perspectives that are completely different from what we’ve shared so far. We want to hear them.
To tell us the questions you’re asking go to “www.ShouldThisExist.com” where you can record a message for us. And join the Should This Exist? newsletter at www.shouldthisexist-stage1220.mystagingwebsite.com.
I’m Caterina Fake.