Feed aggregator

Model predicts long-term effects of nuclear waste on underground disposal systems

MIT Latest News - Fri, 07/18/2025 - 12:00am

As countries across the world experience a resurgence in nuclear energy projects, the questions of where and how to dispose of nuclear waste remain as politically fraught as ever. The United States, for instance, has indefinitely stalled its only long-term underground nuclear waste repository. Scientists are using both modeling and experimental methods to study the effects of underground nuclear waste disposal and ultimately, they hope, build public trust in the decision-making process.

New research from scientists at MIT, Lawrence Berkeley National Lab, and the University of Orléans makes progress in that direction. The study shows that simulations of underground nuclear waste interactions, generated by new, high-performance-computing software, aligned well with experimental results from a research facility in Switzerland.

The study, which was co-authored by MIT PhD student Dauren Sarsenbayev and Assistant Professor Haruko Wainwright, along with Christophe Tournassat and Carl Steefel, appears in the journal PNAS.

“These powerful new computational tools, coupled with real-world experiments like those at the Mont Terri research site in Switzerland, help us understand how radionuclides will migrate in coupled underground systems,” says Sarsenbayev, who is first author of the new study.

The authors hope the research will improve confidence among policymakers and the public in the long-term safety of underground nuclear waste disposal.

“This research — coupling both computation and experiments — is important to improve our confidence in waste disposal safety assessments,” says Wainwright. “With nuclear energy re-emerging as a key source for tackling climate change and ensuring energy security, it is critical to validate disposal pathways.”

Comparing simulations with experiments

Disposing of nuclear waste in deep underground geological formations is currently considered the safest long-term solution for managing high-level radioactive waste. As such, much effort has been put into studying the migration behaviors of radionuclides from nuclear waste within various natural and engineered geological materials.

Since its founding in 1996, the Mont Terri research site in northern Switzerland has served as an important test bed for an international consortium of researchers interested in studying materials like Opalinus clay — a thick, water-tight claystone abundant in the tunneled areas of the mountain.

“It is widely regarded as one of the most valuable real-world experiment sites because it provides us with decades of datasets around the interactions of cement and clay, and those are the key materials proposed to be used by countries across the world for engineered barrier systems and geological repositories for nuclear waste,” explains Sarsenbayev.

For their study, Sarsenbayev and Wainwright collaborated with co-authors Tournassat and Steefel, who have developed high-performance computing software to improve modeling of interactions between the nuclear waste and both engineered and natural materials.

To date, several challenges have limited scientists’ understanding of how nuclear waste reacts with cement-clay barriers. For one thing, the barriers are made up of irregularly mixed materials deep underground. Additionally, the existing class of models commonly used to simulate radionuclide interactions with cement-clay do not take into account electrostatic effects associated with the negatively charged clay minerals in the barriers.

Tournassat and Steefel’s new software accounts for electrostatic effects, making it the only one that can simulate those interactions in three-dimensional space. The software, called CrunchODiTi, was developed from established software known as CrunchFlow and was most recently updated this year. It is designed to be run on many high-performance computers at once in parallel.

For the study, the researchers looked at a 13-year-old experiment, with an initial focus on cement-clay rock interactions. Within the last several years, a mix of both negatively and positively charged ions were added to the borehole located near the center of the cement emplaced in the formation. The researchers focused on a 1-centimeter-thick zone between the radionuclides and cement-clay referred to as the “skin.” They compared their experimental results to the software simulation, finding the two datasets aligned.

“The results are quite significant because previously, these models wouldn’t fit field data very well,” Sarsenbayev says. “It’s interesting how fine-scale phenomena at the ‘skin’ between cement and clay, the physical and chemical properties of which changes over time, could be used to reconcile the experimental and simulation data.” 

The experimental results showed the model successfully accounted for electrostatic effects associated with the clay-rich formation and the interaction between materials in Mont Terri over time.

“This is all driven by decades of work to understand what happens at these interfaces,” Sarsenbayev says. “It’s been hypothesized that there is mineral precipitation and porosity clogging at this interface, and our results strongly suggest that.”

“This application requires millions of degrees of freedom because these multibarrier systems require high resolution and a lot of computational power,” Sarsenbayev says. “This software is really ideal for the Mont Terri experiment.”

Assessing waste disposal plans

The new model could now replace older models that have been used to conduct safety and performance assessments of underground geological repositories.

“If the U.S. eventually decides to dispose nuclear waste in a geological repository, then these models could dictate the most appropriate materials to use,” Sarsenbayev says. “For instance, right now clay is considered an appropriate storage material, but salt formations are another potential medium that could be used. These models allow us to see the fate of radionuclides over millennia. We can use them to understand interactions at timespans that vary from months to years to many millions of years.”

Sarsenbayev says the model is reasonably accessible to other researchers and that future efforts may focus on the use of machine learning to develop less computationally expensive surrogate models.

Further data from the experiment will be available later this month. The team plans to compare those data to additional simulations.

“Our collaborators will basically get this block of cement and clay, and they’ll be able to run experiments to determine the exact thickness of the skin along with all of the minerals and processes present at this interface,” Sarsenbayev says. “It’s a huge project and it takes time, but we wanted to share initial data and this software as soon as we could.”

For now, the researchers hope their study leads to a long-term solution for storing nuclear waste that policymakers and the public can support.

“This is an interdisciplinary study that includes real world experiments showing we’re able to predict radionuclides’ fate in the subsurface,” Sarsenbayev says. “The motto of MIT’s Department of Nuclear Science and Engineering is ‘Science. Systems. Society.’ I think this merges all three domains.”

Warmer ecosystems save their breath

Nature Climate Change - Fri, 07/18/2025 - 12:00am

Nature Climate Change, Published online: 18 July 2025; doi:10.1038/s41558-025-02382-2

Land stores vast amounts of carbon, and how much of it is released as temperatures rise could accelerate climate change. Now research shows ecosystems are more adaptable to climate warming than previously thought, potentially reducing future carbon–climate feedbacks.

Thermal adaptation of respiration in terrestrial ecosystems alleviates carbon loss

Nature Climate Change - Fri, 07/18/2025 - 12:00am

Nature Climate Change, Published online: 18 July 2025; doi:10.1038/s41558-025-02377-z

Terrestrial ecosystems are expected to release more carbon under warming due to temperature-driven increases in ecosystem respiration. Here the authors use eddy covariance data to show that respiration may adapt to warmer temperatures and carbon losses may be lower than expected.

We Support Wikimedia Foundation’s Challenge to UK’s Online Safety Act

EFF: Updates - Thu, 07/17/2025 - 6:18pm

The Electronic Frontier Foundation and ARTICLE 19 strongly support the Wikimedia Foundation’s legal challenge to the categorization regulations of the United Kingdom’s Online Safety Act.

The Foundation – the non-profit that operates Wikipedia and other Wikimedia projects – announced its legal challenge earlier this year, arguing that the regulations endanger Wikipedia and the global community of volunteer contributors who create the information on the site. The High Court of Justice in London will hear the challenge on July 22 and 23.

EFF and ARTICLE 19 agree with the Foundation’s argument that, if enforced, the Category 1 duties - the OSA’s most stringent obligations – would undermine the privacy and safety of Wikipedia’s volunteer contributors, expose the site to manipulation and divert essential resources from protecting people and improving the site. For example, because the law requires Category 1 services to allow users to block all unverified users from editing any content they post, the law effectively requires the Foundation to verify the identity of many Wikipedia contributors. However, that compelled verification undermines the privacy that keeps the site’s volunteers safe.

Wikipedia is the world’s most trusted and widely used encyclopedia, with users across the word accessing its wealth of information and participating in free information exchange through the site. The OSA must not be allowed to diminish it and jeopardize the volunteers on which it depends.

Beyond the issues raised in Wikimedia’s lawsuit, EFF and ARTICLE 19 emphasize that the Online Safety Act poses a serious threat to freedom of expression and privacy online, both in the U.K. and globally. Several key provisions of the law become operational July 25, and some companies already are rolling out age-verification mechanisms which undermine free expression and privacy rights of both adults and minors.

Helping cities evolve

MIT Latest News - Thu, 07/17/2025 - 4:50pm

Growing up in Paris, Vincent Rollet was exposed to the world beyond France from an early age. His dad was an engineer who traveled around the globe to set up electrical infrastructure, and he moved the family to the United States for two years when Rollet was a small child. His father’s work sparked Rollet’s interest in international development and growth. “It made me want to see and learn how things work in other parts of the world,” he says.

Today, Rollet is a fifth-year PhD student in MIT’s Department of Economics, studying how cities evolve — and how they may become constrained by their past. “Cities constantly need to adapt to economic changes,” he explains. “For example, you might need more housing as populations grow, or want to transform manufacturing spaces into modern lab facilities. With the rise of remote work, many cities now have excess office space that could potentially become residential housing.” Ultimately, Rollet hopes his research can influence urban policymakers to better serve city residents.

A happy accident

Rollet’s first exposure to economics was almost accidental. As a teenager, he stumbled upon the lecture videos of a game theory course at Yale University. “I randomly clicked on the available courses,” he said, “and I watched the videos, and I found it interesting.”

In high school and college, he focused on math and physics. “It’s the kind of training you’re typically pushed to do in France,” he says. But at the end of his first year at École Polytechnique — mandatory military training for all students — he remembered the Yale course that he had watched in high school. He had spent that year helping run a military service program for disadvantaged youth. “I was looking for an enjoyable way to start studying again,” he says. “So I went back to game theory.”

Rollet decided to take a game theory course with an economics professor, Pierre Boyer, who would play a key role in his academic path. Through conversations with Boyer, Rollet learned that economics could provide a rigorous, mathematical approach to understanding the topics around international development and international politics that had long fascinated him. Boyer introduced Rollet to two MIT-trained economists, professors Vincent Pons and Benjamin Marx, with whom he continues to collaborate today. A research visit to the U.S. in 2019 to work with them solidified his interest in pursuing graduate school. Shortly thereafter, he began his PhD at MIT.

Why cities get “stuck”

Rollet’s research explores why cities struggle to adapt their built environments as economic conditions shift, and why certain urban spaces become “stuck” in outdated patterns of development. He’s drawn to cities because they are a microcosm of different interacting systems in economics. “To understand cities, you need to understand how labor markets work, how the housing market works, and how transportation works,” he notes.

Rollet has spent most of his PhD focusing on New York City. By examining detailed data on building permits, real estate transactions, rents, and zoning changes, he has tracked the evolution of every building in the city over nearly two decades, studying when and why developers choose to demolish buildings and construct new ones, and how these decisions are influenced by economic, regulatory, and technological constraints. By combining computational theory and data — which often includes information on natural experiments (i.e., What happens when a city changes a regulation?) — Rollet aims to reveal generalizable principles underlying how cities grow and evolve.

Originally shaped as a manufacturing hub with dense commercial centers and sprawling residential outskirts, New York’s physical structure has been largely frozen since zoning regulations were imposed in the 1960s. Despite dramatic shifts in population and economic activity, the city’s regulations have barely budged, creating profound mismatches: soaring housing costs, overcrowded residential areas, and underutilized commercial spaces. The buildings are expensive to replace, and regulations are notoriously hard to change once they are established.

Rollet’s findings reveal critical inefficiencies. In cities like New York or Boston, housing often sells for hundreds of thousands of dollars more than it costs to build. This large gap suggests that demand far outpaces supply: There simply aren’t enough homes being built. “When the housing supply is too constrained, we are effectively wasting resources, making housing unnecessarily expensive,” he explains.

But implementing any kind of action or policy to alleviate these inefficiencies has downstream effects. For example, it can have different impacts on different groups of people. “There will be winners and losers,” Rollet explains. “One reason is that you might directly care about the welfare of a certain group, like directly providing housing for lower-income households. Another reason is that if there are sufficiently many people who are losers of a certain policy, or if they’re sufficiently powerful, they’re going to be able to block the policy change, and this poses a political constraint.”

So what makes a city “stuck”? “Much of the time,” Rollet says, “it’s policy.” But the effects of policy changes take time to materialize and might be difficult for people to detect. Rollet cites Cambridge’s recent zoning reform allowing the construction of six-story buildings as a case in point. “These policy changes can benefit a lot of people, by reducing the housing prices a bit for everyone,” he says, “but individual people won’t know it. This makes collective action very hard.”

Economics, however, provides a toolkit to characterize and quantify these effects. “What economists can bring to the table is to give policymakers more information on the likely consequences of their policy actions,” Rollet says.

Striving to “improve things”

As Rollet enters the home stretch of his PhD, he’s grateful to his advisors in the economics department for helping him develop a foundation for the diverse set of tools necessary for his work. From professors Dave Donaldson and David Atkin, he learned how to adapt methods traditionally used in the study of international trade, to analyze the movement of people across neighborhoods and cities. From Professor Tobias Salz, he gained insights into modeling the behavior of firms over time, which he now applies to understanding the actions of real estate developers. “The training here pushes you to produce research that truly stands out,“ he says. “The courses helped me discover a new set of fields and methods.”

Beyond research, Rollet actively contributes to his department, including serving as the co-president of the Graduate Economics Association. “MIT is truly the best place for economics, not just because of their courses, but because it’s a really friendly department where people help each other out,” he says. “The Graduate Economics Association helps to build that sense of community, and I wanted to be a part of that.” In addition, he is a member of a mental health and peer support group in the department.

Rollet also enjoys teaching. He has been a teaching assistant for microeconomics and international trade courses and has built an impressive writing repertoire explaining complex concepts in several fields. In high school, one of Rollet’s hobbies was writing quantum theory explainers on the internet for general audiences. Some publishers found his writing and contacted him about turning it into a book. The book was published, and has sold more than 14,000 copies. As a college student, Rollet worked on two books: one on game theory for general audiences, and an intro to economics textbook that two professors recruited him to co-author. It’s still the standard textbook at École Polytechnique today. “It was my Covid activity,” Rollet laughs.

Looking forward, Rollet aims to pursue a career in research and teaching. His immediate goal remains clear: develop research that meaningfully impacts policy, by shedding light on how cities can overcome constraints and evolve in ways that better serve their residents. He’s excited about how, in the future, more fine-grained and detailed data sources could shed light on how micro behavior can lead to macro outcomes.

"Housing and cities — these markets are failing in important ways in many parts of the world. There’s real potential for policy to improve things.”

MIT’s Mason Estrada to sign with the Los Angeles Dodgers

MIT Latest News - Thu, 07/17/2025 - 2:00pm

Like almost any MIT student, Mason Estrada wants to take what he learned on campus and apply it to the working world.

Unlike any other MIT student, Estrada will soon be going to work on a pitcher’s mound, and some day Dodger Stadium might be his office.

Estrada, the star pitcher for MIT’s baseball team, is signing a contract with the Los Angeles Dodgers organization, after the team selected him in the 7th round of the Major League Baseball draft on July 14. The right-hander, whose stellar stuff earned significant attention from MLB scouts, will be reporting soon to the Dodgers’ instructional camp in Arizona.

“I’m definitely excited,” says Estrada, who was projected as a likely draft pick but did not know he would be selected by the Dodgers, Major League Baseball’s defending champions.

From the outside, MIT might seem like an atypical starting point for a pitching career, but it has helped Estrada in multiple ways: by providing a strong baseball program in itself, and, more subtly, by reinforcing the value of systematic improvement, at a time when baseball pitching increasingly resembles, well, engineering.

On the first count, Estrada praises his MIT coaches and teammates for the baseball environment they have helped provide.

“It was really awesome,” Estrada says about playing baseball at the Institute. “I was surrounded by a bunch of guys that wanted to win. There was a great team culture of grinding and working hard.”

Meanwhile, pitching in professional baseball more than ever involves “pitch design” or “pitch shaping.” For a decade now, major-league teams have used high-speed cameras to determine which pitches work best. In turn, pitchers are often reverse-engineering parts of their arsenals, by starting with the desired outcome, then finding the combination of velocity and movement to stymie hitters.

Into this setting, enter Estrada, an MIT aeronautics and astronautics major — although, he makes clear, pitching at MIT has never involved transferring aerodynamic knowledge from the classroom to the mound. Rather, what counts is using feedback and analysis to get better.

“It’s not necessarily based on the subject I was studying,” Estrada says. “It’s learning to think like an engineer generally, learning to think through problems the right way, and finding the best solution.”

This season, Estrada went 6-0 with a 2.21 ERA for MIT, striking out 66 and allowing a paltry 22 hits in 40 2/3 innings on the season. There are additional numbers that hint at his potential: Estrada’s fastball has hit 96 miles per hour, and he throws two types of sliders, with velocity in the upper 80s while producing up to 2,700 rotations per minute, in line with big-league metrics.

On the mound, Estrada uses his lower body to generate significant drive toward the plate — “I have to rely on my strength,” he says. Pitchers who share elements of this approach include Spencer Strider of the Atlanta Braves, although, Estrada emphasizes, “Everybody at the professional level is different.”

MIT’s baseball coaches praise Estrada’s dedication to the sport.

“Mason’s work ethic is through the roof,” says Todd Carroll, MIT’s pitching coach and recruiting coordinator, now in his 13th season at the Institute. Carroll thinks Estrada’s fastball and sliders could translate well to the professional game. The forward drive of Estrada’s motion, Carroll also notes, means that when Estrada delivers a pitch, “It’s on a hitter quick.”

Carroll concurs that the engineering mindset on campus actively helps players improve over time.

“MIT students are problem-solvers,” he says. “MIT is a place where people can do that as well as anywhere in the world. When a pitcher here misses the strike zone, that’s a problem they want to solve.”

Inevitably, all the off-field work, analysis, and preparation, is designed to let Estrada simply be himself on the diamond. For athletes, some parts of the brain are best put on pause when competing.

“In games, I’m just focused on getting the hitter out,” Estrada says. “I’m staying in the moment.”

As it happens, baseball’s relatively new world of pitch shaping and pitch design has been enabled by MIT-linked technology. The kind of high-speed video camera many teams use, the Edgertronic, is manufactured by Sanstreak Corp., founded by Mike Matter ’84, a graduate of what is now the Department of Electrical Engineering and Computer Science. If the camera name sounds familiar, it should: Matter named it in homage to Harold “Doc” Edgerton, the legendary MIT pioneer of high-speed photography, whom Matter counted as a mentor.

Estrada is the fifth MIT undergraduate selected in baseball’s draft, which dates to 1966, and the highest-drafted player in MIT history at 225th overall. The others are Alan Dopfel ’72, selected by the California Angels; Jason Szuminski ’00, drafted by the San Diego Padres; Austin Filiere ’18, picked by the Chicago Cubs; and David Hesslink ’17, chosen by the Seattle Mariners. Of those players, Szuminski reached the majors, with the Padres.

At least two major-league pitchers also earned MIT degrees after finishing long baseball careers: Chris Capuano MBA ’19, a former All-Star with the Brewers, who received his master’s degree in management as part of the MIT Sloan Fellows program, and Skip Lockwood SM ’83.

As a Dodger, Estrada joins an organization famed for great pitching: Since the team moved to Los Angeles in 1958, their star pitchers have included Sandy Koufax, Don Drysdale, Fernando Valenzuela, Orel Hershiser, and Clayton Kershaw.

Beyond that, the Dodgers are known for investing considerable resources in player development, staying on the leading edge of analytics while bulking up their staff in order to help players improve. They have won the World Series twice this decade, in 2020 and 2024.

Whatever happens on the diamond, Estrada wants to return to MIT to complete his degree. Before the draft, he had made plans to temporarily transfer to the University of Tennessee to play Division I baseball next season, with the plan of returning to MIT as a student. However, Estrada will not be doing that now that he is signing with the Dodgers.

As things now stand, Estrada is taking a leave of absence from the Institute while his professional career starts to unfold.

“I just want to be clear I’m very thankful to MIT and to the MIT baseball staff for all they’ve done,” Estrada says.

And now, campus experience in hand, Estrada is off to his very distinctive work environment. 

Security Vulnerabilities in ICEBlock

Schneier on Security - Thu, 07/17/2025 - 7:06am

The ICEBlock tool has vulnerabilities:

The developer of ICEBlock, an iOS app for anonymously reporting sightings of US Immigration and Customs Enforcement (ICE) officials, promises that it “ensures user privacy by storing no personal data.” But that claim has come under scrutiny. ICEBlock creator Joshua Aaron has been accused of making false promises regarding user anonymity and privacy, being “misguided” about the privacy offered by iOS, and of being an Apple fanboy. The issue isn’t what ICEBlock stores. It’s about what it could accidentally reveal through its tight integration with iOS...

Why the megalaw didn’t kill Biden’s biggest climate program

ClimateWire News - Thu, 07/17/2025 - 6:41am
EPA officials say President Donald Trump’s massive policy law is a death blow to the Greenhouse Gas Reduction Fund. But the courts could have the final word.

Marjorie Taylor Greene introduces ‘weather modification’ ban

ClimateWire News - Thu, 07/17/2025 - 6:40am
Conspiracy theories about weather-altering technologies have spiked online after the Texas floods.

House Dems scrutinize Trump plan to cut off weather data

ClimateWire News - Thu, 07/17/2025 - 6:38am
Lawmakers have asked for more information about the Pentagon’s decision to stop publicly sharing data from its Defense Meteorological Satellite Program.

Democrats’ bill would require OSHA to issue worker heat protections

ClimateWire News - Thu, 07/17/2025 - 6:38am
The legislation, which is supported by some House Republicans, comes as the Trump administration considers whether to move forward with a Biden-era proposal to protect workers from extreme heat.

Youth fighting Trump on climate get boost from Democrats

ClimateWire News - Thu, 07/17/2025 - 6:37am
A congressional resolution introduced Wednesday calls for the acknowledgement that young people have the right to a clean environment.

HSBC hit by backlash from green clients after net-zero exit

ClimateWire News - Thu, 07/17/2025 - 6:35am
Last week, HSBC became the first U.K. bank to leave the Net-Zero Banking Alliance, which is the industry’s largest climate group.

Norway’s $1.9T wealth fund calls out banks over emissions reports

ClimateWire News - Thu, 07/17/2025 - 6:34am
The comments show the determination of major asset owners in Europe to include climate risk in their investment decisions, despite pushback in some jurisdictions.

How climate change could force FIFA to rethink World Cup calendar

ClimateWire News - Thu, 07/17/2025 - 6:34am
With temperatures rising worldwide, scientists warn that staging soccer tournaments in the Northern Hemisphere summer is getting increasingly dangerous for both players and spectators.

Indigenous youth face violence in bid to protect Colombian resources

ClimateWire News - Thu, 07/17/2025 - 6:33am
In regions like Cauca, violent groups frequently target Indigenous children and teenagers for recruitment.

New tool gives anyone the ability to train a robot

MIT Latest News - Thu, 07/17/2025 - 12:00am

Teaching a robot new skills used to require coding expertise. But a new generation of robots could potentially learn from just about anyone.

Engineers are designing robotic helpers that can “learn from demonstration.” This more natural training strategy enables a person to lead a robot through a task, typically in one of three ways: via remote control, such as operating a joystick to remotely maneuver a robot; by physically moving the robot through the motions; or by performing the task themselves while the robot watches and mimics.

Learning-by-doing robots usually train in just one of these three demonstration approaches. But MIT engineers have now developed a three-in-one training interface that allows a robot to learn a task through any of the three training methods. The interface is in the form of a handheld, sensor-equipped tool that can attach to many common collaborative robotic arms. A person can use the attachment to teach a robot to carry out a task by remotely controlling the robot, physically manipulating it, or demonstrating the task themselves — whichever style they prefer or best suits the task at hand.

The MIT team tested the new tool, which they call a “versatile demonstration interface,” on a standard collaborative robotic arm. Volunteers with manufacturing expertise used the interface to perform two manual tasks that are commonly carried out on factory floors.

The researchers say the new interface offers increased training flexibility that could expand the type of users and “teachers” who interact with robots. It may also enable robots to learn a wider set of skills. For instance, a person could remotely train a robot to handle toxic substances, while further down the production line another person could physically move the robot through the motions of boxing up a product, and at the end of the line, someone else could use the attachment to draw a company logo as the robot watches and learns to do the same.

“We are trying to create highly intelligent and skilled teammates that can effectively work with humans to get complex work done,” says Mike Hagenow, a postdoc at MIT in the Department of Aeronautics and Astronautics. “We believe flexible demonstration tools can help far beyond the manufacturing floor, in other domains where we hope to see increased robot adoption, such as home or caregiving settings.”

Hagenow will present a paper detailing the new interface, at the IEEE Intelligent Robots and Systems (IROS) conference in October. The paper’s MIT co-authors are Dimosthenis Kontogiorgos, a postdoc at the MIT Computer Science and Artificial Intelligence Lab (CSAIL); Yanwei Wang PhD ’25, who recently earned a doctorate in electrical engineering and computer science; and Julie Shah, MIT professor and head of the Department of Aeronautics and Astronautics.

Training together

Shah’s group at MIT designs robots that can work alongside humans in the workplace, in hospitals, and at home. A main focus of her research is developing systems that enable people to teach robots new tasks or skills “on the job,” as it were. Such systems would, for instance, help a factory floor worker quickly and naturally adjust a robot’s maneuvers to improve its task in the moment, rather than pausing to reprogram the robot’s software from scratch — a skill that a worker may not necessarily have.

The team’s new work builds on an emerging strategy in robot learning called “learning from demonstration,” or LfD, in which robots are designed to be trained in more natural, intuitive ways. In looking through the LfD literature, Hagenow and Shah found LfD training methods developed so far fall generally into the three main categories of teleoperation, kinesthetic training, and natural teaching.

One training method may work better than the other two for a particular person or task. Shah and Hagenow wondered whether they could design a tool that combines all three methods to enable a robot to learn more tasks from more people.

“If we could bring together these three different ways someone might want to interact with a robot, it may bring benefits for different tasks and different people,” Hagenow says.

Tasks at hand

With that goal in mind, the team engineered a new versatile demonstration interface (VDI). The interface is a handheld attachment that can fit onto the arm of a typical collaborative robotic arm. The attachment is equipped with a camera and markers that track the tool’s position and movements over time, along with force sensors to measure the amount of pressure applied during a given task.

When the interface is attached to a robot, the entire robot can be controlled remotely, and the interface’s camera records the robot’s movements, which the robot can use as training data to learn the task on its own. Similarly, a person can physically move the robot through a task, with the interface attached. The VDI can also be detached and physically held by a person to perform the desired task. The camera records the VDI’s motions, which the robot can also use to mimic the task when the VBI is reattached.

To test the attachment’s usability, the team brought the interface, along with a collaborative robotic arm, to a local innovation center where manufacturing experts learn about and test technology that can improve factory-floor processes. The researchers set up an experiment where they asked volunteers at the center to use the robot and all three of the interface’s training methods to complete two common manufacturing tasks: press-fitting and molding. In press-fitting, the user trained the robot to press and fit pegs into holes, similar to many fastening tasks. For molding, a volunteer trained the robot to push and roll a rubbery, dough-like substance evenly around the surface of a center rod, similar to some thermomolding tasks.

For each of the two tasks, the volunteers were asked to use each of the three training methods, first teleoperating the robot using a joystick, then kinesthetically manipulating the robot, and finally, detaching the robot’s attachment and using it to “naturally” perform the task as the robot recorded the attachment’s force and movements.

The researchers found the volunteers generally preferred the natural method over teleoperation and kinesthetic training. The users, who were all experts in manufacturing, did offer scenarios in which each method might have advantages over the others. Teleoperation, for instance, may be preferable in training a robot to handle hazardous or toxic substances. Kinesthetic training could help workers adjust the positioning of a robot that is tasked with moving heavy packages. And natural teaching could be beneficial in demonstrating tasks that involve delicate and precise maneuvers.

“We imagine using our demonstration interface in flexible manufacturing environments where one robot might assist across a range of tasks that benefit from specific types of demonstrations,” says Hagenow, who plans to refine the attachment’s design based on user feedback and will use the new design to test robot learning. “We view this study as demonstrating how greater flexibility in collaborative robots can be achieved through interfaces that expand the ways that end-users interact with robots during teaching.”

This work was supported, in part, by the MIT Postdoctoral Fellowship Program for Engineering Excellence and the Wallenberg Foundation Postdoctoral Research Fellowship.

This “smart coach” helps LLMs switch between text and code

MIT Latest News - Thu, 07/17/2025 - 12:00am

Large language models (LLMs) excel at using textual reasoning to understand the context of a document and provide a logical answer about its contents. But these same LLMs often struggle to correctly answer even the simplest math problems.

Textual reasoning is usually a less-than-ideal way to deliberate over computational or algorithmic tasks. While some LLMs can generate code like Python to handle symbolic queries, the models don’t always know when to use code, or what kind of code would work best.

LLMs, it seems, may need a coach to steer them toward the best technique.

Enter CodeSteer, a smart assistant developed by MIT researchers that guides an LLM to switch between code and text generation until it correctly answers a query.

CodeSteer, itself a smaller LLM, automatically generates a series of prompts to iteratively steer a larger LLM. It reviews the model’s current and previous answers after each round and provides guidance for how it can fix or refine that solution until it deems the answer is correct.

The researchers found that augmenting a larger LLM with CodeSteer boosted its accuracy on symbolic tasks, like multiplying numbers, playing Sudoku, and stacking blocks, by more than 30 percent. It also enabled less sophisticated models to outperform more advanced models with enhanced reasoning skills.

This advance could improve the problem-solving capabilities of LLMs for complex tasks that are especially difficult to solve with textual reasoning alone, such as generating paths for robots in uncertain environments or scheduling shipments in an international supply chain.

“There is a race to develop better and better models that are capable of doing everything, but we’ve taken a complementary approach. Researchers have spent years developing effective technologies and tools to tackle problems in many domains. We want to enable LLMs to select the right tools and methods, and make use of others’ expertise to enhance their own capabilities,” says Chuchu Fan, an associate professor of aeronautics and astronautics (AeroAstro) and principal investigator in the MIT Laboratory for Information and Decision Systems (LIDS).

Fan, the senior author of the study, is joined on a paper about the work by LIDS graduate student Yongchao Chen; AeroAstro graduate student Yilun Hao; University of Illinois at Urbana-Champaign graduate student Yueying Liu; and MIT-IBM Watson AI Lab Research Scientist Yang Zhang. The research will be presented at the International Conference on Machine Learning.

An LLM “trainer”  

Ask an LLM which number is bigger, 9.11 or 9.9, and it will often give the wrong answer by using textual reasoning. But ask it to use code to answer the same question, and it can generate and execute a Python script to compare the two numbers, easily solving the problem.

Initially trained to understand and predict human language, LLMs are more likely to answer queries using text, even when code would be more effective. And while they have learned to generate code through fine-tuning, these models often generate an incorrect or less efficient version of the code.

Rather than trying to retrain a powerful LLM like GPT-4 or Claude to improve these capabilities, the MIT researchers fine-tune a smaller, lightweight LLM to guide a larger model between text and code. Fine-tuning a smaller model doesn’t change the larger LLM, so there is no risk it would undermine the larger model’s other abilities.

“We were also inspired by humans. In sports, a trainer may not be better than the star athlete on the team, but the trainer can still give helpful suggestions to guide the athlete. This steering method works for LLMs, too,” Chen says.

This trainer, CodeSteer, works in conjunction with the larger LLM. It first reviews a query and determines whether text or code is suitable for this problem, and which sort of code would be best.

Then it generates a prompt for the larger LLM, telling it to use a coding method or textual reasoning to answer the query. The larger model follows this prompt to answer the query and sends the result back to CodeSteer, which reviews it.

If the answer is not correct, CodeSteer will continue prompting the LLM to try different things that might fix the problem, such as incorporating a search algorithm or constraint into its Python code, until the answer is correct.

“We found that oftentimes, the larger LLM will try to be lazy and use a shorter, less efficient code that will not carry the correct symbolic calculation. We’ve designed CodeSteer to avoid this phenomenon,” Chen says.

A symbolic checker evaluates the code’s complexity and sends a signal to CodeSteer if it is too simple or inefficient. The researchers also incorporate a self-answer checker into CodeSteer, which prompts the LLM to generate code that calculates the answer to verify it is correct.

Tackling complex tasks

As the researchers designed CodeSteer, they couldn’t find suitable symbolic datasets to fine-tune and test the model, since many existing benchmarks don’t point out whether a certain query could be best solved with text or code.

So, they gathered a corpus of 37 complex symbolic tasks, including spatial reasoning, mathematics, order reasoning, and optimization, and built their own dataset, called SymBench. They implemented a fine-tuning approach that leverages SymBench to maximize the performance of CodeSteer.

In their experiments, CodeSteer outperformed all nine baseline methods they evaluated and boosted average accuracy from 53.3 percent to 86.4 percent. It maintains similar performance even on unseen tasks, and on a variety of LLMs.

In addition, a general-purpose model augmented with CodeSteer can achieve higher accuracy than state-of-the-art models designed to focus on complex reasoning and planning, while requiring much less computation.

“Our method uses an LLM’s own capabilities. By augmenting an LLM with the ability to smartly use coding, we can take a model that is already very strong and improve its performance even more,” Chen says.

In the future, the researchers want to streamline CodeSteer to speed up its iterative prompting process. In addition, they are studying how to effectively fine-tune a unified model with the ability to switch between textual reasoning and code generation, rather than relying on a separate assistant.

“The authors present an elegant solution to the critical challenge of tool utilization in LLMs. This simple yet impactful method enables state-of-the-art LLMs to achieve significant performance improvements without requiring direct fine-tuning,” says Jinsung Yoon, a staff research scientist at Google Cloud AI, who was not involved with this work. “This research represents a substantial contribution that promises to significantly enhance the application of LLMs to a diverse range of tasks with which they currently struggle.”

“Their success in training a smaller, specialized model to strategically guide larger, advanced models is particularly impactful,” adds Chi Wang, a senior staff scientist at Google DeepMind who was not involved with this work. “This intelligent collaboration among diverse AI ‘agents’ paves the way for more robust and versatile applications in complex real-world scenarios.”

This research is supported, in part, by the U.S. Office of Naval Research and the MIT-IBM Watson AI Lab.

Can AI really code? Study maps the roadblocks to autonomous software engineering

MIT Latest News - Wed, 07/16/2025 - 4:55pm

Imagine a future where artificial intelligence quietly shoulders the drudgery of software development: refactoring tangled code, migrating legacy systems, and hunting down race conditions, so that human engineers can devote themselves to architecture, design, and the genuinely novel problems still beyond a machine’s reach. Recent advances appear to have nudged that future tantalizingly close, but a new paper by researchers at MIT’s Computer Science and Artificial Intelligence Laboratory (CSAIL) and several collaborating institutions argues that this potential future reality demands a hard look at present-day challenges. 

Titled “Challenges and Paths Towards AI for Software Engineering,” the work maps the many software-engineering tasks beyond code generation, identifies current bottlenecks, and highlights research directions to overcome them, aiming to let humans focus on high-level design while routine work is automated. 

“Everyone is talking about how we don’t need programmers anymore, and there’s all this automation now available,” says Armando Solar‑Lezama, MIT professor of electrical engineering and computer science, CSAIL principal investigator, and senior author of the study. “On the one hand, the field has made tremendous progress. We have tools that are way more powerful than any we’ve seen before. But there’s also a long way to go toward really getting the full promise of automation that we would expect.”

Solar-Lezama argues that popular narratives often shrink software engineering to “the undergrad programming part: someone hands you a spec for a little function and you implement it, or solving LeetCode-style programming interviews.” Real practice is far broader. It includes everyday refactors that polish design, plus sweeping migrations that move millions of lines from COBOL to Java and reshape entire businesses. It requires nonstop testing and analysis — fuzzing, property-based testing, and other methods — to catch concurrency bugs, or patch zero-day flaws. And it involves the maintenance grind: documenting decade-old code, summarizing change histories for new teammates, and reviewing pull requests for style, performance, and security.

Industry-scale code optimization — think re-tuning GPU kernels or the relentless, multi-layered refinements behind Chrome’s V8 engine — remains stubbornly hard to evaluate. Today’s headline metrics were designed for short, self-contained problems, and while multiple-choice tests still dominate natural-language research, they were never the norm in AI-for-code. The field’s de facto yardstick, SWE-Bench, simply asks a model to patch a GitHub issue: useful, but still akin to the “undergrad programming exercise” paradigm. It touches only a few hundred lines of code, risks data leakage from public repositories, and ignores other real-world contexts — AI-assisted refactors, human–AI pair programming, or performance-critical rewrites that span millions of lines. Until benchmarks expand to capture those higher-stakes scenarios, measuring progress — and thus accelerating it — will remain an open challenge.

If measurement is one obstacle, human‑machine communication is another. First author Alex  Gu, an MIT graduate student in electrical engineering and computer science, sees today’s interaction as “a thin line of communication.” When he asks a system to generate code, he often receives a large, unstructured file and even a set of unit tests, yet those tests tend to be superficial. This gap extends to the AI’s ability to effectively use the wider suite of software engineering tools, from debuggers to static analyzers, that humans rely on for precise control and deeper understanding. “I don’t really have much control over what the model writes,” he says. “Without a channel for the AI to expose its own confidence — ‘this part’s correct … this part, maybe double‑check’ — developers risk blindly trusting hallucinated logic that compiles, but collapses in production. Another critical aspect is having the AI know when to defer to the user for clarification.” 

Scale compounds these difficulties. Current AI models struggle profoundly with large code bases, often spanning millions of lines. Foundation models learn from public GitHub, but “every company’s code base is kind of different and unique,” Gu says, making proprietary coding conventions and specification requirements fundamentally out of distribution. The result is code that looks plausible yet calls non‑existent functions, violates internal style rules, or fails continuous‑integration pipelines. This often leads to AI-generated code that “hallucinates,” meaning it creates content that looks plausible but doesn’t align with the specific internal conventions, helper functions, or architectural patterns of a given company. 

Models will also often retrieve incorrectly, because it retrieves code with a similar name (syntax) rather than functionality and logic, which is what a model might need to know how to write the function. “Standard retrieval techniques are very easily fooled by pieces of code that are doing the same thing but look different,” says Solar‑Lezama. 

The authors mention that since there is no silver bullet to these issues, they’re calling instead for community‑scale efforts: richer, having data that captures the process of developers writing code (for example, which code developers keep versus throw away, how code gets refactored over time, etc.), shared evaluation suites that measure progress on refactor quality, bug‑fix longevity, and migration correctness; and transparent tooling that lets models expose uncertainty and invite human steering rather than passive acceptance. Gu frames the agenda as a “call to action” for larger open‑source collaborations that no single lab could muster alone. Solar‑Lezama imagines incremental advances—“research results taking bites out of each one of these challenges separately”—that feed back into commercial tools and gradually move AI from autocomplete sidekick toward genuine engineering partner.

“Why does any of this matter? Software already underpins finance, transportation, health care, and the minutiae of daily life, and the human effort required to build and maintain it safely is becoming a bottleneck. An AI that can shoulder the grunt work — and do so without introducing hidden failures — would free developers to focus on creativity, strategy, and ethics” says Gu. “But that future depends on acknowledging that code completion is the easy part; the hard part is everything else. Our goal isn’t to replace programmers. It’s to amplify them. When AI can tackle the tedious and the terrifying, human engineers can finally spend their time on what only humans can do.”

“With so many new works emerging in AI for coding, and the community often chasing the latest trends, it can be hard to step back and reflect on which problems are most important to tackle,” says Baptiste Rozière, an AI scientist at Mistral AI, who wasn’t involved in the paper. “I enjoyed reading this paper because it offers a clear overview of the key tasks and challenges in AI for software engineering. It also outlines promising directions for future research in the field.”

Gu and Solar-Lezama wrote the paper with University of California at Berkeley Professor Koushik Sen and PhD students Naman Jain and Manish Shetty, Cornell University Assistant Professor Kevin Ellis and PhD student Wen-Ding Li, Stanford University Assistant Professor Diyi Yang and PhD student Yijia Shao, and incoming Johns Hopkins University assistant professor Ziyang Li. Their work was supported, in part, by the National Science Foundation (NSF), SKY Lab industrial sponsors and affiliates, Intel Corp. through an NSF grant, and the Office of Naval Research.

The researchers are presenting their work at the International Conference on Machine Learning (ICML). 

What do we owe each other?

MIT Latest News - Wed, 07/16/2025 - 4:30pm

MIT equips students with the tools to advance science and engineering — but a new class aims to ensure they also develop their own values and learn how to navigate conflicting viewpoints.

Offered as a pilot this past spring, the multidisciplinary class 21.01 (Compass Course: Love, Death, and Taxes: How to Think — and Talk to Others — About Being Human), invites students to wrestle with difficult questions like:

  • What do we value (and why)?
  • What do we know (and how do we know it)?
  • What do we owe to each other (and what should we do about it)?

The class is part of the Compass Initiative, which is led by faculty from across the MIT School of Humanities, Arts, and Social Sciences (SHASS). 

Lily L. Tsai, Ford Professor of Political Science and lead faculty for Compass, says the new course is meant to help students use the humanities and social sciences as their guide to thinking about the kind of humans they want to be and what kind of society they want to help create.

"At MIT, we're some of the people who are creating the technologies that are accelerating change and leading to more unpredictability in the world. We have a special responsibility to envision and reimagine a moral and civic education that enables people to navigate it," says Tsai.

The course is the result of a multi-year collaboration involving over 30 faculty from 19 departments, ranging from Philosophy and Literature to Brain and Cognitive Sciences and Electrical Engineering and Computer Science, all led by a core team of 14 faculty from SHASS and a student advisory board.

During its initial run in the spring, Compass followed an arc that began with students investigating questions of value. Early in the semester, students explored what makes a genius, using Beethoven's "Symphony No. 9" as a case study, accompanied by lectures from Emily Richmond Pollock, associate professor of music, and a podcast conversation with Larry Guth, professor of mathematics, and David Kaiser, professor of physics and science, technology, and society. 

Students then grappled with the concept of a merit-based society by digging into the example of the imperial Chinese civil service exam, guided by professor of history Tristan Brown. Next, they questioned what humans really know to be true by examining the universality of language through lectures by professor of linguistics Adam Albright, and the philosophy of truth and knowledge through lectures by professor of philosophy Alex Byrne.

The semester ended with challenging debates about what humans owe one another, including a class designed by Nobel laureate and professor of economics Esther Duflo on taxation and climate burdens. 

More than anything, Tsai says, she hopes that Compass prepares students to navigate dorm hallways, the family Thanksgiving table, or future labs or boardroom tables, and learn how to express opinions and actively listen to others with whom they may disagree — all without canceling one another. 

The class takes a "flipped classroom" approach: Students watch recorded lectures at home and come to class prepared for discussion and debate. Each section is co-taught by two faculty members, combining disciplines and perspectives.

Second-year mechanical engineering major Kayode Dada signed up because it fulfilled a communications-intensive requirement and offered cross-departmental exposure. But Compass ultimately became more than that to him. "College isn't just about learning science stuff — it's also about how we grow as people," he says. Dada was assigned to a section co-taught by Tsai and professor of literature Arthur Bahr. 

Forming a social contract

In the first week, students draft a Rousseau-inspired social compact and learn firsthand how to build a classroom community. "We knew these were deep topics," Dada says. "To get the most out of the class, we had to open up, respect each other, and keep conversations confidential."

One early exercise was especially impactful. After watching lectures by Ford Professor of Philosophy and Women’s and Gender Studies Sally Haslanger on value, students were asked to draw a map representing their values, with arrows pointing from ones that were more instrumental to ones that were fundamental.

At first, Dada felt stuck. Growing up in Kentucky, the son of a Nigerian immigrant who had dreamed of attending MIT himself, Dada had focused for years on gaining admission to the Institute. "I thought getting into MIT would make me feel fulfilled," he admits. "But once I got here, I realized the work alone wasn't enough."

The values exercise helped him reorient. He identified practicing Christianity, hard work, helping others, and contributing to society as central to his belief system. The exercise influenced Dada, leading him to choose to volunteer at a robotics camp for kids in Louisville to share his MIT education with others.

Who governs science? 

Later in the semester, Dada was animatedly representing a figure whose views contradicted his own: James D. Watson, the Nobel Prize winner who co-discovered DNA's structure — and is also a controversial figure. 

That week, each student had been assigned a persona from a 1976 Cambridge City Council hearing debating recombinant DNA research. The class, designed by Associate Professor Robin Scheffler, was investigating the question: Who governs science — scientists, the government, those who fund research, or the public?

They revisited a real-life debate around recombinant DNA research and the dangers for biological weapons development and other threats to the public that citizens of that time believed it posed when carried out in MIT and Harvard University labs. Pioneered in the 1970s, the technique involved the splicing of genes related to the E. coli bacterium. In the Compass classroom, students argued different sides from their personas: banning the research, moving labs outside city limits, or proceeding without government interference.

Dada notes how faculty intentionally seeded conflicting viewpoints. "It taught me how to negotiate with someone who has different values and come to a resolution that respects everyone involved," he says. "That's something I want to keep exploring."

When Dada closed his presentation with frantically-Googled sentimental music piped unexpectedly from his phone, his classmates laughed in appreciation. The atmosphere was more intimate than academic — an ethos Tsai hoped to cultivate. "They really built intellectual relationships based on trust," she says. "There was a lot of laughter. They took joy in disagreeing and debating."

Changing opinions 

First-year student-athlete Shannon Cordle, who is majoring in mechanical engineering, didn't know what to expect from Compass. Since it was new, there were no student reviews. What stood out to her was the grading system: 15 percent of the final grade is based on a rubric each student created for themselves.

Cordle's goal was to become more comfortable expressing an opinion — even before she's fully formed it. "It's easy to stay quiet when you're unsure," she says. "Compass helped me practice speaking up and being willing to be wrong, because that's how you learn."

One week, the class debated whether a meritocracy creates a just society — an especially relevant topic at MIT, given its famously selective admissions process. 

Students were able to pick their stance beforehand, and then invited to change it as they gained more perspectives during the debate.

"This helps students grasp not only the flaws in another viewpoint, but also how to strengthen their arguments," Tsai says.

Cordle, who hopes to go into prosthetics, views her future field as representing the perfect balance between creativity and ethics. "The humanities challenge how we view our fields as scientists and engineers," she says.

A compass helps travelers find their way — but it's most useful when they need to reorient and change direction. In that spirit, Compass prepares students not just to ask big questions, but to keep asking — and keep adapting — as their lives and careers evolve.

“Bringing these unexpected class elements together with students and faculty generated magical alchemy — a kind of transformation that we didn't even know we could create,” Tsai says.

In addition to the class, the MIT Compass Podcast engages in these fundamental questions with guests from across the MIT schools of Science and Engineering. There are also plans to adapt the residential version of this class for online learners on MITx.

In addition to philanthropic support from MIT Corporation life member emeritus Ray Stata '57, the initiative is supported by the Office of the Vice Chancellor and the MIT Human Insight Collaborative's SHASS Education Innovation Fund, which promotes new, transformative educational approaches in SHASS fields.

Pages