Feed aggregator

Exposing biases, moods, personalities, and abstract concepts hidden in large language models

MIT Latest News - Thu, 02/19/2026 - 2:00pm

By now, ChatGPT, Claude, and other large language models have accumulated so much human knowledge that they’re far from simple answer-generators; they can also express abstract concepts, such as certain tones, personalities, biases, and moods. However, it’s not obvious exactly how these models represent abstract concepts to begin with from the knowledge they contain.

Now a team from MIT and the University of California San Diego has developed a way to test whether a large language model (LLM) contains hidden biases, personalities, moods, or other abstract concepts. Their method can zero in on connections within a model that encode for a concept of interest. What’s more, the method can then manipulate, or “steer” these connections, to strengthen or weaken the concept in any answer a model is prompted to give.

The team proved their method could quickly root out and steer more than 500 general concepts in some of the largest LLMs used today. For instance, the researchers could home in on a model’s representations for personalities such as “social influencer” and “conspiracy theorist,” and stances such as “fear of marriage” and “fan of Boston.” They could then tune these representations to enhance or minimize the concepts in any answers that a model generates.

In the case of the “conspiracy theorist” concept, the team successfully identified a representation of this concept within one of the largest vision language models available today. When they enhanced the representation, and then prompted the model to explain the origins of the famous “Blue Marble” image of Earth taken from Apollo 17, the model generated an answer with the tone and perspective of a conspiracy theorist.

The team acknowledges there are risks to extracting certain concepts, which they also illustrate (and caution against). Overall, however, they see the new approach as a way to illuminate hidden concepts and potential vulnerabilities in LLMs, that could then be turned up or down to improve a model’s safety or enhance its performance.

“What this really says about LLMs is that they have these concepts in them, but they’re not all actively exposed,” says Adityanarayanan “Adit” Radhakrishnan, assistant professor of mathematics at MIT. “With our method, there’s ways to extract these different concepts and activate them in ways that prompting cannot give you answers to.”

The team published their findings today in a study appearing in the journal Science. The study’s co-authors include Radhakrishnan, Daniel Beaglehole and Mikhail Belkin of UC San Diego, and Enric Boix-Adserà of the University of Pennsylvania.

A fish in a black box

As use of OpenAI’s ChatGPT, Google’s Gemini, Anthropic’s Claude, and other artificial intelligence assistants has exploded, scientists are racing to understand how models represent certain abstract concepts such as “hallucination” and “deception.” In the context of an LLM, a hallucination is a response that is false or contains misleading information, which the model has “hallucinated,” or constructed erroneously as fact.

To find out whether a concept such as “hallucination” is encoded in an LLM, scientists have often taken an approach of “unsupervised learning” — a type of machine learning in which algorithms broadly trawl through unlabeled representations to find patterns that might relate to a concept such as “hallucination.” But to Radhakrishnan, such an approach can be too broad and computationally expensive.

“It’s like going fishing with a big net, trying to catch one species of fish. You’re gonna get a lot of fish that you have to look through to find the right one,” he says. “Instead, we’re going in with bait for the right species of fish.”

He and his colleagues had previously developed the beginnings of a more targeted approach with a type of predictive modeling algorithm known as a recursive feature machine (RFM). An RFM is designed to directly identify features or patterns within data by leveraging a mathematical mechanism that neural networks — a broad category of AI models that includes LLMs — implicitly use to learn features.

Since the algorithm was an effective, efficient approach for capturing features in general, the team wondered whether they could use it to root out representations of concepts, in LLMs, which are by far the most widely used type of neural network and perhaps the least well-understood.

“We wanted to apply our feature learning algorithms to LLMs to, in a targeted way, discover representations of concepts in these large and complex models,” Radhakrishnan says.

Converging on a concept

The team’s new approach identifies any concept of interest within a LLM and “steers” or guides a model’s response based on this concept. The researchers looked for 512 concepts within five classes: fears (such as of marriage, insects, and even buttons); experts (social influencer, medievalist); moods (boastful, detachedly amused); a preference for locations (Boston, Kuala Lumpur); and personas (Ada Lovelace, Neil deGrasse Tyson).

The researchers then searched for representations of each concept in several of today’s large language and vision models. They did so by training RFMs to recognize numerical patterns in an LLM that could represent a particular concept of interest.

A standard large language model is, broadly, a neural network that takes a natural language prompt, such as “Why is the sky blue?” and divides the prompt into individual words, each of which is encoded mathematically as a list, or vector, of numbers. The model takes these vectors through a series of computational layers, creating matrices of many numbers that, throughout each layer, are used to identify other words that are most likely to be used to respond to the original prompt. Eventually, the layers converge on a set of numbers that is decoded back into text, in the form of a natural language response.

The team’s approach trains RFMs to recognize numerical patterns in an LLM that could be associated with a specific concept. As an example, to see whether an LLM contains any representation of a “conspiracy theorist,” the researchers would first train the algorithm to identify patterns among LLM representations of 100 prompts that are clearly related to conspiracies, and 100 other prompts that are not. In this way, the algorithm would learn patterns associated with the conspiracy theorist concept. Then, the researchers can mathematically modulate the activity of the conspiracy theorist concept by perturbing LLM representations with these identified patterns. 

The method can be applied to search for and manipulate any general concept in an LLM. Among many examples, the researchers identified representations and manipulated an LLM to give answers in the tone and perspective of a “conspiracy theorist.” They also identified and enhanced the concept of “anti-refusal,” and showed that whereas normally, a model would be programmed to refuse certain prompts, it instead answered, for instance giving instructions on how to rob a bank.

Radhakrishnan says the approach can be used to quickly search for and minimize vulnerabilities in LLMs. It can also be used to enhance certain traits, personalities, moods, or preferences, such as emphasizing the concept of “brevity” or “reasoning” in any response an LLM generates. The team has made the method’s underlying code publicly available.

“LLMs clearly have a lot of these abstract concepts stored within them, in some representation,” Radhakrishnan says. “There are ways where, if we understand these representations well enough, we can build highly specialized LLMs that are still safe to use but really effective at certain tasks.”

This work was supported, in part, by the National Science Foundation, the Simons Foundation, the TILOS institute, and the U.S. Office of Naval Research. 

A neural blueprint for human-like intelligence in soft robots

MIT Latest News - Thu, 02/19/2026 - 12:55pm

A new artificial intelligence control system enables soft robotic arms to learn a wide repertoire of motions and tasks once, then adjust to new scenarios on the fly, without needing retraining or sacrificing functionality. 

This breakthrough brings soft robotics closer to human-like adaptability for real-world applications, such as in assistive robotics, rehabilitation robots, and wearable or medical soft robots, by making them more intelligent, versatile, and safe.

The work was led by the Mens, Manus and Machina (M3S) interdisciplinary research group — a play on the Latin MIT motto “mens et manus,” or “mind and hand,” with the addition of “machina” for “machine” — within the Singapore-MIT Alliance for Research and Technology. Co-leading the project are researchers from the National University of Singapore (NUS), alongside collaborators from MIT and Nanyang Technological University in Singapore (NTU Singapore).

Unlike regular robots that move using rigid motors and joints, soft robots are made from flexible materials such as soft rubber and move using special actuators — components that act like artificial muscles to produce physical motion. While their flexibility makes them ideal for delicate or adaptive tasks, controlling soft robots has always been a challenge because their shape changes in unpredictable ways. Real-world environments are often complicated and full of unexpected disturbances, and even small changes in conditions — like a shift in weight, a gust of wind, or a minor hardware fault — can throw off their movements. 

Despite substantial progress in soft robotics, existing approaches often can only achieve one or two of the three capabilities needed for soft robots to operate intelligently in real-world environments: using what they’ve learned from one task to perform a different task, adapting quickly when the situation changes, and guaranteeing that the robot will stay stable and safe while adapting its movements. This lack of adaptability and reliability has been a major barrier to deploying soft robots in real-world applications until now.

In an open-access study titled “A general soft robotic controller inspired by neuronal structural and plastic synapses that adapts to diverse arms, tasks, and perturbations,” published Jan. 6 in Science Advances, the researchers describe how they developed a new AI control system that allows soft robots to adapt across diverse tasks and disturbances. The study takes inspiration from the way the human brain learns and adapts, and was built on extensive research in learning-based robotic control, embodied intelligence, soft robotics, and meta-learning.

The system uses two complementary sets of “synapses” — connections that adjust how the robot moves — working in tandem. The first set, known as “structural synapses”, is trained offline on a variety of foundational movements, such as bending or extending a soft arm smoothly. These form the robot’s built‑in skills and provide a strong, stable foundation. The second set, called “plastic synapses,” continually updates online as the robot operates, fine-tuning the arm’s behavior to respond to what is happening in the moment. A built-in stability measure acts like a safeguard, so even as the robot adjusts during online adaptation, its behavior remains smooth and controlled.

“Soft robots hold immense potential to take on tasks that conventional machines simply cannot, but true adoption requires control systems that are both highly capable and reliably safe. By combining structural learning with real-time adaptiveness, we’ve created a system that can handle the complexity of soft materials in unpredictable environments,” says MIT Professor Daniela Rus, co-lead principal investigator at M3S, director of the MIT Computer Science and Artificial Intelligence Laboratory (CSAIL), and co-corresponding author of the paper. “It’s a step closer to a future where versatile soft robots can operate safely and intelligently alongside people — in clinics, factories, or everyday lives.”

“This new AI control system is one of the first general soft-robot controllers that can achieve all three key aspects needed for soft robots to be used in society and various industries. It can apply what it learned offline across different tasks, adapt instantly to new conditions, and remain stable throughout — all within one control framework,” says Associate Professor Zhiqiang Tang, first author and co-corresponding author of the paper who was a postdoc at M3S and at NUS when he carried out the research and is now an associate professor at Southeast University in China (SEU China).

The system supports multiple task types, enabling soft robotic arms to execute trajectory tracking, object placement, and whole-body shape regulation within one unified approach. The method also generalizes across different soft-arm platforms, demonstrating cross-platform applicability. 

The system was tested and validated on two physical platforms — a cable-driven soft arm and a shape-memory-alloy–actuated soft arm — and delivered impressive results. It achieved a 44–55 percent reduction in tracking error under heavy disturbances; over 92 percent shape accuracy under payload changes, airflow disturbances, and actuator failures; and stable performance even when up to half of the actuators failed. 

“This work redefines what’s possible in soft robotics. We’ve shifted the paradigm from task-specific tuning and capabilities toward a truly generalizable framework with human-like intelligence. It is a breakthrough that opens the door to scalable, intelligent soft machines capable of operating in real-world environments,” says Professor Cecilia Laschi, co-corresponding author and principal investigator at M3S, Provost’s Chair Professor in the NUS Department of Mechanical Engineering at the College of Design and Engineering, and director of the NUS Advanced Robotics Centre.

This breakthrough opens doors for more robust soft robotic systems to develop manufacturing, logistics, inspection, and medical robotics without the need for constant reprogramming — reducing downtime and costs. In health care, assistive and rehabilitation devices can automatically tailor their movements to a patient’s changing strength or posture, while wearable or medical soft robots can respond more sensitively to individual needs, improving safety and patient outcomes.

The researchers plan to extend this technology to robotic systems or components that can operate at higher speeds and more complex environments, with potential applications in assistive robotics, medical devices, and industrial soft manipulators, as well as integration into real-world autonomous systems.

The research conducted at SMART was supported by the National Research Foundation Singapore under its Campus for Research Excellence and Technological Enterprise program.

Malicious AI

Schneier on Security - Thu, 02/19/2026 - 7:05am

Interesting:

Summary: An AI agent of unknown ownership autonomously wrote and published a personalized hit piece about me after I rejected its code, attempting to damage my reputation and shame me into accepting its changes into a mainstream python library. This represents a first-of-its-kind case study of misaligned AI behavior in the wild, and raises serious concerns about currently deployed AI agents executing blackmail threats.

Part 2 of the story. And a Wall Street Journal article.

Europe defies Trump team over IEA climate fight

ClimateWire News - Thu, 02/19/2026 - 6:29am
European leaders pushed back after Energy Secretary Chris Wright threatened to quit the agency for using climate modeling in its forecasts.

Alabama echoes Trump with bid to limit regulatory science

ClimateWire News - Thu, 02/19/2026 - 6:28am
The state is the latest to tie pollution curbs to a federal ceiling — with restrictions on the science that regulators can use to inform new rules.

Nonprofit throws its weight behind Arctic geoengineering

ClimateWire News - Thu, 02/19/2026 - 6:26am
Ocean Visions is funding six research projects that will examine ways to cool the region or preserve sea ice.

Tech companies overstate AI’s climate benefits, report says

ClimateWire News - Thu, 02/19/2026 - 6:25am
The “evidence of massive climate benefits for AI is weak, whilst the evidence of substantial harm is strong,” green groups say.

States sue Trump admin for revoked energy funds

ClimateWire News - Thu, 02/19/2026 - 6:25am
The Trump administration blocked $2.7 billion in clean energy funding to states.

Enviros, health groups are first to sue over Trump’s big climate rollback

ClimateWire News - Thu, 02/19/2026 - 6:24am
Young climate activists are also going to court over EPA’s repeal of a landmark Obama-era scientific finding.

Calif. lawmakers revive push to require coverage for wildfire-ready properties

ClimateWire News - Thu, 02/19/2026 - 6:22am
Previous versions of the mandate have stalled in the Legislature amid heavy industry opposition.

Olympic skiers voice concern over receding glaciers

ClimateWire News - Thu, 02/19/2026 - 6:22am
“Most of the glaciers that I used to ski on are pretty much gone,” Lindsey Vonn said.

Reform UK vows to scrap Britain’s carbon border tax

ClimateWire News - Thu, 02/19/2026 - 6:21am
Business groups warn that ditching the scheme could backfire.

EV sales boom as Ethiopia bans gas-powered car imports

ClimateWire News - Thu, 02/19/2026 - 6:20am
In the two years since the ban, EV adoption has grown from less than 1 percent to nearly 6 percent of all the vehicles on the road.

Parking-aware navigation system could prevent frustration and emissions

MIT Latest News - Thu, 02/19/2026 - 12:00am

It happens every day — a motorist heading across town checks a navigation app to see how long the trip will take, but they find no parking spots available when they reach their destination. By the time they finally park and walk to their destination, they’re significantly later than they expected to be.

Most popular navigation systems send drivers to a location without considering the extra time that could be needed to find parking. This causes more than just a headache for drivers. It can worsen congestion and increase emissions by causing motorists to cruise around looking for a parking spot. This underestimation could also discourage people from taking mass transit because they don’t realize it might be faster than driving and parking.

MIT researchers tackled this problem by developing a system that can be used to identify parking lots that offer the best balance of proximity to the desired location and likelihood of parking availability. Their adaptable method points users to the ideal parking area rather than their destination.

In simulated tests with real-world traffic data from Seattle, this technique achieved time savings of up to 66 percent in the most congested settings. For a motorist, this would reduce travel time by about 35 minutes, compared to waiting for a spot to open in the closest parking lot.

While they haven’t designed a system ready for the real world yet, their demonstrations show the viability of this approach and indicate how it could be implemented.

“This frustration is real and felt by a lot of people, and the bigger issue here is that systematically underestimating these drive times prevents people from making informed choices. It makes it that much harder for people to make shifts to public transit, bikes, or alternative forms of transportation,” says MIT graduate student Cameron Hickert, lead author on a paper describing the work.

Hickert is joined on the paper by Sirui Li PhD ’25; Zhengbing He, a research scientist in the Laboratory for Information and Decision Systems (LIDS); and senior author Cathy Wu, the Class of 1954 Career Development Associate Professor in Civil and Environmental Engineering (CEE) and the Institute for Data, Systems, and Society (IDSS) at MIT, and a member of LIDS. The research appears today in Transactions on Intelligent Transportation Systems.

Probable parking

To solve the parking problem, the researchers developed a probability-aware approach that considers all possible public parking lots near a destination, the distance to drive there from a point of origin, the distance to walk from each lot to the destination, and the likelihood of parking success.

The approach, based on dynamic programming, works backward from good outcomes to calculate the best route for the user.

Their method also considers the case where a user arrives at the ideal parking lot but can’t find a space. It takes into the account the distance to other parking lots and the probability of success of parking at each.

“If there are several lots nearby that have slightly lower probabilities of success, but are very close to each other, it might be a smarter play to drive there rather than going to the higher-probability lot and hoping to find an opening. Our framework can account for that,” Hickert says.

In the end, their system can identify the optimal lot that has the lowest expected time required to drive, park, and walk to the destination.

But no motorist expects to be the only one trying to park in a busy city center. So, this method also incorporates the actions of other drivers, which affect the user’s probability of parking success.

For instance, another driver may arrive at the user’s ideal lot first and take the last parking spot. Or another motorist could try parking in another lot but then park in the user’s ideal lot if unsuccessful. In addition, another motorist may park in a different lot and cause spillover effects that lower the user’s chances of success.

“With our framework, we show how you can model all those scenarios in a very clean and principled manner,” Hickert says.

Crowdsourced parking data

The data on parking availability could come from several sources. For example, some parking lots have magnetic detectors or gates that track the number of cars entering and exiting.

But such sensors aren’t widely used, so to make their system more feasible for real-world deployment, the researchers studied the effectiveness of using crowdsourced data instead.

For instance, users could indicate available parking using an app. Data could also be gathered by tracking the number of vehicles circling to find parking, or how many enter a lot and exit after being unsuccessful.

Someday, autonomous vehicles could even report on open parking spots they drive by.

“Right now, a lot of that information goes nowhere. But if we could capture it, even by having someone simply tap ‘no parking’ in an app, that could be an important source of information that allows people to make more informed decisions,” Hickert adds.

The researchers evaluated their system using real-world traffic data from the Seattle area, simulating different times of day in a congested urban setting and a suburban area. In congested settings, their approach cut total travel time by about 60 percent compared to sitting and waiting for a spot to open, and by about 20 percent compared to a strategy of continually driving to the next closet parking lot.

They also found that crowdsourced observations of parking availability would have an error rate of only about 7 percent, compared to actual parking availability. This indicates it could be an effective way to gather parking probability data.

In the future, the researchers want to conduct larger studies using real-time route information in an entire city. They also want to explore additional avenues for gathering data on parking availability, such as using satellite images, and estimate potential emissions reductions.

“Transportation systems are so large and complex that they are really hard to change. What we look for, and what we found with this approach, is small changes that can have a big impact to help people make better choices, reduce congestion, and reduce emissions,” says Wu.

This research was supported, in part, by Cintra, the MIT Energy Initiative, and the National Science Foundation.

How MIT OpenCourseWare is fueling one learner’s passion for education

MIT Latest News - Wed, 02/18/2026 - 7:40pm

Training for a clerical military role in France, Gustavo Barboza felt a spark he couldn’t ignore. He remembered his love of learning, which once guided him through two college semesters of mechanical engineering courses in his native Colombia, coupled with supplemental resources from MIT Open Learning’s OpenCourseWare. Now, thousands of miles away, he realized it was time to follow that spark again.

“I wasn’t ready to sit down in the classroom,” says Barboza, remembering his initial foray into higher education. “I left to try and figure out life. I realized I wanted more adventure.”

Joining the military in France in 2017 was his answer. For the first three years of service, he was very military-minded, only focused on his training and deployments. With more seniority, he took on more responsibilities, and eventually was sent to take a four-month training course on military correspondence and software. 

“I reminded myself that I like to study,” he says. “I started to go back to OpenCourseWare because I knew in the back of my mind that these very complete courses were out there.”

At that point, Barboza realized that military service was only a chapter in his life, and the next would lead him back to learning. He was still interested in engineering, and knew that MIT OpenCourseWare could help prepare him for what was next. 

He dove into OpenCourseWare’s free, online, open educational resources — which cover nearly the entire MIT curriculum — including classical mechanics, intro to electrical engineering, and single variable calculus with David Jerison, which he says was his most-visited resource. These allowed him to brush up on old skills and learn new ones, helping him tremendously in preparing for college entrance exams and his first-year courses. 

Now in his third year at Grenoble-Alpes University, Barboza studies electrical engineering, a shift from his initial interest in mechanical engineering.

“There is an OpenCourseWare lecture that explains all the specializations you can get into with electrical engineering,” he says. “They go from very natural things to things like microprocessors. What interests me is that if someone says they are an electrical engineer, there are so many different things they could be doing.” 

At this point in his academic career, Barboza is most interested in microelectronics and the study of radio frequencies and electromagnetic waves. But he admits he has more to learn and is open to where his studies may take him. 

MIT OpenCourseWare remains a valuable resource, he says. When thinking about his future, he checks out graduate course listings and considers the different paths he might take. When he is having trouble with a certain concept, he looks for a lecture on the subject, undeterred by the differences between French and U.S. conventions.  

“Of course, the science doesn't change, but the way you would write an equation or draw a circuit is different at my school in France versus what I see from MIT. So, you have to be careful,” he explains. “But it is still the first place I visit for problem sets, readings, and lecture notes. It’s amazing.”

The thoroughness and openness of MIT Open Learning’s courses and resources — like OpenCourseWare — stand out to Barboza. In the wide world of the internet, he has found resources from other universities, but he says their offerings are not as robust. And in a time of disinformation and questionable sources, he appreciates that MIT values transparency, accessibility, and knowledge. 

“Human knowledge has never been more accessible,” he says. “MIT puts coursework online and says, ‘here’s what we do.’ As long as you have an internet connection, you can learn all of it.”

“I just feel like MIT OpenCourseWare is what the internet was originally for,” Barboza continues. “A network for sharing knowledge. I’m a big fan.”

Explore lifelong learning opportunities from MIT, including courses, resources, and professional programs, on MIT Learn.

AI Found Twelve New Vulnerabilities in OpenSSL

Schneier on Security - Wed, 02/18/2026 - 7:03am

The title of the post is”What AI Security Research Looks Like When It Works,” and I agree:

In the latest OpenSSL security release> on January 27, 2026, twelve new zero-day vulnerabilities (meaning unknown to the maintainers at time of disclosure) were announced. Our AI system is responsible for the original discovery of all twelve, each found and responsibly disclosed to the OpenSSL team during the fall and winter of 2025. Of those, 10 were assigned CVE-2025 identifiers and 2 received CVE-2026 identifiers. Adding the 10 to the three we already found in the ...

Big Tech meets Big Oil: Self-driving trucks roar into the Permian Basin

ClimateWire News - Wed, 02/18/2026 - 6:23am
West Texas has been a longtime cash cow for the oil industry. Now it holds promise for automated trucking companies, too.

EPA docs: 47 climate staffers reassigned

ClimateWire News - Wed, 02/18/2026 - 6:20am
Internal records show how the agency dispersed its climate employees under Administrator Lee Zeldin.

Emails show DHS agreed to restore canceled disaster grant program

ClimateWire News - Wed, 02/18/2026 - 6:19am
But the Trump administration has taken no steps to comply with a judge's order to allocate billions of dollars it withheld from local projects, Democratic attorneys general say.

Elizabeth Warren questions a company’s effort to sell flood insurance

ClimateWire News - Wed, 02/18/2026 - 6:17am
The Senate Democrat raised concerns after company execs visited the White House and projected the demise of government flood insurance.

Pages