Feed aggregator
Study: Platforms that rank the latest LLMs can be unreliable
A firm that wants to use a large language model (LLM) to summarize sales reports or triage customer inquiries can choose between hundreds of unique LLMs with dozens of model variations, each with slightly different performance.
To narrow down the choice, companies often rely on LLM ranking platforms, which gather user feedback on model interactions to rank the latest LLMs based on how they perform on certain tasks.
But MIT researchers found that a handful of user interactions can skew the results, leading someone to mistakenly believe one LLM is the ideal choice for a particular use case. Their study reveals that removing a tiny fraction of crowdsourced data can change which models are top-ranked.
They developed a fast method to test ranking platforms and determine whether they are susceptible to this problem. The evaluation technique identifies the individual votes most responsible for skewing the results so users can inspect these influential votes.
The researchers say this work underscores the need for more rigorous strategies to evaluate model rankings. While they didn’t focus on mitigation in this study, they provide suggestions that may improve the robustness of these platforms, such as gathering more detailed feedback to create the rankings.
The study also offers a word of warning to users who may rely on rankings when making decisions about LLMs that could have far-reaching and costly impacts on a business or organization.
“We were surprised that these ranking platforms were so sensitive to this problem. If it turns out the top-ranked LLM depends on only two or three pieces of user feedback out of tens of thousands, then one can’t assume the top-ranked LLM is going to be consistently outperforming all the other LLMs when it is deployed,” says Tamara Broderick, an associate professor in MIT’s Department of Electrical Engineering and Computer Science (EECS); a member of the Laboratory for Information and Decision Systems (LIDS) and the Institute for Data, Systems, and Society; an affiliate of the Computer Science and Artificial Intelligence Laboratory (CSAIL); and senior author of this study.
She is joined on the paper by lead authors and EECS graduate students Jenny Huang and Yunyi Shen as well as Dennis Wei, a senior research scientist at IBM Research. The study will be presented at the International Conference on Learning Representations.
Dropping data
While there are many types of LLM ranking platforms, the most popular variations ask users to submit a query to two models and pick which LLM provides the better response.
The platforms aggregate the results of these matchups to produce rankings that show which LLM performed best on certain tasks, such as coding or visual understanding.
By choosing a top-performing LLM, a user likely expects that model’s top ranking to generalize, meaning it should outperform other models on their similar, but not identical, application with a set of new data.
The MIT researchers previously studied generalization in areas like statistics and economics. That work revealed certain cases where dropping a small percentage of data can change a model’s results, indicating that those studies’ conclusions might not hold beyond their narrow setting.
The researchers wanted to see if the same analysis could be applied to LLM ranking platforms.
“At the end of the day, a user wants to know whether they are choosing the best LLM. If only a few prompts are driving this ranking, that suggests the ranking might not be the end-all-be-all,” Broderick says.
But it would be impossible to test the data-dropping phenomenon manually. For instance, one ranking they evaluated had more than 57,000 votes. Testing a data drop of 0.1 percent means removing each subset of 57 votes out of the 57,000, (there are more than 10194 subsets), and then recalculating the ranking.
Instead, the researchers developed an efficient approximation method, based on their prior work, and adapted it to fit LLM ranking systems.
“While we have theory to prove the approximation works under certain assumptions, the user doesn’t need to trust that. Our method tells the user the problematic data points at the end, so they can just drop those data points, re-run the analysis, and check to see if they get a change in the rankings,” she says.
Surprisingly sensitive
When the researchers applied their technique to popular ranking platforms, they were surprised to see how few data points they needed to drop to cause significant changes in the top LLMs. In one instance, removing just two votes out of more than 57,000, which is 0.0035 percent, changed which model is top-ranked.
A different ranking platform, which uses expert annotators and higher quality prompts, was more robust. Here, removing 83 out of 2,575 evaluations (about 3 percent) flipped the top models.
Their examination revealed that many influential votes may have been a result of user error. In some cases, it appeared there was a clear answer as to which LLM performed better, but the user chose the other model instead, Broderick says.
“We can never know what was in the user’s mind at that time, but maybe they mis-clicked or weren’t paying attention, or they honestly didn’t know which one was better. The big takeaway here is that you don’t want noise, user error, or some outlier determining which is the top-ranked LLM,” she adds.
The researchers suggest that gathering additional feedback from users, such as confidence levels in each vote, would provide richer information that could help mitigate this problem. Ranking platforms could also use human mediators to assess crowdsourced responses.
For the researchers’ part, they want to continue exploring generalization in other contexts while also developing better approximation methods that can capture more examples of non-robustness.
“Broderick and her students’ work shows how you can get valid estimates of the influence of specific data on downstream processes, despite the intractability of exhaustive calculations given the size of modern machine-learning models and datasets,” says Jessica Hullman, the Ginni Rometty Professor of Computer Science at Northwestern University, who was not involved with this work. “The recent work provides a glimpse into the strong data dependencies in routinely applied — but also very fragile — methods for aggregating human preferences and using them to update a model. Seeing how few preferences could really change the behavior of a fine-tuned model could inspire more thoughtful methods for collecting these data.”
This research is funded, in part, by the Office of Naval Research, the MIT-IBM Watson AI Lab, the National Science Foundation, Amazon, and a CSAIL seed award.
How MIT’s 10th president shaped the Cold War
Today, MIT plays a key role in maintaining U.S. competitiveness, technological leadership, and national defense — and much of the Institute’s work to support the nation’s standing in these areas can be traced back to 1953.
Two months after he took office that year, U.S. President Dwight Eisenhower received a startling report from the military: The USSR had successfully exploded a nuclear bomb nine months sooner than intelligence sources had predicted. The rising Communist power had also detonated a hydrogen bomb using development technology more sophisticated than that of the U.S. And lastly, there was evidence of a new Soviet bomber that rivaled the B-52 in size and range — and the aircraft was of an entirely original design from within the USSR. There was, the report concluded, a significant chance of a surprise nuclear attack on the United States.
Eisenhower’s understanding of national security was vast (he had led the Allies to victory in World War II and served as the first supreme commander of NATO), but the connections he’d made during his two-year stint as president of Columbia University would prove critical to navigating the emerging challenges of the Cold War. He sent his advisors in search of a plan for managing this threat, and he suggested they start with James Killian, then president of MIT.
Killian had an unlikely path to the presidency of MIT. “He was neither a scientist nor an engineer,” says David Mindell, the Dibner Professor of the History of Engineering and Manufacturing and a professor of aeronautics and astronautics at MIT. “But Killian turned out to be a truly gifted administrator.”
While he was serving as editor of MIT Technology Review (where he founded what became the MIT Press), Killian was tapped by then-president Karl Compton to join his staff. As the war effort ramped up on the MIT campus in the 1940s, Compton deputized Killian to lead the RadLab — a 4,000-person effort to develop and deploy the radar systems that proved decisive in the Allied victory.
Killian was named MIT’s 10th president in 1948. In 1951, he launched MIT Lincoln Laboratory, a federally funded research center where MIT and U.S. Air Force scientists and engineers collaborated on new air defense technologies to protect the nation against a nuclear attack.
Two years later, within weeks of Eisenhower’s 1953 request, Killian convened a group of leading scientists at MIT. The group proposed a three-part study: The U.S. needed to reassess its offensive capabilities, its continental defense, and its intelligence operations. Eisenhower agreed.
Killian mobilized 42 engineers and scientists from across the country into three panels matching the committee’s charge. Between September 1954 and February 1955, the panels held 307 meetings with every major defense and intelligence organization in the U.S. government. They had unrestricted access to every project, plan, and program involving national defense. The result, a 190-page report titled “Meeting the Threat of a Surprise Attack,” was delivered to Eisenhower’s desk on Feb. 14, 1955.
The Killian Report, as it came to be known, would go on to play a dramatic role in defining the frontiers of military technology, intelligence gathering, national security policy, and global affairs over the next several decades. Killian’s input would also have dramatic impacts on Eisenhower’s presidency and the relationship between the federal government and higher education.
Foreseeing an evolving competition
The Killian Report opens by anticipating four projected “periods” in the shifting balance of power between the U.S. and the Soviet Union.
In 1955, the U.S. had a decided offensive advantage over the USSR, but it was overly vulnerable to surprise attack. In 1956 and 1957, the U.S. would have an even larger offensive advantage and be only somewhat less vulnerable to surprise. By 1960, the U.S.’ offensive advantage would be narrower, but it would be in a better position to anticipate an attack. Within a decade, the report stated, the two nations would enter “Period IV” — during which “an attack by either side would result in mutual destruction … [a period] so fraught with danger to the U.S. that we should push all promising technological development so that we may stay in Periods II and III as long as possible.”
The report went on to make extensive, detailed recommendations — accelerated development of intercontinental ballistic missiles and high-energy aircraft fuels, expansion and increased ground security for “delivery system” facilities, increased cooperation with Canada and more studies about establishing monitoring stations on polar pack ice, and “studies directed toward better understanding of the radiological hazards that may result from the detonation of large numbers of nuclear weapons,” among others.
“Eisenhower really wanted to draw the perspectives of scientists and engineers into his decision-making,” says Mindell. “Generals and admirals tend to ask for more arms and more boots on the ground. The president didn’t want to be held captive by these views — and Killian’s report really delivered this for him.”
On the day it arrived, President Eisenhower circulated the Killian Report to the head of every department and agency in the federal government and asked them to comment on its recommendations. The Cold War arms race was on — and it would be between scientists and engineers in the United States and those in the Soviet Union.
An odd couple
The Killian Report made many recommendations based on “the correctness of the current national intelligence estimates” — even though “Eisenhower was frustrated with his whole intelligence apparatus,” says Will Hitchcock, the James Madison Professor of History at the University of Virginia and author of “The Age of Eisenhower.” “He felt it was still too much World War II ‘exploding-cigar’ stuff. There wasn’t enough work on advance warning, on seeing what’s over the hill. But that’s what Eisenhower really wanted to know.” The surprise attack on Pearl Harbor still lingered in the minds of many Americans, Hitchcock notes, and “that needed to be avoided.”
Killian needed an aggressive, innovative thinker to assess U.S. intelligence, so he turned to Edwin Land. The cofounder of Polaroid, Land was an astonishingly bold engineer and inventor. He also had military experience, having developed new ordnance targeting systems, aerial photography devices, and other photographic and visual surveillance technologies during World War II. Killian approached Land knowing their methods and work style were quite different. (When the offer to lead the intelligence panel was made, Land was in Hollywood advising filmmakers on the development of 3D movies; Land told Killian he had a personal rule that any committee he served on “must fit into a taxicab.”)
In fall 1954, Land and his five-person panel quickly confirmed Killian and Eisenhower’s suspicions: “We would go in and interview generals and admirals in charge of intelligence and come away worried,” Land reported to Killian later. “We were [young scientists] asking questions — and they couldn’t answer them.” Killian and Land realized this would set their report and its recommendations on a complicated path: While they needed to acknowledge and address the challenges of broadly upgrading intelligence activities, they also needed to make rapid progress on responding to the Soviet threat.
As work on the report progressed, Land and Killian held briefings with Eisenhower. They used these meetings to make two additional proposals — neither of which, President Eisenhower decided, would be spelled out in the final report for security reasons. The first was the development of missile-firing submarines, a long-term prospect that would take a decade to complete. (The technology developed for Polaris-class submarines, Mindell notes, transferred directly to the rockets that powered the Apollo program to the moon.)
The second proposal — to fast-track development of the U-2, a new high-altitude spy plane —could be accomplished within a year, Land told Eisenhower. The president agreed to both ideas, but he put a condition on the U-2 program. As Killian later wrote: “The president asked that it should be handled in an unconventional way so that it would not become entangled in the bureaucracy of the Defense Department or troubled by rivalries among the services.”
Powered by Land’s revolutionary imaging devices, the U-2 would become a critical tool in the U.S.’ ability to assess and understand the Soviet Union’s nuclear capacity. But the spy plane would also go on to have disastrous consequences for the peace process and for Eisenhower.
The aftermath(s)
The Killian Report has a very complex legacy, says Christopher Capozzola, the Elting Morison Professor of History. “There is a series of ironies about the whole undertaking,” he says. “For example, Eisenhower was trying to tamp down interservice rivalries by getting scientists to decide things. But within a couple of years those rivalries have all gotten worse.” Similarly, Capozzola notes, Eisenhower — who famously coined the phrase “military-industrial complex” and warned against it — amplified the militarization of scientific research “more than anyone else.”
Another especially painful irony emerged on May 1, 1960. Two weeks before a meeting between Eisenhower and Khrushchev in Paris to discuss how the U.S. and USSR could ease Cold War tensions and slow the arms race, a U-2 was shot down in Soviet airspace. After a public denial by the U.S. that the aircraft was being used for espionage, the Soviets produced the plane’s wreckage, cameras, and pilot — who admitted he was working for the CIA. The peace process, which had become the centerpiece of Eisenhower’s intended legacy, collapsed.
There were also some brighter outcomes of the Killian Report, Capozzola says. It marked a dramatic reset of the national government’s relationship with academic scientists and engineers — and with MIT specifically. “The report really greased the wheels between MIT scientists and Washington,” he notes. “Perhaps more than the report itself, the deep structures and relationships that Killian set up had implications for MIT and other research universities. They started to orient their missions toward the national interest,” he adds.
The report also cemented Eisenhower’s relationship with Killian. After the launch of Sputnik, which induced a broad public panic in the U.S. about Soviet scientific capabilities, the president called on Killian to guide the national response. Eisenhower later named Killian the first special assistant to the president for science and technology. In the years that followed, Killian would go on to help launch NASA, and MIT engineers would play a critical role in the Apollo mission that landed the first person on the moon. To this day, researchers at MIT and Lincoln Laboratory uphold this legacy of service, advancing knowledge in areas vital to national security, economic competitiveness, and quality of life for all Americans.
As Eisenhower’s special assistant, Killian met with him almost daily and became one of his most trusted advisors. “Killian could talk to the president, and Eisenhower really took his advice,” says Capozzola. “Not very many people can do that. The fact that Killian had that and used it was different.”
A key to their relationship, Capozzola notes, was Killian’s approach to his work. “He exemplified the notion that if you want to get something done, don’t take the credit. At no point did Killian think he was setting science policy. He was advising people on their best options, including decision-makers who would have to make very difficult decisions. That’s it.”
In 1977, after many tours of duty in Washington and his retirement from MIT, Killian summarized his experience working for Eisenhower in his memoir, “Sputnik, Scientists, and Eisenhower.” Killian said of his colleagues: “They were held together in close harmony not only by the challenge of the scientific and technical work they were asked to undertake but by their abiding sense of the opportunity they had to serve a president they admired and the country they loved. They entered the corridors of power in a moment of crisis and served there with a sense of privilege and of admiration for the integrity and high purpose of the White House.”
Mountains magnify mechanisms in climate change biology
Nature Climate Change, Published online: 09 February 2026; doi:10.1038/s41558-025-02549-x
Mountains, with their sharp climatic contrasts, are emblematic of climate-driven species movement and, ultimately, loss. Here, we argue that these same contrasts make mountains powerful natural laboratories for discovering the mechanisms that underlie biological change.Preserving mountains
Nature Climate Change, Published online: 09 February 2026; doi:10.1038/s41558-026-02572-6
Disappearing glaciers and missing snow in mountain regions are some of the most immediate signs of global change today. In this issue, we focus on the broader changes in mountains and how they affect people living both within and far away from their peaks and valleys.Melting glaciers as symbols of tourism paradoxes
Nature Climate Change, Published online: 09 February 2026; doi:10.1038/s41558-025-02544-2
Visitors are increasingly drawn to disappearing glacier landscapes for their beauty and scientific value. This Comment examines the paradoxes reshaping relationships among glaciers, people and communities, and highlights research needed to avoid maladaptation harming local communities.Melting ice and transforming beliefs
Nature Climate Change, Published online: 09 February 2026; doi:10.1038/s41558-025-02551-3
Mountains and their ecosystems have been important to religious beliefs in many regions around the world. In this Viewpoint, researchers describe how climate change in mountain regions is interpreted by local communities and how they transform their spiritual practice in response to it.Cascading downstream impacts of water cycle changes in mountain regions
Nature Climate Change, Published online: 09 February 2026; doi:10.1038/s41558-025-02552-2
Mountains are hotspots of climate change, with melting glaciers, changing water flows and moving ecosystems. Here the authors discuss how these different changes in mountain regions affect downstream regions.Friday Squid Blogging: Squid Fishing Tips
This is a video of advice for squid fishing in Puget Sound.
As usual, you can also use this squid post to talk about the security stories in the news that I haven’t covered.
I Am in the Epstein Files
Once. Someone named “Vincenzo lozzo” wrote to Epstein in email, in 2016: “I wouldn’t pay too much attention to this, Schneier has a long tradition of dramatizing and misunderstanding things.” The topic of the email is DDoS attacks, and it is unclear what I am dramatizing and misunderstanding.
Rabbi Schneier is also mentioned, also incidentally, also once. As far as either of us know, we are not related.
“This is science!” – MIT president talks about the importance of America’s research enterprise on GBH’s Boston Public Radio
In a wide-ranging live conversation, MIT President Sally Kornbluth joined Jim Braude and Margery Eagan live in studio for GBH’s Boston Public Radio on Thursday, February 5. They talked about MIT, the pressures facing America’s research enterprise, the importance of science, that Congressional hearing on antisemitism in 2023, and more – including Sally’s experience as a Type 1 diabetic.
Reflecting on how research and innovation in the treatment of diabetes has advanced over decades of work, leading to markedly better patient care, Kornbluth exclaims: “This is science!”
With new financial pressures facing universities, increased competition for talented students and scholars from outside the U.S., as well as unprecedented pressures on university leaders and campuses, co-host Eagan asks Kornbluth what she thinks will happen in years to come.
“For us, one of the hardest things now is the endowment tax,” remarks Kornbluth. “That is $240 million a year. Think about how much science you can get for $240 million a year. Are we managing it? Yes. Are we still forging ahead on all of our exciting initiatives? Yes. But we’ve had to reconfigure things. We’ve had to merge things. And it’s not the way we should be spending our time and money.”
Watch and listen to the full episode on YouTube. President Kornbluth appears one hour and seven minutes into the broadcast.
Following Kornbluth’s appearance, MIT Assistant Professor John Urschel – also a former offensive lineman for the Baltimore Ravens – joined Edgar B. Herwick III, host of GBH’s newest show, The Curiosity Desk, to talk about his love of his family, linear algebra, and football.
On how he eventually chose math over football, Urschel quips: “Well, I hate to break it to you, I like math better… let me tell you, when I started my PhD at MIT, I just fell in love with the place. I fell in love with this idea of being in this environment [where] everyone loves math, everyone wants to learn. I was just constantly excited every day showing up.”
Prof. Urschel appears about 2 hours and 40 minutes into the webcast on YouTube.
Coming up on Curiosity Desk later this month…
Airing weekday afternoons from 1-2 p.m., The Curiosity Desk will welcome additional MIT guests in the coming weeks. On Thursday, Feb. 12 Anette “Peko” Hosoi, Pappalardo Professor of Mechanical Engineering, and Jerry Lu MFin ’24, a former researcher at the MIT Sports Lab, visit The Curiosity Desk to discuss their work using AI to help Olympic figure skaters improve their jumps.
Then, on Thursday, Feb. 19, Professors Sangeeta Bhatia and Angela Belcher talk with Herwick about their research to improve diagnostics for ovarian cancer. We learn that about 80% of the time ovarian cancer starts in the fallopian tubes and how this points the way to a whole new approach to diagnosing and treating the disease.
MIT News · Curiosity Desk PreviewSource: GBH
iPhone Lockdown Mode Protects Washington Post Reporter
404Media is reporting that the FBI could not access a reporter’s iPhone because it had Lockdown Mode enabled:
The court record shows what devices and data the FBI was able to ultimately access, and which devices it could not, after raiding the home of the reporter, Hannah Natanson, in January as part of an investigation into leaks of classified information. It also provides rare insight into the apparent effectiveness of Lockdown Mode, or at least how effective it might be before the FBI may try other techniques to access the device.
“Because the iPhone was in Lockdown mode, CART could not extract that device,” the court record reads, referring to the FBI’s Computer Analysis Response Team, a unit focused on performing forensic analyses of seized devices. The document is written by the government, and is opposing the return of Natanson’s devices...
Here’s what could happen when the endangerment finding dies
Equinor CEO: Energy investments becoming ‘politicalized and polarized’
Swedish youth sue to force government to act on climate change
State Farm seeks to block prosecutor access to internal records
Prominent environmental groups revive ‘superbill’ priority list
EU to soften emissions curbs on companies in flagship market
As winter comes, a river in Bosnia chokes in tons of waste each year
Fund managers saw historic withdrawals from ESG labels last year
I’m walking here! A new model maps foot traffic in New York City
Early in the 1969 film “Midnight Cowboy,” Dustin Hoffman, playing the character of Ratso Rizzo, crosses a Manhattan street and angrily bangs on the hood of an encroaching taxi. Hoffman’s line — “I’m walking here!” — has since been repeated by thousands of New Yorkers. Where cars and people mix, tensions rise.
And yet, governments and planners across the U.S. haven’t thoroughly tracked where it is that cars and people mix. Officials have long measured vehicle traffic closely while largely ignoring pedestrian traffic. Now, an MIT research group has assembled a routable dataset of sidewalks, crosswalks, and footpaths for all of New York City — a massive mapping project and the first complete model of pedestrian activity in any U.S. city.
The model could help planners decide where to make pedestrian infrastructure and public space investments, and illuminate how development decisions could affect non-motorized travel in the city. The study also helps pinpoint locations throughout the city where there are both lots of pedestrians and high pedestrian hazards, such as traffic crashes, and where streets or intersections are most in need of upgrades.
“We now have a first view of foot traffic all over New York City and can check planning decisions against it,” says Andres Sevtsuk, an associate professor in MIT’s Department of Urban Studies and Planning (DUSP), who led the study. “New York has very high densities of foot traffic outside of its most well-known areas.”
Indeed, one upshot of the model is that while Manhattan has the most foot traffic per block, the city’s other boroughs contain plenty of pedestrian-heavy stretches of sidewalk and could probably use more investment on behalf of walkers.
“Midtown Manhattan has by far the most foot traffic, but we found there is a probably unintentional Manhattan bias when it comes to policies that support pedestrian infrastructure,” Sevtsuk says. “There are a whole lot of streets in New York with very high pedestrian volumes outside of Manhattan, whether in Queens or the Bronx or Brooklyn, and we’re able to show, based on data, that a lot of these streets have foot-traffic levels similar to many parts of Manhattan.”
And, in an advance that could help cities anywhere, the model was used to quantify vehicle crashes involving pedestrians not only as raw totals, but on a per-pedestrian basis.
“A lot of cities put real investments behind keeping pedestrians safe from vehicles by prioritizing dangerous locations,” Sevtsuk says. “But that’s not only where the most crashes occur. Here we are able to calculate accidents per pedestrian, the risk people face, and that broadens the picture in terms of where the most dangerous intersections for pedestrians really are.”
The paper, “Spatial Distribution of Foot-traffic in New York City and Applications for Urban Planning,” is published today in Nature Cities.
The authors are Sevtsuk, the Charles and Ann Spaulding Associate Professor of Urban Science and Planning in DUSP and head of the City Design and Development Group; Rounaq Basu, an assistant professor at Georgia Tech; Liu Liu, a PhD student at the City Form Lab in DUSP; Abdulaziz Alhassan, a PhD student at MIT’s Center for Complex Engineering Systems; and Justin Kollar, a PhD student at MIT’s Leventhal Center for Advanced Urbanism in DUSP.
Walking everywhere
The current study continues work Sevtsuk and his colleagues have conducted charting and modeling pedestrian traffic around the world, from Melbourne to MIT’s Kendall Square neighborhood in Cambridge, Massachusetts. Many cities collect some pedestrian count data — but not much. And while officials usually request vehicle traffic impact assessments for new development plans, they rarely study how new developments or infrastructure proposals affect pedestrians.
However, New York City does devote part of its Department of Transportation (DOT) to pedestrian issues, and about 41 percent of trips city-wide are made on foot, compared to just 28 percent by vehicle, likely the highest such ratio in any big U.S. city. To calibrate the model, the MIT team used pedestrian counts that New York City’s DOT recorded in 2018 and 2019, covering up to 1,000 city sidewalk segments on weekdays and up to roughly 450 segments on weekends.
The researchers were able to test the model — which incorporates a wide range of factors — against New York City’s pedestrian-count data. Once calibrated, the model could expand foot-traffic estimates throughout the whole city, not just the points where pedestrian counts were observed.
The results showed that in Midtown Manhattan, there are about 1,697 pedestrians, on average, per sidewalk segment per hour during the evening peak of foot traffic, the highest in the city. The financial district in lower Manhattan comes in second, at 740 pedestrians per hour, with Greenwich Village third at 656.
Other parts of Manhattan register lower levels of foot traffic, however. Morningside Heights and East Harlem register 226 and 227 pedestrians per block per hour. And that’s similar to, or lower than, some parts of other boroughs. Brooklyn Heights has 277 pedestrians per sidewalk segment per hour; University Heights in the Bronx has 263; Borough Park in Brooklyn and the Grand Concourse in the Bronx average 236; and a slice of Queens in the Corona area averages 222. Many other spots are over 200.
The model overlays many different types of pedestrian journeys for each time period and shows that people are generally headed to work and schools in the morning, but conduct more varied types of trips in mid-day and the evening, as they seek out amenities or conduct social or recreational visits.
“Because of jobs, transit stops are the biggest generators of foot traffic in the morning peak,” Liu observes. “In the evening peak, of course people need to get home too, but patterns are much more varied, and people are not just returning from work or school. More social and recreational travel happens after work, whether it’s getting together with friends or running errands for family or family care trips, and that’s what the model detects too.”
On the safety front, pedestrians face danger in many places, not just the intersections with the most total accidents. Many parts of the city are riskier than others on a per-pedestrian basis, compared to the locations with the most pedestrian-related crashes.
“Places like Times Square and Herald Square in Manhattan may have numerous crashes, but they have very high pedestrian volumes, and it’s actually relatively safe to walk there,” Basu says. “There are other parts of the city, around highway off-ramps and heavy car-infrastructure, including the relatively low-density borough of Staten Island, which turn out to have a disproportionate number of crashes per pedestrian.”
Taking the model across the U.S.
The MIT model stands a solid chance of being applied in New York City policy and planning circles, since officials there are aware of the research and have been regularly communicating with the MIT team about it.
For his part, Sevtsuk emphasizes that, as distinct as New York City might be, the MIT model can be applied to cities and town anywhere in the U.S. As it happens, the team is working with municipal officials in two other places at the moment. One is Los Angeles, where city officials are not only trying to upgrade pedestrian and public transit mobility for regular daily trips, but making plans to handle an influx of visitors for the 2028 summer Olympics.
Meanwhile the state of Maine is working with the MIT team to evaluate pedestrian movement in over 140 of its cities and towns, to better understand the kinds of upgrades and safety improvements it could make for pedestrians across the state. Sevtsuk hopes that still other places will take notice of the New York City study and recognize that the tools are in place to analyze foot traffic more broadly in U.S. cities, to address the urgent need to decarbonize cities, and to start balancing what he views as the disproportionate focus on car travel prevalent in 20th century urban planning.
“I hope this can inspire other cities to invest in modeling foot traffic and mapping pedestrian infrastructure as well,” Sevtsuk says. “Very few cities make plans for pedestrian mobility or examine rigorously how future developments will impact foot-traffic. But they can. Our models serve as a test bed for making future changes.”
