Feed aggregator
Biden’s green bank is on the ropes
NOAA creates powerful role for wind critic
Offshore wind is in a fight for survival
Trump admin asks court to reverse New York climate superfund law
Cruise industry sues Hawaii over climate tax
Trump cuts could weaken federal disaster response, GAO says
Scientists slam inaccuracies in DOE climate change report
France says EU leaders, not ministers, should decide 2040 climate target
Zimbabwe publishes draft regulations to establish climate fund
Sudan seeks aid after landslide kills over 1,000 in single village
Hotter and longer summers are shifting wedding season
3 Questions: The pros and cons of synthetic data in AI
Synthetic data are artificially generated by algorithms to mimic the statistical properties of actual data, without containing any information from real-world sources. While concrete numbers are hard to pin down, some estimates suggest that more than 60 percent of data used for AI applications in 2024 was synthetic, and this figure is expected to grow across industries.
Because synthetic data don’t contain real-world information, they hold the promise of safeguarding privacy while reducing the cost and increasing the speed at which new AI models are developed. But using synthetic data requires careful evaluation, planning, and checks and balances to prevent loss of performance when AI models are deployed.
To unpack some pros and cons of using synthetic data, MIT News spoke with Kalyan Veeramachaneni, a principal research scientist in the Laboratory for Information and Decision Systems and co-founder of DataCebo whose open-core platform, the Synthetic Data Vault, helps users generate and test synthetic data.
Q: How are synthetic data created?
A: Synthetic data are algorithmically generated but do not come from a real situation. Their value lies in their statistical similarity to real data. If we’re talking about language, for instance, synthetic data look very much as if a human had written those sentences. While researchers have created synthetic data for a long time, what has changed in the past few years is our ability to build generative models out of data and use them to create realistic synthetic data. We can take a little bit of real data and build a generative model from that, which we can use to create as much synthetic data as we want. Plus, the model creates synthetic data in a way that captures all the underlying rules and infinite patterns that exist in the real data.
There are essentially four different data modalities: language, video or images, audio, and tabular data. All four of them have slightly different ways of building the generative models to create synthetic data. An LLM, for instance, is nothing but a generative model from which you are sampling synthetic data when you ask it a question.
A lot of language and image data are publicly available on the internet. But tabular data, which is the data collected when we interact with physical and social systems, is often locked up behind enterprise firewalls. Much of it is sensitive or private, such as customer transactions stored by a bank. For this type of data, platforms like the Synthetic Data Vault provide software that can be used to build generative models. Those models then create synthetic data that preserve customer privacy and can be shared more widely.
One powerful thing about this generative modeling approach for synthesizing data is that enterprises can now build a customized, local model for their own data. Generative AI automates what used to be a manual process.
Q: What are some benefits of using synthetic data, and which use-cases and applications are they particularly well-suited for?
A: One fundamental application which has grown tremendously over the past decade is using synthetic data to test software applications. There is data-driven logic behind many software applications, so you need data to test that software and its functionality. In the past, people have resorted to manually generating data, but now we can use generative models to create as much data as we need.
Users can also create specific data for application testing. Say I work for an e-commerce company. I can generate synthetic data that mimics real customers who live in Ohio and made transactions pertaining to one particular product in February or March.
Because synthetic data aren’t drawn from real situations, they are also privacy-preserving. One of the biggest problems in software testing has been getting access to sensitive real data for testing software in non-production environments, due to privacy concerns. Another immediate benefit is in performance testing. You can create a billion transactions from a generative model and test how fast your system can process them.
Another application where synthetic data hold a lot of promise is in training machine-learning models. Sometimes, we want an AI model to help us predict an event that is less frequent. A bank may want to use an AI model to predict fraudulent transactions, but there may be too few real examples to train a model that can identify fraud accurately. Synthetic data provide data augmentation — additional data examples that are similar to the real data. These can significantly improve the accuracy of AI models.
Also, sometimes users don’t have time or the financial resources to collect all the data. For instance, collecting data about customer intent would require conducting many surveys. If you end up with limited data and then try to train a model, it won’t perform well. You can augment by adding synthetic data to train those models better.
Q. What are some of the risks or potential pitfalls of using synthetic data, and are there steps users can take to prevent or mitigate those problems?
A. One of the biggest questions people often have in their mind is, if the data are synthetically created, why should I trust them? Determining whether you can trust the data often comes down to evaluating the overall system where you are using them.
There are a lot of aspects of synthetic data we have been able to evaluate for a long time. For instance, there are existing methods to measure how close synthetic data are to real data, and we can measure their quality and whether they preserve privacy. But there are other important considerations if you are using those synthetic data to train a machine-learning model for a new use case. How would you know the data are going to lead to models that still make valid conclusions?
New efficacy metrics are emerging, and the emphasis is now on efficacy for a particular task. You must really dig into your workflow to ensure the synthetic data you add to the system still allow you to draw valid conclusions. That is something that must be done carefully on an application-by-application basis.
Bias can also be an issue. Since it is created from a small amount of real data, the same bias that exists in the real data can carry over into the synthetic data. Just like with real data, you would need to purposefully make sure the bias is removed through different sampling techniques, which can create balanced datasets. It takes some careful planning, but you can calibrate the data generation to prevent the proliferation of bias.
To help with the evaluation process, our group created the Synthetic Data Metrics Library. We worried that people would use synthetic data in their environment and it would give different conclusions in the real world. We created a metrics and evaluation library to ensure checks and balances. The machine learning community has faced a lot of challenges in ensuring models can generalize to new situations. The use of synthetic data adds a whole new dimension to that problem.
I expect that the old systems of working with data, whether to build software applications, answer analytical questions, or train models, will dramatically change as we get more sophisticated at building these generative models. A lot of things we have never been able to do before will now be possible.
Soft materials hold onto “memories” of their past, for longer than previously thought
If your hand lotion is a bit runnier than usual coming out of the bottle, it might have something to do with the goop’s “mechanical memory.”
Soft gels and lotions are made by mixing ingredients until they form a stable and uniform substance. But even after a gel has set, it can hold onto “memories,” or residual stress, from the mixing process. Over time, the material can give in to these embedded stresses and slide back into its former, premixed state. Mechanical memory is, in part, why hand lotion separates and gets runny over time.
Now, an MIT engineer has devised a simple way to measure the degree of residual stress in soft materials after they have been mixed, and found that common products like hair gel and shaving cream have longer mechanical memories, holding onto residual stresses for longer periods of time than manufacturers might have assumed.
In a study appearing today in Physical Review Letters, Crystal Owens, a postdoc in MIT’s Computer Science and Artificial Intelligence Laboratory (CSAIL), presents a new protocol for measuring residual stress in soft, gel-like materials, using a standard benchtop rheometer.
Applying this protocol to everyday soft materials, Owens found that if a gel is made by mixing it in one direction, once it settles into a stable and uniform state, it effectively holds onto the memory of the direction in which it is mixed. Even after several days, the gel will hold some internal stress that, if released, will cause the gel to shift in the direction opposite to how it was initially mixed, reverting back to its earlier state.
“This is one reason different batches of cosmetics or food behave differently even if they underwent ‘identical’ manufacturing,” Owens says. “Understanding and measuring these hidden stresses during processing could help manufacturers design better products that last longer and perform more predictably.”
A soft glass
Hand lotion, hair gel, and shaving cream all fall under the category of “soft glassy materials” — materials that exhibit properties of both solids and liquids.
“Anything you can pour into your hand and it forms a soft mound is going to be considered a soft glass,” Owens explains. “In materials science, it’s considered a soft version of something that has the same amorphous structure as glass.”
In other words, a soft glassy material is a strange amalgam of a solid and a liquid. It can be poured out like a liquid, and it can hold its shape like a solid. Once they are made, these materials exist in a delicate balance between solid and liquid. And Owens wondered: For how long?
“What happens to these materials after very long times? Do they finally relax or do they never relax?” Owens says. “From a physics perspective, that’s a very interesting concept: What is the essential state of these materials?”
Twist and hold
In the manufacturing of soft glassy materials such as hair gel and shampoo, ingredients are first mixed into a uniform product. Quality control engineers then let a sample sit for about a minute — a period of time that they assume is enough to allow any residual stresses from the mixing process dissipate. In that time, the material should settle into a steady, stable state, ready for use.
But Owens suspected that the materials may hold some degree of stress from the production process long after they’ve appeared to settle.
“Residual stress is a low level of stress that’s trapped inside a material after it’s come to a steady state,” Owens says. “This sort of stress has not been measured in these sorts of materials.”
To test her hypothesis, she carried out experiments with two common soft glassy materials: hair gel and shaving cream. She made measurements of each material in a rheometer — an instrument consisting of two rotating plates that can twist and press a material together at precisely controlled pressures and forces that relate directly to the material’s internal stresses and strains.
In her experiments, she placed each material in the rheometer and spun the instrument’s top plate around to mix the material. Then she let the material settle, and then settle some more — much longer than one minute. During this time, she observed the amount of force it took the rheometer to hold the material in place. She reasoned that the greater the rheometer’s force, the more it must be counteracting any stress within the material that would otherwise cause it to shift out of its current state.
Over multiple experiments using this new protocol, Owens found that different types of soft glassy materials held a significant amount of residual stress, long after most researchers would assume the stress had dissipated. What’s more, she found that the degree of stress that a material retained was a reflection of the direction in which it was initially mixed, and when it was mixed.
“The material can effectively ‘remember’ which direction it was mixed, and how long ago,” Owens says. “And it turns out they hold this memory of their past, a lot longer than we used to think.”
In addition to the protocol she has developed to measure residual stress, Owens has developed a model to estimate how a material will change over time, given the degree of residual stress that it holds. Using this model, she says scientists might design materials with “short-term memory,” or very little residual stress, such that they remain stable over longer periods.
One material where she sees room for such improvement is asphalt — a substance that is first mixed, then poured in molten form over a surface where it then cools and settles over time. She suspects that residual stresses from the mixing of asphalt may contribute to cracks forming in pavement over time. Reducing these stresses at the start of the process could lead to longer-lasting, more resilient roads.
“People are inventing new types of asphalt all the time to be more eco-friendly, and all of these will have different levels of residual stress that will need some control,” she says. “There’s plenty of room to explore.”
This research was supported, in part, by MIT’s Postdoctoral Fellowship for Engineering Excellence and an MIT Mathworks Fellowship.
Moving beyond projects to achieve transformative adaptation
Nature Climate Change, Published online: 03 September 2025; doi:10.1038/s41558-025-02414-x
Projects are not delivering the transformative change needed for climate change adaptation. This failure is due in part to the delivery of adaptation as projects, but there are viable alternatives that can better address the underlying and structural causes of vulnerability.3 Questions: On biology and medicine’s “data revolution”
Caroline Uhler is an Andrew (1956) and Erna Viterbi Professor of Engineering at MIT; a professor of electrical engineering and computer science in the Institute for Data, Science, and Society (IDSS); and director of the Eric and Wendy Schmidt Center at the Broad Institute of MIT and Harvard, where she is also a core institute and scientific leadership team member.
Uhler is interested in all the methods by which scientists can uncover causality in biological systems, ranging from causal discovery on observed variables to causal feature learning and representation learning. In this interview, she discusses machine learning in biology, areas that are ripe for problem-solving, and cutting-edge research coming out of the Schmidt Center.
Q: The Eric and Wendy Schmidt Center has four distinct areas of focus structured around four natural levels of biological organization: proteins, cells, tissues, and organisms. What, within the current landscape of machine learning, makes now the right time to work on these specific problem classes?
A: Biology and medicine are currently undergoing a “data revolution.” The availability of large-scale, diverse datasets — ranging from genomics and multi-omics to high-resolution imaging and electronic health records — makes this an opportune time. Inexpensive and accurate DNA sequencing is a reality, advanced molecular imaging has become routine, and single cell genomics is allowing the profiling of millions of cells. These innovations — and the massive datasets they produce — have brought us to the threshold of a new era in biology, one where we will be able to move beyond characterizing the units of life (such as all proteins, genes, and cell types) to understanding the `programs of life’, such as the logic of gene circuits and cell-cell communication that underlies tissue patterning and the molecular mechanisms that underlie the genotype-phenotype map.
At the same time, in the past decade, machine learning has seen remarkable progress with models like BERT, GPT-3, and ChatGPT demonstrating advanced capabilities in text understanding and generation, while vision transformers and multimodal models like CLIP have achieved human-level performance in image-related tasks. These breakthroughs provide powerful architectural blueprints and training strategies that can be adapted to biological data. For instance, transformers can model genomic sequences similar to language, and vision models can analyze medical and microscopy images.
Importantly, biology is poised to be not just a beneficiary of machine learning, but also a significant source of inspiration for new ML research. Much like agriculture and breeding spurred modern statistics, biology has the potential to inspire new and perhaps even more profound avenues of ML research. Unlike fields such as recommender systems and internet advertising, where there are no natural laws to discover and predictive accuracy is the ultimate measure of value, in biology, phenomena are physically interpretable, and causal mechanisms are the ultimate goal. Additionally, biology boasts genetic and chemical tools that enable perturbational screens on an unparalleled scale compared to other fields. These combined features make biology uniquely suited to both benefit greatly from ML and serve as a profound wellspring of inspiration for it.
Q: Taking a somewhat different tack, what problems in biology are still really resistant to our current tool set? Are there areas, perhaps specific challenges in disease or in wellness, which you feel are ripe for problem-solving?
A: Machine learning has demonstrated remarkable success in predictive tasks across domains such as image classification, natural language processing, and clinical risk modeling. However, in the biological sciences, predictive accuracy is often insufficient. The fundamental questions in these fields are inherently causal: How does a perturbation to a specific gene or pathway affect downstream cellular processes? What is the mechanism by which an intervention leads to a phenotypic change? Traditional machine learning models, which are primarily optimized for capturing statistical associations in observational data, often fail to answer such interventional queries.There is a strong need for biology and medicine to also inspire new foundational developments in machine learning.
The field is now equipped with high-throughput perturbation technologies — such as pooled CRISPR screens, single-cell transcriptomics, and spatial profiling — that generate rich datasets under systematic interventions. These data modalities naturally call for the development of models that go beyond pattern recognition to support causal inference, active experimental design, and representation learning in settings with complex, structured latent variables. From a mathematical perspective, this requires tackling core questions of identifiability, sample efficiency, and the integration of combinatorial, geometric, and probabilistic tools. I believe that addressing these challenges will not only unlock new insights into the mechanisms of cellular systems, but also push the theoretical boundaries of machine learning.
With respect to foundation models, a consensus in the field is that we are still far from creating a holistic foundation model for biology across scales, similar to what ChatGPT represents in the language domain — a sort of digital organism capable of simulating all biological phenomena. While new foundation models emerge almost weekly, these models have thus far been specialized for a specific scale and question, and focus on one or a few modalities.
Significant progress has been made in predicting protein structures from their sequences. This success has highlighted the importance of iterative machine learning challenges, such as CASP (critical assessment of structure prediction), which have been instrumental in benchmarking state-of-the-art algorithms for protein structure prediction and driving their improvement.
The Schmidt Center is organizing challenges to increase awareness in the ML field and make progress in the development of methods to solve causal prediction problems that are so critical for the biomedical sciences. With the increasing availability of single-gene perturbation data at the single-cell level, I believe predicting the effect of single or combinatorial perturbations, and which perturbations could drive a desired phenotype, are solvable problems. With our Cell Perturbation Prediction Challenge (CPPC), we aim to provide the means to objectively test and benchmark algorithms for predicting the effect of new perturbations.
Another area where the field has made remarkable strides is disease diagnostic and patient triage. Machine learning algorithms can integrate different sources of patient information (data modalities), generate missing modalities, identify patterns that may be difficult for us to detect, and help stratify patients based on their disease risk. While we must remain cautious about potential biases in model predictions, the danger of models learning shortcuts instead of true correlations, and the risk of automation bias in clinical decision-making, I believe this is an area where machine learning is already having a significant impact.
Q: Let’s talk about some of the headlines coming out of the Schmidt Center recently. What current research do you think people should be particularly excited about, and why?
A: In collaboration with Dr. Fei Chen at the Broad Institute, we have recently developed a method for the prediction of unseen proteins’ subcellular location, called PUPS. Many existing methods can only make predictions based on the specific protein and cell data on which they were trained. PUPS, however, combines a protein language model with an image in-painting model to utilize both protein sequences and cellular images. We demonstrate that the protein sequence input enables generalization to unseen proteins, and the cellular image input captures single-cell variability, enabling cell-type-specific predictions. The model learns how relevant each amino acid residue is for the predicted sub-cellular localization, and it can predict changes in localization due to mutations in the protein sequences. Since proteins’ function is strictly related to their subcellular localization, our predictions could provide insights into potential mechanisms of disease. In the future, we aim to extend this method to predict the localization of multiple proteins in a cell and possibly understand protein-protein interactions.
Together with Professor G.V. Shivashankar, a long-time collaborator at ETH Zürich, we have previously shown how simple images of cells stained with fluorescent DNA-intercalating dyes to label the chromatin can yield a lot of information about the state and fate of a cell in health and disease, when combined with machine learning algorithms. Recently, we have furthered this observation and proved the deep link between chromatin organization and gene regulation by developing Image2Reg, a method that enables the prediction of unseen genetically or chemically perturbed genes from chromatin images. Image2Reg utilizes convolutional neural networks to learn an informative representation of the chromatin images of perturbed cells. It also employs a graph convolutional network to create a gene embedding that captures the regulatory effects of genes based on protein-protein interaction data, integrated with cell-type-specific transcriptomic data. Finally, it learns a map between the resulting physical and biochemical representation of cells, allowing us to predict the perturbed gene modules based on chromatin images.
Furthermore, we recently finalized the development of a method for predicting the outcomes of unseen combinatorial gene perturbations and identifying the types of interactions occurring between the perturbed genes. MORPH can guide the design of the most informative perturbations for lab-in-a-loop experiments. Furthermore, the attention-based framework provably enables our method to identify causal relations among the genes, providing insights into the underlying gene regulatory programs. Finally, thanks to its modular structure, we can apply MORPH to perturbation data measured in various modalities, including not only transcriptomics, but also imaging. We are very excited about the potential of this method to enable the efficient exploration of the perturbation space to advance our understanding of cellular programs by bridging causal theory to important applications, with implications for both basic research and therapeutic applications.
New gift expands mental illness studies at Poitras Center for Psychiatric Disorders Research
One in every eight people — 970 million globally — live with mental illness, according to the World Health Organization, with depression and anxiety being the most common mental health conditions worldwide. Existing therapies for complex psychiatric disorders like depression, anxiety, and schizophrenia have limitations, and federal funding to address these shortcomings is growing increasingly uncertain.
Patricia and James Poitras ’63 have committed $8 million to the Poitras Center for Psychiatric Disorders Research to launch pioneering research initiatives aimed at uncovering the brain basis of major mental illness and accelerating the development of novel treatments.
“Federal funding rarely supports the kind of bold, early-stage research that has the potential to transform our understanding of psychiatric illness. Pat and I want to help fill that gap — giving researchers the freedom to follow their most promising leads, even when the path forward isn’t guaranteed,” says James Poitras, who is chair of the McGovern Institute for Brain Research board.
Their latest gift builds upon their legacy of philanthropic support for psychiatric disorders research at MIT, which now exceeds $46 million.
“With deep gratitude for Jim and Pat’s visionary support, we are eager to launch a bold set of studies aimed at unraveling the neural and cognitive underpinnings of major mental illnesses,” says Professor Robert Desimone, director of the McGovern Institute, home to the Poitras Center. “Together, these projects represent a powerful step toward transforming how we understand and treat mental illness.”
A legacy of support
Soon after joining the McGovern Institute Leadership Board in 2006, the Poitrases made a $20 million commitment to establish the Poitras Center for Psychiatric Disorders Research at MIT. The center’s goal, to improve human health by addressing the root causes of complex psychiatric disorders, is deeply personal to them both.
“We had decided many years ago that our philanthropic efforts would be directed towards psychiatric research. We could not have imagined then that this perfect synergy between research at MIT’s McGovern Institute and our own philanthropic goals would develop,” recalls Patricia.
The center supports research at the McGovern Institute and collaborative projects with institutions such as the Broad Institute of MIT and Harvard, McLean Hospital, Mass General Brigham, and other clinical research centers. Since its establishment in 2007, the center has enabled advances in psychiatric research including the development of a machine learning “risk calculator” for bipolar disorder, the use of brain imaging to predict treatment outcomes for anxiety, and studies demonstrating that mindfulness can improve mental health in adolescents.
For the past decade, the Poitrases have also fueled breakthroughs in the lab of McGovern investigator and MIT Professor Feng Zhang, backing the invention of powerful CRISPR systems and other molecular tools that are transforming biology and medicine. Their support has enabled the Zhang team to engineer new delivery vehicles for gene therapy, including vehicles capable of carrying genetic payloads that were once out of reach. The lab has also advanced innovative RNA-guided gene engineering tools such as NovaIscB, published in Nature Biotechnology in May 2025. These revolutionary genome editing and delivery technologies hold promise for the next generation of therapies needed for serious psychiatric illness.
In addition to fueling research in the center, the Poitras family has gifted two endowed professorships — the James and Patricia Poitras Professor of Neuroscience at MIT, currently held by Feng Zhang, and the James W. (1963) and Patricia T. Poitras Professor of Brain and Cognitive Sciences at MIT, held by Guoping Feng — and an annual postdoctoral fellowship at the McGovern Institute.
New initiatives at the Poitras Center
The Poitras family’s latest commitment to the Poitras Center will launch an ambitious set of new projects that bring together neuroscientists, clinicians, and computational experts to probe underpinnings of complex psychiatric disorders including schizophrenia, anxiety, and depression. These efforts reflect the center’s core mission: to speed scientific discovery and therapeutic innovation in the field of psychiatric brain disorders research.
McGovern cognitive neuroscientists Evelina Fedorenko PhD ’07, an associate professor, and Nancy Kanwisher ’80, PhD ’86, the Walter A. Rosenblith Professor of Cognitive Neuroscience — in collaboration with psychiatrist Ann Shinn of McLean Hospital — will explore how altered inner speech and reasoning contribute to the symptoms of schizophrenia. They will collect functional MRI data from individuals diagnosed with schizophrenia and matched controls as they perform reasoning tasks. The goal is to identify the brain activity patterns that underlie impaired reasoning in schizophrenia, a core cognitive disruption in the disorder.
A complementary line of investigation will focus on the role of inner speech — the “voice in our head” that shapes thought and self-awareness. The team will conduct a large-scale online behavioral study of neurotypical individuals to analyze how inner speech characteristics correlate with schizophrenia-spectrum traits. This will be followed by neuroimaging work comparing brain architecture among individuals with strong or weak inner voices and people with schizophrenia, with the aim of discovering neural markers linked to self-talk and disrupted cognition.
A different project led by McGovern neuroscientist and MIT Associate Professor Mark Harnett and 2024–2026 Poitras Center Postdoctoral Fellow Cynthia Rais focuses on how ketamine — an increasingly used antidepressant — alters brain circuits to produce rapid and sustained improvements in mood. Despite its clinical success, ketamine’s mechanisms of action remain poorly understood. The Harnett lab is using sophisticated tools to track how ketamine affects synaptic communication and large-scale brain network dynamics, particularly in models of treatment-resistant depression. By mapping these changes at both the cellular and systems levels, the team hopes to reveal how ketamine lifts mood so quickly — and inform the development of safer, longer-lasting antidepressants.
Guoping Feng is leveraging a new animal model of depression to uncover the brain circuits that drive major depressive disorder. The new animal model provides a powerful system for studying the intricacies of mood regulation. Feng’s team is using state-of-the-art molecular tools to identify the specific genes and cell types involved in this circuit, with the goal of developing targeted treatments that can fine-tune these emotional pathways.
“This is one of the most promising models we have for understanding depression at a mechanistic level,” says Feng, who is also associate director of the McGovern Institute. “It gives us a clear target for future therapies.”
Another novel approach to treating mood disorders comes from the lab of James DiCarlo, the Peter de Florez Professor of Neuroscience at MIT, who is exploring the brain’s visual-emotional interface as a therapeutic tool for anxiety. The amygdala, a key emotional center in the brain, is heavily influenced by visual input. DiCarlo’s lab is using advanced computational models to design visual scenes that may subtly shift emotional processing in the brain — essentially using sight to regulate mood. Unlike traditional therapies, this strategy could offer a noninvasive, drug-free option for individuals suffering from anxiety.
Together, these projects exemplify the kind of interdisciplinary, high-impact research that the Poitras Center was established to support.
“Mental illness affects not just individuals, but entire families who often struggle in silence and uncertainty,” adds Patricia Poitras. “Our hope is that Poitras Center scientists will continue to make important advancements and spark novel treatments for complex mental health disorders and, most of all, give families living with these conditions a renewed sense of hope for the future.”
What WhatsApp’s “Advanced Chat Privacy” Really Does
In April, WhatsApp launched its “Advanced Chat Privacy” feature, which, once enabled, disables using certain AI features in chats and prevents conversations from being exported. Since its launch, an inaccurate viral post has been ping-ponging around social networks, creating confusion around what exactly it does.
The viral post falsely claims that if you do not enable Advanced Chat Privacy, Meta’s AI tools will be able to access your private conversations. This isn’t true, and it misrepresents both how Meta AI works and what Advanced Chat Privacy is.
The confusion seems to spawn from the fact that Meta AI can be invoked through a number of methods, including in any group chat with the @Meta AI command. While the chat contents between you and other people are always end-to-end encrypted on the app, what you say to Meta AI is not. Similarly, if you or anyone else in the chat chooses to use Meta AI's “Summarize” feature, which uses Meta’s “Private Processing” technology, that feature routes the text of the chat through Meta’s servers. However, the company claims that they cannot view the content of those messages. This feature remains opt-in, so it's up to you to decide if you want to use it. The company also recently released the results of two audits detailing the issues that have been found thus far and what they’ve done to fix it.
For example, if you and your buddy are chatting, and your friend types in @Meta AI and asks it a question, that part of the conversion, which you can both see, is not end-to-end encrypted, and is usable for AI training or whatever other purposes are included in Meta’s privacy policy. But otherwise, chats remain end-to-end encrypted.
Advanced Chat Privacy offers some bit of control over this. The new privacy feature isn’t a universal setting in WhatsApp; you can enable or disable it on a per-chat basis, but it’s turned off by default. When enabled, Advanced Chat Privacy does three core things:
- Blocks anyone in the chat from exporting the chats,
- Disables auto-downloading media to chat participant’s phones, and
- Disables some Meta AI features
Outside disabling some Meta AI features, Advanced Chat Privacy can be useful in other instances. For example, while someone can always screenshot chats, if you’re concerned about someone easily exporting an entire group chat history, Advanced Chat Privacy makes this harder to do because there’s no longer a one-tap option to do so. And since media can’t be automatically downloaded to someone’s phone (the “Save to Photos” option on the chat settings screen), it’s harder for an attachment to accidentally end up on someone’s device.
How to Enable Advanced Chat PrivacyAdvanced Chat Privacy is enabled or disabled per chat. To enable it:
- Tap the chat name at the top of the screen.
- Select Advanced chat privacy, then tap the toggle to turn it on.
There are some quirks to how this works, though. For one, by default, anyone involved in a chat can turn Advanced Chat Privacy on or off at will, which limits its usefulness but at least helps ensure something doesn’t accidentally get sent to Meta AI.
There’s one way around this, which is for a group admin to lock down what users in the group can do. In an existing group chat that you are the administrator of, tap the chat name at the top of the screen, then:
- Scroll down to Group Permissions.
- Disable the option to “Edit Group Settings.” This makes it so only the administrator can change several important permissions, including Advanced Chat Privacy.
You can also set this permission when starting a new group chat. Just be sure to pop into the permissions page when prompted. Even without Advanced Chat Privacy, the “Edit Group Settings” option is an important one for privacy, because it also includes whether participants can change the length that disappearing messages can be viewed, so it’s something worth considering for every group chat you’re an administrator of, and something WhatsApp should require admins to choose before starting a new chat.
When it comes to one-on-one chats, there is currently no way to block the other person from changing the Advanced Chat Privacy feature, so you’ll have to come to an agreement with the other person on keeping it enabled if that’s what you want. If the setting is changed, you’ll see a notice in the chat stating so:
There are already serious concerns with how much metadata WhatsApp collects, and as the company introduces ads and AI, it’s going to get harder and harder to navigate the app, understand what each setting does, and properly protect the privacy of conversations. One of the reasons alternative encrypted chat options like Signal tend to thrive is because they keep things simple and employ strong default settings and clear permissions. WhatsApp should keep this in mind as it adds more and more features.
New particle detector passes the “standard candle” test
A new and powerful particle detector just passed a critical test in its goal to decipher the ingredients of the early universe.
The sPHENIX detector is the newest experiment at Brookhaven National Laboratory’s Relativistic Heavy Ion Collider (RHIC) and is designed to precisely measure products of high-speed particle collisions. From the aftermath, scientists hope to reconstruct the properties of quark-gluon plasma (QGP) — a white-hot soup of subatomic particles known as quarks and gluons that is thought to have sprung into existence in the few microseconds following the Big Bang. Just as quickly, the mysterious plasma disappeared, cooling and combining to form the protons and neutrons that make up today’s ordinary matter.
Now, the sPHENIX detector has made a key measurement that proves it has the precision to help piece together the primordial properties of quark-gluon plasma.
In a paper in the Journal of High Energy Physics, scientists including physicists at MIT report that sPHENIX precisely measured the number and energy of particles that streamed out from gold ions that collided at close to the speed of light.
Straight ahead
This test is considered in physics to be a “standard candle,” meaning that the measurement is a well-established constant that can be used to gauge a detector’s precision.
In particular, sPHENIX successfully measured the number of charged particles that are produced when two gold ions collide, and determined how this number changes when the ions collide head-on, versus just glancing by. The detector’s measurements revealed that head-on collisions produced 10 times more charged particles, which were also 10 times more energetic, compared to less straight-on collisions.
“This indicates the detector works as it should,” says Gunther Roland, professor of physics at MIT, who is a member and former spokesperson for the sPHENIX Collaboration. “It’s as if you sent a new telescope up in space after you’ve spent 10 years building it, and it snaps the first picture. It’s not necessarily a picture of something completely new, but it proves that it’s now ready to start doing new science.”
“With this strong foundation, sPHENIX is well-positioned to advance the study of the quark-gluon plasma with greater precision and improved resolution,” adds Hao-Ren Jheng, a graduate student in physics at MIT and a lead co-author of the new paper. “Probing the evolution, structure, and properties of the QGP will help us reconstruct the conditions of the early universe.”
The paper’s co-authors are all members of the sPHENIX Collaboration, which comprises over 300 scientists from multiple institutions around the world, including Roland, Jheng, and physicists at MIT’s Bates Research and Engineering Center.
“Gone in an instant”
Particle colliders such as Brookhaven’s RHIC are designed to accelerate particles at “relativistic” speeds, meaning close to the speed of light. When these particles are flung around in opposite, circulating beams and brought back together, any smash-ups that occur can release an enormous amount of energy. In the right conditions, this energy can very briefly exist in the form of quark-gluon plasma — the same stuff that sprung out of the Big Bang.
Just as in the early universe, quark-gluon plasma doesn’t hang around for very long in particle colliders. If and when QGP is produced, it exists for just 10 to the minus 22, or about a sextillionth, of a second. In this moment, quark-gluon plasma is incredibly hot, up to several trillion degrees Celsius, and behaves as a “perfect fluid,” moving as one entity rather than as a collection of random particles. Almost immediately, this exotic behavior disappears, and the plasma cools and transitions into more ordinary particles such as protons and neutrons, which stream out from the main collision.
“You never see the QGP itself — you just see its ashes, so to speak, in the form of the particles that come from its decay,” Roland says. “With sPHENIX, we want to measure these particles to reconstruct the properties of the QGP, which is essentially gone in an instant.”
“One in a billion”
The sPHENIX detector is the next generation of Brookhaven’s original Pioneering High Energy Nuclear Interaction eXperiment, or PHENIX, which measured collisions of heavy ions generated by RHIC. In 2021, sPHENIX was installed in place of its predecessor, as a faster and more powerful version, designed to detect quark-gluon plasma’s more subtle and ephemeral signatures.
The detector itself is about the size of a two-story house and weighs around 1,000 tons. It sits at the intersection of RHIC’s two main collider beams, where relativistic particles, accelerated from opposite directions, meet and collide, producing particles that fly out into the detector. The sPHENIX detector is able to catch and measure 15,000 particle collisions per second, thanks to its novel, layered components, including the MVTX, or micro-vertex — a subdetector that was designed, built, and installed by scientists at MIT’s Bates Research and Engineering Center.
Together, the detector’s systems enable sPHENIX to act as a giant 3D camera that can track the number, energy, and paths of individual particles during an explosion of particles generated by a single collision.
“SPHENIX takes advantage of developments in detector technology since RHIC switched on 25 years ago, to collect data at the fastest possible rate,” says MIT postdoc Cameron Dean, who was a main contributor to the new study’s analysis. “This allows us to probe incredibly rare processes for the first time.”
In the fall of 2024, scientists ran the detector through the “standard candle” test to gauge its speed and precision. Over three weeks, they gathered data from sPHENIX as the main collider accelerated and smashed together beams of gold ions traveling at the speed of light. Their analysis of the data showed that sPHENIX accurately measured the number of charged particles produced in individual gold ion collisions, as well as the particles’ energies. What’s more, the detector was sensitive to a collision’s “head-on-ness,” and could observe that head-on collisions produced more particles with greater energy, compared to less direct collisions.
“This measurement provides clear evidence that the detector is functioning as intended,” Jheng says.
“The fun for sPHENIX is just beginning,” Dean adds. “We are currently back colliding particles and expect to do so for several more months. With all our data, we can look for the one-in-a-billion rare process that could give us insights on things like the density of QGP, the diffusion of particles through ultra-dense matter, and how much energy it takes to bind different particles together.”
This work was supported, in part, by the U.S. Department of Energy Office of Science, and the National Science Foundation.
Judges say EPA can take back billions in climate grants
1965 Cryptanalysis Training Workbook Released by the NSA
In the early 1960s, National Security Agency cryptanalyst and cryptanalysis instructor Lambros D. Callimahos coined the term “Stethoscope” to describe a diagnostic computer program used to unravel the internal structure of pre-computer ciphertexts. The term appears in the newly declassified September 1965 document Cryptanalytic Diagnosis with the Aid of a Computer, which compiled 147 listings from this tool for Callimahos’s course, CA-400: NSA Intensive Study Program in General Cryptanalysis.
The listings in the report are printouts from the Stethoscope program, run on the NSA’s Bogart computer, showing statistical and structural data extracted from encrypted messages, but the encrypted messages themselves are not included. They were used in NSA training programs to teach analysts how to interpret ciphertext behavior without seeing the original message...