Skip to main content

AI Generates Hypotheses Human Scientists Have Not Thought Of

Machine-learning algorithms can guide humans toward new experiments and theories

Abstract blue neon technological brain

Machine learning techniques can help researchers develop novel hypotheses.

Credit:

Getty Images

Electric vehicles have the potential to substantially reduce carbon emissions, but car companies are running out of materials to make batteries. One crucial component, nickel, is projected to cause supply shortages as early as the end of this year. Scientists recently discovered four new materials that could potentially help—and what may be even more intriguing is how they found these materials: the researchers relied on artificial intelligence to pick out useful chemicals from a list of more than 300 options. And they are not the only humans turning to A.I. for scientific inspiration.

Creating hypotheses has long been a purely human domain. Now, though, scientists are beginning to ask machine learning to produce original insights. They are designing neural networks (a type of machine-learning setup with a structure inspired by the human brain) that suggest new hypotheses based on patterns the networks find in data instead of relying on human assumptions. Many fields may soon turn to the muse of machine learning in an attempt to speed up the scientific process and reduce human biases.

In the case of new battery materials, scientists pursuing such tasks have typically relied on database search tools, modeling and their own intuition about chemicals to pick out useful compounds. Instead a team at the University of Liverpool in England used machine learning to streamline the creative process. The researchers developed a neural network that ranked chemical combinations by how likely they were to result in a useful new material. Then the scientists used these rankings to guide their experiments in the laboratory. They identified four promising candidates for battery materials without having to test everything on their list, saving them months of trial and error.


On supporting science journalism

If you're enjoying this article, consider supporting our award-winning journalism by subscribing. By purchasing a subscription you are helping to ensure the future of impactful stories about the discoveries and ideas shaping our world today.


“It’s a great tool,” says Andrij Vasylenko, a research associate at the University of Liverpool and a co-author of the study on finding battery materials, which was published in Nature Communications last month. The A.I. process helps identify the chemical combinations that are worth looking at, he adds, so “we can cover much more chemical space more quickly.”

The discovery of new materials is not the only area where machine learning could contribute to science. Researchers are also applying neural networks to larger technical and theoretical questions. Renato Renner, a physicist at Zurich’s Institute for Theoretical Physics, hopes to someday use machine learning to develop a unified theory of how the universe works. But before A.I. can uncover the true nature of reality, researchers must tackle the notoriously difficult question of how neural networks make their decisions.

Getting inside the Machine-Learning Mind

In the past 10 years, machine learning has become an extremely popular tool for classifying big data and making predictions. Explaining the logical basis for its decisions can be very difficult, however. Neural networks are built from interconnected nodes, modeled after the neurons of the brain, with a structure that changes as information flows through it. While this adaptive model is able to solve complex problems, it is also often impossible for humans to decode the logic involved.

This lack of transparency has been nicknamed “the black box problem” because no one can see inside the network to explain its “thought” process. Not only does this opacity undermine trust in the results—it also limits how much neural networks can contribute to humans’ scientific understanding of the world.

Some scientists are trying to make the black box transparent by developing “interpretability techniques,” which attempt to offer a step-by-step explanation for how a network arrives at its answers. It may not be possible to obtain a high level of detail from complex machine-learning models. But researchers can often identify larger trends in the way a network processes data, sometimes leading to surprising discoveries—such as who is most likely to develop cancer.

Several years ago, Anant Madabhushi, a professor of biomedical engineering at Case Western Reserve University, used interpretability techniques to understand why some patients are more likely than others to have a recurrence of breast or prostate cancer. He fed patient scans to a neural network, and the network identified those with a higher risk of cancer reoccurrence. Then Madabhushi analyzed the network to find the most important feature for determining a patient’s probability of developing cancer again. The results suggested that how tightly glands’ interior structures are packed together is the factor that most accurately predicts the likelihood that a cancer will come back.

“That wasn’t a hypothesis going in. We didn’t know that,” Madabhushi says. “We used a methodology to discover an attribute of the disease that turned out to be important.” It was only after the A.I. had drawn its conclusion that his team found the result also aligns with current scientific literature about pathology. The neural network cannot yet explain why the density of glands’ structure contributes to cancer, but it still helped Madabhushi and his colleagues better understand how tumor growth progresses, leading to new directions for future research.

When A.I. Hits a Wall

Although peeking inside the black box can help humans construct novel scientific hypotheses, “we still have a long way to go,” says Soumik Sarkar, an associate professor of mechanical engineering at Iowa State University. Interpretability techniques can hint at correlations that pop up in the machine-learning process, but they cannot prove causation or offer explanations. They still rely on subject matter experts to derive meaning from the network.

Machine learning also often uses data collected through human processes—which can lead it to reproduce human biases. One neural network, called Correctional Offender Management Profiling for Alternative Sanctions (COMPAS), was even accused of being racist. The network has been used to predict incarcerated people’s likelihood of reoffending. A ProPublica investigation purportedly found the system incorrectly flagged Black people as likely to break the law after being released nearly twice as frequently as it did so for white people in a county in Florida. Equivant, formerly called Northpoint, the criminal justice software company that created COMPAS, has disputed ProPublica’s analysis and claimed its risk-assessment program has been mischaracterized.

Despite such issues, Renner, the Zurich-based physicist, remains hopeful that machine learning can help people pursue knowledge from a less biased perspective. Neural networks could inspire people to think about old questions in new ways, he says. While the networks cannot yet make hypotheses entirely by themselves, they can give hints and direct scientists toward a different view of a problem.

Renner is going so far as to try designing a neural network that can examine the true nature of the cosmos. Physicists have been unable to reconcile two theories of the universe—quantum theory and Einstein’s general theory of relativity—for more than a century. But Renner hopes machine learning will give him the fresh perspective he needs to bridge science’s understanding of how matter works at the scales of the very small and very large.

“We can only make big steps in physics if we look at stuff in an unconventional way,” he says. For now, he is building up the network with historical theories, giving it a taste of how humans think the universe is structured. In the next few years, he plans to ask it to come up with its own answer to this ultimate question.