Prestigious Grant Awarded to Prof. Yonatan Belinkov

Prof. Belinkov, together with Dr. David Bau from Northeastern University, awarded funding from Open Philanthropy for “Initiative for the Interpretable Control of AI”

Assistant Professor Yonatan Belinkov from the Henry and Marilyn Taub Faculty of Computer Science at the Technion – Israel Institute of Technology has been awarded funding from Open Philanthropy for “An Initiative for the Interpretable Control of Artificial Intelligence.” Open Philanthropy identifies outstanding giving opportunities, makes grants, follows the results, and publishes its findings. Its mission is to give as effectively as it can.

Prof. Belinkov won the grant together with Dr. David Bau from the Khoury College of Computer Sciences at Northeastern University. The grant will support the two research teams’ development of interpretable methods to control artificial intelligence.

Prof. Yonatan Belinkov

Prof. Yonatan Belinkov

“Our initiative aims to develop methods to trace and analyze world knowledge in large language models,” said Prof. Belinkov. “We expect this research will help us deal with emergent and unexpected behaviors of AI systems, including potentially harmful behavior, by providing new ways to control unexpected capabilities that may emerge in AI systems.”

As automatic decisions made by AI systems increasingly affect human society, it is important for the objectives of these systems to be aligned with the best interests of humankind even when their capabilities would eventually surpass humans. To this end, the two researchers aim to open the AI “black box” and close the gap between human and AI knowledge by developing interpretable tools for mapping, evaluating, and controlling the processing of knowledge within large language models. Such tools would facilitate the study of ways to ameliorate serious alignment challenges in critical areas such as misinformation, bias, and privacy.

Prof. Belinkov joined the Taub Faculty of Computer Science in October 2020 after completing a Ph.D. at the Massachusetts Institute of Technology (MIT) and a post-doctorate at Harvard University and MIT.