• Yongfeng Zhang

Prof. Yongfeng Zhang received an NSF IIS grant for project titled "Scrutable and Explainable Information Retrieval with Model Intrinsic and Agnostic Approaches" for a total amount of $500,000, covering a three-year period starting from 10/01/2020. As a collaborative project with researchers from University of Utah, the team will develop explainable machine learning algorithms to provide transparent and explainable results in search engines that people use in their daily life.

Information Retrieval (IR) systems are important for people for information access. For example, intelligent search engines are widely used in Web-based services such as web search, product search, and job search. Recently, sophisticated data and complicated black-box models have made modern IR systems less transparent to users. However, as more and more people rely on IR systems to guide their daily life and decision making, there has been growing needs of explainable search results, both for technical communities and the general public, so that they understand why certain search results are provided. Meanwhile, governmental agencies are demanding IR systems to provide not only high-quality results, but also reasonable justifications, so as to enhance the trustworthiness of the systems. This project focuses on developing algorithms and frameworks to improve the scrutability, explainability, and transparency of modern IR systems. It will inspire large-scale academic-industry collaboration, which benefits billions of users by facilitating the development of reliable and explainable information access services.

This project will develop general and reusable frameworks for scrutable and explainable IR. Research in this project will be performed on two directions. The first direction aims at new retrieval models for model-intrinsic explanation. This includes developing transparent inference process and decision boundaries for retrieval actions, scrutable functions that support result exploration with user feedback, and traceable information flow to distinguish the contribution of model inputs. The second direction aims at building analytical and simulative framework for model-agnostic explanation. This includes post-hoc explanation systems with external knowledge, and a simulation framework over black-box retrieval models with explainable outputs. Besides model-intrinsic and model-agnostic approaches, this project will also investigate crowd-sourcing tasks and systematic metrics to compare the effectiveness of intrinsic and agnostic explanations. The research outcomes will include multiple public benchmark datasets and evaluation platforms for explainable IR, which will contribute to the research community for sustainable and reproducible future studies.

More details can be found on the National Science Foundation's webpage at https://www.nsf.gov/awardsearch/showAward?AWD_ID=2007907 and https://www.nsf.gov/awardsearch/showAward?AWD_ID=2007398.