Science Genome, a project that will translate the fundamental questions of how scientific fields and discoveries evolve using new statistical and analytical methods, has received its Minerva Research Initiative funding.
YY Ahn (IUB Informatics), the project’s lead, says the project is ramping up with new students, computing resources, and help from IUNI, whose team will build the software infrastructure for the project.
Science Genome will collaborate with another major IUNI-partner project, the Collaborative Archive & Data Research Environment (CADRE).
CADRE is a platform that allows researchers to examine millions of scientific publications in massive bibliometric datasets, such as the Web of Science and Microsoft Academic Graph, and perform analyses and create visualizations of those datasets.
Science Genome’s initial award period is three years, with funding awarded for an additional two years upon the U.S. Department of Defense’s approval of the project’s progress, totaling up to $4.4 million.
Ahn’s team includes Staša Milojević, Alessandro Flammini, and Fil Menczer (IUB Informatics). Sriraam Natarajan, who is now at the University of Texas at Dallas, is also a collaborator on the project. IUNI’s Xiaoran Yan and Valentin Penthev will play a key role in Science Genome’s development.
Project potential
Ahn says Science Genome will allow researchers to formulate new quantitative questions about how fields evolve into vector-space algebra and dynamics by using graph embedding methods.
“In science of science, you deal with many different entities—authors, papers, and journals. Although they are closely linked together, it’s not easy to analyze them in a unified framework,” Ahn said. “Once you represent these entities as vectors living in the same space, it will become much easier to formulate fundamental questions about the dynamics of scientific enterprise with a completely new perspective.”
Ahn refers to Science Genome as a “useful, compact representation” of each entity, or an author, paper, or journal, that can succinctly capture the essential characteristics of each. With these representations, concepts such as similarity between an author and a paper can be developed with a simple mathematical formula. If the team wants to see the evolution of scientific enterprises, for example, they could simply observe how the vectors change over time and ask whether there is a general law behind it.
Science Genome can then be leveraged to create metrics and methods to assess the potential of scientific discoveries and innovations for specific scientific enterprises—from individual teams to entire countries.
Read more about the project here.