Software to improve efficiency of medical research

November 02, 2006

A data-finding tool that promises to dramatically improve the efficiency of medical research has been developed by a small team of ASU researchers.

The computer software program called Collaborative Bio Curation, or CBioC, can analyze vast amounts of biomedical data to locate and extract specific information critical to research efforts. It's a fusion of computer science, information management, medical research methods and clinical practice that could lead to significant advances in the way scientific data searches are conducted, says CBioC developers Chitta Baral and Graciela Gonzalez.

The quantity of biomedical literature worldwide increases on average by more than 1,000 articles and research papers each day, making it almost impossible for researchers to keep up with the latest findings, says Baral, a professor in ASU's Department of Computer Science and Engineering, and an affiliate faculty member with the Department of Biomedical Informatics. Both departments are part of the university's new School of Computing and Informatics.

CBioC is expected to save researchers the time and effort of wading through hundreds of thousands of articles to locate specific information relevant to their particular research, Baral says.

CBioC can be compared to Wikipedia in its collaborative capabilities, and to Google in its search capabilities, although Google searches only by terms and not by higher-level concepts.

For example, a Google search can find occurrences of information in medical literature about a specific gene. But Google doesn't produce the results provided by CBioC when performing a search for concepts such as “genes related to brain cancer,” or data on gene-disease relationships or protein interactions that are crucial to understanding diseases and the development of new therapies.

The program is a Web browser application that is a search engine and collaboration tool of PubMed, the primary online repository of biomedical papers maintained by the National Library of Medicine.

CBioC runs in a small frame at the bottom of the browser each time a researcher uses PubMed. When an article is selected, CBioC extracts and displays the facts reported in the article. For example, extracted facts that a certain gene has been found to be linked to brain cancer are added to the CBioC database. Similar facts then can be searched from within CBioC.

CBioC allows individual researchers to vote on the correctness of the extracted facts and enables them to share notes and comments about the data among colleagues and other PubMed users. Over time, a consensus is reached among researchers as to which facts are correct, enabling information to be updated and kept accurate, Baral says.

Use of CBioC has been steadily increasing since it became available in December. Researchers throughout the United States , Japan , Australia and Europe are downloading the software. It also caught the attention of Science magazine, which featured CBioC in the NetWatch news section of its Web site.

Gonzalez, an ASU biomedical informatics researcher, says she and Baral are exploring ways to make CBioC useful to a wider range of biological and medical research endeavors.

“We've been talking with the Biodesign Institute at ASU about adapting the software to look for sugar and gene relationships,” Gonzalez says. “We're also working with TGen (the Translational Genomic Research Institute in Phoenix ), exploring its applications for cancer research.”

Collaborative Bio Curation is available for free download at the Web site (www.cbioc.org).

Kelley Emeneker, [email protected]
(480) 965-4808

Research