ASU team receives half-million-dollar NSF grant for web-based data-categorization app
You’re a social scientist trying to do a study on health, wealth and discrimination across hundreds of ethnicities worldwide. Until recently, this would have taken a lot of time and manual calculations with plenty of room for error.
There was no way to easily connect data on health and discrimination for ethnicities across thousands of datasets from countries around the world, explains Daniel Hruschka, professor and associate director at the School of Human Evolution and Social Change at Arizona State University.
Out of frustration, Hruschka and colleagues teamed up and started building CatMapper three years ago. The program helps scientists map categories between different datasets, which inspired the name "CatMapper."
“The number of datasets with information about ethnicities, religions, languages and geographic districts has exploded in recent years, and there is great potential for new analyses that bring these diverse datasets together,” Hruschka said. “However, bringing them together is a wicked problem, because every dataset has a different way of labeling the same thing. To make matters worse, there are thousands of religions, tens of thousands of ethnicities, languages and dialects, and hundreds of thousands of geographic districts to map across datasets.”
The app works to build bridges between different datasets that represent the same information but may be categorized, or named, in different ways. This will make it much easier to unlock and bring together data from a much larger set of datasets.
“CatMapper helps users sort through this Wild West of categories when bringing data together from different sources,” Hruschka said.
Recently, the project received a $550,000 National Science Foundation grant to continue building and expanding. Year to date, CatMapper has over 40,000 views and has helped build new datasets for a number of projects.
Currently, CatMapper houses two applications that help with two kinds of categories. SocioMap handles sociopolitical categories and ArchaMap handles archaeological artifact types, explained Robert Bischoff, an anthropology graduate student. Bischoff is the primary developer of the applications, and has written all of the code and manages the database.
“I never thought I'd be managing websites and Linux servers as an archaeologist, but not only have I learned how to do these things working on CatMapper, I'm now in a full-time position with the Center for Archaeology and Society where I'm using these same skills as the database manager,” Bischoff said.
The applications have four functions: exploring information on hundreds of thousands of categories; translating categories from new datasets; bridging data across datasets; and documenting and sharing users’ prior work.
“The big thing is how do you determine what counts as the same across different datasets when bringing data together?” Hruschka said. “Other people have done comparative studies like this where they have brought data together, but it's a challenging task and it involves tons of decisions. And these decisions are usually not well-documented. So if you want to try and replicate what someone has done in the past, it's almost impossible.”
The web-based applications are free and Hruschka said users include scholars and policymakers, as well as both graduate and undergraduate students. The team is also aiming to make it useful for a wider range of everyday users.
The CatMapper team also includes Matthew Peeples, associate professor at the School of Human Evolution and Social Change, and Sharon Hsiao, assistant professor at the School of Engineering at Santa Clara University.
More Science and technology
Cracking the code of online computer science clubs
Experts believe that involvement in college clubs and organizations increases student retention and helps learners build valuable…
Consortium for Science, Policy & Outcomes celebrates 25 years
For Arizona State University's Consortium for Science, Policy & Outcomes (CSPO), recognizing the past is just as important as…
Hacking satellites to fix our oceans and shoot for the stars
By Preesha KumarFrom memory foam mattresses to the camera and GPS navigation on our phones, technology that was developed for…