State-of-the-art of data science finds hidden relationships in data through human steerable AI
Artificial and human intelligence work together in new research supported by the National Security Agency (NSA) and University of Hawai‘i (UH). Researchers at the Hawai‘i Data Science Institute at University of Hawai‘i are using state-of-the-art artificial intelligence (AI) that is common in visual analytic systems to find hidden relationships between data elements.
Analysts working to understand highly-dimensional data sets like large document collections not only rely on data but on specific user questions. Using an expanded design space of semantic interaction systems applied to state-of-the-art pipelines, analysts can make insights using their domain knowledge and steer AI to obtain a desired visualization.
Semantic interaction enables the direct manipulation of two-dimensional views of high-dimensional data, representing similar data in clusters. The Zexplorer system developed by Ph.D candidate Alberto González Martínez creates a visualization that allows users to manually connect and categorize documents and then embeds user knowledge into the system models. The system is built atop Zotera, a widely used document organization system.
“Traditional analytical pipelines are driven solely by algorithms or models and without a human in the loop they can potentially limit sense-making by masking expected or known structure in the data. When doing data analysis the insights are not solely based on the data but are mainly driven by the users’ specific questions.” said González, who will earn a degree in Computer Science from University of Hawai‘i at Mānoa. “For example the same document or corpus of text may hold numerous orthogonal pieces of information each of which is valuable to different users with different degrees. This work provides a first step towards embedding the domain knowledge and questions into the system models, allowing analysts to steer the system model towards their own mental model.”
While the work has been applied to textual data, the methods are general enough to be used in any other high dimensional domains such as astronomy or genomics and can improve current research in visual analytics and drive the future discovery of more powerful pipelines that emphasize explainability in visual analytic systems.
González along with Masterʻs candidate Billy Troy Wooten, Ph.D candidates Nurit Kirshenbaum and Dylan Kobayashi, and Hawai‘i Data Science Institute Co-Director Jason Leigh were awarded ʻBest Paper in trending now – machine learning and artificial intelligence’ at the 2020 Practice and Experience in Advanced Research Computing (PEARC) national conference for this research.
González and the team of UH researchers presented this research at PEARC20 which was held as a virtual conference from July 27 to 31, 2020.