UW Interactive Data Lab
IDL logo

rTisane: Externalizing Conceptual Models for Data Analysis Prompts Reconsideration of Domain Assumptions and Facilitates Statistical Modeling

Eunice Jun, Edward Misback, Jeffrey Heer, René Just. ACM Human Factors in Computing Systems (CHI), 2024
Figure for rTisane: Externalizing Conceptual Models for Data Analysis Prompts Reconsideration of Domain Assumptions and Facilitates Statistical Modeling
rTisane provides a DSL for specifying conceptual models (left). Analysts validate and refine their conceptual models as the first step of a two-phase interactive disambiguation process. Interactive refinement updates the internal graph representation (middle). rTisane traverses this graph to formulate possible statistical models. Analysts learn about rTisane’s modeling decisions and can update them prior to getting a statistical modeling script as output (right).
Materials
PDF | Software | Best Paper Award
Abstract
Statistical models should accurately reflect analysts’ domain knowledge about variables and their relationships. While recent tools let analysts express these assumptions and use them to produce a resulting statistical model, it remains unclear what analysts want to express and how externalization impacts statistical model quality. This paper addresses these gaps. We first conduct an exploratory study of analysts using a domain-specific language (DSL) to express conceptual models. We observe a preference for detailing how variables relate and a desire to allow, and then later resolve, ambiguity in their conceptual models. We leverage these findings to develop rTisane, a DSL for expressing conceptual models augmented with an interactive disambiguation process. In a controlled evaluation, we find that analysts reconsidered their assumptions, self-reported externalizing their assumptions accurately, and maintained analysis intent with rTisane. Additionally, rTisane enabled some analysts to author statistical models they were unable to specify manually. For others, rTisane resulted in models that better fit the data or enabled iterative improvement.
BibTeX
@inproceedings{2024-rtisane,
  title = {rTisane: Externalizing Conceptual Models for Data Analysis Prompts Reconsideration of Domain Assumptions and Facilitates Statistical Modeling},
  author = {Jun, Eunice AND Misback, Edward AND Heer, Jeffrey AND Just, Ren\'{e}},
  booktitle = {ACM Human Factors in Computing Systems (CHI)},
  year = {2024},
  url = {https://uwdata.github.io/papers/rtisane},
  doi = {10.1145/3613904.3642267}
}