ComputableViz: Mathematical Operators as a Formalism for Visualization Processing and Analysis

Original link: http://vis.pku.edu.cn/blog/computableviz/

With the increased availability and popularity of visualization authoring tools, a large number of visualizations have been produced and shared on the web. Following visualization generation and digitization, there has been a growing research interest in exploring techniques for processing and analyzing visualizations. For example, researchers have begun to study the problems of style transfer and example-based retrieval on visualization. With this trend, new ideas and questions are constantly emerging, so it is necessary to refine the overall framework of these research works to facilitate future work. In this context, researchers such as Wu Aoyu from three schools including Hong Kong University of Science and Technology proposed computable visualization [1], which provides a unified framework based on mathematical operators for visualization processing and analysis.


The design space of visual operations includes two dimensions of operation object and operation type.

The design space of the framework consists of two dimensions, the operation object and the operation type. The operation object is the basic visualization primitive (primitive), which is divided into data-related and style-related. The visualization primitives refer to the Vega-Lite language. Its specification consists of view specification, data, data transformation, visual markup, encoding, view composition, parameters and configuration. The authors further divide them into data (including data and transformations) and styles (including markup, encoding, etc.). Action types include basic actions (acting on two visualizations) and advanced actions (acting on more than two visualizations). The author refers to the existing relatively complete image processing and analysis tools, and divides the operations into Union, Difference and Intersection operations. See further work using machine learning for visualization, visualization comparison, composite visualization, visualization composition, and more. Each operation will be described one by one below.


Visual operator

Basic operations include merge operations, difference operations, and intersection operations. The merge operation can support style transfer, that is, applying the style of one visualization to another visualization. It can also support augmenting a static visualization with new data. This task is called mosaicing, where two or more visualizations from different views of a dataset can be combined to represent the complete view. Another set of tasks supported by the merge operation is for visual compounding and visual comparison, including visual juxtaposition, overlay, and item-by-item juxtaposition. Difference operations can be used to explicitly encode differences, requiring that the data and styles of the two visualizations should match each other. If they don’t match, data differences and style differences can be calculated. Both types of differences can be mapped to semantic operations useful for visualization authoring, for example, data differences can be used to generate data stories, and stylistic differences can be useful for chart recommendations or redesigning visualizations. The take-intersection operation is highly related to the difference operation and covers tasks including explicitly encoding intersection, taking data intersection, and stylistic intersection.

Advanced operations are grouped into six categories, serialization, topological sorting, matching and filtering, clustering, synthetic aggregation, and representational aggregation. The first task is serialization, which is used to find the best order for visualization. However, the relationship between visualizations is often non-linear and instead modeled using a topological graph structure, where each node represents a visualization and each edge represents an edit operation or visualization difference. This task has been extensively studied in visual recommender systems. Both tasks, match screening and clustering, are concerned with measuring differences between visualizations. Match filtering is related to visualization retrieval systems, that is, finding visualizations that are similar to the input visualization. Clustering provides support for meta-visualization analysis. Both tasks require a distance function that converts differences and commonalities into numerical scores. Synthetic aggregation and representative aggregation are summaries of multiple visualizations, which are to aggregate multiple data and select representative visualizations from the collection.

At the implementation level, the framework is based on Vega-Lite syntax and database theory. The visualized Vega-lite spec file is converted into a relational database consisting of base tables and mapping tables, flattening nested json objects by concatenating keys with nesting levels using a dash character (e.g. “encoding-x-field”). All the operators mentioned above are implemented through the join operation of relational tables. The merge operator builds on the FULL OUTER JOIN clause, which merges all records when there is a match in the left or right table records. The intersection operator is implemented through the INNER JOIN clause, which requires records to match one in both tables. The difference operator consists of three steps, including ANTI-JOIN of the left and right tables and finding records that only appear in the left or right table.


Visual representation as relational table

The syntax of the operator function takes as input two visualizations and arguments. The on parameter specifies the column on which to perform the join clause, which can be the “key” primary key or “all” all columns. The how parameter handles data conflict scenarios and can be “left”, “right” (i.e. choose left or right), or “merge”, i.e. merge two tables by adding a new indicator column, indicating that the record is from the left table Still on the right. By setting the “auto_encoding” parameter to true, a new visual encoding can be automatically created for the indicator column.


Operator functions and parameters

The framework has diverse application scenarios, including but not limited to style transfer, visual composition, version control, meta-visualization analysis, cluster analysis, and exploring the lineage of visualizations. The first scenario is style transfer, where you can apply styles from one visualization to another. The second scenario is a composite of visualizations. For example, interactively transforming a bar chart into a stacked bar chart in an AR environment can enhance the user experience by taking the visualization as a whole. The third scenario is version control. Assuming that two people edit the same visualization in their respective branches, and the edited paths are saved, the branches can be merged like git. The fourth scenario is meta-visualization analysis (visualization stitching). Suppose we get multiple pie charts of the carbon emission proportions of countries in the world year by year, and after combining them into a heat map, we can see the situation of each country every year. There are two figures in 2017, but the values ​​are different. By further combining and comparing these two data, you can see the specific differences. More scenarios include cluster analysis and visual lineage analysis.


Application Scenario: Version Control


Application scenario: meta-visualization analysis (visualization stitching)

references:

[1] Aoyu Wu, Wai Tong, Haotian Li, Dominik Moritz, Yong Wang, and Huamin Qu. ComputableViz: Mathematical Operators as a Formalism for Visualisation Processing and Analysis. In Proceedings of CHI Conference on Human Factors in Computing Systems, Article No. 410, pp. 1–15, 2022.

This article is reproduced from: http://vis.pku.edu.cn/blog/computableviz/
This site is for inclusion only, and the copyright belongs to the original author.

Leave a Comment