Analysing and designing visualizations – Diagrammatics (1984) revisited Analisando e projetando visualizações – Diagramática (1984) revisitada

This paper reviews how the authors’ current framework – the DNA of visualization – has evolved from the work laid out in Diagrammatics (Richards, 1984). The goal of this line of work was, and is, to enable the analysis and specification of an extensive range of different types of visual representations of information, such as statistical charts, maps, family trees, Venn diagrams, flow charts, texts using indenting, technical drawings and scientific illustrations. Inspired by an analogy with language, fundamental possibilities of graphic organization were identified in 1984. This work has been further developed into the current DNA of visualization framework. We identify the main concepts within the current framework and point to their roots to the 1984 work.


Introduction
This paper is about the development of a framework that may support designers in the creation of charts, diagrams or other visualizations. By defining the fundamental building blocks of such visual encoding systems, and their various combinatorial possibilities, the framework can be used to explore design choices, deconstruct visualizations, and guide visualization research. This work is presented here along with the origins of this research in 1984, the relevance of which has grown  (Scha, 2005).

"I want to thank […] Clive
Richards for his phonebooksized thesis titled 'Diagrammatics', which I read while trying to live in a cave on the Canary Islands. While rats were chewing holes into my inflatable mattress, Clive's book made a lasting impression on my thinking about graphic representation." (Engelhardt, 2002, p. xi) In this paper the co-authors outline their joint work, with a backward glance to the original Diagrammatics from 1984 that lies at its foundation.

Parts of graphical speech -an analogy with language
The relational meaning of a diagram is taken from the arrangement of its elements, and in this respect it is akin to a sentence or text. Although we can distinguish between sentences and diagrams, in that amongst other things the former have a one-dimensional, one-directional scheme to order their elements, and the latter have the potential to utilize fully two (or even three) dimensions, both make use of a grammar to establish their meaning. (10/2-3) The original Diagrammatics thesis proposes a "grammatically-based analysis" (10/3), for example, the reader is invited to "Consider Figure [2] which may be thought of as saying, 'A is connected to B'. We might then say that 'A' is the equivalent of a grammatical subject and its connection with 'B' is the predicate; thus, the line serves a verb-like function for the nouns A and B."    4 Richards, C., & Engelhardt, Y. | Analysing and designing visualizations -Diagrammatics (1984) revisited In line with this idea, Graham Wills writes that "a visualization can be defined by a collection of 'parts of graphical speech', so a wellformed visualization will have a structure, but within that structure you are free to substitute a variety of different items for each part of speech" (Wills, 2012, p. 22). Our current work includes a 'universal grammar' that describes how 'parts of graphical speech' can be combined. We have devised a system for specifying these syntactic relationships through grammar-based, colour-coded tree diagrams for describing the compositional syntax of different visualization types (Richards & Engelhardt, forthcoming In our approach, visual encodings include not only the use of Bertin's (1967) 'visual variables', but also Gestalt principles of perception. Colour coding and shape coding use the Gestalt principle of 'similarity'. Connecting is an application of the Gestalt principle of 'connection'. Grouping by position can be achieved either through spatial proximity -using the Gestalt principle of 'proximity', or through spatial alignment -using the Gestalt principle of 'continuity'. Our visual encodings also cover some of Johnson's (1987) and Lakoff's (1987) 'image schemata', concepts from Tversky's (1995) 'cognitive origins of graphic conventions' and Ware's (2008) 'graphical codings' (for a description of how our approach relates to all of these, see Engelhardt & Richards, 2018).
A visual component can be involved in several different visual encodings simultaneously, often representing different types of information.
An overview of visual encodings, categorized into arranging, linking and varying, is given in Figure 3, along with explanations and examples.

The DNA of visualization
In our joint work, the 1984 mode of organization has been extended and renamed as the mode of visual encoding, which includes not only the visual encodings themselves, but now encompasses the comprehensive catalogue of building blocks that make up the DNA of visualization.

'DNA' and 'species' -a metaphor for visualization
In their 'Tour through the Visualization Zoo', Jeffrey Heer et al. (2010, p. 60) say that "all visualizations share a common 'DNA' -a set of mappings between data properties and visual attributes such as position, size, shape, and color -and that customized species of visualization might always be constructed by varying these encodings." We use this metaphorical idea of the "DNA" and "species of visualization" in a similar vein, taking it to the extent of identifying a comprehensive set of DNA building blocks that specify different 'visualization species', and the rules for combining these building blocks. This allows for the construction of a broad range of different types of visualization -Heer's "customized species of visualization". The DNA building blocks of 'visualization species' in this biological metaphor correspond to the 'parts of graphical speech' in the linguistic analogy discussed above. We will refer to these building blocks as 'VisDNA'.
the outline shapes of signs in a traffic sign system 1 In this context, 'components from the same set' means components fullfilling the same general function in a visualization. 2 When the exact locations are meaningful for all the points on a demarcating line, enclosure or shared background, we do not regard those as grouping by boundary, but as line locators or surface locators (e.g. country borders or areas on a map).

Figure 3
Visual encodings, categorized into arranging (red), linking (pink), and varying (blue). Picturing involves arranging into a configuration as well as varying visual appearance, hence the combination of red and blue colouring.

The main groups of VisDNA building blocks
The VisDNA building blocks fall into several main groups -these main groups and their relationships are shown in Figure 4. We have given each group a colour code. These groups are: types of information to be represented (grey DNA), visual encodings to represent them (red/ blue/pink DNA), visual components that make up the visualization (green DNA), and any directions or layout principles that may be involved (black-on-white DNA). In addition to colour coding, every VisDNA building block has a three-letter code, as shown in Figures 3 and 6. These codes have been devised for the convenience of auditing visualizations, a process introduced in section 10.

Visualization species
We refer to a 'well-formed' combination of building blocks, i.e., one that follows the VisDNA grammar rules (Richards & Engelhardt, forthcoming), as a visualization species. Tamara Munzner (2014)

Figure 4
This basic overview diagram shows the main groups of VisDNA building blocks and how they relate to each other: types of information in terms of the questions they answer, possible visual encodings (listed separately in Figure 3), visual components (listed separately in Figure 6), and layout principles that may be used in a visualization species.
8 Richards, C., & Engelhardt, Y. | Analysing and designing visualizations -Diagrammatics (1984) revisited species have been given a name (e.g., 'pie chart') and are generally referred to as 'chart types', while novel or rare visualization species often do not have a name (yet). As Heer et al. (2010, p. 67) write, "many more species of visualization exist in the wild, and others await discovery." There is, however, no standard for classifying visualization species (chart types). For example, does using vertical bars versus horizontal bars constitute a different type of chart? Does a chronological ordering of bars versus an ordering by value constitute the same type of chart? There are many ways to 'draw the lines' between species, subspecies or variants of species, and most of the differences between these can be identified by differences in their VisDNA. We have analyzed a large number of visualization species using the VisDNA system, including most of the corpus at datavizproject.com plus many other examples. Example analyses can be found on our accompanying website: VisDNA.com An aspect of visualization that largely falls outside the VisDNA framework is the prescription of 'rules for good design'. Like academic work in linguistics, the framework is primarily descriptive rather than prescriptive, in the sense that it enables the understanding and modelling of (graphic) language.

Visual components
A significant element is the primary unit of analysis in the scheme to be proposed here. […] I take the view that there seems to be little profit in using such items as an individual dot or line as a unit of analysis. If we are going to use linguistics as a model, then what is needed for present purposes is not the pictorial equivalent of a phoneme or morpheme but something closer to a noun phrase […] A significant element is, then, literally any single graphic element in a diagram which signifies something or which at least is capable of having some meaning. (3/13) What were referred to in 1984 as 'significant elements' that make up visualizations, we now define as visual components. A visualization consists of one or more sets of visual components, of which at least one set is involved in one or more visual encodings. See Figure 5 for an example of a chart disaggregated into its visual components. The chart shows the development of products manufactured by a machine tool company. The small drawings of machines are visual components that are involved in three types of visual encodingspicturing, colour coding, and connecting with directed connector lines. A list of the different types of visual components (green DNA) can be found in Figure 6.

BLO XXXXX
A block may or may not be part of a grid structure, but it always has the shape of a grid cell in either a regular grid or -in case of a curved block -in a polar grid. If the description for bars applies (see below), it is not a block. Variation: BLO*curved block.

cells in a heat map, outer bounding box of a tree map bars BAR XXXXX
Bars use sizing of length, away from a fixed 'foot' and/or shared baseline (usually representing 'zero'), and all bars in a set do this in the same direction (ver, hor, rad, ang). Variation: BAR*100% bars, which are always equally sized and composed using proportional space-filling. 11 Richards, C., & Engelhardt, Y. | Analysing and designing visualizations -Diagrammatics (1984) revisited

Composite visual components
Complex visualizations may be structured at different levels, with lower-level structures being embedded in higher-level structures (e.g., a time series of maps, drawings of animals embedded in an evolutionary tree, small pie charts on a map, etc.). Thus, visual components may be either basic visual components (most of which are commonly referred to as 'marks' in the data visualization community) or they may be composite visual components (last item at the bottom of Figure 6). We refer to a composite visual component as a visualization. Components at any level can be subject to visual encodings. This approach accommodates the analysis of complex embedded structures.

Mode of visuospatial resemblance and mode of semantic correspondence
In addition to the mode of visual encoding, two other representational modes have been more or less retained for the VisDNA framework from the original Diagrammatics -the mode of visuospatial resemblance and the mode of semantic correspondence.

Mode of visuospatial resemblance
[…] the term schematization is used to denote the process of image reduction which leads to what may be thought of as a synopsis […] (7/10) The mode of visuospatial resemblance applies to pictures and maps, covering projection methods, detail-revealing techniques and level of schematization. Projection methods include linear perspective, orthographic views, and cartographic projections. Detail-revealing techniques for showing otherwise occluded or difficult-to-see parts include cut-away views, exploded views, ghosted views and insets showing enlarged details (some of these are discussed in Richards, 2017). Schematization, also referred to as 'mode of depiction' in the original diagrammatics, is "concerned with the degree of fidelity with which the image is rendered, that is, the extent to which it is barren of detail" (10/7). The degree of schematization ranges along a continuum from the mimetic to the schematic, from being visually or spatially realistic and detailed to being visually or spatially edited and synoptic -see Figures 7 and 8. Regarding picturing, the idea of a continuum from the mimetic to the schematic is illustrated by Scott McCloud (1993, p. 45) with a sequence of images running from a 'realistic' picture of a face to a very simplified one. In the case of mapping, a detailed relief map of a mountain range is an example of a relatively mimetic map, while a subway map is an example of schematic map.

Mode of semantic correspondence
[…] it is proposed that the mode of correspondence may range from the literal, to the non-literal (3/32-33) Picturing can be characterized by its mode of semantic correspondence, which deals with the type of relationship between 'what is pictured' and 'what is meant'. The mode of semantic correspondence may be literal or non-literal. § In literal picturing, 'what is pictured' -the physical entity (or scene), existing or imagined -is 'what is meant'. § In non-literal picturing, 'what is pictured' is not 'what is meant', but rather represents it through metaphor, metonymy or convention, for example.
This concept of semantic correspondence being literal or non-literal constitutes a further analogy between visualization and language (see section 3). Figure 8 shows that mode of semantic correspondence and mode of visuospatial resemblance can vary independently from each other. It shows literal and non-literal examples of both mimetic and schematic picturing.

Visual treatment
[…] rhetoric and associated modes of speech can in some cases have a visual counterpart […] what is represented can be subject to mediation by a process we might well describe as diagrammatic rhetoric. (6/36) Through visual treatment the visual components and the visual configurations in a visualization may be manipulated to suggest additional nuances of meaning, or connotations, beyond what is conveyed by the visual encodings. The graphic designer Nigel Holmes has made statistical charts take on the appearance of something related to the topic, adding a further level of meaning. For example, a spiky graph of 'Monstrous Costs' is pictured as the teeth of a dragon (Holmes, 1984, p. 45). We may term this a case of 'graphical rhetoric'. Related to the idea of graphical rhetoric are inflections in meaning created by the illustrative style used to produce a visualization -giving it a 'mood' or 'tone of voice', e.g., 'whispering' versus 'shouting' its message. Within style we may also include the use of decoration and backgrounds. Clive Ashwin (1979) discusses style in illustration, offering a framework for its analysis. In working with a taxonomy of diagram types there may be a tendency to design within common families and to overlook the possibilities of hybrid forms. (10/9) All of the above, from 1984, still holds for our current VisDNA framework. The framework provides a tool for the analysis and specification of a comprehensive range of different types of visualizations in terms of specific combinations of VisDNA building blocks. Figure 9 details which visual encodings may be used to represent which types of information. For opening up further visual encoding options, information of one type may be transformed into another type - Figure 10 lists possible transformations. When creating a visualization, one may follow the process laid out in Figure 11. Through this process the VisDNA framework offers a means of exploring a wide range of available options for visual encoding and composition. It may even support the generation of entirely novel visualization species. Because of its flexible structure, further VisDNA building blocks may be added to the framework to accommodate any additional visualization species that one may want to describe and that cannot be fully specified using the current scheme. Examples may be the addition of VisDNA building blocks for animation or interactivity in visualizations.

Auditing visualizations
A process of auditing diagrams is proposed which is aimed at isolating the fundamental modes of graphic organization available for certain classes of diagram. (0/8) The Diagrammatics of 1984 introduced a method of analyzing diagrams, a process referred to as 'auditing'. We have taken this concept forward and devised a new method of analyzing visualizations using VisDNA. This approach uses 'specification trees' -an example is shown in Figure 12, which describes the diagram shown in Figure 5 (this diagram was analyzed in the original Diagrammatics, 9/13-9/19). VisDNA specification trees are constructed using rigorous rules of composition, and aligned with every specification tree is an equivalent description in an English language sentence -which may help when discussing visualization options.  Figure 10 Transforming information from one type to another opens up further visual encoding options.
The complete set of VisDNA grammar rules for creating specification trees is given in Richards and Engelhardt (forthcoming) -together with descriptions of layout principles and directions (only touched on here). Also see the VisDNA.com website.
The VisDNA building blocks and the way in which they can be combined, as exemplified by the specification trees, may offer the basis for a process of formalization and the potential for machine readable specifications. This may serve as a basis for a software system that provides computer generated visualization advice, which could be linked to a rendering engine in order to produce actual visualizations and variants of them.

Conclusions
Much of the original theoretical basis of Diagrammatics, propounded in 1984, with its "grammatically-based analysis" (10/3) still holds good today, and has provided much of the foundation on which our newer

Steps
description VisDNA

Types of information
Identify the types of information to be visualized, listed in figure 9. •••

Possible information transformations
Consider transformations from one type of information into another type of information. For example, pairs of locations (spatial location) can be transformed into distances (quantity). See figure 10.

Visual encodings
Identify visual encodings that can be used to represent these types of information. See figure 9 and figure 3.

Visual components
Identify visual components that can be used to express these visual encodings. See figure 6. •••

Directions and layout principles
Identify directions and layout principles that may be applied to these visual encodings and visual components. See VisDNA.com •••

Creating visualization species
Having chosen all the VisDNA building blocks in steps 1 to 5 above, combine these into possible visualization species (i.e. types of visualizations). This can be an iterative process, sketched out by hand or otherwise created.

Visualization species selection
Select some of the most promising species, bearing in mind the intended audience, purpose and context of use.
8 Prototyping Implement these with the information to be represented, while considering choices regarding mode of visuospatial resemblance, mode of semantic correspondence, reference elements (e.g. axis labels, grid marks, legends) and visual treatment (e.g. graphic style).

Evaluation
Evaluate these implementations, ideally including testing with a group of target users, identify aspects that may deserve further attention, and go back to previous steps accordingly, reconsidering choices made there.

Production
Select the preferred implementation and produce the final visualization(s).
The VisDNA framework can be used to create possible visualization species for implementation -steps 1 to 6. Steps 7 to 10, shown in grey, can be regarded as part of a standard design process. At any step the designer may return to any previous step for reconsideration, including to step 1.

Figure 11
A design process that can be followed to create visualizations.

Figure 12
The VisDNA specification tree for the chart of Miyano machine tools (also shown in disaggregated form in figure 5). More example specification trees can be found in Richards and Engelhardt (forthcoming) and at VisDNA.com. Richards, C., & Engelhardt, Y. | Analysing and designing visualizations -Diagrammatics (1984) revisited VisDNA framework has been constructed. That earlier work has been extended by adding to its grammatical analogy the biological metaphor of DNA. This has introduced the scheme of colour-coded building blocks with three-letter codes, and the rules for their combination in representing various visualization species.
One of the goals of Diagrammatics was "to provide a more precise scheme of terminology than is customarily used by designers and design teachers [… and] those engaged in research into various issues related to communication through diagrams" (1/7). This has been addressed through the VisDNA vocabulary.
The work introduced here offers the designer a means to explore visualization options, as opposed to "working with a taxonomy of diagram types, which could be potentially restricting" (10/19). The DNA of visualization (VisDNA) goes beyond the 1984 work. With its precisely defined building blocks and rigorous grammatical combinations rules, the VisDNA framework provides a system for undertaking a range of analytical activities, both in visualization design practice and in related visualization research.