Fotovis: user-centered development of a tool for visually browsing photographic collections

Fotovis is a tool for visualizing, exploring, and analyzing photographic collections. It is a partial result of a PhD thesis that investigated alternative, visual, and more generous browsing strategies — based on Digital Humanities — for digital tools. This paper discusses how Fotovis was conceived, developed (at the prototype level), and evaluated (at the interface level), through a user-centered approach and using a sample of photographs from the Moreira Salles Institute, an important Brazilian cultural entity. The goal is to highlight the definition of the users of Fotovis, and the design decisions that meet their exploratory and analytical needs.


Introduction
Fotovis is a web-based tool for visualizing, exploring, and analyzing digital photographic collections (DPC). It is designed for different user profiles and is partially the result of a doctoral thesis (Giannella, 2020) that investigated the visual exploration of cultural collections from the observation that there is a growing interest in framing collections as data. This framing "… seeks to foster an expanded set of research, pedagogical, and artistic potential predicated on the computational use of cultural heritage collections" (Padilla, 2018). Within this paradigm, singular issues related to Design emerge; for example, the design of informational spaces for DPC. Keyword search is a dominant search paradigm, but it is ungenerous (Whitelaw, 2015). Thus, this study demonstrates the concept design process for a visual browsing tool, based on data visualization techniques and interactive tasks that amplify exploration and analysis activities performed by consultants.
The central question that the thesis seeks to answer is: How and for whom does data visualization favor discovery and inquiring processes in DPC tools? To answer this, I needed to address other questions. Therefore, this paper will focus on this specific question: What are the needs and expectations of DPC users, and how can we design solutions to improve their exploration and analysis?
To answer these questions, I found an appropriate framework from combining concepts and methods of the following three theoretical and practical domains: Design, Computing, and Humanities. I subsequently approached Digital Humanities, a field that represents a heterogeneous set of studies and practices that aim to understand the implications that digital technologies have on research, especially those that deal with large-scale data in Human Sciences (Berry, 2012). Lunenfeld et al. (2012) state that the most significant change in Humanities research due to digital technologies is the reconsideration of the relationship between practice and theory, between "building" and "reflecting", between "doing things" and "writing about them". The "things" being done by digital humanists -whether software, coding, platforms, data visualizations, or user interfaces -are not only individualized research results but also new ways of investigating the field. Thus, Design contributes to the Digital Humanities through: collaborating with its user-centered approach; best practices in information design, interface, interaction, and navigation in digital environments; and the development of solutions to interpret data graphically.
Within this approach, the aim of this paper is to highlight the definition of users of a tool for visually browsing DPC, as well as the design decisions that meet their exploratory and analytical needs. I contribute with an understanding of the target audience for such a tool, and a reflection on the design space of a visualization-based browsing interface for DPC. The remainder of the paper is structured as follows: the Related Work section provides an overview of data visualization and user modelling approaches to digital cultural heritage (DCH), as well as existing tools for visualizing image collections; the Case Study section covers the methodological framework in which Fotovis was undertaken as a Digital Humanities project, with a focus on discussing the context and needs of prospective users of the tool as well as its design solution; and the User Tests section describes how Fotovis successfully offers discovery and interaction experiences with photographic collections and how it can be improved.

Related Work
In designing Fotovis, the interaction between Information Visualization, Digital Humanities, and DCH research resulted in the emergence of the two key topics detailed below.

Visualization of cultural heritage collection data
Visualization of cultural heritage collection data (VCHCD) can be understood as an emerging theoretical-practical approach to the development and discussion of visually rich web-based tools for enhancing access to cultural collections in order to support their scholarly analysis and casual appreciation.
In contrast to conventional tools for accessing cultural collections (usually centered on a keyword search), this alternative approach is based on concepts like the information flaneur (Dörk, Carpendale & Williamson, 2011), visual exploration (Dörk, 2012), and serendipity and generosity (Whitelaw, 2012(Whitelaw, , 2015. Windhager et al. (2019) reviewed the state of the art for VCHCD, and their examples cover a variety of DCH data, visualization techniques, and information activities. Another contribution of their work is the notion of visual granularity, which is a design conceptualization for the interface of digital tools, and is based on the observation that collections can be accessed, analyzed, and interpreted at four levels of aggregation: Collection Overviews (G1), Collection Overviews Utilizing Discrete Surrogates (G2), Multi-object Previews (G3) and Single Object Previews (G4).

User categories for DCH
The access to and discovery of DCH has been studied from the user perspective. Related work has resulted in a strategy that simplifies user profiles by creating user categories based on level of domain and/ or technical expertise, with users commonly called experts or novices (Walsh, Clough & Foster, 2016;Windhager et al., 2019).
Experts include users with a professional or scientific interest related to DCH. An expert can be someone employed by cultural heritage organizations (e.g., curator or librarian), a trained scholar (e.g., historian), or other professional/specialist (e.g., a cultural producer or iconographic researcher). In-depth knowledge about the collection and structure technology in which it is organized allows experts/professionals to use relevant keywords and search filters that lead them to more accurate and satisfying results. VCHCD for expert users was the focus of Kräutli (2016).
The novices (also called lay or non-expert) category is more heterogeneous. These users are motivated by the need to satisfy personal curiosity and interest in an intellectually challenging environment. They fully comprehend neither the contents of the collection nor its indexing logic -more effort in their queries is necessary in order to obtain meaningful results. VCHCD for novice users has been widely studied. Hinrichs, Schmidt & Carpendale (2008), Thudt, Hinrichs & Carpendale (2012), and Rogers, Hinrichs & Quigley (2014) discuss the casual use of information visualization in-situ (museum or library), through the user's interaction with physical devices (mobile or display installation). Whitelaw (2012) suggests that a user who is unfamiliar with the collection's scope, content, and structure will benefit the most from web-based visualization tools for cultural browsing. Dörk, Carpendale & Williamson (2011) discuss the "urban flaneur making sense of a city" metaphor to designate a new paradigm of information seeking on the web. Glinka, Meier & Dörk (2015) investigated interfaces to DCH that are, above all, more inviting to the general public.

Case Study
The review of related works enabled identification of design aspects of VCHCD that have not yet been fully explored, thus indicating the following three requirements (R) for this present case study: R1. DPC users: although the contrast between expert and novice is an adequate abstraction for framing user categories in the DCH context, these two stereotypes alone do not address the particular expectations of consultants interested in photographic collections.

Methodological framework
Given these requirements, I proposed a case study to confront and reflect on the practical challenges of developing a digital tool for browsing photographs. My research identified an appropriate methodological framework for planning and design of digital products in the theoretical and practical context of Digital Humanities. Organized by the Digital Humanities Laboratory (DHLab) of Yale University Library, the Project Management & UX Design for Digital Scholarship guide (DHLab, 2018) was chosen and slightly adapted. Figure 1 summarizes the methodological framework, organized in phases and steps.
In the following sections, I will highlight key aspects that emerged from this framework, which helped me to understand the potential users (R1) of Fotovis and to make decisions for its design (R2/R3).

Understanding the attributes of the photographic collection
The Moreira Salles Institute (Instituto Moreira Salles -IMS) is an important Brazilian cultural entity. Its DCH is divided into four areas: photography, iconography (prints and drawings), music, and literature. The photographic collection of about 2 million images is distributed among 52 collections representing nineteenth-and twentieth-century Brazilian photography. Another particularity is the strong authorial nature of the collections; that is, the preference of the Institute to acquire, maintain, and disseminate the complete work of photographers.
From the arrival of a new photographic collection at IMS until its online availability for external consultation involves many technical processing steps. Due to the large volume of photographs and the lean team of professionals, the availability of the digitized collection in the image repository is gradual. Of the 2 million photographs at the time this study was conducted, approximately 310,000 were digitized.
Besides the feasibility aspect, there is also the cataloging standard. IMS's policy satisfies certain criteria such as making available for external consultation only photographs with minimal essential and revised information (e.g., cataloging code, author, title, date, place, and process), while simultaneously satisfying other relevant metadata. The digital management system used is Cumulus, with an intranet version named Cumulus Client and a mirrored and reduced version (Cumulus Sites / Portal) for external access, in which a portion of the entire digitized collection (about 30,000 items) is available for online consultation. The digital catalog has about 45 fields for inserting metadata, 15 of which are most frequently reported.
From an evaluation of Cumulus's interface, I identified two main limitations: 1. the software has limited and non-visual query options; 2. the results are always displayed in a vertical grid, and there are few options for rearrangement and ordering. Separation of the results over multiple pages makes an overview difficult to envisage.

Understanding user needs
When designing a new tool, it is fundamental to understand user needs by considering their contexts and expectations. Many research techniques can be used for this; however, in this research, I used interview techniques (Cooper, 2017), personas (Cooper, 1998), and the user journey (Stickdorn & Schneider, 2014) to address R1. The interviews' objective was to collect qualitative data from potential users. Depth was prioritized over quantity -eight onehour interviews were conducted. I developed two semi-structured interview scripts and selected three internal participants (IMS staff and curators) and five external participants (researchers and general visitors).
The two interview scripts had 22 open-ended questions (i.e., not simple yes/no questions), which encouraged respondents to report experiences and express expectations and frustrations in a more contextualized manner. The interview questions and their interpretation were structured into four dimensions: 1. Usage profile: aims to recognize the interviewee's domain of knowledge and motivations for browsing DPC. It was decided to build a tool that enables both straightforward (e.g., keyword search) and overview and orientation tasks, oriented to a more curious and serendipitous information-seeking paradigm. 2. Consultation practices: aims to understand the interviewee's perception of IMS's current management system (Cumulus), considering the situations in which they consult DPC, and the differences between physical and digitized collections. This highlighted the importance of presenting query results in a visually rich configuration that preserves the photographs' metadata (e.g., time, location, physical dimension, etc.). 3. Consultation techniques: aims to understand the interviewee's perception of the browsing, interaction, and visualization techniques offered in Cumulus. This helped in determining that, the less familiarized with the content and the cataloging logic of a collection, the more the user requires a tool with a consistent and trustworthy interface. Even more familiarized users recognized that visualizing options could be improved both in the collection overview and in the detailed view of a specific photograph.

4.
Opportunities: aims to identify usage opportunities yet to be explored. This led to the decision to design a customization environment in Fotovis, which was later named Light Table. Evaluation of the interviews made it possible to identify response patterns and create four main personas to represent Fotovis's target audience, described using two axes: knowledge domain of the photographic collection (high or low), and reason for using the tool (specific or exploratory). Figure 2 maps the personas along the axes and details their profession, main attributions, and reason for using Fotovis.
For each of the personas, a user journey was developed to map their use of Fotovis (Figures 3-6).

Fotovis concept design
The conceptual model of Fotovis is based on the visualization reference model developed by Card (2003) that has three main steps: Data Transformations; Visual Mappings; and View Transformations (Figure 7).

Data Transformations
The model begins with "Data Transformations" -a stage in which the "Raw Data" are collected, assembled, and structured in "Data Tables". In this research, this step passed through the process of obtaining the photographs' metadata from IMS in the form of an XML file and all the procedures for cleaning and manipulating that data until they were converted into a final data table in a Google Sheets spreadsheet.
The final data table consists of 1124 rows and 17 columns. The first column identifies the photograph via registration ID, while the other 16 columns correspond to the variables/attributes and are grouped into seven main access points (A): Authorship (A1), Creation Date (A2), Location (A3), Photographic Processes (A4), Support (A5), Physical Dimension (A6), and Formal Aspects (A7).
Another important result of this step is the identification of the data profile; that is, the understanding of the structure, size, and length/range of the variables. For example, Photographic Processes is a categorical variable that has a length of 10; that is, it has 10 possible values (Albumin/Silver, Digital file, Autochrome/Dye and silver, Collodion/Silver, Collotype/Pigment, Gelatin/Dye, Gelatin/Silver, Digital printing, Planotype/Platinum, or null). On the other hand, the height is a quantitative variable and has an interval that varies between 2.4 cm (record with the lowest height) and 52 cm (record with the highest height).

Visual Mappings
The "Visual Mappings" stage is the most important of the visualization model: it is here that "Data Tables" are translated into "Visual Structures", making visible the underlying relationships of data. From understanding the principles that rule visual encodingspatial substrate, markers, and visual variables -and continuing with the data profile mentioned above, it was possible to define visual structures for each variable and define a final visual schema for each access point in the Collection Overviews (G1) level: stacked bar chart for Creation Date; Treemap for Authorship, Photographic Processes, and Support; scatter plot for Physical Dimension; lluvial diagram for Formal Aspects; and bubble map for Location.

View Transformations
"View Transformations" are intrinsically linked to interaction and the new meanings this brings to visualizations. In Fotovis, interaction plays a decisive role, by offering the user resources to visually explore DPC using Multiple Visual Granularities (R2) and Multiple Access Points (R3).
Fotovis offers five levels of visual granularities -one more than what was proposed by Windhager et al. (2019). The new level, named Single Object Preview with Segmentation (G5), corresponds to the act of looking inside a photograph and the consequent identification of objects and other properties contained in it. I borrowed the term "segmented" from Computer Vision literature, in which it refers to the process of dividing a digital image into multiple regions or objects. Figure 8 shows how R2 and R3 are conceptually orchestrated in the informational space of Fotovis. It is interesting that, while Multiple Visual Granularities leads to vertical movement in the interface, Multiple Access Points allows for horizontal exploration.
To interact with Fotovis and explore it at its different levels of visual granularity, users perform information activities. Based on Windhager et al. (2019), I distinguished and characterized seven types of information activities that support consultation and exploratory tasks in Fotovis. These information activities result in integrated tasks (T) -see Figures 9-16.
Object Search corresponds to the act of "finding one or more relevant objects in an otherwise irrelevant information space". In Fotovis, this is achieved by interacting with the Keyword Search (T1) and the Faceted Visualizations, which will be explained below.
Overview and Orientation deals with "conceptual abstractions or discrete object surrogates to visualize collections at a macro level". The purpose of this activity is to ensure that users orient themselves in the informational space according to various metadata dimensions, and analyze distributions, relationships, patterns, and trends in the DPC. In Fotovis, this is achieved by interacting with the following: the Counter (T2), which informs the number of items displayed on the screen; the Main Visualization (T3), that encodes collection data in any of the available access points (A1-A6); and the Faceted Visualizations (T4) (Dörk, 2012) -these are facets in the form of bar charts, line charts, and histograms, which filter the main visualization while also showing distribution patterns at a glance.
Vertical Immersion or Abstraction supports vertical movements of immersion (zoom in) or abstraction (zoom out) along the overviewdetail axis. In Fotovis, this is achieved by interacting with the Granularity Control (T5) -which allows the user to move along the visual granularity levels (G1-G5) -and the Zoomable Interface (T6).
Accessing Object Details provides users a more detailed and in-depth experience. At the overview levels (G1-G3) in Fotovis, this is achieved by interacting with the following: Tooltip (T7), which reviews detail on demand; the Visualization Detail Control (T8), which adds a categorical variable to the Main Visualization; and the Missing Data Inclusion Control (T9), which adds to the Main Visualization (in a separate area) the number of registers that were not indexed to the dimension being mapped and, therefore, could not be plotted in the visualization. For the Single Object Preview (G4), the following are worth mentioning: the Catalog Sheet (T10); the Photograph Controls (T11), which let the image be manipulated, an error be reported, and/ or use of the image be requested; the Situated View (T12), which details the possible location where the photograph was taken on a map; and Photography Usage History (T13), which details the history of the photograph's use in exhibitions and/or IMS publications. For the Single Object Preview with Segmentation (G5), the following are worth mentioning: Objects (T14) and Tags (T15), which, respectively, describes the automatic detection and classification of objects and tags by pre-trained machine learning models, and Dominant Colors (T16), which reports the dominant colors extracted from the photograph in a color distribution graph.

Horizontal
Exploration includes various open-ended, lateral movements. In Fotovis, this is achieved by interacting with the following: Dimension Control (T17), which allows users to switch the access point of view; the Slideshow (T18); and the Similar Images (T19), which displays photographs visually similar to the one selected.
Curated paths corresponds to a specific horizontal functionality that can be achieved by author-driven navigation routes. In Fotovis, it is achieved via the Tutorial (T20).
Other Activities corresponds to the tasks performed in the Light Table  environment, which gathers a set of tasks (T21) that allow the user to select, juxtapose, rotate, scale, edit, and save a personalized set of photographs on a personalized canvas.     Finally, the Fotovis prototype assumes the form of a medium-fidelity click-through model. This means a scale prototype with a simplified layout that can be partially navigated by users for validation purposes. The complete set of screens and a walk-through simulation video can be accessed at https://www.juliagiannella.com.br/fotovis/.

User Tests
User Tests help validate its success and identify aspects for improvement. In this work, I conducted this step using thinking aloud -a technique introduced in the field of usability by Lewis (1982) and later framed within a broader study in task-centric user interface design (Lewis & Rieman, 1993). The basic idea of thinking aloud is to have participants perform tasks at the interface of a tool/system while verbalizing (aloud) what they are seeing, doing, and thinking. The authors establish three necessary elements for the conduct of the method: users, tasks to be performed, and a version of the product to be tested.
The tests were conducted with four participants, with each participant representing one of the personas of Fotovis's target audience.
From the tests, it was possible to quantitatively evaluate the users regarding their performing of the tasks, as well as gather their impressions and determine their levels of understanding Fotovis and satisfaction with it. The evaluation was graded per the following three-level scale: § Evaluation 1 (green): clearly accomplished the task; § Evaluation 2 (orange): partially fulfilled the task and/or had doubts; § Evaluation 3 (red): failed to complete the task.
The evaluation is based on two criteria: user profile and informational activity class. Regarding the first criterion (Figure 17), the better performance of Users 2 and 4 indicates that Fotovis is more suitable for users who do not have a specific objective and want to explore and perform analytical tasks in collections. Regarding the second criterion (Figure 18), in general, all classes of informational activity were performed with a satisfactory level of clarity and assertiveness.
However, closer observation revealed specific usability problems in the Fotovis interface; for example, regarding the labeling system, the interaction with the Granularity Control (C5), and the components present in the Light Table environment (C21). A second iteration of the Fotovis design could mitigate these problems.
Besides completing tasks, participants also commented and assigned a score from zero to five for their level of satisfaction with the use and perception of Fotovis, considering three parameters: interface, interaction, and relevance ( Figure 19).   The average satisfaction for the interface and interaction was 4 and 3.63, respectively while the relevance question returned a score of 4.5. The interface was rated highest by Users 2 and 4, who made the following comments: [The tool] seemed clear to me. I could generally identify that there were filters [on overview granularities] and contextual information on the left [on preview granularities]. What I didn't find so consistent is that there is also contextual information [in the case of the catalog sheet] on the right side of the screen (User 2) I thought it was very good that you had the filters and several visualizations at the same time so that you could understand your search with several data being presented simultaneously (User 4).
User 3 justified his score of 3.5 for the Fotovis interface by the way the components are visualized and described. For him, the use of certain terminologies made it unfriendly: What made it more difficult for me was the use of vocabulary that I'm unfamiliar with [...] it wasn't clear what the names of the elements mean and where they take you (User 3).
Finally, User 1 evaluated the relevance of Fotovis with some reservations: I think [the system] is relevant for specific uses [...], for a specific project that I need to separate all photographs of a certain format, filtered by aspect n. In these cases I would use it, but not for everyday use. It seemed to me to be one more system besides the one we already use [Cumulus], a bit more visual and, therefore, more intuitive [...], but I don't know if I would replace it (User 1).

Conclusion and Future Work
This paper detailed user studies and the concept design process of Fotovis, a tool for visually browsing photographic collections. This tool is part of broader research concerning VCHCD and DCH.
The empirical results emerging from the case study presented herein offer contributions to the areas of Digital Humanities, Information Visualization, and DCH. Three contributions are summarized below: § Through a user-centered design approach, this investigation enabled a better understanding of the target audience interested in visually exploring and analyzing DPC. The interviews and their consecutive evaluations led to mapping four personas described along two axes: knowledge domain of the photographic collection (high or low), and reason for using the tool (specific or exploratory). For each of these personas, a user journey was developed. As far as I know, this is the only research to have systematically modeled user categories for DPC and to propose user journeys for each of them, while navigating visualization-based tools for digital photographs. § When designing and prototyping the Fotovis interface, it was possible to instantiate the requirements (DPC users, Multiple visual granularities, and Multiple access points) in a design solution that responds to the complexity of photographic collections and supports specific tasks for this kind of cultural artifact. As the prototype is powered by a sample of real data and is built following a user-centered approach, it reflects upon and exposes the potential and the challenges of visualizationbased tools for browsing digital photographs. § User tests provided third-party perceptions of Fotovis, by offering an approach to validate the usage value of the tool and the effectiveness of the proposed informational activities. Fotovis demonstrates benefits for users at different levels.
For users with high domain knowledge and a specific use objective (User 1), Fotovis sought to offer a digital environment to cluster and examine photographs during the curation process. For users with high domain knowledge but an objective of exploratory use (User 2), Fotovis fulfilled the role of highlighting the practices and attitudes of classification, cataloging, and insertion of metadata performed by the IMS's professionals. For users with little domain knowledge but a specific use objective (User 3), Fotovis demonstrated that different information-seeking tasks can be performed complementarily in the same tool. For users with little knowledge and an exploratory goal (User 4), Fotovis offers an open, flexible, and high-level approach to interact with the collection and enable discoveries.
Finally, since Fotovis is a medium-fidelity prototype, it would be opportune to go through a second iteration of its design, taking into account the test evaluations. With the main design problems solved, future work can focus on the implementation of Fotovis. Thus, it would be appropriate to delve deeper into technical implications by addressing design and computational aspects.