Development of Measurement Instrument For Visual Q
Development of Measurement Instrument For Visual Q
https://doi.org/10.1007/s11257-020-09263-7
Henrietta Jylhä1 · Juho Hamari1
Received: 20 February 2019 / Accepted in revised form: 28 March 2020 / Published online: 17 May 2020
© The Author(s) 2020
Abstract
Graphical user interfaces are widely common and present in everyday human–
computer interaction, dominantly in computers and smartphones. Today, various
actions are performed via graphical user interface elements, e.g., windows, menus
and icons. An attractive user interface that adapts to user needs and preferences is
progressively important as it often allows personalized information processing that
facilitates interaction. However, practitioners and scholars have lacked an instrument
for measuring user perception of aesthetics within graphical user interface elements
to aid in creating successful graphical assets. Therefore, we studied dimensionality
of ratings of different perceived aesthetic qualities in GUI elements as the founda-
tion for the measurement instrument. First, we devised a semantic differential scale
of 22 adjective pairs by combining prior scattered measures. We then conducted
a vignette experiment with random participant (n = 569) assignment to evaluate 4
icons from a total of pre-selected 68 game app icons across 4 categories (concrete,
abstract, character and text) using the semantic scales. This resulted in a total of
2276 individual icon evaluations. Through exploratory factor analyses, the obser-
vations converged into 5 dimensions of perceived visual quality: Excellence/Infe-
riority, Graciousness/Harshness, Idleness/Liveliness, Normalness/Bizarreness and
Complexity/Simplicity. We then proceeded to conduct confirmatory factor analyses
to test the model fit of the 5-factor model with all 22 adjective pairs as well as with
an adjusted version of 15 adjective pairs. Overall, this study developed, validated,
and consequently presents a measurement instrument for perceptions of visual quali-
ties of graphical user interfaces and/or singular interface elements (VISQUAL) that
can be used in multiple ways in several contexts related to visual human-computer
interaction, interfaces and their adaption.
* Henrietta Jylhä
henrietta.jylha@tuni.fi
Extended author information available on the last page of the article
13
Vol.:(0123456789)
1 Introduction
1
Linux Information Project, “GUI Definition,” http://www.linfo.org/gui.html (accessed October 23,
2018).
2
Android Developers, “Iconography,” http://www.androiddocs.com/design/style/iconography.html
(accessed October 15, 2018).
13
Content courtesy of Springer Nature, terms of use apply. Rights reserved.
Development of measurement instrument for visual qualities… 951
Salimun et al. 2010; Sarsam and Al-Samarraie 2018; Tractinsky 1997; Tractinsky
et al. 2000) as well as sense of pleasure and trust (Cyr et al. 2006; Jordan 1998;
Zen and Vanderdonckt 2016). A positive user experience is essential for success-
ful human–computer interaction, as a user quickly abandons an interface that is
connected with negative experiences. As the user experience is increasingly tied to
adaptive visual aesthetics, it motivates the need for further research on graphical
user interface elements. Perceptions of successful (i.e., appealing) visual aesthetics
are subjective (Zen and Vanderdonckt 2016), which complicates creating engaging
user experiences for critical masses. Theories and tools have been proposed to assess
and design appropriate graphical user interfaces (e.g., Choi and Lee 2012; Hassen-
zahl et al. 2003; Ngo et al. 2000; Ngo 2001; Ngo et al. 2003; Zen and Vanderdonckt
2016), yet no consensus exists on a consistent method to guide producing success-
ful user interface elements considering the subjective experience. In the pursuit of
investigating what aesthetic features appear together in graphical icons, we attempt
to address this gap by developing an instrument that measures graphical user inter-
face elements via individual user perceptions.
First, we devised a semantic differential scale of 22 adjective pairs. We then con-
ducted a survey-based vignette study with random participant (n = 569) assignment
to evaluate 4 icons from a total of pre-selected 68 game app icons across 4 cate-
gories (concrete, abstract, character and text) using the semantic scales. Game app
icons were used for validity and comparability in the results. This resulted in a total
of 2276 individual icon evaluations. The large-scale quantitative data were analyzed
in several ways. Firstly, we examined factor loadings of the perceived visual quali-
ties with exploratory factor analysis (EFA). Secondly, we performed confirmatory
factor analyses (CFA) to test whether the proposed theory could be applied to simi-
lar latent constructs. Although further validation is required, the results show prom-
ise. Based on these studies, we compose VISQUAL, an instrument for measuring
individual user perceptions of visual qualities of graphical user interface elements,
which can be used for research into adaptive user interfaces. Therefore, this study
allows for theoretical and practical guidelines in the designing process of personal-
ized graphical user interface elements, analyzed via 5 dimensions: Excellence/Infe-
riority, Graciousness/Harshness, Idleness/Liveliness, Normalness/Bizarreness and
Complexity/Simplicity.
Graphical user interface design has experienced tremendous change during the past
decades due to technological evolution. An increasing diversity of devices have
adopted interfaces that adapt according to device characteristics and user pref-
erences. An adaptive user interface (AUI) is defined as a system that changes its
structure and elements depending on the context of the user (Schneider-Hufschmidt
et al. 1993), hence the UI has to be flexible to satisfy various needs. User interface
adaptation consists of modifying parts or a whole UI. User modeling algorithms in
13
Content courtesy of Springer Nature, terms of use apply. Rights reserved.
952 H. Jylhä, J. Hamari
the software level provide the personalization concept, while GUIs display the con-
tent, expressing personalization from the user’s perspective (Alvarez-Cortes et al.
2009). For example, UI elements are expected to scale automatically with screen
size and hide unwanted menu elements. Adaptation can be divided into two cate-
gories depending on the end user: adaptability and adaptivity. Adaptability means
the user’s ability to adapt the UI, and adaptivity means the system’s ability to adapt
the UI. When users communicate with interfaces, both the human and the machine
collaborate toward adaptation, i.e., mixed initiative adaptation (Bouzit et al. 2017).
Adaptiveness in interfaces has been widely studied in terms of user performance
(Gajos et al. 2006), preference (Cockburn et al. 2007) and satisfaction (Gajos et al.
2006), as well as improving task efficiency and learning curve (Lavie and Meyer
2010).
The most important advantage of AUIs is argued to be the total control of UI
appearance that the user has, although it is at the same time considered a shortcom-
ing for users with lower level of technology experience and skill (Gullà et al. 2015).
Adaptive user interfaces may in many cases result in undesired or unpredictable
interface behavior because of the challenges in specifying the design for the wide
variety of users which in some cases lead to users not accepting the UI (Alvarez-
Cortes et al. 2009; Bouzit et al. 2017; Gajos et al. 2006). Moreover, prior research
(Gajos et al. 2006) has shown that purely mechanical properties of an adaptive inter-
face lead to poor user performance and satisfaction. Therefore, understanding user
preferences and perceptions is essential in creating interfaces, and it is necessary to
assess these in early stages of the design process to effectively identify different user
profiles (Gullà et al. 2015). Due to the rapid changes to UI design, new adaptation
techniques and systematic methods are needed in which design decisions are led by
appropriate parameters concerning users and contexts.
A distinction has been made between two types of aesthetics within human–com-
puter interaction, namely classical and expressive aesthetics (Hartmann et al. 2008).
Classical aesthetics refers to orderly and clear designs, whereas expressive aesthet-
ics refer to creative and original designs. Classical aesthetics seem to be perceived
more evenly by users, while expressive aesthetics are denounced by more disper-
sion depending on contextual stimuli (Mahlke and Thüring 2007). Aesthetic value
of graphical user interfaces has been attempted to measure objectively by several
geometry-related and image-related metrics, e.g., balance, equilibrium, symmetry
and sequences well as color contrast and saturation to avoid human involvement in
the process (Maity et al. 2015, 2016; Ngo et al. 2000, 2001, 2003; Vanderdonckt
and Gillo 1994; Zen and Vanderdonckt 2014, 2016). These visual techniques in the
arrangement of layout components can be divided into physical techniques, compo-
sition techniques, association and disassociation techniques, ordering techniques, as
well as photographic techniques (Vanderdonckt and Gillo 1994). Furthermore, bal-
ance is defined as a centered layout where components are equally weighed. Equi-
librium is defined as equal balance between opposing forces. Symmetry is defined
13
Content courtesy of Springer Nature, terms of use apply. Rights reserved.
Development of measurement instrument for visual qualities… 953
13
Content courtesy of Springer Nature, terms of use apply. Rights reserved.
954 H. Jylhä, J. Hamari
not necessarily reveal which elements in the user interface are successful. Layout
designs vary, which may cause difficulties in generalization. This can be regarded as
a shortcoming of the empirical measurements as inclusivity may prevent calculat-
ing genuine values of user interfaces. Prior study (Vanderdonckt and Gillo 1994)
attempting to automate calculation of visual techniques with single interface com-
ponents found that some techniques could be measured, such as physical techniques,
while some others appeared more challenging to measure, such as photographic
techniques. We note that contextual factors surrounding single GUI components are
important in affecting user perceptions, thus evaluating GUI elements separately
may in some cases prove challenging. Moreover, the application of principles heav-
ily depends on visual aims, and hence, further comparison between measurement
instruments is needed in order to explore the relationship between single compo-
nents and their context.
In order to address these gaps, and rather than experimenting with a graphical
user interface as a single piece, we scaled the validation of VISQUAL into single
interface components, i.e., icons. Icons are pictographic symbols within a computer
system, applied principally to graphical user interfaces (Gittins 1986) that have
replaced text-based commands as the means to communicate with users (García
et al. 1994; Gittins 1986; McDougall et al. 1998; Huang et al. 2002). This is because
icons are easy to process (Horton 1994, 1996; Lin and Yeh 2010; McDougall et al.
1999; Wiedenbeck; 1999) and convenient for universal communication (Arend et al.
1987; Horton 1994, 1996; Lodding 1983; McDougall et al. 1999).
Prior research has found that attractiveness leads into better ratings of interfaces
primarily due to the use of graphic elements, such as icons (Roberts et al. 2003).
Icons are one main component of GUI design, and results show that attractive and
appropriately designed icons increase consumer interest and interaction within
online storefront interfaces, such as app stores (Burgers et al. 2016; Chen 2015;
Hou and Ho 2013; Jylhä and Hamari 2019; Lin and Chen 2018; Lin and Yeh 2010;
Salman et al. 2010, 2012; Shu and Lin 2014; Wang and Li 2017). While icons do
not constitute a graphical user interface solitarily, an icon-based GUI is a highly
common presentation in best-selling devices at present. This justifies using icons as
study material for evaluating visual qualities of graphical user interface elements.
Hence, VISQUAL was validated by experimenting on user interface icons.
Prior studies have introduced different methods to measure the aesthetics of
graphical user interfaces during the past decades. Please refer to Table 1 for a sum-
mary list of instruments.
Metric-based instruments include multi-screen interface assessment with formu-
lated aesthetic measures and visual techniques (Ngo et al. 2000, 2001; Vanderdonckt
and Gillo 1994), semi-automated computation of user interfaces with the online tool
QUESTIM (Zen and Vanderdonckt 2016) as well as predictive computation of on-
screen image and typeface aesthetics (Maity et al. 2015, 2016). Survey-based instru-
ments include a semantic differential scale measuring hedonic and pragmatic quali-
ties of interface appeal (Hassenzahl et al. 2003) and a scale measuring perceived
simplicity of user interfaces in relation to visual aesthetics (Choi and Lee 2012).
Semantic differential is a commonly used tool for measuring connotative
meanings of concepts. Similar to AttrakDiff 2 (Hassenzahl et al. 2003), semantic
13
Content courtesy of Springer Nature, terms of use apply. Rights reserved.
Table 1 Measurements for graphical user interface aesthetics
Measure Construct Description Original paper
Aesthetic measures for assessing graphic Multi-screen interface assessment (metric- Aesthetic measures of (1) balance, (2) Ngo et al. (2000)
screens based) equilibrium, (3) symmetry, (4) sequence,
(5) order, and (6) complexity
Aesthetic measures for assessing graphic Multi-screen interface assessment (metric- Aesthetic measures of (1) balance, (2) Ngo (2001)
screens (extended) based) equilibrium, (3) symmetry, (4) sequence,
(5) cohesion, (6) unity, (7) proportion,
(8) simplicity, (9) density, (10) regular-
ity), (11) economy, (12) homogeneity,
and (13) rhythm
Visual techniques for traditional and multi- Computation of visual techniques (metric- Five sets of visual techniques measuring Vanderdonckt and Gillo (1994)
media layouts based) (1) physical techniques, (2) composition
techniques, (3) association and dissocia-
tion techniques, (4) ordering techniques,
and (5) photographic techniques
Quality estimator using metrics (QUES- Computation of aesthetic user interface Semi-automated computation of (1) bal- Zen and Vanderdonckt (2014, 2016)
TIM) metrics (metric-based, online software) ance, (2) density, (3) alignment, (4) con-
Development of measurement instrument for visual qualities…
13
Content courtesy of Springer Nature, terms of use apply. Rights reserved.
Table 1 (continued)
956
13
AttrakDiff 2 Hedonic and pragmatic evaluation of Seven-point semantic differential scale of Hassenzahl et al. (2003)
interface appeal (survey-based, online 21 items measuring (1) hedonic qual-
software) ity–identification, (2) hedonic quality–
stimulation, and (3) pragmatic quality.
Accessible as online software. attrakdiff.
de/index-en.html
Scale of simplicity Simplicity perception of interfaces Seven-point scale measuring six compo- Choi and Lee (2012)
(survey-based) nents: (1) reduction, (2) organization, (3)
component complexity, (4) coordinative
complexity, (5) dynamic complexity, and
(6) visual aesthetics
H. Jylhä, J. Hamari
3 Methods and data
As a foundation for this study, a semantic differential scale of 22 adjective pairs was
employed to measure visual qualities of graphical user interface elements. We con-
ducted a within-subjects vignette study with random participant (n = 569) assign-
ment to evaluate 4 icons from a total of pre-selected 68 game app icons across 4
categories (concrete, abstract, character and text) using the semantic scales. Game
app icons were used for validity and comparability in the results. This resulted in a
total of 2276 individual icon evaluations. The following describes the participants in
the study.
13
Content courtesy of Springer Nature, terms of use apply. Rights reserved.
958 H. Jylhä, J. Hamari
3.1 Participants
3.2 Measure development
13
Content courtesy of Springer Nature, terms of use apply. Rights reserved.
Development of measurement instrument for visual qualities… 959
Table 2 Demographic n %
information
Age –20 60 10.54
(SD = 7.24) 21–25 249 43.76
(Mean = 26.90) 26–30 145 25.48
(Median = 25.00) 31–35 45 7.91
36–40 37 6.50
41–45 16 2.81
46–50 7 1.23
51–55 5 0.88
56–60 3 0.53
60– 2 0.35
Education Less than high school 5 .9
High school 135 23.7
College 95 16.7
Bachelor’s degree 227 39.9
Master’s degree 98 17.2
Higher than master’s degree 9 1.6
Employment Working full-time 133 23.4
Working part-time 62 10.9
Student 351 61.7
Unemployed 11 1.9
Retired 1 .2
Gender Male 297 52.2
Female 257 45.2
Other 15 2.6
Yearly income Less than $19,999 330 58.0
$20,000 to $39,999 105 18.5
$40,000 to $59,999 57 10.0
$60,000 to $79,999 25 4.4
$80,000 to $99,999 13 2.3
$100,000 to $119,999 14 2.5
$120,000 to $139,999 10 1.8
$140,000 or more 15 2.6
13
Content courtesy of Springer Nature, terms of use apply. Rights reserved.
Table 3 Adjective pairs, means and standard deviations (values were comprised between 1 and 7)
960
13
Beautiful–Ugly Shaikh (2009) 4.57 1.618
Calm–Exciting Shaikh (2009) 3.96 1.452
Colorful–Colorless Allen and Matheson (1977) 3.77 1.810
Complex–Simple Choi and Lee (2012), Goonetilleke et al. (2001), McDougall and Reppa (2008, 2013), McDougall et al. 4.69 1.669
(2016)
Concrete–Abstract Arend et al. (1987), Blankenberger and Hahn (1991), Dewar (1999), Hou and Ho (2013), Isherwood et al. 4.02 1.998
(2007), McDougall and Reppa (2008), McDougall et al. (1999, 2000), Moyes and Jordan (1993), Rogers
and Oborne (1987)
Delicate–Rugged Shaikh (2009) 4.42 1.368
Expensive–Cheap Shaikh (2009) 4.83 1.563
Feminine–Masculine Shaikh (2009) 4.34 1.388
Good–Bad Shaikh (2009) 4.34 1.641
Happy–Sad Shaikh (2009) 3.80 1.507
Old–Young Shaikh (2009) 3.98 1.611
Ordinary–Unique Creusen and Schoormans (2005), Creusen et al. (2010), Dewar (1999), Goonetilleke et al. (2001), Huang 3.39 1.651
et al. (2002), Salman et al. (2010)
Passive–Active Shaikh (2009) 3.97 1.708
Professional–Unprofessional Hassenzahl et al. (2003) 4.22 1.736
Quiet–Loud Shaikh (2009) 4.12 1.601
Realistic–Unrealistic Vanderdonckt and Gillo (1994) 4.22 1.592
Relaxed–Stiff Shaikh (2009) 4.47 1.560
Slow–Fast Shaikh (2009) 3.87 1.576
Soft–Hard Shaikh (2009) 4.19 1.545
Strong–Weak Shaikh (2009) 3.93 1.464
Three-dimensional–Two-dimensional Vanderdonckt and Gillo (1994) 4.67 1.863
Warm–Cool Shaikh (2009) 4.02 1.435
H. Jylhä, J. Hamari
and highest scores are between − 0.5 and 0.5, which indicates that the data are
fairly symmetrical.
3.3 Materials
A total of 68 game app icons from Google Play Store were selected for the experi-
ment. Four icons corresponding to common icon styles (concrete, abstract, charac-
ter and text) were selected from each of the 17 categories for game apps (action,
adventure, arcade, board, card, casino, casual, educational, music, puzzle, racing,
role playing, simulation, sports, strategy, trivia and word). The design of graphical
user interface elements is dependent on context (Shu and Lin 2014). Hence, we con-
sidered it justified to include icons from all categories in order to avoid systematic
bias. Moreover, as the prior literature has highlighted the relevance of concreteness
and abstractness as well as whether an icon includes face-like elements or letters, we
ensured that one icon from each category was characteristic of one of these attrib-
utes. Please refer to Table 4 for the icons used in the study.
Additional criteria were the publishing date of the apps and the number of installs
and reviews they had received at the time of selection. Since the icons in the experi-
ment were chosen during December 2016, the acceptable publishing date for the
apps was determined to range from December 3–17, 2016. No more than 500
installs and 30 reviews were permitted. The aim of this was to choose new app icons
to eliminate the chance of app and icon familiarity and thus, systematic bias. Moreo-
ver, the goal was to have a varied sample of icons both in terms of visual styles and
quality, meaning that several different computer graphic techniques were included,
such as 2D and 3D rendered images.
3.4 Procedure
13
Content courtesy of Springer Nature, terms of use apply. Rights reserved.
962 H. Jylhä, J. Hamari
the closer you choose to the left or right adjective, the better you think it fits to
the adjective. If you choose the middle space, you think both adjectives fit equally
well.” The respondent was reminded that there are no right or wrong answers and
was then instructed to click “Next” to begin. The respondent was shown one icon
at a time and was asked to rate the 22 adjective pairs under the icon graphic with
the following text: “In my opinion, this icon is…”. Each respondent was randomly
assigned four icons to evaluate, one from each category of pre-selected icon attrib-
utes (abstract, concrete, character and text). After the semantic scales, the partici-
pant rated their willingness to click the icon as well as download and purchase the
imagined app that the icon belongs to, by using a seven-point Likert scale on the
same page with the icon. Lastly, demographic information (age, gender, etc.) was
asked. The survey took about 10 min to complete. The survey was implemented via
SurveyGizmo, an online survey tool. All content was in English. The data were ana-
lyzed with IBM SPSS Statistics and Amos version 24 as well as Microsoft Office
Excel 2016.
The instrument was evaluated with three stages of consecutive analyses. First, we
examined factor loadings of the 22 visual qualities with exploratory factor analy-
sis (EFA) to examine underlying latent constructs (Table 5). Second, we performed
a confirmatory factor analysis (CFA) with structural equation modeling (SEM) to
assess whether the psychometric properties of the instrument (Fig. 1) are applicable
to similar latent constructs, which revealed the need for modification in the model.
Following the adjustments, another CFA was performed in order to finalize the
model (Fig. 2).
Initially, the factorability of the 22 adjective pairs was examined. The data set was
determined suitable for this purpose as the correlation matrix showed coefficients
above .3 between most items with their respective predicted dimension. Moreover,
the Kaiser–Meyer–Olkin measure of sampling adequacy indicated that the strength
of the relationships among variables was high (KMO = .87), and Bartlett’s test of
sphericity was significant (χ2 (231) = 21,919.22; p < .001).
Given these overall indicators, EFA with varimax rotation was performed to
explore factor structures of the 22 adjective pairs used in the experiment, using data
from 2276 icon evaluations. There were no initial expectations regarding the number
of factors. Principal component analysis (PCA) was used as extraction method to
maximize the variance extracted. Varimax rotation with Kaiser normalization was
used. Please refer to Table 5 for the results of the analysis.
The analysis exposed five distinguishable factors: Excellence/Inferiority, Gra-
ciousness/Harshness, Idleness/Liveliness, Normalness/Bizarreness and Complexity/
Simplicity. Typically, at least two variables must load on a factor so that it can be
given a meaningful interpretation (Henson and Roberts 2006). Correlations starting
from .4 can be considered credible in that the correlations are of moderate strength
or higher (Evans 1996). In this light, all the factors formed in the analysis are valid.
13
Content courtesy of Springer Nature, terms of use apply. Rights reserved.
Development of measurement instrument for visual qualities… 963
13
Content courtesy of Springer Nature, terms of use apply. Rights reserved.
964 H. Jylhä, J. Hamari
3
Kenny, D.A., “Measuring Model Fit,” http://davidakenny.net/cm/fit.htm (accessed November 21,
2018).
13
Content courtesy of Springer Nature, terms of use apply. Rights reserved.
Table 5 Exploratory factor analysis with varimax rotation (loadings > .4 bolded)
Excellence/Inferiority Graciousness/Harshness Idleness/Liveliness Normalness/Bizarreness Complexity/Simplicity
(Variance extracted (Variance extracted (Variance extracted (Variance extracted (Variance extracted
% = 17.353) % = 16.434) % = 15.720) % = 7.828) % = 6.163)
13
Content courtesy of Springer Nature, terms of use apply. Rights reserved.
966 H. Jylhä, J. Hamari
three of the factors showed good level of internal consistency, two were found to
have unacceptable alpha values.
Additionally, there were some concerns related to convergent validity where
the average variance extracted (AVE) was less than .5, namely Graciousness/
Harshness (AVE = .393) and Complexity/Simplicity (AVE = .361). Additionally,
concerns related to composite reliability were discovered where the CR was less
than .7, namely Normalness/Bizarreness (CR = .686) and Complexity/Simplicity
(CR = .520). In terms of discriminant validity, the square root of the average var-
iance extracted of each construct is larger than any correlation between the same
construct and all the other constructs (Fornell and Larcker 1981). Please refer to
Table 6 for full validity and reliability scores.
According to these results, two factors out of five proved to be robust, namely
Excellence/Inferiority and Idleness/Liveliness. At this stage, the instrument does
not seem to be an optimally fitting measurement model due to the poor model fit
indices and the noted problems with validity and reliability. Additional issue here is
the unacceptable loadings (Fig. 1). While loadings should fall between .32 and 1.00
(Matsunaga 2010; Tabachnick and Fidell 2007), the model contains values that are
outside of these boundaries. These observations suggest for post hoc adjustments in
the model.
As noted by the prior literature (Brown 2015; MacKenzie et al. 2011), the
removal of poorly behaved reflective indicators may offer to improve the over-
all model fit. Furthermore, examining strong modification indices (MI = 3.84) and
covarying items accordingly (MacKenzie et al. 2011) is likely to prove beneficial in
balancing unacceptable loadings in the model. By addressing issues associated with
the problematic factors, low scores related to model fit as well as validity and reli-
ability are expected to improve.
13
Content courtesy of Springer Nature, terms of use apply. Rights reserved.
Development of measurement instrument for visual qualities… 967
error covariances. Please refer to Fig. 2 for the adjusted model evaluated in the
CFA.
With these changes, the results of the model fit indices were as follows:
χ2 = 1499.114, DF = 78; χ2/DF = 19.219, p ≤ .001, CFI = .906, RMSEA = .089,
and SRMR = .0705. As discussed previously, the χ2 and p values are highly sensi-
tive to sample size and are thus easily inflated (Matsunaga 2010; Russell 2002).
For this reason, they should be disregarded in this particular context where the
instrument was assessed by using data from 2276 icon evaluations. With the
exception of the discussed values, all indices showed acceptable model fit. Fur-
thermore, all item loadings now fall between the preferred .32 and 1.00 (Matsu-
naga 2010; Tabachnick and Fidell 2007), although some loadings remained low
(< .55) particularly on the factors with only two latent variables.
13
Content courtesy of Springer Nature, terms of use apply. Rights reserved.
968 H. Jylhä, J. Hamari
While the adjusted model retained good alpha values concerning the first three
factors, previously observed issues with the last two factors remained, as fol-
lows: Excellence/Inferiority (α = .896), Graciousness/Harshness (α = .740), Idle-
ness/Liveliness (α = .818), Normalness/Bizarreness (α = .588), and Complexity/
Simplicity (α = .496). The Complexity/Simplicity factor was not altered, thus the
alpha is unchanged. However, regardless of adjustments to the model, the Nor-
malness/Bizarreness factor did not reach an adequate alpha level.
Similarly, adjusting the model improved the AVE values, yet issues remained
relating to convergent validity with three factors having AVE values under .5,
namely Idleness/Liveliness (AVE = .499), Normalness/Bizarreness (AVE = .494)
and Complexity/Simplicity (AVE = .378). The lower AVE score of the Normal-
ness/Bizarreness factor in this stage is presumably caused by the removal of one
semantic pair, ordinary–unique, which transforms the initial three-item factor
into a two-item factor.
Although reliability scores showed significant increase in this stage, issues related
to composite reliability remained for two factors, namely Normalness/Bizarreness
(CR = .646) and Complexity/Simplicity (CR = .533). The model shows continued
support for discriminant validity of the five-factor model in that the square root of
AVE for each of the five factors was > 0.50 and greater than the shared variance
between each of the factors. Please refer to Table 8 for full validity and reliability
scores.
These results repeat the robustness of Excellence/Inferiority and Idleness/Liveli-
ness factors. Moreover, the Graciousness/Harshness factor can be considered solid
in terms of validity and reliability as the AVE value was seemingly close to the
13
Content courtesy of Springer Nature, terms of use apply. Rights reserved.
Development of measurement instrument for visual qualities… 969
7 Discussion
The initial measurement model of 22 items formed a five-factor structure in the EFA
in Stage 1. The factors were named to correspond to the referents on the factors:
Excellence/Inferiority, Graciousness/Harshness, Idleness/Liveliness, Normalness/
Bizarreness and Complexity/Simplicity. All items and factors were valid in the EFA.
The CFA in Stage 2 exposed concerns in the model, which were countered by item
removal in Stage 3. The adjusted model retained 15 (68%) items of the initial 22. As
such, seven items were deleted with loadings under .65 (Table 7) on factors that held
more than 2 items as the recommended solution for indicators that have low validity
and reliability (MacKenzie et al. 2011). This resulted in better validity and reliabil-
ity producing more robust factors, thereby theoretically justifying this choice. The
majority of the removed items represent qualities that may be interpreted as ambigu-
ous in the context of visual qualities of graphical user interfaces (e.g., strong–weak,
hard–soft, old–young). It may be that these adjective pairs are often related to more
concrete, tangible traits than visuals on an interface that are generally impalpable.
Furthermore, some of these items poorly reflected others on the same factor, e.g.,
strong–weak, which can be interpreted as a synonym for quality or as a feature in a
13
Content courtesy of Springer Nature, terms of use apply. Rights reserved.
970
13
Table 6 Validity and reliability for VISQUAL (Stage 2)
CR AVE MSV MaxR(H) Excellence/ Graciousness/ Idleness/Liveliness Normalness/ Complex-
Inferiority Harshness Bizarreness ity/Sim-
plicity
visual (e.g., a character) among other explanations. Considering the other items on
the factor that represent excellency in a more explicit way, this further justifies item
removal from a methodological perspective.
During Stage 3, modification indices were examined for values greater than 3.84
(MacKenzie et al. 2011). Error terms were allowed to correlate between two sets of
latent variables with the largest modification indices, namely professional–unpro-
fessional and expensive–cheap as well as quiet–loud and calm–exciting. These
items can be considered colloquially quite similar to their correlated pair, only that
they represent similar concepts in different ways, i.e., in general and specific terms.
There is an ongoing discussion whether post hoc correlations based on modifica-
tion indices should be made. A key principle is that a constrained parameter should
be allowed to correlate freely only with empirical, conceptual or practical justifica-
tion (e.g., Brown 2015; Hermida 2015; Kaplan 1990; MacCallum 1986). Examining
modification indices has been criticized, e.g., for the risk of biasing parameters in
the model and their standard errors, as well as leading to incorrect interpretations
on model fit and the solutions to its improvement (Brown 2015; Hermida 2015).
To rationalize for these two covaried errors in the development of this particular
measurement model, it is to be noted that similar to the χ2 value and standardized
residuals, modification indices are sensitive to sample size (Brown 2015). When
the sample size is large (more than 200 cases), modification indices can be consid-
ered in determining re-specification (Kaplan 1990). VISQUAL was evaluated using
data from 2276 icon evaluations, which causes inflation to the aforementioned val-
ues. Therefore, appropriate measures need to be taken in order to circumvent issues
related to sample size. Furthermore, residuals were allowed to correlate strictly and
only when the measures were administered to the same informant, i.e., factor.
This was a first-time evaluation and validation study for VISQUAL. The instru-
ment was developed in the pursuit of aiding research and design of aesthetic inter-
face elements, which has been lacking in the field of HCI. In this era of user-adapted
interaction systems, it is crucial to advance the understanding of the relationship
between interface aesthetics and user perceptions. As such, the measurement model
shows promise in examining visual qualities of graphical user interface elements.
However, the model fit indices were nearer to acceptable than good. In addition,
convergent validity and composite reliability remain open for critique. This is per-
haps an expected feature for instruments that are based on subjective perceptions
13
Content courtesy of Springer Nature, terms of use apply. Rights reserved.
972
13
Table 8 Validity and reliability for VISQUAL (Stage 3)
CR AVE MSV MaxR(H) Excellence/ Graciousness/ Idleness/Liveliness Normalness/ Complex-
Inferiority Harshness Bizarreness ity/Sim-
plicity
rather than more specific psychological traits. While aesthetic perception is subjec-
tive, this study shows evidence of features uniformly clustering in the evaluation of
graphical user interface elements. Therefore, not only is the sentiment of what is
aesthetically pleasing parallel within the responses, but also the way in which visual
features in graphical items appear together. For this reason, it is advisable to observe
items separately in conjunction with factors when utilizing VISQUAL in studying
graphical user interface elements. Additionally, experimenting on the initial model
(Fig. 1) as well as the adjusted model (Fig. 2) is recommended in further assessment
of the instrument.
7.1 Implications
The growing need for customizable and adaptive interactive systems requires new
ways of measuring and understanding perceptions and personality dimensions that
affect how graphical user interfaces are designed and adapted. This study was one
of the first attempts to develop a measurement model for individual perceptions on
visual qualities of graphical user interface elements, rather than measuring an entire
user interface. The scale was validated using a large sample of both graphical mate-
rial (i.e., icons) and respondent data, which enhances generalizability.
Icon-based interfaces are customizable, e.g., by user navigation and theme
design. Essentially, this type of user-adaptation aims for effective use, where the
user-perceived pragmatic and hedonic attributes are satisfied. Features for person-
alization include, e.g., rearranging user interface elements per preference. Users
also have the option to customize interface design by installing skins, of which data
are usually gathered to determine user preferences and further recommendations on
adaptation. Measured by VISQUAL, data will be available on individual percep-
tions of GUI elements, which can then be applied for user-adaptation. However, as
modeling dynamic user preference requires both preference representation and user
profile building (Liu 2015), a complementary measurement model that investigates
13
Content courtesy of Springer Nature, terms of use apply. Rights reserved.
974 H. Jylhä, J. Hamari
13
Content courtesy of Springer Nature, terms of use apply. Rights reserved.
Development of measurement instrument for visual qualities… 975
13
Content courtesy of Springer Nature, terms of use apply. Rights reserved.
976 H. Jylhä, J. Hamari
surrounding the component may affect the perceived utility and usability of the
component and the subjective perception of its aesthetics. As such, further research
is invited to compare subjective assessments on GUI components in two scenarios:
isolated and within (part of) a GUI. It is also to be studied whether the instrument is
applicable in other, broader contexts as well as in other fields aside from user inter-
face aesthetics research.
8 Conclusion
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License,
which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as
you give appropriate credit to the original author(s) and the source, provide a link to the Creative Com-
mons licence, and indicate if changes were made. The images or other third party material in this article
are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the
material. If material is not included in the article’s Creative Commons licence and your intended use is
not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission
directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licen
ses/by/4.0/.
13
Content courtesy of Springer Nature, terms of use apply. Rights reserved.
Development of measurement instrument for visual qualities… 977
Please use the following reference when using, adapting, further validating or otherwise
referring to VISQUAL or the paper which it was published in: Jylhä and Hamari (2020).
VISQUAL is designed for measuring perceived visual qualities of graphical user
interfaces and/or singular graphical elements. The following manual guides how to
apply the VISQUAL instrument. All items marked “Yes” for “Included in the final
VISQUAL” should be used; however, we also recommend including the “Optional”
items when administering VISQUAL. All items should preferably be presented on
the same page which the graphical elements are presented on. However, if this is
impractical or impossible, all measurement items should be treated equally in terms
of their cognitive proximity to the graphic under investigation.
Use a seven-point semantic differential scale for each adjective pair (e.g., Beauti-
ful 1 2 3 4 5 6 7 Ugly). The following instructions should be added beside the meas-
ured graphic: “Please evaluate the appearance of the [graphic] shown. The closer
you choose to the left or right adjective, the better you think that adjective charac-
terized the [graphic]. If you choose the middle space, you think both adjectives fit
equally well.” The scale for each GUI element should be initiated with the following
text: “In my opinion, this [graphic] is…”
Polarity of the adjective pairs should be randomized so that perceivably positive
and negative adjectives do not align on the same side of the scale. Please refer to
Table A for list of items.
Table A Items used in VISQUAL (items marked as Optional omitted from the adjusted model)
Factor Adjective pair Included in the
final VISQUAL
13
Content courtesy of Springer Nature, terms of use apply. Rights reserved.
978 H. Jylhä, J. Hamari
References
Ahmed, S.U., Mahmud, A.A., Bergaust, K.: Aesthetics in human-computer interaction: views and
reviews. In: Proceedings of the 30th International Conference on HCI—New Trends in Human-
Computer Interaction, San Diego, USA, pp. 559–568 (2009)
Allen, S., Matheson, J.: Development of a semantic differential to access users’ attitudes towards a batch
mode information retrieval system (ERIC). J. Am. Soc. Inf. Sci. 28, 268–272 (1977)
Alvarez-Cortes, A., Zarate, V.H., Uresti, J.A.R., Zayas, B.E.: Current challenges and applications for
adaptive user interfaces. In: Human–Computer interaction, Inaki Maurtua, Intech Open (2009).
https://doi.org/10.5772/7745
Arend, U., Muthig, K.P., Wandmacher, J.: Evidence for global superiority in menu selection by icons.
Behav. Inf. Technol. 6, 411–426 (1987). https://doi.org/10.1080/01449298708901853
Blankenberger, S., Hahn, K.: Effects of icon design on human–computer interaction. Int. J. Man-Mach.
Stud. 35, 363–377 (1991). https://doi.org/10.1016/S0020-7373(05)80133-6
Bouzit, S., Calvary, G., Coutaz, J., Chêne, D., Petit, E., Vanderdonckt, J.: The PDA-LPA design space
for user interface adaptation. In: Proceedings of the 11th International Conference on Research
Challenges in Information Science (RCIS). Brighton, UK (2017). https://doi.org/10.1109/
rcis.2017.7956559
Brown, T.A.: Confirmatory Factor Analysis for Applied Research. Guilford Publications, New York
(2015)
Burgers, C., Eden, A., Jong, R., Buningh, S.: Rousing reviews and instigative images: the impact of
online reviews and visual design characteristics on app downloads. Mob. Media Commun. 4, 327–
346 (2016). https://doi.org/10.1177/2050157916639348
Chen, C.C.: User recognition and preference of app icon stylization design on the smartphone. In: Stepha-
nidis, C. (ed.) HCI International 2015—Posters’ Extended Abstracts. HCI 2015. Communications in
Computer and Information Science, vol. 529. Springer, Cham (2015). https://doi.org/10.1007/978-
3-319-21383-5_2
Chin, D.N.: Empirical evaluation of user models and user-adapted systems. User Model. User-Adapt.
Interact. 11, 181–194 (2001). https://doi.org/10.1023/A:1011127315884
Choi, J.H., Lee, H.-J.: Facets of simplicity for the smartphone interface: a structural model. Int. J. Hum.
Comput Stud. 70, 129–142 (2012). https://doi.org/10.1016/j.ijhcs.2011.09.002
Cockburn, A., Gutwin, C., Greenberg, S.: A predictive model of menu performance. In: Proceedings of
the 25th Annual SIGCHI Conference on Human Factors in Computing Systems. San Jose, USA, pp.
627–636 (2007). https://doi.org/10.1145/1240624.1240723
Creusen, M.E.H., Schoormans, J.P.L.: The different roles of product appearance in consumer choice. J.
Prod. Innov. Manage. 22, 63–81 (2005). https://doi.org/10.1111/j.0737-6782.2005.00103.x
Creusen, M.E.H., Veryzer, R.W., Schoormans, J.P.L.: Product value importance and consumer pref-
erence for visual complexity and symmetry. Eur. J. Mark. 44, 1437–1452 (2010). https://doi.
org/10.1108/03090561011062916
Cyr, D., Head, M., Ivanov, A.: Design aesthetics leading to m-loyalty in mobile commerce. Inf. Manage.
43, 950–963 (2006). https://doi.org/10.1016/j.im.2006.08.009
Debevc, M., Meyer, B., Donlagic, D., Svecko, R.: Design and evaluation of an adaptive icon toolbar. User
Model. User-Adap. Interact. 6, 1–21 (1996). https://doi.org/10.1007/BF00126652
Dewar, R.: Design and evaluation of public information symbols. In: Zwaga, H.J.G., Boersema, T., Hoon-
hout, H.C.M. (eds.) Visual Information for Everyday Use, pp. 285–303. Taylor & Francis, London
(1999)
Digman, J.M.: Personality structure: emergence of the five-factor model. Annu. Rev. Psychol. 41, 417–
440 (1990). https://doi.org/10.1146/annurev.ps.41.020190.002221
Evans, J.D.: Straightforward Statistics for the Behavioral Sciences. Brooks/Cole Publishing, Pacific
Grove (1996)
Fornell, C., Larcker, D.F.: Evaluating structural equation models with unobservable variables and meas-
urement error. J. Mark. Res. 18, 39–50 (1981). https://doi.org/10.2307/3151312
Gait, J.: An aspect of aesthetics in human–computer communications: pretty windows. IEEE Trans. Soft.
Eng. 8, 714–717 (1985). https://doi.org/10.1109/TSE.1985.232520
Gajos, K.Z., Crewinski, M., Tan, D.S., Weld, D.S.: Exploring the design space for adaptive graphical user
interfaces. In: Proceedings of Advanced Visual Interfaces (AVI). Venezia, Italy, pp. 201–208 (2006)
13
Content courtesy of Springer Nature, terms of use apply. Rights reserved.
Development of measurement instrument for visual qualities… 979
García, M., Badre, A.N., Stasko, J.T.: Development and validation of icons varying in their abstractness.
Interact. Comput. 6, 191–211 (1994). https://doi.org/10.1016/0953-5438(94)90024-8
Gittins, D.: Icon-based human–computer interaction. Int J. Man-Mach. Stud. 24, 519–543 (1986). https://
doi.org/10.1016/S0020-7373(86)80007-4
Goonetilleke, R.S., Shih, H.M., On, H.K., Fritsch, J.: Effects of training and representational charac-
teristics in icon design. Int. J. Hum. Comput Stud. 55, 741–760 (2001). https://doi.org/10.1006/
ijhc.2001.0501
Gullà, F., Ceccacci, S., Germani, M., Cavalieri, L.: Design adaptable and adaptive user interfaces: a
method to manage the information. In: Andò, B., Siciliano, P., Marletta, V., Monteriù, A. (eds.)
Ambient Assisted Living. Biosystems&Biorobotics, vol. 11, pp. 47–58. Springer, Cham (2015)
Hamborg, K.-C., Hülsmann, J., Kaspar, K.: The interplay between usability and aesthetics: more evi-
dence for the “what is usable is beautiful” notion. Adv. Hum. Comput. Int. (2014). https://doi.
org/10.1155/2014/946239
Hartmann, J., Sutcliffe, A., Angeli, A.D.: Towards a theory of user judgment of aesthetics and
user interface quality. ACM Trans. Comput. Hum. Interact. 15, Article 15 (2007). https://doi.
org/10.1145/1460355.1460357
Hartmann, J., Angeli, A.D., Sutcliffe, A.: Framing the user experience: information biases on website
quality judgement. In: Proceedings of the 26th Annual SIGCHI Conference on Human Factors in
Computing Systems. Florence, Italy, pp. 855–864 (2008)
Hassenzahl, M.: The interplay of beauty, goodness, and usability in interactive products. Hum. Comput.
Int. (2004). https://doi.org/10.1207/s15327051hci1904_2
Hassenzahl, M., Burmester, M., Koller, F.: AttrakDiff: EinFragebogenzurMessungwahrgenommenerhe-
donischer und pragmatischerQualität [AttracDiff: a questionnaire to measure perceived hedonic and
pragmatic quality]. In: Ziegler, J., Szwillus, G. (eds.) Mensch&Computer 2003, pp. 187–196. Inter-
aktion in Bewegung. B. G. Teubner, Stuttgart (2003)
Henson, R.K., Roberts, J.K.: Use of exploratory factor analysis in published research: common errors
and some comment on improved practice. Educ. Psychol. Meas. 66, 393–416 (2006). https://doi.
org/10.1177/0013164405282485
Hermida, R.: The problem of allowing correlated errors in structural equation modeling: concerns and
considerations. Comput. Methods Soc. Sci. 3, 5–17 (2015)
Horton, W.: The Icon Book: Visual Symbols for Computing Systems and Documentation. Wiley, New
York (1994)
Horton, W.: Designing icons and visual symbols. In: Proceedings of the CHI 96 Conference on
Human Factors in Computing Systems. Vancouver, Canada, pp. 371–372 (1996). https://doi.
org/10.1145/257089.257378
Hou, K.-C., Ho, C.-H.: A preliminary study on aesthetic of apps icon design. In: Proceedings of the 5th
International Congress of International Association of Societies of Design Research. Tokyo, Japan
(2013)
Huang, S.-M., Shieh, K.-K., Chi, C.-F.: Factors affecting the design of computer icons. Int. J. Ind. Ergon.
29, 211–218 (2002). https://doi.org/10.1016/S0169-8141(01)00064-6
Isherwood, S.J., McDougall, S.J.P., Curry, M.B.: Icon identification in context: The changing role of icon
characteristics with user experience. Hum. Fact. 49, 465–476 (2007). https://doi.org/10.1518/00187
2007X200102
Jankowski, J., Bródka, P., Hamari, J.: A picture is worth a thousand words: an empirical study on the
influence of content visibility on diffusion processes within a virtual world. Behav. Inf. Technol. 35,
926–945 (2016)
Jankowski, J., Hamari, J., Watrobski, J.: A gradual approach for maximising user conversion without
compromising experience with high visual intensity website elements. Int. Res. 29, 194–217 (2019)
Jennings, M.: Theory and models for creating engaging and immersive ecommerce websites. In: Proceed-
ings of the 2000 ACM SIGCPR Conference on Computer Personnel Research. ACM, New York,
USA, pp. 77–85 (2000). https://doi.org/10.1145/333334.333358
Jordan, P.W.: Human factors for pleasure in product use. Appl. Ergon. 29, 25–33 (1998). https://doi.
org/10.1016/S0003-6870(97)00022-7
Jylhä, H., Hamari, J.: An icon that everyone wants to click: how perceived aesthetic qualities predict app
icon successfulness. Int. J. Hum. Comput. Stud. 130, 73–85 (2019). https://doi.org/10.1016/j.ijhcs
.2019.04.004
13
Content courtesy of Springer Nature, terms of use apply. Rights reserved.
980 H. Jylhä, J. Hamari
Jylhä, H., Hamari, J.: Development of measurement instrument for visual qualities of graphical user inter-
face elements (VISQUAL): a test in the context of mobile game icons. User Model. User-Adap.
Inter. (2020). https://doi.org/10.1007/s11257-020-09263-7
Kaplan, D.: Evaluating and modifying covariance structure models: a review and recommendation. Mul-
tivar. Behav. Res. 24, 137–155 (1990). https://doi.org/10.1207/s15327906mbr2502_1
Kline, R.B.: Principles and Practice of Structural Equation Modeling. Guilford Press, New York (2011)
Kurosu, M., Kashimura, K.: Apparent usability vs. inherent usability. In: Proceedings of the CHI 95 Con-
ference Companion on Human Factors in Computing Systems. ACM, New York, USA, pp. 292–293
(1995). https://doi.org/10.1145/223355.223680
Lavie, T., Meyer, J.: Benefits and costs of adaptive user interfaces. Int. J. Hum. Comput. Stud. 68, 508–
524 (2010). https://doi.org/10.1016/j.ijhcs.2010.01.004
Lee, S.H., Boling, E.: Screen design guidelines for motivation in interactive multimedia instruction: a
survey and framework for designers. Educ. Technol. 39, 19–26 (1999)
Lin, C.-H., Chen, M.: The icon matters: how design instability affects download intention of mobile
apps under prevention and promotion motivations. Electron. Commer. Res. (2018). https://doi.
org/10.1007/s10660-018-9297-8
Lin, C.-L., Yeh, J.-T.: Marketing aesthetics on the web: personal attributes and visual communication
effects. In: Proceedings of the 5th IEEE International Conference on Management of Innovation &
Technology. IEEE, Singapore, pp. 1083–1088 (2010)
Liu, X.: Modeling users’ dynamic preference for personalized recommendation. In: Proceedings of the
24th International Joint Conference on Artificial Intelligence. IEEE, Buenos Aires, pp. 1785–1791
(2015)
Lodding, K.N.: Iconic interfacing. IEEE Comput. Graph. Appl. 3, 11–20 (1983). https://doi.org/10.1109/
MCG.1983.262982
MacCallum, R.: Specification searches in covariance structure modeling. Psychol. Bull. 100, 107–120
(1986). https://doi.org/10.1037/0033-2909.100.1.107
MacKenzie, S.B., Podsakoff, P.M., Podsakoff, N.P.: Construct measurement and validation procedures in
MIS and behavioral research: Integrating new and existing techniques. Manag. Inf. Syst. 35, 293–
334 (2011). https://doi.org/10.2307/23044045
Mahlke, S., Thüring, M.: Studying antecedents of emotional experiences in interactive contexts. In: Pro-
ceedings of the SIGCHI Conference on Human Factors in Computing Systems. San Jose, USA, pp.
915–918 (2007)
Maity, R., Uttav, A., Gourav, V., Bhattacharya, S.: A non-linear regression model to predict aesthetic rat-
ings of on-screen images. In: Proceedings of the Annual Meeting of the Australian Special Interest
Group for Computer Human Interaction, OZCHI 2015, Parkville, Australia, pp. 44–52 (2015). https
://doi.org/10.1145/2838739.2838743
Maity, R., Madrosiya, A., Bhattacharya, S.: A computational model to predict aesthetic quality of text ele-
ments of GUI. Proc. Comput. Sci. 84, 152–159 (2016). https://doi.org/10.1016/j.procs.2016.04.081
Matsunaga, M.: How to factor-analyze your data right: do’s, don’ts, and how-to’s. Int. J. Psychol. Res. 3,
97–110 (2010). https://doi.org/10.21500/20112084.854
McDougall, S.J.P., Reppa, I.: Why do I like it? The relationships between icon characteristics, user per-
formance and aesthetic appeal. In: Proceedings of the Human Factors and Ergonomics Society 52nd
Annual Meeting. New York, USA, pp. 1257–1261 (2008). https://doi.org/10.1177/1541931208
05201822
McDougall, S.J.P., Reppa, I.: Ease of icon processing can predict icon appeal. In: Proceedings of the 15th
international conference on Human–Computer Interaction. Las Vegas, USA, pp. 575–584 (2013).
https://doi.org/10.1007/978-3-642-39232-0_62
McDougall, S.J.P., Curry, M.B., de Bruijin, O.: Understanding what makes icons effective: how subjec-
tive ratings can inform design. In: Hanson, M. (ed.) Contemporary Ergonomics, pp. 285–289. Tay-
lor & Francis, London (1998)
McDougall, S.J.P., Curry, M.B., de Bruijin, O.: Measuring symbol and icon characteristics: norms for
concreteness, complexity, meaningfulness, familiarity, and semantic distance for 239 symbols.
Behav. Res. Methods Instrum. Comput. 31, 487–519 (1999). https://doi.org/10.3758/BF03200730
McDougall, S.J.P., de Bruijn, O., Curry, M.B.: Exploring the effects of icon characteristics on user per-
formance: the role of icon concreteness, complexity, and distinctiveness. J. Exp. Psychol. Appl. 6,
291–306 (2000). https://doi.org/10.1037/1076-898X.6.4.291
13
Content courtesy of Springer Nature, terms of use apply. Rights reserved.
Development of measurement instrument for visual qualities… 981
McDougall, S.J.P., Reppa, I., Kulik, J., Taylor, A.: What makes icons appealing? The role of processing
fluency in predicting icon appeal in different task contexts. Appl. Ergon. 55, 156–172 (2016). https
://doi.org/10.1016/j.apergo.2016.02.006
Mõttus, M., Lamas, D., Pajusalu, M., Torres, R.: The evaluation of interface aesthetics. In: Proceedings
of the International Conference on Multimedia, Interaction, Design and Innovation (MIDI). Warsaw,
Poland (2013). https://doi.org/10.1145/2500342.2500345
Moyes, J., Jordan, P.W.: Icon design and its effect on guessability, learnability, and experienced user per-
formance. In: Alty, J.D., Diaper, D., Gust, S. (eds.) People and Computers VIII, pp. 49–59. Cam-
bridge University Society, Cambridge (1993)
Ngo, D.C.L.: Measuring the aesthetic elements of screen designs. Displays 22, 73–78 (2001). https://doi.
org/10.1016/S0141-9382(01)00053-1
Ngo, D.C.L., Samsudin, A., Abdullah, R.: Aesthetic measures for assessing graphic screens. J. Inf. Sci.
Eng. 16, 97–116 (2000)
Ngo, D.C.L., Teo, L.S., Byrne, J.G.: Modelling interface aesthetics. Inf. Sci. 152, 25–46 (2003). https://
doi.org/10.1016/S0020-0255(02)00404-8
Norman, D.A.: Emotional design: why we love (or hate) everyday things. Basic Books, New York (2004)
Nunnally, J.C., Bernstein, I.: Psychological Theory. McGraw-Hill, New York (1994)
Overby, E., Sabyasachi, M.: Physical and electronic wholesale markets: an empirical analysis of product
sorting and market function. J. Manag. Inf. Syst. 31, 11–46 (2014). https://doi.org/10.2753/MIS07
42-1222310202
Roberts, L., Rankin, L., Moore, D., Plunkett, S., Washburn, D., Wilch-Ringen, B.: Looks good to me. In:
Proceedings of CHE03, Extended Abstracts on Human Factors in Computing Systems. ACM, New
York, USA, pp. 818–819 (2003)
Rogers, Y., Oborne, D.J.: Pictorial communication of abstract verbs in relation to human–computer inter-
action. Br. J. Psychol. 78, 99–112 (1987). https://doi.org/10.1111/j.2044-8295.1987.tb02229.x
Russell, D.W.: In search of underlying dimensions: the use (and abuse) of factor analysis in personal-
ity and social psychology bulletin. Personal. Soc. Psychol. Bull. 28, 1629–1646 (2002). https://doi.
org/10.1177/014616702237645
Salimun, C., Purchase, H.C., Simmons, D., Brewster, S.: The effect of aesthetically pleasing composition
on visual search performance. In: Proceedings of the 6th Nordic Conference on Human-Computer
Interaction: Extending Boundaries. ACM, Reykjavik, Iceland, pp. 422–431 (2010). https://doi.
org/10.1145/1868914.1868963
Salman, Y.B., Kim, Y., Cheng, H.I.: Senior-friendly icon design for the mobile phone. In: Proceedings of
the 6th International Conference on Digital Content, Multimedia Technology and its Applications
(IDC 2010). IEEE, Seoul, South Korea, pp. 103–108 (2010)
Salman, Y.B., Cheng, H.I., Patterson, P.E.: Icon and user interface design for emergency medical infor-
mation systems: a case study. Int. J. Med. Inform. 81, 29–35 (2012). https://doi.org/10.1016/j.ijmed
inf.2011.08.005
Sarsam, S.M., Al-Samarraie, H.: Towards incorporating personality into the design of an interface:
a method for facilitating users’ interaction with the display. User Model. User-Adap. Interact. 28,
75–96 (2018). https://doi.org/10.1007/s11257-018-9201-1
Schneider-Hufschmidt, M., Malinowski, U., Kuhme, T.: Adaptive user Interfaces: Principles and Practice.
Elsevier Science Inc., New York (1993)
Shaikh, A.D.: Know your typefaces! Semantic differential presentation of 40 onscreen typefaces. Usab.
N. 11, 23–65 (2009)
Shu, W., Lin, C.-S.: Icon design and game app adoption. In: Proceedings of the 20th Americas Confer-
ence on Information Systems. Georgia, USA (2014)
Smith, K.A., Dennis, M., Masthoff, J., Tintarev, N.: A methodology for creating and validating psycho-
logical stories for conveying and measuring psychological traits. User Model. User-Adap. Interact.
29, 573–618 (2019). https://doi.org/10.1007/s11257-019-09219-6
Tabachnick, B.G., Fidell, L.S.: Using Multivariate Statistics. Allyn and Bacon/Pearson, Boston (2007)
Tractinsky, N.: Aesthetics and apparent usability: empirically assessing cultural and methodological
issues. In: Proceedings of the ACM SIGCHI Conference on Human FACTORS in Computing Sys-
tems. ACM, New York, pp. 115–122 (1997). https://doi.org/10.1145/258549.258626
Tractinsky, N., Katz, A.S., Ikar, D.: What is beautiful is usable. Interact. Comput. 13, 127–145 (2000).
https://doi.org/10.1016/S0953-5438(00)00031-X
13
Content courtesy of Springer Nature, terms of use apply. Rights reserved.
982 H. Jylhä, J. Hamari
Vanderdonckt, J., Gillo, X.: Visual techniques for traditional and multimedia layouts. In: Proceedings
of the Workshop on Advanced Visual Interfaces AVI. Bari, Italy, pp. 95–104 (1994). https://doi.
org/10.1145/192309.192334
Wang, M., Li, X.: Effects of the aesthetic design of icons on app downloads: evidence from an android
market. Electron. Commer. Res. 17, 83–102 (2017). https://doi.org/10.1007/s10660-016-9245-4
Wiedenbeck, S.: The use of icons and labels in an end user application program: An empirical study of
learning and retention. Behav. Inf. Technol. 18, 68–82 (1999). https://doi.org/10.1080/0144929991
19129
Wu, W., Chen, L., Zhao, Y.: Personalizing recommendation diversity based on user personality. User
Model. User-Adap. Interact. 28, 237–276 (2018). https://doi.org/10.1007/s11257-018-9205-x
Zen, M., Vanderdonckt, J.: Towards an evaluation of graphical user interfaces aesthetics based on met-
rics. In: Proceedings of the IEEE 8th International Conference on Research Challenges in Informa-
tion Science (RCIS). Marrakech, Morocco, pp. 1–6 (2014). https://doi.org/10.1109/rcis.2014.68610
50
Zen, M., Vanderdonckt, J.: Assessing user interface aesthetics based on the inter-subjectivity of judg-
ment. In: Proceedings of the 30th International BCS Human Computer Interaction Conference.
BCS, Swindon, UK (2016). https://doi.org/10.14236/ewic/hci2016.25
Zukerman, I., Albrecht, D.W.: Predictive statistical models for user modeling. User Model. User-Adap.
Interact. 11, 5–18 (2001). https://doi.org/10.1023/A:1011175525451
Publisher’s Note Springer Nature remains neutral with regard to jurisdictional claims in published
maps and institutional affiliations.
Henrietta Jylhä is a researcher and a PhD candidate at the Gamification Group at Tampere University.
Her research focuses on visual aspects in interactive environments such as graphical user interfaces relat-
ing to consumer psychology. She has experience in quantitative methods, i.e. extensive international sur-
vey studies and online experiments. She also has a degree in game and computer graphics and a strong
background in digital arts. Jylhä’s current research explores the relationship between consumer percep-
tions and app icons. http://gamifi cation.group/h-jylha/.
Juho Hamari is a Professor of Gamification and leads the Gamification Group at Tampere University. He
has authored several seminal academic articles on areas of gamification, games, extended realities and
online economies from perspectives of human-computer interaction, information systems science, con-
sumer behavior. His research has been published in a variety of prestigious venues such as IEEE Transac-
tions on Affective Computing, UMUAI, IJHCS, IJHCI, JASIST, IJIM, Organization Studies, New Media
& Society, Journal of Business Research, Computers in Human Behavior, Internet Research, Electronic
Commerce Research and Applications, Simulation & Gaming, as well as in books published by among
others MIT Press. http://juhohamari.com.
Affiliations
Henrietta Jylhä1 · Juho Hamari1
Juho Hamari
juho.hamari@tuni.fi
1
Gamification Group, Faculty of Information Technology and Communication Sciences,
Tampere University, 33014 Tampere University, Finland
13
Content courtesy of Springer Nature, terms of use apply. Rights reserved.
Terms and Conditions
Springer Nature journal content, brought to you courtesy of Springer Nature Customer Service Center
GmbH (“Springer Nature”).
Springer Nature supports a reasonable amount of sharing of research papers by authors, subscribers
and authorised users (“Users”), for small-scale personal, non-commercial use provided that all
copyright, trade and service marks and other proprietary notices are maintained. By accessing,
sharing, receiving or otherwise using the Springer Nature journal content you agree to these terms of
use (“Terms”). For these purposes, Springer Nature considers academic use (by researchers and
students) to be non-commercial.
These Terms are supplementary and will apply in addition to any applicable website terms and
conditions, a relevant site licence or a personal subscription. These Terms will prevail over any
conflict or ambiguity with regards to the relevant terms, a site licence or a personal subscription (to
the extent of the conflict or ambiguity only). For Creative Commons-licensed articles, the terms of
the Creative Commons license used will apply.
We collect and use personal data to provide access to the Springer Nature journal content. We may
also use these personal data internally within ResearchGate and Springer Nature and as agreed share
it, in an anonymised way, for purposes of tracking, analysis and reporting. We will not otherwise
disclose your personal data outside the ResearchGate or the Springer Nature group of companies
unless we have your permission as detailed in the Privacy Policy.
While Users may use the Springer Nature journal content for small scale, personal non-commercial
use, it is important to note that Users may not:
1. use such content for the purpose of providing other users with access on a regular or large scale
basis or as a means to circumvent access control;
2. use such content where to do so would be considered a criminal or statutory offence in any
jurisdiction, or gives rise to civil liability, or is otherwise unlawful;
3. falsely or misleadingly imply or suggest endorsement, approval , sponsorship, or association
unless explicitly agreed to by Springer Nature in writing;
4. use bots or other automated methods to access the content or redirect messages
5. override any security feature or exclusionary protocol; or
6. share the content in order to create substitute for Springer Nature products or services or a
systematic database of Springer Nature journal content.
In line with the restriction against commercial use, Springer Nature does not permit the creation of a
product or service that creates revenue, royalties, rent or income from our content or its inclusion as
part of a paid for service or for other commercial gain. Springer Nature journal content cannot be
used for inter-library loans and librarians may not upload Springer Nature journal content on a large
scale into their, or any other, institutional repository.
These terms of use are reviewed regularly and may be amended at any time. Springer Nature is not
obligated to publish any information or content on this website and may remove it or features or
functionality at our sole discretion, at any time with or without notice. Springer Nature may revoke
this licence to you at any time and remove access to any copies of the Springer Nature journal content
which have been saved.
To the fullest extent permitted by law, Springer Nature makes no warranties, representations or
guarantees to Users, either express or implied with respect to the Springer nature journal content and
all parties disclaim and waive any implied warranties or warranties imposed by law, including
merchantability or fitness for any particular purpose.
Please note that these rights do not automatically extend to content, data or other material published
by Springer Nature that may be licensed from third parties.
If you would like to use or distribute our Springer Nature journal content to a wider audience or on a
regular basis or in any other manner not expressly permitted by these Terms, please contact Springer
Nature at
onlineservice@springernature.com