The Science of Visual Search

How We Find Information on Screens and Interfaces

Jan 28, 2025

Every time you pick up your phone or look at a computer screen, the process often begins with finding an icon or information on the display. Visual search is a fundamental part of interacting with graphical user interfaces (GUIs). Developed in the 1970s at Xerox PARC, GUIs remain the dominant interaction model, widely used across phones, computers, home displays, and car systems. While voice user interfaces have made some progress, they remain challenging to use effectively (for more, see our recent research: Cooking With Agents: Designing Context-aware Voice Interaction). As a result, GUIs continue to be the primary paradigm for how we interact with computers.

Why? Because we can visually navigate displays with a clear idea of what we’re looking for and where to find it. Our eyes are remarkably fast and efficient, capable of processing visual information at incredible speeds. By constantly adjusting our gaze, we build a mental representation of the world, enabling us to scan, focus, and recognize targets with ease. This efficiency makes visual search a natural and effective way to interact with GUIs, offering a speed and precision that other interaction paradigms have yet to match.

But what happens during the visual search process when you’re looking for something on a screen? How do we efficiently locate an item? In a short video I filmed during the COVID-19 pandemic as part of the MSc HCI program at UCL, I delve into the fundamentals of visual search and the strategies people use to find information.

The video begins with a simple example: finding a number in a vertical list. From this, we uncover some basic, intuitive properties of visual search. For example, serial position effects show that search times increase as the target appears further down the list, reflecting a top-to-bottom search strategy. Interestingly, the first two positions often have relatively flat search times, suggesting special processing at the start of the list. Similarly, the menu size effect reveals that larger menus take longer to search, as users must scan more items and are more likely to skip over items along the way. Research refines these intuitions, offering a deeper understanding of the mechanisms behind visual search.

To deepen this understanding, we turn to the physiology of the eye. Our visual field shapes search strategies: the fovea, where we focus, provides sharp detail, while the parafovea, just beyond the point of fixation, offers broader but less precise information. Visual acuity and perceptual span are critical, with optimal fixation often occurring on the second or third item to maximize information gathering. Search strategies adapt to these qualities, balancing efficiency with the probability of detecting a target. This leads to fascinating behaviors, such as skipping over items during searches.

In the video, I demonstrate an AI simulation model developed using modern reinforcement learning techniques. The model incorporates the physiology of the eye, including effective visual acuity, and is designed to find a target as quickly as possible, with errors penalized by a time cost. By running the model thousands of times across simulated trials, complex visual search behaviors emerge—behaviors that closely mirror how people search. We validate this by comparing the model’s gaze patterns to those of real people using eye-tracking technology. The model reveals adaptive search strategies, such as skipping over items to locate the target faster—though this can occasionally result in missed targets.

Eye-tracking provides detailed, moment-by-moment insights into where people look, when, and in what order. This uncovers nuanced search strategies, including how menu organization influences performance. For instance, when searching alphabetically organized menus, people locate targets faster by leveraging their knowledge of alphabetical order to estimate the target’s likely location (e.g., a word like "Apple" will be near the top of the list, while "Zoo" will be near the end). Our research also shows that with repeated practice on the same menu, people learn the locations of target items, further improving search speed.

Finally, the video explores how visual search and motor behavior interact. Eye-tracking and mouse-tracking data reveal two distinct patterns. In the search-and-point strategy, users visually scan the menu from top to bottom, with the cursor trailing behind in discrete steps. In contrast, the direct-selection strategy occurs when users are familiar with the target’s location; they move their gaze and cursor directly to the item in a rapid, ballistic motion, followed by fine adjustments to finalize selection. These movements align with our understanding of how people perform rapid aim movements, such as those in a Fitts' Law pointing task.

Incorporating empirical research, AI simulations, and real-time behavioral data, the video unpicks how we optimize our search strategies on screens. It highlights how visual and motor behaviors work together to make our interactions with GUIs efficient and intuitive. From uncovering the impact of menu size and item position to exploring adaptive strategies and the influence of physiological factors like visual acuity, the video offers a comprehensive look at the science behind visual search. By connecting these insights to practical applications—such as how menu organization and familiarity shape performance—it reveals how understanding these processes can improve interface design and user experience.

For those interested in exploring further, here are the key references mentioned in the video:

Xiuli Chen, Gilles Bailly, Duncan P. Brumby, Antti Oulasvirta, and Andrew Howes. 2015. The Emergence of Interactive Behavior: A Model of Rational Menu Search. In Proceedings of the 33rd Annual ACM Conference on Human Factors in Computing Systems (CHI '15). [https://doi.org/10.1145/2702123.2702483]
Gilles Bailly, Antti Oulasvirta, Duncan P. Brumby, and Andrew Howes. 2014. Model of visual search and selection time in linear menus. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (CHI '14). [https://doi.org/10.1145/2556288.2557093]

If you’re keen to learn more about the field of Human-Computer Interaction, consider applying to the MSc program at UCL. Learn more here:

https://www.ucl.ac.uk/prospective-students/graduate/taught-degrees/human-computer-interaction-msc

Duncan’s Substack

Discussion about this post