Bricks: Laying the Foundations for Graspable User Interfaces
Fitzmaurice, G., Ishii, H., & Buxton, W.
(1) Dynamic Graphics Project CSRI, University of Toronto Toronto, Ontario, CANADA M5S 1A4 Tel: +1 (416) 978-6619 E-mail: gf@dgp.toronto.edu E-mail: buxton@dgp.toronto.edu
KEYWORDS: input devices, graphical user interfaces, graspable user interfaces, haptic input, two-handed interaction, prototyping, computer augmented environments, ubiquitous computing
The Graspable UIs allow direct control of electronic or virtual objects through physical artifacts which act as handles for control (see Figure 1). These physical artifacts are essentially new input devices which can be tightly coupled or "attached" to virtual objects for manipulation or for expressing action (e.g., to set parameters or to initiate a process). In essence, Graspable UIs are a blend of virtual and physical artifacts, each offering affordances in their respective instantiation. In many cases, we wish to offer a seamless blend between the physical and virtual worlds.
FIGURE 1. A graspable object.
The basic premise is that the affordances of the physical handles are inherently richer than what virtual handles afford through conventional direct manipulation techniques. These physical affordances, which we will discuss in more detail later, include facilitating two handed interactions, spatial caching, and parallel position and orientation control.
The Graspable UI design offers a concurrence between space-multiplexed input and output. Input devices can be classified as being space-multiplexed or time-multiplexed. With space-multiplexed input, each function to be controlled has a dedicated transducer, each occupying its own space. For example, an automobile has a brake, clutch, throttle, steering wheel, and gear shift which are distinct, dedicated transducers controlling a single specific task. In contrast, time-multiplexing input uses one device to control different functions at different points in time. For instance, the mouse uses time multiplexing as it controls functions as diverse as menu selection, navigation using the scroll widgets, pointing, and activating "buttons." Traditional GUIs have an inherent dissonance in that the display output is often space-multiplexed (icons or control widgets occupy their own space and must be made visible to use) while the input is time-multiplexed (i.e., most of our actions are channeled through a single device, a mouse, over time). Only one task, therefore, can be performed at a time, as they all use the same transducer. The resulting interaction techniques are often sequential in nature and mutually exclusive. Graspable UIs attempt to overcome this.
In general, the Graspable UI design philosophy has several advantages:
The bricks act as specialized input devices and are tracked by the host computer. From the computer's perspective, the brick devices are tightly coupled to the host computer -- capable of constantly receiving brick related information (e.g., position, orientation and selection information) which can be relayed to application programs and the operating system. From the user's perspective, the bricks act as physical handles to electronic objects and offer a rich blend of physical and electronic affordances.
FIGURE 3. Move and rotate virtual object by manipulating physical brick which acts as a handle.
A simple example application may be a floor planner (see Figure 5a). Each piece of furniture has a physical brick attached and the user can arrange the pieces, most likely in a rapid trial-and-error fashion. This design lends itself to two handed interaction and the forming of highly transient groupings by touching and moving multiple bricks at the same time.
FIGURE 4. Two bricks can stretch the square. One brick acts like an anchor while the second brick is moved.
Placing more than one brick on an electronic object gives the user multiple control points to manipulate an object. For example, a spline-curve can have bricks placed on its control points (see Figure 5b). A more compelling example is using the position and orientation information of the bricks to deform the shape of an object. In Figure 6, the user starts off with a rectangle shaped object. By placing a brick at both ends and rotating them at the same time, the user specifies a bending transformation similar to what would happen in the real world if the object were made out of a malleable material such as clay. It is difficult to imagine how this action or transformation could be expressed easily using a mouse.
One key idea that the examples illustrates is that the bricks can offer a significantly rich vocabulary of expression for input devices. Compared to most pointing devices (e.g., the mouse) which only offers an x-y location, the bricks offer multiple x-y locations and orientation information at the same instances of time.
FIGURE 6. Moving and rotating both bricks at the same time causes the electronic object to be transformed.
The LegoWall prototype (developed by A/S Modulex, Billund Denmark in conjunction with the Jutland Institute of Technology in 1988) consists of specially designed LEGO blocks that fasten to a wall mounted peg-board panel composed of a grid of connectors. The connectors supply power and a means of communication from the blocks to a central processing unit. This central processing unit runs an expert system to help track where the blocks are and what actions are valid.
The behavior construction kits [9] consist of computerized LEGO pieces with electronic sensors (such as light, temperature, pressure) which can be programmed by a computer (using LEGO/Logo) and assembled by users. These LEGO machines can be spread throughout the environment to capture or interact with behaviors of people, animals or other physical objects. The "programmable brick," a small battery powered computer containing a microprocessor, non-volatile ROM and I/O ports is also being developed to spread computation.
The AlgoBlock system [13] is a set of physical blocks that can be connected to each other to form a program. Each block corresponds to a single Logo-like command in the programming language. Once again, the emphasis is on manipulating physical blocks each with a designated atomic function which can be linked together to compose a more complex program. The system facilitates collaboration by providing simultaneous access and mutual monitoring of each block.
Based on a similar philosophy of the 3-Draw computer-aided design tool [11], Hinckley et al. has developed passive real-world interface props [5]. Here users are given physical props as a mechanism to manipulate 3D models. They are striving for interfaces in which the computer passively observes a natural user dialog in the real world (manipulating physical objects), rather than forcing a user to engage in a contrived dialog in the computer generated world.
Finally, the DigitalDesk [15] merges our everyday physical desktop with
paper documents and electronic documents. A computer display is projected
down onto a real physical desk and video cameras pointed at the desk use
image analysis techniques to sense what the user is doing. The DigitalDesk
is a great example of how well we can merge physical and electronic artifacts,
taking advantage of the strengths of both mediums.
We observed rapid hand movements and a high degree of parallelism in terms of the use of two hands throughout the task. A very rich gestural vocabulary was exhibited. For instance, a subject's hands and arms would cross during the task. Subjects would sometimes slide instead of pick-up and drop the bricks. Multiple bricks were moved at the same time. Occasionally a hand was used as a "bulldozer" to form groups or to move a set of bricks at the same time. The task allowed subjects to perform imprecise actions and interactions. That is, they could use mostly ballistic actions throughout the task and the system allowed for imprecise and incomplete specifications (e.g., "put this brick in that pile," which does not require a precise (x, y) position specification). Finally, we noticed that users would enlarge their workspace to be roughly the range of their arms' reach.
Once again this sorting task also revealed interesting interaction properties. Tactile feedback was often used to grab dominos while visually attending to other tasks. The non-dominant hand was often used to reposition and align the dominos into their final resting place while, in parallel, the dominant hand was used to retrieve new dominos. The most interesting observation was that subjects seemed to inherently know the geometric properties of the bricks and made use of this everyday knowledge in their interactions without prompting. For example, if 5 bricks are side-by-side in a row, subjects knew that applying simultaneous pressure to the left-most and right-most end bricks will cause the entire row of bricks to be moved. Finally, in the restricted workspace domino condition we observed one subject taking advantage of the "stackability" of the dominos and occasionally piled similar dominos on top of others to conserve space. Also, sometimes a subject would use their non-dominant hand as a "clipboard" or temporary buffer while they plan or manipulate other dominos.
FIGURE 7. Flexible curve and stretchable square.
We found that each subject had a different style of grasping the stretchable square for position and orientation tasks. This served to remind us that physical objects often have a wide variety of ways to grasp and to manipulate them even given natural grasp points. In addition, subjects did not hesitate and were not confounded by trying to plan a grasp strategy. One subject used his dominant hand to perform the primary manipulation and the non-dominant hand as a breaking mechanism and for finer control.
Perhaps the most salient observation is that users performed the three operations (translation, rotation and scaling) in parallel. That is, as the subjects were translating the square towards its final position, they would also rotate and scale the square at the same time. These atomic operations are combined and chunked together [1].
We observed that even when we factor out the time needed to switch in and out of rotation mode in MacDraw, task completion time was about an order of magnitude longer than the physical manipulation using the stretchable square. We noticed a "zoom-in" effect to reach the desired end target goal. For example, subjects would first move the object on top of the target. Then they would rotate the object, but often be unable to plan ahead and realize that the center of rotation will cause the object to be displaced. Thus, they often had to perform another translation operation. They would repeat this process until satisfied with a final match.
The MacDraw user interface, and many other interfaces, forces the subject to perform the operations in a strictly sequential manner. While we can become very adept at performing a series of atomic operations in sequence, the interface constrains user interaction behavior. In effect, the interface forces users to remain novices by not allowing them to exhibit more natural and efficient expressions of specifying atomic operations in parallel.
We found that users quickly learned and explored the physical
properties of the flexible curve and exhibited very expert performance
in under a minute. All ten fingers were often used to impart forces and
counterforces onto the curve. The palm of the hand was also used to preserve
portions of the shape during the curve matching task. We observed that
some subjects would "semantically load" their hands and arms before making
contact with the flexible curve in anticipation of their interactions.
The semantic loading is a preconceived grasp and manipulation strategy
by the user which, in order to execute properly, the arms, hands and fingers
must start in a specific, sometime uncomfortable, loaded position. This
process often allowed the subject to reach the final target curve shape
in one gestural action.
All of these exploratory studies and mock-ups aided us to quickly explore some of the core concepts with minimum set-up effort. Finally, the video tapes that we create often serves as inspirational material.
The two Bird receivers act like bricks and can be used simultaneously to perform operations in parallel. One of the bricks has a push button attached to it to register additional user input. This button is primarily used for creating new objects. Grasps (i.e., attaching the brick to a virtual object) are registered when a brick is near or on the desktop surface. To release a grasp, the user lifts the brick off of the desktop (about 2 cm).
To select the current tool (select, delete, rectangle, triangle, line, circle) and current draw color, we use a physical tray and an ink-well metaphor. Users dunk a brick in a compartment in the tray to select a particular tool. A soft audio beep is heard to act as feedback for switching tools. Once a tool is selected, a prototype shape or tool icon is attached to the brick. The shape or icon is drawn in a semi-transparent layer so that users may see through the tool.
FIGURE 8. GraspDraw application and ActiveDesk.
The concept of an anchor and actuator have been
defined in interactions that involve two or more bricks. An anchor serves
as the origin of an interaction operation. Anchors often specify an orientation
value as well as a positional value. Actuators only specify positional
values and operate within a frame of reference defined by an anchor. For
example, performing a stretching operation on a virtual object involves
using two bricks one as an anchor and the other as an actuator. The first
brick attached to the virtual object acts as an anchor. The object can
be moved or rotated. When the second brick is attached, it serves
as an actuator. Position information is registered relative to the anchor
brick. If the first anchor brick is released, the actuator brick is promoted
to the role of an anchor.
The first point can be dealt with by careful experimental design and differentiating between controlled experiments and user testing. Our approach to the second is to partner with a commercial software company that has a real application with real users. In so doing, we were able to access both a real application and a highly trained user community.
Hence, we have implemented a critical mass of the Graspable UI
into a modified version of Alias Studio(TM), a high-end 3D modeling and
animation program for SGI machines. Specifically, we are exploring how
multiple bricks can be used to aid curve editing tasks. Although we have
just begun this stage of research, we currently have two bricks integrated
into the Studio program. The bricks can be used to simultaneously edit
the position, orientation and scale factor for points along a curve. Future
investigations may use bricks to clamp or freeze portions of the curve.
This integration process and evaluation will further help us to refine
the Graspable UI concepts.
One could argue that all Graphical UI interactions, except perhaps touch (e.g., touchscreens) are already graspable interfaces if they use a mouse or stylus. However, this claim misses a few important distinctions. First, Graspable UIs make a distinction between "attachment" and "selection." In traditional Graphical UIs, the selection paradigm dictates that there is typically only one active selection; Selection N implicitly causes Selection N-1 to be unselected. In contrast, when bricks are attached to virtual objects the association persists across multiple interactions. Selections are then made by making physical contact with the bricks. Therefore, with Graspable UIs we can possibly eliminate many of the redundant selection actions and make selections easier by replacing the act of precisely positioning a cursor over a small target with the act of grabbing a brick. Secondly, Graspable UIs advocate using multiple devices (e.g., bricks) instead of channeling all interactions through one device (e.g., mouse). Consequently, not only are selections persistent, there can be one persistent selection per brick. Thirdly, the bricks are inherently spatial. For example, we can temporarily arrange bricks to form spatial caches or use them as spatial landmarks for storage. By having more spatial persistence, we can use more of our spatial reasoning skills and muscle memory. This was exhibited during the LEGO and Domino exploratory studies. Clearly, the bricks are handled differently than a mouse.
One may suggest to eliminate using bricks and instead use only our hands as the physical input devices. While this may be useful for some applications, in general using a physical intermediary (i.e., brick) may be more desirable. First, tactile feedback is essential; it provides a way of safeguarding user intent. The bricks supply tactile confirmation and serve as a visual interaction residue. Secondly, hand gestures lack very natural delimiters for starting and stopping points. This makes it difficult to segment commands and introduces lexical pragmatics. In contrast, the affordances of touching and releasing a brick serve as very natural start and stop points.
There are many open design issues and interaction pragmatics to research. For example, should we vary the attributes of a brick (shape, size, color, weight) to indicate its function? Should all the bricks have symmetrical behavior? How many bricks can a user operate with at the same time? Do the bricks take up too much space and cause screen clutter (perhaps we can stack the bricks and they can be made out of translucent material)? For fine, precise pointing, do bricks have a natural hot spot (perhaps a corner or edge)? Sometimes it is more advantageous to have a big "cursor" to acquire a small target [6].
Our goal has been to quickly explore the new design space and identify major landmarks and issues rather than quantify any specific subset of the terrain. The next phase of our evaluation will include a more detailed evaluation at places in the design space that have the most potential.
It should be noted that this is not an exhaustive parsing
of the design space. Robinett [10], however, proposes a more formal taxonomy
for technologically mediated experiences which may aid our investigation.
Yet, the many dimensions of our design space exhibit its richness and provides
a more structured mechanism to explore the concepts behind Graspable UIs.
The Graspable User Interface is an example of "radical evolution." It is evolutionary in the sense that it builds upon the conventions of the GUI. Hence, both existing technology and human skill will transfer to the new technique. However, it is radical in that the incremental change that it introduces takes us into a radically new design space. Assuming that this new space is an improvement on what preceded it, this combination gives us the best of both worlds: the new and the status quo.
From the experience gained in the work described, we believe these
new techniques to be highly potent and worthy of deeper study. What we
have attempted is a proof of concept and exposition of our ideas. Hopefully
this work will lead to a more detailed exploration of the technique and
its potential.
More material including dynamic figures can be found on the CHI'95 Electronic
Proceedings CD-ROM and at URL: http: //www.dgp.utoronto.ca/people/GeorgeFitzmaurice/home.html
2. Eilan, N., McCarthy, R. and Brewer, B. (1993). Spatial Representation. Oxford, UK: Blackwell.
3. Guiard, Y. (1987). Asymmetric Division of Labor in Human Skilled Bimanual Action: The Kinematic Chain Model. In Journal of Motor Behavior, 19(4), pp. 486-517.
4. Fitzmaurice, G.W. (1993). Situated Information Spaces and Spatially Aware Palmtop Computers, Communications of the ACM. 36(7), pp. 38-49.
5. Hickley, K., Pausch, R., Goble, J. C. and Kassell, N. F. (1994). Passive Real-World Interface Props for Neurosurgical Visualization. Proc. of CHI'94, pp. 452-458.
6. Kabbash, P. and Buxton, W. (1995). The 'Prince' Technique: Fitts' Law and Selection Using Area Cursors, To appear in Proc. of CHI'95.
7. Kabbash, P., Buxton, W. and Sellen, A. (1994). Two-Handed Input in a Compound Task. Proc. of CHI94, pp. 417-423.
8. MacKenzie, C. L. and Iberall, T. (1994). The Grasping Hand. Amsterdam: North-Holland, Elsevier Science.
9. Resnick, M. (1993). Behavior Construction Kits. In Communications of the ACM. 36(7), pp. 64-71.
10. Robinett, W. (1992). Synthetic Experience: A Proposed Taxonomy. Presence, 1(2), pp. 229-247.
11. Sachs, E., Roberts, A. and Stoops, D. (1990). 3-Draw: A tool for the conceptual design of three-dimensional shapes. CHI'90 Technical Video Program, ACM SIGGRAPH Video Review, Issue 55, No. 2.
12. Schneider, S.A. (1990). Experiments in the dynamic and strategic control of cooperating manipulators. Ph.D. Thesis, Dept. of Elec. Eng., Stanford Univ.
13. Suzuki, H., Kato, H. (1993). AlgoBlock: a Tangible Programming Language, a Tool for Collaborative Learning. Proceedings of 4th European Logo Conference, Aug. 1993, Athens Greece, pp. 297-303.
14. Weiser, M. (1991). The computer for the 21st Century. In Scientific America, 265(3), pp. 94-104.
15. Wellner, P. (1993). Interacting with paper on the DigitalDesk. In Com. of the ACM. 36(7), pp. 86-96.
16. Zimmerman, T., Smith, J.R., Paradiso, J.A., Allport, D. and Gershenfeld,
N. (1995). Applying Electric Field Sensing to Human-Computer Interfaces.
To Appear in Proceedings of CHI'95.