no matter the man or the machine: 2010

Precision Timing in Human-Robot Interaction: Coordination of Head Movement and Utterance

Related comments:

Summary:

Yamazaki, et. al, a team of researchers from Future University-Hakodate and Saitama University, discuss the affect significance of non-verbal actions by robots when conversing with human partners. Previous research had indicated the importance of non-verbal actions in human-robot communications, but Yamazaki, et. al aimed to show that the timing of such actions is also important. They conducted preliminary experiments with museum guide robots in which they robots either turned their head toward a visitor while talking about an exhibit, or they continued looking straight at the exhibit. They believed that visitors were likely to look at the robot if the robot looked at the visitors. They took things a step further in the next experiment, where robots performed either in systematic mode (turned head to visitor at interactionally significant cues) or unsystematic mode (turned head to visitor at interactionally insignificant cues).

In the main experiment of the study, the researchers studied 46 participants who were shown a display in a museum by a guide robot (pictured above). What they found was that participants showed synchronized responses to the robot in systematic mode, often turning their heads at the same time as the robot. The robots in unsystematic mode, however, elicited fewer responses from the visitors and were much less smooth in communication.

Discussion:

It's very interesting to me how easily people begin to interact with the robot in a more natural way when it turns its head at the appropriate cues - it displays how important non-verbal communication really is and how simple changes to robots or other interaction systems can make dealing with them so much smoother. The researchers mentioned that visitors would often stop paying attention to the robot's explanation of an exhibit if it seemed that the robot was not turning toward them at the appropriate cues, but were more engaged when the robot looked at them at important points in the explanation. The research still doesn't explain why Japanese people seem to be obsessed with robots, but it does indicate that we can make them more interactive with the appropriate timing of non-verbal cues.

Building Mashups By Example

Related comments:

Summary:

Rattapoom Tuchinda, Pedro Szekely and Craig Knoblock of the University of Southern California's Information Science Institute present their research on building internet mashups in this paper. A mashup is a web application that integrates data from multiple web sources to provide a unique service. Many tools exist for creating mashups that ostensibly accommodate non-programmers, relying on things like widget controls to execute their functions, but these tools are usually confusing or do in fact rely on the user's understanding of programming concepts in order to function. The authors briefly discussed several existing tools and the main issues of creating mashups - data retrieval, source modeling, data cleaning, data integration, and data visualization. They then presented their own mashup builder, Karma, which they posited would solve the problems with other mashup-creation tools.

Basically, users can drag data from the browser on the left into the table on the right to "extract" it, after which Karma will help them assign attribute names, clean and integrate the data. The authors discussed the implementation of each step of the process as realized by Karma in detail. They held an evaluation comparing Karma to mashup tools Dapper and Pipes to see if it made mashup creation easier and faster; an expert was tasked with carrying out three mashup building tasks and the researchers tallied how many "steps" the user took for each process in the task. Overall, Karma outperformed the other two systems, consistently taking fewer steps to achieve the same results.

Discussion:

I have to be honest: after reading this paper and even looking at a few examples cited by the authors, I still can't really tell what a mashup is or what its purpose might be. All of the things I saw looked like spruced-up RSS feeds or link dumps...maybe that's what a mashup is supposed to accomplish? Karma certainly seemed up to the task of creating one, I guess, but it didn't help me figure out what a mashup was. I even went to MashupCamp to see what all the fuss was about, and the first sentence I read in their about page said, "Ask 10 self-proclaimed mashup developers what a mashup is and you might get 10 different answers." Thanks for the illumination, MashupCamp.

Tagsplanations: Explaining Recommendations Using Tags

Related comments:

Summary:

In this paper, Jesse Vig, Shilad Sen, and John Riedl from University of Minnesota's Grouplens Research present Tagsplanations, a system of explaining system recommendations based on community tags. Popular media services like Netflix or iTunes often have recommender systems that suggest similar items based on the user's previous selections or perceived "tastes." The researchers first discuss the most common method of recommendation - establishing an "intermediary entity" between the user and the item to be recommended. This is the typical model of an intermediary entity:

Tagsplanations uses tags to establish this intermediary. For example, “We recommend the movie Fargo because it is tagged with quirky and you have enjoyed other movies tagged with quirky.” They aim to provide the user more information about why a user might like a recommendation. It uses several features like tag preference, relevance, and filtering to increase the overall quality of the tags.

The researchers conducted a user study in which participants tested Tagsplanations in four different interfaces and filled out a survey about their experience. 80% of the subjects found that the system helped them understand recommendations better and make a selection decision.

Discussion:

From my experience with Netflix, I know that tagging and recommendation systems can help a user discover new media that they otherwise might not have come across. I think Tagsplanations is a great way to bolster current recommendation systems and help users make decisions about what they want to see. The specific tags are a lot better than a vague "you might also like this" statement.

Opening Skinner's Box

Opening Skinner's Box by Lauren Slater was at once entertaining, fascinating, horrifying, disorienting, and just plain weird. It was refreshing to read a book that was packed with information rather than acidic complaints about design or software. Slater dispenses history and commentary in an accessible narrative style that belies the profound observations about the human psyche made by the discussed researchers. However, some of her more personal and colorful remarks (as well as her tendency to try and replicate some of the experiments from the book) undermined my confidence in her as a reliable narrator. I found her thoughts about each experiment useful, though, as they often echoed or contrasted my own.

Slater gives a brief historical synopsis of each experiment (the full list of which I'll put below) and the responses or controversy generated by it; she often interweaves personal narrative or vivid analogies with facts which, while making the book more interesting as a whole, are sometimes off-putting or detrimental to the reader.

Slater discusses ten important psychological researchers and their experiments: B.F. Skinner and his work on behaviorism; Stanley Milgram's authority experiment; David Rosenhan's skewering of psychiatric diagnosis; Harry Harlow's brutal look at love; Bruce Alexander's Rat Park, Elizabeth Loftus' work with false memories; Antonio Moniz and his most famous creation, lobotomy; Latane and Darley's discovery of diffusion of responsibility; and Eric Kandel, whose work showed the brain chemistry behind learning. Each of these experiments is fascinating in its own right, and having them all together in one place makes them that much easier to compare and discuss.

I enjoyed reading this book, though I was cautious to accept many of the things Slater said in it. Several interviewees in the book later claimed to have been misquoted and Slater's own sanity is called into question several times by the things she says and does. Personally, I thought that made the book more interesting - I would have enjoyed a dusty academic fact-roll far less than I enjoyed Opening Skinner's Box.

MediaGLOW: Organizing Photos in a Graph-based Workspace

Related comments:

Summary:
This short paper from IUI 2008 by Girgensohn, et al. presents MediaGLOW (Graph Layout Organizing Workspace), a photo-browsing system that groups user photos and displays them via a graph-based layout algorithm that places items in stacks and local neighborhoods. MediaGLOW maintains a graph representing similarity distances between photos and photo stacks (nodes). Similarity distances connect nodes like springs and assign weights as appropriate. These distance values can be calculated in a number of ways - photo creation time, geographic location, and visual similarity. Once the user groups photos into a stack, the stack is surrounded by a "neighborhood" of similar photos. The neighborhoods are represented by "halos" that show up "hot" (red) or "cold" (blue) based on how strongly related they are.

The authors conducted a user study of MediaGLOW in which participants were given 450 geo-tagged photos and asked to place them into five categories, then choose three photos from each category to place into a travelogue. The overall results showed that traditional interfaces were more efficient for the task than MediaGLOW, but that MediaGLOW was more "fun" to use.

Discussion:
MediaGLOW does a good job of visually associating photos and seems like a fun way to navigate a photo library. The problem with it (for me) is that the layout would get confusing and cluttered very quickly as the number of photos in the library increased. When combined with some sort of touch interface, I think MediaGLOW could be a very powerful tool for photo browsing and organization.

Detecting and Correcting User Activity Switches: Algorithms and Interfaces

Related comments:

Summary:
In this paper from Oregon State University’s School of EECS, authors Jianqing Shen, et al. present an updated version of an interface they had previously designed to detect and catalog switches in user activity, called TaskTracer. TaskTracer applies machine learning methods to associate sets of resources with a particular user activity and make those resources more readily available to the user. In this context, the term resources include documents, folders, email messages, contacts, and web pages. TaskTracer configures the desktop in several ways to make these resources easy to get to: the task explorer presents a unified view of all resources associated with the current activity; the folder predictor modifies the Windows Open/Save dialogs by defaulting to folders associated with the current activity and adding shortcuts to them; the system has a time reporting feature that allows the user to show how much time was spent on each activity in a given period; finally, TaskTracer automatically tags incoming and outgoing emails associated with the activity. The first version of TaskTracer had several problems including incorrect associations, unnecessary interruption of users with dialog boxes, and very slow learning algorithms. TaskTracer2 fixes these issues by implementing an improved association engine, a desktop state estimator, a more intelligent switch detector, a notification controller that minimizes user interruption cost, and a clearer two-panel UI.

The researchers conducted a study on two people, a “power user” who recorded 4 months of data, and a second user who used the system for 6 days. Overall, the participants found TaskTracer2 to rarely make an incorrect prediction, though it didn’t always make the exact correct one.

Discussion:
I think the user study for this application speaks the most clearly in regards to how I feel about TaskTracer: the “power user” found it very useful and accurate and was even described by the researchers as “fairly careful about declaring switches.” I think associating resources with particular tasks is a great idea (like a more powerful, dynamic version of Windows’ “Recent Documents” feature), but having to explicitly declare when I’m switching activities would annoy me very quickly, especially if the system started pestering me about it with dialog boxes. I think TaskTracer is best suited to the sort of conscientious power users that the researchers studied; the average user probably wouldn’t reap enough benefit for all the explicit task-switch declaration to be worth it.

The Inmates Are Running the Asylum (Chapters 8-14)

I think it's safe to say that Alan Cooper's tone softens somewhat in the second half of this book. While the first seven chapters concerned themselves primarily with calling out the hubris and generally dickish behavior of programmers, the latter seven offered constructive suggestions and real-world examples of how to fix the problems present in the software industry. It was refreshing to see solutions offered to balance out all the acidic complaining. His concept of developing for "personas" rather than the generic "user" has a lot of merit - I even found myself using this process with my project team when developing our sketch application.

As I stated in my summary of the first seven chapters, I think a lot of the things Cooper has to say in this book lack the impact in 2010 that they had in 1999 largely because so much of it is part of the industry now. Companies do focus on interaction design. Programmers aren't the huge jocks they used to be (at least not to as large a degree). The software we have now is better than what we used to make. Is some of it still dancing bearware? Probably. I don't think that kind of software will ever completely disappear. However, I think it's safe to say at this point that (forgive me) the inmates are no longer running the asylum.

Annotating Gigapixel Images

Related comments:

Summary:
Qing Luan, et al. present their developments in the annotation of very large (billions of pixels or gigapixel) images in this paper. Their aim was to augment the pan-and-zoom interface used by applications like Google Earth and Virtual Earth with visual and audio annotations, driven by an intelligent rendering system that takes into account the viewer's perceptual distance from the objects being annotated. They mentioned related work in areas like zoomable UIs, map labeling, human psychophysics, and augmented reality systems. The system developed by the researchers runs on HD View and consists of text labels, audio loops, or narrative audio.

The system gauges viewer perspective, depth, and field of view relative to each annotation; it then assigns strengths to the various annotations based on these elements. Thus, farther-off text labels will be smaller or farther-off audio annotations will be played at lower volume.

Discussion:
I haven't played with Google Earth very much, but according to the authors this system is a lot like it, so I think I have a pretty good idea of how the interface works. I think it's a great idea to render annotations dynamically based on viewer position -- if all available annotations were rendered simultaneously on a map of the United States it would be completely unreadable. Hiding or showing labels as the user pans and zooms encourages the user to explore the image to see what annotations can be uncovered. This would be great if applied to an educational setting where children could browse a map of, say, Philadelphia and read excerpts from historical documents or hear audio clips about important locations.

Edge-Respecting Brushes

Related comments:

Summary:
In this paper, Dan R. Olsen Jr. and Mitchell K. Harris of Brigham Young University's Computer Science Department discuss various methods of making brushes for image-editing programs like Photoshop "smarter" by utilizing least-cost algorithms and edge-respecting implementations.

The authors discuss five prior techniques that framed their research -- flood fill (fills all adjacent pixels the same color until a clearly-defined edge is reached), boundary specification (the "lasso" tool), tri-maps (user-specified map of foreground, background, and unused pixels), bilateral grids (color difference-aware), and quick select (inferred selection from a brush stroke). The edge-respecting brush they developed takes elements of these techniques and refines them by applying cost and alpha computation algorithms like the one below:

A user study was conducted with 9 subjects who were asked to manipulate 6 images each. The edge-respecting brush was found to produce edge-agreement accuracy 99% of the time -- 3% higher on average than the traditional lasso and snap tools.

Discussion:

I wish Photoshop already had something like this. Lasso selection is a painful process that requires Keebler-elf precision and saintly patience to yield accurate results, and most snap tools don't perform well enough for my needs. Being able to paint along an edge without fear of messing up a critical part of the image would make creating layers much easier, and generally speed up the image editing process.

EDIT: I found out a few days ago that Photoshop CS5 is actually implementing something like this with their "content-aware fill" tool. Awesome!

Foldable Interactive Displays

Related comments:

Summary:
Authors Johnny Chung Lee, Scott E. Hudson, and Edward Tse (from Carnegie Mellon's HCI Institute and Smart Technologies respectively) discuss their research on foldable displays in this paper. Most "flexible display" prototypes that are typically reported about are based on OLED (organic light-emitting diode) technology, which are still limited in their implementation and offer no form of touch input. The foldable displays discussed by the authors are actually flexible screens with embedded infrared (IR) LEDs that allow a camera/projector combination to track the display and project an appropriately-sized image onto it. The IR LEDs basically function as fiducial markers in an augmented-reality system where digital images (in this case, whatever is supposed to show up on the display) are superimposed over real life. The authors discussed the advantages and drawbacks of four main foldable shapes -- the newspaper, scroll, fan, and umbrella -- as well as how IR tracking is accomplished on each shape.

Discussion:
When I read the title of the paper, I assumed that the displays would be driven by OLED and that the authors had somehow solved the problem of touch input on such a display. Alas, it was not to be as they basically rigged an augmented-reality system together with IR LEDs and a Wii-mote which allowed camera-based tracking and projection onto the static, un-interactive surface. The displays they discussed could have been made of anything -- paper, cloth, plastic, whatever. The whole system is driven by the camera tracking. I suppose I shouldn't have gotten my hopes up so high, but it's difficult to see how this system could be used anywhere outside of a classroom or research facility with the necessary technologies already installed. The point of smaller, mobile displays is to be able to carry them with you without the need for external equipment. The only real advantage I can see here (other than the orientation-sensitivity, which was kind of cool) is that a user would be able to carry a movie-screen-sized display in a backpack or even their pocket and set it up with minimal effort, assuming that the IR-tracking and camera equipment is available wherever they planned on watching a movie.

Inky: A Sloppy Command Line for the Web with Rich Visual Feedback

Related comments:

Summary:
In this paper, Robert C. Miller, et al. (researchers at MIT CSAIL and the University of Southampton) present Inky (short for Internet Keywords), which is basically a command line interface that provides shortcuts to common web browser tasks. It functions as a hybrid between a command line and GUI interface, giving dynamic visual feedback to the user as they type and utilizing sloppy syntax which frees the user from learning any new command syntax. Keywords can be provided in any order, replaced with synonyms, or entered in a variety of ways; the Inky interpreter attempts to match the user's input against available commands. Inky is a Firefox extension built using HTML, CSS, JavaScript, Java, and XML. The various construction, command, and content-highlighting specifics are discussed in the rest of the paper, covering details about the keyword interpreter, functions and usage.

The authors conducted a small user study of seven participants and found that 95 out of 131 user commands were correctly interpreted, and that Inky was fairly easily learnable based on user response. Most of the commands that were not correctly interpreted were simply attempts by users to invoke commands that did not exist (by entering website names, for example). The researchers used their findings about Inky to begin development on a second prototype called Pinky (Personal Information Keywords) which focuses on lightweight data capture for personal information management.

Discussion:
My first thought about this application was that it seemed an awful lot like Quicksilver for Mac -- an application launcher/wizard's tool that interprets user input to execute commands and launch applications. This paper even mentions Quicksilver, in fact, when talking about how Inky is invoked. It also shares many similarities with Mozilla's own Ubiquity. Like, a lot of similarities. As in, these tools are nearly identical in function and aim. Having used Quicksilver and Ubiquity pretty extensively, I think a command-line-style interface for quick command execution is a great idea that can really streamline a user's experience (if they are willing to take on a small learning curve).

Understanding the Intent Behind Mobile Information Needs

Related comments:

Summary:
In this paper, Karen Church and Barry Smyth (of Telefonica Research and University College Dublin respectively) lay out their study of user's mobile content needs and mobile contexts. The explosion of mobile device popularity across the world (3.5 billion subscribers in 2007) has led to new patterns of information retrieval and new needs for users. The authors analyze previous work dealing with what mobile users search for and why; the research indicated that 50% of the top mobile search queries were related to adult content (though only 8% of all users in the study engaged in any kind of search at all) and that intent for search could usually be attributed to a need for awareness or status-checking behavior.

The authors conducted a four-week study of 20 participants who were asked to keep a diary of all their information needs while they were at home, work, or on the move. Participants logged the date and time, location, and information need, as well as any additional comments they had. The study generated 405 diary entries (approx. 20.3 per person), the majority of which concerned on-the-go conditions for mobile information need. The researchers created three categories for user intent: informational (how-to's, advice, showtimes), geographical (location-based or directions), and PIM or personal information management (PIN codes, friend requests, to-do lists). 30% of entries were geographical in nature, and 42% of entries were non-informational. PIM entries represented 11% of the entries.

Overall, the study indicated that mobile users look for considerably different information than standard Web users, seeming to have a greater need for geographical-based information such as directions or service locations while they were on-the-move; user information needs also seemed to revolve heavily around social interaction (friend requests, questions asked in conversation, status updates etc).

Discussion:
The research presented in this paper is very relevant to current trends in technology - users are gravitating more and more to mobile content, and the advent of user-friendly mobile browsing devices like the iPhone and the upcoming iPad will only heighten this trend. I found the categorizations made by the researchers to be pretty accurate - in the brief period that I owned a web-enabled phone, I used it primarily for getting directions, checking Facebook or Twitter, and managing email. This falls right in line with the results of the study. However, in the future I think user content needs will shift more and more toward entertainment in a variety of contexts and locations as more content is pushed to the mobile space and bandwidth increases.

Simplified Facial Animation Control Utilizing Novel Input Devices: A Comparative Study

Related comments:

Summary:
This paper by Nikolaus Bee, Bernhard Falk and Elisabeth Andr´e from the University of Augsburg's Institute of Computer Science details new methods and controls for manipulating facial animations. First, the researchers discuss the advantages and drawbacks of predominant current methods for facial animation control. Slider-based GUI systems, for example, are easy to implement and familiar to users, but they afford control over only one parameter at a time and do not typically have an obvious mapping for manipulation. The authors discuss several previous studies about direct-mapping including data gloves, data suits, and midi keyboards; they also detail typical facial expression generation technologies like the Facial Action Coding Systems (FACS) which has been used in everything from the Lord of the Rings Trilogy to Half-Life 2. The researchers implemented FACS to drive the facial model - "Alfred" - that they used for their study.

Basically, control points on Alfred's facial structure were mapped to various buttons on an XBox 360 controller or data glove with three different settings - upper face, lower face without inner lips, and inner lips.

To evaluate the ease of use of this system, the authors recruited a group of 17 subjects who were trained in using the slider and gamepad systems and then asked to create three facial expressions based on a photo. Overall, the participants found the gamepad more enjoyable, accurate, and satisfying to use than sliders. The gamepad also created a nearly 30% increase in speed for most cases.

Discussion:
The authors' work in this area is very promising. The obvious application for such a system would be for users to tweak the facial expression of their XBox Live avatar as they see fit. It would also be cool if players could create simple animations for sending messages to friends or other players on the network. Allowing gamers this level of customization to their characters might increase the affective level of interaction between players.

Emotional Design

It's been a very fascinating thing to witness the complete 180 that Donald Norman has made over the course of nearly two decades between the publishing of The Design of Everyday Things and 2004's Emotional Design. In the former, he made some very strong (and very cranky) arguments about the cardinal importance of functionality in the design of products, and frequently made fun of the sort of products and structures that "must have won a design award." Critics of Norman have stated that if we were all to follow the principles in TDoET we would have usable but ugly designs.

But after years of research in the area of human emotions, Norman embraces the very sort of products he so readily dismissed in his previous book -- like the juicer that dominates the front cover of Emotional Design. It is pretty to look at, certainly, and it does have some functional aspects (the lowest point of the "rocket" body serves as a drip point for juice) but according to its designer it is not actually intended to be used for making juice. Norman's point with this example (and the rest of his book) is that emotional and sensory appeal in design takes precedence over functionality.

This main point breaks down emotional appeal into three main categories: visceral, behavioral, and reflective. Designs that appeal on the visceral level are the ones that elicit a base, visually-stimulating response. Behavioral design focuses on a product's ease-of-use and the pleasure derived thenceforth. Reflective design considers the rationalization and intellectualization of a product -- does it tell a story or make its owner think more deeply about it? Norman's talk at TED Conference 2003 illustrates these three new principles pretty well (and will take you a lot less time than reading this book).

I enjoyed reading this book and finding out more about how emotions are so critical to the way that we work as humans. Norman was much less of a blowhard in this book and I had an easier time taking him seriously. Well, at least until I got to the final chapter of the future of robots -- that was a major left-turn that felt like the end of the movie A.I. Artificial Intelligence; things were going fine and then all of a sudden I was like, "wait, did that really just happen?" It felt very out of place (though I suppose it's kind of a segway into his next book The Design of Future Things) and was the most weakly-argued of his chapters. Overall though, I enjoyed it much more than TDoET but the last chapter has me a little scared at what The Design of Future Things has in store.

MusicSim: Integrating Audio Analysis and User Feedback in an Interactive Music Browsing UI

Related comments:

Summary:
Two researchers from the University of Munich, Ya-Xi Chen and Andreas Butz, conducted research in the area of music information retrieval (MIR) about integrating audio content analysis with metadata-based interfaces to create a program called MusicSim. In their paper, they address users' increasing need to browse music in more diverse ways and the difficulty that non-experts have in navigating many current music browsing interfaces. The researchers advocate active user control and feedback of the system to improve performance by integrating audio analysis in the UI and providing visual assistance to music browsing. MusicSim presents songs clustered by content similarity and is controlled by the user through mouse operations; it supports a text-search feature, playlist generation, album-art and color-coded genre visualization. The user can input feedback into the system by splitting or merging album clusters or adjusting a slider in the recluster control panel. MusicSim implements very low-level audio analysis tools like jAudio to compute similarity between songs.

MusicSim was tested by 36 study participants, over half of which found the visualization UI very useful. However, the system did have a few problems, such as a confusing graph view and the fact that genre understanding varies wildly from user to user and should be used as additional information rather than the basis for clustering.

Discussion:
I think MusicSim is a great idea that provides a good starting point for future work in music collection visualization and management. In fact, a system similar to MusicSim was proposed by my senior capstone group for our semester-long project not too long ago. We felt that users wanted as much direct interaction with their libraries as possible, and drafted a multitouch system that would utilize a large screen (thereby solving MusicSim's difficulty with the cramped graph view). The clustering functionality of MusicSim is really useful for large libraries because it can help users locate what they want more quickly, and though using album art to represent music items is a no-brainer (and not exactly the most original idea in the world) it is nevertheless demonstrably effective and easy for users to identify with.

Passages Through Time: Chronicling Users' Information Interaction History by Recording When and What They Read

Related comments:

Summary:
In this paper, Karl Gyllstrom of the University of North Carolina Computer Science Department outlines his research in building interaction history for users to improve information retrieval and characterize document activity. His system Passages collects data from text-based desktop activities like web-page content, emails, and other files, as well as precise timing information about these items' visibility. These data combine to provide a very detailed record of what content was viewed and when. This record can then be used to answer user questions about their content such as, "when did I read this paper," or "which documents have I spent the most time composing?"

The system is comprised of two subsystems: the tracing subsystem for recording event streams in the GUI and filesystem, and the retrieval subsystem for handling artifact requests ("when was the last time I read this document?") and temporal retrieval ("what files did i work on the most during this time period?"). The author goes on to detail the algorithms behind these subsystems and compare it to existing approaches, which do not currently consider how long users view material and give as much weight to glanced-over pages as ones that have been thoroughly pored over. The author's user study examined 15 participants' use of Passages over a total of 14.27 hours of activity; the results indicated that the system was well-suited to adding nuance and precision to document history requests.

Discussion:
For the super-organized, this probably a dream come true. Being able to know which documents you were reading during a certain period of time or finding out how much time you spent reading a paper could potentially be very useful. I found a couple of the author's assumptions about the algorithm to be somewhat ill-founded; for starters, it is based on whether content is visible as the active window. I have a 27" monitor and frequently have multiple documents open simultaneously, with one or neither set as the active window. I may be reading either of them but not having that time or content logged into the system because of the way Passages is set up. This obviously isn't going to be a problem for everyone, but I feel like the productivity-hawks who would use something like this would probably have large or multiple displays, thus rendering the system less effective.

More than Face-to-Face: Empathy Effects of Video Framing

Related comments:

Summary:

In this paper, David T. Nguyen of Accenture Technology Labs and John Canny of UC Berkeley's Institute of Design present their research about the effectiveness of different videoconferencing techniques in the context of empathic interaction. The authors first cover some of the basic benefits of videoconferencing (time and money-saving for business interactions) as well as some of its shortcomings (lower degree of trust, decreased measure of non-verbal cues, disparities in gaze matching). They mentioned several previous bodies of work (mainly pertaining to gaze preservation) and established their central hypothesis -- that the correct framing of the subjects in a video conference could reduce or eliminate any disparities between it and a face-to-face meeting. A basic hierarchy was established:

participants (or dyads) in face-to-face meetings exhibit the highest level of empathy
dyads in upper-body-framed video meetings exhibit the next highest level of empathy
dyads in head-only-framed video meetings exhibit the lowest level of empathy

Based on the findings of their study (which consisted of 62 test sessions with the various types of meeting/framing), the authors then presented some design guidelines for video systems that would ensure the highest level of dyad empathy in video conferences. These basically consisted of measures that framed the participants' upper bodies and allowed the greatest amount of non-verbal body-language cues to be detected.

Discussion:

The old sci-fi B-movie staple of the "videophone" has become rather commonplace now with the inclusion of integrated webcams and chat software in most commercially-available laptops. Users can place video calls to people across the planet (via Skype or other video conferencing client) and have empathic interactions in real-time. Based on my own personal experience with video conferencing, I can attest that framing which favors the transmission of non-verbal cues are the most effective; much of the authors' research provided unsurprising results in this regard. The most intriguing thing to me was the concept of gaze-matching. This is often the one part of video conferencing that I (and high-powered business-types around the world) find the most trouble with -- dyads can never really make "eye contact" in a video conference, and some of the systems they discussed presented some cool solutions to this problem.

Learning from IKEA Hacking: "I'm Not One to Decoupage a Tabletop and Call It a Day."

Related comments: Jill's blog.

Summary:
Daniela Rosner and Johnathan Bean, two UC Berkeley students, present their findings about the activities of the DIY community, specifically in the area of IKEA furniture "hacking." These hackers take pieces of IKEA furniture kits and cobble them together in new and interesting ways as a method of artistic expression, practical modification, or just to see what they can create. The challenge in IKEA hacking is to create something new that has no instructions, though participants have increasingly reached out to other hackers via online communities like instructables.com and ikeahacker.com. The paper gives an overview of IKEA hacking and delves a little into the rationale for it -- DIY-types enjoy the change of pace provided by working with their hands to create something physical, in contrast with their daily jobs; the presence of IKEA products in homes and workplaces has become so prevalent that "hacking" provides participants a way to personalize their furniture and distinguish it from others; and displaying creativity and ingenuity in hacking can be very rewarding to the participant.

Discussion:
Being a participant in (or at least a follower of) one particular DIY community -- music -- I can understand and appreciate the appeal of creating something new from readily-available parts and seeking to put together something totally individual without any kind of manual or assistance. One thing the paper hit on very well was the online community aspect of hacking. Collaboration is a huge part of any DIY community -- participants can inspire and be inspired by the work of others, share ideas, show off their creations, and get help if necessary. This kind of collaborative, supportive environment really fuels participation. One of the most interesting points of this paper was the idea that IKEA has no style; that it has become so ubiquitous that any style it held has disappeared. This "lack of style" really gives DIY-types a wide-open playing field to create something that would never hit a store or showroom floor.

The Inmates Are Running the Asylum (Chapters 1-7)

My initial reaction to Alan Cooper's The Inmates Are Running the Asylum was almost the same as to Donald Norman's The Design of Everyday Things -- these are the irrelevant rants of a cranky old professor who has nothing better to do than complain and would sooner send mail via carrier pigeon than learn to use "one of these newfangled machines." I'm sure this was due in part to his less-than-hospitable treatment of programmers and his unapologetic tone, and part due to my tendency to not take seriously as a technology authority an individual who can't open a Word document.

As I read on, however, my critical attitude softened somewhat. Some of his arguments (particularly concerning "dancing bearware" and "cognitive friction") seemed apropos to the modern discussion of software design. A lot of what he argued was really a paraphrasing of Norman's points in TDoET, applied almost exclusively to interface design. My annoyance (and sometimes anger) at his vitriol aimed toward programmers turned to pity over the course of the book as I slowly started to put myself in his place -- and time -- and realized that most of the mistakes he rails against simply aren't happening anymore.
The software of the turn of the century was, as Cooper says, written and designed primarily by programmers who were as much concerned with finding discrete, creative ways to blame users for the problems with their programs (giving them an out for responsibility) as they were with making money from the stuff they continued to shovel onto hapless computer owners. Obviously, this pattern revealed itself when the dot-com bubble burst and a large contingent of these shovelware developers disappeared as quickly as they had come. No doubt much of Cooper's readily apparent bitterness stems from his time working with such companies.

The developers that remained, however, learned their lesson and have started to design in a much more user-centric manner. Cooper's arguments didn't seem relevant to me because they aren't relevant -- companies are already doing the things he's talking about. Additionally, I felt that some of his concerns didn't apply to a society where a large part of the population has grown up using computers (though he would undoubtedly classify our generation as "scarred" to the point of numbness by our experiences with poorly-designed interfaces). I don't know how large a part his writings and work influenced the industry, but I do know that a lot has changed since this book was published. Apple, Microsoft, and other developers spend millions of dollars on interface design for every product they make, and though I'm sure Alan Cooper may still think they're far from perfect, the software products of today are much more user-centric than those of the past.

The Application of Forgiveness in a Social System Design

Related comments: Jill's blog.

Summary:
In this paper, Asimina Vasalou, Jens Riegelsberger and Adam Joinson (a joint research team from the University of Bath and Google UK) lay out a framework for designing "reparative social systems" by defining the process of forgiveness in social interaction. They bring light especially on the case of the online user who may offend unintentionally or accidentally; reparative systems may more easily allow trust to be re-established and the user returned to good standing in the online community. They define forgiveness as "the victim's prosocial change towards the offender as s/he replaces these initial negative motivations with positive motivations." These positive motivations can be influenced by a number of factors: offense severity, intent, apology, reparative actions taken, non-verbal expressions, dyadic history (previously rewarding interactions between two users), and history in the community. Victims will assess all of the above factors when deciding whether to forgive the offender. They expound further on the definition of forgiveness in the following ways:

Forgiveness cannot be mandatory (it does not follow automatically after an offender's penance)
Forgiveness is not unconditional (rather, it follows the offender's acknowledgement of responsibility and amends)
Forgiveness does not necessarily repair trust or remove accountability

Forgiveness has its benefits -- offenders can relieve guilt or shame through apology or reparative action, victims may reduce or release their anger toward the offender, and it can empower an online community to learn by example and move toward self-moderation.

From the definitions presented, the researchers arrived at the following provisions for reparative design in social systems:

Respect the dyadic nature of forgiveness (overcome the asynchronous nature of online forum communication by notifying offenders and providing a grace period for response)
Support the motivating factors of forgiveness (provide systems that allow the victim and community to measure an offender's previous and current actions, intent, apology, etc. and gives the offender tools to provide an adequate apology)
Increase public awareness (make the offense public to educate the community)
Build flexibility within and around forgiveness (allow victims to retract decisions of trust and accountability if desired)
Design interventions to lower attributions (assess an offender's previous history to prevent victims from jumping to conclusions about an offender's motivation)

Discussion:
The research presented in this paper certainly gives a thorough definition of forgiveness as well as methods of enacting it; the problem with putting such systems in place in an online community is that the selection of communities willing to self-moderate and enact these kind of reparative systems are probably few and far between. The researchers mention "trolls" all too briefly at the beginning of the paper and don't seem to realize that a troll can simply create another account on a forum if s/he got banned for offensive or abusive conduct. The systems they mention apply most directly to buyer/seller sites like Amazon, eBay or Craigslist where buyer/seller trust is paramount, and they do a good job of mentioning their system's applications in this respect. However, they seem to view these systems as being more widely applicable than they really are.

“Pimp My Roomba”: Designing for Personalization

Related comments: Zach's blog

Summary:

In this paper, JaYoung Sung, Rebecca E. Grinter and Henrik I. Christensen of Georgia Institute of Technology dispense the results of their research on the effects of personalization on user interaction with Roombas. They postulated that people will naturally personalize their Roombas (or any other object, really) if encouraged by the design of the object or given a set of tools to do so; they also wanted to see if people's experiences with the device were positively impacted by their personalization. The researchers did a study of 30 households in the Atlanta area, each of which were given a Roomba. 15 of the households were given "personalization toolkits" to use on their Roombas which included stickers, lettering sets, coupons for Roomba skins, and a booklet showing how other users had customized their Roombas in the past. The other 15 households were given no toolkit or any other indication that customization was possible or desirable.

It turns out that 10 of the 15 households with toolkits used them or went online to order additional skins with the included coupons, but of those, only 6 actually customized the Roomba. There were three primary motivations behind this customization: expressing the Roomba's identity, show its value in the household, and to make it either stand out or blend in with its home environment. For example, some families gave their Roomba a name and decorated it differently according to its gender to give it some personality. Other families decorated the Roomba as a "reward" or out of "gratitude" for the services it provided to show its worth to the family. Finally, some of the households decorated Roomba to make it stand out from the carpet or decorations, and others tried to make Roomba fit into the aesthetic of their home. The end result of the customization was a feeling of connectedness to Roomba, with householders seeing it as "their" robot instead of "a" robot. Conversely, none of the 15 households who did not receive toolkits decorated Roomba.

Discussion:
Personalization is ingrained in our society. Think about it -- what's the first thing most users do with a new computer? More likely than not they will change the desktop display image to something they want, to make the computer more theirs. Customizable skins are available for every device imaginable, from laptops to cell phones and game consoles. The paper shows that users will customize a device to make it more personal, as long as easy methods to do so exist. Some of the families talked about ordering more skins but found the process too complicated. The families without kits, obviously, didn't know that customization was an option, and therefore didn't personalize Roomba. In fact, after reading this paper, I decided I want a Roomba. This Roomba. If you don't think he's cute you don't have a heart.

What's Next?: Emergent Storytelling from Video Collections

Related comments: no other posts on this paper, comment on Kerry's blog.

Summary:
Three researchers from MIT's Media Laboratory (Edward Yu-Te Shen, Henry Lieberman and Glorianna Davenport) developed Storied Navigation, a video editing system that lets users quickly search and browse a video library to compose a story. The system works by annotating video clips with sequence information and story attributes; users can then search with words or sentences to find clips with relevant characters, themes, emotions and story elements. In this way, users simply type in an emotion or story sequence they wish to convey and the program returns clips matching their criteria -- the user doesn't even have to be familiar with the video library to be able to tell a story. Storied Navigation uses natural language processing (NLP) to parse user input for story elements, characters, etc. It features an "edit-by-typing" function that sequences video clips based on a story the user types in English, and an "edit-by-recommendation" feature that can either find similar alternatives to selected clips or provide clips that will continue the current story sequence ("what's next").

The researchers conducted two separate studies, the first of which focused on whether the system helped users to develop their story threads. The subjects reported the system's ease of use, helpful features, powerful search, the improved efficiency with which they could sift through large amounts of video to find what they needed, ability to compose stories on-the-fly, and the way the interface helped them understand the logic behind a story. The second study focused on browsing an unknown video library. All seven subjects reported that the system helps them find what they want, mentioning that the edit-by-typing function was the most useful to them -- some of the subjects even wanted to create stories from the clips that were returned from their queries.

Discussion:
BRILLIANT. As a video editor, I can tell you that there is nothing more painful than trying to sift through hours of documentary footage to find an appropriate clip -- this tool is literally the coolest thing a documentary filmmaker could ever want (aside from one of these, maybe). There are times during editing when I even know what all of the footage is (in the event that it isn't dozens of hours of material), and just don't know how to move forward with it. The Storied Navigation system answers my unspoken question ("what's next?") and even provides multiple ways of filtering the results it provides by matching emotions or placement in the story structure. The ability to compose stories on-the-fly or quickly change a train-of-thought to another direction greatly streamlines the film editing process. I could save so much time and tedious browsing by using this system; the "ways to think" that it provides are simply brilliant and ought to spur a storytelling sense in any filmmaker.

A Vehicle for Research: Using Street Sweepers to Explore the Landscape of Environmental Community Action

Related comments: no other posts on this paper, comment on Patrick's blog.

Summary:
In this paper, Paul M. Aoki, et al discuss several key points concerning the gathering of air quality data by citizens, as well as the social and political ramifications of such action. The authors conducted this research in hopes of refining the development of mobile participatory sensors that ordinary people can use to collect viable data. This data could be used to augment existing data stations and provide community activist groups with politically relevant evidence for legislative change. Environmental decision-making is something of a closed-door affair; government agencies (such as the EPA) have established data-gathering facilities that they deem reliable enough to inform legislation, and outside intervention or assistance is typically dismissed. Three groups typically inform or influence the environmental-legislation process: government bodies, emitters (factories, oil refineries, etc.), and public interest advocates like community groups or The Sierra Club.

This particular study was of the San Francisco Bay Area, which is an ideal location to study air quality due to various historical factors about the area and characteristics of California environmental legislation that make it a sort of "laboratory" for air quality standards. The primary monitoring station in the Bay Area is operated by the Bay Area Air Quality Management District (BAAQMD, or "Air District"). However, dissenting opinions about how air quality should be measured make agreement about its effectiveness difficult. Many citizen groups in the area feel that the data models produced by the Air District are an inaccurate reflection of the average citizen's pollution intake because the monitoring stations are too spread out and too far above the ground.

The data gathered by citizens (or in this case, by sensor packages mounted on street sweeper vehicles) seems immediately advantageous since it is "on-the-ground" data. However, this approach is also rife with problems. People may be interested in gathering data, but only for the purpose of influencing others -- they have no interest in learning anything from it. The quality of the data gathered by less expensive, mobile equipment is also much lower, calling into question whether such data would be valid or useful at all. Because of disagreements about data and the way it is utilized, government agencies, researchers, and community groups are usually so distrustful of each other than very little progress toward change can be achieved at all. The researchers concluded that in order to promote successful community action toward environmental change, advocacy groups should connect with collaborative social mapping tools to preserve continuity in long advocacy campaigns, and some means of validation for data collected by non-expert users should be established so that their data is taken seriously.

Discussion:
As with any other hotly-debated policy topic, environmental legislation carries a host of variables that often make it nearly impossible for disputing groups to arrive at an effective solution; in this case the disputing groups are the people (activist groups) and the government (the Air District, and by larger association, the EPA). The people don't think they government is using "real" enough, "on-the-ground" data, and the government doesn't think the people could possibly gather any accurate, useful data on their own. What bothered me about this paper is the fact that the researchers never made mention about their findings regarding the data -- what did the street sweepers collect? Was the data any good? Could it have been used to assist government agencies in their analysis of air quality? We never find out because the authors are too busy telling us what we already know about policy debates.

Exploring the Analytical Processes of Intelligence Analysts

Related comments: no other posts on this paper, so I commented on William's blog.

Summary:
This paper presents the results of an observational case study of intelligence analysts (IAs) at work, conducted by a group of three researchers from Pacific Northwest National Laboratory (George Chin, Jr., Olga A. Kuchar, and Katherine E. Wolf). Research cases like this one have gained more interest over the past decade; these efforts attempt to find ways of developing new information technologies and visualization tools to improve or assist intelligence analysis. The bulk of the paper describes two case scenarios presented to a group of IAs. The first scenario charted the intelligence-gathering and analysis methodology of the IAs, and the second scenario examined how IAs collaborated in real-time to carry out a group analysis. In each scenario, IAs were given material similar to what one would find in an actual case, such as group background info, intercepted communications, electronic files, witness statements, etc. New information was also given periodically through the course of the analysis to see how it would be integrated with the existing information. The researchers found that the strategies used in analysis varied from IA to IA, but that there were a few that were most prominent -- the IAs all followed a similar sequence of steps to conduct their analysis. The first step, obviously, is information gathering. All of the IAs printed out all the case files despite being given electronic copies, and all the IAs proceeded to arrange or order the information in some way so that they could extract relevant facts. Once data was ordered and facts were extracted, IAs attempted to identify patterns and trends in the evidence. Most of the IAs displayed relationships in the data visually, either via a hand-drawn diagram or an electronic aid like a graph or spreadsheet. The weight given to certain pieces of evidence shifted for each IA depending on the credibility of the source or method of obtainment. The IA's judgement in this area was based largely on his or her previous experience. In the second scenario, IAs collaborated to corroborate data and share resources rather than compare and pass on conclusions; they worked to form a single more accurate analysis.

From all this observation, the researchers were able to identify a few areas in which technology might aid IAs in their work. Computers could auto-generate standard analysis views based on given facts and relationships, improving the speed and efficiency of an IA's analysis; optical character recognition tools for converting oft-used sketches into text or graphs to enable easy electronic storage and sharing; taking advantage of multiple displays to accomodate simultaneous viewing of many documents; and integrating link analysis and case management tools in a way that would provide more sophisticated pattern-matching so IAs could locate and draw conclusions from past cases more quickly.

Discussion:
Intelligence analysis is a critically important field that gets a pretty wonky dramatization on television (see 24 for evidence), but it nonetheless often requires the snap decisions and life-or-death implications depicted in such mediums. One of the things I found most encouraging about this study was the willingness of IAs to collaborate with one another, and the results they achieved by sharing facts and helping each other eliminate information. The single largest problem facing our intelligence community today isn't al-Qaeda or missing Soviet nukes -- it's the lack of collaboration between various agencies and a competitive attitude that weakens our ability to assess and identify threats. It's in this area, not in multiple displays or workflow management, that technology researchers need to develop new tools. Something like a secure digital "whiteboard" where information could be shared, edited and annotated across multiple global agencies would be extremely useful in enabling that kind of collaboration.

PrintMarmoset: Redesigning the Print Button for Sustainability

Related comments: Jacob's blog

Summary:
In this paper, HP Labs researchers Jun Xiao and Jian Fan predominantly discuss the concept of sustainable interaction design (SID) and the challenges presented by systems and users when attempting to introduce "green" behaviors. They tested and refined their observations by developing PrintMarmoset, a refinement of other "smart" printing technologies like "printer-friendly" versions of web pages and HP Smart Web Printing. The aim of PrintMarmoset was to address the problems presented by printing web content -- undesirable formatting, cut-off text, advertisements, and blank pages -- and reduce the amount of waste paper associated with those problems. All of this, of course, had to be implemented in a way that did not require developers to modify existing web sites nor users to change existing print flow, required near-zero user input effort, offered user flexibility and pre-defined templates, maintained a history of print activities for future reference, and preferable raise awareness of itself among users. Xiao and Fan achieved this by writing PrintMarmoset as a Firefox browser extension; when the user clicks the PrintMarmoset button, the main content of the page is selected by the plugin and is highlighted to the user. PrintMarmoset then allows the user to add to or remove from the content selection with one very simple mouse "stroke" gesture. Once the user is satisfied with the selection, he or she clicks the PrintMarmoset button again and a document containing the selected content is generated.

PrintMarmoset approximates desired content via analysis of a scaled-down screen image. Visual separators are identified and used to divide the page into blocks. Once these blocks are determined, content importance is evaluated by parsing text information from the HTML portion of the page. PrintMarmoset also utilizes a tool called PrintMark that stores the user's selection for that page and uses it as the default printing template for that website in any future printing attempts. These PrintMarks can then also be shared with other PrintMarmoset users to encourage a more "viral" and communicative method of promoting sustainability through saving printout paper. PrintMarmoset was tested and compared against other tools like HP Smart Web Printing. These tests favored PrintMarmoset's easy stroke selection and most users stated that the tool offered a natural WYSIWYG printing experience.

Discussion:

The difficulty with presenting sustainable technologies to users is the fact that most sustainable technologies are only effective on a long-term basis and there are few if any short-term benefits; users want results now with as little effort as possible. In this regard, PrintMarmoset is a very effective tool -- it requires little of its users besides the click of a button and a single mouse stroke, and it provides a paper-saving print method with clear short-term and long-term benefits. I personally don't print a whole lot of web content (hardly any, in fact) but for those who do, in offices or businesses or wherever, I can see this being a very useful, easy-to-integrate tool for getting the desired content on the page. The are only two readily apparent problems I see with PrintMarmoset: it is limited to the Firefox web browser (which is widespread but hardly ubiquitous) and therefore discounts a large audience of other browser users; and it is a browser extension, which by definition requires a user to find and install it. As discussed above, users will only exert the least possible amount of effort and they may not be convinced that an "alternate print button" is worth their time to install, or may not know how to install it at all.

no matter the man or the machine

Precision Timing in Human-Robot Interaction: Coordination of Head Movement and Utterance

Building Mashups By Example

Tagsplanations: Explaining Recommendations Using Tags

Opening Skinner's Box

MediaGLOW: Organizing Photos in a Graph-based Workspace

Detecting and Correcting User Activity Switches: Algorithms and Interfaces

The Inmates Are Running the Asylum (Chapters 8-14)

Annotating Gigapixel Images

Edge-Respecting Brushes

Foldable Interactive Displays

Inky: A Sloppy Command Line for the Web with Rich Visual Feedback

Understanding the Intent Behind Mobile Information Needs

Simplified Facial Animation Control Utilizing Novel Input Devices: A Comparative Study

Emotional Design

MusicSim: Integrating Audio Analysis and User Feedback in an Interactive Music Browsing UI

Passages Through Time: Chronicling Users' Information Interaction History by Recording When and What They Read

More than Face-to-Face: Empathy Effects of Video Framing

Learning from IKEA Hacking: "I'm Not One to Decoupage a Tabletop and Call It a Day."

The Inmates Are Running the Asylum (Chapters 1-7)

The Application of Forgiveness in a Social System Design

“Pimp My Roomba”: Designing for Personalization

What's Next?: Emergent Storytelling from Video Collections

A Vehicle for Research: Using Street Sweepers to Explore the Landscape of Environmental Community Action

Exploring the Analytical Processes of Intelligence Analysts

PrintMarmoset: Redesigning the Print Button for Sustainability

About Me

Blog Archive

Followers