CHI 2013 Papers and Notes (Refereed) Reviews of submission #806: "The Design and Field Observation of a Haptic Notification System for Timing Awareness During Oral Presentations" ------------------------ Submission 806, Review 4 ------------------------ Reviewer: primary Overall rating: 3.5 (scale is 1..5; 5 is best) Your Assessment of this Paper's Contribution to HCI Overall Rating 3.5 . . . Between neutral and possibly accept Review Expertise 4 (Expert ) The Meta-Review As should be clear from the scores, we have a bit of a split in opinions here. R2 and R3 are considerably more positive about the paper than R1. Overall it appears that R2 and R3 both appreciate the real-world application aspect of this work, and consider this to be an interesting innovation backed by a good study. R1 on the other hand raises several concerns. The first, and in my view most important, of those is methodological saying: “It is unclear how questions were phrased to participants. This coupled with the way the data is presented as simple percentages make it very hard to trust the data.”, “Without knowing the specific questions you asked, it's hard to trust your conclusion here”, and “I feel that I am being presented a limited view of the data to try and sell a particular conclusion.”. However, R1 also indicates “my issues here are almost entirely with the paper and not the work”, so it would seem that they don’t feel these flaws are fundamental. R1’s second concern is about whether there is really enough content/contribution here, saying: “content of this paper is not sufficient for a full CHI paper” and “I feel that the paper has been padded out with unnecessary information”. R1 also asks for both fewer and more references. While that is no doubt a bit frustrating keep in mind that they are really asking for *better* references and better coverage of the related work, particularly with respect to the interruption and task switching literature. My own review of this paper falls a bit between the two polls of the other reviewers. However, I would say I share some of the concerns of R1. I find this a quite well documented design exercise and justified and analyzed fairly well with respect to prior work (although I think R1 has a point about needing a bit better tie to the interruption literature). While I personally have less concern about the study than expressed by R1, I do share a feeling that the paper is a bit “light” on actual content to fill 10 pages. On the other hand I don’t think the content here would fit well in 4 pages, so I’m not sure I can complain too much about it being 10. I’m also not sure about the level of overall impact – this is a small domain and I’m not seeing the amount of innovation here as extremely high. Finally, I would raise a new point: that the paper doesn’t make a really strong case for why chair-speaker communication is really necessary (as opposed to just making the speaker aware of time, i.e., what at one point you call system-speaker communication). In fact discussion alludes to the fact that system-speaker communication needs to be considered as an alternative. Overall I would personally rate this paper at a “3” if I was giving a separate score (as a meta-review score I have used the average of that and the other reviewers' scores). Associate Chairs Additional Comments The Review Areas for Improvement ------------------------ Submission 806, Review 5 ------------------------ Reviewer: secondary Overall rating: 3.5 (scale is 1..5; 5 is best) Your Assessment of this Paper's Contribution to HCI This paper describes a haptic notification system designed and used for helping session chairs notify speakers for when their time is up. The system was deployed and its effectiveness/tradeoffs were discussed. Overall Rating 3.5 . . . Between neutral and possibly accept Review This review is entered before reading the other reviews or rebuttal/discussion. Later comments will be added after reading those. As a 2AC, I feel that a 3.0 rating is unacceptable, hence the 3.5, though this paper really is only slightly above 3.0 for me. Pros: I like the notion of a haptic system for presentation time notification and like that it was deployed in a variety of different contexts. The problem being addressed is something that every conference attendee, speaker, and chair is acutely aware, and the haptic channel seems like an appropriate solution. Cons: I felt that the amount of information imparted in this paper was more appropriate to a note, though the extensive deployment shows that this has been in development for a while. What I was missing was some more controlled examination of the notification system: what were the tradeoffs between speakers attention and awareness of the haptic notificaitons (and the content of the notifications). How distracting were they? Even a small controlled study would be a useful addition to the paper. How many bits of information could users effectively perceive and under what circumstances. Also, it's not clear how important and clear were the communication aspects of the system between the sessions chair and the speaker. Why wouldn't a simple buzz at a number of preset times be appropriate without any messaging? Did the chairs intentionally wait for appropriate moments in order to signal the speakers? So, it's a balance. I like the work, and feel like it would make a useful addition, but would have preferred more rigor and appropriate information such that I might be able to use that information and build on top of the ideas present here. Expertise 3 (Knowledgeable) The Meta-Review Associate Chairs Additional Comments 2AC Additional Comments Post other reviews and rebuttal. It seems that the core of the reviewers all agree, and the rebuttal did not significant sway many opinions. In particular, the review by R2 did raise some important and worrisome points as to the overall satisfaction of the solution and whether the authors were cherry picking results to make their point. That being said, collecting field data is (for me) often an end goal that would be backed up by a quantitative assessment, but those assessments always run the risks of being hard to compare to the eventual task. Measuring this in a real situation is worthy of presentation and would be favorable to the other way around (just testing and not deploying). The Review Areas for Improvement 2AC Additional Comments to Rebuttal ------------------------ Submission 806, Review 1 ------------------------ Reviewer: external Overall rating: 2.5 (scale is 1..5; 5 is best) Your Assessment of this Paper's Contribution to HCI The paper presents HaNs, a tactile wristband used to deliver cues to people making presentations to help them manage their time. The system was tested in 3 different settings and subjective data was gathered from speakers and session chairs. The authors conclude that the technology has the potential to promote better time management in such situations and outlined further avenues of study. The primary contribution of the paper is a method of time management for conference speakers along with data about how well it performs in practice. Overall Rating 2.5 . . . Between possibly reject and neutral Review In my opinion the content of this paper is not sufficient for a full CHI paper, and I also believe that the research (as presented) is weak and shows signs of bias. Some specific issues: * The trial as described was carried out in 3 contexts, but in 2 of those contexts it is presented alongside traditional interaction methods. I don't feel that the implications of this are adequately addressed. * It is unclear how questions were phrased to participants. This coupled with the way the data is presented as simple percentages make it very hard to trust the data. Basically, I don't know if you loaded the questions, and there does appears to be quite a strong bias in the reporting of some figures, for example; "Chairs liked how HaNs automatically notified them of the time without the effort to poll...". I don't see how you could reach that conclusion without a loaded question. At one point you state "a more comprehensive report...can be found in [39]"; that is fine under some circumstances, but in this case there are just too many missing details in the paper. * Page 6: "Some speakers behavior did change after a cue: they talked faster, skipped slides, or mentioned timing..." followed a few paragraphs later by "few audience members noticed delivery...there were no external signs of speakers losing a train of thought, startling, or breaking their speech due to cues". Without knowing the specific questions you asked, it's hard to trust your conclusion here, and you don't back up the latter assertion up with any real data. Of course you expect the speaker's behavior to change; how can you say that the audience didn't notice this? * Your assertion that the wearable technology resulted in freedom of movement is extremely weak. "Some Speakers" is not a number. There are too many weasel-words in this paper, and it looks like you're trying to disguise weak results. Most of the results are anecdotal and all of the data are provided without any type of statistical analysis (not enough information is given for me to figure out how badly this affects the validity if your paper). It just isn't strong enough to convince me. Quotes are great to provide insights, but data are more important; exactly how many negative comments did you received about the device size, for example? To count it out: under Results/Feasibility, Results/Improvements and Results/Limitations you make 17 points. Of those points, 13 of them are backed up with numerical data. The other 4 are only backed up with quotes or anecdotes. * The HaNs system wasn't popular with speakers, but it was popular with chairs. I went back and took a look at the presented quotes and counted the particularly pro-HaNs quotes and the anti-HaNs quotes, not counting 'neutral' or 'suggestion' quotes. Of course, my interpretation means this isn't particularly reliable, but for pro-Hans quotes I counted 6 from the audience, 4 from speakers, and 2 from chairs. For anti-Hans quotes I counted 0 from the audience, 3 from speakers, and 1 from a chair. So I really don't think there is a fair or balanced discussion going on; I feel like I'm being fed quotes that back up your assertions. I often felt that there was an interesting story that I wasn't being told; for example, p.7: "I buzzed him again and it was clear he got the message!". What is the other side of this? How does this compare to traditional methods? * Your discussion often makes assertions without referring back to your data or to other work. For example "some speakers...would prefer not to be controlled". * I feel that the paper has been padded out with unnecessary information, such as the battery capacity of the devices, which adds nothing. I also feel that this is present in your references section, which lists 39 references of which 28 are papers or journals, 4 are books, 6 are websites or products and 1 is an omitted thesis. I also feel that you're missing some important citations that would back up your work. I think you should be referencing Iqbal and Bailey's work on Breakpoints, Vastenburg et.al's work on "Considerate Home Notification Systems" (2007/2009) and Gillie and Broadbent's seminal paper "What makes interruptions disruptive? A study of length, similarity, and complexity". You might also find it useful to look at Warnock et.al's work on multimodal notifications because I think that would support a lot of your assertions about tactile signals. Some of these papers are based around interruption management; that is the theory that interruption timing affects disruption/intrusiveness/acceptability. When a chair interrupts the speaker they are unlikely to do so mid-sentence, yet that will happen in your system. I think this is a point you need to address and discuss. What this basically adds up to is that I'm not convinced by your paper. I feel that I am being presented a limited view of the data to try and sell a particular conclusion. That being said, my issues here are almost entirely with the paper and not the work. While I still feel that there isn't enough here for a CHI paper (perhaps a Note would have been more suitable) I feel that if the authors were to present fewer points (remove the weak stuff, basically) and then present a broader and more thorough analysis and discussion of the remaining points, this could be an interesting paper. Expertise 4 (Expert ) Areas for Improvement Important things to improve: * Cut out the weak results from the paper and expand on what remains. * Try to provide a more balanced reporting of your results, or at least address the problem of the paper appearing to be biased. * Address the issue of interruption management. * Make sure you back up your assertions with citations or your own data. * Clarify how speakers responded to notifications and how this affected social acceptability * I think you need a much stronger comparison with existing methods, especially as your system failed to impress speakers. Some smaller things: * Abstract's opening sentence is a bit weak. * Method of using inline enumerated lists is awkward and inconsistent (p.4 Method para.1, p.5 Results para.2). * The cut-out box on page 6 really just over complicates things. I would suggest using the notations (HS:0%, SM:0%, GR:0%) which would make your paper easier to read. * What you called a Haptic Icon is usually called a Tacton (google scholar search: tacton = 467 results, haptic icon = 99 results). Some additional references to consider, as mentioned above: * M. H. Vastenburg, D. V. Keyson, and H. Ridder, “Considerate home notification systems: a field study of acceptability of notifications in the home,” Personal and Ubiquitous Computing, vol. 12, no. 8, pp. 555–566, Jun. 2007. * M. H. Vastenburg, D. V. Keyson, and H. de Ridder, “Considerate home notification systems: A user study of acceptability of notifications in a living-room laboratory,” International Journal of Human-Computer Studies, vol. 67, no. 9, pp. 814–826, Sep. 2009. * S. T. Iqbal and B. P. Bailey, “Oasis,” ACM Transactions on Computer-Human Interaction, vol. 17, no. 4, pp. 1–28, Dec. 2010. * B. P. Bailey and J. A. Konstan, “On the need for attention-aware systems: Measuring effects of interruption on task performance, error rate, and affective state,” Computers in Human Behavior, vol. 22, no. 4, pp. 685–708, Jul. 2006. * T. Gillie and D. Broadbent, “What makes interruptions disruptive? A study of length, similarity, and complexity,” Psychological Research, vol. 50, no. 4, pp. 243–250, Apr. 1989. * D. M. Cades, J. G. Trafton, and D. A. Boehm-Davis, “Mitigating disruptions: Can resuming an interrupted task be trained,” in Human Factors and Ergonomics Society Annual Meeting Proceedings, 2006, vol. 50, no. 3, pp. 368–371. * D. Warnock, M. McGee-Lennon, and S. Brewster, “The impact of unwanted multimodal notifications,” in Proceedings of the 13th international conference on multimodal interfaces - ICMI ’11, 2011, pp. 177–184. * D. Warnock, M. McGee-Lennon, and S. Brewster, “Older Users, Multimodal Reminders and Assisted Living Technology,” Journal of Health Informatics, vol. 18, no. 3, pp. 181–190, 2012. * E. Hoggan, A. Crossan, S. Brewster, and T. Kaaresoja, “Audio or tactile feedback: which modality when?,” in Proceedings of the 27th international conference on Human factors in computing systems, 2009, pp. 2253–2256. ------------------------ Submission 806, Review 2 ------------------------ Reviewer: external Overall rating: 4.5 (scale is 1..5; 5 is best) Your Assessment of this Paper's Contribution to HCI The paper demonstrates a Haptic Notification System to remind both speakers and chairs of a conference session (and/or of a seminar) about the duration left for a talk, and also establish a low-level and non-disruptive communication between them during the talks. Personally, I always had problems keeping track of the time, as a speaker and as a chair, so I really appreciate this work. Overall Rating 4.5 . . . Between possibly accept and strong accept Review The paper is well written. Authors targeted at a real world problem and proposed a simple solution that is socially, cognitively and technically feasible. Unfortunately there is no prior art of using haptics/tactile feedback to notify timings during a session so the comparison of vibro-tactile feedback scheme is not possible, and I consider this paper to be first in this direction. Authors did a great effort to collect data from two mid-level conferences and one seminar series, which was not trivial. The only caveat in author's approach is that the performance is assessed qualitatively, and the paper lacks quantitative assessment. This is somewhat a set back, because there is no way for future readers to compare the performance with their devise schemes. Perhaps collecting a quantitative data before, during and after the conference is not practical, but it would highlight the findings and significantly improve the impression of the paper. Nevertheless, this is an early attempt to use a haptic system for such a task, so it might get a pass. The research as a whole is well conducted, and the work is presented as a usable solution. I would also emphasize on putting in some technical content, so readers can imagine improved solutions for the problem. For example, authors can explain what was the selection process of designing vibratory profiles, selection of motors and selection of voltage range and its scaling (linear versus logarithmic for example). What was the reaction (or rise) time of the tactor used? What was the weight of the HaNS, and is it easily applicable on other body sites? for example, the device could be placed in a pocket and tied to the belt for example. Authors mentioned that speakers ratings were lower than chairs' ratings. Why was that the case and how to improve it? What was the reaction of speakers when the tactile feedback was first presented to them during their talk? Was there a moment that audience could sense the reaction? Were the audience receptive of sudden reaction of the speaker when tactile cues turned ON, and when the speaker turned OFF vibrations. It would have been nice if authors had conducted a similar study where no haptics was provided to the speaker and to the session chair. In this case, they could quantify and compare the performance of haptics and no-haptics case. For example, if chairs were asked to display signs at intermediate intervals, how many times did they either miss it or delay in displaying the sign. How long did the chair put up the sign before getting acknowledgement from the speakers, and how much time the speakers extend beyond the allotted time? Nevertheless, the paper have insights and will be useful for future readers. Expertise 3 (Knowledgeable) Areas for Improvement ------------------------ Submission 806, Review 3 ------------------------ Reviewer: external Overall rating: 4.5 (scale is 1..5; 5 is best) Your Assessment of this Paper's Contribution to HCI The paper contains the design rationale, development and evaluation of a haptic notification system (HaNS) that is used by sessions chairs and presenters to keep the presentations in time. This is a new application for haptic feedback which based on the results works in its intended use, provides private information for the speaker and supports a session chair's task without interruptions that would be noticed by the audience. Overall, the paper can be thought as a well-grounded design of a haptic solution to a common problem which has been proven to be successfully solved, with some limitations. The results contribute to the practical use of haptic interaction in real context. Overall Rating 4.5 . . . Between possibly accept and strong accept Review First, I must say that even if I haven't tried out the HaNS system in practice, I observed its use this Spring in the Haptic Symposium conference in Vancouver. My initial opinion was that this is a well-suited use for haptics even if the present implementation (as a prototype device) is still rather cumbersome and can later be manufactured in much smaller form if commercially deployed. The authors know their field of research well, and they have included clearly extensive enough references to earlier research. They also discuss the dynamics of conference sessions in detail, including chair-speaker loop and the speaker's internal control. Naturally it is simpler to affect the communication between the speaker and the chair than what happens internally with the speaker -- thus leading to going overtime in HS even if the feedback was correctly provided and noticed during the presentation. Overall, I am positive towards this submission. As I commented in the contribution section, it contains the full design, development and evaluation cycle of a useful new haptic system. The system was extensively evaluted in three different settings, in a conventional conference, madness sessions and local seminar. The results in general support the solution to be usable and fix a number of problems related to chair-speaker communication. The authors have also listed a number of limitations and potential improvements that can later be addressed. I had no problems with the evaluation being done in real settings and not in a controlled environment. CHI community should address the real-life problems with real people, not just always partly artificial controlled settings in research laboratories (that are important on their own behalf). One thing that was not clearly explained in the paper is the design decision to include three tactors in the device. The reason can be guessed to be a better contact with the user, as the tactors were not used in any spatial haptic messages (like in series 1-2-3 or 3-2-1). This could, however, be mentioned in the paper. Spatial messages would also offer more expressive communication possibilities, but it looks like they were not necessary in this study. Earlier work in the use of spatial haptic messages can be found in the literature. Expertise 4 (Expert ) Areas for Improvement The paper is well written and organized. I have no additional comments for improvement. All the required references seem to be included.