We thank the reviewers. We begin by addressing R3’s concerns, as these were highlighted by the AC. R1 also raised many important questions, which we discuss. Sample size: R3 was concerned that a sample size of 6 participants per group was too small and that the older group was too diverse. To clarify, there were 12 participants per age group (p.3, and in the N values of the figure captions). To alleviate any ambiguity, we will adjust the wording on page 3 to say: “We recruited 12 participants from each age group (AGE), for a total of 24 participants”. While we agree with the AC that a larger sample would provide additional evidence, our sample size is consistent with the field (see [13,27,28] in the paper). In terms of diversity, we used the standard groupings for age-comparative research on movement control [14]. Moreover, very few significant differences were found between the 55-69 and 70+ age groups in [19]. Importance: R3 was also concerned that pen interfaces might be inappropriate for older adults; however, prior work has not yet discounted pen interaction. When compared directly, older adults perform better with a pen than a mouse [5]. While older adults do experience problems with pen interaction, they experience at least as many difficulties with a mouse (contrast the difficulties listed in [25] with [19]). Touch interaction is also good for simple interfaces, however, it has been shown to be more error prone than the mouse when targets are small (9.8% vs. 2.1% in [ii]). In contrast to R3, R1 was uncertain whether improvements to the already low error rates for pen interfaces is worthy of additional study. We argue that because of older adults’ cognitive impairments, errors are of substantially greater significance in the real world than reflected in experimental studies. Because there is no error penalty in a Fitts task, errors are likely to be more acceptable to users in that context than they would be in a real world situation. This makes determining what is an acceptable error rate for computer use challenging. There is a shortage of real-world data on selection frequency (and none on error frequency), but one workplace study (N=2146, mean age=42) found that in 1.3 hours of computer use (daily average), participants made on average about 1300 selections [i]. We extrapolate that for those users, a 5% error rate would result in 65 errors, which could have a substantial impact, depending on the exact cost of the errors. While these figures may not map directly to older users’ experiences, it is reasonable to expect that an older user would encounter a comparable number of errors, from which they might have additional difficulty recovering. An additional factor to consider is the variability between participants. In [19], the overall error rates were 4.2%, and 6.4% (for pre-old and old, respectively), however the respective maximum error rates were 16.7%, and 22.2%, suggesting that some users had substantial problems with errors. We were unable to fully understand R1's 3rd point. We believe many of the concerns could be addressed by a better breakdown of the overall error data, including the following means (for Control, Steady, Bubble, Steady-Bubble, respectively): older: 22.2%, 19.3%, 10.9%, 10.6% and younger: 16.0%, 13.0%, 7.5%, 6.8%. Similar data is currently in the paper, but we agree that a summary table would help aid interpretation. Further, we can clarify that most of the errors (and largest effects) occurred with the smallest target sizes, which is consistent with prior work [19] and representative of many real world tasks. Target-awareness: We agree with the excellent points made by R1 on the limitations of target aware techniques, but note that the majority of successful pointing techniques to date have been target-aware (as discussed in [iii]), suggesting there is a tension between the usefulness of target knowledge and the difficulty of obtaining it. Moreover, one of the strengths of combining approaches is that support can seamlessly shift to match different task constraints (potentially using target knowledge when it is possible and relying on other techniques when it not). Though our particular combination is not perfect, it is the first example of this general approach, and demonstrates potential that has not been sufficiently explored to date. Finally, there were many excellent recommendations for improving the clarity, which we will incorporate (e.g., we will expand on the description of the combined technique as recommended by R2). [i] Andersen et al. Computer mouse use predicts acute pain but not prolonged or chronic pain in the neck and shoulder. Occup Environ Med, 2008, 65: 126-131. [ii] Sasangohar et al. Evaluation of mouse and touch input for a tabletop display using Fitts’ reciprocal tapping task. Proc. HFES’09, pp. 839-843. [iii] Wobbrock et al. The angle mouse: target-agnostic dynamic gain adjustment based on angular deviation. Proc. CHI’09, 1401-1410.