June082005Reviews < SPL

Mik's Review

Problem

Modern IDEs don't provide programmers with adequate support for comprehending and navigating the subset of the system that is relevant to their task. This paper identifies this problem in particular to maintenance tasks.

Contributions

The contributions are a lab-based user study of 10 programmers concludes that 35% of the time that programmers spent navigating could be saved by a better tool, an outline of the requirements for such a tool, and a mockup. Some interesting highlights of the proposed tool include:

6.2: Allow programmers to explicitly manage working sets and task contexts.
6.5: Show context in place when the model of what's on the screen prevents classical containment hierarchy browsing.

The paper also outlines some interesting problems:

Programmers frequently make false hypotheses that lead them to the wrong parts of the system (88% of the time seems excessive though).
When returning to a task it takes time to recover a working set (60 seconds).
There is a desire to use Eclipse bookmarks to save context, but this is cumbersome.
Addressing the problem by using very large screen resolutions will introduce a real estate management problem.
Sharing working sets could be useful in a collaborative context.

Weaknesses

Consider that:

The student study subjects observed were nowhere near the level considered expert in industry. An expert Java/Eclipse programmer does not get confused by Eclipse in the way that the subjects did (explained by Wes below).
Including dependencies, real systems are 2-5 orders of magnitude bigger than the one studied.
It is extremely rare for industry developers to use 1024x768 screen resolution since that causes excessive navigation.

Since this paper is based on observations of programming in the small, the paper leads to conclusions that do not scale to industry development:

6.1: Having programmers manually prune away a working set of automated dependencies would be a huge burden when working on a large system.
6.3: Semantic highlighting of occurrences is in Eclipse, and I think that it has been there for over a year. The extrapolation of this to cross files would show way too much if you made it dependency-based.
6.4: It's a big stretch to imagine this UI scaling. Folding has already been around for ages.
6.6: Very heuristic and unclear how it could scale.

The pie chart in Figure 3 is really hard to believe. As Wes points out reading and navigating code are part of the same task. Navigating dependencies and searching for names are the same task for many Eclipse programmers who use the search shortcuts and view. Testing can also be dominated by reading and navigating.

Questions

Could the copy/paste tracking be useful?
Do those "what happens" and "why" queries seem useful?
Can we imagine a useful version of Figure 10 (the proposed IDE)?
How effective would the current Eclipse be in QXGA resolution (2048x1536). How about the proposed UI?

Clint's Review

Problem Addressed / Brief Summary

Much of a programmer's time is spent maintaining code, and most the timeduring maintenance is spent developing an understanding of the code. This work explores the type of activities programmers perform during maintenance tasks to understand its associated difficulties and to create design requirements for better IDE support.

The bulk of the paper details a user study of 10 "expert" programmers. The programmers were told to perform a series maintenance tasks of a small Java code base. This work was captured and analyzed. The lastquarter of the paper presents an maintenance-oriented IDE designed to overcome some of the problems identified in the case study.

Key Contributions

User study details how programmers spend time on maintenance.
The authors identify several bottlenecks in the maintenance process. Namely: working set navigation and manipulation.
The authors propose designs for several new IDE tools to compliment the maintenance process.

Weaknesses

I did not feel that the interruptions were useful, and could only add noise to the observations. I realize the desire to mimic real world environments, but the 22% of the 70 minute trial was wasted on this.
Many of the screenshot figures were superfluous -- Eg: Figures 5,8,9.

Questions

How do the results of this study map to projects "in the large" (larger code base and multiple developers)?
What overhead might the proposed IDE (figure 10) create? In particular the developer must now spend time creating code fragments and trimming automatically generated dependencies.

Wesley's Review

Problem Addressed

Developers spend a relatively small portion of their time editing code during maintenance tasks. This suggests that even modern IDEs do not provide adequate support for these tasks, which results in considerable amounts of time spent navigating, searching, and reading task-irrelevant code.

This problem is approached by conducting a study of developers performing maintenance tasks with Eclipse. Observations of the programmers' difficulties and inefficiencies are used to deduce "new" design requirements for development tools. The authors then propose a development tool design that they believe satisfies these requirements.

Key Contributions

The authors suggest based on their observations that maintenance tasks consist of three main activities: Forming, navigating, and manipulating a 'working set' of task-relevant code fragments.
The paper reports on the average amount of time developers spent on reading (22%), editing (20%), navigating (16%), searching (13%), and other activities.
Some specific observations of programmers' behaviour are made. For example, programmers often perform quick there-and-back navigations they call 'glances.' They also observe that indirect dependency navigation and visual scrolling searches in editors and the package explorer account for a substantial amount of time.
A set of six design requirements for maintenance tools based on their observations (Table 5).
A proposal for a single maintenance-oriented tool that addresses each of the six design requirements. The proposed tool lays out task specific code fragments connected by dependency arrows such as "declares," and "copy of."
This study helps justify the need for the long list of tools that have already been developed to address the requirements in Table 5.

Weaknesses

A major weakness of this paper is that the authors are oblivious to related work and claim to be discovering and solving problems that are actually well known and actively researched. The authors are essentially re-observing that the code associated with a maintenance task is often scattered and thus forms a crosscutting concern, which they rename a 'working set.' This is a modularity problem addressed by years of AOSD research and many development tools. For example, many of the requirements listed in Table 5 are addressed by FEAT, CME, Mylar, the Design Snippets Tool, Eclipse, AOP languages, and many others.
Several factors detracted from the realism of the development tasks:
- 503 line program
- The expert Java developers did not seem to have experience with Eclipse. For example, they were confused by error highlighting and the incremental compilation, missed the "wrap search" option, and used searches when direct structure navigation was possible.
The paint program seems to have been developed just for the experiment, so it isn't clear if the source code is representative
5 small unrelated tasks in just over an hour doesn't seem like a common development scenario
The authors state that developers tended to ask one of two questions, how does X work? and Why did(n't) X happen? They also state that during exploration programmers were looking for answers to specific questions such as "What defines this variable's value?" However, the paper doesn't explain how the experimenters knew when a developer asked such a question or what they were really looking for. Screen capture videos were recorded, but there is no mention of verbal data.
The proposed development tool in Figure 10 shows a small amount of code, mostly from the same class. However, this requires nearly the entire screen to display. The paper could include a discussion of how the approach will scale to present non-trivial crosscutting concerns.

Questions

The authors suggest that forming, navigating, and manipulating a working set are distinct activities. Is it possible that navigating and forming are part of the same activity since you need to navigate to find relevant code? What is the distinction between forming and navigating?
What changes when the developer is familiar with the code? Does this case make the suggested requirements less useful?
Why have all previous studies cited in the paper involved a program study period followed by a program modification period? Is this an artificial constraint that should be avoided?
Do the math breaks help or hurt the realism? Should we use this technique in future maintenance task studies?

Raw edit | More topic actions

Topic revision: r3 - 2005-10-25 - JohnAnvik