Ducky Thesis Proposal Notes

Problem Statement

I propose to study which techniques use when developing code, how that varies from person to person, and how success at programming tasks correlates with choice of techniques. I will do so by replicating part or all of Robillard et al's work.

Problems:

  • P1: We (software engineering researchers) do not know what different low-level techniques developers use when developing code using an IDE.
    • P1.1: We do not have a shared vocabulary for discussing different techniques that developers use when developing code with IDEs. (?)
  • P2: We do not know which techniques are the most productive.
  • P3: We do not know how to teach/train developers how to be more productive.

Givens (right word?):

Robillard et al. showed:
  • G1: Different people use different techniques for locating relevant pieces of code.
  • G2: Charactaristic interaction patterns reflecting those techniques can be discovered by analyzing coded transcripts of video of users navigating code.
  • G3: Success at finding relevant pieces of code correlates with what technique(s) the developer uses.

Hypotheses:

  • H1: These characteristic interaction patterns can be discovered by analyzing interaction telemetry of navigation tasks.
  • H2: Software can recognize those patterns in navigation tasks.
  • H3: Data mining software can discover interesting interaction patterns in navigation tasks.
  • H4: Data mining software can discover interesting interaction patterns in more general code-development tasks.
  • H5: Success in coding tasks correlates with which interaction patterns the developer uses.

Literature Review

@@@ A presentation of the relevant literature and the theoretical framework.

  • Robillard et al
  • Murphy/Kersten/Findlater
  • BSD et al (unpublished)
  • ?

Proposed data-gathering methods

@@@ A description of the research design and instruments and data gathering methods.

I will use data collected by BSD which contains a replication of the first part of Robillard et al's study, where professional programmers search for specific interesting methods in the code.

For further work, I have access to

  • many individual logs of traces of a small number of developers either fixing one well-described bug or adding a well-defined feature, with the code available
  • many individual logs of traces of a large number of developers working on unknown material, without the code available

We do not have data corresponding to the second part of Robillard et al's study, where professional programmers attempt to add a feature. I might need to run a study replicating that part.

Proposed analysis methods

@@@ An outline of the plan for data analysis and the rationale for the level and method chosen, applicable statistical tests and computer programs.

I will use three techniques to analyze the data:

  1. eyeballs (better word?) -- I will examine the data visually, with filters as appropriate to change how the data is visualized
  2. protein-motif finding algorithm -- I will use a modified protein motif-finding algorithm to search for common patterns, and judgement to select interesting ones.
  3. data-visualization and mining tools, e.g. YALE -- I will use data mining and visualization tools to search through the patterns.

Having found patterns, I will write code to recognize those patterns.

Edit | Attach | Watch | Print version | History: r25 | r17 < r16 < r15 < r14 | Backlinks | Raw View | Raw edit | More topic actions...
Topic revision: r15 - 2006-11-20 - TWikiGuest
 
  • Edit
  • Attach
This site is powered by the TWiki collaboration platform Powered by PerlCopyright © 2008-2025 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback