Difference: HAL (1 vs. 45)

Revision 452011-01-05 - mavc

Line: 1 to 1
 

Feature Milestones

HAL 1.0

target: September, 2010
Line: 207 to 207
 
  • (CN) DataManager-decorated ExecutionManager still requires explicit commit to save results. Also run results cannot be saved unless explicitly associated with an experiment id.
  • (CN) Parameter values (eg Instance files) with spaces are split during command string construction; need to enquote them as necessary.
  • (CN) Form input not validates moved from feature requests
Added:
>
>
  • (MC) After error: java.io.IOException: Cannot run program "gnuplot" (in directory "gnuplotData"): java.io.IOException: error=2, No such file or directory, experiment cannot be aborted.
 

Revision 442010-08-24 - ChrisNell

Line: 1 to 1
 

Feature Milestones

HAL 1.0

target: September, 2010
Line: 19 to 19
 

Functionality for meta-algorithm developers

  • Ability to interact with the parameter space of an algorithm (examine domains, conditionalities, etc.) done
  • Ability to transform algorithm parameter spaces: log transforms, discretization done
Changed:
<
<
  • Ability to run arbitrary algorithms, including other meta-algorithms, in identical fashion done in redesign
>
>
  • Ability to run arbitrary algorithms, including other meta-algorithms, in identical fashion done
 
  • Ability to monitor the trajectories of all output variables of an executed algorithm run, in real time done
Changed:
<
<
  • Ability to query database of previous runs directly done in redesign
  • Ability to access instance features in prog. in refactor
  • Pre-defined metrics for aggregating performance across runs done in redesign
>
>
  • Ability to query database of previous runs directly done
  • Ability to access instance features done
  • Pre-defined metrics for aggregating performance across runs done
 

Backend functionality exposed in above

  • Ability to execute algorithms locally done
  • Ability to execute algorithms on a remote host via SSH needs update re: API changes
  • Ability to execute algorithms on a SGE cluster needs update re: object API changes
  • Ability to actively monitor remotely running algorithms via RPC needs update re: object API changes
Changed:
<
<
  • MySQL database storing records of all algorithms, instances, runs, etc. being redesigned now
  • SQLite database fallback if MySQL unavailable as above
>
>
  • MySQL database storing records of all algorithms, instances, runs, etc. done
  • SQLite database fallback if MySQL unavailable done
 
  • R interface for performing statistical tests, etc. done

Meta-Algorithms Included

Changed:
<
<
  • Configuration procedure: ParamILS (external) done; will need minor updates to work with backend redesign
>
>
  • Configuration procedure: ParamILS (external) in progress
 
  • Configuration procedure: ROAR (internal) done; will need minor updates to work with backend redesign
Changed:
<
<
  • Analysis procedure: Paired algorithm comparison done, updated
  • Analysis procedure: Single-algorithm analysis done; will need minor updates to work with backend redesign
>
>
  • Analysis procedure: Paired algorithm comparison in progress
  • Analysis procedure: Single-algorithm analysis in progress
 

Distribution Issues

  • Documentation
Line: 63 to 63
 
  • Support for "bag-of-machines" execution manager

Meta-Algorithms Included

Changed:
<
<
>
>
 
  • Multi-algorithm comparison
  • SATzilla-like portfolio builder
  • Parallelized AC
Line: 107 to 107
 
  • (CF) Selection of performance metric before selecting the configurator to use. What is the exact problem specification for configuration?
  • (CN) convenience methods in MetaAlgorithm hiding next(), hasNext(), report() from the 3rd-party developer; instead providing an interface like AlgorithmRun fetchRun(Algorithm a), with no InterruptedException; implies an AlgorithmRun class that can adaptively switch between a "queued" and a "running" implementation for before and after the true environment fetchRun(...) call is made/returns.
  • (HH) Service-oriented volunteer computing. See, e.g., "Service-Oriented Volunteer Computing for Massively Parallel Constraint Solving Using Portfolios", Zeynep Kiziltan and Jacopo Mauro, in CPAIOR-2010 proceedings.
Changed:
<
<
>
>
  • (KLB) Handle network issues (e.g. loss of connection to datamanager, etc.) robustly. Restart runs, etc., as required to ensure that the originally-requested job ultimately completes correctly with as little babysitting by the user as possible.
  • (FH) Normalization transform, in addition to existing log transform
 

Active work items

Frontend

Line: 132 to 133
 

Backend

Release-critical

Changed:
<
<
  • CN Split algorithms and configuration spaces, allowing run reuse for common-binary configuration spaces. Both DB and Java object model; requires Algorithm refactor below. (CN: done for Java objects; in progress for DB)
  • CN Explicit representation of problems/encodings, compatability of algs and instances via problem (encodings). done in Java objects; to do in data management
  • CN Refactor code to align class hierarchy with terminology of paper (CN: done for all but configurator implementations)
>
>
  • CN Split algorithms and configuration spaces, allowing run reuse for common-binary configuration spaces. Both DB and Java object model; requires Algorithm refactor below. done
  • CN Explicit representation of problems/encodings, compatability of algs and instances via problem (encodings). done
  • CN Refactor code to align class hierarchy with terminology of paper (CN: done for all but meta-algorithm implementations, which are in progress)
 
  • CN Refactor Algorithm/ParameterSpace/Parameter/Domain structure to allow above done
Changed:
<
<
  • CN Database schema -- speed-related refactor (CN: in progress)
>
>
  • CN Database schema -- speed-related refactor done (may want further tuning)
 
  • CN Refactor SSH & RPC execution managers to work under refactor

Important

Changed:
<
<
  • CN Connection pooling done (contingent on rest of DataManager refactor, above)
  • Caching analysis results
  • CN Query optimization (CN: in progress)
>
>
  • CN Connection pooling done
  • Caching analysis results (CN: in progress as part of meta-alg changes above)
  • CN Query optimization done (may want more depending on real-world observations)
 
  • Selective limitation of run-level archiving (dynamic based on runtime?)
  • add incumbentname semantic input to (design) procedures
  • instance features
Line: 151 to 152
 
  • CN DataManager API refinement (in progress as part of DataManager refactor)
  • CF N-way performance comparison
  • Stale connection issue; incl. robustness to general network issues
Changed:
<
<
  • CN Read-only DataManager connection for use by individual MA procedures done (as part of DataManager refactor)
>
>
  • CN Read-only DataManager connection for use by individual MA procedures done
 
  • Allowing relationships (incl. possible run-reuse) between different-binary "builds" of algorithms, including due to bugfixes, additional exposed parameters, etc. Also for different "versions" (without reuse) corresponding to added funcitonality.
Changed:
<
<
  • Ability to quantify membership of configurations to different design spaces
>
>
  • Ability to quantify membership of configurations to different design spaces done
 

Application: ActiveConfigurator

Release Critical

Changed:
<
<
  • VC ROAR in HAL done
  • VC Calling Matlab from Java done
>
>
  • VC ROAR in Java in testing
  • VC Calling Matlab from Java in testing
 
  • CN parameter transformations (log, discretization, etc.) done
  • VC SMBO, calling Matlab for model building/evaluation (VC: implemented, in testing)
  • Adapt Weka RF implementation for regression
Line: 170 to 171
 

Support/QA/Misc.

Release Critical

Changed:
<
<
  • JX unit testing: parameters (domains) (in progress)
  • unit testing: parameter spaces
>
>
  • unit testing: parameters (domains) OK
  • unit testing: parameter spaces OK
 
  • unit testing: algorithms
  • unit testing: execution managers (local, SSH, cluster)
  • unit testing: data managers (SQLite, MySQL)
Line: 181 to 182
 

Important

  • CN Git, not CVS done
Changed:
<
<
  • CN Order+configure new DB server (CN: ordered; waiting for shipment)
>
>
  • CN Order+configure new DB server (CN: waiting for Dave B to make final changeover)
 
  • user-facing documentation (help)
Changed:
<
<
  • CN Better logging/error-reporting (to console/within HAL). eg: log4j (in progress)
>
>
  • CN Better logging/error-reporting (to console/within HAL). eg:*done* (for most cases; exceptions are auto-logged)
 
  • CN JX VC Basic Windows support done, in testing
  • Better handling of overhead runtime vs. target algorithm runtime

Nice-to-have

Changed:
<
<
  • JX developer-facing documentation (javadocs) (in progress in parallel with unit testing)
>
>
  • developer-facing documentation (javadocs) (in progress in parallel with other work)
 

Bug Reports

Revision 432010-08-05 - ChrisNell

Line: 1 to 1
 

Feature Milestones

HAL 1.0

target: September, 2010
Line: 23 to 23
 
  • Ability to monitor the trajectories of all output variables of an executed algorithm run, in real time done
  • Ability to query database of previous runs directly done in redesign
  • Ability to access instance features in prog. in refactor
Deleted:
<
<
  • Random Forest classification + regression models, incl. interface accepting AlgorithmRun objects for training and inference
 
  • Pre-defined metrics for aggregating performance across runs done in redesign

Backend functionality exposed in above

Line: 38 to 37
 

Meta-Algorithms Included

  • Configuration procedure: ParamILS (external) done; will need minor updates to work with backend redesign
  • Configuration procedure: ROAR (internal) done; will need minor updates to work with backend redesign
Deleted:
<
<
 
  • Analysis procedure: Paired algorithm comparison done, updated
  • Analysis procedure: Single-algorithm analysis done; will need minor updates to work with backend redesign
Line: 57 to 55
 
  • Ability to "chain" experiments (eg. design procs. followed by analysis proc comparing incumbents)

Functionality for meta-algorithm developers

Added:
>
>
  • Random Forest classification + regression models, incl. interface accepting AlgorithmRun objects for training and inference
 
  • support for feature extraction procedures

Backend functionality

Line: 64 to 63
 
  • Support for "bag-of-machines" execution manager

Meta-Algorithms Included

Added:
>
>
 
  • Multi-algorithm comparison
  • SATzilla-like portfolio builder
  • Parallelized AC

Revision 422010-07-28 - ChrisNell

Line: 1 to 1
 

Feature Milestones

HAL 1.0

target: September, 2010
Line: 24 to 24
 
  • Ability to query database of previous runs directly done in redesign
  • Ability to access instance features in prog. in refactor
  • Random Forest classification + regression models, incl. interface accepting AlgorithmRun objects for training and inference
Changed:
<
<
  • Pre-defined metrics for aggregating performance across runs
>
>
  • Pre-defined metrics for aggregating performance across runs done in redesign
 

Backend functionality exposed in above

  • Ability to execute algorithms locally done

Revision 412010-07-28 - ChrisNell

Line: 1 to 1
 

Feature Milestones

HAL 1.0

target: September, 2010
Line: 12 to 12
 
  • Page to view summary of all queued, running, and completed jobs
  • Page to view browse/view details/delete runs/problems/instances/algorithms/environments
  • Dynamic run monitoring analysis pages, including:
Changed:
<
<
  • Plots: Overlaid SCDs for (fixed #) multi-alg, multi inst meta-algs (RTDs for single-inst), SQT for meta-algs where possible, scatter plot for 2-target multi-instance meta-algs, incumbent SCD/RTD for design meta-algs.
  • Descriptive statistics: (mean/sd, quantiles/iqrs) for assessing single-algorithm on an instance dist
  • Statistical tests: Wilcoxon signed rank, Spearman correlation for comparing 2 algs on an instance dist
>
>
  • Plots: Overlaid SCDs for (fixed #) multi-alg, multi inst meta-algs (RTDs for single-inst), SQT for meta-algs where possible, scatter plot for 2-target multi-instance meta-algs, incumbent SCD/RTD for design meta-algs. done but being reworked
  • Descriptive statistics: (mean/sd, quantiles/iqrs) for assessing single-algorithm on an instance dist done
  • Statistical tests: Wilcoxon signed rank, Spearman correlation for comparing 2 algs on an instance dist done
 

Functionality for meta-algorithm developers

  • Ability to interact with the parameter space of an algorithm (examine domains, conditionalities, etc.) done
Line: 42 to 42
 
  • Analysis procedure: Paired algorithm comparison done, updated
  • Analysis procedure: Single-algorithm analysis done; will need minor updates to work with backend redesign
Changed:
<
<
>
>

Distribution Issues

  • Documentation
  • Detection/configuration of external dependencies (c.f. UI/execution environment specification)
  • Double-click-to-run universal JAR distribution
 

HAL 1.1

Line: 54 to 57
 
  • Ability to "chain" experiments (eg. design procs. followed by analysis proc comparing incumbents)

Functionality for meta-algorithm developers

Changed:
<
<
>
>
  • support for feature extraction procedures
 

Backend functionality

  • Support for TORQUE clusters
Added:
>
>
  • Support for "bag-of-machines" execution manager
 

Meta-Algorithms Included

  • Multi-algorithm comparison
  • SATzilla-like portfolio builder
  • Parallelized AC
Added:
>
>
 

HAL 1.x

target: 2011
Deleted:
<
<
  • Packaging/bundling complete experiments or other HAL primitives for easy reproduction or installation by other users.
 
  • libraries of:
    • search/optimization procedures
    • machine learning tools
Line: 75 to 79
 
  • bootstrapped analyses
  • robustness analyses
  • parameter response analyses
Deleted:
<
<
 
  • Parallel portfolios in HAL
  • Iterated F-Race in HAL
  • support for optimization/Monte-Carlo experiments
  • support instance generators
Deleted:
<
<
  • support for feature extraction procedures
 
  • support for instance format converters
  • Support text-file inputs and outputs for external algorithms (now is only cmd line, and stdin/err)
  • array jobs in SGE

Revision 402010-07-27 - ChrisNell

Line: 1 to 1
Deleted:
<
<
 

Feature Milestones

HAL 1.0

target: September, 2010
Line: 7 to 6
 
  • Page to add new external target algorithms
  • Page to add new parameter spaces for a given target algorithm (modified from existing spaces)
  • Page to add new problem instances/distributions (in the form of lists of files)
Changed:
<
<
  • Page to specify new execution environments (i.e. cluster config details)
>
>
  • Page to specify new execution environments (Eg. cluster config details)
 
  • Pages to specify & launch included meta-algorithms
  • Ability to view algorithms/instances by problem (instance compatibility) during above specification
  • Page to view summary of all queued, running, and completed jobs
  • Page to view browse/view details/delete runs/problems/instances/algorithms/environments
Changed:
<
<
  • Dnyamic monitoring pages, including:
>
>
  • Dynamic run monitoring analysis pages, including:
 
  • Plots: Overlaid SCDs for (fixed #) multi-alg, multi inst meta-algs (RTDs for single-inst), SQT for meta-algs where possible, scatter plot for 2-target multi-instance meta-algs, incumbent SCD/RTD for design meta-algs.
  • Descriptive statistics: (mean/sd, quantiles/iqrs) for assessing single-algorithm on an instance dist
  • Statistical tests: Wilcoxon signed rank, Spearman correlation for comparing 2 algs on an instance dist
Line: 20 to 19
 

Functionality for meta-algorithm developers

  • Ability to interact with the parameter space of an algorithm (examine domains, conditionalities, etc.) done
  • Ability to transform algorithm parameter spaces: log transforms, discretization done
Changed:
<
<
  • Ability to run arbitrary algorithms, including other meta-algorithms, in identical fashion done in refactor
>
>
  • Ability to run arbitrary algorithms, including other meta-algorithms, in identical fashion done in redesign
 
  • Ability to monitor the trajectories of all output variables of an executed algorithm run, in real time done
Changed:
<
<
  • Ability to query database of previous runs directly done in refactor
>
>
  • Ability to query database of previous runs directly done in redesign
 
  • Ability to access instance features in prog. in refactor
  • Random Forest classification + regression models, incl. interface accepting AlgorithmRun objects for training and inference
Added:
>
>
  • Pre-defined metrics for aggregating performance across runs
 

Backend functionality exposed in above

  • Ability to execute algorithms locally done
Line: 33 to 33
 
  • Ability to actively monitor remotely running algorithms via RPC needs update re: object API changes
  • MySQL database storing records of all algorithms, instances, runs, etc. being redesigned now
  • SQLite database fallback if MySQL unavailable as above
Changed:
<
<
>
>
  • R interface for performing statistical tests, etc. done
 

Meta-Algorithms Included

Changed:
<
<
  • Configuration procedure: ParamILS (external) done; will need minor updates to work with backend refactor
  • Configuration procedure: ROAR (internal) done; will need minor updates to work with backend refactor
>
>
  • Configuration procedure: ParamILS (external) done; will need minor updates to work with backend redesign
  • Configuration procedure: ROAR (internal) done; will need minor updates to work with backend redesign
 
  • Configuration procedure: ActiveConfigurator (internal) in progress
  • Analysis procedure: Paired algorithm comparison done, updated
Changed:
<
<
  • Analysis procedure: Single-algorithm analysis done; will need minor updates to work with backend refactor
>
>
  • Analysis procedure: Single-algorithm analysis done; will need minor updates to work with backend redesign
 
Line: 64 to 64
 
  • SATzilla-like portfolio builder
  • Parallelized AC
Changed:
<
<

Scheduled Tasks

>
>

HAL 1.x

target: 2011
  • Packaging/bundling complete experiments or other HAL primitives for easy reproduction or installation by other users.
  • libraries of:
    • search/optimization procedures
    • machine learning tools
  • multi-algorithm comparisons
  • scaling analyses
  • bootstrapped analyses
  • robustness analyses
  • parameter response analyses
  • ParamILS in HAL
  • Parallel portfolios in HAL
  • Iterated F-Race in HAL
  • support for optimization/Monte-Carlo experiments
  • support instance generators
  • support for feature extraction procedures
  • support for instance format converters
  • Support text-file inputs and outputs for external algorithms (now is only cmd line, and stdin/err)
  • array jobs in SGE
  • Wider support for working directory requirements of individual algorithm runs, e.g. Concorde's creation of 20 files with fixed names.

Unprioritized Features

new feature requests should be initially added here; notify a HAL developer and come to a HAL meeting if you feel your feature must move up the stack quickly
  • (FH) Support for complete configuration experiment, front to back: run configurator N times on a training set, report the N training and test set performances CN: can hopefully be implemented as a chained experiment
  • (FH) Developers of configurators should be able to swap in new versions of a configurator _CN:
  • (FH) Configuration scenarios, specifying a complete configuration task including the test set; only missing part being the configurator
  • (FH) Saveable sets of configuration scenarios to perform (use case: I change the configurator and want to evaluate it)
  • (FH) Taking this a step further: support for optimizing a parameterized configurator (configurator is an algorithm, and the above set of experiments is the set of "instances") CN: this is what is being implemented in the ongoing backend redesign
  • (FH) Submitting runs from a machine that is itelf a cluster submit host should not need to go through SSH
  • (CF) Memory usage / CPU time monitoring in HAL of target algorithm runs, in order to report warnings on potential problems (like excessive swapping for example).
  • (HH) Significance-gated analysis / sequential hypothesis testing (see email from HH).
  • (CF) Continued testing to support LAMA-ish difficulties in HAL:
  • * Wallclock vs. CPU cutoff options
  • * Warnings in the dashboard if target runs or experiments are behaving "strangely"
  • * Email notifications sent to users when various events happen
  • (CF) Restricted data/execution/targetalgs for the demo server
  • (CF) Selection of performance metric before selecting the configurator to use. What is the exact problem specification for configuration?
  • (CN) convenience methods in MetaAlgorithm hiding next(), hasNext(), report() from the 3rd-party developer; instead providing an interface like AlgorithmRun fetchRun(Algorithm a), with no InterruptedException; implies an AlgorithmRun class that can adaptively switch between a "queued" and a "running" implementation for before and after the true environment fetchRun(...) call is made/returns.
  • (HH) Service-oriented volunteer computing. See, e.g., "Service-Oriented Volunteer Computing for Massively Parallel Constraint Solving Using Portfolios", Zeynep Kiziltan and Jacopo Mauro, in CPAIOR-2010 proceedings.

Active work items

 

Frontend

Release-critical

  • CF algorithm specification screen: implement (includes initial design space specification) (CF): In Progress
Line: 146 to 189
 
  • JX developer-facing documentation (javadocs) (in progress in parallel with unit testing)
Deleted:
<
<

Medium-term

For future HAL 1.x revisions
  • Packaging/bundling complete experiments or other HAL primitives for easy reproduction or installation by other users.
  • libraries of:
    • search/optimization procedures
    • machine learning tools
  • multi-algorithm comparisons
  • scaling analyses
  • bootstrapped analyses
  • robustness analyses
  • parameter response analyses
  • SATzilla in HAL
  • ParamILS in HAL
  • Parallel portfolios in HAL
  • Iterated F-Race in HAL
  • chained-procedure experiments
  • support for optimization/Monte-Carlo experiments
  • support instance generators
  • Support text-file inputs and outputs for external algorithms
  • array jobs in SGE
  • Wider support for working directory requirements of individual algorithm runs, i.e. Concorde's creation of 20 files with fixed names.
  • Validation of form input.
  • Scriptable submission of experiments. (CF): Accelerated for Frank, finished 18/05/2010.
  • Ability to browse algorithms, instances, instance sets, configurations, etc. This includes the ability to see things related to the item being browsed. Performance of different algorithms/configurations on a given instance, performance of algorithms across an instance set, performance of a given configuration.

Long-term/Unprioritized

Feature requests should be initially added here
  • (FH) Support for complete configuration experiment, front to back: run configurator N times on a training set, report the N training and test set performances
  • (FH) Developers of configurators should be able to swap in new versions of a configurator
  • (FH) Configuration scenarios, specifying a complete configuration task including the test set; only missing part being the configurator
  • (FH) Saveable sets of configuration scenarios to perform (use case: I change the configurator and want to evaluate it)
  • (FH) Taking this a step further: support for optimizing a parameterized configurator (configurator is an algorithm, and the above set of experiments is the set of "instances")
  • (FH) Submitting runs from a machine that is itelf a cluster submit host should not need to go through SSH
  • (CF) Memory usage / CPU time monitoring in HAL of target algorithm runs, in order to report warnings on potential problems (like excessive swapping for example).
  • (HH) Significance-gated analysis / sequential hypothesis testing (see email from HH).
  • (CF) Continued testing to support LAMA-ish difficulties in HAL:
  • * Wallclock vs. CPU cutoff options
  • * Warnings in the dashboard if target runs or experiments are behaving "strangely"
  • * Email notifications sent to users when various events happen
  • (CF) Restricted data/execution/targetalgs for the demo server
  • (CN) Support of performance metrics
  • (CF) Selection of performance metric before selecting the configurator to use. What is the exact problem specification for configuration?
  • (CN) convenience methods in MetaAlgorithm hiding next(), hasNext(), report() from the 3rd-party developer; instead providing an interface like AlgorithmRun fetchRun(Algorithm a), with no InterruptedException; implies an AlgorithmRun class that can adaptively switch between a "queued" and a "running" implementation for before and after the true environment fetchRun(...) call is made/returns.
  • (HH) Service-oriented volunteer computing. See, e.g., "Service-Oriented Volunteer Computing for Massively Parallel Constraint Solving Using Portfolios", Zeynep Kiziltan and Jacopo Mauro, in CPAIOR-2010 proceedings.

 

Bug Reports

  • (CN) JSC test reliability issue (compared to R)
  • (CN) end-of-experiment hanging bug (GGA, multinode cluster runs)
Deleted:
<
<
  • (JS) InnoDB SQL errors (CN): fixed 11/05/10
 
  • (LX) missing current-time point in solution quality trace, so don't see the final "flat line"
  • (CN) accuracy of mid-run overhead accounting for PILS/GGA
  • (CF) Configuration file callstrings with weird spaces, i.e. "... -param '$val$ blah' ..." where '$val blah' needs to be passed to the target as a single argument. (CN) does this work with double-quotes instead of single-quotes?
Line: 203 to 199
 
Deleted:
<
<
  • (CF) When HAL kills a target algorithm run, it does not also kill all child processes spawned by that run. This can leave zombies and all kinds of other very bad things after a period of time. (CN): fixed 18/05/10
 
  • (FH) If a HAL slave process fails to start, the associated expt. status stays on "queued" forever
  • (FH) Database table contention causes locking and high query latency. Likely to be fixed by database changes and use of InnoDB, but I'm reporting it anyway.
  • (CN) DataManager-decorated ExecutionManager still requires explicit commit to save results. Also run results cannot be saved unless explicitly associated with an experiment id.
  • (CN) Parameter values (eg Instance files) with spaces are split during command string construction; need to enquote them as necessary.
Added:
>
>
  • (CN) Form input not validates moved from feature requests
 

Revision 392010-07-27 - ChrisNell

Line: 1 to 1
Changed:
<
<

Short-term

Target: CRC/initial release
>
>

Feature Milestones

HAL 1.0

target: September, 2010

Web UI Features

  • Page to add new external target algorithms
  • Page to add new parameter spaces for a given target algorithm (modified from existing spaces)
  • Page to add new problem instances/distributions (in the form of lists of files)
  • Page to specify new execution environments (i.e. cluster config details)
  • Pages to specify & launch included meta-algorithms
  • Ability to view algorithms/instances by problem (instance compatibility) during above specification
  • Page to view summary of all queued, running, and completed jobs
  • Page to view browse/view details/delete runs/problems/instances/algorithms/environments
  • Dnyamic monitoring pages, including:
  • Plots: Overlaid SCDs for (fixed #) multi-alg, multi inst meta-algs (RTDs for single-inst), SQT for meta-algs where possible, scatter plot for 2-target multi-instance meta-algs, incumbent SCD/RTD for design meta-algs.
  • Descriptive statistics: (mean/sd, quantiles/iqrs) for assessing single-algorithm on an instance dist
  • Statistical tests: Wilcoxon signed rank, Spearman correlation for comparing 2 algs on an instance dist

Functionality for meta-algorithm developers

  • Ability to interact with the parameter space of an algorithm (examine domains, conditionalities, etc.) done
  • Ability to transform algorithm parameter spaces: log transforms, discretization done
  • Ability to run arbitrary algorithms, including other meta-algorithms, in identical fashion done in refactor
  • Ability to monitor the trajectories of all output variables of an executed algorithm run, in real time done
  • Ability to query database of previous runs directly done in refactor
  • Ability to access instance features in prog. in refactor
  • Random Forest classification + regression models, incl. interface accepting AlgorithmRun objects for training and inference

Backend functionality exposed in above

  • Ability to execute algorithms locally done
  • Ability to execute algorithms on a remote host via SSH needs update re: API changes
  • Ability to execute algorithms on a SGE cluster needs update re: object API changes
  • Ability to actively monitor remotely running algorithms via RPC needs update re: object API changes
  • MySQL database storing records of all algorithms, instances, runs, etc. being redesigned now
  • SQLite database fallback if MySQL unavailable as above

Meta-Algorithms Included

  • Configuration procedure: ParamILS (external) done; will need minor updates to work with backend refactor
  • Configuration procedure: ROAR (internal) done; will need minor updates to work with backend refactor
  • Configuration procedure: ActiveConfigurator (internal) in progress
  • Analysis procedure: Paired algorithm comparison done, updated
  • Analysis procedure: Single-algorithm analysis done; will need minor updates to work with backend refactor

HAL 1.1

target: December, 2010

Web UI Features

  • Ability to export complete experiment packages (including algorithms, instances, run instructions)
  • Ability to load and execute an experiment package
  • Ability to "chain" experiments (eg. design procs. followed by analysis proc comparing incumbents)

Functionality for meta-algorithm developers

Backend functionality

  • Support for TORQUE clusters

Meta-Algorithms Included

  • Multi-algorithm comparison
  • SATzilla-like portfolio builder
  • Parallelized AC

Scheduled Tasks

 

Frontend

Release-critical

Deleted:
<
<
functionality promised in paper
 
  • CF algorithm specification screen: implement (includes initial design space specification) (CF): In Progress
  • CF left side of landing page: task selection/presentation according to pattern concept (CF): In Progress
  • CF experiment specification and monitor screens from a pattern template, and procedure-specific requirements, including experiment and incubment naming
Line: 11 to 74
 
  • CF Execution environment specification (incl. R, Gnuplot, java locations) (CF): In Progress
  • RTDs/per-target-algorithm-run monitoring and navigation
  • design space specification by revision of existing spaces
Added:
>
>
  • Merge with backend refactor (when done)
 

Important

Deleted:
<
<
works as-is but end-user experience significantly impacted
 
  • Data management interface:
    • deleting runs/expts/etc.
    • data export
Line: 24 to 87
 

Backend

Release-critical

Changed:
<
<
for functionality mentioned in paper for which post-release changes would be problematic
  • CN Named instance set table done
  • CN Named configuration table done
  • CN Execution environment table. (CF): Reopened to account for java/ruby/gnuplot location specification etc.; mostly done but not checked in
  • CN Split algorithms and configuration spaces, allowing run reuse for common-binary configuration spaces. Both DB and Java object model; requires Algorithm refactor below. (CN: done for Java objects; begun for DB)
>
>
  • CN Split algorithms and configuration spaces, allowing run reuse for common-binary configuration spaces. Both DB and Java object model; requires Algorithm refactor below. (CN: done for Java objects; in progress for DB)
 
  • CN Explicit representation of problems/encodings, compatability of algs and instances via problem (encodings). done in Java objects; to do in data management
Changed:
<
<
  • CN rename objects to match paper terminology done
  • CN Refactor Algorithms/Meta-algorithms in code to align class hierarchy with terminology of paper (CN: done for all but configurator implementations)
>
>
  • CN Refactor code to align class hierarchy with terminology of paper (CN: done for all but configurator implementations)
 
  • CN Refactor Algorithm/ParameterSpace/Parameter/Domain structure to allow above done
  • CN Database schema -- speed-related refactor (CN: in progress)
Added:
>
>
  • CN Refactor SSH & RPC execution managers to work under refactor
 

Important

Deleted:
<
<
mostly to (substantially) improve UI responsiveness
 
  • CN Connection pooling done (contingent on rest of DataManager refactor, above)
  • Caching analysis results
  • CN Query optimization (CN: in progress)
  • Selective limitation of run-level archiving (dynamic based on runtime?)
  • add incumbentname semantic input to (design) procedures
Added:
>
>
  • instance features
 

Nice-to-have

Deleted:
<
<
noticeable mostly to developer-users
 
Changed:
<
<
  • CF N-way performance comparison first-cut for Frank.
>
>
  • CF N-way performance comparison
 
  • Stale connection issue; incl. robustness to general network issues
  • CN Read-only DataManager connection for use by individual MA procedures done (as part of DataManager refactor)
  • Allowing relationships (incl. possible run-reuse) between different-binary "builds" of algorithms, including due to bugfixes, additional exposed parameters, etc. Also for different "versions" (without reuse) corresponding to added funcitonality.
Line: 55 to 113
 

Application: ActiveConfigurator

Release Critical

Changed:
<
<
  • VC ROAR in HAL (CN: implemented, in testing)
>
>
  • VC ROAR in HAL done
 
  • VC Calling Matlab from Java done
  • CN parameter transformations (log, discretization, etc.) done
Changed:
<
<
  • VC SMBO, calling Matlab for model building/evaluation
>
>
  • VC SMBO, calling Matlab for model building/evaluation (VC: implemented, in testing)
 
  • Adapt Weka RF implementation for regression
  • Pure-Java SMBO implementation
  • Merge Java AC with refactored HAL codebase once refactor is completed
Line: 67 to 125
 

Support/QA/Misc.

Release Critical

Changed:
<
<
  • JX unit testing: parameters (domains)
>
>
  • JX unit testing: parameters (domains) (in progress)
 
  • unit testing: parameter spaces
  • unit testing: algorithms
  • unit testing: execution managers (local, SSH, cluster)
Changed:
<
<
  • unit testing: data managers (SQLite, Mysql)
>
>
  • unit testing: data managers (SQLite, MySQL)
 
  • unit testing: meta-algorithms
  • functional testing: full pipeline
Added:
>
>
  • Licensing issues (GPL'd components...)
 

Important

  • CN Git, not CVS done
  • CN Order+configure new DB server (CN: ordered; waiting for shipment)
  • user-facing documentation (help)
Changed:
<
<
  • Better logging/error-reporting (to console/within HAL). eg: log4j
>
>
  • CN Better logging/error-reporting (to console/within HAL). eg: log4j (in progress)
  • CN JX VC Basic Windows support done, in testing
 
  • Better handling of overhead runtime vs. target algorithm runtime

Nice-to-have

Changed:
<
<
  • developer-facing documentation (javadocs) (JX: in progress in parallel with unit testing)
>
>
  • JX developer-facing documentation (javadocs) (in progress in parallel with unit testing)
 

Medium-term

Changed:
<
<
Planned for future HAL 1.x revisions
>
>
For future HAL 1.x revisions
 
  • Packaging/bundling complete experiments or other HAL primitives for easy reproduction or installation by other users.
Deleted:
<
<
  • Windows support
 
  • libraries of:
    • search/optimization procedures
    • machine learning tools
Line: 107 to 165
 
  • support for optimization/Monte-Carlo experiments
  • support instance generators
  • Support text-file inputs and outputs for external algorithms
Deleted:
<
<
  • Instance features
  • Explicit representation of problems (e.g. particular instance formats)
  • Experiments calling experiments, not just external target algs
 
  • array jobs in SGE
Deleted:
<
<
  • Hashing everything, including instances, instance sets and configurations.
 
  • Wider support for working directory requirements of individual algorithm runs, i.e. Concorde's creation of 20 files with fixed names.
  • Validation of form input.
  • Scriptable submission of experiments. (CF): Accelerated for Frank, finished 18/05/2010.
Line: 126 to 180
 
  • (FH) Saveable sets of configuration scenarios to perform (use case: I change the configurator and want to evaluate it)
  • (FH) Taking this a step further: support for optimizing a parameterized configurator (configurator is an algorithm, and the above set of experiments is the set of "instances")
  • (FH) Submitting runs from a machine that is itelf a cluster submit host should not need to go through SSH
Deleted:
<
<
 
  • (CF) Memory usage / CPU time monitoring in HAL of target algorithm runs, in order to report warnings on potential problems (like excessive swapping for example).
  • (HH) Significance-gated analysis / sequential hypothesis testing (see email from HH).
  • (CF) Continued testing to support LAMA-ish difficulties in HAL:

Revision 382010-07-20 - ChrisNell

Line: 1 to 1
 

Short-term

Target: CRC/initial release

Frontend

Line: 37 to 37
 

Important

mostly to (substantially) improve UI responsiveness
Changed:
<
<
  • Connection pooling
>
>
  • CN Connection pooling done (contingent on rest of DataManager refactor, above)
 
  • Caching analysis results
Changed:
<
<
  • Query optimization
>
>
  • CN Query optimization (CN: in progress)
 
  • Selective limitation of run-level archiving (dynamic based on runtime?)
  • add incumbentname semantic input to (design) procedures

Nice-to-have

noticeable mostly to developer-users
Changed:
<
<
>
>
 
  • CF N-way performance comparison first-cut for Frank.
  • Stale connection issue; incl. robustness to general network issues
Changed:
<
<
  • Read-only DataManager connection for use by individual MA procedures
>
>
  • CN Read-only DataManager connection for use by individual MA procedures done (as part of DataManager refactor)
 
  • Allowing relationships (incl. possible run-reuse) between different-binary "builds" of algorithms, including due to bugfixes, additional exposed parameters, etc. Also for different "versions" (without reuse) corresponding to added funcitonality.
  • Ability to quantify membership of configurations to different design spaces

Revision 372010-07-15 - ChrisNell

Line: 1 to 1
 

Short-term

Target: CRC/initial release

Frontend

Line: 28 to 28
 
  • CN Named instance set table done
  • CN Named configuration table done
  • CN Execution environment table. (CF): Reopened to account for java/ruby/gnuplot location specification etc.; mostly done but not checked in
Changed:
<
<
  • CN Split algorithms and configuration spaces, allowing run reuse for common-binary configuration spaces. Both DB and Java object model; requires Algorithm refactor below. (CN: done for Java objects; not started for DB)
  • CN Explicit representation of problems/encodings, compatability of algs and instances via problem (encodings). (CN: in progress)
>
>
  • CN Split algorithms and configuration spaces, allowing run reuse for common-binary configuration spaces. Both DB and Java object model; requires Algorithm refactor below. (CN: done for Java objects; begun for DB)
  • CN Explicit representation of problems/encodings, compatability of algs and instances via problem (encodings). done in Java objects; to do in data management
 
  • CN rename objects to match paper terminology done
Changed:
<
<
  • CN Refactor Algorithms/Meta-algorithms in code to align class hierarchy with terminology of paper (CN: in progress)
>
>
  • CN Refactor Algorithms/Meta-algorithms in code to align class hierarchy with terminology of paper (CN: done for all but configurator implementations)
 
  • CN Refactor Algorithm/ParameterSpace/Parameter/Domain structure to allow above done
Changed:
<
<
  • CN Database schema -- speed-related refactor
>
>
  • CN Database schema -- speed-related refactor (CN: in progress)
 

Important

mostly to (substantially) improve UI responsiveness
Line: 53 to 53
 
  • Ability to quantify membership of configurations to different design spaces
Added:
>
>

Application: ActiveConfigurator

Release Critical

  • VC ROAR in HAL (CN: implemented, in testing)
  • VC Calling Matlab from Java done
  • CN parameter transformations (log, discretization, etc.) done
  • VC SMBO, calling Matlab for model building/evaluation
  • Adapt Weka RF implementation for regression
  • Pure-Java SMBO implementation
  • Merge Java AC with refactored HAL codebase once refactor is completed
  • Adapt standalone Java AC to work as "internal" HAL meta-algorithm
 

Support/QA/Misc.

Release Critical

Changed:
<
<
  • more unittests; also functional/integration tests
>
>
  • JX unit testing: parameters (domains)
  • unit testing: parameter spaces
  • unit testing: algorithms
  • unit testing: execution managers (local, SSH, cluster)
  • unit testing: data managers (SQLite, Mysql)
  • unit testing: meta-algorithms
  • functional testing: full pipeline
 

Important

Added:
>
>
  • CN Git, not CVS done
  • CN Order+configure new DB server (CN: ordered; waiting for shipment)
 
  • user-facing documentation (help)
  • Better logging/error-reporting (to console/within HAL). eg: log4j
  • Better handling of overhead runtime vs. target algorithm runtime

Nice-to-have

Changed:
<
<
  • developer-facing documentation (javadocs)
>
>
  • developer-facing documentation (javadocs) (JX: in progress in parallel with unit testing)
 

Medium-term

Line: 82 to 102
 
Deleted:
<
<
 
  • Iterated F-Race in HAL
  • chained-procedure experiments
  • support for optimization/Monte-Carlo experiments
  • support instance generators
Deleted:
<
<
 
  • Support text-file inputs and outputs for external algorithms
  • Instance features
  • Explicit representation of problems (e.g. particular instance formats)

Revision 362010-06-16 - ChrisFawcett

Line: 1 to 1
 

Short-term

Target: CRC/initial release

Frontend

Line: 119 to 119
 
  • (CN) Support of performance metrics
  • (CF) Selection of performance metric before selecting the configurator to use. What is the exact problem specification for configuration?
  • (CN) convenience methods in MetaAlgorithm hiding next(), hasNext(), report() from the 3rd-party developer; instead providing an interface like AlgorithmRun fetchRun(Algorithm a), with no InterruptedException; implies an AlgorithmRun class that can adaptively switch between a "queued" and a "running" implementation for before and after the true environment fetchRun(...) call is made/returns.
Added:
>
>
  • (HH) Service-oriented volunteer computing. See, e.g., "Service-Oriented Volunteer Computing for Massively Parallel Constraint Solving Using Portfolios", Zeynep Kiziltan and Jacopo Mauro, in CPAIOR-2010 proceedings.
 

Bug Reports

  • (CN) JSC test reliability issue (compared to R)

Revision 352010-06-14 - ChrisNell

Line: 1 to 1
 

Short-term

Target: CRC/initial release

Frontend

Line: 118 to 118
 
  • (CF) Restricted data/execution/targetalgs for the demo server
  • (CN) Support of performance metrics
  • (CF) Selection of performance metric before selecting the configurator to use. What is the exact problem specification for configuration?
Added:
>
>
  • (CN) convenience methods in MetaAlgorithm hiding next(), hasNext(), report() from the 3rd-party developer; instead providing an interface like AlgorithmRun fetchRun(Algorithm a), with no InterruptedException; implies an AlgorithmRun class that can adaptively switch between a "queued" and a "running" implementation for before and after the true environment fetchRun(...) call is made/returns.
 

Bug Reports

  • (CN) JSC test reliability issue (compared to R)

Revision 342010-06-10 - ChrisNell

Line: 1 to 1
 

Short-term

Target: CRC/initial release

Frontend

Line: 27 to 27
 for functionality mentioned in paper for which post-release changes would be problematic
  • CN Named instance set table done
  • CN Named configuration table done
Changed:
<
<
  • CN Execution environment table. (CF): Reopened to account for java/ruby/gnuplot location specification etc.; in progress
  • CN Split algorithms and configuration spaces, allowing run reuse for common-binary configuration spaces. Both DB and Java object model; requires Algorithm refactor below. (CN: in progress)
  • CN Database schema -- speed-related refactor (CN: next up)
  • CN Explicit representation of problems/encodings, compatability of algs and instances via problem (encodings).
>
>
  • CN Execution environment table. (CF): Reopened to account for java/ruby/gnuplot location specification etc.; mostly done but not checked in
  • CN Split algorithms and configuration spaces, allowing run reuse for common-binary configuration spaces. Both DB and Java object model; requires Algorithm refactor below. (CN: done for Java objects; not started for DB)
  • CN Explicit representation of problems/encodings, compatability of algs and instances via problem (encodings). (CN: in progress)
 
  • CN rename objects to match paper terminology done
Changed:
<
<
  • CN Refactor Algorithms/Meta-algorithms in code to align class hierarchy with terminology of paper
  • CN Refactor Algorithm/ParameterSpace/Parameter/Domain structure to allow above
>
>
  • CN Refactor Algorithms/Meta-algorithms in code to align class hierarchy with terminology of paper (CN: in progress)
  • CN Refactor Algorithm/ParameterSpace/Parameter/Domain structure to allow above done
  • CN Database schema -- speed-related refactor
 

Important

mostly to (substantially) improve UI responsiveness
Line: 133 to 133
 
  • (CF) When HAL kills a target algorithm run, it does not also kill all child processes spawned by that run. This can leave zombies and all kinds of other very bad things after a period of time. (CN): fixed 18/05/10
  • (FH) If a HAL slave process fails to start, the associated expt. status stays on "queued" forever
  • (FH) Database table contention causes locking and high query latency. Likely to be fixed by database changes and use of InnoDB, but I'm reporting it anyway.
Added:
>
>
  • (CN) DataManager-decorated ExecutionManager still requires explicit commit to save results. Also run results cannot be saved unless explicitly associated with an experiment id.
  • (CN) Parameter values (eg Instance files) with spaces are split during command string construction; need to enquote them as necessary.
 

Revision 332010-06-08 - ChrisFawcett

Line: 1 to 1
 

Short-term

Target: CRC/initial release

Frontend

Line: 69 to 69
 

Medium-term

Planned for future HAL 1.x revisions
Changed:
<
<
  • Packaging complete experiments
>
>
  • Packaging/bundling complete experiments or other HAL primitives for easy reproduction or installation by other users.
 
  • Windows support
  • libraries of:
    • search/optimization procedures

Revision 322010-05-27 - ChrisNell

Line: 1 to 1
 

Short-term

Target: CRC/initial release

Frontend

Line: 52 to 52
 
  • Allowing relationships (incl. possible run-reuse) between different-binary "builds" of algorithms, including due to bugfixes, additional exposed parameters, etc. Also for different "versions" (without reuse) corresponding to added funcitonality.
  • Ability to quantify membership of configurations to different design spaces
Added:
>
>
 

Support/QA/Misc.

Added:
>
>

Release Critical

  • more unittests; also functional/integration tests
 

Important

  • user-facing documentation (help)
  • Better logging/error-reporting (to console/within HAL). eg: log4j
  • Better handling of overhead runtime vs. target algorithm runtime
Deleted:
<
<
  • WAY more unittests; also functional/integration tests
 

Nice-to-have

  • developer-facing documentation (javadocs)
Line: 99 to 102
 

Long-term/Unprioritized

Feature requests should be initially added here
Deleted:
<
<
  • (FH) Probably simple: Support PAR-10 as one parameter for ParamILS
  • (FH) Probably simple: Support single-CPU arrow runs with csh shell (currently, I can only run ParamILS using 2 CPUs, one of which is then idle)
 
  • (FH) Support for complete configuration experiment, front to back: run configurator N times on a training set, report the N training and test set performances
  • (FH) Developers of configurators should be able to swap in new versions of a configurator
  • (FH) Configuration scenarios, specifying a complete configuration task including the test set; only missing part being the configurator

Revision 312010-05-27 - FrankHutter

Line: 1 to 1
 

Short-term

Target: CRC/initial release

Frontend

Line: 100 to 99
 

Long-term/Unprioritized

Feature requests should be initially added here
Added:
>
>
  • (FH) Probably simple: Support PAR-10 as one parameter for ParamILS
  • (FH) Probably simple: Support single-CPU arrow runs with csh shell (currently, I can only run ParamILS using 2 CPUs, one of which is then idle)
 
  • (FH) Support for complete configuration experiment, front to back: run configurator N times on a training set, report the N training and test set performances
  • (FH) Developers of configurators should be able to swap in new versions of a configurator
  • (FH) Configuration scenarios, specifying a complete configuration task including the test set; only missing part being the configurator

Revision 302010-05-27 - ChrisNell

Line: 1 to 1
 

Short-term

Target: CRC/initial release

Frontend

Line: 27 to 27
 for functionality mentioned in paper for which post-release changes would be problematic
  • CN Named instance set table done
  • CN Named configuration table done
Changed:
<
<
  • CN Execution environment table. (CF): Reopened to account for java/ruby/gnuplot location specification etc.
  • CN Split algorithms and configuration spaces, allowing run reuse for common-binary configuration spaces. Both DB and Java object model; requires Algorithm refactor below.
>
>
  • CN Execution environment table. (CF): Reopened to account for java/ruby/gnuplot location specification etc.; in progress
  • CN Split algorithms and configuration spaces, allowing run reuse for common-binary configuration spaces. Both DB and Java object model; requires Algorithm refactor below. (CN: in progress)
  • CN Database schema -- speed-related refactor (CN: next up)
 
  • CN Explicit representation of problems/encodings, compatability of algs and instances via problem (encodings).
  • CN rename objects to match paper terminology done
  • CN Refactor Algorithms/Meta-algorithms in code to align class hierarchy with terminology of paper
  • CN Refactor Algorithm/ParameterSpace/Parameter/Domain structure to allow above
Deleted:
<
<
  • CN Database schema -- speed-related refactor
 

Important

mostly to (substantially) improve UI responsiveness
Line: 114 to 114
 
  • * Warnings in the dashboard if target runs or experiments are behaving "strangely"
  • * Email notifications sent to users when various events happen
  • (CF) Restricted data/execution/targetalgs for the demo server
Added:
>
>
  • (CN) Support of performance metrics
 
  • (CF) Selection of performance metric before selecting the configurator to use. What is the exact problem specification for configuration?

Bug Reports

Revision 292010-05-27 - ChrisFawcett

Line: 1 to 1
 

Short-term

Target: CRC/initial release

Frontend

Line: 19 to 19
 
    • data export
  • Error logging/handling/browsing
  • Plotting ex-gnuplot
Changed:
<
<
>
>
  • Documentation as a header on most of the experiment pages, paragraph explaining the intention etc.
  • Hiding "advanced" settings, such as configurator-specific settings or other tools, with appropriate defaults.
 

Backend

Release-critical

Line: 128 to 129
 
  • (JS) one of the ExecutionManagers produces unstarted AlgorithmRuns
  • (CF) When HAL kills a target algorithm run, it does not also kill all child processes spawned by that run. This can leave zombies and all kinds of other very bad things after a period of time. (CN): fixed 18/05/10
  • (FH) If a HAL slave process fails to start, the associated expt. status stays on "queued" forever
Added:
>
>
  • (FH) Database table contention causes locking and high query latency. Likely to be fixed by database changes and use of InnoDB, but I'm reporting it anyway.
 

Revision 282010-05-25 - ChrisNell

Line: 1 to 1
 

Short-term

Target: CRC/initial release

Frontend

Line: 127 to 127
 
  • (JS) Algorithms with a requirement of a new directory for each run.
  • (JS) one of the ExecutionManagers produces unstarted AlgorithmRuns
  • (CF) When HAL kills a target algorithm run, it does not also kill all child processes spawned by that run. This can leave zombies and all kinds of other very bad things after a period of time. (CN): fixed 18/05/10
Added:
>
>
  • (FH) If a HAL slave process fails to start, the associated expt. status stays on "queued" forever
 

Revision 272010-05-19 - ChrisFawcett

Line: 1 to 1
 

Short-term

Target: CRC/initial release

Frontend

Release-critical

functionality promised in paper
Changed:
<
<
  • CF algorithm specification screen: implement (includes initial design space specification)
  • CF left side of landing page: task selection/presentation according to pattern concept
>
>
  • CF algorithm specification screen: implement (includes initial design space specification) (CF): In Progress
  • CF left side of landing page: task selection/presentation according to pattern concept (CF): In Progress
 
  • CF experiment specification and monitor screens from a pattern template, and procedure-specific requirements, including experiment and incubment naming
Changed:
<
<
  • CF instance specification screen: implement
  • CF Execution environment specification (incl. R, Gnuplot, java locations)
>
>
  • CF instance specification screen: implement (CF): In Progress
  • CF Execution environment specification (incl. R, Gnuplot, java locations) (CF): In Progress
 
  • RTDs/per-target-algorithm-run monitoring and navigation
  • design space specification by revision of existing spaces
Line: 26 to 26
 for functionality mentioned in paper for which post-release changes would be problematic
  • CN Named instance set table done
  • CN Named configuration table done
Changed:
<
<
  • CN Execution environment table done
>
>
  • CN Execution environment table. (CF): Reopened to account for java/ruby/gnuplot location specification etc.
 
  • CN Split algorithms and configuration spaces, allowing run reuse for common-binary configuration spaces. Both DB and Java object model; requires Algorithm refactor below.
  • CN Explicit representation of problems/encodings, compatability of algs and instances via problem (encodings).
  • CN rename objects to match paper terminology done
Line: 93 to 93
 
  • Hashing everything, including instances, instance sets and configurations.
  • Wider support for working directory requirements of individual algorithm runs, i.e. Concorde's creation of 20 files with fixed names.
  • Validation of form input.
Added:
>
>
  • Scriptable submission of experiments. (CF): Accelerated for Frank, finished 18/05/2010.
 
  • Ability to browse algorithms, instances, instance sets, configurations, etc. This includes the ability to see things related to the item being browsed. Performance of different algorithms/configurations on a given instance, performance of algorithms across an instance set, performance of a given configuration.
Line: 106 to 107
 
  • (FH) Submitting runs from a machine that is itelf a cluster submit host should not need to go through SSH
  • (JS) public static AlgorithmRun subclasses in most ExecutionManagers should probably be private
  • (CF) Memory usage / CPU time monitoring in HAL of target algorithm runs, in order to report warnings on potential problems (like excessive swapping for example).
Changed:
<
<
  • (HH) Significance-gated testing / Sequential Testing (see email from HH).
>
>
  • (HH) Significance-gated analysis / sequential hypothesis testing (see email from HH).
  • (CF) Continued testing to support LAMA-ish difficulties in HAL:
  • * Wallclock vs. CPU cutoff options
  • * Warnings in the dashboard if target runs or experiments are behaving "strangely"
  • * Email notifications sent to users when various events happen
  • (CF) Restricted data/execution/targetalgs for the demo server
  • (CF) Selection of performance metric before selecting the configurator to use. What is the exact problem specification for configuration?
 

Bug Reports

  • (CN) JSC test reliability issue (compared to R)

Revision 262010-05-19 - ChrisFawcett

Line: 1 to 1
 

Short-term

Target: CRC/initial release

Frontend

Line: 106 to 106
 
  • (FH) Submitting runs from a machine that is itelf a cluster submit host should not need to go through SSH
  • (JS) public static AlgorithmRun subclasses in most ExecutionManagers should probably be private
  • (CF) Memory usage / CPU time monitoring in HAL of target algorithm runs, in order to report warnings on potential problems (like excessive swapping for example).
Added:
>
>
  • (HH) Significance-gated testing / Sequential Testing (see email from HH).
 

Bug Reports

  • (CN) JSC test reliability issue (compared to R)

Revision 252010-05-19 - ChrisNell

Line: 1 to 1
 

Short-term

Target: CRC/initial release

Frontend

Line: 107 to 107
 
  • (JS) public static AlgorithmRun subclasses in most ExecutionManagers should probably be private
  • (CF) Memory usage / CPU time monitoring in HAL of target algorithm runs, in order to report warnings on potential problems (like excessive swapping for example).
Changed:
<
<

Bugs Reports

>
>

Bug Reports

 
  • (CN) JSC test reliability issue (compared to R)
  • (CN) end-of-experiment hanging bug (GGA, multinode cluster runs)
  • (JS) InnoDB SQL errors (CN): fixed 11/05/10
Line: 118 to 118
 
Changed:
<
<
  • (CF) When HAL kills a target algorithm run, it does not also kill all child processes spawned by that run. This can leave zombies and all kinds of other very bad things after a period of time.
>
>
  • (CF) When HAL kills a target algorithm run, it does not also kill all child processes spawned by that run. This can leave zombies and all kinds of other very bad things after a period of time. (CN): fixed 18/05/10
 

Revision 242010-05-18 - ChrisNell

Line: 1 to 1
 

Short-term

Target: CRC/initial release

Frontend

Line: 23 to 23
 

Backend

Release-critical

Changed:
<
<
mostly to enable critical UI tasks
>
>
for functionality mentioned in paper for which post-release changes would be problematic
 
  • CN Named instance set table done
  • CN Named configuration table done
  • CN Execution environment table done
Changed:
<
<
  • CN Split algorithms and configuration spaces, allowing run reuse for common-binary configuration spaces. Both DB and Java object model.
  • CN Explicit representation of problems/encodings, compatability of algs and instances via problem (encodings)
>
>
  • CN Split algorithms and configuration spaces, allowing run reuse for common-binary configuration spaces. Both DB and Java object model; requires Algorithm refactor below.
  • CN Explicit representation of problems/encodings, compatability of algs and instances via problem (encodings).
 
  • CN rename objects to match paper terminology done
Added:
>
>
  • CN Refactor Algorithms/Meta-algorithms in code to align class hierarchy with terminology of paper
  • CN Refactor Algorithm/ParameterSpace/Parameter/Domain structure to allow above
  • CN Database schema -- speed-related refactor
 

Important

mostly to (substantially) improve UI responsiveness
Deleted:
<
<
  • Database schema -- speed-related refactor
 
  • Connection pooling
  • Caching analysis results
  • Query optimization
Line: 48 to 50
 
  • Read-only DataManager connection for use by individual MA procedures
  • Allowing relationships (incl. possible run-reuse) between different-binary "builds" of algorithms, including due to bugfixes, additional exposed parameters, etc. Also for different "versions" (without reuse) corresponding to added funcitonality.
  • Ability to quantify membership of configurations to different design spaces
Deleted:
<
<
  • Refactor Algorithms/Meta-algorithms in code to align class hierarchy with terminology of paper
  • Refactor/cleanup Algorithm/ParameterSpace/Parameter/Domain structure
 

Support/QA/Misc.

Revision 232010-05-18 - ChrisNell

Line: 1 to 1
 

Short-term

Target: CRC/initial release

Frontend

Line: 29 to 29
 
  • CN Execution environment table done
  • CN Split algorithms and configuration spaces, allowing run reuse for common-binary configuration spaces. Both DB and Java object model.
  • CN Explicit representation of problems/encodings, compatability of algs and instances via problem (encodings)
Changed:
<
<
  • CN add incumbentname semantic input to (design) procedures
  • CN rename objects to match paper terminology
>
>
  • CN rename objects to match paper terminology done
 

Important

mostly to (substantially) improve UI responsiveness
Line: 39 to 38
 
  • Caching analysis results
  • Query optimization
  • Selective limitation of run-level archiving (dynamic based on runtime?)
Added:
>
>
  • add incumbentname semantic input to (design) procedures
 

Nice-to-have

noticeable mostly to developer-users

Revision 222010-05-18 - ChrisFawcett

Line: 1 to 1
 

Short-term

Target: CRC/initial release

Frontend

Line: 105 to 105
 
  • (FH) Taking this a step further: support for optimizing a parameterized configurator (configurator is an algorithm, and the above set of experiments is the set of "instances")
  • (FH) Submitting runs from a machine that is itelf a cluster submit host should not need to go through SSH
  • (JS) public static AlgorithmRun subclasses in most ExecutionManagers should probably be private
Added:
>
>
  • (CF) Memory usage / CPU time monitoring in HAL of target algorithm runs, in order to report warnings on potential problems (like excessive swapping for example).
 

Bugs Reports

  • (CN) JSC test reliability issue (compared to R)
Line: 117 to 118
 
Added:
>
>
  • (CF) When HAL kills a target algorithm run, it does not also kill all child processes spawned by that run. This can leave zombies and all kinds of other very bad things after a period of time.
 

Revision 212010-05-18 - ChrisNell

Line: 1 to 1
 

Short-term

Target: CRC/initial release

Frontend

Line: 48 to 48
 
  • Read-only DataManager connection for use by individual MA procedures
  • Allowing relationships (incl. possible run-reuse) between different-binary "builds" of algorithms, including due to bugfixes, additional exposed parameters, etc. Also for different "versions" (without reuse) corresponding to added funcitonality.
  • Ability to quantify membership of configurations to different design spaces
Added:
>
>
  • Refactor Algorithms/Meta-algorithms in code to align class hierarchy with terminology of paper
  • Refactor/cleanup Algorithm/ParameterSpace/Parameter/Domain structure
 

Support/QA/Misc.

Revision 202010-05-12 - ChrisNell

Line: 1 to 1
 

Short-term

Target: CRC/initial release

Frontend

Line: 102 to 102
 
  • (FH) Saveable sets of configuration scenarios to perform (use case: I change the configurator and want to evaluate it)
  • (FH) Taking this a step further: support for optimizing a parameterized configurator (configurator is an algorithm, and the above set of experiments is the set of "instances")
  • (FH) Submitting runs from a machine that is itelf a cluster submit host should not need to go through SSH
Added:
>
>
 

Bugs Reports

  • (CN) JSC test reliability issue (compared to R)
Line: 113 to 114
 
Added:
>
>
 

Revision 192010-05-12 - ChrisNell

Line: 1 to 1
Changed:
<
<

Short-term priorities (Pre-CRC/release)

>
>

Short-term

Target: CRC/initial release
 

Frontend

Release-critical

functionality promised in paper
  • CF algorithm specification screen: implement (includes initial design space specification)
  • CF left side of landing page: task selection/presentation according to pattern concept
Changed:
<
<
  • CF experiment specification and monitor screens from a pattern template, and procedure-specific requirements, including experiment and inclubment naming
>
>
  • CF experiment specification and monitor screens from a pattern template, and procedure-specific requirements, including experiment and incubment naming
 
  • CF instance specification screen: implement
  • CF Execution environment specification (incl. R, Gnuplot, java locations)
  • RTDs/per-target-algorithm-run monitoring and navigation
Line: 18 to 19
 
    • data export
  • Error logging/handling/browsing
  • Plotting ex-gnuplot
Deleted:
<
<
  • Named configurations, ability to specify name for final incumbent a priori and reference it in subsequent comparison experiments.
 

Backend

Line: 50 to 50
 
  • Ability to quantify membership of configurations to different design spaces
Changed:
<
<

Code/Robustness/Misc. tasks

>
>

Support/QA/Misc.

 

Important

  • user-facing documentation (help)
  • Better logging/error-reporting (to console/within HAL). eg: log4j
Line: 61 to 61
 
  • developer-facing documentation (javadocs)
Changed:
<
<

Medium-term Plans

>
>

Medium-term

Planned for future HAL 1.x revisions
 
  • Packaging complete experiments
  • Windows support
  • libraries of:
Line: 92 to 94
 
  • Ability to browse algorithms, instances, instance sets, configurations, etc. This includes the ability to see things related to the item being browsed. Performance of different algorithms/configurations on a given instance, performance of algorithms across an instance set, performance of a given configuration.
Changed:
<
<

Feature Requests (unprioritized/long-term)

>
>

Long-term/Unprioritized

Feature requests should be initially added here
 
  • (FH) Support for complete configuration experiment, front to back: run configurator N times on a training set, report the N training and test set performances
  • (FH) Developers of configurators should be able to swap in new versions of a configurator
  • (FH) Configuration scenarios, specifying a complete configuration task including the test set; only missing part being the configurator
  • (FH) Saveable sets of configuration scenarios to perform (use case: I change the configurator and want to evaluate it)
  • (FH) Taking this a step further: support for optimizing a parameterized configurator (configurator is an algorithm, and the above set of experiments is the set of "instances")
  • (FH) Submitting runs from a machine that is itelf a cluster submit host should not need to go through SSH
Deleted:
<
<
  • (CN) Ability to actively manage database, including properly cascaded deletion of elements.
 
Changed:
<
<

Known Bugs:

>
>

Bugs Reports

 
  • (CN) JSC test reliability issue (compared to R)
  • (CN) end-of-experiment hanging bug (GGA, multinode cluster runs)
  • (JS) InnoDB SQL errors (CN): fixed 11/05/10

Revision 182010-05-12 - ChrisFawcett

Line: 1 to 1
 

Short-term priorities (Pre-CRC/release)

Frontend

Release-critical

Line: 87 to 87
 
  • Experiments calling experiments, not just external target algs
  • array jobs in SGE
  • Hashing everything, including instances, instance sets and configurations.
Changed:
<
<
>
>
  • Wider support for working directory requirements of individual algorithm runs, i.e. Concorde's creation of 20 files with fixed names.
  • Validation of form input.
  • Ability to browse algorithms, instances, instance sets, configurations, etc. This includes the ability to see things related to the item being browsed. Performance of different algorithms/configurations on a given instance, performance of algorithms across an instance set, performance of a given configuration.
 

Feature Requests (unprioritized/long-term)

  • (FH) Support for complete configuration experiment, front to back: run configurator N times on a training set, report the N training and test set performances
Line: 96 to 99
 
  • (FH) Saveable sets of configuration scenarios to perform (use case: I change the configurator and want to evaluate it)
  • (FH) Taking this a step further: support for optimizing a parameterized configurator (configurator is an algorithm, and the above set of experiments is the set of "instances")
  • (FH) Submitting runs from a machine that is itelf a cluster submit host should not need to go through SSH
Deleted:
<
<
  • (CF) Ability to browse algorithms, instances, instance sets, configurations, etc. This includes the ability to see things related to the item being browsed. Performance of different algorithms/configurations on a given instance, performance of algorithms across an instance set, performance of a given configuration.
  • (CF) Form validation.
  • (CF) Wider support for working directory requirements of individual algorithm runs, i.e. Concorde's creation of 20 files with fixed names.
 
  • (CN) Ability to actively manage database, including properly cascaded deletion of elements.

Known Bugs:

Revision 172010-05-12 - ChrisNell

Line: 1 to 1
Changed:
<
<

Pre-CRC/release tasks

UI tasks

>
>

Short-term priorities (Pre-CRC/release)

Frontend

 

Release-critical

Deleted:
<
<
  • CF N-way performance comparison first-cut for Frank.
 functionality promised in paper
Changed:
<
<
  • CF algorithm specification screen: implement
>
>
  • CF algorithm specification screen: implement (includes initial design space specification)
 
  • CF left side of landing page: task selection/presentation according to pattern concept
Changed:
<
<
  • CF experiment specification and monitor screens from a pattern template, and procedure-specific requirements (may require DB changes)
  • CF instance specification screen: implement (requires DB change)
  • CF Execution environment specification (incl. R, Gnuplot, java locations; requires DB change)
  • CN RTDs/per-target-algorithm-run monitoring and navigation
>
>
  • CF experiment specification and monitor screens from a pattern template, and procedure-specific requirements, including experiment and inclubment naming
  • CF instance specification screen: implement
  • CF Execution environment specification (incl. R, Gnuplot, java locations)
  • RTDs/per-target-algorithm-run monitoring and navigation
  • design space specification by revision of existing spaces
 

Important

works as-is but end-user experience significantly impacted
Line: 21 to 21
 
  • Named configurations, ability to specify name for final incumbent a priori and reference it in subsequent comparison experiments.
Changed:
<
<

Database tasks

>
>

Backend

 

Release-critical

mostly to enable critical UI tasks
  • CN Named instance set table done
  • CN Named configuration table done
  • CN Execution environment table done
Changed:
<
<
  • CN Configuration spaces vs. algorithms; appropriate unique ID hashing; algorithm versions?
>
>
  • CN Split algorithms and configuration spaces, allowing run reuse for common-binary configuration spaces. Both DB and Java object model.
  • CN Explicit representation of problems/encodings, compatability of algs and instances via problem (encodings)
  • CN add incumbentname semantic input to (design) procedures
  • CN rename objects to match paper terminology
 

Important

mostly to (substantially) improve UI responsiveness
Added:
>
>
  • Database schema -- speed-related refactor
 
  • Connection pooling
  • Caching analysis results
Deleted:
<
<
  • Database schema -- speed-related redesign
 
  • Query optimization
Changed:
<
<
  • Selective limitation of run-level logging (dynamic based on runtime?)
>
>
  • Selective limitation of run-level archiving (dynamic based on runtime?)
 

Nice-to-have

noticeable mostly to developer-users
Added:
>
>
  • CF N-way performance comparison first-cut for Frank.
 
  • Stale connection issue; incl. robustness to general network issues
  • Read-only DataManager connection for use by individual MA procedures
Added:
>
>
  • Allowing relationships (incl. possible run-reuse) between different-binary "builds" of algorithms, including due to bugfixes, additional exposed parameters, etc. Also for different "versions" (without reuse) corresponding to added funcitonality.
  • Ability to quantify membership of configurations to different design spaces
 

Code/Robustness/Misc. tasks

Line: 81 to 87
 
  • Experiments calling experiments, not just external target algs
  • array jobs in SGE
  • Hashing everything, including instances, instance sets and configurations.
Added:
>
>
 
Changed:
<
<

Feature Requests

>
>

Feature Requests (unprioritized/long-term)

 
  • (FH) Support for complete configuration experiment, front to back: run configurator N times on a training set, report the N training and test set performances
  • (FH) Developers of configurators should be able to swap in new versions of a configurator
  • (FH) Configuration scenarios, specifying a complete configuration task including the test set; only missing part being the configurator
  • (FH) Saveable sets of configuration scenarios to perform (use case: I change the configurator and want to evaluate it)
  • (FH) Taking this a step further: support for optimizing a parameterized configurator (configurator is an algorithm, and the above set of experiments is the set of "instances")
  • (FH) Submitting runs from a machine that is itelf a cluster submit host should not need to go through SSH
Deleted:
<
<
  • (CF) Distinction made in HAL between parameterised and parameter-less algorithms, including the ability to see the "parent" parameterised algorithm of a given parameter-less algorithm. Other similar queries would be equally useful (all SPEAR configurations, the parameterised algorithm corresponding to a partial instantiation of another parameterised algorithm, etc.) (CN) need to flesh out how this interacts with named configurations/configuration spaces also under discussion
 
  • (CF) Ability to browse algorithms, instances, instance sets, configurations, etc. This includes the ability to see things related to the item being browsed. Performance of different algorithms/configurations on a given instance, performance of algorithms across an instance set, performance of a given configuration.
  • (CF) Form validation.
  • (CF) Wider support for working directory requirements of individual algorithm runs, i.e. Concorde's creation of 20 files with fixed names.

Revision 162010-05-11 - ChrisNell

Line: 1 to 1
 

Pre-CRC/release tasks

UI tasks

Release-critical

Line: 26 to 26
 mostly to enable critical UI tasks
  • CN Named instance set table done
  • CN Named configuration table done
Changed:
<
<
  • CN Execution environment table
  • CN Configuration spaces vs. algorithms; appropriate unique ID hashing
>
>
  • CN Execution environment table done
  • CN Configuration spaces vs. algorithms; appropriate unique ID hashing; algorithm versions?
 

Important

mostly to (substantially) improve UI responsiveness

Revision 152010-05-11 - ChrisNell

Line: 1 to 1
 

Pre-CRC/release tasks

UI tasks

Release-critical

Line: 77 to 77
 
  • Git, not CVS
  • Support text-file inputs and outputs for external algorithms
  • Instance features
Added:
>
>
  • Explicit representation of problems (e.g. particular instance formats)
 
  • Experiments calling experiments, not just external target algs
  • array jobs in SGE
  • Hashing everything, including instances, instance sets and configurations.
Line: 97 to 98
 

Known Bugs:

  • (CN) JSC test reliability issue (compared to R)
  • (CN) end-of-experiment hanging bug (GGA, multinode cluster runs)
Changed:
<
<
>
>
  • (JS) InnoDB SQL errors (CN): fixed 11/05/10
 
  • (LX) missing current-time point in solution quality trace, so don't see the final "flat line"
  • (CN) accuracy of mid-run overhead accounting for PILS/GGA
  • (CF) Configuration file callstrings with weird spaces, i.e. "... -param '$val$ blah' ..." where '$val blah' needs to be passed to the target as a single argument. (CN) does this work with double-quotes instead of single-quotes?

Revision 142010-05-11 - ChrisNell

Line: 1 to 1
 

Pre-CRC/release tasks

UI tasks

Release-critical

Line: 25 to 25
 

Release-critical

mostly to enable critical UI tasks
  • CN Named instance set table done
Changed:
<
<
  • CN Named configuration table
>
>
  • CN Named configuration table done
 
  • CN Execution environment table
  • CN Configuration spaces vs. algorithms; appropriate unique ID hashing

Revision 132010-05-10 - ChrisNell

Line: 1 to 1
 

Pre-CRC/release tasks

UI tasks

Release-critical

Line: 92 to 92
 
  • (CF) Ability to browse algorithms, instances, instance sets, configurations, etc. This includes the ability to see things related to the item being browsed. Performance of different algorithms/configurations on a given instance, performance of algorithms across an instance set, performance of a given configuration.
  • (CF) Form validation.
  • (CF) Wider support for working directory requirements of individual algorithm runs, i.e. Concorde's creation of 20 files with fixed names.
Added:
>
>
  • (CN) Ability to actively manage database, including properly cascaded deletion of elements.
 

Known Bugs:

  • (CN) JSC test reliability issue (compared to R)

Revision 122010-05-10 - ChrisNell

Line: 1 to 1
 

Pre-CRC/release tasks

UI tasks

Release-critical

Line: 88 to 88
 
  • (FH) Saveable sets of configuration scenarios to perform (use case: I change the configurator and want to evaluate it)
  • (FH) Taking this a step further: support for optimizing a parameterized configurator (configurator is an algorithm, and the above set of experiments is the set of "instances")
  • (FH) Submitting runs from a machine that is itelf a cluster submit host should not need to go through SSH
Changed:
<
<
  • (CF) Distinction made in HAL between parameterised and parameter-less algorithms, including the ability to see the "parent" parameterised algorithm of a given parameter-less algorithm. Other similar queries would be equally useful (all SPEAR configurations, the parameterised algorithm corresponding to a partial instantiation of another parameterised algorithm, etc.)
>
>
  • (CF) Distinction made in HAL between parameterised and parameter-less algorithms, including the ability to see the "parent" parameterised algorithm of a given parameter-less algorithm. Other similar queries would be equally useful (all SPEAR configurations, the parameterised algorithm corresponding to a partial instantiation of another parameterised algorithm, etc.) (CN) need to flesh out how this interacts with named configurations/configuration spaces also under discussion
 
  • (CF) Ability to browse algorithms, instances, instance sets, configurations, etc. This includes the ability to see things related to the item being browsed. Performance of different algorithms/configurations on a given instance, performance of algorithms across an instance set, performance of a given configuration.
  • (CF) Form validation.
  • (CF) Wider support for working directory requirements of individual algorithm runs, i.e. Concorde's creation of 20 files with fixed names.
Line: 99 to 99
 
  • (JS) InnoDB SQL errors
  • (LX) missing current-time point in solution quality trace, so don't see the final "flat line"
  • (CN) accuracy of mid-run overhead accounting for PILS/GGA
Changed:
<
<
  • (CF) Configuration file callstrings with weird spaces, i.e. "... -param '$val$ blah' ..." where '$val blah' needs to be passed to the target as a single argument.
>
>
  • (CF) Configuration file callstrings with weird spaces, i.e. "... -param '$val$ blah' ..." where '$val blah' needs to be passed to the target as a single argument. (CN) does this work with double-quotes instead of single-quotes?
 

Revision 112010-05-05 - FrankHutter

Line: 1 to 1
 

Pre-CRC/release tasks

UI tasks

Release-critical

Line: 82 to 82
 
  • Hashing everything, including instances, instance sets and configurations.

Feature Requests

Changed:
<
<
  • (FH) submitting runs from a machine that is itelf a cluster submit host should not need to go through SSH
>
>
  • (FH) Support for complete configuration experiment, front to back: run configurator N times on a training set, report the N training and test set performances
  • (FH) Developers of configurators should be able to swap in new versions of a configurator
  • (FH) Configuration scenarios, specifying a complete configuration task including the test set; only missing part being the configurator
  • (FH) Saveable sets of configuration scenarios to perform (use case: I change the configurator and want to evaluate it)
  • (FH) Taking this a step further: support for optimizing a parameterized configurator (configurator is an algorithm, and the above set of experiments is the set of "instances")
  • (FH) Submitting runs from a machine that is itelf a cluster submit host should not need to go through SSH
 
  • (CF) Distinction made in HAL between parameterised and parameter-less algorithms, including the ability to see the "parent" parameterised algorithm of a given parameter-less algorithm. Other similar queries would be equally useful (all SPEAR configurations, the parameterised algorithm corresponding to a partial instantiation of another parameterised algorithm, etc.)
  • (CF) Ability to browse algorithms, instances, instance sets, configurations, etc. This includes the ability to see things related to the item being browsed. Performance of different algorithms/configurations on a given instance, performance of algorithms across an instance set, performance of a given configuration.
  • (CF) Form validation.

Revision 102010-05-05 - ChrisFawcett

Line: 1 to 1
 

Pre-CRC/release tasks

UI tasks

Release-critical

Line: 18 to 18
 
    • data export
  • Error logging/handling/browsing
  • Plotting ex-gnuplot
Added:
>
>
  • Named configurations, ability to specify name for final incumbent a priori and reference it in subsequent comparison experiments.
 

Database tasks

Line: 78 to 79
 
  • Instance features
  • Experiments calling experiments, not just external target algs
  • array jobs in SGE
Changed:
<
<
>
>
  • Hashing everything, including instances, instance sets and configurations.
 

Feature Requests

  • (FH) submitting runs from a machine that is itelf a cluster submit host should not need to go through SSH
Changed:
<
<
>
>
  • (CF) Distinction made in HAL between parameterised and parameter-less algorithms, including the ability to see the "parent" parameterised algorithm of a given parameter-less algorithm. Other similar queries would be equally useful (all SPEAR configurations, the parameterised algorithm corresponding to a partial instantiation of another parameterised algorithm, etc.)
  • (CF) Ability to browse algorithms, instances, instance sets, configurations, etc. This includes the ability to see things related to the item being browsed. Performance of different algorithms/configurations on a given instance, performance of algorithms across an instance set, performance of a given configuration.
  • (CF) Form validation.
  • (CF) Wider support for working directory requirements of individual algorithm runs, i.e. Concorde's creation of 20 files with fixed names.
 

Known Bugs:

  • (CN) JSC test reliability issue (compared to R)
Line: 90 to 94
 
  • (JS) InnoDB SQL errors
  • (LX) missing current-time point in solution quality trace, so don't see the final "flat line"
  • (CN) accuracy of mid-run overhead accounting for PILS/GGA
Added:
>
>
  • (CF) Configuration file callstrings with weird spaces, i.e. "... -param '$val$ blah' ..." where '$val blah' needs to be passed to the target as a single argument.
  • (JS) FixedConfigurationExperiment UI is outdated, unusable.
  • (JS) HAL is not usable on WestGrid. We need a TorqueClusterExecutionManager.
  • (JS) Algorithms with a requirement of a new directory for each run.
 

Revision 92010-05-02 - ChrisNell

Line: 1 to 1
 

Pre-CRC/release tasks

UI tasks

Release-critical

Line: 23 to 23
 

Database tasks

Release-critical

mostly to enable critical UI tasks
Changed:
<
<
  • CN Named instance set table
>
>
  • CN Named instance set table done
 
  • CN Named configuration table
  • CN Execution environment table
  • CN Configuration spaces vs. algorithms; appropriate unique ID hashing

Revision 82010-04-22 - ChrisFawcett

Line: 1 to 1
 

Pre-CRC/release tasks

UI tasks

Release-critical

Added:
>
>
  • CF N-way performance comparison first-cut for Frank.
 functionality promised in paper
  • CF algorithm specification screen: implement
  • CF left side of landing page: task selection/presentation according to pattern concept

Revision 72010-04-22 - ChrisNell

Line: 1 to 1
 

Pre-CRC/release tasks

UI tasks

Release-critical

Line: 88 to 88
 
  • (CN) end-of-experiment hanging bug (GGA, multinode cluster runs)
  • (JS) InnoDB SQL errors
  • (LX) missing current-time point in solution quality trace, so don't see the final "flat line"
Added:
>
>
  • (CN) accuracy of mid-run overhead accounting for PILS/GGA
 

Revision 62010-04-21 - ChrisNell

Line: 1 to 1
 

Pre-CRC/release tasks

UI tasks

Changed:
<
<
  • Completely replace current Hal/HalServer.java servlets with better-designed modular implementation
  • left side of landing page: task selection drill-down to fit pattern concept
  • algorithm specification screen: implement
  • instance specification screen: implement
  • experiment specification and monitor screens from a pattern template, and procedure-specific requirements
  • RTDs/per-target-algorithm-run monitoring and navigation
>
>

Release-critical

functionality promised in paper
  • CF algorithm specification screen: implement
  • CF left side of landing page: task selection/presentation according to pattern concept
  • CF experiment specification and monitor screens from a pattern template, and procedure-specific requirements (may require DB changes)
  • CF instance specification screen: implement (requires DB change)
  • CF Execution environment specification (incl. R, Gnuplot, java locations; requires DB change)
  • CN RTDs/per-target-algorithm-run monitoring and navigation

Important

works as-is but end-user experience significantly impacted
 
  • Data management interface:
    • deleting runs/expts/etc.
    • data export
Deleted:
<
<
  • Execution environment specification:
    • R, Gnuplot, java locations
    • DB authentication/path -- .properties file?
 
  • Error logging/handling/browsing
  • Plotting ex-gnuplot
Line: 16 to 18
 
  • Error logging/handling/browsing
  • Plotting ex-gnuplot
Added:
>
>
 

Database tasks

Added:
>
>

Release-critical

mostly to enable critical UI tasks
  • CN Named instance set table
  • CN Named configuration table
  • CN Execution environment table
  • CN Configuration spaces vs. algorithms; appropriate unique ID hashing

Important

mostly to (substantially) improve UI responsiveness
 
  • Connection pooling
  • Caching analysis results
Deleted:
<
<
 
  • Database schema -- speed-related redesign
  • Query optimization
Deleted:
<
<
  • Stale connection issue; incl. robustness to general network issues
 
  • Selective limitation of run-level logging (dynamic based on runtime?)
Changed:
<
<
  • Named configurations
  • Named instance sets
  • Instance features
  • Execution environments
  • Configuration spaces vs. algorithms; appropriate unique ID hashing
>
>

Nice-to-have

noticeable mostly to developer-users
  • DataManager API refinement
  • Stale connection issue; incl. robustness to general network issues
 
  • Read-only DataManager connection for use by individual MA procedures

Added:
>
>
 

Code/Robustness/Misc. tasks

Added:
>
>

Important

  • user-facing documentation (help)
 
  • Better logging/error-reporting (to console/within HAL). eg: log4j
  • Better handling of overhead runtime vs. target algorithm runtime
  • WAY more unittests; also functional/integration tests
Changed:
<
<
  • array jobs in SGE
  • user-facing documentation (help)
>
>

Nice-to-have

 
  • developer-facing documentation (javadocs)
Deleted:
<
<
  • Experiments calling experiments, not just external target algs
 

Medium-term Plans

Line: 62 to 74
 
  • support instance generators
  • Git, not CVS
  • Support text-file inputs and outputs for external algorithms
Added:
>
>
  • Instance features
  • Experiments calling experiments, not just external target algs
  • array jobs in SGE
 

Feature Requests

Revision 52010-04-20 - HolgerHoos

Line: 1 to 1
 

Pre-CRC/release tasks

UI tasks

  • Completely replace current Hal/HalServer.java servlets with better-designed modular implementation
  • left side of landing page: task selection drill-down to fit pattern concept
  • algorithm specification screen: implement
Changed:
<
<
  • instance specificaiton screen: implement
  • experiment specification and monitor screens from a pattern template, and proceure-specific requirements
  • RTDs/per-target-algorithm-run monitoring and navigaiton
>
>
  • instance specification screen: implement
  • experiment specification and monitor screens from a pattern template, and procedure-specific requirements
  • RTDs/per-target-algorithm-run monitoring and navigation
 
  • Data management interface:
    • deleting runs/expts/etc.
    • data export
Line: 17 to 17
 
  • Plotting ex-gnuplot

Database tasks

Changed:
<
<
  • Conneciton pooling
>
>
  • Connection pooling
 
  • Caching analysis results
  • DataManager API redesign
  • Database schema -- speed-related redesign
Line: 65 to 65
 

Feature Requests

Changed:
<
<
  • (FH) submitting runs from a machine that is itelf a cluster submit hos should not need to go through SSH
>
>
  • (FH) submitting runs from a machine that is itelf a cluster submit host should not need to go through SSH
 

Known Bugs:

Revision 42010-04-20 - ChrisNell

Line: 1 to 1
Changed:
<
<
Usage notes/observations/etc for HAL 1.0. To be considered post-paper.
>
>

Pre-CRC/release tasks

UI tasks

  • Completely replace current Hal/HalServer.java servlets with better-designed modular implementation
  • left side of landing page: task selection drill-down to fit pattern concept
  • algorithm specification screen: implement
  • instance specificaiton screen: implement
  • experiment specification and monitor screens from a pattern template, and proceure-specific requirements
  • RTDs/per-target-algorithm-run monitoring and navigaiton
  • Data management interface:
    • deleting runs/expts/etc.
    • data export
  • Execution environment specification:
    • R, Gnuplot, java locations
    • DB authentication/path -- .properties file?
  • Error logging/handling/browsing
  • Plotting ex-gnuplot
 
Changed:
<
<
TODO:
  • dealing with runs that report 0.00s -- PILS doesn't progress if this happens
  • improve cluster "niceness" -- if many subruns are spawned, HAL can take over all cpu's on a node
  • remove POSIX requirements
  • "tagging" of configurations, instances
  • investigate PILS thinking it performed fewer runs than are committed to the DB.
>
>

Database tasks

  • Conneciton pooling
  • Caching analysis results
  • DataManager API redesign
  • Database schema -- speed-related redesign
  • Query optimization
  • Stale connection issue; incl. robustness to general network issues
  • Selective limitation of run-level logging (dynamic based on runtime?)
  • Named configurations
  • Named instance sets
  • Instance features
  • Execution environments
  • Configuration spaces vs. algorithms; appropriate unique ID hashing
  • Read-only DataManager connection for use by individual MA procedures

Code/Robustness/Misc. tasks

  • Better logging/error-reporting (to console/within HAL). eg: log4j
  • Better handling of overhead runtime vs. target algorithm runtime
  • WAY more unittests; also functional/integration tests
  • array jobs in SGE
  • user-facing documentation (help)
  • developer-facing documentation (javadocs)
  • Experiments calling experiments, not just external target algs

Medium-term Plans

  • Packaging complete experiments
  • Windows support
  • libraries of:
    • search/optimization procedures
    • machine learning tools
  • multi-algorithm comparisons
  • scaling analyses
  • bootstrapped analyses
  • robustness analyses
  • parameter response analyses
  • SATzilla in HAL
  • ParamILS in HAL
  • Parallel portfolios in HAL
  • ActiveConfigurator in HAL
  • Iterated F-Race in HAL
  • chained-procedure experiments
  • support for optimization/Monte-Carlo experiments
  • support instance generators
  • Git, not CVS
  • Support text-file inputs and outputs for external algorithms

Feature Requests

  • (FH) submitting runs from a machine that is itelf a cluster submit hos should not need to go through SSH

Known Bugs:

  • (CN) JSC test reliability issue (compared to R)
  • (CN) end-of-experiment hanging bug (GGA, multinode cluster runs)
  • (JS) InnoDB SQL errors
  • (LX) missing current-time point in solution quality trace, so don't see the final "flat line"
 

Revision 32010-03-31 - ChrisNell

Line: 1 to 1
 Usage notes/observations/etc for HAL 1.0. To be considered post-paper.

TODO:

Changed:
<
<
  • dealing with runs that report 0.00s
>
>
  • dealing with runs that report 0.00s -- PILS doesn't progress if this happens
  • improve cluster "niceness" -- if many subruns are spawned, HAL can take over all cpu's on a node
 
  • remove POSIX requirements
  • "tagging" of configurations, instances
  • investigate PILS thinking it performed fewer runs than are committed to the DB.

Revision 22010-03-31 - ChrisNell

Line: 1 to 1
Changed:
<
<
Notes in adapting Iterated F-Race (IFR) to the HAL framework. This "diary" is intended to:
  1. provide fodder for eventual better documentation
  2. serve as a record for improving the process described therein
>
>
Usage notes/observations/etc for HAL 1.0. To be considered post-paper.
 
Changed:
<
<

Terminology

In HAL:

  • An Algorithm instance is an arbitrary code object which implements the HAL algorithm API, so that its inputs and outputs are well-defined and accessed in a standardized manner.
  • A Parameter is fixed if its value should not be adjusted to improve performance, and free otherwise.
  • A Pattern is an Algorithm which accepts a particular set of standard fixed input parameters in addition to an arbitrary set of free, "configurable" parameters. Similarly, a pattern's output includes a set of standard parameters. The fixed inputs include a problem instance, a random seed, a maximum runtime, a maximum runlength, and a A Pattern instance can accept additional, nonstandard inputs, but these must have default values so that if left unspecified the algorithm will still run.
  • A Scenario specifies conditions under which Patterns are run, including training and test instances, execution budgets, and overall performance objectives.
  • A Configurator is an Algorithm which accepts Patterns and Scenarios, and outputs parameter instantiations for the Pattern. In practice a Configurator will most likely attempt to return the best-evaluated parameter instance.

Initial Conisderations

Iterated F-Race is presented as a set of R functions which are called via a set of Linux shell scripts. HAL has a very basic (vanilla) F-Race component which will be useful as a starting point for both of the following tasks:

Tasks which need to be completed:

  1. Write a .json configuration file for IFR, so that it can be loaded into HAL as a generic ExternalAlgorithm. This wrapper should preserve the naming conventions/etc. used by the algorithms' authors
  2. Write a WrappedConfigurator subclass for IFR, which enables HAL to use the IFR ExternalAlgorithm as a configurator on arbitrary Pattern instances.

Each of these is approached in turn.

1. Configuration File

IFR is "natively" configured to work with a particular algorithm through editing of a BASH script file, tune.sh. This script is used to provide a sequence of commands to an R interpreter, and itself takes no command-line arguments. An example script is (apologies for crazy width...).

R --no-save --no-restore --slave<<EOF
source("race.R")
source("hrace.R")
source("eval.R")

# doesn't matter descriptions
experiment.name<-"Iterative F-race for Tuning ACOTSP"
extra.description<-"F-RACE applied to ACOTSP"

# excutable initials, usually the rest of the command lines is followed by "--parameter_name parameter_value"
executable<-"../ACOTSP.V1.0/acotsp --tries 1 --time 20 "

# instance directory for tuning and testing
instance.dir<-"../../../../Instances"
test.instance.dir<-"../../../../TestInstances"

# tuning budget in number of evaluations
maxAllotedExperiments = 6000

# type "r" means continuous parameters (real numbers), "i" means integer parameters, "c" means categorical parameters, "m" also means categorical parameters which is called in command lines by "--parameter_value", e.g. "--mmas"; while the usual parameters are called in command lines using the format "--parameter_name parameter_value". 
parameter.type.list<-list(alpha="r",beta="r",rho="r",ants="i",nnants="i", nnls="i", q0="r", localsearch="c", dlb="c", mode="m", rasrank="i", elitistants="i")

# boundary inclusive for continuous or integer parameters. for categorical parameters this simply lists all the possible levels. 
parameter.boundary.list<-list(alpha=c(0.01,5.0), beta=c(0.0,10.0), rho=c(0.0001, 1.0), ants=c(1,100), nnants=c(5,100), nnls=c(5, 100),  q0=c(0.0,1.0), localsearch=c(0,1,2,3), dlb=c(0,1), mode=c("mmas", "acs", "ras", "eas", "as"), rasrank=c(1,10), elitistants=c(1,750))

# the conditional parameters
parameter.subsidiary.list<-list(q0=list(mode=c("acs")), rasrank=list(mode=c("ras")), elitistants=list(mode=c("eas")), nnls=list(localsearch=c(1,2,3)), dlb=list(localsearch=c(1,2,3)))

# in case the parameter names differ from what we give above, usually leave it as empty
parameter.name.list<-list()

# wrapper file for racing
wrapper.file="race-wrapper.R"

result=hrace.wrapper(maxAllotedExperiments=maxAllotedExperiments,parameter.type.list=parameter.type.list,parameter.boundary.list=parameter.boundary.list,
experiment.name=experiment.name,extra.description=extra.description,executable=executable,instance.dir=instance.dir, test.instance.dir, parameter.subsidiary.list=parameter.subsidiary.list, parameter.name.list=parameter.name.list, wrapper.file=wrapper.file)

# to perform the tuned parameter on the testing instances
eval(result=result, executable=executable, test.instance.dir=test.instance.dir)

The target algorithm-specific parameters (R variables) set in this file are:

  • experiment.name (string)
  • extra.description (string)
  • executable (path)
  • instance.dir (path)
  • test.instance.dir (path)
  • maxAllotedExperiments (integer)
  • parameter.type.list (list mapping string names to characters in "ricm" indicating domains)
  • parameter.boundary.list (list mapping above names to either endpoints or categorical values, further specifying domains)
  • parameter.subsidiary.list (conditionals; list mapping above names to another mapping of above names to conditional values)
  • parameter.name.list (optional, list of parameter display names if they differ from the ones used above)
Note that all of these parameters (except possibly maxAllotedExperiments) are fixed.

While it would be possible to have HAL control R directly (circumventing this script), for the time being we will simply configure HAL to generate tune.sh scripts for arbitrary target Patterns.

Note that the parameter.*.list parameters can be viewed as strings of a particular format. The exact realization of this string depends on the target algorithm, and as such is generated in the ExternalConfigurator subclass. However, as a sanity check we can enforce that the format of the string is roughly valid using regular expressions; while this is not strictly necessary it is done below.

The outputs of the iterated F-Race algorithm are the same as those of the original F-Race; thus, they can be carried over from that algorithm's .json file. The difference is here the outputs are printed to STDOUT rather than dumped into a file. Also, The json-format configuration is then:

{
    "path" : "../../../../ifrace/TUNE",
    "command" : "bash",
    "deterministic": false,
    "inputFormat": {
        "callstring": "$bashscript",
        "$bashscript": [
            "R --no-save --no-restore --slave<<EOF",
            "source('race.R')",
            "source('hrace.R')",
            "source('eval.R"')",
            "experiment.name<-"Iterative F-race run by HAL",
            "extra.description<-"F-RACE run by HAL",
            "executable<-'$executable'",
            "instance.dir<-'instance.dir'",
            "test.instance.dir<-'test.instance.dir'",
            "maxAllotedExperiments = $maxAllotedExperiments",
            "parameter.type.list<-$parameter.type.list",
            "parameter.boundary.list<-$parameter.boundary.list",
            "parameter.subsidiary.list<-$parameter.subsidiary.list",
            "parameter.name.list<-$parameter.name.list",
            "wrapper.file="race-wrapper.R",
            "result=hrace.wrapper(maxAllotedExperiments=maxAllotedExperiments, parameter.type.list=parameter.type.list,parameter.boundary.list=parameter.boundary.list,
experiment.name=experiment.name,extra.description=extra.description,executable=executable,instance.dir=instance.dir, test.instance.dir, parameter.subsidiary.list=parameter.subsidiary.list, parameter.name.list=parameter.name.list, wrapper.file=wrapper.file)",
            "eval(result=result, executable=executable, test.instance.dir=test.instance.dir)",
            "EOF"
        ] 
    },
    "inputs": {
        "bashcsript": {"domain": "String()", "properties":{"fixed":1}},
        "experiment.name": {"domain": "String()", "properties":{"fixed":1}},
        "extra.description": {"domain": "String()", "properties":{"fixed":1}},
        "executable": {"domain": "String()", "properties":{"fixed":1}},
        "instance.dir": {"domain": "String()", "properties":{"fixed":1}},
        "test.instance.dir": {"domain": "String()", "properties":{"fixed":1}}
        "maxAllottedExperiments": {"domain": "Integer(0, None)", "properties":{"fixed":1}},
        "parameter.type.list": {"domain": "String('list\([^\s=]+=[^\s=,]+(?:,\s*[^\s=,]+=[^\s=,]+)*\)')", "properties":{"fixed":1}},
        "parameter.boundary.list": {"domain": "String('list\([^\s=]+=[^\s=,]+(?:,\s*[^\s=,]+=[^\s=,]+)*\)')", "properties":{"fixed":1}},
        "parameter.subsidiary.list": {"domain": "String('list\([^\s=]+=list\([^\s=]+=[^\s=,]+(?:,\s*[^\s=,]+=[^\s=,]+)*\)(?:,\s*[^\s=]+=list\([^\s=]+=[^\s=,]+(?:,\s*[^\s=,]+=[^\s=,]+)*\))*\)')", "properties":{"fixed":1}},
        "parameter.name.list": {"domain": "String('list\([^\s=]+=[^\s=,]?(?:,\s*[^\s=,]+=[^\s=,]+)*\)')", "properties":{"fixed":1}},
    },            
    "outputFormat": {
        "stdout": [{"^\|([x=-])\|\s*([0-9]+)\|\s*([0-9]+)\|\s*([0-9]+)\|\s*([0-9]+(?:.[0-9]+)?)\|\s*([0-9]+)\|":
                    ["marker", "task", "alive", "best", "meanbest", "nruns"],
 
                   {"Description of the selected candidate:\s}
        ]
    },
    "outputs": {
        "marker": ["x", "-", "="],        
        "task": "Integer(0, None)",
        "alive": "Integer(0, None)",
        "best": "Integer(0, None)",
        "meanbest": "Real()",
        "nruns": "Integer(0, None)"
    }
}

2. WrappedConfigurator subclass

This is where the bulk of the work of integrating a new configurator with HAL occurs. The WrappedConfigurator subclass defines the logic required to:

  1. Configure the ExternalAlgorithm's parameters appropriately given a Pattern and a Scenario.
  2. Moderate the ExternalAlgorithm's calls to the target Pattern.
  3. Interpret the ExternalAlgorithm's output.

There are three methods of WrappedConfigurator which will likely need to be overridden:

Additionally, it an ExecutionServer subclass will need to be created, which will intercept the external algorithm's attempts to call the Pattern, execute said Pattern appropriately, and return an appropriately formatted result.

-- ChrisNell - 08 Sep 2009

>
>
TODO:
  • dealing with runs that report 0.00s
  • remove POSIX requirements
  • "tagging" of configurations, instances
  • investigate PILS thinking it performed fewer runs than are committed to the DB.
 

Revision 12009-09-08 - ChrisNell

Line: 1 to 1
Added:
>
>
Notes in adapting Iterated F-Race (IFR) to the HAL framework. This "diary" is intended to:
  1. provide fodder for eventual better documentation
  2. serve as a record for improving the process described therein

Terminology

In HAL:

  • An Algorithm instance is an arbitrary code object which implements the HAL algorithm API, so that its inputs and outputs are well-defined and accessed in a standardized manner.
  • A Parameter is fixed if its value should not be adjusted to improve performance, and free otherwise.
  • A Pattern is an Algorithm which accepts a particular set of standard fixed input parameters in addition to an arbitrary set of free, "configurable" parameters. Similarly, a pattern's output includes a set of standard parameters. The fixed inputs include a problem instance, a random seed, a maximum runtime, a maximum runlength, and a A Pattern instance can accept additional, nonstandard inputs, but these must have default values so that if left unspecified the algorithm will still run.
  • A Scenario specifies conditions under which Patterns are run, including training and test instances, execution budgets, and overall performance objectives.
  • A Configurator is an Algorithm which accepts Patterns and Scenarios, and outputs parameter instantiations for the Pattern. In practice a Configurator will most likely attempt to return the best-evaluated parameter instance.

Initial Conisderations

Iterated F-Race is presented as a set of R functions which are called via a set of Linux shell scripts. HAL has a very basic (vanilla) F-Race component which will be useful as a starting point for both of the following tasks:

Tasks which need to be completed:

  1. Write a .json configuration file for IFR, so that it can be loaded into HAL as a generic ExternalAlgorithm. This wrapper should preserve the naming conventions/etc. used by the algorithms' authors
  2. Write a WrappedConfigurator subclass for IFR, which enables HAL to use the IFR ExternalAlgorithm as a configurator on arbitrary Pattern instances.

Each of these is approached in turn.

1. Configuration File

IFR is "natively" configured to work with a particular algorithm through editing of a BASH script file, tune.sh. This script is used to provide a sequence of commands to an R interpreter, and itself takes no command-line arguments. An example script is (apologies for crazy width...).

R --no-save --no-restore --slave<<EOF
source("race.R")
source("hrace.R")
source("eval.R")

# doesn't matter descriptions
experiment.name<-"Iterative F-race for Tuning ACOTSP"
extra.description<-"F-RACE applied to ACOTSP"

# excutable initials, usually the rest of the command lines is followed by "--parameter_name parameter_value"
executable<-"../ACOTSP.V1.0/acotsp --tries 1 --time 20 "

# instance directory for tuning and testing
instance.dir<-"../../../../Instances"
test.instance.dir<-"../../../../TestInstances"

# tuning budget in number of evaluations
maxAllotedExperiments = 6000

# type "r" means continuous parameters (real numbers), "i" means integer parameters, "c" means categorical parameters, "m" also means categorical parameters which is called in command lines by "--parameter_value", e.g. "--mmas"; while the usual parameters are called in command lines using the format "--parameter_name parameter_value". 
parameter.type.list<-list(alpha="r",beta="r",rho="r",ants="i",nnants="i", nnls="i", q0="r", localsearch="c", dlb="c", mode="m", rasrank="i", elitistants="i")

# boundary inclusive for continuous or integer parameters. for categorical parameters this simply lists all the possible levels. 
parameter.boundary.list<-list(alpha=c(0.01,5.0), beta=c(0.0,10.0), rho=c(0.0001, 1.0), ants=c(1,100), nnants=c(5,100), nnls=c(5, 100),  q0=c(0.0,1.0), localsearch=c(0,1,2,3), dlb=c(0,1), mode=c("mmas", "acs", "ras", "eas", "as"), rasrank=c(1,10), elitistants=c(1,750))

# the conditional parameters
parameter.subsidiary.list<-list(q0=list(mode=c("acs")), rasrank=list(mode=c("ras")), elitistants=list(mode=c("eas")), nnls=list(localsearch=c(1,2,3)), dlb=list(localsearch=c(1,2,3)))

# in case the parameter names differ from what we give above, usually leave it as empty
parameter.name.list<-list()

# wrapper file for racing
wrapper.file="race-wrapper.R"

result=hrace.wrapper(maxAllotedExperiments=maxAllotedExperiments,parameter.type.list=parameter.type.list,parameter.boundary.list=parameter.boundary.list,
experiment.name=experiment.name,extra.description=extra.description,executable=executable,instance.dir=instance.dir, test.instance.dir, parameter.subsidiary.list=parameter.subsidiary.list, parameter.name.list=parameter.name.list, wrapper.file=wrapper.file)

# to perform the tuned parameter on the testing instances
eval(result=result, executable=executable, test.instance.dir=test.instance.dir)

The target algorithm-specific parameters (R variables) set in this file are:

  • experiment.name (string)
  • extra.description (string)
  • executable (path)
  • instance.dir (path)
  • test.instance.dir (path)
  • maxAllotedExperiments (integer)
  • parameter.type.list (list mapping string names to characters in "ricm" indicating domains)
  • parameter.boundary.list (list mapping above names to either endpoints or categorical values, further specifying domains)
  • parameter.subsidiary.list (conditionals; list mapping above names to another mapping of above names to conditional values)
  • parameter.name.list (optional, list of parameter display names if they differ from the ones used above)
Note that all of these parameters (except possibly maxAllotedExperiments) are fixed.

While it would be possible to have HAL control R directly (circumventing this script), for the time being we will simply configure HAL to generate tune.sh scripts for arbitrary target Patterns.

Note that the parameter.*.list parameters can be viewed as strings of a particular format. The exact realization of this string depends on the target algorithm, and as such is generated in the ExternalConfigurator subclass. However, as a sanity check we can enforce that the format of the string is roughly valid using regular expressions; while this is not strictly necessary it is done below.

The outputs of the iterated F-Race algorithm are the same as those of the original F-Race; thus, they can be carried over from that algorithm's .json file. The difference is here the outputs are printed to STDOUT rather than dumped into a file. Also, The json-format configuration is then:

{
    "path" : "../../../../ifrace/TUNE",
    "command" : "bash",
    "deterministic": false,
    "inputFormat": {
        "callstring": "$bashscript",
        "$bashscript": [
            "R --no-save --no-restore --slave<<EOF",
            "source('race.R')",
            "source('hrace.R')",
            "source('eval.R"')",
            "experiment.name<-"Iterative F-race run by HAL",
            "extra.description<-"F-RACE run by HAL",
            "executable<-'$executable'",
            "instance.dir<-'instance.dir'",
            "test.instance.dir<-'test.instance.dir'",
            "maxAllotedExperiments = $maxAllotedExperiments",
            "parameter.type.list<-$parameter.type.list",
            "parameter.boundary.list<-$parameter.boundary.list",
            "parameter.subsidiary.list<-$parameter.subsidiary.list",
            "parameter.name.list<-$parameter.name.list",
            "wrapper.file="race-wrapper.R",
            "result=hrace.wrapper(maxAllotedExperiments=maxAllotedExperiments, parameter.type.list=parameter.type.list,parameter.boundary.list=parameter.boundary.list,
experiment.name=experiment.name,extra.description=extra.description,executable=executable,instance.dir=instance.dir, test.instance.dir, parameter.subsidiary.list=parameter.subsidiary.list, parameter.name.list=parameter.name.list, wrapper.file=wrapper.file)",
            "eval(result=result, executable=executable, test.instance.dir=test.instance.dir)",
            "EOF"
        ] 
    },
    "inputs": {
        "bashcsript": {"domain": "String()", "properties":{"fixed":1}},
        "experiment.name": {"domain": "String()", "properties":{"fixed":1}},
        "extra.description": {"domain": "String()", "properties":{"fixed":1}},
        "executable": {"domain": "String()", "properties":{"fixed":1}},
        "instance.dir": {"domain": "String()", "properties":{"fixed":1}},
        "test.instance.dir": {"domain": "String()", "properties":{"fixed":1}}
        "maxAllottedExperiments": {"domain": "Integer(0, None)", "properties":{"fixed":1}},
        "parameter.type.list": {"domain": "String('list\([^\s=]+=[^\s=,]+(?:,\s*[^\s=,]+=[^\s=,]+)*\)')", "properties":{"fixed":1}},
        "parameter.boundary.list": {"domain": "String('list\([^\s=]+=[^\s=,]+(?:,\s*[^\s=,]+=[^\s=,]+)*\)')", "properties":{"fixed":1}},
        "parameter.subsidiary.list": {"domain": "String('list\([^\s=]+=list\([^\s=]+=[^\s=,]+(?:,\s*[^\s=,]+=[^\s=,]+)*\)(?:,\s*[^\s=]+=list\([^\s=]+=[^\s=,]+(?:,\s*[^\s=,]+=[^\s=,]+)*\))*\)')", "properties":{"fixed":1}},
        "parameter.name.list": {"domain": "String('list\([^\s=]+=[^\s=,]?(?:,\s*[^\s=,]+=[^\s=,]+)*\)')", "properties":{"fixed":1}},
    },            
    "outputFormat": {
        "stdout": [{"^\|([x=-])\|\s*([0-9]+)\|\s*([0-9]+)\|\s*([0-9]+)\|\s*([0-9]+(?:.[0-9]+)?)\|\s*([0-9]+)\|":
                    ["marker", "task", "alive", "best", "meanbest", "nruns"],
 
                   {"Description of the selected candidate:\s}
        ]
    },
    "outputs": {
        "marker": ["x", "-", "="],        
        "task": "Integer(0, None)",
        "alive": "Integer(0, None)",
        "best": "Integer(0, None)",
        "meanbest": "Real()",
        "nruns": "Integer(0, None)"
    }
}

2. WrappedConfigurator subclass

This is where the bulk of the work of integrating a new configurator with HAL occurs. The WrappedConfigurator subclass defines the logic required to:

  1. Configure the ExternalAlgorithm's parameters appropriately given a Pattern and a Scenario.
  2. Moderate the ExternalAlgorithm's calls to the target Pattern.
  3. Interpret the ExternalAlgorithm's output.

There are three methods of WrappedConfigurator which will likely need to be overridden:

Additionally, it an ExecutionServer subclass will need to be created, which will intercept the external algorithm's attempts to call the Pattern, execute said Pattern appropriately, and return an appropriately formatted result.

-- ChrisNell - 08 Sep 2009

 
This site is powered by the TWiki collaboration platform Powered by PerlCopyright © 2008-2025 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback