Difference: HAL (1 vs. 45)

Revision 452011-01-05 - mavc

Line: 1 to 1
	Feature Milestones HAL 1.0 target: September, 2010
Line: 207 to 207
	(CN) DataManager-decorated ExecutionManager still requires explicit commit to save results. Also run results cannot be saved unless explicitly associated with an experiment id. (CN) Parameter values (eg Instance files) with spaces are split during command string construction; need to enquote them as necessary. (CN) Form input not validates moved from feature requests
Added:
> >	(MC) After error: java.io.IOException: Cannot run program "gnuplot" (in directory "gnuplotData"): java.io.IOException: error=2, No such file or directory, experiment cannot be aborted.

Revision 442010-08-24 - ChrisNell

  Feature Milestones 
 HAL 1.0 
target: September, 2010
  Functionality for meta-algorithm developers 
 
 Ability to interact with the parameter space of an algorithm (examine domains, conditionalities, etc.) done
  Ability to transform algorithm parameter spaces:  log transforms, discretization done
-<
<
+ Ability to run arbitrary algorithms, including other meta-algorithms, in identical fashion done in redesign
->
>
+ Ability to run arbitrary algorithms, including other meta-algorithms, in identical fashion done
  Ability to monitor the trajectories of all output variables of an executed algorithm run, in real time done
-<
<
+ Ability to query database of previous runs directly done in redesign
  Ability to access instance features in prog. in refactor
  Pre-defined metrics for aggregating performance across runs done in redesign
->
>
+ Ability to query database of previous runs directly done
  Ability to access instance features done
  Pre-defined metrics for aggregating performance across runs done
  Backend functionality exposed in above 
 
 Ability to execute algorithms locally done
  Ability to execute algorithms on a remote host via SSH needs update re: API changes
  Ability to execute algorithms on a SGE cluster needs update re: object API changes
  Ability to actively monitor remotely running algorithms via RPC needs update re: object API changes
-<
<
+ MySQL database storing records of all algorithms, instances, runs, etc. being redesigned now
  SQLite database fallback if MySQL unavailable as above
->
>
+ MySQL database storing records of all algorithms, instances, runs, etc. done
  SQLite database fallback if MySQL unavailable done
  R interface for performing statistical tests, etc. done
 

 Meta-Algorithms Included
-<
<
+ Configuration procedure: ParamILS (external) done; will need minor updates to work with backend redesign
->
>
+ Configuration procedure: ParamILS (external) in progress
  Configuration procedure: ROAR (internal) done; will need minor updates to work with backend redesign
-<
<
+ Analysis procedure: Paired algorithm comparison done, updated
  Analysis procedure: Single-algorithm analysis done; will need minor updates to work with backend redesign
->
>
+ Analysis procedure: Paired algorithm comparison in progress
  Analysis procedure: Single-algorithm analysis in progress
  Distribution Issues 
 
 Documentation
  Support for "bag-of-machines" execution manager
 

 Meta-Algorithms Included
-<
<
+ Configuration procedure: ActiveConfigurator (internal) in progress
->
>
+ Configuration procedure: ActiveConfigurator (internal)
  Multi-algorithm comparison
  SATzilla-like portfolio builder
  Parallelized AC
  (CF) Selection of performance metric before selecting the configurator to use. What is the exact problem specification for configuration?
  (CN) convenience methods in MetaAlgorithm hiding next(), hasNext(), report() from the 3rd-party developer; instead providing an interface like AlgorithmRun fetchRun(Algorithm a), with no InterruptedException; implies an AlgorithmRun class that can adaptively switch between a "queued" and a "running" implementation for before and after the true environment fetchRun(...) call is made/returns.
  (HH) Service-oriented volunteer computing. See, e.g., "Service-Oriented Volunteer Computing for Massively Parallel Constraint Solving Using Portfolios", Zeynep Kiziltan and Jacopo Mauro, in CPAIOR-2010 proceedings.
-<
<
->
>
+ (KLB) Handle network issues (e.g. loss of connection to datamanager, etc.) robustly.  Restart runs, etc., as required to ensure that the originally-requested job ultimately completes correctly with as little babysitting by the user as possible.
  (FH) Normalization transform, in addition to existing log transform
  Active work items 
 Frontend
  Backend 
 Release-critical
-<
<
+ CN Split algorithms and configuration spaces, allowing run reuse for common-binary configuration spaces.  Both DB and Java object model; requires Algorithm refactor below. (CN: done for Java objects; in progress for DB)
  CN Explicit representation of problems/encodings, compatability of algs and instances via problem (encodings). done in Java objects; to do in data management
  CN Refactor code to align class hierarchy with terminology of paper (CN: done for all but configurator implementations)
->
>
+ CN Split algorithms and configuration spaces, allowing run reuse for common-binary configuration spaces.  Both DB and Java object model; requires Algorithm refactor below. done
  CN Explicit representation of problems/encodings, compatability of algs and instances via problem (encodings). done
  CN Refactor code to align class hierarchy with terminology of paper (CN: done for all but meta-algorithm implementations, which are in progress)
  CN Refactor Algorithm/ParameterSpace/Parameter/Domain structure to allow above done
-<
<
+ CN Database schema -- speed-related refactor (CN: in progress)
->
>
+ CN Database schema -- speed-related refactor done (may want further tuning)
  CN Refactor SSH & RPC execution managers to work under refactor
 

 Important
-<
<
+ CN Connection pooling done (contingent on rest of DataManager refactor, above)
  Caching analysis results
  CN Query optimization (CN: in progress)
->
>
+ CN Connection pooling done
  Caching analysis results (CN: in progress as part of meta-alg changes above)
  CN Query optimization done (may want more depending on real-world observations)
  Selective limitation of run-level archiving (dynamic based on runtime?)
  add incumbentname semantic input to (design) procedures
  instance features
  CN DataManager API refinement (in progress as part of DataManager refactor)
  CF N-way performance comparison
  Stale connection issue; incl. robustness to general network issues
-<
<
+ CN Read-only DataManager connection for use by individual MA procedures done (as part of DataManager refactor)
->
>
+ CN Read-only DataManager connection for use by individual MA procedures done
  Allowing relationships (incl. possible run-reuse) between different-binary "builds" of algorithms, including due to bugfixes, additional exposed parameters, etc.  Also for different "versions" (without reuse) corresponding to added funcitonality.
-<
<
+ Ability to quantify membership of configurations to different design spaces
->
>
+ Ability to quantify membership of configurations to different design spaces done
  Application: ActiveConfigurator 
 Release Critical
-<
<
+ VC ROAR in HAL done
  VC Calling Matlab from Java done
->
>
+ VC ROAR in Java in testing
  VC Calling Matlab from Java in testing
  CN parameter transformations (log, discretization, etc.) done
  VC SMBO, calling Matlab for model building/evaluation  (VC: implemented, in testing)
  Adapt Weka RF implementation for regression
  Support/QA/Misc. 
 Release Critical
-<
<
+ JX unit testing: parameters (domains) (in progress)
  unit testing: parameter spaces
->
>
+ unit testing: parameters (domains) OK
  unit testing: parameter spaces OK
  unit testing: algorithms
  unit testing: execution managers (local, SSH, cluster)
  unit testing: data managers (SQLite, MySQL)
  Important 
 
 CN Git, not CVS done
-<
<
+ CN Order+configure new DB server (CN: ordered; waiting for shipment)
->
>
+ CN Order+configure new DB server (CN: waiting for Dave B to make final changeover)
  user-facing documentation (help)
-<
<
+ CN Better logging/error-reporting (to console/within HAL).  eg: log4j (in progress)
->
>
+ CN Better logging/error-reporting (to console/within HAL).  eg:*done* (for most cases; exceptions are auto-logged)
  CN JX VC Basic Windows support done, in testing
  Better handling of overhead runtime vs. target algorithm runtime
 

 Nice-to-have
-<
<
+ JX developer-facing documentation (javadocs) (in progress in parallel with unit testing)
->
>
+ developer-facing documentation (javadocs) (in progress in parallel with other work)
  Bug Reports

Revision 432010-08-05 - ChrisNell

Line: 1 to 1
	Feature Milestones HAL 1.0 target: September, 2010
Line: 23 to 23
	Ability to monitor the trajectories of all output variables of an executed algorithm run, in real time done Ability to query database of previous runs directly done in redesign Ability to access instance features in prog. in refactor
Deleted:
< <	Random Forest classification + regression models, incl. interface accepting AlgorithmRun objects for training and inference
	Pre-defined metrics for aggregating performance across runs done in redesign Backend functionality exposed in above
Line: 38 to 37
	Meta-Algorithms Included Configuration procedure: ParamILS (external) done; will need minor updates to work with backend redesign Configuration procedure: ROAR (internal) done; will need minor updates to work with backend redesign
Deleted:
< <	Configuration procedure: ActiveConfigurator (internal) in progress
	Analysis procedure: Paired algorithm comparison done, updated Analysis procedure: Single-algorithm analysis done; will need minor updates to work with backend redesign
Line: 57 to 55
	Ability to "chain" experiments (eg. design procs. followed by analysis proc comparing incumbents) Functionality for meta-algorithm developers
Added:
> >	Random Forest classification + regression models, incl. interface accepting AlgorithmRun objects for training and inference
	support for feature extraction procedures Backend functionality
Line: 64 to 63
	Support for "bag-of-machines" execution manager Meta-Algorithms Included
Added:
> >	Configuration procedure: ActiveConfigurator (internal) in progress
	Multi-algorithm comparison SATzilla-like portfolio builder Parallelized AC

Revision 422010-07-28 - ChrisNell

Line: 1 to 1
	Feature Milestones HAL 1.0 target: September, 2010
Line: 24 to 24
	Ability to query database of previous runs directly done in redesign Ability to access instance features in prog. in refactor Random Forest classification + regression models, incl. interface accepting AlgorithmRun objects for training and inference
Changed:
< <	Pre-defined metrics for aggregating performance across runs
> >	Pre-defined metrics for aggregating performance across runs done in redesign
	Backend functionality exposed in above Ability to execute algorithms locally done

Revision 412010-07-28 - ChrisNell

Line: 1 to 1
	Feature Milestones HAL 1.0 target: September, 2010
Line: 12 to 12
	Page to view summary of all queued, running, and completed jobs Page to view browse/view details/delete runs/problems/instances/algorithms/environments Dynamic run monitoring analysis pages, including:
Changed:
< <	Plots: Overlaid SCDs for (fixed #) multi-alg, multi inst meta-algs (RTDs for single-inst), SQT for meta-algs where possible, scatter plot for 2-target multi-instance meta-algs, incumbent SCD/RTD for design meta-algs. Descriptive statistics: (mean/sd, quantiles/iqrs) for assessing single-algorithm on an instance dist Statistical tests: Wilcoxon signed rank, Spearman correlation for comparing 2 algs on an instance dist
> >	Plots: Overlaid SCDs for (fixed #) multi-alg, multi inst meta-algs (RTDs for single-inst), SQT for meta-algs where possible, scatter plot for 2-target multi-instance meta-algs, incumbent SCD/RTD for design meta-algs. done but being reworked Descriptive statistics: (mean/sd, quantiles/iqrs) for assessing single-algorithm on an instance dist done Statistical tests: Wilcoxon signed rank, Spearman correlation for comparing 2 algs on an instance dist done
	Functionality for meta-algorithm developers Ability to interact with the parameter space of an algorithm (examine domains, conditionalities, etc.) done
Line: 42 to 42
	Analysis procedure: Paired algorithm comparison done, updated Analysis procedure: Single-algorithm analysis done; will need minor updates to work with backend redesign
Changed:
< <
> >	Distribution Issues Documentation Detection/configuration of external dependencies (c.f. UI/execution environment specification) Double-click-to-run universal JAR distribution
	HAL 1.1
Line: 54 to 57
	Ability to "chain" experiments (eg. design procs. followed by analysis proc comparing incumbents) Functionality for meta-algorithm developers
Changed:
< <
> >	support for feature extraction procedures
	Backend functionality Support for TORQUE clusters
Added:
> >	Support for "bag-of-machines" execution manager
	Meta-Algorithms Included Multi-algorithm comparison SATzilla-like portfolio builder Parallelized AC
Added:
> >	ParamILS (internal)
	HAL 1.x target: 2011
Deleted:
< <	Packaging/bundling complete experiments or other HAL primitives for easy reproduction or installation by other users.
	libraries of: search/optimization procedures machine learning tools
Line: 75 to 79
	bootstrapped analyses robustness analyses parameter response analyses
Deleted:
< <	ParamILS in HAL
	Parallel portfolios in HAL Iterated F-Race in HAL support for optimization/Monte-Carlo experiments support instance generators
Deleted:
< <	support for feature extraction procedures
	support for instance format converters Support text-file inputs and outputs for external algorithms (now is only cmd line, and stdin/err) array jobs in SGE

Revision 402010-07-27 - ChrisNell

Line: 1 to 1
Deleted:
< <
	Feature Milestones HAL 1.0 target: September, 2010
Line: 7 to 6
	Page to add new external target algorithms Page to add new parameter spaces for a given target algorithm (modified from existing spaces) Page to add new problem instances/distributions (in the form of lists of files)
Changed:
< <	Page to specify new execution environments (i.e. cluster config details)
> >	Page to specify new execution environments (Eg. cluster config details)
	Pages to specify & launch included meta-algorithms Ability to view algorithms/instances by problem (instance compatibility) during above specification Page to view summary of all queued, running, and completed jobs Page to view browse/view details/delete runs/problems/instances/algorithms/environments
Changed:
< <	Dnyamic monitoring pages, including:
> >	Dynamic run monitoring analysis pages, including:
	Plots: Overlaid SCDs for (fixed #) multi-alg, multi inst meta-algs (RTDs for single-inst), SQT for meta-algs where possible, scatter plot for 2-target multi-instance meta-algs, incumbent SCD/RTD for design meta-algs. Descriptive statistics: (mean/sd, quantiles/iqrs) for assessing single-algorithm on an instance dist Statistical tests: Wilcoxon signed rank, Spearman correlation for comparing 2 algs on an instance dist
Line: 20 to 19
	Functionality for meta-algorithm developers Ability to interact with the parameter space of an algorithm (examine domains, conditionalities, etc.) done Ability to transform algorithm parameter spaces: log transforms, discretization done
Changed:
< <	Ability to run arbitrary algorithms, including other meta-algorithms, in identical fashion done in refactor
> >	Ability to run arbitrary algorithms, including other meta-algorithms, in identical fashion done in redesign
	Ability to monitor the trajectories of all output variables of an executed algorithm run, in real time done
Changed:
< <	Ability to query database of previous runs directly done in refactor
> >	Ability to query database of previous runs directly done in redesign
	Ability to access instance features in prog. in refactor Random Forest classification + regression models, incl. interface accepting AlgorithmRun objects for training and inference
Added:
> >	Pre-defined metrics for aggregating performance across runs
	Backend functionality exposed in above Ability to execute algorithms locally done
Line: 33 to 33
	Ability to actively monitor remotely running algorithms via RPC needs update re: object API changes MySQL database storing records of all algorithms, instances, runs, etc. being redesigned now SQLite database fallback if MySQL unavailable as above
Changed:
< <
> >	R interface for performing statistical tests, etc. done
	Meta-Algorithms Included
Changed:
< <	Configuration procedure: ParamILS (external) done; will need minor updates to work with backend refactor Configuration procedure: ROAR (internal) done; will need minor updates to work with backend refactor
> >	Configuration procedure: ParamILS (external) done; will need minor updates to work with backend redesign Configuration procedure: ROAR (internal) done; will need minor updates to work with backend redesign
	Configuration procedure: ActiveConfigurator (internal) in progress Analysis procedure: Paired algorithm comparison done, updated
Changed:
< <	Analysis procedure: Single-algorithm analysis done; will need minor updates to work with backend refactor
> >	Analysis procedure: Single-algorithm analysis done; will need minor updates to work with backend redesign

Line: 64 to 64
	SATzilla-like portfolio builder Parallelized AC
Changed:
< <	Scheduled Tasks
> >	HAL 1.x target: 2011 Packaging/bundling complete experiments or other HAL primitives for easy reproduction or installation by other users. libraries of: search/optimization procedures machine learning tools multi-algorithm comparisons scaling analyses bootstrapped analyses robustness analyses parameter response analyses ParamILS in HAL Parallel portfolios in HAL Iterated F-Race in HAL support for optimization/Monte-Carlo experiments support instance generators support for feature extraction procedures support for instance format converters Support text-file inputs and outputs for external algorithms (now is only cmd line, and stdin/err) array jobs in SGE Wider support for working directory requirements of individual algorithm runs, e.g. Concorde's creation of 20 files with fixed names. Unprioritized Features new feature requests should be initially added here; notify a HAL developer and come to a HAL meeting if you feel your feature must move up the stack quickly (FH) Support for complete configuration experiment, front to back: run configurator N times on a training set, report the N training and test set performances CN: can hopefully be implemented as a chained experiment (FH) Developers of configurators should be able to swap in new versions of a configurator _CN: (FH) Configuration scenarios, specifying a complete configuration task including the test set; only missing part being the configurator (FH) Saveable sets of configuration scenarios to perform (use case: I change the configurator and want to evaluate it) (FH) Taking this a step further: support for optimizing a parameterized configurator (configurator is an algorithm, and the above set of experiments is the set of "instances") CN: this is what is being implemented in the ongoing backend redesign (FH) Submitting runs from a machine that is itelf a cluster submit host should not need to go through SSH (CF) Memory usage / CPU time monitoring in HAL of target algorithm runs, in order to report warnings on potential problems (like excessive swapping for example). (HH) Significance-gated analysis / sequential hypothesis testing (see email from HH). (CF) Continued testing to support LAMA-ish difficulties in HAL: * Wallclock vs. CPU cutoff options * Warnings in the dashboard if target runs or experiments are behaving "strangely" * Email notifications sent to users when various events happen (CF) Restricted data/execution/targetalgs for the demo server (CF) Selection of performance metric before selecting the configurator to use. What is the exact problem specification for configuration? (CN) convenience methods in MetaAlgorithm hiding next(), hasNext(), report() from the 3rd-party developer; instead providing an interface like AlgorithmRun fetchRun(Algorithm a), with no InterruptedException; implies an AlgorithmRun class that can adaptively switch between a "queued" and a "running" implementation for before and after the true environment fetchRun(...) call is made/returns. (HH) Service-oriented volunteer computing. See, e.g., "Service-Oriented Volunteer Computing for Massively Parallel Constraint Solving Using Portfolios", Zeynep Kiziltan and Jacopo Mauro, in CPAIOR-2010 proceedings. Active work items
	Frontend Release-critical CF algorithm specification screen: implement (includes initial design space specification) (CF): In Progress
Line: 146 to 189
	JX developer-facing documentation (javadocs) (in progress in parallel with unit testing)
Deleted:
< <	Medium-term For future HAL 1.x revisions Packaging/bundling complete experiments or other HAL primitives for easy reproduction or installation by other users. libraries of: search/optimization procedures machine learning tools multi-algorithm comparisons scaling analyses bootstrapped analyses robustness analyses parameter response analyses SATzilla in HAL ParamILS in HAL Parallel portfolios in HAL Iterated F-Race in HAL chained-procedure experiments support for optimization/Monte-Carlo experiments support instance generators Support text-file inputs and outputs for external algorithms array jobs in SGE Wider support for working directory requirements of individual algorithm runs, i.e. Concorde's creation of 20 files with fixed names. Validation of form input. Scriptable submission of experiments. (CF): Accelerated for Frank, finished 18/05/2010. Ability to browse algorithms, instances, instance sets, configurations, etc. This includes the ability to see things related to the item being browsed. Performance of different algorithms/configurations on a given instance, performance of algorithms across an instance set, performance of a given configuration. Long-term/Unprioritized Feature requests should be initially added here (FH) Support for complete configuration experiment, front to back: run configurator N times on a training set, report the N training and test set performances (FH) Developers of configurators should be able to swap in new versions of a configurator (FH) Configuration scenarios, specifying a complete configuration task including the test set; only missing part being the configurator (FH) Saveable sets of configuration scenarios to perform (use case: I change the configurator and want to evaluate it) (FH) Taking this a step further: support for optimizing a parameterized configurator (configurator is an algorithm, and the above set of experiments is the set of "instances") (FH) Submitting runs from a machine that is itelf a cluster submit host should not need to go through SSH (CF) Memory usage / CPU time monitoring in HAL of target algorithm runs, in order to report warnings on potential problems (like excessive swapping for example). (HH) Significance-gated analysis / sequential hypothesis testing (see email from HH). (CF) Continued testing to support LAMA-ish difficulties in HAL: * Wallclock vs. CPU cutoff options * Warnings in the dashboard if target runs or experiments are behaving "strangely" * Email notifications sent to users when various events happen (CF) Restricted data/execution/targetalgs for the demo server (CN) Support of performance metrics (CF) Selection of performance metric before selecting the configurator to use. What is the exact problem specification for configuration? (CN) convenience methods in MetaAlgorithm hiding next(), hasNext(), report() from the 3rd-party developer; instead providing an interface like AlgorithmRun fetchRun(Algorithm a), with no InterruptedException; implies an AlgorithmRun class that can adaptively switch between a "queued" and a "running" implementation for before and after the true environment fetchRun(...) call is made/returns. (HH) Service-oriented volunteer computing. See, e.g., "Service-Oriented Volunteer Computing for Massively Parallel Constraint Solving Using Portfolios", Zeynep Kiziltan and Jacopo Mauro, in CPAIOR-2010 proceedings.
	Bug Reports (CN) JSC test reliability issue (compared to R) (CN) end-of-experiment hanging bug (GGA, multinode cluster runs)
Deleted:
< <	(JS) InnoDB SQL errors (CN): fixed 11/05/10
	(LX) missing current-time point in solution quality trace, so don't see the final "flat line" (CN) accuracy of mid-run overhead accounting for PILS/GGA (CF) Configuration file callstrings with weird spaces, i.e. "... -param '$val$ blah' ..." where '$val blah' needs to be passed to the target as a single argument. (CN) does this work with double-quotes instead of single-quotes?
Line: 203 to 199
	(JS) HAL is not usable on WestGrid. We need a TorqueClusterExecutionManager. (JS) Algorithms with a requirement of a new directory for each run. (JS) one of the ExecutionManagers produces unstarted AlgorithmRuns
Deleted:
< <	(CF) When HAL kills a target algorithm run, it does not also kill all child processes spawned by that run. This can leave zombies and all kinds of other very bad things after a period of time. (CN): fixed 18/05/10
	(FH) If a HAL slave process fails to start, the associated expt. status stays on "queued" forever (FH) Database table contention causes locking and high query latency. Likely to be fixed by database changes and use of InnoDB, but I'm reporting it anyway. (CN) DataManager-decorated ExecutionManager still requires explicit commit to save results. Also run results cannot be saved unless explicitly associated with an experiment id. (CN) Parameter values (eg Instance files) with spaces are split during command string construction; need to enquote them as necessary.
Added:
> >	(CN) Form input not validates moved from feature requests

Revision 392010-07-27 - ChrisNell

-<
<
+ Short-term 
Target: CRC/initial release
->
>
+ Feature Milestones 
 HAL 1.0 
target: September, 2010
 Web UI Features 
 
 Page to add new external target algorithms
  Page to add new parameter spaces for a given target algorithm (modified from existing spaces)
  Page to add new problem instances/distributions (in the form of lists of files)
  Page to specify new execution environments (i.e. cluster config details)
  Pages to specify & launch included meta-algorithms
  Ability to view algorithms/instances by problem (instance compatibility) during above specification
  Page to view summary of all queued, running, and completed jobs
  Page to view browse/view details/delete runs/problems/instances/algorithms/environments
  Dnyamic monitoring pages, including:
  Plots: Overlaid SCDs for (fixed #) multi-alg, multi inst meta-algs (RTDs for single-inst), SQT for meta-algs where possible, scatter plot for 2-target multi-instance meta-algs, incumbent SCD/RTD for design meta-algs.  
  Descriptive statistics: (mean/sd, quantiles/iqrs) for assessing single-algorithm on an instance dist
  Statistical tests: Wilcoxon signed rank, Spearman correlation for comparing 2 algs on an instance dist
 

 Functionality for meta-algorithm developers 
 
 Ability to interact with the parameter space of an algorithm (examine domains, conditionalities, etc.) done
  Ability to transform algorithm parameter spaces:  log transforms, discretization done
  Ability to run arbitrary algorithms, including other meta-algorithms, in identical fashion done in refactor
  Ability to monitor the trajectories of all output variables of an executed algorithm run, in real time done
  Ability to query database of previous runs directly done in refactor
  Ability to access instance features in prog. in refactor
  Random Forest classification + regression models, incl. interface accepting AlgorithmRun objects for training and inference
 

 Backend functionality exposed in above 
 
 Ability to execute algorithms locally done
  Ability to execute algorithms on a remote host via SSH needs update re: API changes
  Ability to execute algorithms on a SGE cluster needs update re: object API changes
  Ability to actively monitor remotely running algorithms via RPC needs update re: object API changes
  MySQL database storing records of all algorithms, instances, runs, etc. being redesigned now
  SQLite database fallback if MySQL unavailable as above
  
 

 Meta-Algorithms Included 
 
 Configuration procedure: ParamILS (external) done; will need minor updates to work with backend refactor
  Configuration procedure: ROAR (internal) done; will need minor updates to work with backend refactor
  Configuration procedure: ActiveConfigurator (internal) in progress
  Analysis procedure: Paired algorithm comparison done, updated
  Analysis procedure: Single-algorithm analysis done; will need minor updates to work with backend refactor
 




 HAL 1.1 
target: December, 2010

 Web UI Features 
 
 Ability to export complete experiment packages (including algorithms, instances, run instructions)
  Ability to load and execute an experiment package
  Ability to "chain" experiments (eg. design procs. followed by analysis proc comparing incumbents)
 

 Functionality for meta-algorithm developers 
 
 
 

 Backend functionality 
 
 Support for TORQUE clusters
 

 Meta-Algorithms Included 
 
 Multi-algorithm comparison
  SATzilla-like portfolio builder
  Parallelized AC
 

 Scheduled Tasks
  Frontend 
 Release-critical
-<
<
+functionality promised in paper
  CF algorithm specification screen: implement (includes initial design space specification) (CF): In Progress
  CF left side of landing page:  task selection/presentation according to pattern concept (CF): In Progress
  CF experiment specification and monitor screens from a pattern template, and procedure-specific requirements, including experiment and incubment naming
  CF Execution environment specification (incl. R, Gnuplot, java locations) (CF): In Progress
  RTDs/per-target-algorithm-run monitoring and navigation
  design space specification by revision of existing spaces
->
>
+ Merge with backend refactor (when done)
  Important
-<
<
+works as-is but end-user experience significantly impacted
  Data management interface: 
 deleting runs/expts/etc.
  data export
  Backend 
 Release-critical
-<
<
+for functionality mentioned in paper for which post-release changes would be problematic 
 CN Named instance set table done
  CN Named configuration table done
  CN Execution environment table. (CF): Reopened to account for java/ruby/gnuplot location specification etc.; mostly done but not checked in
  CN Split algorithms and configuration spaces, allowing run reuse for common-binary configuration spaces.  Both DB and Java object model; requires Algorithm refactor below. (CN: done for Java objects; begun for DB)
->
>
+ CN Split algorithms and configuration spaces, allowing run reuse for common-binary configuration spaces.  Both DB and Java object model; requires Algorithm refactor below. (CN: done for Java objects; in progress for DB)
  CN Explicit representation of problems/encodings, compatability of algs and instances via problem (encodings). done in Java objects; to do in data management
-<
<
+ CN rename objects to match paper terminology done
  CN Refactor Algorithms/Meta-algorithms in code to align class hierarchy with terminology of paper (CN: done for all but configurator implementations)
->
>
+ CN Refactor code to align class hierarchy with terminology of paper (CN: done for all but configurator implementations)
  CN Refactor Algorithm/ParameterSpace/Parameter/Domain structure to allow above done
  CN Database schema -- speed-related refactor (CN: in progress)
->
>
+ CN Refactor SSH & RPC execution managers to work under refactor
  Important
-<
<
+mostly to (substantially) improve UI responsiveness
  CN Connection pooling done (contingent on rest of DataManager refactor, above)
  Caching analysis results
  CN Query optimization (CN: in progress)
  Selective limitation of run-level archiving (dynamic based on runtime?)
  add incumbentname semantic input to (design) procedures
->
>
+ instance features
  Nice-to-have
-<
<
+noticeable mostly to developer-users
  CN DataManager API refinement (in progress as part of DataManager refactor)
-<
<
+ CF N-way performance comparison first-cut for Frank.
->
>
+ CF N-way performance comparison
  Stale connection issue; incl. robustness to general network issues
  CN Read-only DataManager connection for use by individual MA procedures done (as part of DataManager refactor)
  Allowing relationships (incl. possible run-reuse) between different-binary "builds" of algorithms, including due to bugfixes, additional exposed parameters, etc.  Also for different "versions" (without reuse) corresponding to added funcitonality.
  Application: ActiveConfigurator 
 Release Critical
-<
<
+ VC ROAR in HAL (CN: implemented, in testing)
->
>
+ VC ROAR in HAL done
  VC Calling Matlab from Java done
  CN parameter transformations (log, discretization, etc.) done
-<
<
+ VC SMBO, calling Matlab for model building/evaluation
->
>
+ VC SMBO, calling Matlab for model building/evaluation  (VC: implemented, in testing)
  Adapt Weka RF implementation for regression
  Pure-Java SMBO implementation
  Merge Java AC with refactored HAL codebase once refactor is completed
  Support/QA/Misc. 
 Release Critical
-<
<
+ JX unit testing: parameters (domains)
->
>
+ JX unit testing: parameters (domains) (in progress)
  unit testing: parameter spaces
  unit testing: algorithms
  unit testing: execution managers (local, SSH, cluster)
-<
<
+ unit testing: data managers (SQLite, Mysql)
->
>
+ unit testing: data managers (SQLite, MySQL)
  unit testing: meta-algorithms
  functional testing:  full pipeline
->
>
+ Licensing issues (GPL'd components...)
  Important 
 
 CN Git, not CVS done
  CN Order+configure new DB server (CN: ordered; waiting for shipment)
  user-facing documentation (help)
-<
<
+ Better logging/error-reporting (to console/within HAL).  eg: log4j
->
>
+ CN Better logging/error-reporting (to console/within HAL).  eg: log4j (in progress)
  CN JX VC Basic Windows support done, in testing
  Better handling of overhead runtime vs. target algorithm runtime
 

 Nice-to-have
-<
<
+ developer-facing documentation (javadocs) (JX: in progress in parallel with unit testing)
->
>
+ JX developer-facing documentation (javadocs) (in progress in parallel with unit testing)
  Medium-term
-<
<
+Planned for future HAL 1.x revisions
->
>
+For future HAL 1.x revisions
  Packaging/bundling complete experiments or other HAL primitives for easy reproduction or installation by other users.
-<
<
+ Windows support
  libraries of: 
 search/optimization procedures
  machine learning tools
  support for optimization/Monte-Carlo experiments
  support instance generators
  Support text-file inputs and outputs for external algorithms
-<
<
+ Instance features
  Explicit representation of problems (e.g. particular instance formats)
  Experiments calling experiments, not just external target algs
  array jobs in SGE
-<
<
+ Hashing everything, including instances, instance sets and configurations.
  Wider support for working directory requirements of individual algorithm runs, i.e. Concorde's creation of 20 files with fixed names.
  Validation of form input.
  Scriptable submission of experiments. (CF): Accelerated for Frank, finished 18/05/2010.
  (FH) Saveable sets of configuration scenarios to perform (use case: I change the configurator and want to evaluate it)
  (FH) Taking this a step further: support for optimizing a parameterized configurator (configurator is an algorithm, and the above set of experiments is the set of "instances")
  (FH) Submitting runs from a machine that is itelf a cluster submit host should not need to go through SSH
-<
<
+ (JS) public static AlgorithmRun subclasses in most ExecutionManagers should probably be private
  (CF) Memory usage / CPU time monitoring in HAL of target algorithm runs, in order to report warnings on potential problems (like excessive swapping for example).
  (HH) Significance-gated analysis / sequential hypothesis testing (see email from HH).
  (CF) Continued testing to support LAMA-ish difficulties in HAL:

Revision 382010-07-20 - ChrisNell

  Short-term 
Target: CRC/initial release
 Frontend
  Important 
mostly to (substantially) improve UI responsiveness
-<
<
+ Connection pooling
->
>
+ CN Connection pooling done (contingent on rest of DataManager refactor, above)
  Caching analysis results
-<
<
+ Query optimization
->
>
+ CN Query optimization (CN: in progress)
  Selective limitation of run-level archiving (dynamic based on runtime?)
  add incumbentname semantic input to (design) procedures
 

 Nice-to-have 
noticeable mostly to developer-users
-<
<
+ DataManager API refinement
->
>
+ CN DataManager API refinement (in progress as part of DataManager refactor)
  CF N-way performance comparison first-cut for Frank.
  Stale connection issue; incl. robustness to general network issues
-<
<
+ Read-only DataManager connection for use by individual MA procedures
->
>
+ CN Read-only DataManager connection for use by individual MA procedures done (as part of DataManager refactor)
  Allowing relationships (incl. possible run-reuse) between different-binary "builds" of algorithms, including due to bugfixes, additional exposed parameters, etc.  Also for different "versions" (without reuse) corresponding to added funcitonality.
  Ability to quantify membership of configurations to different design spaces

Revision 372010-07-15 - ChrisNell

Line: 1 to 1
	Short-term Target: CRC/initial release Frontend
Line: 28 to 28
	CN Named instance set table done CN Named configuration table done CN Execution environment table. (CF): Reopened to account for java/ruby/gnuplot location specification etc.; mostly done but not checked in
Changed:
< <	CN Split algorithms and configuration spaces, allowing run reuse for common-binary configuration spaces. Both DB and Java object model; requires Algorithm refactor below. (CN: done* for Java objects; not started for DB)* CN Explicit representation of problems/encodings, compatability of algs and instances via problem (encodings). (CN: in progress)
> >	CN Split algorithms and configuration spaces, allowing run reuse for common-binary configuration spaces. Both DB and Java object model; requires Algorithm refactor below. (CN: done* for Java objects; begun for DB)* CN Explicit representation of problems/encodings, compatability of algs and instances via problem (encodings). done in Java objects; to do in data management
	CN rename objects to match paper terminology done
Changed:
< <	CN Refactor Algorithms/Meta-algorithms in code to align class hierarchy with terminology of paper (CN: in progress)
> >	CN Refactor Algorithms/Meta-algorithms in code to align class hierarchy with terminology of paper (CN: done for all but configurator implementations)
	CN Refactor Algorithm/ParameterSpace/Parameter/Domain structure to allow above done
Changed:
< <	CN Database schema -- speed-related refactor
> >	CN Database schema -- speed-related refactor (CN: in progress)
	Important mostly to (substantially) improve UI responsiveness
Line: 53 to 53
	Ability to quantify membership of configurations to different design spaces
Added:
> >	Application: ActiveConfigurator Release Critical VC ROAR in HAL (CN: implemented, in testing) VC Calling Matlab from Java done CN parameter transformations (log, discretization, etc.) done VC SMBO, calling Matlab for model building/evaluation Adapt Weka RF implementation for regression Pure-Java SMBO implementation Merge Java AC with refactored HAL codebase once refactor is completed Adapt standalone Java AC to work as "internal" HAL meta-algorithm
	Support/QA/Misc. Release Critical
Changed:
< <	more unittests; also functional/integration tests
> >	JX unit testing: parameters (domains) unit testing: parameter spaces unit testing: algorithms unit testing: execution managers (local, SSH, cluster) unit testing: data managers (SQLite, Mysql) unit testing: meta-algorithms functional testing: full pipeline
	Important
Added:
> >	CN Git, not CVS done CN Order+configure new DB server (CN: ordered; waiting for shipment)
	user-facing documentation (help) Better logging/error-reporting (to console/within HAL). eg: log4j Better handling of overhead runtime vs. target algorithm runtime Nice-to-have
Changed:
< <	developer-facing documentation (javadocs)
> >	developer-facing documentation (javadocs) (JX: in progress in parallel with unit testing)
	Medium-term
Line: 82 to 102
	SATzilla in HAL ParamILS in HAL Parallel portfolios in HAL
Deleted:
< <	ActiveConfigurator in HAL
	Iterated F-Race in HAL chained-procedure experiments support for optimization/Monte-Carlo experiments support instance generators
Deleted:
< <	Git, not CVS
	Support text-file inputs and outputs for external algorithms Instance features Explicit representation of problems (e.g. particular instance formats)

Revision 362010-06-16 - ChrisFawcett

Line: 1 to 1
	Short-term Target: CRC/initial release Frontend
Line: 119 to 119
	(CN) Support of performance metrics (CF) Selection of performance metric before selecting the configurator to use. What is the exact problem specification for configuration? (CN) convenience methods in MetaAlgorithm hiding next(), hasNext(), report() from the 3rd-party developer; instead providing an interface like AlgorithmRun fetchRun(Algorithm a), with no InterruptedException; implies an AlgorithmRun class that can adaptively switch between a "queued" and a "running" implementation for before and after the true environment fetchRun(...) call is made/returns.
Added:
> >	(HH) Service-oriented volunteer computing. See, e.g., "Service-Oriented Volunteer Computing for Massively Parallel Constraint Solving Using Portfolios", Zeynep Kiziltan and Jacopo Mauro, in CPAIOR-2010 proceedings.
	Bug Reports (CN) JSC test reliability issue (compared to R)

Revision 352010-06-14 - ChrisNell

Line: 1 to 1
	Short-term Target: CRC/initial release Frontend
Line: 118 to 118
	(CF) Restricted data/execution/targetalgs for the demo server (CN) Support of performance metrics (CF) Selection of performance metric before selecting the configurator to use. What is the exact problem specification for configuration?
Added:
> >	(CN) convenience methods in MetaAlgorithm hiding next(), hasNext(), report() from the 3rd-party developer; instead providing an interface like AlgorithmRun fetchRun(Algorithm a), with no InterruptedException; implies an AlgorithmRun class that can adaptively switch between a "queued" and a "running" implementation for before and after the true environment fetchRun(...) call is made/returns.
	Bug Reports (CN) JSC test reliability issue (compared to R)

Revision 342010-06-10 - ChrisNell

Line: 1 to 1
	Short-term Target: CRC/initial release Frontend
Line: 27 to 27
	for functionality mentioned in paper for which post-release changes would be problematic CN Named instance set table done CN Named configuration table done
Changed:
< <	CN Execution environment table. (CF): Reopened to account for java/ruby/gnuplot location specification etc.; in progress CN Split algorithms and configuration spaces, allowing run reuse for common-binary configuration spaces. Both DB and Java object model; requires Algorithm refactor below. (CN: in progress) CN Database schema -- speed-related refactor (CN: next up) CN Explicit representation of problems/encodings, compatability of algs and instances via problem (encodings).
> >	CN Execution environment table. (CF): Reopened to account for java/ruby/gnuplot location specification etc.; mostly done but not checked in CN Split algorithms and configuration spaces, allowing run reuse for common-binary configuration spaces. Both DB and Java object model; requires Algorithm refactor below. (CN: done* for Java objects; not started for DB)* CN Explicit representation of problems/encodings, compatability of algs and instances via problem (encodings). (CN: in progress)
	CN rename objects to match paper terminology done
Changed:
< <	CN Refactor Algorithms/Meta-algorithms in code to align class hierarchy with terminology of paper CN Refactor Algorithm/ParameterSpace/Parameter/Domain structure to allow above
> >	CN Refactor Algorithms/Meta-algorithms in code to align class hierarchy with terminology of paper (CN: in progress) CN Refactor Algorithm/ParameterSpace/Parameter/Domain structure to allow above done CN Database schema -- speed-related refactor
	Important mostly to (substantially) improve UI responsiveness
Line: 133 to 133
	(CF) When HAL kills a target algorithm run, it does not also kill all child processes spawned by that run. This can leave zombies and all kinds of other very bad things after a period of time. (CN): fixed 18/05/10 (FH) If a HAL slave process fails to start, the associated expt. status stays on "queued" forever (FH) Database table contention causes locking and high query latency. Likely to be fixed by database changes and use of InnoDB, but I'm reporting it anyway.
Added:
> >	(CN) DataManager-decorated ExecutionManager still requires explicit commit to save results. Also run results cannot be saved unless explicitly associated with an experiment id. (CN) Parameter values (eg Instance files) with spaces are split during command string construction; need to enquote them as necessary.

Revision 332010-06-08 - ChrisFawcett

Line: 1 to 1
	Short-term Target: CRC/initial release Frontend
Line: 69 to 69
	Medium-term Planned for future HAL 1.x revisions
Changed:
< <	Packaging complete experiments
> >	Packaging/bundling complete experiments or other HAL primitives for easy reproduction or installation by other users.
	Windows support libraries of: search/optimization procedures

Revision 322010-05-27 - ChrisNell

Line: 1 to 1
	Short-term Target: CRC/initial release Frontend
Line: 52 to 52
	Allowing relationships (incl. possible run-reuse) between different-binary "builds" of algorithms, including due to bugfixes, additional exposed parameters, etc. Also for different "versions" (without reuse) corresponding to added funcitonality. Ability to quantify membership of configurations to different design spaces
Added:
> >
	Support/QA/Misc.
Added:
> >	Release Critical more unittests; also functional/integration tests
	Important user-facing documentation (help) Better logging/error-reporting (to console/within HAL). eg: log4j Better handling of overhead runtime vs. target algorithm runtime
Deleted:
< <	WAY more unittests; also functional/integration tests
	Nice-to-have developer-facing documentation (javadocs)
Line: 99 to 102
	Long-term/Unprioritized Feature requests should be initially added here
Deleted:
< <	(FH) Probably simple: Support PAR-10 as one parameter for ParamILS (FH) Probably simple: Support single-CPU arrow runs with csh shell (currently, I can only run ParamILS using 2 CPUs, one of which is then idle)
	(FH) Support for complete configuration experiment, front to back: run configurator N times on a training set, report the N training and test set performances (FH) Developers of configurators should be able to swap in new versions of a configurator (FH) Configuration scenarios, specifying a complete configuration task including the test set; only missing part being the configurator

Revision 312010-05-27 - FrankHutter

Line: 1 to 1
	Short-term Target: CRC/initial release Frontend
Line: 100 to 99
	Long-term/Unprioritized Feature requests should be initially added here
Added:
> >	(FH) Probably simple: Support PAR-10 as one parameter for ParamILS (FH) Probably simple: Support single-CPU arrow runs with csh shell (currently, I can only run ParamILS using 2 CPUs, one of which is then idle)
	(FH) Support for complete configuration experiment, front to back: run configurator N times on a training set, report the N training and test set performances (FH) Developers of configurators should be able to swap in new versions of a configurator (FH) Configuration scenarios, specifying a complete configuration task including the test set; only missing part being the configurator

Revision 302010-05-27 - ChrisNell

Line: 1 to 1
	Short-term Target: CRC/initial release Frontend
Line: 27 to 27
	for functionality mentioned in paper for which post-release changes would be problematic CN Named instance set table done CN Named configuration table done
Changed:
< <	CN Execution environment table. (CF): Reopened to account for java/ruby/gnuplot location specification etc. CN Split algorithms and configuration spaces, allowing run reuse for common-binary configuration spaces. Both DB and Java object model; requires Algorithm refactor below.
> >	CN Execution environment table. (CF): Reopened to account for java/ruby/gnuplot location specification etc.; in progress CN Split algorithms and configuration spaces, allowing run reuse for common-binary configuration spaces. Both DB and Java object model; requires Algorithm refactor below. (CN: in progress) CN Database schema -- speed-related refactor (CN: next up)
	CN Explicit representation of problems/encodings, compatability of algs and instances via problem (encodings). CN rename objects to match paper terminology done CN Refactor Algorithms/Meta-algorithms in code to align class hierarchy with terminology of paper CN Refactor Algorithm/ParameterSpace/Parameter/Domain structure to allow above
Deleted:
< <	CN Database schema -- speed-related refactor
	Important mostly to (substantially) improve UI responsiveness
Line: 114 to 114
	* Warnings in the dashboard if target runs or experiments are behaving "strangely" * Email notifications sent to users when various events happen (CF) Restricted data/execution/targetalgs for the demo server
Added:
> >	(CN) Support of performance metrics
	(CF) Selection of performance metric before selecting the configurator to use. What is the exact problem specification for configuration? Bug Reports

Revision 292010-05-27 - ChrisFawcett

Line: 1 to 1
	Short-term Target: CRC/initial release Frontend
Line: 19 to 19
	data export Error logging/handling/browsing Plotting ex-gnuplot
Changed:
< <
> >	Documentation as a header on most of the experiment pages, paragraph explaining the intention etc. Hiding "advanced" settings, such as configurator-specific settings or other tools, with appropriate defaults.
	Backend Release-critical
Line: 128 to 129
	(JS) one of the ExecutionManagers produces unstarted AlgorithmRuns (CF) When HAL kills a target algorithm run, it does not also kill all child processes spawned by that run. This can leave zombies and all kinds of other very bad things after a period of time. (CN): fixed 18/05/10 (FH) If a HAL slave process fails to start, the associated expt. status stays on "queued" forever
Added:
> >	(FH) Database table contention causes locking and high query latency. Likely to be fixed by database changes and use of InnoDB, but I'm reporting it anyway.

Revision 282010-05-25 - ChrisNell

Line: 1 to 1
	Short-term Target: CRC/initial release Frontend
Line: 127 to 127
	(JS) Algorithms with a requirement of a new directory for each run. (JS) one of the ExecutionManagers produces unstarted AlgorithmRuns (CF) When HAL kills a target algorithm run, it does not also kill all child processes spawned by that run. This can leave zombies and all kinds of other very bad things after a period of time. (CN): fixed 18/05/10
Added:
> >	(FH) If a HAL slave process fails to start, the associated expt. status stays on "queued" forever

Revision 272010-05-19 - ChrisFawcett

Line: 1 to 1
	Short-term Target: CRC/initial release Frontend Release-critical functionality promised in paper
Changed:
< <	CF algorithm specification screen: implement (includes initial design space specification) CF left side of landing page: task selection/presentation according to pattern concept
> >	CF algorithm specification screen: implement (includes initial design space specification) (CF): In Progress CF left side of landing page: task selection/presentation according to pattern concept (CF): In Progress
	CF experiment specification and monitor screens from a pattern template, and procedure-specific requirements, including experiment and incubment naming
Changed:
< <	CF instance specification screen: implement CF Execution environment specification (incl. R, Gnuplot, java locations)
> >	CF instance specification screen: implement (CF): In Progress CF Execution environment specification (incl. R, Gnuplot, java locations) (CF): In Progress
	RTDs/per-target-algorithm-run monitoring and navigation design space specification by revision of existing spaces
Line: 26 to 26
	for functionality mentioned in paper for which post-release changes would be problematic CN Named instance set table done CN Named configuration table done
Changed:
< <	CN Execution environment table done
> >	CN Execution environment table. (CF): Reopened to account for java/ruby/gnuplot location specification etc.
	CN Split algorithms and configuration spaces, allowing run reuse for common-binary configuration spaces. Both DB and Java object model; requires Algorithm refactor below. CN Explicit representation of problems/encodings, compatability of algs and instances via problem (encodings). CN rename objects to match paper terminology done
Line: 93 to 93
	Hashing everything, including instances, instance sets and configurations. Wider support for working directory requirements of individual algorithm runs, i.e. Concorde's creation of 20 files with fixed names. Validation of form input.
Added:
> >	Scriptable submission of experiments. (CF): Accelerated for Frank, finished 18/05/2010.
	Ability to browse algorithms, instances, instance sets, configurations, etc. This includes the ability to see things related to the item being browsed. Performance of different algorithms/configurations on a given instance, performance of algorithms across an instance set, performance of a given configuration.
Line: 106 to 107
	(FH) Submitting runs from a machine that is itelf a cluster submit host should not need to go through SSH (JS) public static AlgorithmRun subclasses in most ExecutionManagers should probably be private (CF) Memory usage / CPU time monitoring in HAL of target algorithm runs, in order to report warnings on potential problems (like excessive swapping for example).
Changed:
< <	(HH) Significance-gated testing / Sequential Testing (see email from HH).
> >	(HH) Significance-gated analysis / sequential hypothesis testing (see email from HH). (CF) Continued testing to support LAMA-ish difficulties in HAL: * Wallclock vs. CPU cutoff options * Warnings in the dashboard if target runs or experiments are behaving "strangely" * Email notifications sent to users when various events happen (CF) Restricted data/execution/targetalgs for the demo server (CF) Selection of performance metric before selecting the configurator to use. What is the exact problem specification for configuration?
	Bug Reports (CN) JSC test reliability issue (compared to R)

Revision 262010-05-19 - ChrisFawcett

Line: 1 to 1
	Short-term Target: CRC/initial release Frontend
Line: 106 to 106
	(FH) Submitting runs from a machine that is itelf a cluster submit host should not need to go through SSH (JS) public static AlgorithmRun subclasses in most ExecutionManagers should probably be private (CF) Memory usage / CPU time monitoring in HAL of target algorithm runs, in order to report warnings on potential problems (like excessive swapping for example).
Added:
> >	(HH) Significance-gated testing / Sequential Testing (see email from HH).
	Bug Reports (CN) JSC test reliability issue (compared to R)

Revision 252010-05-19 - ChrisNell

Line: 1 to 1
	Short-term Target: CRC/initial release Frontend
Line: 107 to 107
	(JS) public static AlgorithmRun subclasses in most ExecutionManagers should probably be private (CF) Memory usage / CPU time monitoring in HAL of target algorithm runs, in order to report warnings on potential problems (like excessive swapping for example).
Changed:
< <	Bugs Reports
> >	Bug Reports
	(CN) JSC test reliability issue (compared to R) (CN) end-of-experiment hanging bug (GGA, multinode cluster runs) (JS) InnoDB SQL errors (CN): fixed 11/05/10
Line: 118 to 118
	(JS) HAL is not usable on WestGrid. We need a TorqueClusterExecutionManager. (JS) Algorithms with a requirement of a new directory for each run. (JS) one of the ExecutionManagers produces unstarted AlgorithmRuns
Changed:
< <	(CF) When HAL kills a target algorithm run, it does not also kill all child processes spawned by that run. This can leave zombies and all kinds of other very bad things after a period of time.
> >	(CF) When HAL kills a target algorithm run, it does not also kill all child processes spawned by that run. This can leave zombies and all kinds of other very bad things after a period of time. (CN): fixed 18/05/10

Revision 242010-05-18 - ChrisNell

Line: 1 to 1
	Short-term Target: CRC/initial release Frontend
Line: 23 to 23
	Backend Release-critical
Changed:
< <	mostly to enable critical UI tasks
> >	for functionality mentioned in paper for which post-release changes would be problematic
	CN Named instance set table done CN Named configuration table done CN Execution environment table done
Changed:
< <	CN Split algorithms and configuration spaces, allowing run reuse for common-binary configuration spaces. Both DB and Java object model. CN Explicit representation of problems/encodings, compatability of algs and instances via problem (encodings)
> >	CN Split algorithms and configuration spaces, allowing run reuse for common-binary configuration spaces. Both DB and Java object model; requires Algorithm refactor below. CN Explicit representation of problems/encodings, compatability of algs and instances via problem (encodings).
	CN rename objects to match paper terminology done
Added:
> >	CN Refactor Algorithms/Meta-algorithms in code to align class hierarchy with terminology of paper CN Refactor Algorithm/ParameterSpace/Parameter/Domain structure to allow above CN Database schema -- speed-related refactor
	Important mostly to (substantially) improve UI responsiveness
Deleted:
< <	Database schema -- speed-related refactor
	Connection pooling Caching analysis results Query optimization
Line: 48 to 50
	Read-only DataManager connection for use by individual MA procedures Allowing relationships (incl. possible run-reuse) between different-binary "builds" of algorithms, including due to bugfixes, additional exposed parameters, etc. Also for different "versions" (without reuse) corresponding to added funcitonality. Ability to quantify membership of configurations to different design spaces
Deleted:
< <	Refactor Algorithms/Meta-algorithms in code to align class hierarchy with terminology of paper Refactor/cleanup Algorithm/ParameterSpace/Parameter/Domain structure
	Support/QA/Misc.

Revision 232010-05-18 - ChrisNell

Line: 1 to 1
	Short-term Target: CRC/initial release Frontend
Line: 29 to 29
	CN Execution environment table done CN Split algorithms and configuration spaces, allowing run reuse for common-binary configuration spaces. Both DB and Java object model. CN Explicit representation of problems/encodings, compatability of algs and instances via problem (encodings)
Changed:
< <	CN add incumbentname semantic input to (design) procedures CN rename objects to match paper terminology
> >	CN rename objects to match paper terminology done
	Important mostly to (substantially) improve UI responsiveness
Line: 39 to 38
	Caching analysis results Query optimization Selective limitation of run-level archiving (dynamic based on runtime?)
Added:
> >	add incumbentname semantic input to (design) procedures
	Nice-to-have noticeable mostly to developer-users

Revision 222010-05-18 - ChrisFawcett

Line: 1 to 1
	Short-term Target: CRC/initial release Frontend
Line: 105 to 105
	(FH) Taking this a step further: support for optimizing a parameterized configurator (configurator is an algorithm, and the above set of experiments is the set of "instances") (FH) Submitting runs from a machine that is itelf a cluster submit host should not need to go through SSH (JS) public static AlgorithmRun subclasses in most ExecutionManagers should probably be private
Added:
> >	(CF) Memory usage / CPU time monitoring in HAL of target algorithm runs, in order to report warnings on potential problems (like excessive swapping for example).
	Bugs Reports (CN) JSC test reliability issue (compared to R)
Line: 117 to 118
	(JS) HAL is not usable on WestGrid. We need a TorqueClusterExecutionManager. (JS) Algorithms with a requirement of a new directory for each run. (JS) one of the ExecutionManagers produces unstarted AlgorithmRuns
Added:
> >	(CF) When HAL kills a target algorithm run, it does not also kill all child processes spawned by that run. This can leave zombies and all kinds of other very bad things after a period of time.

Revision 212010-05-18 - ChrisNell

Line: 1 to 1
	Short-term Target: CRC/initial release Frontend
Line: 48 to 48
	Read-only DataManager connection for use by individual MA procedures Allowing relationships (incl. possible run-reuse) between different-binary "builds" of algorithms, including due to bugfixes, additional exposed parameters, etc. Also for different "versions" (without reuse) corresponding to added funcitonality. Ability to quantify membership of configurations to different design spaces
Added:
> >	Refactor Algorithms/Meta-algorithms in code to align class hierarchy with terminology of paper Refactor/cleanup Algorithm/ParameterSpace/Parameter/Domain structure
	Support/QA/Misc.

Revision 202010-05-12 - ChrisNell

Line: 1 to 1
	Short-term Target: CRC/initial release Frontend
Line: 102 to 102
	(FH) Saveable sets of configuration scenarios to perform (use case: I change the configurator and want to evaluate it) (FH) Taking this a step further: support for optimizing a parameterized configurator (configurator is an algorithm, and the above set of experiments is the set of "instances") (FH) Submitting runs from a machine that is itelf a cluster submit host should not need to go through SSH
Added:
> >	(JS) public static AlgorithmRun subclasses in most ExecutionManagers should probably be private
	Bugs Reports (CN) JSC test reliability issue (compared to R)
Line: 113 to 114
	(JS) FixedConfigurationExperiment UI is outdated, unusable. (JS) HAL is not usable on WestGrid. We need a TorqueClusterExecutionManager. (JS) Algorithms with a requirement of a new directory for each run.
Added:
> >	(JS) one of the ExecutionManagers produces unstarted AlgorithmRuns

Revision 192010-05-12 - ChrisNell

-<
<
+ Short-term priorities (Pre-CRC/release)
->
>
+ Short-term 
Target: CRC/initial release
  Frontend 
 Release-critical 
functionality promised in paper 
 CF algorithm specification screen: implement (includes initial design space specification)
  CF left side of landing page:  task selection/presentation according to pattern concept
-<
<
+ CF experiment specification and monitor screens from a pattern template, and procedure-specific requirements, including experiment and inclubment naming
->
>
+ CF experiment specification and monitor screens from a pattern template, and procedure-specific requirements, including experiment and incubment naming
  CF instance specification screen: implement
  CF Execution environment specification (incl. R, Gnuplot, java locations)
  RTDs/per-target-algorithm-run monitoring and navigation
  data export
 
  Error logging/handling/browsing
  Plotting ex-gnuplot
-<
<
+ Named configurations, ability to specify name for final incumbent a priori and reference it in subsequent comparison experiments.
  Backend
  Ability to quantify membership of configurations to different design spaces
-<
<
+ Code/Robustness/Misc. tasks
->
>
+ Support/QA/Misc.
  Important 
 
 user-facing documentation (help)
  Better logging/error-reporting (to console/within HAL).  eg: log4j
  developer-facing documentation (javadocs)
-<
<
+ Medium-term Plans
->
>
+ Medium-term 
Planned for future HAL 1.x revisions
  Packaging complete experiments
  Windows support
  libraries of:
  Ability to browse algorithms, instances, instance sets, configurations, etc. This includes the ability to see things related to the item being browsed. Performance of different algorithms/configurations on a given instance, performance of algorithms across an instance set, performance of a given configuration.
-<
<
+ Feature Requests (unprioritized/long-term)
->
>
+ Long-term/Unprioritized 
Feature requests should be initially added here
  (FH) Support for complete configuration experiment, front to back: run configurator N times on a training set, report the N training and test set performances
  (FH) Developers of configurators should be able to swap in new versions of a configurator
  (FH) Configuration scenarios, specifying a complete configuration task including the test set; only missing part being the configurator
  (FH) Saveable sets of configuration scenarios to perform (use case: I change the configurator and want to evaluate it)
  (FH) Taking this a step further: support for optimizing a parameterized configurator (configurator is an algorithm, and the above set of experiments is the set of "instances")
  (FH) Submitting runs from a machine that is itelf a cluster submit host should not need to go through SSH
-<
<
+ (CN) Ability to actively manage database, including properly cascaded deletion of elements.
-<
<
+ Known Bugs:
->
>
+ Bugs Reports
  (CN) JSC test reliability issue (compared to R)
  (CN) end-of-experiment hanging bug (GGA, multinode cluster runs)
  (JS) InnoDB SQL errors (CN): fixed 11/05/10

Revision 182010-05-12 - ChrisFawcett

Line: 1 to 1
	Short-term priorities (Pre-CRC/release) Frontend Release-critical
Line: 87 to 87
	Experiments calling experiments, not just external target algs array jobs in SGE Hashing everything, including instances, instance sets and configurations.
Changed:
< <
> >	Wider support for working directory requirements of individual algorithm runs, i.e. Concorde's creation of 20 files with fixed names. Validation of form input. Ability to browse algorithms, instances, instance sets, configurations, etc. This includes the ability to see things related to the item being browsed. Performance of different algorithms/configurations on a given instance, performance of algorithms across an instance set, performance of a given configuration.
	Feature Requests (unprioritized/long-term) (FH) Support for complete configuration experiment, front to back: run configurator N times on a training set, report the N training and test set performances
Line: 96 to 99
	(FH) Saveable sets of configuration scenarios to perform (use case: I change the configurator and want to evaluate it) (FH) Taking this a step further: support for optimizing a parameterized configurator (configurator is an algorithm, and the above set of experiments is the set of "instances") (FH) Submitting runs from a machine that is itelf a cluster submit host should not need to go through SSH
Deleted:
< <	(CF) Ability to browse algorithms, instances, instance sets, configurations, etc. This includes the ability to see things related to the item being browsed. Performance of different algorithms/configurations on a given instance, performance of algorithms across an instance set, performance of a given configuration. (CF) Form validation. (CF) Wider support for working directory requirements of individual algorithm runs, i.e. Concorde's creation of 20 files with fixed names.
	(CN) Ability to actively manage database, including properly cascaded deletion of elements. Known Bugs:

Revision 172010-05-12 - ChrisNell

-<
<
+ Pre-CRC/release tasks 
 UI tasks
->
>
+ Short-term priorities (Pre-CRC/release) 
 Frontend
  Release-critical
-<
<
+ CF N-way performance comparison first-cut for Frank.
 functionality promised in paper
-<
<
+ CF algorithm specification screen: implement
->
>
+ CF algorithm specification screen: implement (includes initial design space specification)
  CF left side of landing page:  task selection/presentation according to pattern concept
-<
<
+ CF experiment specification and monitor screens from a pattern template, and procedure-specific requirements (may require DB changes)
  CF instance specification screen: implement (requires DB change)
  CF Execution environment specification (incl. R, Gnuplot, java locations; requires DB change)
  CN RTDs/per-target-algorithm-run monitoring and navigation
->
>
+ CF experiment specification and monitor screens from a pattern template, and procedure-specific requirements, including experiment and inclubment naming
  CF instance specification screen: implement
  CF Execution environment specification (incl. R, Gnuplot, java locations)
  RTDs/per-target-algorithm-run monitoring and navigation
  design space specification by revision of existing spaces
  Important 
works as-is but end-user experience significantly impacted
  Named configurations, ability to specify name for final incumbent a priori and reference it in subsequent comparison experiments.
-<
<
+ Database tasks
->
>
+ Backend
  Release-critical 
mostly to enable critical UI tasks 
 CN Named instance set table done
  CN Named configuration table done
  CN Execution environment table done
-<
<
+ CN Configuration spaces vs. algorithms; appropriate unique ID hashing; algorithm versions?
->
>
+ CN Split algorithms and configuration spaces, allowing run reuse for common-binary configuration spaces.  Both DB and Java object model.
  CN Explicit representation of problems/encodings, compatability of algs and instances via problem (encodings)
  CN add incumbentname semantic input to (design) procedures
  CN rename objects to match paper terminology
  Important 
mostly to (substantially) improve UI responsiveness
->
>
+ Database schema -- speed-related refactor
  Connection pooling
  Caching analysis results
-<
<
+ Database schema -- speed-related redesign
  Query optimization
-<
<
+ Selective limitation of run-level logging (dynamic based on runtime?)
->
>
+ Selective limitation of run-level archiving (dynamic based on runtime?)
  Nice-to-have 
noticeable mostly to developer-users 
 DataManager API refinement
->
>
+ CF N-way performance comparison first-cut for Frank.
  Stale connection issue; incl. robustness to general network issues
  Read-only DataManager connection for use by individual MA procedures
->
>
+ Allowing relationships (incl. possible run-reuse) between different-binary "builds" of algorithms, including due to bugfixes, additional exposed parameters, etc.  Also for different "versions" (without reuse) corresponding to added funcitonality.
  Ability to quantify membership of configurations to different design spaces
  Code/Robustness/Misc. tasks
  Experiments calling experiments, not just external target algs
  array jobs in SGE
  Hashing everything, including instances, instance sets and configurations.
->
>
-<
<
+ Feature Requests
->
>
+ Feature Requests (unprioritized/long-term)
  (FH) Support for complete configuration experiment, front to back: run configurator N times on a training set, report the N training and test set performances
  (FH) Developers of configurators should be able to swap in new versions of a configurator
  (FH) Configuration scenarios, specifying a complete configuration task including the test set; only missing part being the configurator
  (FH) Saveable sets of configuration scenarios to perform (use case: I change the configurator and want to evaluate it)
  (FH) Taking this a step further: support for optimizing a parameterized configurator (configurator is an algorithm, and the above set of experiments is the set of "instances")
  (FH) Submitting runs from a machine that is itelf a cluster submit host should not need to go through SSH
-<
<
+ (CF) Distinction made in HAL between parameterised and parameter-less algorithms, including the ability to see the "parent" parameterised algorithm of a given parameter-less algorithm. Other similar queries would be equally useful (all SPEAR configurations, the parameterised algorithm corresponding to a partial instantiation of another parameterised algorithm, etc.) (CN) need to flesh out how this interacts with named configurations/configuration spaces also under discussion
  (CF) Ability to browse algorithms, instances, instance sets, configurations, etc. This includes the ability to see things related to the item being browsed. Performance of different algorithms/configurations on a given instance, performance of algorithms across an instance set, performance of a given configuration.
  (CF) Form validation.
  (CF) Wider support for working directory requirements of individual algorithm runs, i.e. Concorde's creation of 20 files with fixed names.

Revision 162010-05-11 - ChrisNell

Line: 1 to 1
	Pre-CRC/release tasks UI tasks Release-critical
Line: 26 to 26
	mostly to enable critical UI tasks CN Named instance set table done CN Named configuration table done
Changed:
< <	CN Execution environment table CN Configuration spaces vs. algorithms; appropriate unique ID hashing
> >	CN Execution environment table done CN Configuration spaces vs. algorithms; appropriate unique ID hashing; algorithm versions?
	Important mostly to (substantially) improve UI responsiveness

Revision 152010-05-11 - ChrisNell

Line: 1 to 1
	Pre-CRC/release tasks UI tasks Release-critical
Line: 77 to 77
	Git, not CVS Support text-file inputs and outputs for external algorithms Instance features
Added:
> >	Explicit representation of problems (e.g. particular instance formats)
	Experiments calling experiments, not just external target algs array jobs in SGE Hashing everything, including instances, instance sets and configurations.
Line: 97 to 98
	Known Bugs: (CN) JSC test reliability issue (compared to R) (CN) end-of-experiment hanging bug (GGA, multinode cluster runs)
Changed:
< <	(JS) InnoDB SQL errors
> >	(JS) InnoDB SQL errors (CN): fixed 11/05/10
	(LX) missing current-time point in solution quality trace, so don't see the final "flat line" (CN) accuracy of mid-run overhead accounting for PILS/GGA (CF) Configuration file callstrings with weird spaces, i.e. "... -param '$val$ blah' ..." where '$val blah' needs to be passed to the target as a single argument. (CN) does this work with double-quotes instead of single-quotes?

Revision 142010-05-11 - ChrisNell

Line: 1 to 1
	Pre-CRC/release tasks UI tasks Release-critical
Line: 25 to 25
	Release-critical mostly to enable critical UI tasks CN Named instance set table done
Changed:
< <	CN Named configuration table
> >	CN Named configuration table done
	CN Execution environment table CN Configuration spaces vs. algorithms; appropriate unique ID hashing

Revision 132010-05-10 - ChrisNell

Line: 1 to 1
	Pre-CRC/release tasks UI tasks Release-critical
Line: 92 to 92
	(CF) Ability to browse algorithms, instances, instance sets, configurations, etc. This includes the ability to see things related to the item being browsed. Performance of different algorithms/configurations on a given instance, performance of algorithms across an instance set, performance of a given configuration. (CF) Form validation. (CF) Wider support for working directory requirements of individual algorithm runs, i.e. Concorde's creation of 20 files with fixed names.
Added:
> >	(CN) Ability to actively manage database, including properly cascaded deletion of elements.
	Known Bugs: (CN) JSC test reliability issue (compared to R)

Revision 122010-05-10 - ChrisNell

Line: 1 to 1
	Pre-CRC/release tasks UI tasks Release-critical
Line: 88 to 88
	(FH) Saveable sets of configuration scenarios to perform (use case: I change the configurator and want to evaluate it) (FH) Taking this a step further: support for optimizing a parameterized configurator (configurator is an algorithm, and the above set of experiments is the set of "instances") (FH) Submitting runs from a machine that is itelf a cluster submit host should not need to go through SSH
Changed:
< <	(CF) Distinction made in HAL between parameterised and parameter-less algorithms, including the ability to see the "parent" parameterised algorithm of a given parameter-less algorithm. Other similar queries would be equally useful (all SPEAR configurations, the parameterised algorithm corresponding to a partial instantiation of another parameterised algorithm, etc.)
> >	(CF) Distinction made in HAL between parameterised and parameter-less algorithms, including the ability to see the "parent" parameterised algorithm of a given parameter-less algorithm. Other similar queries would be equally useful (all SPEAR configurations, the parameterised algorithm corresponding to a partial instantiation of another parameterised algorithm, etc.) (CN) need to flesh out how this interacts with named configurations/configuration spaces also under discussion
	(CF) Ability to browse algorithms, instances, instance sets, configurations, etc. This includes the ability to see things related to the item being browsed. Performance of different algorithms/configurations on a given instance, performance of algorithms across an instance set, performance of a given configuration. (CF) Form validation. (CF) Wider support for working directory requirements of individual algorithm runs, i.e. Concorde's creation of 20 files with fixed names.
Line: 99 to 99
	(JS) InnoDB SQL errors (LX) missing current-time point in solution quality trace, so don't see the final "flat line" (CN) accuracy of mid-run overhead accounting for PILS/GGA
Changed:
< <	(CF) Configuration file callstrings with weird spaces, i.e. "... -param '$val$ blah' ..." where '$val blah' needs to be passed to the target as a single argument.
> >	(CF) Configuration file callstrings with weird spaces, i.e. "... -param '$val$ blah' ..." where '$val blah' needs to be passed to the target as a single argument. (CN) does this work with double-quotes instead of single-quotes?
	(JS) FixedConfigurationExperiment UI is outdated, unusable. (JS) HAL is not usable on WestGrid. We need a TorqueClusterExecutionManager. (JS) Algorithms with a requirement of a new directory for each run.

Revision 112010-05-05 - FrankHutter

Line: 1 to 1
	Pre-CRC/release tasks UI tasks Release-critical
Line: 82 to 82
	Hashing everything, including instances, instance sets and configurations. Feature Requests
Changed:
< <	(FH) submitting runs from a machine that is itelf a cluster submit host should not need to go through SSH
> >	(FH) Support for complete configuration experiment, front to back: run configurator N times on a training set, report the N training and test set performances (FH) Developers of configurators should be able to swap in new versions of a configurator (FH) Configuration scenarios, specifying a complete configuration task including the test set; only missing part being the configurator (FH) Saveable sets of configuration scenarios to perform (use case: I change the configurator and want to evaluate it) (FH) Taking this a step further: support for optimizing a parameterized configurator (configurator is an algorithm, and the above set of experiments is the set of "instances") (FH) Submitting runs from a machine that is itelf a cluster submit host should not need to go through SSH
	(CF) Distinction made in HAL between parameterised and parameter-less algorithms, including the ability to see the "parent" parameterised algorithm of a given parameter-less algorithm. Other similar queries would be equally useful (all SPEAR configurations, the parameterised algorithm corresponding to a partial instantiation of another parameterised algorithm, etc.) (CF) Ability to browse algorithms, instances, instance sets, configurations, etc. This includes the ability to see things related to the item being browsed. Performance of different algorithms/configurations on a given instance, performance of algorithms across an instance set, performance of a given configuration. (CF) Form validation.

Revision 102010-05-05 - ChrisFawcett

Line: 1 to 1
	Pre-CRC/release tasks UI tasks Release-critical
Line: 18 to 18
	data export Error logging/handling/browsing Plotting ex-gnuplot
Added:
> >	Named configurations, ability to specify name for final incumbent a priori and reference it in subsequent comparison experiments.
	Database tasks
Line: 78 to 79
	Instance features Experiments calling experiments, not just external target algs array jobs in SGE
Changed:
< <
> >	Hashing everything, including instances, instance sets and configurations.
	Feature Requests (FH) submitting runs from a machine that is itelf a cluster submit host should not need to go through SSH
Changed:
< <
> >	(CF) Distinction made in HAL between parameterised and parameter-less algorithms, including the ability to see the "parent" parameterised algorithm of a given parameter-less algorithm. Other similar queries would be equally useful (all SPEAR configurations, the parameterised algorithm corresponding to a partial instantiation of another parameterised algorithm, etc.) (CF) Ability to browse algorithms, instances, instance sets, configurations, etc. This includes the ability to see things related to the item being browsed. Performance of different algorithms/configurations on a given instance, performance of algorithms across an instance set, performance of a given configuration. (CF) Form validation. (CF) Wider support for working directory requirements of individual algorithm runs, i.e. Concorde's creation of 20 files with fixed names.
	Known Bugs: (CN) JSC test reliability issue (compared to R)
Line: 90 to 94
	(JS) InnoDB SQL errors (LX) missing current-time point in solution quality trace, so don't see the final "flat line" (CN) accuracy of mid-run overhead accounting for PILS/GGA
Added:
> >	(CF) Configuration file callstrings with weird spaces, i.e. "... -param '$val$ blah' ..." where '$val blah' needs to be passed to the target as a single argument. (JS) FixedConfigurationExperiment UI is outdated, unusable. (JS) HAL is not usable on WestGrid. We need a TorqueClusterExecutionManager. (JS) Algorithms with a requirement of a new directory for each run.

Revision 92010-05-02 - ChrisNell

Line: 1 to 1
	Pre-CRC/release tasks UI tasks Release-critical
Line: 23 to 23
	Database tasks Release-critical mostly to enable critical UI tasks
Changed:
< <	CN Named instance set table
> >	CN Named instance set table done
	CN Named configuration table CN Execution environment table CN Configuration spaces vs. algorithms; appropriate unique ID hashing

Revision 82010-04-22 - ChrisFawcett

Line: 1 to 1
	Pre-CRC/release tasks UI tasks Release-critical
Added:
> >	CF N-way performance comparison first-cut for Frank.
	functionality promised in paper CF algorithm specification screen: implement CF left side of landing page: task selection/presentation according to pattern concept

Revision 72010-04-22 - ChrisNell

Line: 1 to 1
	Pre-CRC/release tasks UI tasks Release-critical
Line: 88 to 88
	(CN) end-of-experiment hanging bug (GGA, multinode cluster runs) (JS) InnoDB SQL errors (LX) missing current-time point in solution quality trace, so don't see the final "flat line"
Added:
> >	(CN) accuracy of mid-run overhead accounting for PILS/GGA

Revision 62010-04-21 - ChrisNell

Line: 1 to 1
	Pre-CRC/release tasks UI tasks
Changed:
< <	Completely replace current Hal/HalServer.java servlets with better-designed modular implementation left side of landing page: task selection drill-down to fit pattern concept algorithm specification screen: implement instance specification screen: implement experiment specification and monitor screens from a pattern template, and procedure-specific requirements RTDs/per-target-algorithm-run monitoring and navigation
> >	Release-critical functionality promised in paper CF algorithm specification screen: implement CF left side of landing page: task selection/presentation according to pattern concept CF experiment specification and monitor screens from a pattern template, and procedure-specific requirements (may require DB changes) CF instance specification screen: implement (requires DB change) CF Execution environment specification (incl. R, Gnuplot, java locations; requires DB change) CN RTDs/per-target-algorithm-run monitoring and navigation Important works as-is but end-user experience significantly impacted
	Data management interface: deleting runs/expts/etc. data export
Deleted:
< <	Execution environment specification: R, Gnuplot, java locations DB authentication/path -- .properties file?
	Error logging/handling/browsing Plotting ex-gnuplot
Line: 16 to 18
	Error logging/handling/browsing Plotting ex-gnuplot
Added:
> >
	Database tasks
Added:
> >	Release-critical mostly to enable critical UI tasks CN Named instance set table CN Named configuration table CN Execution environment table CN Configuration spaces vs. algorithms; appropriate unique ID hashing Important mostly to (substantially) improve UI responsiveness
	Connection pooling Caching analysis results
Deleted:
< <	DataManager API redesign
	Database schema -- speed-related redesign Query optimization
Deleted:
< <	Stale connection issue; incl. robustness to general network issues
	Selective limitation of run-level logging (dynamic based on runtime?)
Changed:
< <	Named configurations Named instance sets Instance features Execution environments Configuration spaces vs. algorithms; appropriate unique ID hashing
> >	Nice-to-have noticeable mostly to developer-users DataManager API refinement Stale connection issue; incl. robustness to general network issues
	Read-only DataManager connection for use by individual MA procedures
Added:
> >
	Code/Robustness/Misc. tasks
Added:
> >	Important user-facing documentation (help)
	Better logging/error-reporting (to console/within HAL). eg: log4j Better handling of overhead runtime vs. target algorithm runtime WAY more unittests; also functional/integration tests
Changed:
< <	array jobs in SGE user-facing documentation (help)
> >	Nice-to-have
	developer-facing documentation (javadocs)
Deleted:
< <	Experiments calling experiments, not just external target algs
	Medium-term Plans
Line: 62 to 74
	support instance generators Git, not CVS Support text-file inputs and outputs for external algorithms
Added:
> >	Instance features Experiments calling experiments, not just external target algs array jobs in SGE
	Feature Requests

Revision 52010-04-20 - HolgerHoos

Line: 1 to 1
	Pre-CRC/release tasks UI tasks Completely replace current Hal/HalServer.java servlets with better-designed modular implementation left side of landing page: task selection drill-down to fit pattern concept algorithm specification screen: implement
Changed:
< <	instance specificaiton screen: implement experiment specification and monitor screens from a pattern template, and proceure-specific requirements RTDs/per-target-algorithm-run monitoring and navigaiton
> >	instance specification screen: implement experiment specification and monitor screens from a pattern template, and procedure-specific requirements RTDs/per-target-algorithm-run monitoring and navigation
	Data management interface: deleting runs/expts/etc. data export
Line: 17 to 17
	Plotting ex-gnuplot Database tasks
Changed:
< <	Conneciton pooling
> >	Connection pooling
	Caching analysis results DataManager API redesign Database schema -- speed-related redesign
Line: 65 to 65
	Feature Requests
Changed:
< <	(FH) submitting runs from a machine that is itelf a cluster submit hos should not need to go through SSH
> >	(FH) submitting runs from a machine that is itelf a cluster submit host should not need to go through SSH
	Known Bugs:

Revision 42010-04-20 - ChrisNell

-<
<
+Usage notes/observations/etc for HAL 1.0.  To be considered post-paper.
->
>
+ Pre-CRC/release tasks 
 UI tasks 
 
 Completely replace current Hal/HalServer.java servlets with better-designed modular implementation
  left side of landing page:  task selection drill-down to fit pattern concept
  algorithm specification screen: implement
  instance specificaiton screen: implement
  experiment specification and monitor screens from a pattern template, and proceure-specific requirements
  RTDs/per-target-algorithm-run monitoring and navigaiton
  Data management interface: 
 deleting runs/expts/etc.
  data export
 
  Execution environment specification: 
 R, Gnuplot, java locations
  DB authentication/path -- .properties file?
 
  Error logging/handling/browsing
  Plotting ex-gnuplot
-<
<
+TODO: 
 dealing with runs that report 0.00s -- PILS doesn't progress if this happens
  improve cluster "niceness" -- if many subruns are spawned, HAL can take over all cpu's on a node
  remove POSIX requirements
  "tagging" of configurations, instances
  investigate PILS thinking it performed fewer runs than are committed to the DB.
->
>
+ Database tasks 
 
 Conneciton pooling
  Caching analysis results
  DataManager API redesign
  Database schema -- speed-related redesign
  Query optimization
  Stale connection issue; incl. robustness to general network issues
  Selective limitation of run-level logging (dynamic based on runtime?)
  Named configurations
  Named instance sets
  Instance features
  Execution environments
  Configuration spaces vs. algorithms; appropriate unique ID hashing
  Read-only DataManager connection for use by individual MA procedures
 

 Code/Robustness/Misc. tasks 
 
 Better logging/error-reporting (to console/within HAL).  eg: log4j
  Better handling of overhead runtime vs. target algorithm runtime
  WAY more unittests; also functional/integration tests
  array jobs in SGE
  user-facing documentation (help)
  developer-facing documentation (javadocs)
  Experiments calling experiments, not just external target algs
 


 Medium-term Plans 
 
 Packaging complete experiments
  Windows support
  libraries of: 
 search/optimization procedures
  machine learning tools
 
  multi-algorithm comparisons
  scaling analyses
  bootstrapped analyses
  robustness analyses
  parameter response analyses
  SATzilla in HAL
  ParamILS in HAL
  Parallel portfolios in HAL
  ActiveConfigurator in HAL
  Iterated F-Race in HAL
  chained-procedure experiments
  support for optimization/Monte-Carlo experiments
  support instance generators
  Git, not CVS
  Support text-file inputs and outputs for external algorithms
 


 Feature Requests 
 
 (FH) submitting runs from a machine that is itelf a cluster submit hos should not need to go through SSH
 


 Known Bugs: 
 
 (CN) JSC test reliability issue (compared to R)
  (CN) end-of-experiment hanging bug (GGA, multinode cluster runs)
  (JS) InnoDB SQL errors
  (LX) missing current-time point in solution quality trace, so don't see the final "flat line"

Revision 32010-03-31 - ChrisNell

Line: 1 to 1
	Usage notes/observations/etc for HAL 1.0. To be considered post-paper. TODO:
Changed:
< <	dealing with runs that report 0.00s
> >	dealing with runs that report 0.00s -- PILS doesn't progress if this happens improve cluster "niceness" -- if many subruns are spawned, HAL can take over all cpu's on a node
	remove POSIX requirements "tagging" of configurations, instances investigate PILS thinking it performed fewer runs than are committed to the DB.

Revision 22010-03-31 - ChrisNell

-<
<
+Notes in adapting Iterated F-Race (IFR) to the HAL framework.
This "diary" is intended to: 
 provide fodder for eventual better documentation
  serve as a record for improving the process described therein
->
>
+Usage notes/observations/etc for HAL 1.0.  To be considered post-paper.
-<
<
+ Terminology 

In HAL: 

 An Algorithm instance is an arbitrary code object which implements the HAL algorithm API, so that its inputs and outputs are well-defined and accessed in a standardized manner.
  A Parameter is fixed if its value should not be adjusted to improve performance, and free otherwise.
  A Pattern is an Algorithm which accepts a particular set of standard fixed input parameters in addition to an arbitrary set of free, "configurable" parameters.  Similarly, a pattern's output includes a set of standard parameters.  The fixed inputs include a problem instance, a random seed, a maximum runtime, a maximum runlength, and a  A Pattern instance can accept additional, nonstandard inputs, but these must have default values so that if left unspecified the algorithm will still run.
  A Scenario specifies conditions under which Patterns are run, including training and test instances, execution budgets, and overall performance objectives.
  A Configurator is an Algorithm which accepts Patterns and Scenarios, and outputs parameter instantiations for the Pattern.  In practice a Configurator will most likely attempt to return the best-evaluated parameter instance.
 


 Initial Conisderations 
Iterated F-Race is presented as a set of R functions which are called via a set of Linux shell scripts.  HAL has a very basic (vanilla) F-Race component which will be useful as a starting point for both of the following tasks:

Tasks which need to be completed: 

 Write a .json configuration file for IFR, so that it can be loaded into HAL as a generic ExternalAlgorithm.  This wrapper should preserve the naming conventions/etc. used by the algorithms' authors
  Write a WrappedConfigurator subclass for IFR, which enables HAL to use the IFR ExternalAlgorithm as a configurator on arbitrary Pattern instances.
 

Each of these is approached in turn.

 1. Configuration File 
IFR is "natively" configured to work with a particular algorithm through editing of a BASH script file, tune.sh.  This script is used to provide a sequence of commands to an R interpreter, and itself takes no command-line arguments.  An example script is (apologies for crazy width...).

R --no-save --no-restore --slave<<EOF
source("race.R")
source("hrace.R")
source("eval.R")

# doesn't matter descriptions
experiment.name<-"Iterative F-race for Tuning ACOTSP"
extra.description<-"F-RACE applied to ACOTSP"

# excutable initials, usually the rest of the command lines is followed by "--parameter_name parameter_value"
executable<-"../ACOTSP.V1.0/acotsp --tries 1 --time 20 "

# instance directory for tuning and testing
instance.dir<-"../../../../Instances"
test.instance.dir<-"../../../../TestInstances"

# tuning budget in number of evaluations
maxAllotedExperiments = 6000

# type "r" means continuous parameters (real numbers), "i" means integer parameters, "c" means categorical parameters, "m" also means categorical parameters which is called in command lines by "--parameter_value", e.g. "--mmas"; while the usual parameters are called in command lines using the format "--parameter_name parameter_value". 
parameter.type.list<-list(alpha="r",beta="r",rho="r",ants="i",nnants="i", nnls="i", q0="r", localsearch="c", dlb="c", mode="m", rasrank="i", elitistants="i")

# boundary inclusive for continuous or integer parameters. for categorical parameters this simply lists all the possible levels. 
parameter.boundary.list<-list(alpha=c(0.01,5.0), beta=c(0.0,10.0), rho=c(0.0001, 1.0), ants=c(1,100), nnants=c(5,100), nnls=c(5, 100),  q0=c(0.0,1.0), localsearch=c(0,1,2,3), dlb=c(0,1), mode=c("mmas", "acs", "ras", "eas", "as"), rasrank=c(1,10), elitistants=c(1,750))

# the conditional parameters
parameter.subsidiary.list<-list(q0=list(mode=c("acs")), rasrank=list(mode=c("ras")), elitistants=list(mode=c("eas")), nnls=list(localsearch=c(1,2,3)), dlb=list(localsearch=c(1,2,3)))

# in case the parameter names differ from what we give above, usually leave it as empty
parameter.name.list<-list()

# wrapper file for racing
wrapper.file="race-wrapper.R"

result=hrace.wrapper(maxAllotedExperiments=maxAllotedExperiments,parameter.type.list=parameter.type.list,parameter.boundary.list=parameter.boundary.list,
experiment.name=experiment.name,extra.description=extra.description,executable=executable,instance.dir=instance.dir, test.instance.dir, parameter.subsidiary.list=parameter.subsidiary.list, parameter.name.list=parameter.name.list, wrapper.file=wrapper.file)

# to perform the tuned parameter on the testing instances
eval(result=result, executable=executable, test.instance.dir=test.instance.dir)


The target algorithm-specific parameters (R variables) set in this file are: 

 experiment.name (string)
  extra.description (string)
  executable (path)
  instance.dir (path)
  test.instance.dir (path)
  maxAllotedExperiments (integer)
  parameter.type.list (list mapping string names to characters in "ricm" indicating domains)
  parameter.boundary.list (list mapping above names to either endpoints or categorical values, further specifying domains)
  parameter.subsidiary.list (conditionals; list mapping above names to another mapping of above names to conditional values)
  parameter.name.list (optional, list of parameter display names if they differ from the ones used above)
 
Note that all of these parameters (except possibly maxAllotedExperiments) are fixed.

While it would be possible to have HAL control R directly (circumventing this script), for the time being we will simply configure HAL to generate tune.sh scripts for arbitrary target Patterns.

Note that the parameter.*.list parameters can be viewed as strings of a particular format.  The exact realization of this string depends on the target algorithm, and as such is generated in the ExternalConfigurator subclass.  However, as a sanity check we can enforce that the format of the string is roughly valid using regular expressions; while this is not strictly necessary it is done below.

The outputs of the iterated F-Race algorithm are the same as those of the original F-Race; thus, they can be carried over from that algorithm's .json file.  The difference is here the outputs are printed to STDOUT rather than dumped into a file.  Also,	The json-format configuration is then:
{
    "path" : "../../../../ifrace/TUNE",
    "command" : "bash",
    "deterministic": false,
    "inputFormat": {
        "callstring": "$bashscript",
        "$bashscript": [
            "R --no-save --no-restore --slave<<EOF",
            "source('race.R')",
            "source('hrace.R')",
            "source('eval.R"')",
            "experiment.name<-"Iterative F-race run by HAL",
            "extra.description<-"F-RACE run by HAL",
            "executable<-'$executable'",
            "instance.dir<-'instance.dir'",
            "test.instance.dir<-'test.instance.dir'",
            "maxAllotedExperiments = $maxAllotedExperiments",
            "parameter.type.list<-$parameter.type.list",
            "parameter.boundary.list<-$parameter.boundary.list",
            "parameter.subsidiary.list<-$parameter.subsidiary.list",
            "parameter.name.list<-$parameter.name.list",
            "wrapper.file="race-wrapper.R",
            "result=hrace.wrapper(maxAllotedExperiments=maxAllotedExperiments, parameter.type.list=parameter.type.list,parameter.boundary.list=parameter.boundary.list,
experiment.name=experiment.name,extra.description=extra.description,executable=executable,instance.dir=instance.dir, test.instance.dir, parameter.subsidiary.list=parameter.subsidiary.list, parameter.name.list=parameter.name.list, wrapper.file=wrapper.file)",
            "eval(result=result, executable=executable, test.instance.dir=test.instance.dir)",
            "EOF"
        ] 
    },
    "inputs": {
        "bashcsript": {"domain": "String()", "properties":{"fixed":1}},
        "experiment.name": {"domain": "String()", "properties":{"fixed":1}},
        "extra.description": {"domain": "String()", "properties":{"fixed":1}},
        "executable": {"domain": "String()", "properties":{"fixed":1}},
        "instance.dir": {"domain": "String()", "properties":{"fixed":1}},
        "test.instance.dir": {"domain": "String()", "properties":{"fixed":1}}
        "maxAllottedExperiments": {"domain": "Integer(0, None)", "properties":{"fixed":1}},
        "parameter.type.list": {"domain": "String('list\([^\s=]+=[^\s=,]+(?:,\s*[^\s=,]+=[^\s=,]+)*\)')", "properties":{"fixed":1}},
        "parameter.boundary.list": {"domain": "String('list\([^\s=]+=[^\s=,]+(?:,\s*[^\s=,]+=[^\s=,]+)*\)')", "properties":{"fixed":1}},
        "parameter.subsidiary.list": {"domain": "String('list\([^\s=]+=list\([^\s=]+=[^\s=,]+(?:,\s*[^\s=,]+=[^\s=,]+)*\)(?:,\s*[^\s=]+=list\([^\s=]+=[^\s=,]+(?:,\s*[^\s=,]+=[^\s=,]+)*\))*\)')", "properties":{"fixed":1}},
        "parameter.name.list": {"domain": "String('list\([^\s=]+=[^\s=,]?(?:,\s*[^\s=,]+=[^\s=,]+)*\)')", "properties":{"fixed":1}},
    },            
    "outputFormat": {
        "stdout": [{"^\|([x=-])\|\s*([0-9]+)\|\s*([0-9]+)\|\s*([0-9]+)\|\s*([0-9]+(?:.[0-9]+)?)\|\s*([0-9]+)\|":
                    ["marker", "task", "alive", "best", "meanbest", "nruns"],
 
                   {"Description of the selected candidate:\s}
        ]
    },
    "outputs": {
        "marker": ["x", "-", "="],        
        "task": "Integer(0, None)",
        "alive": "Integer(0, None)",
        "best": "Integer(0, None)",
        "meanbest": "Real()",
        "nruns": "Integer(0, None)"
    }
}


 2. WrappedConfigurator subclass 

This is where the bulk of the work of integrating a new configurator with HAL occurs.  The WrappedConfigurator subclass defines the logic required to:
 

 Configure the ExternalAlgorithm's parameters appropriately given a Pattern and a Scenario.
  Moderate the ExternalAlgorithm's calls to the target Pattern.
  Interpret the ExternalAlgorithm's output.
 

There are three methods of WrappedConfigurator which will likely need to be overridden: 

 WrappedConfigurator._setupEnvironment(), for input manipulation and general environment setup
  WrappedConfigurator._cleanupEnvironment(), for post-execution cleanup
  WrappedConfigurator._mapOutput(.), to interpret the raw algorithm output 
 

Additionally, it an ExecutionServer subclass will need to be created, which will intercept the external algorithm's attempts to call the Pattern, execute said Pattern appropriately, and return an appropriately formatted result.



-- ChrisNell - 08 Sep 2009
->
>
+TODO: 
 dealing with runs that report 0.00s
  remove POSIX requirements
  "tagging" of configurations, instances
  investigate PILS thinking it performed fewer runs than are committed to the DB.

Revision 12009-09-08 - ChrisNell

Line: 1 to 1
Added:
> >	Notes in adapting Iterated F-Race (IFR) to the HAL framework. This "diary" is intended to: provide fodder for eventual better documentation serve as a record for improving the process described therein Terminology In HAL: An Algorithm instance is an arbitrary code object which implements the HAL algorithm API, so that its inputs and outputs are well-defined and accessed in a standardized manner. A Parameter is fixed if its value should not be adjusted to improve performance, and free otherwise. A Pattern is an Algorithm which accepts a particular set of standard fixed input parameters in addition to an arbitrary set of free, "configurable" parameters. Similarly, a pattern's output includes a set of standard parameters. The fixed inputs include a problem instance, a random seed, a maximum runtime, a maximum runlength, and a A Pattern instance can accept additional, nonstandard inputs, but these must have default values so that if left unspecified the algorithm will still run. A Scenario specifies conditions under which Patterns are run, including training and test instances, execution budgets, and overall performance objectives. A Configurator is an Algorithm which accepts Patterns and Scenarios, and outputs parameter instantiations for the Pattern. In practice a Configurator will most likely attempt to return the best-evaluated parameter instance. Initial Conisderations Iterated F-Race is presented as a set of R functions which are called via a set of Linux shell scripts. HAL has a very basic (vanilla) F-Race component which will be useful as a starting point for both of the following tasks: Tasks which need to be completed: Write a .json configuration file for IFR, so that it can be loaded into HAL as a generic ExternalAlgorithm. This wrapper should preserve the naming conventions/etc. used by the algorithms' authors Write a WrappedConfigurator subclass for IFR, which enables HAL to use the IFR ExternalAlgorithm as a configurator on arbitrary Pattern instances. Each of these is approached in turn. 1. Configuration File IFR is "natively" configured to work with a particular algorithm through editing of a BASH script file, tune.sh. This script is used to provide a sequence of commands to an R interpreter, and itself takes no command-line arguments. An example script is (apologies for crazy width...). R --no-save --no-restore --slave<<EOF source("race.R") source("hrace.R") source("eval.R") # doesn't matter descriptions experiment.name<-"Iterative F-race for Tuning ACOTSP" extra.description<-"F-RACE applied to ACOTSP" # excutable initials, usually the rest of the command lines is followed by "--parameter_name parameter_value" executable<-"../ACOTSP.V1.0/acotsp --tries 1 --time 20 " # instance directory for tuning and testing instance.dir<-"../../../../Instances" test.instance.dir<-"../../../../TestInstances" # tuning budget in number of evaluations maxAllotedExperiments = 6000 # type "r" means continuous parameters (real numbers), "i" means integer parameters, "c" means categorical parameters, "m" also means categorical parameters which is called in command lines by "--parameter_value", e.g. "--mmas"; while the usual parameters are called in command lines using the format "--parameter_name parameter_value". parameter.type.list<-list(alpha="r",beta="r",rho="r",ants="i",nnants="i", nnls="i", q0="r", localsearch="c", dlb="c", mode="m", rasrank="i", elitistants="i") # boundary inclusive for continuous or integer parameters. for categorical parameters this simply lists all the possible levels. parameter.boundary.list<-list(alpha=c(0.01,5.0), beta=c(0.0,10.0), rho=c(0.0001, 1.0), ants=c(1,100), nnants=c(5,100), nnls=c(5, 100), q0=c(0.0,1.0), localsearch=c(0,1,2,3), dlb=c(0,1), mode=c("mmas", "acs", "ras", "eas", "as"), rasrank=c(1,10), elitistants=c(1,750)) # the conditional parameters parameter.subsidiary.list<-list(q0=list(mode=c("acs")), rasrank=list(mode=c("ras")), elitistants=list(mode=c("eas")), nnls=list(localsearch=c(1,2,3)), dlb=list(localsearch=c(1,2,3))) # in case the parameter names differ from what we give above, usually leave it as empty parameter.name.list<-list() # wrapper file for racing wrapper.file="race-wrapper.R" result=hrace.wrapper(maxAllotedExperiments=maxAllotedExperiments,parameter.type.list=parameter.type.list,parameter.boundary.list=parameter.boundary.list, experiment.name=experiment.name,extra.description=extra.description,executable=executable,instance.dir=instance.dir, test.instance.dir, parameter.subsidiary.list=parameter.subsidiary.list, parameter.name.list=parameter.name.list, wrapper.file=wrapper.file) # to perform the tuned parameter on the testing instances eval(result=result, executable=executable, test.instance.dir=test.instance.dir) The target algorithm-specific parameters (R variables) set in this file are: experiment.name (string) extra.description (string) executable (path) instance.dir (path) test.instance.dir (path) maxAllotedExperiments (integer) parameter.type.list (list mapping string names to characters in "ricm" indicating domains) parameter.boundary.list (list mapping above names to either endpoints or categorical values, further specifying domains) parameter.subsidiary.list (conditionals; list mapping above names to another mapping of above names to conditional values) parameter.name.list (optional, list of parameter display names if they differ from the ones used above) Note that all of these parameters (except possibly maxAllotedExperiments) are fixed. While it would be possible to have HAL control R directly (circumventing this script), for the time being we will simply configure HAL to generate tune.sh scripts for arbitrary target Patterns. Note that the parameter..list parameters can be viewed as strings of a particular format. The exact realization of this string depends on the target algorithm, and as such is generated in the ExternalConfigurator subclass. However, as a sanity check we can enforce that the format of the string is roughly valid using regular expressions; while this is not strictly necessary it is done below. The outputs of the iterated F-Race algorithm are the same as those of the original F-Race; thus, they can be carried over from that algorithm's .json file. The difference is here the outputs are printed to STDOUT rather than dumped into a file. Also, The json-format configuration is then: { "path" : "../../../../ifrace/TUNE", "command" : "bash", "deterministic": false, "inputFormat": { "callstring": "$bashscript", "$bashscript": [ "R --no-save --no-restore --slave<<EOF", "source('race.R')", "source('hrace.R')", "source('eval.R"')", "experiment.name<-"Iterative F-race run by HAL", "extra.description<-"F-RACE run by HAL", "executable<-'$executable'", "instance.dir<-'instance.dir'", "test.instance.dir<-'test.instance.dir'", "maxAllotedExperiments = $maxAllotedExperiments", "parameter.type.list<-$parameter.type.list", "parameter.boundary.list<-$parameter.boundary.list", "parameter.subsidiary.list<-$parameter.subsidiary.list", "parameter.name.list<-$parameter.name.list", "wrapper.file="race-wrapper.R", "result=hrace.wrapper(maxAllotedExperiments=maxAllotedExperiments, parameter.type.list=parameter.type.list,parameter.boundary.list=parameter.boundary.list, experiment.name=experiment.name,extra.description=extra.description,executable=executable,instance.dir=instance.dir, test.instance.dir, parameter.subsidiary.list=parameter.subsidiary.list, parameter.name.list=parameter.name.list, wrapper.file=wrapper.file)", "eval(result=result, executable=executable, test.instance.dir=test.instance.dir)", "EOF" ] }, "inputs": { "bashcsript": {"domain": "String()", "properties":{"fixed":1}}, "experiment.name": {"domain": "String()", "properties":{"fixed":1}}, "extra.description": {"domain": "String()", "properties":{"fixed":1}}, "executable": {"domain": "String()", "properties":{"fixed":1}}, "instance.dir": {"domain": "String()", "properties":{"fixed":1}}, "test.instance.dir": {"domain": "String()", "properties":{"fixed":1}} "maxAllottedExperiments": {"domain": "Integer(0, None)", "properties":{"fixed":1}}, "parameter.type.list": {"domain": "String('list\([^\s=]+=[^\s=,]+(?:,\s[^\s=,]+=[^\s=,]+)\)')", "properties":{"fixed":1}}, "parameter.boundary.list": {"domain": "String('list\([^\s=]+=[^\s=,]+(?:,\s[^\s=,]+=[^\s=,]+)\)')", "properties":{"fixed":1}}, "parameter.subsidiary.list": {"domain": "String('list\([^\s=]+=list\([^\s=]+=[^\s=,]+(?:,\s[^\s=,]+=[^\s=,]+)\)(?:,\s[^\s=]+=list\([^\s=]+=[^\s=,]+(?:,\s[^\s=,]+=[^\s=,]+)\))\)')", "properties":{"fixed":1}}, "parameter.name.list": {"domain": "String('list\([^\s=]+=[^\s=,]?(?:,\s[^\s=,]+=[^\s=,]+)\)')", "properties":{"fixed":1}}, }, "outputFormat": { "stdout": [{"^\\|([x=-])\\|\s([0-9]+)\\|\s([0-9]+)\\|\s([0-9]+)\\|\s([0-9]+(?:.[0-9]+)?)\\|\s([0-9]+)\\|": ["marker", "task", "alive", "best", "meanbest", "nruns"], {"Description of the selected candidate:\s} ] }, "outputs": { "marker": ["x", "-", "="], "task": "Integer(0, None)", "alive": "Integer(0, None)", "best": "Integer(0, None)", "meanbest": "Real()", "nruns": "Integer(0, None)" } } 2. WrappedConfigurator subclass This is where the bulk of the work of integrating a new configurator with HAL occurs. The WrappedConfigurator subclass defines the logic required to: Configure the ExternalAlgorithm's parameters appropriately given a Pattern and a Scenario. Moderate the ExternalAlgorithm's calls to the target Pattern. Interpret the ExternalAlgorithm's output. There are three methods of WrappedConfigurator which will likely need to be overridden: WrappedConfigurator._setupEnvironment(), for input manipulation and general environment setup WrappedConfigurator._cleanupEnvironment(), for post-execution cleanup WrappedConfigurator._mapOutput(.), to interpret the raw algorithm output Additionally, it an ExecutionServer subclass will need to be created, which will intercept the external algorithm's attempts to call the Pattern, execute said Pattern appropriately, and return an appropriately formatted result. -- ChrisNell - 08 Sep 2009

Line: 1 to 1

Added:

>
>

Notes in adapting Iterated F-Race (IFR) to the HAL framework. This "diary" is intended to:

provide fodder for eventual better documentation
serve as a record for improving the process described therein

Terminology

In HAL:

An Algorithm instance is an arbitrary code object which implements the HAL algorithm API, so that its inputs and outputs are well-defined and accessed in a standardized manner.
A Parameter is fixed if its value should not be adjusted to improve performance, and free otherwise.
A Pattern is an Algorithm which accepts a particular set of standard fixed input parameters in addition to an arbitrary set of free, "configurable" parameters. Similarly, a pattern's output includes a set of standard parameters. The fixed inputs include a problem instance, a random seed, a maximum runtime, a maximum runlength, and a A Pattern instance can accept additional, nonstandard inputs, but these must have default values so that if left unspecified the algorithm will still run.
A Scenario specifies conditions under which Patterns are run, including training and test instances, execution budgets, and overall performance objectives.
A Configurator is an Algorithm which accepts Patterns and Scenarios, and outputs parameter instantiations for the Pattern. In practice a Configurator will most likely attempt to return the best-evaluated parameter instance.

Initial Conisderations

Iterated F-Race is presented as a set of R functions which are called via a set of Linux shell scripts. HAL has a very basic (vanilla) F-Race component which will be useful as a starting point for both of the following tasks:

Tasks which need to be completed:

Write a .json configuration file for IFR, so that it can be loaded into HAL as a generic ExternalAlgorithm. This wrapper should preserve the naming conventions/etc. used by the algorithms' authors
Write a WrappedConfigurator subclass for IFR, which enables HAL to use the IFR ExternalAlgorithm as a configurator on arbitrary Pattern instances.

Each of these is approached in turn.

1. Configuration File

IFR is "natively" configured to work with a particular algorithm through editing of a BASH script file, tune.sh. This script is used to provide a sequence of commands to an R interpreter, and itself takes no command-line arguments. An example script is (apologies for crazy width...).

R --no-save --no-restore --slave<<EOF
source("race.R")
source("hrace.R")
source("eval.R")

# doesn't matter descriptions
experiment.name<-"Iterative F-race for Tuning ACOTSP"
extra.description<-"F-RACE applied to ACOTSP"

# excutable initials, usually the rest of the command lines is followed by "--parameter_name parameter_value"
executable<-"../ACOTSP.V1.0/acotsp --tries 1 --time 20 "

# instance directory for tuning and testing
instance.dir<-"../../../../Instances"
test.instance.dir<-"../../../../TestInstances"

# tuning budget in number of evaluations
maxAllotedExperiments = 6000

# type "r" means continuous parameters (real numbers), "i" means integer parameters, "c" means categorical parameters, "m" also means categorical parameters which is called in command lines by "--parameter_value", e.g. "--mmas"; while the usual parameters are called in command lines using the format "--parameter_name parameter_value". 
parameter.type.list<-list(alpha="r",beta="r",rho="r",ants="i",nnants="i", nnls="i", q0="r", localsearch="c", dlb="c", mode="m", rasrank="i", elitistants="i")

# boundary inclusive for continuous or integer parameters. for categorical parameters this simply lists all the possible levels. 
parameter.boundary.list<-list(alpha=c(0.01,5.0), beta=c(0.0,10.0), rho=c(0.0001, 1.0), ants=c(1,100), nnants=c(5,100), nnls=c(5, 100),  q0=c(0.0,1.0), localsearch=c(0,1,2,3), dlb=c(0,1), mode=c("mmas", "acs", "ras", "eas", "as"), rasrank=c(1,10), elitistants=c(1,750))

# the conditional parameters
parameter.subsidiary.list<-list(q0=list(mode=c("acs")), rasrank=list(mode=c("ras")), elitistants=list(mode=c("eas")), nnls=list(localsearch=c(1,2,3)), dlb=list(localsearch=c(1,2,3)))

# in case the parameter names differ from what we give above, usually leave it as empty
parameter.name.list<-list()

# wrapper file for racing
wrapper.file="race-wrapper.R"

result=hrace.wrapper(maxAllotedExperiments=maxAllotedExperiments,parameter.type.list=parameter.type.list,parameter.boundary.list=parameter.boundary.list,
experiment.name=experiment.name,extra.description=extra.description,executable=executable,instance.dir=instance.dir, test.instance.dir, parameter.subsidiary.list=parameter.subsidiary.list, parameter.name.list=parameter.name.list, wrapper.file=wrapper.file)

# to perform the tuned parameter on the testing instances
eval(result=result, executable=executable, test.instance.dir=test.instance.dir)

The target algorithm-specific parameters (R variables) set in this file are:

experiment.name (string)
extra.description (string)
executable (path)
instance.dir (path)
test.instance.dir (path)
maxAllotedExperiments (integer)
parameter.type.list (list mapping string names to characters in "ricm" indicating domains)
parameter.boundary.list (list mapping above names to either endpoints or categorical values, further specifying domains)
parameter.subsidiary.list (conditionals; list mapping above names to another mapping of above names to conditional values)
parameter.name.list (optional, list of parameter display names if they differ from the ones used above)

Note that all of these parameters (except possibly maxAllotedExperiments) are fixed.

While it would be possible to have HAL control R directly (circumventing this script), for the time being we will simply configure HAL to generate tune.sh scripts for arbitrary target Patterns.

Note that the parameter.*.list parameters can be viewed as strings of a particular format. The exact realization of this string depends on the target algorithm, and as such is generated in the ExternalConfigurator subclass. However, as a sanity check we can enforce that the format of the string is roughly valid using regular expressions; while this is not strictly necessary it is done below.

The outputs of the iterated F-Race algorithm are the same as those of the original F-Race; thus, they can be carried over from that algorithm's .json file. The difference is here the outputs are printed to STDOUT rather than dumped into a file. Also, The json-format configuration is then:

{
    "path" : "../../../../ifrace/TUNE",
    "command" : "bash",
    "deterministic": false,
    "inputFormat": {
        "callstring": "$bashscript",
        "$bashscript": [
            "R --no-save --no-restore --slave<<EOF",
            "source('race.R')",
            "source('hrace.R')",
            "source('eval.R"')",
            "experiment.name<-"Iterative F-race run by HAL",
            "extra.description<-"F-RACE run by HAL",
            "executable<-'$executable'",
            "instance.dir<-'instance.dir'",
            "test.instance.dir<-'test.instance.dir'",
            "maxAllotedExperiments = $maxAllotedExperiments",
            "parameter.type.list<-$parameter.type.list",
            "parameter.boundary.list<-$parameter.boundary.list",
            "parameter.subsidiary.list<-$parameter.subsidiary.list",
            "parameter.name.list<-$parameter.name.list",
            "wrapper.file="race-wrapper.R",
            "result=hrace.wrapper(maxAllotedExperiments=maxAllotedExperiments, parameter.type.list=parameter.type.list,parameter.boundary.list=parameter.boundary.list,
experiment.name=experiment.name,extra.description=extra.description,executable=executable,instance.dir=instance.dir, test.instance.dir, parameter.subsidiary.list=parameter.subsidiary.list, parameter.name.list=parameter.name.list, wrapper.file=wrapper.file)",
            "eval(result=result, executable=executable, test.instance.dir=test.instance.dir)",
            "EOF"
        ] 
    },
    "inputs": {
        "bashcsript": {"domain": "String()", "properties":{"fixed":1}},
        "experiment.name": {"domain": "String()", "properties":{"fixed":1}},
        "extra.description": {"domain": "String()", "properties":{"fixed":1}},
        "executable": {"domain": "String()", "properties":{"fixed":1}},
        "instance.dir": {"domain": "String()", "properties":{"fixed":1}},
        "test.instance.dir": {"domain": "String()", "properties":{"fixed":1}}
        "maxAllottedExperiments": {"domain": "Integer(0, None)", "properties":{"fixed":1}},
        "parameter.type.list": {"domain": "String('list\([^\s=]+=[^\s=,]+(?:,\s*[^\s=,]+=[^\s=,]+)*\)')", "properties":{"fixed":1}},
        "parameter.boundary.list": {"domain": "String('list\([^\s=]+=[^\s=,]+(?:,\s*[^\s=,]+=[^\s=,]+)*\)')", "properties":{"fixed":1}},
        "parameter.subsidiary.list": {"domain": "String('list\([^\s=]+=list\([^\s=]+=[^\s=,]+(?:,\s*[^\s=,]+=[^\s=,]+)*\)(?:,\s*[^\s=]+=list\([^\s=]+=[^\s=,]+(?:,\s*[^\s=,]+=[^\s=,]+)*\))*\)')", "properties":{"fixed":1}},
        "parameter.name.list": {"domain": "String('list\([^\s=]+=[^\s=,]?(?:,\s*[^\s=,]+=[^\s=,]+)*\)')", "properties":{"fixed":1}},
    },            
    "outputFormat": {
        "stdout": [{"^\|([x=-])\|\s*([0-9]+)\|\s*([0-9]+)\|\s*([0-9]+)\|\s*([0-9]+(?:.[0-9]+)?)\|\s*([0-9]+)\|":
                    ["marker", "task", "alive", "best", "meanbest", "nruns"],
 
                   {"Description of the selected candidate:\s}
        ]
    },
    "outputs": {
        "marker": ["x", "-", "="],        
        "task": "Integer(0, None)",
        "alive": "Integer(0, None)",
        "best": "Integer(0, None)",
        "meanbest": "Real()",
        "nruns": "Integer(0, None)"
    }
}

2. WrappedConfigurator subclass

This is where the bulk of the work of integrating a new configurator with HAL occurs. The WrappedConfigurator subclass defines the logic required to:

Configure the ExternalAlgorithm's parameters appropriately given a Pattern and a Scenario.
Moderate the ExternalAlgorithm's calls to the target Pattern.
Interpret the ExternalAlgorithm's output.

There are three methods of WrappedConfigurator which will likely need to be overridden:

WrappedConfigurator._setupEnvironment(), for input manipulation and general environment setup
WrappedConfigurator._cleanupEnvironment(), for post-execution cleanup
WrappedConfigurator._mapOutput(.), to interpret the raw algorithm output

Additionally, it an ExecutionServer subclass will need to be created, which will intercept the external algorithm's attempts to call the Pattern, execute said Pattern appropriately, and return an appropriately formatted result.

-- ChrisNell - 08 Sep 2009

View topic | History: r45 < r44 < r43 < r42 | More topic actions...

Difference: HAL (1 vs. 45)

Revision 452011-01-05 - mavc

Feature Milestones

HAL 1.0

Revision 442010-08-24 - ChrisNell

Feature Milestones

HAL 1.0

Functionality for meta-algorithm developers

Backend functionality exposed in above

Meta-Algorithms Included

Distribution Issues

Meta-Algorithms Included

Active work items

Frontend

Backend

Release-critical

Important

Application: ActiveConfigurator

Release Critical

Support/QA/Misc.

Release Critical

Important

Nice-to-have

Bug Reports

Revision 432010-08-05 - ChrisNell

Feature Milestones

HAL 1.0

Backend functionality exposed in above

Meta-Algorithms Included

Functionality for meta-algorithm developers

Backend functionality

Meta-Algorithms Included

Revision 422010-07-28 - ChrisNell

Feature Milestones

HAL 1.0

Backend functionality exposed in above

Revision 412010-07-28 - ChrisNell

Feature Milestones

HAL 1.0

Functionality for meta-algorithm developers

Distribution Issues

HAL 1.1

Functionality for meta-algorithm developers

Backend functionality

Meta-Algorithms Included

HAL 1.x

Revision 402010-07-27 - ChrisNell

Feature Milestones

HAL 1.0

Functionality for meta-algorithm developers

Backend functionality exposed in above

Meta-Algorithms Included

Scheduled Tasks

HAL 1.x

Unprioritized Features

Active work items

Frontend

Release-critical

Medium-term

Long-term/Unprioritized

Bug Reports

Revision 392010-07-27 - ChrisNell

Short-term

Feature Milestones

HAL 1.0

Web UI Features

Functionality for meta-algorithm developers

Backend functionality exposed in above

Meta-Algorithms Included

HAL 1.1

Web UI Features

Functionality for meta-algorithm developers

Backend functionality

Meta-Algorithms Included

Scheduled Tasks

Frontend

Release-critical

Important

Backend

Release-critical