
Since moving cylc to github, I'm using this page to keep track of 
half-formed and relatively insignificant ideas, and issues that require
further investigation before concluding there is a problem ... at which
point the issue can be migrated to the github issue tracker.
_______
GENERAL
 
 + with a ctrl-Z stopped suite present, gcylc takes the port scan
 timeout time to start up. We should be able to show the window *before*
 the updater thread does a port scan.

 + Python dict.get( key, default ) never returns KeyError, and seems to
 be valid right back to Python 2.4 and earlier.

 + cylc command tab completion?  see /etc/bash_completion.d/git.sh !

 + cycling modules: for clarity, instead of initial_adjust_up adjusting
only if necessary, have it always adjust up, but only call it if T is
not valid (i.e. not on-sequence for the cycler).

 + Cycle times are still passed around as strings and temporarily
 converted to cycle time objects whenever time manipulation
 is required - we should move entirely to objects USING THE NEW
 taskid.py CLASSES.
 + USE 'a:INT' INTERNALLY FOR ASYNC TAGS TO MAKE CHECKING FOR TAG
VALIDITY EASIER.
 + To complete the extension to YYYYMMDDHHmmss (i.e. with mmss) check
 for any hardwired assumptions about ctime string length (e.g. in suite 
 log filtering - command and GUI).

* bad graph syntax crashes cylc graph (e.g. "color:blue" for
  "color=blue").

* can validation be made to detect circular dependencies. E.g. 
if foo is a member of FAM, then "FAM => foo" means foo will not run
because it depends on itself.  This could be detected as a
"self-prerequisite", but "FAM => bar => foo" would be trickier.

* remote tasks using initial scripting to set PATH to cylc - could the
default use of CYLC_DIR cause problems for this?

* (from Matt Shin) event handlers should read the "MESSAGE" from stdin
rather than as a command line argument. This would allow multi-line 
messages such as remote error tracebacks to be sent to the handler.

* Replace cumbersome checks for dict key validity throughout the code.
  This:
    if key not in foo:
        foo[key] = 0
    res = foo[key]
  is equivalent to this:
    res = foo.get(key,0)
  Likewise, this:
    if key in foo:
        foo[key].append(bar)
    else:
        foo[key] = [bar]
  is equivalent to:
    foo.setdefault(key,[]).append(bar)
  The defaultdict can be used similarly (i.e. without checking for 
  existence of an item first) but only since Python 2.5.

* We need to be able to distinguish cycle time and async tags
  throughout, by using distinct derived classes or preprending 
  markers such as: c:YYYYMMDDHH and a:1. Currently both are 
  just treated as integers in certain comparisons, e.g. can
  set a stop time of '1' in a cycling suite, and the suite will
  stop as soon as all running tasks have finished.

* Consider changing the task pool list to a dictionary indexed by task
  ID. This would avoid having to loop through the pool to find a task.
* Also, consider a major change to the scheduling algorithm that would
  fit quite nicely with a dict-based task pool: if dict values contain 
  a list of *known trigger tasks* determined from the dependency graph
  each task could just ask those tasks, rather than all of them (via the
  broker) for satisfaction. We could then eliminate the broker, and
  may significantly increase efficiency for very large suites?

* a 'nudge' that just updates task state info, without going through
  the full task processing loop?

* warm and cold start should insert tasks beyond the optional
  final task in the 'held' state, not just delete them? (see 'STOPPING'
  in bin/_run

* Consider replacing finished tasks with a data structure that remembers
  what happened. More efficient for very large suites?

* exclude inserted spinup/temporary tasks from runahead check?
  (or manually alter the check - this should be fine because the
  catching up spinup task will be all alone out behind the 
  caught-up operational tasks - so the suite won't get cluttered
  with finished tasks as a result of the long runahead limit).

* GUI: check for suite blocking before popping up entry dialogs?

* method arg checking in remote switch: just use generic exception
  handling for efficiency, leaving specific arg format checking to
  the commands?

* stop-at-cycle-time won't shutdown if there are any tasks stuck in
  'waiting', even if they are all passed the stop cycle (still true?).

* update cylc xdot code from latest version available? (see recently
  deleted 'external' directory)

* config(suite) writes to stdout/stderr (gcylc parent terminal) - see
  gcylc:view_log() for a non-validating suite.

* clean up prerequisites classes - put all in one module? (same with
  outputs + requisites?)

* Any SystemExit() left in config.py, cylcconfigobj, (...?) BAD FOR GUI!

* gcylc: setting running suite group green fails if multiple suites are
  running. Need to parse each group member and check if any are running.

* Consider possible changes to cylc internals in light of the fact
  that as of cylc-3 we know all task dependencies in advance, thanks
  to the suite.rc dependency graph. E.g. we may be able to remove many
  finished tasks ahead of the generic cleanup cut off.

* currently all task reset commands go through the 'remote' object,
  which all suites have, but we could get a proxy for the task proxy
  directly, in which case we'd could catch Pyro.NamingError to detect if
  no such task exists in the suite. 

* Pyro config:
  + PYRO_MAXCONNECTIONS , default 200: max number
  of simultaneous connections to a single pyro server?
  The default connection validator checks for this.
  + port range: set sufficiently wide for large sites.
  + consider Pyro multi-threading again?

* Check that user-defined task names do not clash with parent class
  or attribute names.

* ?can deletion and insertion while a suite is HELD result in 
    "INSERTING A TASK
      ERROR: A%2010081800 has already registered its outputs"

* Python re: check use of re m.group() and m.groups(): 

    m.group(0)  # entire match
    m.group(1)  # first parenthesised sub-group
                # ...

    m.groups()  # tuple of parenthesised sub-groups


* task ID should not be needed in message strings (?); it can be supplied 
  automatically by access methods (for started, finished, etc.)

* cycle-dependent number of restarts: this can cause problems because
  only the most recent finished task is retained to satisfy
  prerequisites. If a task alternately does short and long forecasts
  with accordingly few and many restart outputs, we must retain *both*
  previously finished short and long versions of the task when currently
  only the short one would be retained. THE CYLC TASK ELIMINATION
  ALGORITHM MUST TAKE RESTART OUTPUTS INTO ACCOUNT: Do not eliminate a
  finished task if its restart outputs are still valid.
* DUPLICATE?: consider restart messages for split tasks that actually
  use the same restart files: e.g. nzlam_long (06,18Z) and nzlam_short
  (00,12Z).  Currently must combine this into one task with
  conditionals, OR register (and report) restart messages explicitly
  because the automatic restart messages will not have the right task
  name.

* check all open( file, mode ) statements for mode 'rb' etc. 

* wherever I've used this:
    os.makedirs( os.path.dirname( filename ) )
  check that filename is not just a bare local dir filename like foo.bar
  as opposed to dir/foo.bar, else dirname() will return empty string and 
  makedirs() will fail.  This could happen if the user tries to use
  files in $PWD without specifying the full path.

* allow insertion of tasks in 'spawned already' state, for oneoff runs 
  of non-oneoff tasks?

* task state at initialization: there's no point in giving initial
  state of "finished" for example, if this does not result in setting
  prerequisites and outputs satisfied and completed, respectively.
  Consolidate this task state resetting code into methods in task.py?

* use new cylc_mode class everywhere mode test is required.

* consider more specialized messaging commands:
    cylc task error
    cylc task warning
    cylc task message
    cylc task output

* cylc submit --scheduler: task state resets to 'running' but all
  messages are ignored.

* extreme task elimination method - extrapolate forward in time to see
  if a finished task will ever be needed, otherwise delete? Or, post 
  cylc-3, use the predefined dependency graph to detect tasks that 
  will not be needed anymore?

* CHECK USE OF DICTIONARY ITERATION THROUGHOUT THE CODE:
    for item in dict:
       (operation that adds or removes items from dict)
  => RuntimeError: dictionary changed size during iteration

  Instead use:
    for item in dict.keys():
  which must build a temporary list?

* src/which.py, like the shell command, searches only for executable
  files. The ll_raw job submit classes use which to find task scripts
  that don't have a full path supplied in the taskdef; these need
  to check that which returns a result before continuing (e.g. if 
  the task script has not been set executable). 

* consider reset --no-spawn (Bernard tried to reset a waiting unspawned
  task to finished, which made it spawn ... is this the desired
  behaviour?)

* DOCUMENT or CHANGE: cotemporal peers of failed tasks are not deleted
  automatically because we USED TO restart with failed tasks in the 
  'waiting' state (not 'ready') - thus the aforementioned peers may be
  required to satisfy the failed task's prerequisites post resetting.
  (still relevant in cylc-4 or not?)

* consider task proc loop invocation - could we separate summary update
  vs task processing (e.g. when a task fails we have to run pointlessly
  (?) through the loop in order to update the summary immediately for
  monitoring.

* check that all error messages go to stderr (print >> sys.stderr).

* allow cylc task proxies to kill their real external tasks at shutdown
  (and otherwise)?

* when there are multiple finished tasks that can satisfy a new task's
  restart prerequisites, the one that actually satisfies the new task
  will be an essentially random choice (the first one that comes along). 
  This is OK because the only thing that matters is that at least one
  task can satisfy the restart dependency, then the new task calls the
  prerequisite satisfied. However, we could get tasks to record the ID
  of the satisfier task as well, for each prerequisite, and also to
  choose the latest task as satisfier if more than one can do it.

* Cylc internals can easily handle different task commands, environment,
outputs etc., at different times, BUT this would complicate the suite.rc 
spec considerably for very rare use cases.  (EXAMPLE
different n_restart_outputs for a model that does a short
forecast at 0,12 and long forecast at 6,18). To change this,
use more 'if HOUR in ...' conditionals in taskdef.py, as for
prerequisites via the dependency graph cycle-time lists, and some
interface to this in config.py (which however should be designed
not to further complicate suite.rc files for general use).

GUI

+ can we keep the control GUI graph view centered, after layout
 changes, on the most recently clicked-on task? (clicking on a task
 centers it now, for a given layout).

 + New multi-view suite control GUI: all suite info is now got from the
 suite via Pyro instead of from a local config file parsed when the GUI
 starts up (because with remote control capability we may not have a
 local config file to parse).  Will this be a performance hit (either
 for the GUI or the suite) in large suites?  If so, we could optionally
 allow the old method with the proviso that it would not work for remote
 controlled suites.

* currently scans for port on every command - should store found port?

* catch invalid regex (e.g. unbalanced paren) in user-entered filters.

====================================
GRAPH STABILITY:

Small changes to graph content can cause large layout jumps - this is
probably inevitable. 

However, changes to the order of nodes and edges in the dot-file
for the *same* graph may also cause layout changes. I've traced 
this back to config.get_graph() for the cylc base graph and verified
that I'm sorting edges and nodes prior to adding them to the graph.
Thus the random(?) reordering seems to be a feature of the internals
of pygraphviz? (maybe it stores nodes in dicts in the underlying C
library...?)
CONSIDER ADDING LAYOUT ONCE to base graph, and thereafter only when new 
tasks are added to the graph? 

====================================
FUZZY PREREQUISITES

* allow mixed fuzzy and non-fuzzy prerequisites (currently have to 
  set identical fuzzy bounds to simulate the non-fuzzy case; see
  topnet.py).

* FUZZY MATCHING CURRENTLY ASSUMES AT MOST ONE COMPATIBLE OLDER FINISHED
  VERSION OF THE UPSTREAM TASK IS PRESENT, otherwise the match occurs
  with the first one found, whichever it is. This can fail if a suite
  problem puts a hold on spent task deletion! E.g.: topnet and oper
  interface go on ahead when topnet_vis has failed (unlikely to happen
  though!)

* just before a task runs, try to re-satisfy fuzzy prerequisites in case
  a more up-to-date satisfier has shown up while the task was waiting
  for other prerequisites to be satisfied ... OR (better?!) don't try to
  satisfy fuzzy prerequistes until after all non-fuzzy ones have been
  satsified. 
