TODO List:


As with all open source projects... if you want something strongly
enough, then please (1) code it and submit it, or (2) pay me to add it.
You have the source, you have the power - use it. Or has been said for years:

  Use the Source, Luke.


* In general, modify the program so that it ports more easily.  Currently,
  it assumes a Unix-like system (esp. in the shell programs), and it requires
  md5sum as a separate executable.
  There are probably some other nonportable constructs, in particular
  for non-Unix systems (e.g., symlink handling and file/dirnames).

* Rewrite Bourne shell code to either Perl or Python (prob. Python), and
  make the call to md5sum optional.  That way, the program
  could run on Windows without Cygwin.

* Improve the heuristics for detecting language type.

* Clean up the program.  This was originally written as a one-off program
  that wouldn't be used again (or distributed!), and it shows.

  For example, adding a language should only
  require 1 or two changes, instead, you have to modify multiple locations.
  The heuristics used to detect language type should
  be made more modular, so it could be reused in other programs, and
  so you don't HAVE to write out a list of filenames first if you
  don't want to.

* Consider rewriting everything not in C into Python.  Perl is
  a write-only language, and it's absurdly hard to read Perl code later.
  I find Python code much cleaner.  And shell isn't as portable.

  One reason I didn't rewrite it in Python is that I had concerns about
  Python's licensing issues; Python versions 1.6 and up have questionable
  compatibility with the GPL.  Thankfully, the Free Software Foundation (FSF)
  and the Python developers have worked together, and the Python 
  developers have fixed the license for version 2.0.1 and up.
  Joy!!  I'm VERY happy about this!

* Create an "javascript" category. ".js" extention, "js" type.

* Are any CGI files (.cgi) unhandled?  Are files unidentified?

* Also create a category, "only the code embedded in HTML"
  (e.g., Javascript scripts, PHP statements, etc.).

* Handle Cobol.

* Add support for:
  .pco -> Oracle preprocessed Cobol Code
  .pfo -> Oracle preprocessed Fortran Code

* Handle Fortran beyond Fortran 77.

* Handle PL/1.

* Handle Ruby.

* Handle ML/CAML.  It uses Pascal-style comments (*..*),
  double-quoted C-like strings "\n...", and .ml or .mli file extensions
  (.mli is an interface file for CAML).

* If a unpreprocessed database source code file (e.g., ".pc") exists,
  then its corresponding generated file (e.g., ".c") should be counted
  as being automatically generated, NOT as a normal program to count.

* Improve makefile identification and counting.
  Currently the program does not identify as makefiles "Imakefile"
  (generated by xmkmf and processed by imake, used by MIT X server)
  nor automake/autoconf files (Makefile.am/Makefile.in).
  Need to handle ".rules" too.

  I didn't just add these files to the "makefile" list, because
  I have concerns about processing them correctly using the
  makefile counter.  Since most people won't count makefiles anyway,
  this isn't an issue for most.  I welcome patches to change this,
  _IF_ you ensure that the resulting counts are correct.

  The current version is sufficient for handling programs who have
  ordinary makefiles that are to be included in the SLOC count when
  they enable the option to count makefiles.

  Currently the makefiles count "all non-blank lines"; conceivably
  someone might want to count only the actual directives, not the
  conditions under which they fire.


* Improve the flexibility in symlink handling; see "make_filelists".
  It should be rewritten.  Some systems don't allow
  "test"ing for symlinks, which was a portability problem - that problem
  at least has been removed.

* I've added a few utilities that I use for counting whole Linux systems
  to the tar file, but they're not installed by the RPM and they're not
  documented.

* Roger Pilkey suggested handling this warning:
  WARNING! File /cygdrive/d/dev/src/eUnification/load/bin/schemaDoc.sh has
  unknown start: #!D:/cygnus/bin/bash.exe

* Modify the code, esp. sloccount, to handle systems so large that
  the data directory list can't be expanded using "*".
  This would involve using "xargs" in sloccount, maybe getting rid
  of the separate filelist creation, and having break_filelist
  call compute_all directly (break_filelist needs to run all the time,
  or its reloading of hashes during initialization would become the
  bottleneck).

