freebsd-dev/contrib/cvs/PROJECTS
peter 732631db68 Import cvs-1.9.23 as at 19980123. There are a number of really nice
things fixed in here, including the '-ko' vs. -A problem with
remote cvs which caused all files with -ko to be resent each time
(which is damn painful over a modem, I can tell you).  It also found a
heap of stray empty directories that should have been pruned with the -P
flag to cvs update but were not for some reason.

It also has the fully integrated rcs and diff, so no more fork/exec
overheads for rcs,ci,patch,diff,etc.  This means that it parses the control
data in the rcs files only once rather than twice or more.

If the 'cvs diff' vs. Index thing is going to be fixed for future patch
compatability, this is the place to do it.
1998-01-26 03:09:57 +00:00

54 lines
2.3 KiB
Plaintext

This is a list of projects for CVS. In general, unlike the things in
the TODO file, these need more analysis to determine if and how
worthwhile each task is.
I haven't gone through TODO, but it's likely that it has entries that
are actually more appropriate for this list.
0. Improved Efficency
* CVS uses a single doubly linked list/hash table data structure for
all of its lists. Since the back links are only used for deleting
list nodes it might be beneficial to use singly linked lists or a
tree structure. Most likely, a single list implementation will not
be appropriate for all uses.
One easy change would be to remove the "type" field out of the list
and node structures. I have found it to be of very little use when
debugging, and each instance eats up a word of memory. This can add
up and be a problem on memory-starved machines.
Profiles have shown that on fast machines like the Alpha, fsortcmp()
is one of the hot spots.
* Dynamically allocated character strings are created, copied, and
destroyed throughout CVS. The overhead of malloc()/strcpy()/free()
needs to be measured. If significant, it could be minimized by using a
reference counted string "class".
* File modification time is stored as a character string. It might be
worthwile to use a time_t internally if the time to convert a time_t
(from struct stat) to a string is greater that the time to convert a
ctime style string (from the entries file) to a time_t. time_t is
an machine-dependant type (although it's pretty standard on UN*X
systems), so we would have to have different conversion routines.
Profiles show that both operations are called about the same number
of times.
* stat() is one of the largest performance bottlenecks on systems
without the 4.4BSD filesystem. By spliting information out of
the filesystem (perhaps the "rename database") we should be
able to improve performance.
* Parsing RCS files is very expensive. This might be unnecessary if
RCS files are only used as containers for revisions, and tag,
revision, and date information was available in easy to read
(and modify) indexes. This becomes very apparent with files
with several hundred revisions.
1. Improved testsuite/sanity check script
* Need to use a code coverage tool to determine how much the sanity
script tests, and fill in the holes.