Sophie

Sophie

distrib > Mandriva > 2010.0 > i586 > media > contrib-release > by-pkgid > 76c735c873c1ecdaf8e14fbade6783e5 > files > 3

freedups-0.6.14-5mdv2010.0.noarch.rpm


V0.6.7 - Apr 26, 2003
- If the user requested DatesEqual, add mtime to the equivalence class.


V0.6.6 - Apr 26, 2003
- InodeOfFile used to be used in Md5SumOf (could stand on its own),
  was created and cleared in IndexFile, and heavily in LinkFiles. 
  Md5SumOf now creates an entry on demand, and the create and clear are
  moved entirely to LinkFiles.  In fact, we don't clear it at all now.


V0.6.5 - Apr 26, 2003
- Start using size+uid+gid+mode as an equivalence class instead of size
  in the InodesOfSize array.  All the same tests are performed as
  before, only we have fewer inodes to which the current one is
  compared.
- Stop reconstructing the InodesOfSize{size} on every inode.  Just
  replace an entry in this array if it becomes necessary.
- Quiet the "Tried to link identical Inodes" message.


V0.6.4 - Apr 16, 2003
- Brown paper bag time.  I compared a new inode to each of the inodes of
  its size, trying to find a link.  The problem was, if it didn't match
  any of them, I failed to add it to the list of inodes of that size so
  it might match future inodes.  For example, if I have two pairs of
  identical files, the first pair gets linked, the second does not.  The
  regression test was updated to check that this works in the future. 
  My sincere thanks to Martin Sheppard and Milton Yates at csiro.au for
  debugging, finding, and sending in a flawless fix for, this bug.


V0.6.3 - Mar 9, 2003
- Reasonably big change; we're now processing files immediately as
  they're read from disk rather than waiting until everyone's read into
  memory.  I'm hoping this will allow one to actually make some headway
  even in the case where there's a huge number of files.  I also suspect
  it'll go faster as we're down from order ~1.5x num_files to 1x
  num_files.  This loses a bit of disk cache locality in processing the
  nodes, but the lack of seeks and the fact that we don't have to wait
  until everything's read in to make some progress should more than make
  up for that.
- Because of the above, I no longer figure out how many nodes are
  solitary versus multiple.
- Minor fixes.


V0.6.2 - Feb 26, 2003
- Slight modification.  All calculated sums get written to KnownMd5sums
  and NewMd5sums.  We use KnownMd5sums for all internal work, NewMd5sums
  is only used for appending new sums to the cache at the end.


V0.6.1 - Feb 22, 2003
- Break md5sums into known and new md5sums.  Known sums came from the
  cache and therefore don't need to be written out.  new sums were
  calculated on this run and are appended to cache at the end.
- Minor typos and fixes.


V0.6.0 - Dec 9, 2002
- Cleanups of a stable 0.5.9.  Removed a few variables.  Old debugging
  code removed.  Move \n into Debug.
- LinkInodes doesn't call LinkFiles if ActuallyLink=no any more (there's
  a mini version embedded in LinkInodes now that does the Debug prints).


V0.5.9 - Dec 9, 2002
- Changed Inodespec storage format from slash delimited string to packed  
  SLSSSLLL format.  Runtime peak memory for a 219800 file run went from
  87.8M to 78.8M; 10% memory savings.  Informal numbers show it about 15%
  faster as well.
- Wow.  Instead of loading InodeOfFile during the initial file scan, I
  leave it blank until we've discarded solitary inodes, and then I load
  it with _just_ the files and inodes of the currently-being-worked-on
  size.  This brings the peak memory use for that same 219800 file run
  down to 41.8M.  Woah.  
- For reference, given that v0.5.6 needed 1.08x ram for v0.5.7, v0.5.6  
  would have needed 94.8M.  We've saved 56% of our peak ram  
  requirements.
- On a P3-1500, I can process 219800 files (whose directory entires are
  in disk cache and that are already linked) in 80 seconds.  2,747
  files/second.
- New regression tests, including a full link of two copies of a kernel
  source tree and a diff afterwards.
- The truly verbose debugs show garbage if they try to print md5sums or
  inodespecs, sorry.  I'm guessing I'm the only person that sees them
  anyways.


V0.5.8 - Nov 30, 2002
- Print a reasonably accurate estimate of how much space would have been
  saved on dry runs (-a turned off).
- Slightly restructure LinkInodes to reduce code repetition.


V0.5.7 - Nov 27, 2002
- Don't use IndexedFiles{File}=0 test to guarantee unique files anymore,
  use defined(@InodeOfFile{File}) which we have already.  Saves 8% of
  memory usage.  :-)


V0.5.6 - Nov 24, 2002
- Use seperate cache for every user - safer.
- Ignore files we can't stat for some reason
- Added regression test, to be run on every new version


V0.5.5 - Nov 22, 2002
- Added code overview at the top.
- Show the size we're working on at each new link (if size has changed
  from last time).
- Ignore blank md5sums.
- Discard the md5sum of an inode if we perform the last unlink on that
  inode.


V0.5.4 - Nov 20, 2002
- Slightly different array syntax, per Ross Carlson


V0.5.3 - Nov 17, 2002
- Stop using :::: as a separator between the filenames in FilesOfInode;
  make the FilesOfInode values real arrays.


V0.5.2 - Nov 4, 2002
- Discard solitary inodes early to (theoretically) save memory (note
  that perl (5, at least) doesn't actually return memory to the OS if
  the app undef's it).


V0.5.1 - Nov 3, 2002
- IndexFile function to load all arrays.
- Load md5sum cache late


V0.5 - Nov 3, 2002
- Freedups has been rewritten in perl.
- First perl release with the following features:
    - (shared) md5 checksum cache
    - Read filenames in and stat them, storing inode info in internal arrays.
    - Do comparison of _inodes_, not filenames.
    - If a given size has a single inode, discard it as there's no chance of linking.


V0.4 - May 6, 2001
- v0.3 and below were spending a _lot_ of time forking basename, even
  when we didn't need to test for basename.  By removing that and
  grouping files with identical md5sums together, it processes large
  numbers of files in about a tenth of the time.  It does need to 
  read all the files now, some twice, but it's worth it for the speedup.


V0.3 - Mar 11, 2001
- Handles command line parameters now.  Setting options via environment
  variables works for the moment, but will be removed in a future
  version.
- Updated documentation.  List of apps, more verbose answers to
  questions.
- GPL text block added.
- Other minor fixes and cleanups.

V0.2.1 - Mar 02, 2001
- Added README and Changelog to package
- Don't debug by default in shipping version.
- Clean out more debugging code.
- minor code cleanups
- Add examples to Usage output.
- Use mktemp if available for temporary signature file.


V0.2 - Feb 23, 2001
- Removal of a lot of forks and simplification of tests.
- More equivalency testing done in find's output.
- Link to the older of the two files or the file with the most links.


V0.1 - Feb 19, 2001
- Basic search and link functionality
- Environment variables available:
- ACTUALLYLINK=YES       #Just reports on potential savings if anything but YES.                               
- VERBOSE=YES            #Show directory listing and wait before linking if YES.                               
- CHECKDATE=YES          #Modified date and time must be equal to be considered for linking if YES.            
- FILENAMESEQUAL=YES     #Files must have the same name (in different directories to be considered for linking 
- MINSIZE=size           #Files must be larger than this size (in bytes) to be considered for linking.         


- Not publicly released.