File-UStore

 view release on metacpan or  search on metacpan

README  view on Meta::CPAN


      * An Analysis of Compare-by-hash - for reasons why a UUID based storage 
      maybe preferred over hash based solution in certain cases.
      http://www.usenix.org/events/hotos03/tech/full_papers/henson/henson.pdf

USE CASE FOR THIS MODULE IN LIEU OF A HASH BASED STORAGE

    File::HStore is a similar module that provides File Hash based storage.
    However due to the nature of File Hashing, File::HStore doesn't allow
    duplicates. If the same file is stored a second time using File::HStore
    it transparently returns the same hash it had returned last time as an
    id without adding any new file in storage due to inherent character of
    hash based storage, while this is useful if a user doesn't want any
    duplicates occurring in a storage, this apparently trivial difference
    is risky in the use case where two processes upload a duplicate file to
    the store and both processes want to do file handling on these files
    simultaneously, only one of the processes will be able to get a
    lock(deletion,manipulation etc.) on the file at a time and if the first
    process deletes the file referred to by its ID, the second process will
    never know what happened to the file it added. However in circumstances
    where filename based deduplication is desired you must use

lib/File/UStore.pm  view on Meta::CPAN


  * An Analysis of Compare-by-hash - for reasons why a UUID based storage 
  maybe preferred over hash based solution in certain cases.
  http://www.usenix.org/events/hotos03/tech/full_papers/henson/henson.pdf

=head1 USE CASE FOR THIS MODULE IN LIEU OF A HASH BASED STORAGE

File::HStore is a similar module that
provides File Hash based storage. However due to the nature of File
Hashing, File::HStore doesn't allow duplicates. If the same file is
stored a second time using File::HStore it transparently returns the
same hash it had returned last time as an id without adding any new 
file in storage due to inherent character of hash based storage, while 
this is useful if a user doesn't want any duplicates occurring in a
storage, this apparently trivial difference is risky in the use case
where two processes upload a duplicate file to the store and both
processes want to do file handling on these files simultaneously, only 
one of the processes will be able to get a lock(deletion,manipulation 
etc.) on the file at a time and if the first process deletes the file 
referred to by its ID, the second process will never know what happened 
to the file it added. However in circumstances where filename based



( run in 0.282 second using v1.01-cache-2.11-cpan-8780591d54d )