×

De-duplication of data in a data processing system

DC CAFC
  • US 7,949,662 B2
  • Filed: 12/23/2003
  • Issued: 05/24/2011
  • Est. Priority Date: 04/11/1995
  • Status: Expired due to Fees
First Claim
Patent Images

1. A computer-implemented method of eliminating a particular data item at a given storage location in a distributed file system when a copy of said particular data item can be obtained from at least one other storage location in the file system,wherein said file system comprises (i) a plurality of distinct storage locations, and (ii) at least one computer distinct from said plurality of storage locations, and (iii) at least one database, andwherein each data item in the file system is stored at multiple distinct storage locations in the file system, based, at least in part, on a predetermined degree of redundancy for said file system, andwherein said at least one database comprises mapping data that identifies where data items are stored in the file system, andwherein the file system further includes a table including a plurality of records indicating changes made to said file system, said table being accessible by said at least one computer,the method comprising the steps of:

  • (A) obtaining, at said at least one computer, a request to delete a particular data item from said given storage location; and

    (B) based on the request, ascertaining a particular substantially unique data identifier for the particular data item, wherein said particular data item consists of a particular sequence of bits and wherein said particular data identifier is based, at least in part, on a given function of all of the bits in the particular sequence of bits of the particular data item; and

    (C) determining, using at least said particular substantially unique data identifier for the particular data item, whether deletion of said particular data item from said given storage location will leave a sufficient number of copies of said data item present at one or more other storage locations in the file system to satisfy said predetermined degree of redundancy for said file system with respect to that particular data item; and

    (D) based at least in part on said determining, if it is determined that deletion of the particular data item from the given storage location will leave at least a sufficient number of copies of said data item at one or more other storage locations in said file system to satisfy said predetermined degree of redundancy for said file system with respect to that particular data item, then(d1) adding an entry to said table to indicate deletion of said particular file from the given storage location in the file system, said entry including said particular substantially unique identifier for said particular data item.

View all claims
  • 9 Assignments
Timeline View
Assignment View
    ×
    ×