Deletion Due To Duplication Patents (Class 707/664)
  • Patent number: 8516149
    Abstract: An information retrieval system having: a client adapted for accessing a plurality of file sets stored on one of a plurality of file servers; a plurality of file servers configured to operate with a federated file system namespace; and a memory for storing re-direction information accessible by the client for identifying a request issued by the client for a file set at a first location in the namespace where the file set is located at a second, different location on one of the file servers and wherein the client in examining the re-direction information in the memory, re-directs the request to the second location in accordance with the re-direction information.
    Type: Grant
    Filed: December 17, 2010
    Date of Patent: August 20, 2013
    Assignee: EMC Corporation
    Inventor: Christopher Howard Edmett Stacey
  • Patent number: 8504528
    Abstract: The various embodiments herein include operate to identify, consolidate, and reduce redundant backup data storage. One embodiment includes storing data blocks and first signatures of data chunks of each stored data block, the first signature of each data chunk including a reference to a storage location of the data chunk within a stored data block, the stored data blocks including data blocks of previous and recent backup sessions. Some embodiments further include storing second signatures in a second signature repository, where the second signatures are calculated based on determined boundaries of the first signatures from previous backup sessions. At least one of the second signatures is calculated based on at least two first signatures, and in the range of 32 to 64 first signatures in some embodiments. Some embodiments may identify data chunks of the recent backup session present in the stored data blocks prior to the recent backup session.
    Type: Grant
    Filed: November 9, 2009
    Date of Patent: August 6, 2013
    Assignee: CA, Inc.
    Inventors: Chandra Reddy, Ming Yan, Liqiu Song
  • Patent number: 8495288
    Abstract: A storage controller of the present invention narrows down the target for data comparison by comparing hash codes beforehand and rapidly detects duplicated data. A hash value setting unit sets a hash code in data received from a host. Hash code-attached data is stored in a logical volume. A microprocessor unit compares the hash codes for each comparison-targeted data. When hash codes match with one another, a data comparator compares the target data, and determines whether or not the data is duplicated data. When duplicated data is detected, the microprocessor unit removes the duplicated data.
    Type: Grant
    Filed: June 10, 2008
    Date of Patent: July 23, 2013
    Assignee: Hitachi, Ltd.
    Inventors: Mutsumi Hosoya, Hiroshi Kanayama, Wataru Mineta
  • Patent number: 8468320
    Abstract: Methods for utilizing a locality table when performing data deduplication are disclosed. One method involves accessing a locality table stored in memory. The locality table includes several signatures, each of which identifies one of a several data units that were consecutively added to a deduplicated data store on a persistent storage device. The method then involves searching the locality table for a new signature of a new data unit, in order to determine whether a copy of the new data unit is already present in the deduplicated data store. If the new signature is not found in the locality table, a pointer table is accessed. The pointer table indicates a subset of a set of signatures stored on the persistent storage device. In response to accessing the pointer table, the subset of the set of signatures, indicated by the pointer table, is searched for the new signature.
    Type: Grant
    Filed: June 30, 2008
    Date of Patent: June 18, 2013
    Assignee: Symantec Operating Corporation
    Inventor: Russell R. Stringham
  • Patent number: 8447732
    Abstract: Deduplication in a network storage environment includes, for files stored in a network, determining a location constraint status specified by a compliance agreement for each of the files. Location constraint statuses include a location of persistent residency and no residency restriction. Deduplication also includes selecting a file from the files in the network and identifying corresponding redundant files, the selected file and the corresponding redundant files representing a set. Deduplication further includes determining the location constraint status for each of the files in the set. For the files in the set having a location constraint status specifying a location of persistent residency, the deduplication includes retaining a master copy at the respective location of persistent residency, and removing the corresponding redundant files from the network.
    Type: Grant
    Filed: August 2, 2011
    Date of Patent: May 21, 2013
    Assignee: International Business Machines Corporation
    Inventors: Abhinay R. Nagpal, Sandeep R. Patil, Gandhi Sivakumar, Carolyn A. Whitehead
  • Patent number: 8438137
    Abstract: Techniques for selecting between source and target deduplication include analyzing resource information related to resources available for deduplication, analyzing backup metadata of a backup job containing formation related to backup of data from the source to the target, and selecting between deduplication on the source or the target based on the analyzed resource information and the backup metadata.
    Type: Grant
    Filed: February 28, 2011
    Date of Patent: May 7, 2013
    Assignee: Hewlett-Packard Development Company, L.P.
    Inventors: Stephen Gold, Sri Harshan Kapanipathi
  • Patent number: 8407186
    Abstract: A computer-implemented method for data-selection-specific data deduplication associated with a single-instance-storage computing subsystem may comprise: 1) detecting a request to store a data selection to the single-instance-storage computing subsystem, 2) identifying a data-selection-specific fingerprint set associated with the data selection and stored on a storage device, and 3) utilizing the data-selection-specific fingerprint set associated with the data selection for data deduplication associated with the request to store the data selection to the single-instance-storage computing subsystem. Other exemplary data deduplication methods, as well as corresponding exemplary systems and computer-readable media, are also disclosed.
    Type: Grant
    Filed: March 31, 2009
    Date of Patent: March 26, 2013
    Assignee: Symantec Corporation
    Inventors: Nick Cremelie, Bastiaan Stougie
  • Patent number: 8396837
    Abstract: When accepting a write request including a data, an apparatus 100A acquires a first hash value based on a first hash function and, meanwhile, acquires a second hash value based on a second hash function. When a storage device 110A has not stored the acquired first hash value and second hash value in correlation with each other, the apparatus correlates the data, the first hash value, the second hash value, and information of referenced times, and then stores the correlated items into the storage device. On the other hand, when the storage device has stored the acquired first hash value and second hash value in correlation with each other, the apparatus changes the information of referenced times stored in correlation with the first hash value and the second hash value so as to add one to the number of times denoted by the information of referenced times.
    Type: Grant
    Filed: October 14, 2009
    Date of Patent: March 12, 2013
    Assignee: NEC Corporation
    Inventor: Masatsugu Matsuura
  • Patent number: 8396839
    Abstract: A subset of de-duplicated is outputted. In some embodiments, the output includes a subset of data stored in de-duplicated form in a plurality of containers each including a plurality of data segments comprising the data. For each container that includes one or more data segments comprising the subset, a corresponding container data is included in the output. Each container may include one or more segments not included in the subset. For each container the corresponding container data of which is included in the output, a corresponding value in a data structure including for each container stored on the de-duplicated storage system a data value indicating whether or not the corresponding container data has been included in the output is updated.
    Type: Grant
    Filed: June 25, 2010
    Date of Patent: March 12, 2013
    Assignee: EMC Corporation
    Inventor: Mark Huang
  • Patent number: 8392375
    Abstract: The claimed subject matter relates to a network-accessible, online data archival service with a data store for archiving data for clients of the archival service. The archival service can include an architecture that can facilitate perpetual sustainability and accessibility of data by conforming to a model. In particular, the model can describe or define a minimum set of extensible or pluggable components or modules needed to facilitate and guarantee sustainability of and accessibility to the data in perpetuity.
    Type: Grant
    Filed: March 23, 2009
    Date of Patent: March 5, 2013
    Assignee: Microsoft Corporation
    Inventors: Elissa E. Murphy, Yan V. Leshinsky, John D. Mehr, Navjot Virk
  • Patent number: 8392376
    Abstract: A system and method for managing a resource reclamation reference list at a coarse level. A storage device is configured to store a plurality of storage objects in a plurality of storage containers, each of said storage containers being configured to store a plurality of said storage objects. A storage container reference list is maintained, wherein for each of the storage containers the storage container reference list identifies which files of a plurality of files reference a storage object within a given storage container. In response to detecting deletion of a given file that references an object within a particular storage container of the storage containers, a server is configured to update the storage container reference list by removing from the storage container reference list an identification of the given file. A reference list associating segment objects with files that reference those segment objects may not be updated response to the deletion.
    Type: Grant
    Filed: September 3, 2010
    Date of Patent: March 5, 2013
    Assignee: Symantec Corporation
    Inventor: Fanglu Guo
  • Patent number: 8370309
    Abstract: Redundant data is removed from a volume of data by partitioning the volume of data into fixed-length input segments and determining, for each of the input segments, whether a selected portion of the input segment matches a portion of a segment within a de-duplication dictionary. If the portion of the input segment matches a portion of the segment within the dictionary, the segment within the de-duplication dictionary is compared with the input segment and a token representative of the segment within the dictionary is substituted for at least part of the input segment determined to match the segment within the dictionary.
    Type: Grant
    Filed: June 30, 2009
    Date of Patent: February 5, 2013
    Assignee: Infineta Systems, Inc.
    Inventors: Karempudi V. Ramarao, Raj Kanaya
  • Patent number: 8346730
    Abstract: Deduplication of data on disk devices based on a threshold number (THN) of sequential blocks is described herein, the threshold number being two or greater. Deduplication may be performed when a series of THN or more received blocks (THN series) match a sequence of THN or more stored blocks (THN sequence), whereby a sequence comprises blocks stored on the same track of a disk device. Deduplication may be performed using a block-comparison mechanism comprising metadata entries of stored blocks and a mapping mechanism containing mappings of deduplicated blocks to their matching blocks. The mapping mechanism may be used to perform later read requests received for the deduplicated blocks. The deduplication described herein may reduce the read latency as the number of seeks between tracks may be reduced. Also, when a seek to a different track is performed, the seek time cost is spread over THN or more blocks.
    Type: Grant
    Filed: April 25, 2008
    Date of Patent: January 1, 2013
    Assignee: NetApp. Inc.
    Inventors: Kiran Srinivasan, Garth Goodson, Kaladhar Voruganti
  • Patent number: 8327097
    Abstract: A technique of backing up data for networked storage devices using de-duplication is disclosed in which a communication device divides a to-be-stored new file into data blocks, defines and updates a statistical value representative of a history of reference to each data block within previous files, and transmits the statistical value to another communication device. The communication device, upon reception of the statistical value, selects a preloaded data block, based on the received statistical value, and transmits to another communication device a copying request for making a copy of a real data block identical to the preloaded data block. The communication device, upon reception of the copy, stores the copy as the preloaded data block.
    Type: Grant
    Filed: February 25, 2009
    Date of Patent: December 4, 2012
    Assignee: KDDI Corporation
    Inventors: Takahiro Miyamoto, Michiaki Hayashi
  • Patent number: 8327061
    Abstract: A digital data storage device physically stores blocks of identical data only once on its storage medium wherein a second or even further identical blocks are stored only as reference referring to the first block of these identical blocks. By this technique, storage of duplicate data is most effectively avoided on the lowest storage level of the disk storage device, even in cases where identical blocks are written by different operating Systems. In the preferred embodiment, the underlying storage medium (magnetic hard disk, optical disk, tape, or M-RAM) is segmented into two areas, the first area particularly comprising a relatively small block reference table and the remaining physical storage area for storing real blocks of information.
    Type: Grant
    Filed: September 28, 2010
    Date of Patent: December 4, 2012
    Assignee: International Business Machines Corporation
    Inventors: Manfred Boldy, Peter Sander, Hermann Stamm-Wilbrandt
  • Patent number: 8315985
    Abstract: A method and apparatus for optimizing a de-duplication rate for backup streams is described. In one embodiment, the method for optimizing data de-duplication using an extent mapping of a backup stream includes processing a backup stream to access an extent mapping associated with a plurality of data files, wherein the plurality of the data files are arranged within the backup stream and examining the extent mapping to identify at least one extent group within the backup stream, wherein the plurality of the data files are de-duplicated using at least one location of the at least one extent group.
    Type: Grant
    Filed: December 18, 2008
    Date of Patent: November 20, 2012
    Assignee: Symantec Corporation
    Inventors: James Ohr, Michael Zeis, Dean Elling, Stephan Kurt Gipp, William DesJardin
  • Patent number: 8307089
    Abstract: A system for media content storage and delivery. The system includes a server that has a receiver and a processor. The receiver receives media data that indicates media content to be stored. The processor is in communication with the receiver. The processor determines media content characteristics that correspond to the media content to be stored. The processor determines a length of time to store the media content based on the media data and determines a cost amount based at least in part on the determined media content characteristics and length of time to store the media content.
    Type: Grant
    Filed: November 21, 2011
    Date of Patent: November 6, 2012
    Assignee: Ariel Inventions, LLC
    Inventor: Leigh M. Rothschild
  • Patent number: 8290911
    Abstract: A system and method for implementing data deduplication-aware copying of data are provided. In response to a request to copy a source file between a source filesystem and a destination filesystem, file mapping information corresponding to the source file is retrieved. The file mapping information is stored in a source filesystem map. The source filesystem accesses a source logical volume. The source logical volume maps to a deduplication storage area. The destination filesystem accesses a destination logical volume. The destination logical volume maps to the deduplication storage area. The source file comprises data stored in the deduplication storage area. A destination file is allocated, based on the file mapping information, in the destination filesystem. The destination file is mapped to the data stored in the data deduplication storage area.
    Type: Grant
    Filed: June 30, 2009
    Date of Patent: October 16, 2012
    Assignee: Symantec Corporation
    Inventors: Viswesvaran Janakiraman, Bruce Robert Montague
  • Patent number: 8285827
    Abstract: A method, and apparatus for software and resource management with a model-based architecture.
    Type: Grant
    Filed: March 31, 2006
    Date of Patent: October 9, 2012
    Assignee: EMC Corporation
    Inventors: David Stephen Reiner, George M. Ericson
  • Patent number: 8280854
    Abstract: A computer-implemented method for relocating deduplicated data within a multi-device storage system. The method may include identifying a set of deduplicated data units stored on a first device of the multi-device storage system. Each data unit in the set of data units is referred to by one or more deduplication references. The method may also include procuring reference data that indicates, for each data unit in the set of deduplicated data units, the number of deduplication references that point to the data unit. The method may further include using the reference data to select one or more data units from the set of deduplicated data units for relocation to a second device in the multi-device storage system and relocating the one or more data units to the second device in the multi-device storage system. Various other methods, systems, and computer-readable media are also disclosed.
    Type: Grant
    Filed: September 1, 2009
    Date of Patent: October 2, 2012
    Assignee: Symantec Corporation
    Inventor: Travis Emmert
  • Patent number: 8280859
    Abstract: The present invention provides for a system and method for assuring integrity of deduplicated data objects stored within a storage system. A data object is copied to secondary storage media, and a digital signature such as a checksum is generated of the data object. Then, deduplication is performed upon the data object and the data object is split into chunks. The chunks are combined when the data object is subsequently accessed, and a signature is generated for the reassembled data object. The reassembled data object is provided if the newly generated signature is identical to the originally generated signature, and otherwise a backup copy of the data object is provided from secondary storage media.
    Type: Grant
    Filed: August 2, 2010
    Date of Patent: October 2, 2012
    Assignee: International Business Machines Corporation
    Inventors: Matthew J. Anglin, David M. Cannon
  • Patent number: 8271452
    Abstract: A database archiving method includes storing a plurality of record fields, wherein each of the plurality of record fields is a field of a record of the database, and storing in a first database archive an index that includes at least one record entry, wherein each of the at least one record entry references at least one record field of the database. The plurality of record fields is stored independently of the first database archive, and each field included in more than one record of any single table of the database is stored for the more than one record as a single record field referenced by a plurality of record entries, each of the plurality of record entries corresponding to a different one of the more than one record.
    Type: Grant
    Filed: June 12, 2006
    Date of Patent: September 18, 2012
    Assignee: Rainstor Limited
    Inventor: Tom Benjamin Longshaw
  • Patent number: 8250079
    Abstract: A system, method and computer program product for identifying near and exact-duplicate documents in a document collection, including for each document in the collection, reading textual content from the document; filtering the textual content based on user settings; determining N most frequent words from the filtered textual content of the document; performing a quorum search of the N most frequent words in the document with a threshold M; and sorting results from the quorum search based on relevancy. Based on the values of N and M near and exact-duplicate documents are identified in the document collection.
    Type: Grant
    Filed: March 30, 2011
    Date of Patent: August 21, 2012
    Assignee: MSC Intellectual Properties B.V.
    Inventors: Johannes C. Scholtes, Siebe Bloembergen
  • Patent number: 8234444
    Abstract: A method to select a deduplication protocol for a data storage library comprising a plurality of data storage devices configured as a RAID array, by establishing a normal deduplication protocol, a RAID failure deduplication protocol, and a multiple storage device failure deduplication protocol. The method receives host data comprising a plurality of interleaved data blocks. If the system is operating without any storage device failures, then the method processes the host data using the normal deduplication protocol. If the system is operating with a storage device failure, then the method processes the host data using the RAID failure deduplication protocol. If the system is operating with multiple storage device failures, then the method processes the host data using the multiple storage device failure deduplication protocol.
    Type: Grant
    Filed: March 11, 2008
    Date of Patent: July 31, 2012
    Assignee: International Business Machines Corporation
    Inventors: Allen Keith Bates, Nils Haustein, Craig Anthony Klein, Ulf Troppens, Daniel James Winarski
  • Patent number: 8225060
    Abstract: A computer-enabled method of storing an input dataset in a storage medium includes storing a copy for each of a plurality of repeatable blocks of data in an input dataset in a storage medium. The process further includes finding a location in the storage medium of the copy of a block of data in the input dataset. Finding the location includes determining a most likely location in the storage medium of the copy of the block of data from one or more blocks of data preceding the block of data based on statistics of past stored data. Finding the location further includes if the determined most likely location contains a block of data that matches with the actual block of data, retrieving the location in the storage medium of the copy of the block of data. The process also includes storing the location of the copy of the block of data.
    Type: Grant
    Filed: October 16, 2009
    Date of Patent: July 17, 2012
    Inventor: Andrew Leppard
  • Patent number: 8209506
    Abstract: A data de-duplication application de-duplicates redundant data in the pooled storage capacity of a virtualized storage environment. The virtualized storage environment includes a plurality of storage devices and a virtualization or abstraction layer that aggregates all or a portion of the storage capacity of each storage device into a single pool of storage capacity, all or portions of which can be allocated to one or more host systems. For each host system, the virtualization layer presents a representation of at least a portion of the pooled storage capacity wherein the corresponding host system can read and write data. The data de-duplication application identifies redundant data in the pooled storage capacity and replaces it with one or more pointers pointing to a single instance of the data. The de-duplication application can operate on fixed or variable size blocks of data and can de-duplicate data either post-process or in-line.
    Type: Grant
    Filed: September 28, 2007
    Date of Patent: June 26, 2012
    Assignee: EMC Corporation
    Inventor: Jedidiah Yueh
  • Patent number: 8204918
    Abstract: An image forming apparatus, an image forming system and a file managing method thereof include displaying a file list, selecting a deletion target file from the displayed file list, and storing the selected deletion target file in a temporary storing unit.
    Type: Grant
    Filed: July 30, 2008
    Date of Patent: June 19, 2012
    Assignee: Samsung Electronics Co., Ltd.
    Inventor: Yang-hun Jung
  • Patent number: 8185504
    Abstract: An image processing apparatus including: a correspondence determination unit configured to refer to respective additional information data included in a file and another file and determine whether identical additional information data exists; a size determination unit configured to compare a combined size of the two files with a reference size when judged that identical additional data exists; a flag addition unit configured to add a flag indicating data exempt from search target to identical additional information data included in either one of the two files when judged that the combined size is smaller than the reference size; a deletion unit configured to delete identical additional information data included in either one of the two files when judged that the combined size is equal to or greater than the reference size; and a storing unit configured to store a combined file.
    Type: Grant
    Filed: July 10, 2008
    Date of Patent: May 22, 2012
    Assignee: Canon Kabushiki Kaisha
    Inventor: Yasuhiro Hino
  • Publication number: 20120059800
    Abstract: A system and method for managing a resource reclamation reference list at a coarse level. A storage device is configured to store a plurality of storage objects in a plurality of storage containers, each of said storage containers being configured to store a plurality of said storage objects. A storage container reference list is maintained, wherein for each of the storage containers the storage container reference list identifies which files of a plurality of files reference a storage object within a given storage container. In response to detecting deletion of a given file that references an object within a particular storage container of the storage containers, a server is configured to update the storage container reference list by removing from the storage container reference list an identification of the given file. A reference list associating segment objects with files that reference those segment objects may not be updated response to the deletion.
    Type: Application
    Filed: September 3, 2010
    Publication date: March 8, 2012
    Inventor: Fanglu Guo
  • Patent number: 8131687
    Abstract: A method for deduplicating and managing data blocks within a file system includes adding a deduplication identifier to each pointer pointing to a data block to indicate whether the data block is deduplicated, detecting duplicate data blocks, determining whether one of the duplicate data blocks has been deduplicated, when detected, determining that one duplicate data block is a master copy when it is determined that one duplicate data block has been deduplicated, selecting one of the duplicate data blocks to be a master copy when it is determined that the duplicate data blocks have not been deduplicated, and setting the deduplication identifier of the selected duplicate data block to indicate deduplication, and determining that the other duplicate data block is a new duplicate data block and setting the deduplication identifier of the other duplicate data block to indicate deduplication and directing the respective pointer to the master copy.
    Type: Grant
    Filed: November 13, 2008
    Date of Patent: March 6, 2012
    Assignee: International Business Machines Corporation
    Inventors: Allen K. Bates, Nils Haustein, Craig A. Klein, Frank Krick, Ulf Troppens, Daniel Winarski
  • Patent number: 8131685
    Abstract: A system matches accounts based on attributes of the accounts, and scores the matched account pairs based on a probability of the matched accounts being duplicate accounts. The system can utilize the matched and scored account pairs to determine duplicate accounts, and terminate at least one of the accounts in a duplicate account pair.
    Type: Grant
    Filed: February 18, 2010
    Date of Patent: March 6, 2012
    Assignee: Google Inc.
    Inventors: Joel Gedalius, Brian Sinay, Naval Verma, Julian Wong
  • Patent number: 8108353
    Abstract: The invention provides a method and apparatus for determining sizing of chunk portions in data de-duplication. The method chunks input data into segments where each segment has a first size, assigns an identifier to each of the data segments, assigns an index to each of the identifiers, creates a suffix structure and a longest common prefix structure from the indexes, detects repeated sequences of indexes and non-repeated indexes from the suffix structure and the longest common prefix structure, determines a second size based on said detected repeated sequences and non-repeated indexes, and chunks the input data into a second plurality of data segments each having the second size.
    Type: Grant
    Filed: June 11, 2008
    Date of Patent: January 31, 2012
    Assignee: International Business Machines Corporation
    Inventors: Subashini Balachandran, Mihail Corneliu Constantinescu, Jan Hendrik Pieper
  • Patent number: 8095530
    Abstract: A computer-implemented method includes receiving a plurality of character strings. The number of strings (M) in the plurality of strings having a unique substring of X characters at an extremity of the string is determined, the number of strings (N) in the plurality of strings having at least X characters in the string is determined. A probability is determined, based on a predetermined model for a distribution of characters in the strings, that the unique substring of X characters would occur M or more times out of the N strings, given that the unique character string occurs at least once. Based on the probability, the number M, and the number N, it is determined that the unique character string is a significant affix in the plurality of character strings, and the unique character string is stored.
    Type: Grant
    Filed: July 21, 2008
    Date of Patent: January 10, 2012
    Assignee: Google Inc.
    Inventor: Matthew Lloyd
  • Patent number: 8078814
    Abstract: Provided is a copy pair monitoring method which is for a storage system having at least one host computer, at least one storage subsystem, and a management computer, the storage subsystem including volumes storing data requested by the host computer to be written, the management computer being accessible to the host computer and the storage subsystem. The copy pair monitoring method is characterized by including the steps of: obtaining every piece of copy pair definition information that is stored in the host computer; removing duplicate copy pair definition information from the whole copy pair definition information obtained; and collecting the copy pair status based on the obtained copy pair definition information from which duplicate copy pair definition information has been removed.
    Type: Grant
    Filed: November 15, 2006
    Date of Patent: December 13, 2011
    Assignee: Hitachi, Ltd.
    Inventors: Hironori Emaru, Yuichi Yagawa, Hiroyuki Inoue
  • Patent number: 8060469
    Abstract: A file containing proprietary content can be protected against unauthorized duplication via file sharing between remote computers connected to an Internet swapping service. To this end, the content to be protected is searched on the Internet, at least the hash ID of each data record offered as a search hit is stored, this hash ID is linked to substitute content data, and queries of remote computers for the file to be protected are responded to by offering the modified data record.
    Type: Grant
    Filed: March 27, 2009
    Date of Patent: November 15, 2011
    Assignee: Arvato Storage media GmbH
    Inventors: Mario Dzeko, Jens Maukisch, Sebastian Uhl
  • Patent number: 8037029
    Abstract: A records management system and method includes sending periodic notifications to record owners and managers when their records are under a hold order. Also, return receipts in response to an e-mail message related to a record are automatically declared as records themselves and linked to the original record.
    Type: Grant
    Filed: October 10, 2006
    Date of Patent: October 11, 2011
    Assignee: International Business Machines Corporation
    Inventors: Tod Andrew DeBie, Ivan Chi Wei Lee, Tina Joyce Lustig, Bao Vu, Hsien-Rong Yang
  • Patent number: 8024572
    Abstract: A system and method for data storage and removal includes providing databases and providing encryption keys. Each database is associated with a database time period and each encryption key is associated with an encryption time period. Data items are received and each data item is encrypted using the encryption key associated with the encryption time period that corresponds to a time associated with the data item. Each encrypted data item is stored in the database associated with the database time period that corresponds to the time associated with the data item. Each encryption key is deactivated at a predetermined time after the associated encryption time period ends. Each database is made irretrievable upon a determination that all of the encryption keys associated with the data items stored in that database have been deactivated.
    Type: Grant
    Filed: December 22, 2004
    Date of Patent: September 20, 2011
    Assignee: AOL Inc.
    Inventor: Harmannus Vandermolen
  • Patent number: 8024300
    Abstract: In an image forming apparatus, a first document manager stores image data in an image storage device, associates the image data with a first identifier, outputs the image data via an image output device when the first identifier is specified via an operation device, and deletes the image data from the image storage device when a first condition is satisfied. A second document manager associates the image data with a second identifier, outputs the second identifier via the image output device, and outputs the image data via the image output device when the second identifier is input via an image input device. The first document manager does not delete the image data from the image storage device and prohibits the operation device from specifying the first identifier when the image data is associated with both the first and second identifiers.
    Type: Grant
    Filed: July 30, 2008
    Date of Patent: September 20, 2011
    Assignee: Ricoh Company Limited
    Inventor: Hiroki Hiraguchi
  • Patent number: 8015146
    Abstract: In a networked information system, a portion of the information processing is offloaded from servers to a storage system to reduce network traffic and conserve server resources. The information system includes a storage system storing files or objects and having a function which automatically extracts portions of text from the files and transmits the extracted text to the servers. The text extraction is responsive to file requests from the servers. The extracted text and files are stored on the storage system, decreasing the need to send entire files across the network. Thus, by transmitting smaller extracted text data instead of entire files over the network, network performance can be increased through the reduction of traffic. Additionally, the processing strain on physical resources of the servers can be reduced by extracting the text at the storage system rather than at the servers.
    Type: Grant
    Filed: June 16, 2008
    Date of Patent: September 6, 2011
    Assignee: Hitachi, Ltd.
    Inventor: Yasuyuki Mimatsu
  • Publication number: 20110191302
    Abstract: A storage system 1 including: a plurality of data storage systems 3000 each including a storage apparatus 5000 providing a data storage area to an external apparatus 1000, and an information processor 4000 controlling data input and output between the external apparatus and the storage apparatus, the storage system comprising; a data attribute information retention part 4420 holding data delete allow/disallow information 4423 and data attribute information 4421 and 4424; a management information retention part 2420 that is a list holding at least one of data attribute information 2421, and data storage location information 2422 of each piece of data; and a data delete control part 4440 receiving a data delete command and controlling a process of deleting the data stored in the storage apparatus based on the command.
    Type: Application
    Filed: April 24, 2009
    Publication date: August 4, 2011
    Inventors: Hiroshi Nasu, Yuri Hiraiwa
  • Patent number: 7991746
    Abstract: A storage system comprising apparatus for consolidating portions of free space from a plurality of remote storage units; and apparatus for presenting the consolidated portions as a single file system.
    Type: Grant
    Filed: April 8, 2008
    Date of Patent: August 2, 2011
    Assignee: Hewlett-Packard Development Company, L.P.
    Inventors: Rajesh Anantha Krishnaiyer, Subramaniam Venkata Kalambur, Puneet Puri
  • Patent number: 7984022
    Abstract: Provided are techniques for space recovery with storage management coupled with a deduplicating storage system. A notification is received that one or more data objects have been logically deleted by deleting metadata about the one or more data objects, wherein the notification provides storage locations within one or more logical storage volumes corresponding to the deleted one or more data objects, wherein each of the one or more data objects are divided into one or more extents. In response to determining that a sparse file represents the one or more logical storage volumes, physical space is deallocated by nulling out space in the sparse file corresponding to each of the one or more extents.
    Type: Grant
    Filed: April 18, 2008
    Date of Patent: July 19, 2011
    Assignee: International Business Machines Corporation
    Inventors: David Maxwell Cannon, Mark Andrew Smith
  • Patent number: 7979400
    Abstract: A database spread over multiple nodes allows each node to store a journal recording changes made to the database and also allows a journaling component to manage the memory space available for journaling. Two threshold size values may be specified for the journal. The first threshold value specifies a journal size at which to being pruning the journal on a given node. A journal pruning algorithm may be used to identify journal entries that may be removed. For example, once a given transaction completes (i.e., commits) the journal entries related to that transaction may be pruned from the journal. The second threshold value specifies the maximum size of the journal. After reaching this size, journal entries may be written to disk instead of the in-memory journal.
    Type: Grant
    Filed: June 10, 2008
    Date of Patent: July 12, 2011
    Assignee: International Business Machines Corporation
    Inventors: Eric Lawrence Barsness, David L. Darrington, Amanda Peters, John Matthew Santosuosso
  • Patent number: 7979399
    Abstract: A database spread over multiple nodes allows each node to store a journal recording changes made to the database and also allows a journaling component to manage the memory space available for journaling. Two threshold size values may be specified for the journal. The first threshold value specifies a journal size at which to being pruning the journal on a given node. A journal pruning algorithm may be used to identify journal entries that may be removed. For example, once a given transaction completes (i.e., commits) the journal entries related to that transaction may be pruned from the journal. The second threshold value specifies the maximum size of the journal. After reaching this size, journal entries may be written to disk instead of the in-memory journal.
    Type: Grant
    Filed: June 10, 2008
    Date of Patent: July 12, 2011
    Assignee: International Business Machines Corporation
    Inventors: Eric Lawrence Barsness, David L. Darrington, Amanda Peters, John Matthew Santosuosso
  • Patent number: 7937371
    Abstract: Data that is to be deduplicated and compressed is received. The data is compressed then deduplicated to generate first compressed then deduplicated data. The data is deduplicated then compressed to generate first deduplicated then compressed data. The first compressed then deduplicated data is stored if the first compressed then deduplicated data is smaller in size than the first deduplicated then compressed data. The first deduplicated then compressed data is stored if the first deduplicated then compressed data is smaller in size than the first compressed then deduplicated data.
    Type: Grant
    Filed: March 14, 2008
    Date of Patent: May 3, 2011
    Assignee: International Business Machines Corporation
    Inventors: Allen Keith Bates, Nils Haustein, Craig Anthony Klein, Daniel James Winarski
  • Patent number: 7930306
    Abstract: A system, method and computer program product for identifying near and exact-duplicate documents in a document collection, including for each document in the collection, reading textual content from the document; filtering the textual content based on user settings; determining N most frequent words from the filtered textual content of the document; performing a quorum search of the N most frequent words in the document with a threshold M; and sorting results from the quorum search based on relevancy. Based on the values of N and M near and exact-duplicate documents are identified in the document collection.
    Type: Grant
    Filed: April 30, 2008
    Date of Patent: April 19, 2011
    Assignee: MSC Intellectual Properties B.V.
    Inventors: Johannes C. Scholtes, Siebe Bloembergen
  • Patent number: 7916856
    Abstract: A technique is disclosed for receiving call-control data, often from a variety of sources; processing the data into a format, content, and size that is appropriate for a telecommunications endpoint; and transmitting the processed call-control data to the endpoint. The personal profile manager of the illustrative embodiment is what first acquires the call-control data, which includes a dialing plan. The manager also reformats the call-control data and deletes redundant data. Subsequently, when a request is received from an endpoint, the personal profile manager further processes the call-control data and then transmits, to the requesting endpoint, the portion of the processed data that is appropriate for the endpoint. Transmitting the call-control data to the endpoint offloads some of the processing from the supporting call-processing server, as the data that constitute call-control rules that are used by the endpoint to make call-control decisions.
    Type: Grant
    Filed: June 6, 2006
    Date of Patent: March 29, 2011
    Assignee: Avaya Inc.
    Inventors: Albert J Baker, Pankaj Omprakash Agrawal, Glen George Freundlich
  • Publication number: 20100332456
    Abstract: Systems and methods are disclosed for performing data storage operations, including content-indexing, containerized deduplication, and policy-driven storage, within a cloud environment. The systems support a variety of clients and cloud storage sites that may connect to the system in a cloud environment that requires data transfer over wide area networks, such as the Internet, which may have appreciable latency and/or packet loss, using various network protocols, including HTTP and FTP. Methods are disclosed for content indexing data stored within a cloud environment to facilitate later searching, including collaborative searching. Methods are also disclosed for performing containerized deduplication to reduce the strain on a system namespace, effectuate cost savings, etc. Methods are disclosed for identifying suitable storage locations, including suitable cloud storage sites, for data files subject to a storage policy.
    Type: Application
    Filed: March 31, 2010
    Publication date: December 30, 2010
    Inventors: Anand Prahlad, Marcus S. Muller, Rajiv Kottomtharayil, Srinivas Kavuri, Parag Gokhale, Manoj Vijayan
  • Publication number: 20100306176
    Abstract: Files may be stored by a plurality of director devices. At times, a central director archive may deduplicate a file by removing the file from each of the plurality of director devices that is storing the file. The archive generally stores all files stored by each of the director devices. In one example, a director archive includes a processing unit that generates a list of identifiers of files stored by the director archive device that should be removed from a director device in accordance with policies of the files, wherein the director archive device is configured to store copies of files stored by the director device, and wherein the director device is communicatively coupled to the director archive device, and a network interface that sends the list of identifiers to the director device to cause the director device to delete the files from local storage of the director device.
    Type: Application
    Filed: January 27, 2010
    Publication date: December 2, 2010
    Applicant: Digitiliti, Inc.
    Inventors: Rodd Eric Johnson, Kenneth M. Peters, Brad D. Wenzel
  • Patent number: 7814074
    Abstract: The present invention provides for a system and method for assuring integrity of deduplicated data objects stored within a storage system. A data object is copied to secondary storage media, and a digital signature such as a checksum is generated of the data object. Then, deduplication is performed upon the data object and the data object is split into chunks. The chunks are combined when the data object is subsequently accessed, and a signature is generated for the reassembled data object. The reassembled data object is provided if the newly generated signature is identical to the originally generated signature, and otherwise a backup copy of the data object is provided from secondary storage media.
    Type: Grant
    Filed: March 14, 2008
    Date of Patent: October 12, 2010
    Assignee: International Business Machines Corporation
    Inventors: Matthew J. Anglin, David M. Cannon