Patents by Inventor Hugo Patterson

Hugo Patterson has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 8126852
    Abstract: A method of determining whether a data segment is a duplicate using cooperating deduplicators is disclosed. The data segment is received. A first deduplicator is operated to to determine whether the incoming data segment is a duplicate based on first information available to the first deduplicator regarding stored data segments that are stored in a memory. A second deduplicator is selectively operated to determine whether the incoming data segment is a duplicate based on second information available to the second deduplicator; wherein the selective operation of the second deduplicator depends on the determination made by the first deduplicator.
    Type: Grant
    Filed: December 1, 2008
    Date of Patent: February 28, 2012
    Assignee: EMC Corporation
    Inventor: R. Hugo Patterson
  • Publication number: 20120041957
    Abstract: Techniques for efficiently indexing and searching similar data are described herein. According to one embodiment, in response to a query for one or more terms received from a client, a query index is accessed to retrieve a list of one or more super files. Each super file is associated with a group of similar files. Each super file includes terms and/or sequences of terms obtained from the associated group of similar files. Thereafter, the super files representing groups of similar files are presented to the client, where each of the super files includes at least one of the queried terms. Other methods and apparatuses are also described.
    Type: Application
    Filed: October 24, 2011
    Publication date: February 16, 2012
    Inventors: Windsor W. Hsu, R. Hugo Patterson
  • Patent number: 8099401
    Abstract: Techniques for efficiently indexing and searching similar data are described herein. According to one embodiment, in response to a query for one or more terms received from a client, a query index is accessed to retrieve a list of one or more super files. Each super file is associated with a group of similar files. Each super file includes terms and/or sequences of terms obtained from the associated group of similar files. Thereafter, the super files representing groups of similar files are presented to the client, where each of the super files includes at least one of the queried terms. Other methods and apparatuses are also described.
    Type: Grant
    Filed: July 18, 2007
    Date of Patent: January 17, 2012
    Assignee: EMC Corporation
    Inventors: Windsor W. Hsu, R. Hugo Patterson
  • Publication number: 20110307530
    Abstract: A method and apparatus for different embodiments of incremental garbage collection of data in a secondary storage. In one embodiment, a method comprises locating blocks of data in a log that are referenced and within a range at a tail of the log. The method also includes copying the blocks of data that are referenced and within the range to an unallocated segment of the log.
    Type: Application
    Filed: August 23, 2011
    Publication date: December 15, 2011
    Inventor: R. Hugo Patterson
  • Publication number: 20110302326
    Abstract: Selecting a segment boundary within block b is disclosed. A first anchor location j|j+1 is identified wherein a value of f(b[j?A+1 . . . j+B]) satisfies a constraint and wherein A and B are non-negative integers. A segment boundary location k|k+1 is determined wherein k is greater than minimum distance from j.
    Type: Application
    Filed: June 2, 2011
    Publication date: December 8, 2011
    Applicant: EMC CORPORATION
    Inventors: Kai Li, Umesh Maheshwari, R. Hugo Patterson
  • Publication number: 20110270887
    Abstract: Cluster storage is disclosed. A data stream or a data block is received. The data stream or the data block is broken into segments. For each segment, a cluster node is selected, and a portion of the segment smaller than the segment is identified that is a duplicate of a portion of a segment already managed by the cluster node.
    Type: Application
    Filed: July 8, 2011
    Publication date: November 3, 2011
    Applicant: EMC CORPORATION
    Inventors: Sazzala Reddy, Umesh Maheshwari, Edward K. Lee, R. Hugo Patterson
  • Patent number: 8028009
    Abstract: A method and apparatus for different embodiments of incremental garbage collection of data in a secondary storage. In one embodiment, a method comprises locating blocks of data in a log that are referenced and within a range at a tail of the log. The method also includes copying the blocks of data that are referenced and within the range to an unallocated segment of the log.
    Type: Grant
    Filed: November 10, 2008
    Date of Patent: September 27, 2011
    Assignee: EMC Corporation
    Inventor: R. Hugo Patterson
  • Patent number: 8005861
    Abstract: Cluster storage is disclosed. A data stream or a data block is received. The data stream or the data block is broken into segments. For each segment, a cluster node is selected, and a portion of the segment smaller than the segment is identified that is a duplicate of a portion of a segment already managed by the cluster node.
    Type: Grant
    Filed: June 29, 2007
    Date of Patent: August 23, 2011
    Assignee: EMC Corporation
    Inventors: Sazzala Reddy, Umesh Maheshwari, Edward K. Lee, R. Hugo Patterson
  • Publication number: 20110196869
    Abstract: Storage of data segments is disclosed. For each segment, a similar segment to the segment is identified, wherein the similar segment is already managed by a cluster node. In the event the similar segment is identified, a reference to the similar segment and a delta between the similar segment and the segment are caused to be stored instead of the segment.
    Type: Application
    Filed: April 19, 2011
    Publication date: August 11, 2011
    Applicants: EMC CORPORATION
    Inventors: R. Hugo Patterson, Kai Li, Ming Benjamin Zhu, Sazzala Venkata Reddy, Umesh Maheshwari, Edward K. Lee
  • Patent number: 7979584
    Abstract: Selecting a segment boundary within block b is disclosed. A first anchor location j|j+1 is identified wherein a value of f(b[j?A+1 . . . j+B]) satisfies a constraint and wherein A and B are non-negative integers. A segment boundary location k|k+1 is determined wherein k is greater than minimum distance from j.
    Type: Grant
    Filed: July 14, 2006
    Date of Patent: July 12, 2011
    Assignee: EMC Corporation
    Inventors: Kai Li, Umesh Maheshwari, R. Hugo Patterson
  • Patent number: 7962520
    Abstract: Cluster storage is disclosed. A data stream or a data block is received. The data stream or the data block is broken into segments. For each segment, a cluster node is selected, and in the event that a similar segment to the segment is identified that is already managed by the selected cluster node, a reference to the similar segment and a delta between the similar segment and the segment is caused to be stored on the selected cluster node.
    Type: Grant
    Filed: April 9, 2008
    Date of Patent: June 14, 2011
    Assignee: EMC Corporation
    Inventors: R. Hugo Patterson, Kai Li, Ming Benjamin Zhu, Sazzala Venkata Reddy, Umesh Maheshwari, Edward K. Lee
  • Publication number: 20110072227
    Abstract: A system for storing data comprises a performance storage unit for storing a data stream or a data block in. The data stream or the data block comprises one or more data items. The system further comprises a segment storage system for automatically storing a stored data item of the one or more data items as a set of segments. The system further comprises a performance segment storage unit for storing the set of segments in the event that the stored data item has been stored using the segment storage system.
    Type: Application
    Filed: September 21, 2010
    Publication date: March 24, 2011
    Applicant: EMC CORPORATION
    Inventor: R. Hugo Patterson
  • Publication number: 20110071980
    Abstract: A system for storing data comprises a performance storage unit and a performance segment storage unit. The system further comprises a determiner. The determiner determines whether a requested data is stored in the performance storage unit. The determiner determines whether the requested data is stored in the performance segment storage unit in the event that the requested data is not stored in the performance storage unit.
    Type: Application
    Filed: September 21, 2010
    Publication date: March 24, 2011
    Applicant: EMC CORPORATION
    Inventor: R. Hugo Patterson
  • Publication number: 20110072226
    Abstract: A system for storing data comprises a performance storage system for storing one or more data items. A data item of the one or more data items comprises a data file or a data block. The system further comprises a segment storage system for storing a snapshot of a stored data item of the one or more data items in the performance storage system. The taking of the snapshot of the stored data item enables recall of the stored data item as stored at the time of the snapshot. At least one newly written segment is stored as a reference to a previously stored segment.
    Type: Application
    Filed: September 21, 2010
    Publication date: March 24, 2011
    Applicant: EMC CORPORATION
    Inventor: R. Hugo Patterson
  • Patent number: 7882064
    Abstract: File system replication includes determining whether one of a plurality of files included in an original file system has been updated since a previous replication, the file having a plurality of data segments, and in the event that the file has been updated, locating among the plurality of data segments a previously stored data segment that is newly referenced by the file, and that does not require replication.
    Type: Grant
    Filed: July 6, 2006
    Date of Patent: February 1, 2011
    Assignee: EMC Corporation
    Inventors: Edward K. Lee, Ming Benjamin Zhu, Umesh Maheshwari, R. Hugo Patterson
  • Publication number: 20110016083
    Abstract: Seeding replication is disclosed. One or more but not all files stored on a deduplicated storage system are selected to be replicated. One or more segments referred to by the selected one or more but not all files are determined. A data structure is created that is used to indicate that at least the one or more segments are to be replicated. In the event that an indication based at least in part on the data structure indicates that a candidate segment stored on the deduplicating storage system is to be replicated, the candidate segment is replicated.
    Type: Application
    Filed: September 26, 2010
    Publication date: January 20, 2011
    Applicant: EMC CORPORATION
    Inventor: R. Hugo Patterson
  • Patent number: 7870103
    Abstract: A method for tolerating collisions of identifiers for data segments is disclosed. The method comprises combining a primary identifier and secondary identifier of a first segment to make a combined identifier of the first segment and combining a primary identifier and secondary identifier of a second segment to make a combined identifier of the second segment. The method further comprises determining if the combined identifier of the first segment is the same as the combined identifier of the second segment.
    Type: Grant
    Filed: October 13, 2005
    Date of Patent: January 11, 2011
    Assignee: EMC Corporation
    Inventors: Umesh Maheshwari, R. Hugo Patterson
  • Publication number: 20100332452
    Abstract: A system for storing files comprises a processor and a memory. The processor is configured to break a file into one or more segments; store the one or more segments in a first storage unit; and add metadata to the first storage unit so that the file can be accessed independent of a second storage unit, wherein a single namespace enables access for files stored in the first storage unit and the second storage unit.
    Type: Application
    Filed: June 25, 2009
    Publication date: December 30, 2010
    Inventors: Windsor W. Hsu, R. Hugo Patterson
  • Patent number: 7827137
    Abstract: Seeding replication is disclosed. One or more but not all files stored on a deduplicated storage system are selected to be replicated. One or more segments referred to by the selected one or more but not all files are determined. A data structure is created that is used to indicate that at least the one or more segments are to be replicated. In the event that an indication based at least in part on the data structure indicates that a candidate segment stored on the deduplicating storage system is to be replicated, the candidate segment is replicated.
    Type: Grant
    Filed: May 24, 2007
    Date of Patent: November 2, 2010
    Assignee: EMC Corporation
    Inventor: R. Hugo Patterson
  • Publication number: 20100257315
    Abstract: A system and method are disclosed for providing efficient data storage. A plurality of data segments is received in a data stream. The system determines whether a data segment has been stored previously in a low latency memory. In the event that the data segment is determined to have been stored previously, an identifier for the previously stored data segment is returned.
    Type: Application
    Filed: June 21, 2010
    Publication date: October 7, 2010
    Applicant: EMC CORPORATION
    Inventors: Ming Benjamin Zhu, Kai Li, R. Hugo Patterson