Patents by Inventor Mark S. Manasse

Mark S. Manasse has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 9934311
    Abstract: Weighted features associated with a document are scaled using scales to generate a set of unweighted elements for each scale. A sketch is generated for each scale by sampling the unweighted elements generated for the scale. The scales are chosen based on a selected cutoff factor so that documents that have a similarity that is less than the cutoff factor might have no scales in common, while documents that have a similarity that is greater than the cutoff factor will have at sufficiently many but at least one scale in common. The similarity of these documents can be estimated using the sketches associated with each of the documents for the common scales.
    Type: Grant
    Filed: April 24, 2014
    Date of Patent: April 3, 2018
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Bernhard Haeupler, Kunal Talwar, Mark S. Manasse
  • Patent number: 9189488
    Abstract: Hash values corresponding to a file are processed in windows to determine a minimum hash value for each window. Each window may begin at a minimum hash value determined for a previous window and end after a fixed number of hash values. If a hash value is less than a threshold hash value, it is added to a buffer that is used to store the hash values in sorted order for a current window. If a hash value is greater than the threshold, it is added to another buffer whose hash values are not stored in sorted order. At the end of the current window, the minimum hash value in the first buffer is selected as the landmark for the window. If the first buffer is empty, then the hash values in the other buffer are sorted and the minimum hash value is selected as the landmark for the window.
    Type: Grant
    Filed: April 7, 2011
    Date of Patent: November 17, 2015
    Assignee: MICROSOFT TECHNOLOGY LICENSING, LLC
    Inventors: Mark S. Manasse, Arnd Christian König, Paul Adrian Oltean
  • Publication number: 20150310102
    Abstract: Weighted features associated with a document are scaled using scales to generate a set of unweighted elements for each scale. A sketch is generated for each scale by sampling the unweighted elements generated for the scale. The scales are chosen based on a selected cutoff factor so that documents that have a similarity that is less than the cutoff factor might have no scales in common, while documents that have a similarity that is greater than the cutoff factor will have at sufficiently many but at least one scale in common. The similarity of these documents can be estimated using the sketches associated with each of the documents for the common scales.
    Type: Application
    Filed: April 24, 2014
    Publication date: October 29, 2015
    Applicant: Microsoft Corporation
    Inventors: Bernhard Haeupler, Kunal Talwar, Mark S. Manasse
  • Patent number: 8972649
    Abstract: A generator matrix is provided to generate codewords from messages of write operations. Rather than generate a codeword using the entire generator matrix, some number of bits of the codeword are determined to be, or designated as, stuck bits. One or more submatrices of the generator matrix are determined based on the columns of the generator matrix that correspond to the stuck bits. The submatrices are used to generate the codeword from the message, and only the bits of the codeword that are not the stuck bits are written to a memory block. By designating one or more bits as stuck bits, the operating life of the bits is increased. Some of the submatrices of the generator matrix may be pre-computed for different stuck bit combinations. The pre-computed submatrices may be used to generate the codewords, thereby increasing the performance of write operations.
    Type: Grant
    Filed: October 5, 2012
    Date of Patent: March 3, 2015
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: John D. Davis, Parikshit Gopalan, Mark S. Manasse, Karin Strauss, Sergey Yekhanin
  • Publication number: 20140101366
    Abstract: A generator matrix is provided to generate codewords from messages of write operations. Rather than generate a codeword using the entire generator matrix, some number of bits of the codeword are determined to be, or designated as, stuck bits. One or more submatrices of the generator matrix are determined based on the columns of the generator matrix that correspond to the stuck bits. The submatrices are used to generate the codeword from the message, and only the bits of the codeword that are not the stuck bits are written to a memory block. By designating one or more bits as stuck bits, the operating life of the bits is increased. Some of the submatrices of the generator matrix may be pre-computed for different stuck bit combinations. The pre-computed submatrices may be used to generate the codewords, thereby increasing the performance of write operations.
    Type: Application
    Filed: October 5, 2012
    Publication date: April 10, 2014
    Applicant: Microsoft Corporation
    Inventors: John D. Davis, Parikshit Gopalan, Mark S. Manasse, Karin Strauss, Sergey Yekhanin
  • Patent number: 8594239
    Abstract: Each of a plurality of documents is divided into samples. Small bit-strings are generated for selected samples from each of the documents and used to create a sketch for each document. Because the bit-strings are small (e.g., only one, two, or three bits in length), the generated sketches are smaller than the sketches generated using previous methods for generating sketches, and therefore use less storage space. The generated sketches are compared to determine documents that are near-duplicates of one another.
    Type: Grant
    Filed: February 21, 2011
    Date of Patent: November 26, 2013
    Assignee: Microsoft Corporation
    Inventors: Mark S. Manasse, Arnd Christian König
  • Publication number: 20120259897
    Abstract: Hash values corresponding to a file are processed in windows to determine a minimum hash value for each window. Each window may begin at a minimum hash value determined for a previous window and end after a fixed number of hash values. If a hash value is less than a threshold hash value, it is added to a buffer that is used to store the hash values in sorted order for a current window. If a hash value is greater than the threshold, it is added to another buffer whose hash values are not stored in sorted order. At the end of the current window, the minimum hash value in the first buffer is selected as the landmark for the window. If the first buffer is empty, then the hash values in the other buffer are sorted and the minimum hash value is selected as the landmark for the window.
    Type: Application
    Filed: April 7, 2011
    Publication date: October 11, 2012
    Applicant: Microsoft Corporation
    Inventors: Mark S. Manasse, Arnd Christian König, Paul Adrian Oltean
  • Publication number: 20120213313
    Abstract: Each of a plurality of documents is divided into samples. Small bit-strings are generated for selected samples from each of the documents and used to create a sketch for each document. Because the bit-strings are small (e.g., only one, two, or three bits in length), the generated sketches are smaller than the sketches generated using previous methods for generating sketches, and therefore use less storage space. The generated sketches are compared to determine documents that are near-duplicates of one another.
    Type: Application
    Filed: February 21, 2011
    Publication date: August 23, 2012
    Applicant: Microsoft Corporation
    Inventors: Mark S. Manasse, Arnd Christian König
  • Patent number: 8112496
    Abstract: The present invention finds candidate objects for remote differential compression. Objects are updated between two or more computing devices using remote differential compression (RDC) techniques such that required data transfers are minimized. An algorithm provides enhanced efficiencies for allowing the receiver to locate a set of objects that are similar to the object that needs to be transferred from the sender. Once this set of similar objects has been found, the receiver may reuse any chunks from these objects during the RDC algorithm.
    Type: Grant
    Filed: July 31, 2009
    Date of Patent: February 7, 2012
    Assignee: Microsoft Corporation
    Inventors: Mark S. Manasse, Dan Teodosiu, Akhil Wable
  • Publication number: 20100064141
    Abstract: The present invention finds candidate objects for remote differential compression. Objects are updated between two or more computing devices using remote differential compression (RDC) techniques such that required data transfers are minimized. An algorithm provides enhanced efficiencies for allowing the receiver to locate a set of objects that are similar to the object that needs to be transferred from the sender. Once this set of similar objects has been found, the receiver may reuse any chunks from these objects during the RDC algorithm.
    Type: Application
    Filed: July 31, 2009
    Publication date: March 11, 2010
    Applicant: Microsoft Corporation
    Inventors: Mark S. Manasse, Dan Teodosiu, Akhil Wable
  • Patent number: 7613787
    Abstract: The present invention finds candidate objects for remote differential compression. Objects are updated between two or more computing devices using remote differential compression (RDC) techniques such that required data transfers are minimized. An algorithm provides enhanced efficiencies for allowing the receiver to locate a set of objects that are similar to the object that needs to be transferred from the sender. Once this set of similar objects has been found, the receiver may reuse any chunks from these objects during the RDC algorithm.
    Type: Grant
    Filed: September 24, 2004
    Date of Patent: November 3, 2009
    Assignee: Microsoft Corporation
    Inventors: Mark S. Manasse, Dan Teodosiu, Akhil Wable
  • Patent number: 7603370
    Abstract: A method detects similar objects in a collection of such objects by modification of a previous method in such a way that per-object memory requirements are reduced while false detections are avoided approximately as well as in the previous method. The modification includes (i) combining k samples of features into s supersamples, the value of k being reduced from the corresponding value used in the previous method; (ii) recording each supersample to b bits of precision, the value of b being reduced from the corresponding value used in the previous method; and (iii) requiring l matching supersamples in order to conclude that the two objects are sufficiently similar, the value of l being greater than the corresponding value required in the previous method. One application of the invention is in association with a web search engine query service to determine clusters of query results that are near-duplicate documents.
    Type: Grant
    Filed: March 22, 2004
    Date of Patent: October 13, 2009
    Assignee: Microsoft Corporation
    Inventor: Mark S. Manasse
  • Patent number: 7257554
    Abstract: An electronic commerce system and method has a number of computer systems connected by a network, including a broker computer system having a database of scrips representing a form of currency, a vendor computer system having a database containing products which may be exchanged for the scrips, and a consumer computer system with which a user may initiate transactions to obtain the products contained in the database of the vendor computer system in return for scrip. The broker issues scrip to the consumer having a Customer ID including a Hash subfield containing a value produced by consumer identifying information hashed with a nonce. When the scrip is exchanged for additional scrip, the value in the Hash subfield is hashed with another nonce. The consumer stores the nonces used to produce the Hash subfield in a wallet.
    Type: Grant
    Filed: March 19, 1999
    Date of Patent: August 14, 2007
    Assignee: Hewlett-Packard Development Company, L.P.
    Inventors: Steven C. Glassman, Mark S. Manasse, John W. Court, Edmund J. Grohn, Andrew M. Palka, Nigel Norris
  • Patent number: 7203343
    Abstract: A system and method for finding one or more target biometric samples that are similar to or match a query biometric sample. A query feature vector is generated from a query biometric vector. The query biometric vector represents the query biometric sample as a set of characteristics. The characteristics are either invariable or variable. The query feature vector comprises a plurality of features which are derived from the query biometric vector using a process that includes canonicalization of the characters in the biometric vector. The query feature vector is compared to a plurality of similarly created target feature vectors, each target feature vector representing a respective target biometric sample. A target biometric sample is a potential match to the query biometric sample when a threshold number of features in the corresponding target feature vector are identical to features in the query biometric vector.
    Type: Grant
    Filed: September 21, 2001
    Date of Patent: April 10, 2007
    Assignee: Hewlett-Packard Development Company, L.P.
    Inventors: Mark S. Manasse, Andrei Z. Broder
  • Publication number: 20040139072
    Abstract: The invention provides a system and method for locating records in a database storing objects similar to a specified object. A set of object expansion rules and a set of canonicalization rules are applied to the specified object to generate a sequence of tokens. A set of features are then generated for the sequence of tokens. Generating a set of features includes: generating a set of characters from the sequence of tokens; assigning an identification element to each character in the set of characters to create a set of identification elements; creating a set of permuted identification elements; selecting a predetermined number of permuted identification elements from the set of permuted identification elements; partitioning the selected, permuted identification elements into a plurality of groups; and producing a feature value from each of these groups. Finally, a set of objects from the database with a predefined number of feature values in common with those of the specified object are located.
    Type: Application
    Filed: January 13, 2003
    Publication date: July 15, 2004
    Inventors: Andrei Z. Broder, Mark S. Manasse
  • Publication number: 20030061233
    Abstract: A system and method for finding one or more target biometric samples that are similar to or match a query biometric sample. A query feature vector is generated from a query biometric vector. The query biometric vector represents the query biometric sample as a set of characteristics. The characteristics are either invariable or variable. The query feature vector comprises a plurality of features which are derived from the query biometric vector using a process that includes canonicalization of the characters in the biometric vector. The query feature vector is compared to a plurality of similarly created target feature vectors, each target feature vector representing a respective target biometric sample. A target biometric sample is a potential match to the query biometric sample when a threshold number of features in the corresponding target feature vector are identical to features in the query biometric vector.
    Type: Application
    Filed: September 21, 2001
    Publication date: March 27, 2003
    Inventors: Mark S. Manasse, Andrei Z. Broder
  • Patent number: 6523012
    Abstract: An electronic commerce system includes a broker computer system having a database of scrip representing a form of currency, a vendor computer system having a database containing products which may be exchanged for the scrip, a consumer computer system with which a user may initiate transactions with the scrip, and an agent computer system to which the consumer can delegate rights to perform actions with the scrip. To delegate actions on scrip, the delegator provides the delegatee with a delegation having a list of the delegated actions. In addition, the delegator determines a delegation scrip secret (DSS) and a delegation pass phrase (DPP) and securely passes these to the delegatee. The delegatee uses the DSS to authenticate itself to servers accepting the scrip and uses the DPP to encrypt the DSS while the scrip is stored by the delegatee. To perform an action with delegated scrip, the delegatee sends a request for the action to a server.
    Type: Grant
    Filed: May 21, 1999
    Date of Patent: February 18, 2003
    Assignee: Compaq Information Technology Group, L.P.
    Inventors: Steven C. Glassman, Mark S. Manasse
  • Patent number: 6453305
    Abstract: An electronic commerce system and method enforces a license agreement for content on an open network by restricting the number of consumers that can concurrently access the content. A consumer initially acquires vendor scrip, either from a broker or the vendor itself. The consumer presents the vendor scrip to the vendor along with a request to access the content. In response, the vendor gathers information about the consumer to determine whether the consumer belongs to the class allowed to access the content. The information may be gathered from the scrip or from other sources. If the consumer belongs to the class, then the vendor determines if a license to access the content is available. Generally, a license is available if the number of other consumers having licenses to access the content is less than the maximum specified in the license agreement. If no licenses are available, the vendor provides the consumer with an estimate of when a license will be available.
    Type: Grant
    Filed: May 21, 1999
    Date of Patent: September 17, 2002
    Assignee: Compaq Computer Corporation
    Inventors: Steven C. Glassman, Mark S. Manasse
  • Patent number: 6424953
    Abstract: An electronic commerce system and method includes a broker computer system having a database of scrips, a vendor computer, and a consumer computer system having a wallet protected by a pass phrase. To strengthen the pass phrase, the wallet adds a nonce and a random phrase having a length determined by the processing speed of the computer system. The internal phrase is hashed with another nonce to form a checksum, which is stored in the wallet. A portion of each scrip is encrypted by hashing a unique nonce and the internal pass phrase. XORing the scrip with the hash, and storing the encrypted portion and nonce in the wallet. The wallet adds another nonce and random string to form an internal pass phrase. To use the scrip, the user provides the pass phrase, The wallet verifies that the phrase is correct, and if so, decrypts the scrip.
    Type: Grant
    Filed: March 19, 1999
    Date of Patent: July 23, 2002
    Assignee: Compaq Computer Corp.
    Inventors: Steven C. Glassman, Mark S. Manasse
  • Patent number: 6349296
    Abstract: A computer-implemented method determines the resemblance of data objects such as Web pages. Each data object is partitioned into a sequence of tokens. The tokens are grouped into overlapping sets of the tokens to form shingles. Each shingle is represented by a unique identification element encoded as a fingerprint. A minimum element from each of the images of the set of fingerprints associated with a document under each of a plurality of pseudo random permutations of the set of all fingerprints are selected to generate a sketch of each data object. The sketches characterize the resemblance of the data objects. The sketches can be further partitioned into a plurality of groups. Each group is fingerprinted to form a feature. Data objects that share more than a certain numbers of features are estimated to be nearly identical.
    Type: Grant
    Filed: August 21, 2000
    Date of Patent: February 19, 2002
    Assignee: AltaVista Company
    Inventors: Andrei Z. Broder, Steven C. Glassman, Charles G. Nelson, Mark S. Manasse, Geoffrey G. Zweig