Patents by Inventor Mike Bendersky

Mike Bendersky has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Publication number: 20220004918
    Abstract: Implementations relate to training a model that can be used to process values for defined features, where the values are specific to a user account, to generate a predicted user measure that reflects both popularity and quality of the user account. The model is trained based on losses that are each generated as a function of both a corresponding generated popularity measure and a corresponding generated quality measure of a corresponding training instance. Accordingly, the model can be trained to generate, based on values for a given user account, a single measure that reflects both quality and popularity of the given user account. Implementations are additionally or alternatively directed to utilizing such predicted user measures to restrict provisioning of content items that are from user accounts having respective predicted user measures that fail to satisfy a threshold.
    Type: Application
    Filed: July 6, 2020
    Publication date: January 6, 2022
    Inventors: Spurthi Amba Hombaiah, Vladimir Ofitserov, Mike Bendersky, Marc Alexander Najork
  • Publication number: 20210374345
    Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for performing a machine learning task on a tuple of respective input sequences to generate an output. In one aspect, one of the systems includes a neural network comprising a plurality of encoder neural networks and a head neural network, each encoder neural network configured to: receive a respective input sequence from the tuple; process the respective input sequence using one or more encoder network layers to generate an encoded representation comprising a sequence of tokens; and process each of some or all of the tokens in the sequence of tokens using a projection layer to generate a lower-dimensional representation, and the head neural network configured to: receive lower-dimensional representations of a respective proper subset of the sequence of tokens generated by the encoder neural network; and process the lower-dimensional representations to generate the output.
    Type: Application
    Filed: June 1, 2021
    Publication date: December 2, 2021
    Inventors: Karthik Raman, Liu Yang, Mike Bendersky, Jiecao Chen, Marc Alexander Najork
  • Patent number: 10970293
    Abstract: Methods and apparatus related to using document feature(s) of a document that is responsive to a query, and optionally query feature(s) of the query, to determine a presentation characteristic for presenting a search result that corresponds to the document. In some implementations, measures associated with the document feature(s) and/or query feature(s) may be used to determine the presentation characteristic. The measures may be based on past interactions, by corresponding users, with other documents that share one or more of the document features with the document, where a plurality of the other documents are different from the document (and optionally each different from one another). In some implementations, the document and/or the other documents include, or are restricted to, documents that are access restricted.
    Type: Grant
    Filed: August 26, 2019
    Date of Patent: April 6, 2021
    Assignee: GOOGLE LLC
    Inventors: Mike Bendersky, Marc Alexander Najork, Donald Metzler, Xuanhui Wang
  • Patent number: 10540610
    Abstract: Methods, apparatus, and computer-readable media are provided for analyzing a cluster of communications, such as B2C emails, to generate a template for the cluster that defines transient segments and fixed segments of the cluster of communications. More particularly, methods, apparatus, and computer-readable media are provided for generating and/or applying a trained structured machine learning model for a generated template that can be used to determine, for one or more transient segments of subsequent communications, a corresponding probability that a given semantic label is the correct semantic label for extracted content of the transient segment(s).
    Type: Grant
    Filed: April 27, 2016
    Date of Patent: January 21, 2020
    Assignee: GOOGLE LLC
    Inventors: Jie Yang, Amr Ahmed, Luis Garcia Pueyo, Mike Bendersky, Amitabh Saikia, Marc-Allen Cartright, Marc Alexander Najork, MyLinh Yang, Hui Tan, Weinan Zhang, Vanja Josifovski, Alexander J. Smola
  • Publication number: 20190377741
    Abstract: Methods and apparatus related to using document feature(s) of a document that is responsive to a query, and optionally query feature(s) of the query, to determine a presentation characteristic for presenting a search result that corresponds to the document. In some implementations, measures associated with the document feature(s) and/or query feature(s) may be used to determine the presentation characteristic. The measures may be based on past interactions, by corresponding users, with other documents that share one or more of the document features with the document, where a plurality of the other documents are different from the document (and optionally each different from one another). In some implementations, the document and/or the other documents include, or are restricted to, documents that are access restricted.
    Type: Application
    Filed: August 26, 2019
    Publication date: December 12, 2019
    Inventors: Mike Bendersky, Marc Alexander Najork, Donald Metzler, Xuanhui Wang
  • Patent number: 10394832
    Abstract: Methods and apparatus related to using document feature(s) of a document that is responsive to a query, and optionally query feature(s) of the query, to determine a presentation characteristic for presenting a search result that corresponds to the document. In some implementations, measures associated with the document feature(s) and/or query feature(s) may be used to determine the presentation characteristic. The measures may be based on past interactions, by corresponding users, with other documents that share one or more of the document features with the document, where a plurality of the other documents are different from the document (and optionally each different from one another). In some implementations, the document and/or the other documents include, or are restricted to, documents that are access restricted.
    Type: Grant
    Filed: October 24, 2016
    Date of Patent: August 27, 2019
    Assignee: GOOGLE LLC
    Inventors: Mike Bendersky, Marc Alexander Najork, Donald Metzler, Xuanhui Wang
  • Patent number: 10360537
    Abstract: Techniques are described herein for generating and applying event data extraction templates. In various implementations, a data extraction template may be applied to structured communications to extract, from each structured communication, event data associated with a transient markup language path indicated in the data extraction template. The data extraction template may include an event-related semantic data type assigned to the transient markup language path and a strength of association between the transient structural path and the event-related semantic data type. Feedback may be obtained concerning event data extracted from one or more of the structured communications. Based on the feedback, the strength of association between the transient markup language path and the event-related semantic data type may be altered.
    Type: Grant
    Filed: April 11, 2017
    Date of Patent: July 23, 2019
    Assignee: GOOGLE LLC
    Inventors: Mike Bendersky, Maureen Heymans, Jinan Lou, Jie Yang, MyLinh Yang, Amitabh Saikia, Marc-Allen Cartright, Vanja Josifovski, Hui Tan, Luis Garcia Pueyo
  • Patent number: 10216838
    Abstract: Methods, apparatus, and computer-readable media are provided for generating and applying data extraction templates. In various implementations, a corpus of structured communications such as emails may be grouped into clusters based on one or more similarities between the structured communications. A set of structural paths may be identified from structured communications of a particular cluster. One or more structural paths of the set may be classified as transient wherein a count of occurrences of one or more associated segments of text across the particular cluster satisfies a criterion. One or more transient paths may be assigned a semantic data type and/or a confidentiality designation based on various signals. A data extraction template may be generated to extract, from subsequent structured communications, segments of text associated with transient (and in some cases, non-confidential) structural paths.
    Type: Grant
    Filed: December 29, 2016
    Date of Patent: February 26, 2019
    Assignee: Google LLC
    Inventors: Luis Garcia Pueyo, Vanja Josifovski, Amitabh Saikia, Jie Yang, Mike Bendersky, Srinidhi Viswanatha, Marc-Allen Cartright
  • Patent number: 10216837
    Abstract: Methods, apparatus, systems, and computer-readable media are provided for selecting pattern matching segments suitable for electronic communication clustering. A set of pattern matching segments may be identified that match at least one of a corpus of electronic communication addresses. A measure of coverage of each of the set of pattern matching segments across the corpus of electronic communication addresses may be determined. A score associated with each pattern matching segment may be determined based on the measure of coverage and one or more measures of flexibility associated with each of the set of pattern matching segments. One or more of the pattern matching segments may be selected based on the determine scores. A corpus of electronic communications may then be grouped into a plurality of clusters based on a comparison of the one or more selected pattern matching segments to electronic communication addresses associated with the corpus of electronic communications.
    Type: Grant
    Filed: December 29, 2014
    Date of Patent: February 26, 2019
    Assignee: GOOGLE LLC
    Inventors: Amitabh Saikia, Marc-Allen Cartright, Luis Garcia Pueyo, Vanja Josifovski, Jie Yang, Mike Bendersky, MyLinh Yang
  • Publication number: 20180113866
    Abstract: Methods and apparatus related to using document feature(s) of a document that is responsive to a query, and optionally query feature(s) of the query, to determine a presentation characteristic for presenting a search result that corresponds to the document. In some implementations, measures associated with the document feature(s) and/or query feature(s) may be used to determine the presentation characteristic. The measures may be based on past interactions, by corresponding users, with other documents that share one or more of the document features with the document, where a plurality of the other documents are different from the document (and optionally each different from one another). In some implementations, the document and/or the other documents include, or are restricted to, documents that are access restricted.
    Type: Application
    Filed: October 24, 2016
    Publication date: April 26, 2018
    Inventors: Mike Bendersky, Marc Alexander Najork, Donald Metzler, Xuanhui Wang
  • Patent number: 9953185
    Abstract: In various implementations, a plurality of non-private n-grams that satisfy a privacy criterion may be identified within a search log of private search queries and corresponding post-search activity. A plurality of query patterns may be generated based on the plurality of non-private n-grams. Aggregate search activity statistics associated with each of the plurality of query patterns may be determined from the search log. Aggregate search activity statistics associated with each query pattern may be indicative of search activity associated with a plurality of private search queries in the search log that match the query pattern. In response to a determination that aggregate search activity statistics for a given query pattern satisfy a performance criterion, a methodology for generating data that is presented in response to search queries that match the given query pattern may be altered based on aggregate search activity statistics associated with the given query pattern.
    Type: Grant
    Filed: November 24, 2015
    Date of Patent: April 24, 2018
    Assignee: GOOGLE LLC
    Inventors: Mike Bendersky, Donald Metzler, Marc Alexander Najork, Dor Naveh, Vlad Panait, Xuanhui Wang
  • Publication number: 20170293696
    Abstract: A computing device may generate, a graph that includes a plurality of nodes, wherein the plurality of nodes includes a plurality of entity nodes representing a plurality of entities and a plurality of feature nodes representing a plurality of features, and wherein each of the plurality of entity nodes is connected in the graph to one or more of the plurality of feature nodes. The computing device may perform label propagation to associate a distribution of labels with each of the plurality of nodes. The computing device may be configured to receive an indication of at least one of a feature of interest or an entity of interest. The computing device may further be configured to output an indication of one or more related entities that are related to the feature of interest or the entity of interest.
    Type: Application
    Filed: April 11, 2016
    Publication date: October 12, 2017
    Inventors: Mike Bendersky, Vijay Garg, Sujith Ravi, Cheng Li
  • Patent number: 9785705
    Abstract: Methods, apparatus, systems, and computer-readable media are provided for generating and applying data extraction templates. In various implementations, a corpus of plain text communications such as emails may be grouped into clusters based on one or more similarities between the plain text communications. One or more segments of communications of a particular cluster may be classified as transient based on textual pattern matching. One or more other segments of the communications of the particular cluster may be classified as transient based on various criteria. One or more transient segments may be assigned a generic and/or specific semantic data type and/or a confidentiality designation based on various signals. A data extraction template may be generated to extract, from subsequent plain text communications, content associated with transient (and in some cases, non-confidential) segments.
    Type: Grant
    Filed: October 16, 2014
    Date of Patent: October 10, 2017
    Assignee: GOOGLE INC.
    Inventors: Marc-Allen Cartright, Luis Garcia Pueyo, Vanja Josifovski, Amitabh Saikia, Jie Yang, Mike Bendersky, MyLinh Yang
  • Patent number: 9756073
    Abstract: Methods, apparatus, systems, and computer-readable media are provided for determining whether communications are attempts at phishing. In various implementations, a potentially-deceptive communication may be matched to one or more templates of a plurality of templates. Each template may represent content shared among a cluster of communications sent by a legitimate entity. In various implementations, it may be determined that an address associated with the communication is not affiliated with one or more legitimate entities associated with the one or more matched templates. In various implementations, the communication may be classified as a phishing attempt based on the determining.
    Type: Grant
    Filed: January 26, 2017
    Date of Patent: September 5, 2017
    Assignee: GOOGLE INC.
    Inventors: Mike Bendersky, Luis Garcia Pueyo, Kashyap Ramesh Puranik, Amitabh Saikia, Jie Yang, Marc-Allen Cartright
  • Patent number: 9734148
    Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for redacting data from a document collection generated for a set of documents that include personal information. The redaction of the data is based in part on a comparison of the document collection to a set of a personal documents of users for which the users have provided explicit approval to use in the processing of the document collection.
    Type: Grant
    Filed: October 21, 2014
    Date of Patent: August 15, 2017
    Assignee: Google Inc.
    Inventors: Mike Bendersky, Vanja Josifovski, Amitabh Saikia, Marc-Allen Cartright, Jie Yang, Luis Garcia Pueyo, MyLinh Yang
  • Publication number: 20170147834
    Abstract: In various implementations, a plurality of non-private n-grams that satisfy a privacy criterion may be identified within a search log of private search queries and corresponding post-search activity. A plurality of query patterns may be generated based on the plurality of non-private n-grams. Aggregate search activity statistics associated with each of the plurality of query patterns may be determined from the search log. Aggregate search activity statistics associated with each query pattern may be indicative of search activity associated with a plurality of private search queries in the search log that match the query pattern. In response to a determination that aggregate search activity statistics for a given query pattern satisfy a performance criterion, a methodology for generating data that is presented in response to search queries that match the given query pattern may be altered based on aggregate search activity statistics associated with the given query pattern.
    Type: Application
    Filed: November 24, 2015
    Publication date: May 25, 2017
    Inventors: Mike Bendersky, Donald Metzler, Marc Alexander Najork, Dor Naveh, Vlad Panait, Xuanhui Wang
  • Publication number: 20170149824
    Abstract: Methods, apparatus, systems, and computer-readable media are provided for determining whether communications are attempts at phishing. In various implementations, a potentially-deceptive communication may be matched to one or more templates of a plurality of templates. Each template may represent content shared among a cluster of communications sent by a legitimate entity. In various implementations, it may be determined that an address associated with the communication is not affiliated with one or more legitimate entities associated with the one or more matched templates. In various implementations, the communication may be classified as a phishing attempt based on the determining.
    Type: Application
    Filed: January 26, 2017
    Publication date: May 25, 2017
    Inventors: Mike Bendersky, Luis Garcia Pueyo, Kashyap Ramesh Puranik, Amitabh Saikia, Jie Yang, Marc-Allen Cartright
  • Patent number: 9652530
    Abstract: Methods and apparatus are described herein for generating and applying event data extraction templates. In various implementations, a set of structural paths may be identified from a corpus of communications. A first structural path of the set of structural paths, associated with a first segment of text, may be classified as transient in response to a determination that a frequency of occurrences of the first segment of text across the corpus satisfies a criterion. Event heuristics may be applied to the communications of the corpus. A determination may be made, based on the applying, that the communications of the corpus are event-related. An event data type may be assigned to the transient structural path based on the applying. An event data extraction template may be generated to extract, from one or more subsequent communications, one or more event-related segments of text associated with the transient structural path.
    Type: Grant
    Filed: August 27, 2014
    Date of Patent: May 16, 2017
    Assignee: GOOGLE INC.
    Inventors: Mike Bendersky, Maureen Heymans, Jinan Lou, Jie Yang, MyLinh Yang, Amitabh Saikia, Marc-Allen Cartright, Vanja Josifovski, Hui Tan, Luis Garcia Pueyo
  • Patent number: 9596265
    Abstract: Methods, apparatus, systems, and computer-readable media are provided for determining whether communications are attempts at phishing. In various implementations, a potentially-deceptive communication may be matched to one or more templates of a plurality of templates. Each template may represent content shared among a cluster of communications sent by a trustworthy entity. In various implementations, it may be determined that an address associated with the communication is not affiliated with one or more trustworthy entities associated with the one or more matched templates. In various implementations, the communication may be classified as a phishing attempt based on the determining.
    Type: Grant
    Filed: May 13, 2015
    Date of Patent: March 14, 2017
    Assignee: GOOGLE INC.
    Inventors: Mike Bendersky, Luis Garcia Pueyo, Kashyap Ramesh Puranik, Amitabh Saikia, Jie Yang, Marc-Allen Cartright
  • Patent number: 9563689
    Abstract: Methods, apparatus, and computer-readable media are provided for generating and applying data extraction templates. In various implementations, a corpus of structured communications such as emails may be grouped into clusters based on one or more similarities between the structured communications. A set of structural paths may be identified from structured communications of a particular cluster. One or more structural paths of the set may be classified as transient wherein a count of occurrences of one or more associated segments of text across the particular cluster satisfies a criterion. One or more transient paths may be assigned a semantic data type and/or a confidentiality designation based on various signals. A data extraction template may be generated to extract, from subsequent structured communications, segments of text associated with transient (and in some cases, non-confidential) structural paths.
    Type: Grant
    Filed: August 27, 2014
    Date of Patent: February 7, 2017
    Assignee: Google Inc.
    Inventors: Luis Garcia Pueyo, Vanja Josifovski, Amitabh Saikia, Jie Yang, Mike Bendersky, Srinidhi Viswanatha, Marc-Allen Cartright