Patents by Inventor Marc Alexander Najork

Marc Alexander Najork has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

Dynamic Language Models for Continuously Evolving Content

Publication number: 20230401382

Abstract: Provided are systems and methods for incremental training of machine learning models to adapt to changes in an underlying data distribution. One example setting in which the techniques described herein may be beneficial is for incrementally training natural language models to enable the models to have or adapt to a dynamically changing vocabulary. Incremental training is provided as a feasible and inexpensive way of adapting machine learning models to evolving vocabulary without having to retrain them from scratch.

Type: Application

Filed: October 19, 2021

Publication date: December 14, 2023

Inventors: Spurthi Amba Hombaiah, Mingyang Zhang, Michael Bendersky, Tao Chen, Marc Alexander Najork
Layout-Aware Multimodal Pretraining for Multimodal Document Understanding

Publication number: 20230222285

Abstract: Systems and methods for document processing that can process and understand the layout, text size, text style, and multimedia of a document can generate more accurate and informed document representations. The layout of a document paired with text size and style can indicate what portions of a document are possibly more important, and the understanding of that importance can help with understanding of the document. Systems and methods utilizing a hierarchical framework that processes the block-level and the document-level of a document can capitalize on these indicators to generate a better document representation.

Type: Application

Filed: December 22, 2020

Publication date: July 13, 2023

Inventors: Mingyang Zhang, Cheng Li, Tao Chen, Spurthi Amba Hombaiah, Michael Bendersky, Marc Alexander Najork, Te-Lin Wu
AUTOMATIC FILE ORGANIZATION WITHIN A CLOUD STORAGE SYSTEM

Publication number: 20230177004

Abstract: Techniques are described herein for enabling more computationally efficient organization of files within a cloud storage system. A method includes: receiving information identifying a document and a set of folders; for each folder in the set of folders, using a trained model to predict a similarity measure between the folder and the document; for each folder in the set of folders, determining a score for the folder based on the predicted similarity measure for the folder; selecting a candidate folder from the set of folders using the scores of the folders within the set of folders; and providing, on a user interface, a selectable option to associate the document with the candidate folder.

Type: Application

Filed: December 7, 2021

Publication date: June 8, 2023

Inventors: Weize Kong, Mingyang Zhang, Michael Bendersky, Marc Alexander Najork, Mike Colagrosso, Brandon Vargo, Remy Burger
ADVERSARIAL BANDITS POLICY FOR CRAWLING HIGHLY DYNAMIC CONTENT

Publication number: 20230169128

Abstract: Techniques of generating recrawl policies for commercial offer pages include generating a multiple strategy approach using a number of different strategies. In some implementations, each strategy is an arm of a K-armed adversarial bandits algorithm with reinforcement learning. Moreover, in some implementations, the multiple strategy approach also uses a machine learning algorithm to estimate parameters such as a click rate, impression rate, and likelihood of price change, i.e., change rate, which was assumed known in the conventional approaches.

Type: Application

Filed: March 30, 2020

Publication date: June 1, 2023

Inventors: Michael Bendersky, Przemyslaw Gajda, Sergey Novikov, Marc Alexander Najork, Shuguang Han
TRAINING AND/OR UTILIZING A MODEL FOR PREDICTING MEASURES REFLECTING BOTH QUALITY AND POPULARITY OF CONTENT

Publication number: 20230094198

Abstract: Implementations relate to training a model that can be used to process values for defined features, where the values are specific to a user account, to generate a predicted user measure that reflects both popularity and quality of the user account. The model is trained based on losses that are each generated as a function of both a corresponding generated popularity measure and a corresponding generated quality measure of a corresponding training instance. Accordingly, the model can be trained to generate, based on values for a given user account, a single measure that reflects both quality and popularity of the given user account. Implementations are additionally or alternatively directed to utilizing such predicted user measures to restrict provisioning of content items that are from user accounts having respective predicted user measures that fail to satisfy a threshold.

Type: Application

Filed: December 5, 2022

Publication date: March 30, 2023

Inventors: Spurthi Amba Hombaiah, Vladimir Ofitserov, Mike Bendersky, Marc Alexander Najork
Training and/or utilizing a model for predicting measures reflecting both quality and popularity of content

Patent number: 11551150

Abstract: Implementations relate to training a model that can be used to process values for defined features, where the values are specific to a user account, to generate a predicted user measure that reflects both popularity and quality of the user account. The model is trained based on losses that are each generated as a function of both a corresponding generated popularity measure and a corresponding generated quality measure of a corresponding training instance. Accordingly, the model can be trained to generate, based on values for a given user account, a single measure that reflects both quality and popularity of the given user account. Implementations are additionally or alternatively directed to utilizing such predicted user measures to restrict provisioning of content items that are from user accounts having respective predicted user measures that fail to satisfy a threshold.

Type: Grant

Filed: July 6, 2020

Date of Patent: January 10, 2023

Assignee: GOOGLE LLC

Inventors: Spurthi Amba Hombaiah, Vladimir Ofitserov, Mike Bendersky, Marc Alexander Najork
Search and retrieval of structured information cards

Patent number: 11238058

Abstract: Methods, systems, apparatus, including computer programs encoded on computer storage medium, to facilitate identification of additional trigger-terms for a structured information card. In one aspect, the method includes actions of accessing data associated with a template for presenting structured information, wherein the accessed data references (i) a label term and (ii) a value. Other actions may include obtaining a candidate label term, identifying one or more entities that are associated with the label term, identifying one or more of the entities that are associated with the candidate label term, and for each particular entity of the one or more entities that are associated with the candidate label term, associating, with the candidate label term, (i) a label term that is associated with the particular entity, and (ii) the value associated with the label term.

Type: Grant

Filed: November 2, 2020

Date of Patent: February 1, 2022

Assignee: Google LLC

Inventors: Marc Alexander Najork, Sujith Ravi, Michael Bendersky, Peter Shao-sen Young, Timothy Youngjin Sohn, Mingyang Zhang, Thomas Nelson, Xuanhui Wang
TRAINING AND/OR UTILIZING A MODEL FOR PREDICTING MEASURES REFLECTING BOTH QUALITY AND POPULARITY OF CONTENT

Publication number: 20220004918

Abstract: Implementations relate to training a model that can be used to process values for defined features, where the values are specific to a user account, to generate a predicted user measure that reflects both popularity and quality of the user account. The model is trained based on losses that are each generated as a function of both a corresponding generated popularity measure and a corresponding generated quality measure of a corresponding training instance. Accordingly, the model can be trained to generate, based on values for a given user account, a single measure that reflects both quality and popularity of the given user account. Implementations are additionally or alternatively directed to utilizing such predicted user measures to restrict provisioning of content items that are from user accounts having respective predicted user measures that fail to satisfy a threshold.

Type: Application

Filed: July 6, 2020

Publication date: January 6, 2022

Inventors: Spurthi Amba Hombaiah, Vladimir Ofitserov, Mike Bendersky, Marc Alexander Najork
PROCESSING LARGE-SCALE TEXTUAL INPUTS USING NEURAL NETWORKS

Publication number: 20210374345

Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for performing a machine learning task on a tuple of respective input sequences to generate an output. In one aspect, one of the systems includes a neural network comprising a plurality of encoder neural networks and a head neural network, each encoder neural network configured to: receive a respective input sequence from the tuple; process the respective input sequence using one or more encoder network layers to generate an encoded representation comprising a sequence of tokens; and process each of some or all of the tokens in the sequence of tokens using a projection layer to generate a lower-dimensional representation, and the head neural network configured to: receive lower-dimensional representations of a respective proper subset of the sequence of tokens generated by the encoder neural network; and process the lower-dimensional representations to generate the output.

Type: Application

Filed: June 1, 2021

Publication date: December 2, 2021

Inventors: Karthik Raman, Liu Yang, Mike Bendersky, Jiecao Chen, Marc Alexander Najork
TRAINING A RANKING MODEL

Publication number: 20210125108

Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for training a ranking machine learning model. In one aspect, a method includes the actions of receiving training data for a ranking machine learning model, the training data including training examples, and each training example including data identifying: a search query, result documents from a result list for the search query, and a result document that was selected by a user from the result list, receiving position data for each training example in the training data, the position data identifying a respective position of the selected result document in the result list for the search query in the training example; determining, for each training example in the training data, a respective selection bias value; and determining a respective importance value for each training example from the selection bias value for the training example, the importance value.

Type: Application

Filed: October 24, 2016

Publication date: April 29, 2021

Applicant: Google LLC

Inventors: Donald Arthur Metzler, JR., Xuanhui Wang, Marc Alexander Najork, Michael Bendersky
Ranking search result documents

Patent number: 10970293

Abstract: Methods and apparatus related to using document feature(s) of a document that is responsive to a query, and optionally query feature(s) of the query, to determine a presentation characteristic for presenting a search result that corresponds to the document. In some implementations, measures associated with the document feature(s) and/or query feature(s) may be used to determine the presentation characteristic. The measures may be based on past interactions, by corresponding users, with other documents that share one or more of the document features with the document, where a plurality of the other documents are different from the document (and optionally each different from one another). In some implementations, the document and/or the other documents include, or are restricted to, documents that are access restricted.

Type: Grant

Filed: August 26, 2019

Date of Patent: April 6, 2021

Assignee: GOOGLE LLC

Inventors: Mike Bendersky, Marc Alexander Najork, Donald Metzler, Xuanhui Wang
SEARCH AND RETRIEVAL OF STRUCTURED INFORMATION CARDS

Publication number: 20210049165

Abstract: Methods, systems, apparatus, including computer programs encoded on computer storage medium, to facilitate identification of additional trigger-terms for a structured information card. In one aspect, the method includes actions of accessing data associated with a template for presenting structured information, wherein the accessed data references (i) a label term and (ii) a value. Other actions may include obtaining a candidate label term, identifying one or more entities that are associated with the label term, identifying one or more of the entities that are associated with the candidate label term, and for each particular entity of the one or more entities that are associated with the candidate label term, associating, with the candidate label term, (i) a label term that is associated with the particular entity, and (ii) the value associated with the label term.

Type: Application

Filed: November 2, 2020

Publication date: February 18, 2021

Inventors: Marc Alexander Najork, Sujith Ravi, Michael Bendersky, Peter Shao-sen Young, Timothy Youngjin Sohn, Mingyang Zhang, Thomas Nelson, Xuanhui Wang
Search and retrieval of structured information cards

Patent number: 10824630

Abstract: Methods, systems, apparatus, including computer programs encoded on computer storage medium, to facilitate identification of additional trigger-terms for a structured information card. In one aspect, the method includes actions of accessing data associated with a template for presenting structured information, wherein the accessed data references (i) a label term and (ii) a value. Other actions may include obtaining a candidate label term, identifying one or more entities that are associated with the label term, identifying one or more of the entities that are associated with the candidate label term, and for each particular entity of the one or more entities that are associated with the candidate label term, associating, with the candidate label term, (i) a label term that is associated with the particular entity, and (ii) the value associated with the label term.

Type: Grant

Filed: October 26, 2016

Date of Patent: November 3, 2020

Assignee: GOOGLE LLC

Inventors: Marc Alexander Najork, Sujith Ravi, Michael Bendersky, Peter Shao-sen Young, Timothy Youngjin Sohn, Mingyang Zhang, Thomas Nelson, Xuanhui Wang
Generating and applying a trained structured machine learning model for determining a semantic label for content of a transient segment of a communication

Patent number: 10540610

Abstract: Methods, apparatus, and computer-readable media are provided for analyzing a cluster of communications, such as B2C emails, to generate a template for the cluster that defines transient segments and fixed segments of the cluster of communications. More particularly, methods, apparatus, and computer-readable media are provided for generating and/or applying a trained structured machine learning model for a generated template that can be used to determine, for one or more transient segments of subsequent communications, a corresponding probability that a given semantic label is the correct semantic label for extracted content of the transient segment(s).

Type: Grant

Filed: April 27, 2016

Date of Patent: January 21, 2020

Assignee: GOOGLE LLC

Inventors: Jie Yang, Amr Ahmed, Luis Garcia Pueyo, Mike Bendersky, Amitabh Saikia, Marc-Allen Cartright, Marc Alexander Najork, MyLinh Yang, Hui Tan, Weinan Zhang, Vanja Josifovski, Alexander J. Smola
RANKING SEARCH RESULT DOCUMENTS

Publication number: 20190377741

Abstract: Methods and apparatus related to using document feature(s) of a document that is responsive to a query, and optionally query feature(s) of the query, to determine a presentation characteristic for presenting a search result that corresponds to the document. In some implementations, measures associated with the document feature(s) and/or query feature(s) may be used to determine the presentation characteristic. The measures may be based on past interactions, by corresponding users, with other documents that share one or more of the document features with the document, where a plurality of the other documents are different from the document (and optionally each different from one another). In some implementations, the document and/or the other documents include, or are restricted to, documents that are access restricted.

Type: Application

Filed: August 26, 2019

Publication date: December 12, 2019

Inventors: Mike Bendersky, Marc Alexander Najork, Donald Metzler, Xuanhui Wang
Ranking search results documents

Patent number: 10394832

Abstract: Methods and apparatus related to using document feature(s) of a document that is responsive to a query, and optionally query feature(s) of the query, to determine a presentation characteristic for presenting a search result that corresponds to the document. In some implementations, measures associated with the document feature(s) and/or query feature(s) may be used to determine the presentation characteristic. The measures may be based on past interactions, by corresponding users, with other documents that share one or more of the document features with the document, where a plurality of the other documents are different from the document (and optionally each different from one another). In some implementations, the document and/or the other documents include, or are restricted to, documents that are access restricted.

Type: Grant

Filed: October 24, 2016

Date of Patent: August 27, 2019

Assignee: GOOGLE LLC

Inventors: Mike Bendersky, Marc Alexander Najork, Donald Metzler, Xuanhui Wang
SEARCH AND RETRIEVAL OF STRUCTURED INFORMATION CARDS

Publication number: 20180113865

Abstract: Methods, systems, apparatus, including computer programs encoded on computer storage medium, to facilitate identification of additional trigger-terms for a structured information card. In one aspect, the method includes actions of accessing data associated with a template for presenting structured information, wherein the accessed data references (i) a label term and (ii) a value. Other actions may include obtaining a candidate label term, identifying one or more entities that are associated with the label term, identifying one or more of the entities that are associated with the candidate label term, and for each particular entity of the one or more entities that are associated with the candidate label term, associating, with the candidate label term, (i) a label term that is associated with the particular entity, and (ii) the value associated with the label term.

Type: Application

Filed: October 26, 2016

Publication date: April 26, 2018

Inventors: Marc Alexander Najork, Sujith Ravi, Michael Bendersky, Peter Shao-sen Young, Timothy Youngjin Sohn, Mingyang Zhang, Thomas Nelson, Xuanhui Wang
RANKING SEARCH RESULT DOCUMENTS

Publication number: 20180113866

Abstract: Methods and apparatus related to using document feature(s) of a document that is responsive to a query, and optionally query feature(s) of the query, to determine a presentation characteristic for presenting a search result that corresponds to the document. In some implementations, measures associated with the document feature(s) and/or query feature(s) may be used to determine the presentation characteristic. The measures may be based on past interactions, by corresponding users, with other documents that share one or more of the document features with the document, where a plurality of the other documents are different from the document (and optionally each different from one another). In some implementations, the document and/or the other documents include, or are restricted to, documents that are access restricted.

Type: Application

Filed: October 24, 2016

Publication date: April 26, 2018

Inventors: Mike Bendersky, Marc Alexander Najork, Donald Metzler, Xuanhui Wang
Identifying query patterns and associated aggregate statistics among search queries

Patent number: 9953185

Abstract: In various implementations, a plurality of non-private n-grams that satisfy a privacy criterion may be identified within a search log of private search queries and corresponding post-search activity. A plurality of query patterns may be generated based on the plurality of non-private n-grams. Aggregate search activity statistics associated with each of the plurality of query patterns may be determined from the search log. Aggregate search activity statistics associated with each query pattern may be indicative of search activity associated with a plurality of private search queries in the search log that match the query pattern. In response to a determination that aggregate search activity statistics for a given query pattern satisfy a performance criterion, a methodology for generating data that is presented in response to search queries that match the given query pattern may be altered based on aggregate search activity statistics associated with the given query pattern.

Type: Grant

Filed: November 24, 2015

Date of Patent: April 24, 2018

Assignee: GOOGLE LLC

Inventors: Mike Bendersky, Donald Metzler, Marc Alexander Najork, Dor Naveh, Vlad Panait, Xuanhui Wang
IDENTIFYING QUERY PATTERNS AND ASSOCIATED AGGREGATE STATISTICS AMONG SEARCH QUERIES

Publication number: 20170147834

Abstract: In various implementations, a plurality of non-private n-grams that satisfy a privacy criterion may be identified within a search log of private search queries and corresponding post-search activity. A plurality of query patterns may be generated based on the plurality of non-private n-grams. Aggregate search activity statistics associated with each of the plurality of query patterns may be determined from the search log. Aggregate search activity statistics associated with each query pattern may be indicative of search activity associated with a plurality of private search queries in the search log that match the query pattern. In response to a determination that aggregate search activity statistics for a given query pattern satisfy a performance criterion, a methodology for generating data that is presented in response to search queries that match the given query pattern may be altered based on aggregate search activity statistics associated with the given query pattern.

Type: Application

Filed: November 24, 2015

Publication date: May 25, 2017

Inventors: Mike Bendersky, Donald Metzler, Marc Alexander Najork, Dor Naveh, Vlad Panait, Xuanhui Wang

1 2 next