Patents by Inventor Benoit Dumoulin
Benoit Dumoulin has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Patent number: 7809715Abstract: A method for handling abbreviations in web queries includes building a dictionary of a plurality of possible word expansions for a plurality of potential abbreviations related to query terms received or anticipated to be received by a search engine; accepting a query including an abbreviation; expanding the abbreviation into one of the plurality of word expansions if a probability that the expansion is correct is above a threshold value, wherein the probability is determined by taking into consideration a context of the abbreviation within the query, wherein the context including at least anchor text; and sending the query with the expanded abbreviation to the search engine to generate a search results page related to the query.Type: GrantFiled: April 15, 2008Date of Patent: October 5, 2010Assignee: Yahoo! Inc.Inventors: Xing Wei, Fuchun Peng, Benoit Dumoulin
-
Publication number: 20100191740Abstract: A system and method for ranking web searches with quantified semantic features. A query for a web search is received from a user. The query is segmented and tagged into one or more linguistic segments using linguistic analysis. At least some of the linguistic segments are tagged with a linguistic type. A query execution plan is generated comprising the linguistic segments and, for each of the linguistic segments tagged with a linguistic type, at least one tag attribute comprising at least one domain specific feature of the linguistic type. A search is performed for documents matching the query. Each of the documents is scored for each of the linguistic segments of the query execution plan using the tag attributes of the respective linguistic segment. The documents are ranked using a function that uses the scores of the documents. A ranked list of the documents is transmitted back to the user.Type: ApplicationFiled: January 26, 2009Publication date: July 29, 2010Applicant: Yahoo! Inc.Inventors: Yumao Lu, Benoit Dumoulin
-
Publication number: 20100191758Abstract: A system and method for improved search relevance using proximity boosting. A query for a web search is received from a user, via a network, wherein the query comprises a plurality of query tokens. One or more concepts are identified in the query wherein each of concepts comprises at least two query tokens. A relative concept strength is determined for each of the identified concepts. The query is then rewritten for submission to a search engine wherein for each of the one or more concepts, a syntax rule associated with the respective relative concept strength of the concept is applied to the query tokens comprising the concept such that the rewritten query represents the one or more concepts whereby the proximity of the one or more concepts in a search result returned by the search engine to the user in response to the rewritten query is boosted.Type: ApplicationFiled: January 26, 2009Publication date: July 29, 2010Applicant: Yahoo! Inc.Inventors: Fuchun Peng, Xing Wei, Yumao Lu, Xin Li, Donald Metzler, Hang Cui, Benoit Dumoulin
-
Publication number: 20100185623Abstract: An aggregate ranking model is generated, which comprises a general ranking model and one or more topical training models. Each topical ranking model is associated with a topic, or topic class, and for use in ranking search result items determined to belong to the topic, or topic class. As one example, the topical ranking model is trained using a set of topical training data, e.g., training data determined to belong to the topic, or topic class, a general ranking model and a residue, or error, determined from a general ranking generated by the general ranking model for the topical training data, with the topical ranking model being trained to minimize the general ranking model's error in the aggregate ranking model.Type: ApplicationFiled: January 15, 2009Publication date: July 22, 2010Inventors: Yumao Lu, Benoit Dumoulin
-
Publication number: 20100114878Abstract: A method is provided for selecting relevant documents returned from a search query. When a search engine finds search terms in documents, the document score is based on the frequency of the occurrence of those terms, the category of the term, and the section of the document in which the term is found. Each (category type, document section) pair is assigned a weight that is used to modify the contribution of term frequency. The weights are determined in an offline process using historical data and human validation. Through this empirical process, the weight assignments are made to correlate high relevance scores with documents that humans would find relevant to a search query.Type: ApplicationFiled: October 22, 2008Publication date: May 6, 2010Inventors: Yumao Lu, Benoit Dumoulin
-
Publication number: 20100094835Abstract: Techniques are described for automatically determining which terms in a search query may be augmented by contextually similar terms such that more relevant results can be displayed to a user. Contextually similar words are determined based on training data, including a web corpus and a query log. Once contextually similar words are determined, they may be inserted into a search query and used to find more relevant results. Consequently, documents that contain helpful information but may not have exact word matches may be found more readily by a search engine.Type: ApplicationFiled: October 15, 2008Publication date: April 15, 2010Inventors: Yumao Lu, Benoit Dumoulin
-
Publication number: 20100036784Abstract: The present invention is directed towards systems and methods for identifying high quality content in a social media environment. The method according to one embodiment of the present invention comprises retrieving a content item and retrieving a plurality of quality features associated with said content item wherein said quality features comprise intrinsic, usage and relationship features. The method then performs an analysis of said content item against said quality features and generates a quality score based on said analysis.Type: ApplicationFiled: August 7, 2008Publication date: February 11, 2010Applicant: Yahoo! Inc.Inventors: Gilad Mishne, Benoit Dumoulin, Aristides Gionis, Debora Donato, Yevgeny Agichtein
-
Publication number: 20090276381Abstract: Techniques for automatically scoring submissions to an online question-and-answer submission system are disclosed. According to one such technique, an initial set of user submissions are scored by human operators and/or automated algorithmic mechanisms. The submissions and their accompanying scores are provided as training data to an automated machine learning mechanism. The machine learning mechanism processes the training data and automatically detects patterns in the provided submissions. The machine learning mechanism automatically correlates these patterns with the scores assigned to the submissions that match those patterns. As a result, the machine learning mechanism is trained. Thereafter, the machine learning mechanism processes unscored submissions. The machine learning mechanism automatically identifies, from among the patterns that the machine learning mechanism has already detected, one or more patterns that these submissions match.Type: ApplicationFiled: June 24, 2009Publication date: November 5, 2009Inventors: Daniel Boies, Benoit Dumoulin, Remi Kwan
-
Publication number: 20090259643Abstract: A method for normalizing query words in web search includes populating a dictionary with join and split candidates and corresponding joined and split words from an aggregate of query logs; determining a confidence score for join and split candidates, a highest confidence score for each being characterized in the dictionary as must-join and must-split, respectively; accepting queries with words amenable to being split or joined, or amenable to an addition or deletion of a hyphen or an apostrophe; generating, based on the accepted queries, split candidates obtained from the dictionary, and candidates of join, hyphen, or apostrophe algorithmically; and submitting to a search engine the generated possible candidates characterized as must-join or must-split in the dictionary, to improve search results returned in response to the queries; applying a language dictionary to generated candidates not characterized as must-split or must-join, to rank them, and submitting those highest-ranked to the search engine.Type: ApplicationFiled: April 15, 2008Publication date: October 15, 2009Applicant: Yahoo! Inc.Inventors: Fuchun Peng, George H. Mills, Benoit Dumoulin
-
Publication number: 20090259629Abstract: A method for handling abbreviations in web queries includes building a dictionary of a plurality of possible word expansions for a plurality of potential abbreviations related to query terms received or anticipated to be received by a search engine; accepting a query including an abbreviation; expanding the abbreviation into one of the plurality of word expansions if a probability that the expansion is correct is above a threshold value, wherein the probability is determined by taking into consideration a context of the abbreviation within the query, wherein the context including at least anchor text; and sending the query with the expanded abbreviation to the search engine to generate a search results page related to the query.Type: ApplicationFiled: April 15, 2008Publication date: October 15, 2009Applicant: Yahoo! Inc.Inventors: Xing Wei, Fuchun Peng, Benoit Dumoulin
-
Publication number: 20090248595Abstract: Computer-enabled methods, apparatus, and computer-readable media are provided for verifying that a given network name, such as a URL, is an official, e.g., registered, approved, or otherwise officially recognized, network name that refers to or identifies a principal, such as a business. These techniques involve receiving a principal name and a given network name, receiving at least one feature attribute from at least one database of feature attributes, wherein the at least one feature attribute comprises a characteristic of the principal name or a characteristic of the network name, and invoking a logistic regression method to generate a probability, based upon the at least one feature attribute, that the given network name is an official network name for the principal name. The logistic regression method may include a gradient boosting tree model that generates the probability based upon the at least one feature attribute.Type: ApplicationFiled: March 31, 2008Publication date: October 1, 2009Inventors: Yumao Lu, Nawaaz Ahmed, Fuchun Peng, Benoit Dumoulin
-
Patent number: 7571145Abstract: Techniques for automatically scoring submissions to an online question-and-answer submission system are disclosed. According to one such technique, an initial set of user submissions are scored by human operators and/or automated algorithmic mechanisms. The submissions and their accompanying scores are provided as training data to an automated machine learning mechanism. The machine learning mechanism processes the training data and automatically detects patterns in the provided submissions. The machine learning mechanism automatically correlates these patterns with the scores assigned to the submissions that match those patterns. As a result, the machine learning mechanism is trained. Thereafter, the machine learning mechanism processes unscored submissions. The machine learning mechanism automatically identifies, from among the patterns that the machine learning mechanism has already detected, one or more patterns that these submissions match.Type: GrantFiled: October 18, 2006Date of Patent: August 4, 2009Assignee: Yahoo! Inc.Inventors: Daniel Boies, Benoit Dumoulin, Remi Kwan
-
Publication number: 20080154819Abstract: Techniques for automatically scoring submissions to an online question-and-answer submission system are disclosed. According to one such technique, an initial set of user submissions are scored by human operators and/or automated algorithmic mechanisms. The submissions and their accompanying scores are provided as training data to an automated machine learning mechanism. The machine learning mechanism processes the training data and automatically detects patterns in the provided submissions. The machine learning mechanism automatically correlates these patterns with the scores assigned to the submissions that match those patterns. As a result, the machine learning mechanism is trained. Thereafter, the machine learning mechanism processes unscored submissions. The machine learning mechanism automatically identifies, from among the patterns that the machine learning mechanism has already detected, one or more patterns that these submissions match.Type: ApplicationFiled: October 18, 2006Publication date: June 26, 2008Inventors: Daniel Boies, Benoit Dumoulin, Remi Kwan
-
Patent number: 7206389Abstract: A computerized method is provided for electronically directing a call to a class, such that an utterance spoken by a speaker and received by a call-routing system is classified by the call-routing system as being associated with the class, such that the call-routing system includes a speech-recognition module, a feature-extraction module, and a classification module. The method includes extracting features from recognized speech; weighting elements of a feature vector with respective speech-recognition scores, wherein each weighting element is associated with one of the features; ranking classes to which the features are associated; and electronically directing the call to a highest-ranking class.Type: GrantFiled: January 7, 2004Date of Patent: April 17, 2007Assignee: Nuance Communications, Inc.Inventors: Benoit Dumoulin, Dominic Lavoie, Real Tremblay, Ben Shahshahani, Remi Ken-Sho Kwan
-
Patent number: 6868381Abstract: A speech recognition system having an input for receiving an input signal indicative of a spoken utterance that is indicative of at least one speech element. The system further includes a first processing unit operative for processing the input signal to derive from a speech recognition dictionary a speech model associated to a given speech element that constitutes a potential match to the at least one speech element. The system further comprised a second processing unit for generating a modified version of the speech model on the basis of the input signal. The system further provides a third processing unit for processing the input signal on the basis of the modified version of the speech model to generate a recognition result indicative of whether the modified version of the at least one speech model constitutes a match to the input signal.Type: GrantFiled: December 21, 1999Date of Patent: March 15, 2005Assignee: Nortel Networks LimitedInventors: Stephen Douglas Peters, Daniel Boies, Benoit Dumoulin
-
Patent number: 6856957Abstract: A technique for identifying one or more items from amongst a plurality of items in response to a spoken utterance is used to improve call routing and information retrieval systems which employ automatic speech recognition (ASR). An automatic speech recognizer is used to recognize the utterance, including generating a plurality of hypotheses for the utterance. A query element is then generated for use in identifying one or more items from amongst the plurality of items. The query element includes a set of values representing two or more of the hypotheses, each value corresponding to one of the words in the hypotheses. Each value in the query element is then weighted based on hypothesis confidence, word confidence, or both, as determined by the ASR process. The query element is then applied to the plurality of items to identify one or more items which satisfy the query.Type: GrantFiled: February 7, 2001Date of Patent: February 15, 2005Assignee: Nuance CommunicationsInventor: Benoit Dumoulin
-
Patent number: 6502070Abstract: An apparatus for normalizing speech feature elements in a signal derived from a spoken utterance. The apparatus includes an input, a processing unit and an output. The input receives speech feature elements transmitted over a channel that induces a channel specific distortion in the speech feature elements. The processing unit is coupled to the input and is operative for altering the speech feature elements to generate normalized speech feature elements. The normalized speech feature elements simulate a transmission of the speech feature elements over a reference channel that is other than the channel over which the transmission actually takes place. The apparatus can be used as a speech recognition pre-processing unit to reduce channel related variability in the signal on which speech recognition is to be performed.Type: GrantFiled: April 28, 2000Date of Patent: December 31, 2002Assignee: Nortel Networks LimitedInventors: Daniel Boies, Benoit Dumoulin, Stephen Douglas Peters