Patents by Inventor Nick Pendar
Nick Pendar has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Patent number: 9703823Abstract: In general, embodiments of the present invention provide systems, methods and computer readable media for automated dynamic data quality assessment. One aspect of the subject matter described in this specification includes the actions of receiving a data quality job including a new data sample; and, if the new data sample is determined to be added to a reservoir of data samples, sending a quality verification request to an oracle; receiving a new data sample quality estimate from the oracle; and adding the new data sample and estimate to the reservoir. A second aspect of the subject matter includes the actions of receiving, from a predictive model, a judgment associated with a new data sample; analyzing the new data sample based in part on the judgment to determine whether to send a new data sample quality verification request to an oracle; and, if a new data sample quality estimate is received from the oracle, determining whether to add the new data sample and the judgment to the reservoir.Type: GrantFiled: May 23, 2016Date of Patent: July 11, 2017Assignee: Groupon, Inc.Inventors: Mark Thomas Daly, Shawn Ryan Jeffery, Matthew DeLand, Nick Pendar, Andrew James, David Johnston
-
Patent number: 9652527Abstract: In general, embodiments of the present invention provide systems, methods and computer readable media for for generating an optimal classifying query set for categorizing and/or labeling textual data based on a query subsumption calculus to determine, given two queries, whether one of the queries subsumes another. In one aspect, a method includes generating a group of determining queries based on analyzing text within a document; receiving a group of classifying queries; and, for each determining query within the group of determining queries, determining whether at least one of the classifying queries is subsumed by the determining query; and updating the group of classifying queries in an instance in which the classifying query is subsumed by the determining query.Type: GrantFiled: June 30, 2016Date of Patent: May 16, 2017Assignee: Groupon, Inc.Inventor: Nick Pendar
-
Patent number: 9600776Abstract: In general, embodiments of the present invention provide systems, methods and computer readable media for automated dynamic data quality assessment. One aspect of the subject matter described in this specification includes the actions of receiving a data quality job including a new data sample; and, if the new data sample is determined to be added to a reservoir of data samples, sending a quality verification request to an oracle; receiving a new data sample quality estimate from the oracle; and adding the new data sample and estimate to the reservoir. A second aspect of the subject matter includes the actions of receiving, from a predictive model, a judgment associated with a new data sample; analyzing the new data sample based in part on the judgment to determine whether to send a new data sample quality verification request to an oracle; and, if a new data sample quality estimate is received from the oracle, determining whether to add the new data sample and the judgment to the reservoir.Type: GrantFiled: November 22, 2013Date of Patent: March 21, 2017Assignee: Groupon, Inc.Inventors: Mark Thomas Daly, Shawn Ryan Jeffery, Matthew DeLand, Nick Pendar, Andrew James, David Johnston
-
Publication number: 20170060993Abstract: A system and method are disclosed for obtaining a plurality of unlabeled text documents; obtaining an initial concept; obtaining keywords from a knowledge source based on the initial concept; scoring the plurality of unlabeled documents based at least in part on the initial keywords; determining a categorization of the documents based on the scores; performing a first feature selection and creating a first vector space representation of each document in a first category and a second category, the first and second categories based on the scores, the first vector space representation serving as one or more labels for an associated unlabeled textual document; and generating the training set including a subset of the obtained unlabeled textual documents, the subset of the obtained unlabeled documents including a documents belonging to the first category and documents belonging to the second category.Type: ApplicationFiled: August 31, 2016Publication date: March 2, 2017Inventors: Nick Pendar, Zhuang Wang
-
Publication number: 20170031927Abstract: In general, embodiments of the present invention provide systems, methods and computer readable media for for generating an optimal classifying query set for categorizing and/or labeling textual data based on a query subsumption calculus to determine, given two queries, whether one of the queries subsumes another. In one aspect, a method includes generating a group of determining queries based on analyzing text within a document; receiving a group of classifying queries; and, for each determining query within the group of determining queries, determining whether at least one of the classifying queries is subsumed by the determining query; and updating the group of classifying queries in an instance in which the classifying query is subsumed by the determining query.Type: ApplicationFiled: June 30, 2016Publication date: February 2, 2017Inventor: Nick Pendar
-
Publication number: 20170024427Abstract: In general, embodiments of the present invention provide systems, methods and computer readable media for automated dynamic data quality assessment. One aspect of the subject matter described in this specification includes the actions of receiving a data quality job including a new data sample; and, if the new data sample is determined to be added to a reservoir of data samples, sending a quality verification request to an oracle; receiving a new data sample quality estimate from the oracle; and adding the new data sample and estimate to the reservoir. A second aspect of the subject matter includes the actions of receiving, from a predictive model, a judgment associated with a new data sample; analyzing the new data sample based in part on the judgment to determine whether to send a new data sample quality verification request to an oracle; and, if a new data sample quality estimate is received from the oracle, determining whether to add the new data sample and the judgment to the reservoir.Type: ApplicationFiled: May 23, 2016Publication date: January 26, 2017Inventors: Mark Thomas Daly, Shawn Ryan Jeffery, Matthew DeLand, Nick Pendar, Andrew James, David Johnston
-
Publication number: 20160314201Abstract: Provided herein are systems, methods and computer readable media for classification and tagging of textual data. An example method may include accessing a corpus comprising a plurality of documents, each document having one or more labels indicative of services offered by a merchant, generating a query based on extracted features and the documents, generating a precision score for at least a portion of the generated query and selecting a subset of the generated queries based on an assigned precision score satisfying a precision score threshold, the selected subset of the generated queries configured to provide an indication of one or more labels to be applied to machine readable text. A second example method, utilized for tagging machine readable text with unknown labels, may include assigning a label to textual portions of the machine readable text based on results of the application of the queries.Type: ApplicationFiled: February 23, 2016Publication date: October 27, 2016Inventor: Nick Pendar
-
Patent number: 9411905Abstract: In general, embodiments of the present invention provide systems, methods and computer readable media for generating an optimal classifying query set for categorizing and/or labeling textual data based on a query subsumption calculus to determine, given two queries, whether one of the queries subsumes another. In one aspect, a method includes generating a group of determining queries based on analyzing text within a document; receiving a group of classifying queries; and, for each determining query within the group of determining queries, determining whether at least one of the classifying queries is subsumed by the determining query; and updating the group of classifying queries in an instance in which the classifying query is subsumed by the determining query.Type: GrantFiled: September 26, 2013Date of Patent: August 9, 2016Assignee: Groupon, Inc.Inventor: Nick Pendar
-
Patent number: 9390112Abstract: In general, embodiments of the present invention provide systems, methods and computer readable media for automated dynamic data quality assessment. One aspect of the subject matter described in this specification includes the actions of receiving a data quality job including a new data sample; and, if the new data sample is determined to be added to a reservoir of data samples, sending a quality verification request to an oracle; receiving a new data sample quality estimate from the oracle; and adding the new data sample and estimate to the reservoir. A second aspect of the subject matter includes the actions of receiving, from a predictive model, a judgment associated with a new data sample; analyzing the new data sample based in part on the judgment to determine whether to send a new data sample quality verification request to an oracle; and, if a new data sample quality estimate is received from the oracle, determining whether to add the new data sample and the judgment to the reservoir.Type: GrantFiled: November 22, 2013Date of Patent: July 12, 2016Assignee: Groupon, Inc.Inventors: Mark Thomas Daly, Shawn Ryan Jeffery, Matthew DeLand, Nick Pendar, Andrew James, David Johnston
-
Patent number: 9330167Abstract: Provided herein are systems, methods and computer readable media for classification and tagging of textual data. An example method may include accessing a corpus comprising a plurality of documents, each document having one or more labels indicative of services offered by a merchant, generating a query based on extracted features and the documents, generating a precision score for at least a portion of the generated query and selecting a subset of the generated queries based on an assigned precision score satisfying a precision score threshold, the selected subset of the generated queries configured to provide an indication of one or more labels to be applied to machine readable text. A second example method, utilized for tagging machine readable text with unknown labels, may include assigning a label to textual portions of the machine readable text based on results of the application of the queries.Type: GrantFiled: May 13, 2013Date of Patent: May 3, 2016Assignee: Groupon, Inc.Inventor: Nick Pendar
-
Patent number: 9235652Abstract: Embodiments of the present invention provide systems, methods and computer readable media for optimizing a data integration process. In embodiments, a system can be configured to represent the processing of a data record that includes attributes, and to use that representation to determine an optimal processing of that data record. In embodiments, the system represents the processing of a data record as an operator graph comprising nodes and edges, where each node is an operator node that represents an operator for implementing at least one logical operation on at least one of the attributes and each edge between a pair of nodes represents the movement of data between the nodes. In embodiments, each operator node includes one or more operator metrics (e.g. operator cost metrics and operator quality metrics). In embodiments, the system determines optimal processing of the data record by determining a best path within the operator graph.Type: GrantFiled: March 8, 2013Date of Patent: January 12, 2016Assignee: Groupon, Inc.Inventors: Shawn Ryan Jeffery, Nick Pendar, Matt DeLand, Liwen Sun, Rajat Bhattacharjee
-
Patent number: 9122710Abstract: In general, embodiments of the present invention provide systems, methods and computer readable media for identifying a new business based on programmatically analyzing content received from online sources and, as a result, discovering one or more references to the business. In embodiments, the system stores historical data representing previously identified new businesses and then uses attributes of those businesses in search queries to receive related content. Additionally or alternatively, the system stores data representing online sources that historically provided content containing references to new businesses and then continues to access those sources for additional content. In embodiments, the system performs content analysis on structured and/or unstructured content.Type: GrantFiled: March 12, 2013Date of Patent: September 1, 2015Assignee: Groupon, Inc.Inventors: Shawn Ryan Jeffery, Nick Pendar, Richard Clark Barber
-
Patent number: 8515956Abstract: A method and system for clustering a plurality of data elements is provided. According to embodiments of the present invention, a bit vector is generated based on each of the data elements. Bit operations are used to group each data element into a cluster. Clustering may be performed by partition clustering or hierarchical clustering. Embodiments of the present invention cluster data elements such as text documents, audio files, video files, photos, or other data files.Type: GrantFiled: May 4, 2010Date of Patent: August 20, 2013Assignee: H5Inventor: Nick Pendar
-
Publication number: 20100287160Abstract: A method and system for clustering a plurality of data elements is provided. According to embodiments of the present invention, a bit vector is generated based on each of the data elements. Bit operations are used to group each data element into a cluster. Clustering may be performed by partition clustering or hierarchical clustering. Embodiments of the present invention cluster data elements such as text documents, audio files, video files, photos, or other data files.Type: ApplicationFiled: May 4, 2010Publication date: November 11, 2010Inventor: Nick Pendar