Patents by Inventor Nick Pendar

Nick Pendar has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

Automated dynamic data quality assessment

Patent number: 9703823

Abstract: In general, embodiments of the present invention provide systems, methods and computer readable media for automated dynamic data quality assessment. One aspect of the subject matter described in this specification includes the actions of receiving a data quality job including a new data sample; and, if the new data sample is determined to be added to a reservoir of data samples, sending a quality verification request to an oracle; receiving a new data sample quality estimate from the oracle; and adding the new data sample and estimate to the reservoir. A second aspect of the subject matter includes the actions of receiving, from a predictive model, a judgment associated with a new data sample; analyzing the new data sample based in part on the judgment to determine whether to send a new data sample quality verification request to an oracle; and, if a new data sample quality estimate is received from the oracle, determining whether to add the new data sample and the judgment to the reservoir.

Type: Grant

Filed: May 23, 2016

Date of Patent: July 11, 2017

Assignee: Groupon, Inc.

Inventors: Mark Thomas Daly, Shawn Ryan Jeffery, Matthew DeLand, Nick Pendar, Andrew James, David Johnston
Multi-term query subsumption for document classification

Patent number: 9652527

Abstract: In general, embodiments of the present invention provide systems, methods and computer readable media for for generating an optimal classifying query set for categorizing and/or labeling textual data based on a query subsumption calculus to determine, given two queries, whether one of the queries subsumes another. In one aspect, a method includes generating a group of determining queries based on analyzing text within a document; receiving a group of classifying queries; and, for each determining query within the group of determining queries, determining whether at least one of the classifying queries is subsumed by the determining query; and updating the group of classifying queries in an instance in which the classifying query is subsumed by the determining query.

Type: Grant

Filed: June 30, 2016

Date of Patent: May 16, 2017

Assignee: Groupon, Inc.

Inventor: Nick Pendar
Automated adaptive data analysis using dynamic data quality assessment

Patent number: 9600776

Abstract: In general, embodiments of the present invention provide systems, methods and computer readable media for automated dynamic data quality assessment. One aspect of the subject matter described in this specification includes the actions of receiving a data quality job including a new data sample; and, if the new data sample is determined to be added to a reservoir of data samples, sending a quality verification request to an oracle; receiving a new data sample quality estimate from the oracle; and adding the new data sample and estimate to the reservoir. A second aspect of the subject matter includes the actions of receiving, from a predictive model, a judgment associated with a new data sample; analyzing the new data sample based in part on the judgment to determine whether to send a new data sample quality verification request to an oracle; and, if a new data sample quality estimate is received from the oracle, determining whether to add the new data sample and the judgment to the reservoir.

Type: Grant

Filed: November 22, 2013

Date of Patent: March 21, 2017

Assignee: Groupon, Inc.

Inventors: Mark Thomas Daly, Shawn Ryan Jeffery, Matthew DeLand, Nick Pendar, Andrew James, David Johnston
Creating a Training Data Set Based on Unlabeled Textual Data

Publication number: 20170060993

Abstract: A system and method are disclosed for obtaining a plurality of unlabeled text documents; obtaining an initial concept; obtaining keywords from a knowledge source based on the initial concept; scoring the plurality of unlabeled documents based at least in part on the initial keywords; determining a categorization of the documents based on the scores; performing a first feature selection and creating a first vector space representation of each document in a first category and a second category, the first and second categories based on the scores, the first vector space representation serving as one or more labels for an associated unlabeled textual document; and generating the training set including a subset of the obtained unlabeled textual documents, the subset of the obtained unlabeled documents including a documents belonging to the first category and documents belonging to the second category.

Type: Application

Filed: August 31, 2016

Publication date: March 2, 2017

Inventors: Nick Pendar, Zhuang Wang
MULTI-TERM QUERY SUBSUMPTION FOR DOCUMENT CLASSIFICATION

Publication number: 20170031927

Abstract: In general, embodiments of the present invention provide systems, methods and computer readable media for for generating an optimal classifying query set for categorizing and/or labeling textual data based on a query subsumption calculus to determine, given two queries, whether one of the queries subsumes another. In one aspect, a method includes generating a group of determining queries based on analyzing text within a document; receiving a group of classifying queries; and, for each determining query within the group of determining queries, determining whether at least one of the classifying queries is subsumed by the determining query; and updating the group of classifying queries in an instance in which the classifying query is subsumed by the determining query.

Type: Application

Filed: June 30, 2016

Publication date: February 2, 2017

Inventor: Nick Pendar
AUTOMATED DYNAMIC DATA QUALITY ASSESSMENT

Publication number: 20170024427

Abstract: In general, embodiments of the present invention provide systems, methods and computer readable media for automated dynamic data quality assessment. One aspect of the subject matter described in this specification includes the actions of receiving a data quality job including a new data sample; and, if the new data sample is determined to be added to a reservoir of data samples, sending a quality verification request to an oracle; receiving a new data sample quality estimate from the oracle; and adding the new data sample and estimate to the reservoir. A second aspect of the subject matter includes the actions of receiving, from a predictive model, a judgment associated with a new data sample; analyzing the new data sample based in part on the judgment to determine whether to send a new data sample quality verification request to an oracle; and, if a new data sample quality estimate is received from the oracle, determining whether to add the new data sample and the judgment to the reservoir.

Type: Application

Filed: May 23, 2016

Publication date: January 26, 2017

Inventors: Mark Thomas Daly, Shawn Ryan Jeffery, Matthew DeLand, Nick Pendar, Andrew James, David Johnston
METHOD, APPARATUS, AND COMPUTER PROGRAM PRODUCT FOR CLASSIFICATION AND TAGGING OF TEXTUAL DATA

Publication number: 20160314201

Abstract: Provided herein are systems, methods and computer readable media for classification and tagging of textual data. An example method may include accessing a corpus comprising a plurality of documents, each document having one or more labels indicative of services offered by a merchant, generating a query based on extracted features and the documents, generating a precision score for at least a portion of the generated query and selecting a subset of the generated queries based on an assigned precision score satisfying a precision score threshold, the selected subset of the generated queries configured to provide an indication of one or more labels to be applied to machine readable text. A second example method, utilized for tagging machine readable text with unknown labels, may include assigning a label to textual portions of the machine readable text based on results of the application of the queries.

Type: Application

Filed: February 23, 2016

Publication date: October 27, 2016

Inventor: Nick Pendar
Multi-term query subsumption for document classification

Patent number: 9411905

Abstract: In general, embodiments of the present invention provide systems, methods and computer readable media for generating an optimal classifying query set for categorizing and/or labeling textual data based on a query subsumption calculus to determine, given two queries, whether one of the queries subsumes another. In one aspect, a method includes generating a group of determining queries based on analyzing text within a document; receiving a group of classifying queries; and, for each determining query within the group of determining queries, determining whether at least one of the classifying queries is subsumed by the determining query; and updating the group of classifying queries in an instance in which the classifying query is subsumed by the determining query.

Type: Grant

Filed: September 26, 2013

Date of Patent: August 9, 2016

Assignee: Groupon, Inc.

Inventor: Nick Pendar
Automated dynamic data quality assessment

Patent number: 9390112

Abstract: In general, embodiments of the present invention provide systems, methods and computer readable media for automated dynamic data quality assessment. One aspect of the subject matter described in this specification includes the actions of receiving a data quality job including a new data sample; and, if the new data sample is determined to be added to a reservoir of data samples, sending a quality verification request to an oracle; receiving a new data sample quality estimate from the oracle; and adding the new data sample and estimate to the reservoir. A second aspect of the subject matter includes the actions of receiving, from a predictive model, a judgment associated with a new data sample; analyzing the new data sample based in part on the judgment to determine whether to send a new data sample quality verification request to an oracle; and, if a new data sample quality estimate is received from the oracle, determining whether to add the new data sample and the judgment to the reservoir.

Type: Grant

Filed: November 22, 2013

Date of Patent: July 12, 2016

Assignee: Groupon, Inc.

Inventors: Mark Thomas Daly, Shawn Ryan Jeffery, Matthew DeLand, Nick Pendar, Andrew James, David Johnston
Method, apparatus, and computer program product for classification and tagging of textual data

Patent number: 9330167

Abstract: Provided herein are systems, methods and computer readable media for classification and tagging of textual data. An example method may include accessing a corpus comprising a plurality of documents, each document having one or more labels indicative of services offered by a merchant, generating a query based on extracted features and the documents, generating a precision score for at least a portion of the generated query and selecting a subset of the generated queries based on an assigned precision score satisfying a precision score threshold, the selected subset of the generated queries configured to provide an indication of one or more labels to be applied to machine readable text. A second example method, utilized for tagging machine readable text with unknown labels, may include assigning a label to textual portions of the machine readable text based on results of the application of the queries.

Type: Grant

Filed: May 13, 2013

Date of Patent: May 3, 2016

Assignee: Groupon, Inc.

Inventor: Nick Pendar
Optimizing a data integration process

Patent number: 9235652

Abstract: Embodiments of the present invention provide systems, methods and computer readable media for optimizing a data integration process. In embodiments, a system can be configured to represent the processing of a data record that includes attributes, and to use that representation to determine an optimal processing of that data record. In embodiments, the system represents the processing of a data record as an operator graph comprising nodes and edges, where each node is an operator node that represents an operator for implementing at least one logical operation on at least one of the attributes and each edge between a pair of nodes represents the movement of data between the nodes. In embodiments, each operator node includes one or more operator metrics (e.g. operator cost metrics and operator quality metrics). In embodiments, the system determines optimal processing of the data record by determining a best path within the operator graph.

Type: Grant

Filed: March 8, 2013

Date of Patent: January 12, 2016

Assignee: Groupon, Inc.

Inventors: Shawn Ryan Jeffery, Nick Pendar, Matt DeLand, Liwen Sun, Rajat Bhattacharjee
Discovery of new business openings using web content analysis

Patent number: 9122710

Abstract: In general, embodiments of the present invention provide systems, methods and computer readable media for identifying a new business based on programmatically analyzing content received from online sources and, as a result, discovering one or more references to the business. In embodiments, the system stores historical data representing previously identified new businesses and then uses attributes of those businesses in search queries to receive related content. Additionally or alternatively, the system stores data representing online sources that historically provided content containing references to new businesses and then continues to access those sources for additional content. In embodiments, the system performs content analysis on structured and/or unstructured content.

Type: Grant

Filed: March 12, 2013

Date of Patent: September 1, 2015

Assignee: Groupon, Inc.

Inventors: Shawn Ryan Jeffery, Nick Pendar, Richard Clark Barber
Method and system for clustering datasets

Patent number: 8515956

Abstract: A method and system for clustering a plurality of data elements is provided. According to embodiments of the present invention, a bit vector is generated based on each of the data elements. Bit operations are used to group each data element into a cluster. Clustering may be performed by partition clustering or hierarchical clustering. Embodiments of the present invention cluster data elements such as text documents, audio files, video files, photos, or other data files.

Type: Grant

Filed: May 4, 2010

Date of Patent: August 20, 2013

Assignee: H5

Inventor: Nick Pendar
METHOD AND SYSTEM FOR CLUSTERING DATASETS

Publication number: 20100287160

Abstract: A method and system for clustering a plurality of data elements is provided. According to embodiments of the present invention, a bit vector is generated based on each of the data elements. Bit operations are used to group each data element into a cluster. Clustering may be performed by partition clustering or hierarchical clustering. Embodiments of the present invention cluster data elements such as text documents, audio files, video files, photos, or other data files.

Type: Application

Filed: May 4, 2010

Publication date: November 11, 2010

Inventor: Nick Pendar

prev 1 2 3