Patents by Inventor Nick Pendar

Nick Pendar has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 9703823
    Abstract: In general, embodiments of the present invention provide systems, methods and computer readable media for automated dynamic data quality assessment. One aspect of the subject matter described in this specification includes the actions of receiving a data quality job including a new data sample; and, if the new data sample is determined to be added to a reservoir of data samples, sending a quality verification request to an oracle; receiving a new data sample quality estimate from the oracle; and adding the new data sample and estimate to the reservoir. A second aspect of the subject matter includes the actions of receiving, from a predictive model, a judgment associated with a new data sample; analyzing the new data sample based in part on the judgment to determine whether to send a new data sample quality verification request to an oracle; and, if a new data sample quality estimate is received from the oracle, determining whether to add the new data sample and the judgment to the reservoir.
    Type: Grant
    Filed: May 23, 2016
    Date of Patent: July 11, 2017
    Assignee: Groupon, Inc.
    Inventors: Mark Thomas Daly, Shawn Ryan Jeffery, Matthew DeLand, Nick Pendar, Andrew James, David Johnston
  • Patent number: 9652527
    Abstract: In general, embodiments of the present invention provide systems, methods and computer readable media for for generating an optimal classifying query set for categorizing and/or labeling textual data based on a query subsumption calculus to determine, given two queries, whether one of the queries subsumes another. In one aspect, a method includes generating a group of determining queries based on analyzing text within a document; receiving a group of classifying queries; and, for each determining query within the group of determining queries, determining whether at least one of the classifying queries is subsumed by the determining query; and updating the group of classifying queries in an instance in which the classifying query is subsumed by the determining query.
    Type: Grant
    Filed: June 30, 2016
    Date of Patent: May 16, 2017
    Assignee: Groupon, Inc.
    Inventor: Nick Pendar
  • Patent number: 9600776
    Abstract: In general, embodiments of the present invention provide systems, methods and computer readable media for automated dynamic data quality assessment. One aspect of the subject matter described in this specification includes the actions of receiving a data quality job including a new data sample; and, if the new data sample is determined to be added to a reservoir of data samples, sending a quality verification request to an oracle; receiving a new data sample quality estimate from the oracle; and adding the new data sample and estimate to the reservoir. A second aspect of the subject matter includes the actions of receiving, from a predictive model, a judgment associated with a new data sample; analyzing the new data sample based in part on the judgment to determine whether to send a new data sample quality verification request to an oracle; and, if a new data sample quality estimate is received from the oracle, determining whether to add the new data sample and the judgment to the reservoir.
    Type: Grant
    Filed: November 22, 2013
    Date of Patent: March 21, 2017
    Assignee: Groupon, Inc.
    Inventors: Mark Thomas Daly, Shawn Ryan Jeffery, Matthew DeLand, Nick Pendar, Andrew James, David Johnston
  • Publication number: 20170060993
    Abstract: A system and method are disclosed for obtaining a plurality of unlabeled text documents; obtaining an initial concept; obtaining keywords from a knowledge source based on the initial concept; scoring the plurality of unlabeled documents based at least in part on the initial keywords; determining a categorization of the documents based on the scores; performing a first feature selection and creating a first vector space representation of each document in a first category and a second category, the first and second categories based on the scores, the first vector space representation serving as one or more labels for an associated unlabeled textual document; and generating the training set including a subset of the obtained unlabeled textual documents, the subset of the obtained unlabeled documents including a documents belonging to the first category and documents belonging to the second category.
    Type: Application
    Filed: August 31, 2016
    Publication date: March 2, 2017
    Inventors: Nick Pendar, Zhuang Wang
  • Publication number: 20170031927
    Abstract: In general, embodiments of the present invention provide systems, methods and computer readable media for for generating an optimal classifying query set for categorizing and/or labeling textual data based on a query subsumption calculus to determine, given two queries, whether one of the queries subsumes another. In one aspect, a method includes generating a group of determining queries based on analyzing text within a document; receiving a group of classifying queries; and, for each determining query within the group of determining queries, determining whether at least one of the classifying queries is subsumed by the determining query; and updating the group of classifying queries in an instance in which the classifying query is subsumed by the determining query.
    Type: Application
    Filed: June 30, 2016
    Publication date: February 2, 2017
    Inventor: Nick Pendar
  • Publication number: 20170024427
    Abstract: In general, embodiments of the present invention provide systems, methods and computer readable media for automated dynamic data quality assessment. One aspect of the subject matter described in this specification includes the actions of receiving a data quality job including a new data sample; and, if the new data sample is determined to be added to a reservoir of data samples, sending a quality verification request to an oracle; receiving a new data sample quality estimate from the oracle; and adding the new data sample and estimate to the reservoir. A second aspect of the subject matter includes the actions of receiving, from a predictive model, a judgment associated with a new data sample; analyzing the new data sample based in part on the judgment to determine whether to send a new data sample quality verification request to an oracle; and, if a new data sample quality estimate is received from the oracle, determining whether to add the new data sample and the judgment to the reservoir.
    Type: Application
    Filed: May 23, 2016
    Publication date: January 26, 2017
    Inventors: Mark Thomas Daly, Shawn Ryan Jeffery, Matthew DeLand, Nick Pendar, Andrew James, David Johnston
  • Publication number: 20160314201
    Abstract: Provided herein are systems, methods and computer readable media for classification and tagging of textual data. An example method may include accessing a corpus comprising a plurality of documents, each document having one or more labels indicative of services offered by a merchant, generating a query based on extracted features and the documents, generating a precision score for at least a portion of the generated query and selecting a subset of the generated queries based on an assigned precision score satisfying a precision score threshold, the selected subset of the generated queries configured to provide an indication of one or more labels to be applied to machine readable text. A second example method, utilized for tagging machine readable text with unknown labels, may include assigning a label to textual portions of the machine readable text based on results of the application of the queries.
    Type: Application
    Filed: February 23, 2016
    Publication date: October 27, 2016
    Inventor: Nick Pendar
  • Patent number: 9411905
    Abstract: In general, embodiments of the present invention provide systems, methods and computer readable media for generating an optimal classifying query set for categorizing and/or labeling textual data based on a query subsumption calculus to determine, given two queries, whether one of the queries subsumes another. In one aspect, a method includes generating a group of determining queries based on analyzing text within a document; receiving a group of classifying queries; and, for each determining query within the group of determining queries, determining whether at least one of the classifying queries is subsumed by the determining query; and updating the group of classifying queries in an instance in which the classifying query is subsumed by the determining query.
    Type: Grant
    Filed: September 26, 2013
    Date of Patent: August 9, 2016
    Assignee: Groupon, Inc.
    Inventor: Nick Pendar
  • Patent number: 9390112
    Abstract: In general, embodiments of the present invention provide systems, methods and computer readable media for automated dynamic data quality assessment. One aspect of the subject matter described in this specification includes the actions of receiving a data quality job including a new data sample; and, if the new data sample is determined to be added to a reservoir of data samples, sending a quality verification request to an oracle; receiving a new data sample quality estimate from the oracle; and adding the new data sample and estimate to the reservoir. A second aspect of the subject matter includes the actions of receiving, from a predictive model, a judgment associated with a new data sample; analyzing the new data sample based in part on the judgment to determine whether to send a new data sample quality verification request to an oracle; and, if a new data sample quality estimate is received from the oracle, determining whether to add the new data sample and the judgment to the reservoir.
    Type: Grant
    Filed: November 22, 2013
    Date of Patent: July 12, 2016
    Assignee: Groupon, Inc.
    Inventors: Mark Thomas Daly, Shawn Ryan Jeffery, Matthew DeLand, Nick Pendar, Andrew James, David Johnston
  • Patent number: 9330167
    Abstract: Provided herein are systems, methods and computer readable media for classification and tagging of textual data. An example method may include accessing a corpus comprising a plurality of documents, each document having one or more labels indicative of services offered by a merchant, generating a query based on extracted features and the documents, generating a precision score for at least a portion of the generated query and selecting a subset of the generated queries based on an assigned precision score satisfying a precision score threshold, the selected subset of the generated queries configured to provide an indication of one or more labels to be applied to machine readable text. A second example method, utilized for tagging machine readable text with unknown labels, may include assigning a label to textual portions of the machine readable text based on results of the application of the queries.
    Type: Grant
    Filed: May 13, 2013
    Date of Patent: May 3, 2016
    Assignee: Groupon, Inc.
    Inventor: Nick Pendar
  • Patent number: 9235652
    Abstract: Embodiments of the present invention provide systems, methods and computer readable media for optimizing a data integration process. In embodiments, a system can be configured to represent the processing of a data record that includes attributes, and to use that representation to determine an optimal processing of that data record. In embodiments, the system represents the processing of a data record as an operator graph comprising nodes and edges, where each node is an operator node that represents an operator for implementing at least one logical operation on at least one of the attributes and each edge between a pair of nodes represents the movement of data between the nodes. In embodiments, each operator node includes one or more operator metrics (e.g. operator cost metrics and operator quality metrics). In embodiments, the system determines optimal processing of the data record by determining a best path within the operator graph.
    Type: Grant
    Filed: March 8, 2013
    Date of Patent: January 12, 2016
    Assignee: Groupon, Inc.
    Inventors: Shawn Ryan Jeffery, Nick Pendar, Matt DeLand, Liwen Sun, Rajat Bhattacharjee
  • Patent number: 9122710
    Abstract: In general, embodiments of the present invention provide systems, methods and computer readable media for identifying a new business based on programmatically analyzing content received from online sources and, as a result, discovering one or more references to the business. In embodiments, the system stores historical data representing previously identified new businesses and then uses attributes of those businesses in search queries to receive related content. Additionally or alternatively, the system stores data representing online sources that historically provided content containing references to new businesses and then continues to access those sources for additional content. In embodiments, the system performs content analysis on structured and/or unstructured content.
    Type: Grant
    Filed: March 12, 2013
    Date of Patent: September 1, 2015
    Assignee: Groupon, Inc.
    Inventors: Shawn Ryan Jeffery, Nick Pendar, Richard Clark Barber
  • Patent number: 8515956
    Abstract: A method and system for clustering a plurality of data elements is provided. According to embodiments of the present invention, a bit vector is generated based on each of the data elements. Bit operations are used to group each data element into a cluster. Clustering may be performed by partition clustering or hierarchical clustering. Embodiments of the present invention cluster data elements such as text documents, audio files, video files, photos, or other data files.
    Type: Grant
    Filed: May 4, 2010
    Date of Patent: August 20, 2013
    Assignee: H5
    Inventor: Nick Pendar
  • Publication number: 20100287160
    Abstract: A method and system for clustering a plurality of data elements is provided. According to embodiments of the present invention, a bit vector is generated based on each of the data elements. Bit operations are used to group each data element into a cluster. Clustering may be performed by partition clustering or hierarchical clustering. Embodiments of the present invention cluster data elements such as text documents, audio files, video files, photos, or other data files.
    Type: Application
    Filed: May 4, 2010
    Publication date: November 11, 2010
    Inventor: Nick Pendar