Patents by Inventor Philip Ogren

Philip Ogren has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Publication number: 20220343072
    Abstract: A natural language identity classifier system is described, which employs a supervised machine learning (ML) model to perform language identity classification on input text. The ML model takes, as input, non-lexicalized features of target text derived from subword tokenization of the text. Specifically, these non-lexicalized features are generated based on statistics determined for tokens identified for the input text. According to an embodiment, at least some of the non-lexicalized features are based on natural language-specific summary statistics that indicate how often tokens were found within a corpus for each natural language. Use of such summary statistics allows for generation of natural language specific conditional probability-based features.
    Type: Application
    Filed: April 22, 2021
    Publication date: October 27, 2022
    Inventor: Philip Ogren
  • Patent number: 10915233
    Abstract: The present disclosure describes techniques for entity classification and data enrichment of data sets. A data enrichment system is disclosed that can extract, repair, and enrich datasets, resulting in more precise entity resolution and classification for purposes of subsequent indexing and clustering. Disclosed techniques may include performing entity recognition to identify segments of interest that relate to an entity. Related data may be analyzed for classification, which can be used to transform the data for enrichment to its users.
    Type: Grant
    Filed: September 24, 2015
    Date of Patent: February 9, 2021
    Assignee: ORACLE INTERNATIONAL CORPORATION
    Inventors: Alexander Sasha Stojanovic, Philip Ogren, Kevin L. Markey, Mark Kreider
  • Patent number: 10891272
    Abstract: The present disclosure relates generally to a data enrichment service that extracts, repairs, and enriches datasets, resulting in more precise entity resolution and correlation for purposes of subsequent indexing and clustering. As the data enrichment service can include a visual recommendation engine and language for performing large-scale data preparation, repair, and enrichment of heterogeneous datasets. This enables the user to select and see how the recommended enrichments (e.g., transformations and repairs) will affect the user's data and make adjustments as needed. The data enrichment service can receive feedback from users through a user interface and can filter recommendations based on the user feedback.
    Type: Grant
    Filed: September 24, 2015
    Date of Patent: January 12, 2021
    Assignee: ORACLE INTERNATIONAL CORPORATION
    Inventors: Alexander Sasha Stojanovic, Luis E. Rivas, Philip Ogren, Glenn Allen Murray
  • Patent number: 10885056
    Abstract: Techniques are disclosed for standardization of data. According to a first technique, standard representation terms are determined for to-be-standardized data using the to-be-standardized data itself and without using any external reference data. According to a second technique, a combination of the to-be-standardized data and an external reference is used to determine standard representation terms for the to-be-standardized data.
    Type: Grant
    Filed: September 25, 2018
    Date of Patent: January 5, 2021
    Assignee: ORACLE INTERNATIONAL CORPORATION
    Inventors: Michael Malak, Luis E. Rivas, Mark L. Kreider, Philip Ogren, Robert James Oberbreckling
  • Patent number: 10482128
    Abstract: A string analysis tool for calculating a similarity metric between an input string and a plurality of strings in a collection to be searched. The string analysis tool may include optimizations that may reduce the number of calculations to be carried out when calculating the similarity metric for large volumes of data. In this regard, the string analysis tool may represent strings as features. As such, analysis may be performed relative to features (e.g., of either the input string or plurality of strings to be searched) such that features from the strings may be eliminated from consideration when identifying candidate strings from the collection for which a similarity metric is to be calculated. The elimination of features may be based on a minimum similarity metric threshold, wherein features that are incapable of contributing to a similarity metric above the minimum similarity metric threshold are eliminated from consideration.
    Type: Grant
    Filed: May 15, 2017
    Date of Patent: November 19, 2019
    Assignee: ORACLE INTERNATIONAL CORPORATION
    Inventor: Philip Ogren
  • Publication number: 20190102441
    Abstract: Techniques are disclosed for standardization of data. According to a first technique, standard representation terms are determined for to-be-standardized data using the to-be-standardized data itself and without using any external reference data. According to a second technique, a combination of the to-be-standardized data and an external reference is used to determine standard representation terms for the to-be-standardized data.
    Type: Application
    Filed: September 25, 2018
    Publication date: April 4, 2019
    Applicant: Oracle International Corporation
    Inventors: Michael Malak, Luis E. Rivas, Mark L. Kreider, Philip Ogren, Robert James Oberbreckling
  • Publication number: 20180330015
    Abstract: A string analysis tool for calculating a similarity metric between an input string and a plurality of strings in a collection to be searched. The string analysis tool may include optimizations that may reduce the number of calculations to be carried out when calculating the similarity metric for large volumes of data. In this regard, the string analysis tool may represent strings as features. As such, analysis may be performed relative to features (e.g., of either the input string or plurality of strings to be searched) such that features from the strings may be eliminated from consideration when identifying candidate strings from the collection for which a similarity metric is to be calculated. The elimination of features may be based on a minimum similarity metric threshold, wherein features that are incapable of contributing to a similarity metric above the minimum similarity metric threshold are eliminated from consideration.
    Type: Application
    Filed: May 15, 2017
    Publication date: November 15, 2018
    Inventor: Philip Ogren
  • Publication number: 20160092474
    Abstract: The present disclosure relates generally to a data enrichment service that extracts, repairs, and enriches datasets, resulting in more precise entity resolution and correlation for purposes of subsequent indexing and clustering. As the data enrichment service can include a visual recommendation engine and language for performing large-scale data preparation, repair, and enrichment of heterogeneous datasets. This enables the user to select and see how the recommended enrichments (e.g., transformations and repairs) will affect the user's data and make adjustments as needed. The data enrichment service can receive feedback from users through a user interface and can filter recommendations based on the user feedback.
    Type: Application
    Filed: September 24, 2015
    Publication date: March 31, 2016
    Inventors: Alexander Sasha Stojanovic, Luis E. Rivas, Philip Ogren, Glenn Allen Murray
  • Publication number: 20160092475
    Abstract: The present disclosure describes techniques for entity classification and data enrichment of data sets. A data enrichment system is disclosed that can extract, repair, and enrich datasets, resulting in more precise entity resolution and classification for purposes of subsequent indexing and clustering. Disclosed techniques may include performing entity recognition to identify segments of interest that relate to an entity. Related data may be analyzed for classification, which can be used to transform the data for enrichment to its users.
    Type: Application
    Filed: September 24, 2015
    Publication date: March 31, 2016
    Inventors: Alexander Sasha Stojanovic, Philip Ogren, Kevin L. Markey, Mark Kreider
  • Patent number: 9201869
    Abstract: Computer-based tools and methods for conversion of data from a first form to a second form without reference to the context of data to be converted. The conversion may be facilitated by matching source data with external information (e.g., public and/or private schema) that contain rules (e.g., context specific rules) for conversion of the data. The matching may be performed based on an optimized index string matching technique that may be operable to match source data to external information that is context dependent without specific identification of the context of either the source data or the external information identified. Accordingly, the conversion of data may be performed in an unsupervised machine learning environment.
    Type: Grant
    Filed: May 21, 2013
    Date of Patent: December 1, 2015
    Assignee: Oracle International Corporation
    Inventors: Philip Ogren, Luis Rivas, Edward A. Green
  • Patent number: 9070090
    Abstract: A string analysis tool for calculating a similarity metric between a source string and a plurality of target strings. The string analysis tool may include optimizations that may reduce the number of calculations to be carried out when calculating the similarity metric for large volumes of data. In this regard, the string analysis tool may represent strings as features. As such, analysis may be performed relative to features (e.g., of either the source string or plurality of target strings) such that features from the strings may be eliminated from consideration when identifying target strings for which a similarity metric is to be calculated. The elimination of features may be based on a minimum similarity metric threshold, wherein features that are incapable of contributing to a similarity metric above the minimum similarity metric threshold are eliminated from consideration.
    Type: Grant
    Filed: August 28, 2012
    Date of Patent: June 30, 2015
    Assignee: Oracle International Corporation
    Inventors: Philip Ogren, Luis Rivas, Edward A. Green
  • Publication number: 20140067363
    Abstract: Computer-based tools and methods for conversion of data from a first form to a second form without reference to the context of data to be converted. The conversion may be facilitated by matching source data with external information (e.g., public and/or private schema) that contain rules (e.g., context specific rules) for conversion of the data. The matching may be performed based on an optimized index string matching technique that may be operable to match source data to external information that is context dependent without specific identification of the context of either the source data or the external information identified. Accordingly, the conversion of data may be performed in an unsupervised machine learning environment.
    Type: Application
    Filed: May 21, 2013
    Publication date: March 6, 2014
    Inventors: Philip Ogren, Luis Rivas, Edward A. Green
  • Publication number: 20140067728
    Abstract: A string analysis tool for calculating a similarity metric between a source string and a plurality of target strings. The string analysis tool may include optimizations that may reduce the number of calculations to be carried out when calculating the similarity metric for large volumes of data. In this regard, the string analysis tool may represent strings as features. As such, analysis may be performed relative to features (e.g., of either the source string or plurality of target strings) such that features from the strings may be eliminated from consideration when identifying target strings for which a similarity metric is to be calculated. The elimination of features may be based on a minimum similarity metric threshold, wherein features that are incapable of contributing to a similarity metric above the minimum similarity metric threshold are eliminated from consideration.
    Type: Application
    Filed: August 28, 2012
    Publication date: March 6, 2014
    Applicant: ORACLE INTERNATIONAL CORPORATION
    Inventors: Philip Ogren, Luis Rivas, Edward A. Green