Patents by Inventor Philip Ogren
Philip Ogren has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Publication number: 20220343072Abstract: A natural language identity classifier system is described, which employs a supervised machine learning (ML) model to perform language identity classification on input text. The ML model takes, as input, non-lexicalized features of target text derived from subword tokenization of the text. Specifically, these non-lexicalized features are generated based on statistics determined for tokens identified for the input text. According to an embodiment, at least some of the non-lexicalized features are based on natural language-specific summary statistics that indicate how often tokens were found within a corpus for each natural language. Use of such summary statistics allows for generation of natural language specific conditional probability-based features.Type: ApplicationFiled: April 22, 2021Publication date: October 27, 2022Inventor: Philip Ogren
-
Patent number: 10915233Abstract: The present disclosure describes techniques for entity classification and data enrichment of data sets. A data enrichment system is disclosed that can extract, repair, and enrich datasets, resulting in more precise entity resolution and classification for purposes of subsequent indexing and clustering. Disclosed techniques may include performing entity recognition to identify segments of interest that relate to an entity. Related data may be analyzed for classification, which can be used to transform the data for enrichment to its users.Type: GrantFiled: September 24, 2015Date of Patent: February 9, 2021Assignee: ORACLE INTERNATIONAL CORPORATIONInventors: Alexander Sasha Stojanovic, Philip Ogren, Kevin L. Markey, Mark Kreider
-
Patent number: 10891272Abstract: The present disclosure relates generally to a data enrichment service that extracts, repairs, and enriches datasets, resulting in more precise entity resolution and correlation for purposes of subsequent indexing and clustering. As the data enrichment service can include a visual recommendation engine and language for performing large-scale data preparation, repair, and enrichment of heterogeneous datasets. This enables the user to select and see how the recommended enrichments (e.g., transformations and repairs) will affect the user's data and make adjustments as needed. The data enrichment service can receive feedback from users through a user interface and can filter recommendations based on the user feedback.Type: GrantFiled: September 24, 2015Date of Patent: January 12, 2021Assignee: ORACLE INTERNATIONAL CORPORATIONInventors: Alexander Sasha Stojanovic, Luis E. Rivas, Philip Ogren, Glenn Allen Murray
-
Patent number: 10885056Abstract: Techniques are disclosed for standardization of data. According to a first technique, standard representation terms are determined for to-be-standardized data using the to-be-standardized data itself and without using any external reference data. According to a second technique, a combination of the to-be-standardized data and an external reference is used to determine standard representation terms for the to-be-standardized data.Type: GrantFiled: September 25, 2018Date of Patent: January 5, 2021Assignee: ORACLE INTERNATIONAL CORPORATIONInventors: Michael Malak, Luis E. Rivas, Mark L. Kreider, Philip Ogren, Robert James Oberbreckling
-
Patent number: 10482128Abstract: A string analysis tool for calculating a similarity metric between an input string and a plurality of strings in a collection to be searched. The string analysis tool may include optimizations that may reduce the number of calculations to be carried out when calculating the similarity metric for large volumes of data. In this regard, the string analysis tool may represent strings as features. As such, analysis may be performed relative to features (e.g., of either the input string or plurality of strings to be searched) such that features from the strings may be eliminated from consideration when identifying candidate strings from the collection for which a similarity metric is to be calculated. The elimination of features may be based on a minimum similarity metric threshold, wherein features that are incapable of contributing to a similarity metric above the minimum similarity metric threshold are eliminated from consideration.Type: GrantFiled: May 15, 2017Date of Patent: November 19, 2019Assignee: ORACLE INTERNATIONAL CORPORATIONInventor: Philip Ogren
-
Publication number: 20190102441Abstract: Techniques are disclosed for standardization of data. According to a first technique, standard representation terms are determined for to-be-standardized data using the to-be-standardized data itself and without using any external reference data. According to a second technique, a combination of the to-be-standardized data and an external reference is used to determine standard representation terms for the to-be-standardized data.Type: ApplicationFiled: September 25, 2018Publication date: April 4, 2019Applicant: Oracle International CorporationInventors: Michael Malak, Luis E. Rivas, Mark L. Kreider, Philip Ogren, Robert James Oberbreckling
-
Publication number: 20180330015Abstract: A string analysis tool for calculating a similarity metric between an input string and a plurality of strings in a collection to be searched. The string analysis tool may include optimizations that may reduce the number of calculations to be carried out when calculating the similarity metric for large volumes of data. In this regard, the string analysis tool may represent strings as features. As such, analysis may be performed relative to features (e.g., of either the input string or plurality of strings to be searched) such that features from the strings may be eliminated from consideration when identifying candidate strings from the collection for which a similarity metric is to be calculated. The elimination of features may be based on a minimum similarity metric threshold, wherein features that are incapable of contributing to a similarity metric above the minimum similarity metric threshold are eliminated from consideration.Type: ApplicationFiled: May 15, 2017Publication date: November 15, 2018Inventor: Philip Ogren
-
Publication number: 20160092474Abstract: The present disclosure relates generally to a data enrichment service that extracts, repairs, and enriches datasets, resulting in more precise entity resolution and correlation for purposes of subsequent indexing and clustering. As the data enrichment service can include a visual recommendation engine and language for performing large-scale data preparation, repair, and enrichment of heterogeneous datasets. This enables the user to select and see how the recommended enrichments (e.g., transformations and repairs) will affect the user's data and make adjustments as needed. The data enrichment service can receive feedback from users through a user interface and can filter recommendations based on the user feedback.Type: ApplicationFiled: September 24, 2015Publication date: March 31, 2016Inventors: Alexander Sasha Stojanovic, Luis E. Rivas, Philip Ogren, Glenn Allen Murray
-
Publication number: 20160092475Abstract: The present disclosure describes techniques for entity classification and data enrichment of data sets. A data enrichment system is disclosed that can extract, repair, and enrich datasets, resulting in more precise entity resolution and classification for purposes of subsequent indexing and clustering. Disclosed techniques may include performing entity recognition to identify segments of interest that relate to an entity. Related data may be analyzed for classification, which can be used to transform the data for enrichment to its users.Type: ApplicationFiled: September 24, 2015Publication date: March 31, 2016Inventors: Alexander Sasha Stojanovic, Philip Ogren, Kevin L. Markey, Mark Kreider
-
Patent number: 9201869Abstract: Computer-based tools and methods for conversion of data from a first form to a second form without reference to the context of data to be converted. The conversion may be facilitated by matching source data with external information (e.g., public and/or private schema) that contain rules (e.g., context specific rules) for conversion of the data. The matching may be performed based on an optimized index string matching technique that may be operable to match source data to external information that is context dependent without specific identification of the context of either the source data or the external information identified. Accordingly, the conversion of data may be performed in an unsupervised machine learning environment.Type: GrantFiled: May 21, 2013Date of Patent: December 1, 2015Assignee: Oracle International CorporationInventors: Philip Ogren, Luis Rivas, Edward A. Green
-
Scalable string matching as a component for unsupervised learning in semantic meta-model development
Patent number: 9070090Abstract: A string analysis tool for calculating a similarity metric between a source string and a plurality of target strings. The string analysis tool may include optimizations that may reduce the number of calculations to be carried out when calculating the similarity metric for large volumes of data. In this regard, the string analysis tool may represent strings as features. As such, analysis may be performed relative to features (e.g., of either the source string or plurality of target strings) such that features from the strings may be eliminated from consideration when identifying target strings for which a similarity metric is to be calculated. The elimination of features may be based on a minimum similarity metric threshold, wherein features that are incapable of contributing to a similarity metric above the minimum similarity metric threshold are eliminated from consideration.Type: GrantFiled: August 28, 2012Date of Patent: June 30, 2015Assignee: Oracle International CorporationInventors: Philip Ogren, Luis Rivas, Edward A. Green -
Publication number: 20140067363Abstract: Computer-based tools and methods for conversion of data from a first form to a second form without reference to the context of data to be converted. The conversion may be facilitated by matching source data with external information (e.g., public and/or private schema) that contain rules (e.g., context specific rules) for conversion of the data. The matching may be performed based on an optimized index string matching technique that may be operable to match source data to external information that is context dependent without specific identification of the context of either the source data or the external information identified. Accordingly, the conversion of data may be performed in an unsupervised machine learning environment.Type: ApplicationFiled: May 21, 2013Publication date: March 6, 2014Inventors: Philip Ogren, Luis Rivas, Edward A. Green
-
SCALABLE STRING MATCHING AS A COMPONENT FOR UNSUPERVISED LEARNING IN SEMANTIC META-MODEL DEVELOPMENT
Publication number: 20140067728Abstract: A string analysis tool for calculating a similarity metric between a source string and a plurality of target strings. The string analysis tool may include optimizations that may reduce the number of calculations to be carried out when calculating the similarity metric for large volumes of data. In this regard, the string analysis tool may represent strings as features. As such, analysis may be performed relative to features (e.g., of either the source string or plurality of target strings) such that features from the strings may be eliminated from consideration when identifying target strings for which a similarity metric is to be calculated. The elimination of features may be based on a minimum similarity metric threshold, wherein features that are incapable of contributing to a similarity metric above the minimum similarity metric threshold are eliminated from consideration.Type: ApplicationFiled: August 28, 2012Publication date: March 6, 2014Applicant: ORACLE INTERNATIONAL CORPORATIONInventors: Philip Ogren, Luis Rivas, Edward A. Green