Patents by Inventor Paul O'Hara
Paul O'Hara has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Patent number: 12205203Abstract: Using approximated bin intervals to label the histograms provides clarity and allows for the histogram to be more intuitively understood. A dataset may comprise a plurality of records having a plurality of features including one or more continuous features. A selection of a continuous feature may be obtained. A bin width based on a number of bins and feature statistics of the continuous feature may be determined. An approximated bin interval range is determined by applying a bin mask based on the bin width to the feature statistics. An approximated bin width is determined based on the number of bins and the approximated bin interval range. Approximated bin intervals for the histogram are determined based on the approximated bin width. A histogram is generated having bins with intervals based the approximated bin intervals.Type: GrantFiled: July 12, 2023Date of Patent: January 21, 2025Assignee: BUSINESS OBJECTS SOFTWARE LTDInventors: Paul O'Hara, Malte Christian Kaufmann, Esther Rodrigo Ortiz, Conor White
-
Publication number: 20250013668Abstract: Anomalies may be detected using a multiple machine learning model anomaly detection framework. A clustering model is trained using an unsupervised machine learning algorithm on a historical anomaly dataset. A plurality of clusters of records are determined by applying the historical anomaly dataset to the clustering model. Then it is determined whether each cluster of the plurality of clusters is an anomaly-type cluster or a normal-type cluster. The plurality of labels for the plurality of records are updated based on the particular record's cluster classification. Non-pure clusters are determined from among the plurality of clusters based on a purity threshold. A supervised machine learning model is trained for each of the non-pure clusters using the records in the given cluster and the labels for each of those records. Then, predictions of an anomaly are made using the clustering model and the supervised machine learning models.Type: ApplicationFiled: June 26, 2024Publication date: January 9, 2025Inventors: Paul O'Hara, Ying Wu, Malte Christian Kaufmann
-
Patent number: 12079196Abstract: The present disclosure provides for accurate and efficient identification of candidate features for an input dataset comprising one or more continuous features and one or more categorical features is obtained. A number of categorical feature categories based on the one or more categorical features is determined. Record counts for each of the categorical feature categories are determined. Skew statistics for each category are determined based on the record counts for each of the categorical feature categories. Cardinality skew factors for each of the one or more categorical features are then determined based on the record counts and the skew statistics. A number of the one or more categorical features having the highest cardinality skew factors are selected from among the cardinality skew factors. Then, a top contributor deviation analysis is performed using the selected number of the categorical features having the highest cardinality skew factors.Type: GrantFiled: October 8, 2021Date of Patent: September 3, 2024Assignee: BUSINESS OBJECTS SOFTWARE LTDInventors: Paul O'Hara, Malte Christian Kaufmann, Alan McShane
-
Patent number: 12050628Abstract: Anomalies may be detected using a multiple machine learning model anomaly detection framework. A clustering model is trained using an unsupervised machine learning algorithm on a historical anomaly dataset. A plurality of clusters of records are determined by applying the historical anomaly dataset to the clustering model. Then it is determined whether each cluster of the plurality of clusters is an anomaly-type cluster or a normal-type cluster. The plurality of labels for the plurality of records are updated based on the particular record's cluster classification. Non-pure clusters are determined from among the plurality of clusters based on a purity threshold. A supervised machine learning model is trained for each of the non-pure clusters using the records in the given cluster and the labels for each of those records. Then, predictions of an anomaly are made using the clustering model and the supervised machine learning models.Type: GrantFiled: July 6, 2023Date of Patent: July 30, 2024Assignee: BUSINESS OBJECTS SOFTWARE LTDInventors: Paul O'Hara, Ying Wu, Malte Christian Kaufmann
-
Publication number: 20240193462Abstract: A system may obtain a plurality of historical feature contribution score (FCS) datasets, each historical FCS dataset comprising a first plurality of feature contribution scores and a size of the historical FCS dataset. The system may apply default feature contribution category classification (FCCC) parameters to the plurality of historical FCS datasets and may optimize the default FCCC parameters to produce a plurality of optimized FCCC parameters. The system may produce a training dataset comprising the optimized FCCC parameters and use the training dataset to train a machine learning model to apply the category classification labels. The system may apply the new FCS dataset to the machine learning model, the new FCS dataset comprising a second plurality of feature contribution scores and a size of the new FCS dataset, and provide the category classification labels for the new FCS dataset to a user interface.Type: ApplicationFiled: November 29, 2022Publication date: June 13, 2024Inventor: Paul O'Hara
-
Patent number: 11928562Abstract: A system and method include input of data records to a first trained predictive model to obtain a predicted value associated with each input data record. A model region is then associated with each of the input data records based on the first trained predictive model, the input data records and the predicted values. Enhanced input data records are generated by, for each model region, adding derived values of engineered features associated with the model region to input data records associated with the model region and default values of the engineered features associated with the model region to input training records not associated with the model region. The enhanced input data records are input to a second trained predictive model to obtain an enhanced predicted value associated with each input data record.Type: GrantFiled: September 16, 2020Date of Patent: March 12, 2024Assignee: BUSINESS OBJECTS SOFTWARE LIMITEDInventors: Paul O'Hara, Ying Wu
-
Publication number: 20240062101Abstract: A historical feature contribution score dataset comprising a number of sets of scores generated by machine learning model may be obtained. Additional feature contribution score sets may be materialized such that the size of each additional feature contribution score set is based on a corresponding randomly selected values within a set-size range. A training dataset may be produced that includes feature contribution scores and corresponding classification labels extracted from the historical feature contribution score dataset and the additional feature contribution score sets. The classification labels may indicate an amount that the corresponding feature contribution scores contribute to a prediction of a target feature. A machine learning model may be trained to predict the classification labels using the training dataset. An input feature contribution score set may be applied to the machine learning model to obtain predicted classification labels.Type: ApplicationFiled: August 17, 2022Publication date: February 22, 2024Inventor: Paul O'Hara
-
Publication number: 20240020896Abstract: Using approximated bin intervals to label the histograms provides clarity and allows for the histogram to be more intuitively understood. A dataset may comprise a plurality of records having a plurality of features including one or more continuous features. A selection of a continuous feature may be obtained. A bin width based on a number of bins and feature statistics of the continuous feature may be determined. An approximated bin interval range is determined by applying a bin mask based on the bin width to the feature statistics. An approximated bin width is determined based on the number of bins and the approximated bin interval range. Approximated bin intervals for the histogram are determined based on the approximated bin width. A histogram is generated having bins with intervals based the approximated bin intervals.Type: ApplicationFiled: July 12, 2023Publication date: January 18, 2024Inventors: Paul O'Hara, Malte Christian Kaufmann, Esther Rodrigo Ortiz, Conor White
-
Patent number: 11734864Abstract: Using approximated bin intervals to label the histograms provides clarity and allows for the histogram to be more intuitively understood. A dataset may comprise a plurality of records having a plurality of features including one or more continuous features. A selection of a continuous feature may be obtained. A bin width based on a number of bins and feature statistics of the continuous feature may be determined. An approximated bin interval range is determined by applying a bin mask based on the bin width to the feature statistics. An approximated bin width is determined based on the number of bins and the approximated bin interval range. Approximated bin intervals for the histogram are determined based on the approximated bin width. A histogram is generated having bins with intervals based the approximated bin intervals.Type: GrantFiled: October 29, 2021Date of Patent: August 22, 2023Assignee: BUSINESS OBJECTS SOFTWARE LTDInventors: Paul O'Hara, Malte Christian Kaufmann, Esther Rodrigo Ortiz, Conor White
-
Patent number: 11727030Abstract: The present disclosure involves systems, software, and computer implemented methods for automatically detecting hot areas in heat map visualizations. One example method includes identifying a two-dimensional heat map. The identified two-dimensional heat map is converted to a one-dimensional heat map. Cells of the one-dimensional heat map are clustered using a density-based clustering algorithm to generate at least one dense region of cells. A mean value of cells in each dense region is calculated and the dense regions are sorted by mean value in descending order. An approach for identifying hot areas is selected and the selected approach is used to identify at least one dense region as a hot area of the one-dimensional heat map.Type: GrantFiled: May 5, 2020Date of Patent: August 15, 2023Assignee: Business Objects Software Ltd.Inventors: Ben Murphy, Ying Wu, Paul O'Hara, Emmet Norton, Malte Christian Kaufmann, Orla Cullen
-
Patent number: 11720579Abstract: Systems and methods include determination, for each of a plurality of discrete features, of statistics based on a number of occurrences of each discrete value of the discrete feature in the data, determination of first summary statistics based on the determined statistics, determine of a dissimilarity for each discrete feature based on the first summary statistics and on the statistics determined for the discrete feature, determination of candidate discrete features based on the determined dissimilarities, determination, for each of the candidate discrete features, of second summary statistics based on values of a continuous feature associated with each discrete value of the candidate discrete feature, determination of a deviation score for each of the candidate discrete features based on the second summary statistics, and transmission of the candidate discrete features for display in association with the continuous feature based on the determined deviation scores.Type: GrantFiled: July 6, 2021Date of Patent: August 8, 2023Assignee: BUSINESS OBJECTS SOFTWARE LTDInventors: Paul O'Hara, Malte Christian Kaufmann, Alan McShane, Anirban Banerjee, Mark Ahern
-
Publication number: 20230216752Abstract: Techniques for enabling secure access to data using data blocks is described. Computing device(s) can provide instruction(s) to a component associated with an entity, wherein the instruction(s) are associated with an identifier corresponding to a data block of a plurality of data blocks. The computing device(s) can receive, from the component, data associated with the component, wherein the data is associated with the identifier and is indicative of a state of the component. The computing device(s) can store the data in the data block and monitor, using rule(s), changes to the state of the component based at least partly on the data in the data block. As a result, techniques described herein enable near real-time—and in some examples, automatic—reporting and/or remediation for correcting changes to the state of the component using data that is securely accessed by use of data blocks.Type: ApplicationFiled: March 10, 2023Publication date: July 6, 2023Inventors: Chad Campbell, Carroll Wayne Moon, Christopher James Carlson, Jeremy David Sublett, Paul O'Hara, David Ray Garza, David James Weatherford, Jason Aaron Graham, Jon Matthew Loflin, Kyle J. Wagner
-
Patent number: 11693879Abstract: Systems and methods include reception of a set of data including continuous features and a discrete feature, each continuous feature associated with a plurality of values and the discrete feature associated with a plurality of discrete values, determine, for each continuous feature, a relationship factor representing a relationship between the discrete feature and the continuous feature based on the plurality of values associated with the continuous feature and the plurality of discrete values, identify one of the continuous features associated with a largest one of the determined relationship factors, generate, for each of the other features, a correlation factor representing a correlation between the continuous feature and the identified continuous feature, determine, for each of the continuous features other than the identified continuous feature, a composite relationship score based on the relationship factor and the correlation factor associated with the feature, and present a visualization associated wiType: GrantFiled: May 19, 2021Date of Patent: July 4, 2023Assignee: BUSINESS OBJECTS SOFTWARE LTD.Inventors: Paul O'Hara, Ying Wu, Jiazheng Li, Cathal McGovern, Malte Christian Kaufmann, Esther Rodrigo Ortiz, Kerry O'Connor, Michael Golden, Satinder Singh, Vlad Zat
-
Patent number: 11681715Abstract: Systems and methods include determination, determine, for each of a plurality of discrete features, of statistics for each discrete value of the discrete feature based on values of a continuous feature associated with the discrete value, determination, for each discrete feature, of first summary statistics based on the statistics determined for each discrete value of the discrete feature, determination, for each discrete feature, of a dissimilarity based on the first summary statistics determined for the discrete feature and on the statistics determined for each discrete value of the discrete feature, determination of candidate discrete features of the discrete features based on the determined dissimilarities, the candidate discrete features comprising less than all of the discrete features, determination, for each of the candidate discrete features, of second summary statistics based on values of the continuous feature associated with each discrete value of the candidate discrete feature, determine of a deviType: GrantFiled: June 9, 2021Date of Patent: June 20, 2023Assignee: BUSINESS OBJECTS SOFTWARE LTD.Inventors: Paul O'Hara, Malte Christian Kaufmann, Anirban Banerjee, Ian Denver, Alan McShane
-
Publication number: 20230133856Abstract: Using approximated bin intervals to label the histograms provides clarity and allows for the histogram to be more intuitively understood. A dataset may comprise a plurality of records having a plurality of features including one or more continuous features. A selection of a continuous feature may be obtained. A bin width based on a number of bins and feature statistics of the continuous feature may be determined. An approximated bin interval range is determined by applying a bin mask based on the bin width to the feature statistics. An approximated bin width is determined based on the number of bins and the approximated bin interval range. Approximated bin intervals for the histogram are determined based on the approximated bin width. A histogram is generated having bins with intervals based the approximated bin intervals.Type: ApplicationFiled: October 29, 2021Publication date: May 4, 2023Inventors: Paul O'Hara, Malte Christian Kaufmann, Esther Rodrigo Ortiz, Conor White
-
Publication number: 20230113850Abstract: The present disclosure provides for accurate and efficient identification of candidate features for an input dataset comprising one or more continuous features and one or more categorical features is obtained. A number of categorical feature categories based on the one or more categorical features is determined. Record counts for each of the categorical feature categories are determined. Skew statistics for each category are determined based on the record counts for each of the categorical feature categories. Cardinality skew factors for each of the one or more categorical features are then determined based on the record counts and the skew statistics. A number of the one or more categorical features having the highest cardinality skew factors are selected from among the cardinality skew factors. Then, a top contributor deviation analysis is performed using the selected number of the categorical features having the highest cardinality skew factors.Type: ApplicationFiled: October 8, 2021Publication date: April 13, 2023Inventors: Paul O'Hara, Malte Christian Kaufmann, Alan McShane
-
Patent number: 11606270Abstract: Techniques for enabling secure access to data using data blocks is described. Computing device(s) can provide instruction(s) to a component associated with an entity, wherein the instruction(s) are associated with an identifier corresponding to a data block of a plurality of data blocks. The computing device(s) can receive, from the component, data associated with the component, wherein the data is associated with the identifier and is indicative of a state of the component. The computing device(s) can store the data in the data block and monitor, using rule(s), changes to the state of the component based at least partly on the data in the data block. As a result, techniques described herein enable near real-time—and in some examples, automatic—reporting and/or remediation for correcting changes to the state of the component using data that is securely accessed by use of data blocks.Type: GrantFiled: April 19, 2021Date of Patent: March 14, 2023Assignee: CloudFit Software, LLCInventors: Chad Campbell, Carroll Wayne Moon, Christopher James Carlson, Jeremy David Sublett, Paul O'Hara, David Ray Garza, David James Weatherford, Jason Aaron Graham, Jon Matthew Loflin, Kyle Wagner
-
Publication number: 20230010992Abstract: Systems and methods include determination, for each of a plurality of discrete features, of statistics based on a number of occurrences of each discrete value of the discrete feature in the data, determination of first summary statistics based on the determined statistics, determine of a dissimilarity for each discrete feature based on the first summary statistics and on the statistics determined for the discrete feature, determination of candidate discrete features based on the determined dissimilarities, determination, for each of the candidate discrete features, of second summary statistics based on values of a continuous feature associated with each discrete value of the candidate discrete feature, determination of a deviation score for each of the candidate discrete features based on the second summary statistics, and transmission of the candidate discrete features for display in association with the continuous feature based on the determined deviation scores.Type: ApplicationFiled: July 6, 2021Publication date: January 12, 2023Inventors: Paul O'HARA, Malte Christian KAUFMANN, Alan McSHANE, Anirban BANERJEE, Mark AHERN
-
Publication number: 20220398246Abstract: Systems and methods include determination, determine, for each of a plurality of discrete features, of statistics for each discrete value of the discrete feature based on values of a continuous feature associated with the discrete value, determination, for each discrete feature, of first summary statistics based on the statistics determined for each discrete value of the discrete feature, determination, for each discrete feature, of a dissimilarity based on the first summary statistics determined for the discrete feature and on the statistics determined for each discrete value of the discrete feature, determination of candidate discrete features of the discrete features based on the determined dissimilarities, the candidate discrete features comprising less than all of the discrete features, determination, for each of the candidate discrete features, of second summary statistics based on values of the continuous feature associated with each discrete value of the candidate discrete feature, determine of a deviType: ApplicationFiled: June 9, 2021Publication date: December 15, 2022Inventors: Paul O'HARA, Malte Christian KAUFMANN, Anirban BANERJEE, Ian DENVER, Alan McSHANE
-
Publication number: 20220374450Abstract: Systems and methods include reception of a set of data including continuous features and a discrete feature, each continuous feature associated with a plurality of values and the discrete feature associated with a plurality of discrete values, determine, for each continuous feature, a relationship factor representing a relationship between the discrete feature and the continuous feature based on the plurality of values associated with the continuous feature and the plurality of discrete values, identify one of the continuous features associated with a largest one of the determined relationship factors, generate, for each of the other features, a correlation factor representing a correlation between the continuous feature and the identified continuous feature, determine, for each of the continuous features other than the identified continuous feature, a composite relationship score based on the relationship factor and the correlation factor associated with the feature, and present a visualization associated wiType: ApplicationFiled: May 19, 2021Publication date: November 24, 2022Inventors: Paul O'HARA, Ying WU, Jiazheng LI, Cathal McGOVERN, Malte Christian KAUFMANN, Esther Rodrigo ORTIZ, Kerry O'CONNOR, Michael GOLDEN, Satinder SINGH