Patents by Inventor Alan McShane
Alan McShane has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Patent number: 11720579Abstract: Systems and methods include determination, for each of a plurality of discrete features, of statistics based on a number of occurrences of each discrete value of the discrete feature in the data, determination of first summary statistics based on the determined statistics, determine of a dissimilarity for each discrete feature based on the first summary statistics and on the statistics determined for the discrete feature, determination of candidate discrete features based on the determined dissimilarities, determination, for each of the candidate discrete features, of second summary statistics based on values of a continuous feature associated with each discrete value of the candidate discrete feature, determination of a deviation score for each of the candidate discrete features based on the second summary statistics, and transmission of the candidate discrete features for display in association with the continuous feature based on the determined deviation scores.Type: GrantFiled: July 6, 2021Date of Patent: August 8, 2023Assignee: BUSINESS OBJECTS SOFTWARE LTDInventors: Paul O'Hara, Malte Christian Kaufmann, Alan McShane, Anirban Banerjee, Mark Ahern
-
Patent number: 11681715Abstract: Systems and methods include determination, determine, for each of a plurality of discrete features, of statistics for each discrete value of the discrete feature based on values of a continuous feature associated with the discrete value, determination, for each discrete feature, of first summary statistics based on the statistics determined for each discrete value of the discrete feature, determination, for each discrete feature, of a dissimilarity based on the first summary statistics determined for the discrete feature and on the statistics determined for each discrete value of the discrete feature, determination of candidate discrete features of the discrete features based on the determined dissimilarities, the candidate discrete features comprising less than all of the discrete features, determination, for each of the candidate discrete features, of second summary statistics based on values of the continuous feature associated with each discrete value of the candidate discrete feature, determine of a deviType: GrantFiled: June 9, 2021Date of Patent: June 20, 2023Assignee: BUSINESS OBJECTS SOFTWARE LTD.Inventors: Paul O'Hara, Malte Christian Kaufmann, Anirban Banerjee, Ian Denver, Alan McShane
-
Patent number: 11675765Abstract: A system and method including determining, for a specified target measure column of a first dataset including a plurality of records, the metadata of the first dataset, including a probability distribution for the specified target column and dimension scores for the dimensions for the first dataset conditioned on the specified target measure column, where the first dataset comprises a plurality of columns including the at least one target measure column and a plurality of non-numeric, dimension columns for the records of the first dataset; determining, for a subset of data of the first dataset based on one or more specified variables, dimension scores for the dimensions of the subset of data approximately derived from the determined metadata of the first dataset; and providing recommendations of top contributors based on the approximated dimension scores of dimensions of the subset of data.Type: GrantFiled: May 25, 2021Date of Patent: June 13, 2023Assignee: BUSINESS OBJECTS SOFTWARE LTD.Inventors: Ying Wu, Malte Christian Kaufmann, Alan McShane, Anirban Banerjee, Gareth Maguire
-
Publication number: 20230113850Abstract: The present disclosure provides for accurate and efficient identification of candidate features for an input dataset comprising one or more continuous features and one or more categorical features is obtained. A number of categorical feature categories based on the one or more categorical features is determined. Record counts for each of the categorical feature categories are determined. Skew statistics for each category are determined based on the record counts for each of the categorical feature categories. Cardinality skew factors for each of the one or more categorical features are then determined based on the record counts and the skew statistics. A number of the one or more categorical features having the highest cardinality skew factors are selected from among the cardinality skew factors. Then, a top contributor deviation analysis is performed using the selected number of the categorical features having the highest cardinality skew factors.Type: ApplicationFiled: October 8, 2021Publication date: April 13, 2023Inventors: Paul O'Hara, Malte Christian Kaufmann, Alan McShane
-
Patent number: 11574019Abstract: Techniques are described for integrating prediction capabilities from data management platforms into applications. Implementations employ a data science platform (DSP) that operates in conjunction with a data management solution (e.g., a data hub). The DSP can be used to orchestrate data pipelines using various machine learning (ML) algorithms and/or data preparation functions. The data hub can also provide various orchestration and data pipelining capabilities to receive and handle data from various types of data sources, such as databases, data warehouses, other data storage solutions, internet-of-things (IoT) platforms, social networks, and/or other data sources. In some examples, users such as data engineers and/or others may use the implementations described herein to handle the orchestration of data into a data management platform.Type: GrantFiled: September 18, 2018Date of Patent: February 7, 2023Assignee: Business Objects Software Ltd.Inventors: Apoorva Kumar, Alan McShane
-
Publication number: 20230010992Abstract: Systems and methods include determination, for each of a plurality of discrete features, of statistics based on a number of occurrences of each discrete value of the discrete feature in the data, determination of first summary statistics based on the determined statistics, determine of a dissimilarity for each discrete feature based on the first summary statistics and on the statistics determined for the discrete feature, determination of candidate discrete features based on the determined dissimilarities, determination, for each of the candidate discrete features, of second summary statistics based on values of a continuous feature associated with each discrete value of the candidate discrete feature, determination of a deviation score for each of the candidate discrete features based on the second summary statistics, and transmission of the candidate discrete features for display in association with the continuous feature based on the determined deviation scores.Type: ApplicationFiled: July 6, 2021Publication date: January 12, 2023Inventors: Paul O'HARA, Malte Christian KAUFMANN, Alan McSHANE, Anirban BANERJEE, Mark AHERN
-
Publication number: 20220398246Abstract: Systems and methods include determination, determine, for each of a plurality of discrete features, of statistics for each discrete value of the discrete feature based on values of a continuous feature associated with the discrete value, determination, for each discrete feature, of first summary statistics based on the statistics determined for each discrete value of the discrete feature, determination, for each discrete feature, of a dissimilarity based on the first summary statistics determined for the discrete feature and on the statistics determined for each discrete value of the discrete feature, determination of candidate discrete features of the discrete features based on the determined dissimilarities, the candidate discrete features comprising less than all of the discrete features, determination, for each of the candidate discrete features, of second summary statistics based on values of the continuous feature associated with each discrete value of the candidate discrete feature, determine of a deviType: ApplicationFiled: June 9, 2021Publication date: December 15, 2022Inventors: Paul O'HARA, Malte Christian KAUFMANN, Anirban BANERJEE, Ian DENVER, Alan McSHANE
-
Publication number: 20220382729Abstract: A system and method including determining, for a specified target measure column of a first dataset including a plurality of records, the metadata of the first dataset, including a probability distribution for the specified target column and dimension scores for the dimensions for the first dataset conditioned on the specified target measure column, where the first dataset comprises a plurality of columns including the at least one target measure column and a plurality of non-numeric, dimension columns for the records of the first dataset; determining, for a subset of data of the first dataset based on one or more specified variables, dimension scores for the dimensions of the subset of data approximately derived from the determined metadata of the first dataset; and providing recommendations of top contributors based on the approximated dimension scores of dimensions of the subset of data.Type: ApplicationFiled: May 25, 2021Publication date: December 1, 2022Inventors: Ying Wu, Malte Christian Kaufmann, Alan McShane, Anirban Banerjee, Gareth Maguire
-
Patent number: 11461319Abstract: Examples of dynamic database query efficiency improvement are provided herein. Query portions of a received database query can be identified as candidates for replacement. The candidates for replacement can be query portions that reduce the efficiency of the query. Alternative queries can be determined that include substitute query portion(s) in place of candidate(s) for replacement. An expected performance of the alternative queries can be determined. Based at least in part on the expected performance of the alternative queries, one or more alternative queries can be selected as replacement database queries for the received database query.Type: GrantFiled: October 6, 2014Date of Patent: October 4, 2022Assignee: Business Objects Software, Ltd.Inventor: Alan McShane
-
Patent number: 10789547Abstract: Techniques are described for identifying an input training dataset stored within an underlying data platform; and transmitting instructions to the data platform, the instructions being executable by the data platform to train a predictive model based on the input training dataset by delegating one or more data processing operations to a plurality of nodes across the data platform.Type: GrantFiled: September 9, 2016Date of Patent: September 29, 2020Assignee: Business Objects Software Ltd.Inventors: Alan McShane, Jacques Doan Huu, Ahmed Abdelrahman, Antoine Carme, Bertrand Lamy, Fadi Maali, Laya Ouologuem, Milena Caires, Nicolas Dulian, Erik Marcade
-
Publication number: 20200081916Abstract: Techniques are described for integrating prediction capabilities from data management platforms into applications. Implementations employ a data science platform (DSP) that operates in conjunction with a data management solution (e.g., a data hub). The DSP can be used to orchestrate data pipelines using various machine learning (ML) algorithms and/or data preparation functions. The data hub can also provide various orchestration and data pipelining capabilities to receive and handle data from various types of data sources, such as databases, data warehouses, other data storage solutions, internet-of-things (IoT) platforms, social networks, and/or other data sources. In some examples, users such as data engineers and/or others may use the implementations described herein to handle the orchestration of data into a data management platform.Type: ApplicationFiled: December 13, 2018Publication date: March 12, 2020Inventors: Alan McShane, Apoorva Kumar
-
Publication number: 20200004891Abstract: Techniques are described for integrating prediction capabilities from data management platforms into applications. Implementations employ a data science platform (DSP) that operates in conjunction with a data management solution (e.g., a data hub). The DSP can be used to orchestrate data pipelines using various machine learning (ML) algorithms and/or data preparation functions. The data hub can also provide various orchestration and data pipelining capabilities to receive and handle data from various types of data sources, such as databases, data warehouses, other data storage solutions, internet-of-things (IoT) platforms, social networks, and/or other data sources. In some examples, users such as data engineers and/or others may use the implementations described herein to handle the orchestration of data into a data management platform.Type: ApplicationFiled: September 18, 2018Publication date: January 2, 2020Inventors: Apoorva Kumar, Alan McShane
-
Patent number: 10305967Abstract: Techniques are described for providing a unified client to interact with a distributed processing platform such as a Hadoop cluster. The unified client may include multiple sub-clients each of which is configured to interface with a particular subsystem of the distributed processing platform, such as MapReduce, Hive, Spark, and so forth. The unified client may be included in an application to provide, for the application, a single interface for communications between the application and the distributed processing platform during a unified communication session.Type: GrantFiled: September 9, 2016Date of Patent: May 28, 2019Assignee: Business Objects Software Ltd.Inventors: Jacques Doan Huu, Alan McShane, Ahmed Abdelrahman, Fadi Maali, Milena Caires
-
Publication number: 20170262769Abstract: Techniques are described for identifying an input training dataset stored within an underlying data platform; and transmitting instructions to the data platform, the instructions being executable by the data platform to train a predictive model based on the input training dataset by delegating one or more data processing operations to a plurality of nodes across the data platform.Type: ApplicationFiled: September 9, 2016Publication date: September 14, 2017Inventors: Alan McShane, Jacques Doan Huu, Ahmed Abdelrahman, Antoine Carme, Bertrand Lamy, Fadi Maali, Laya Ouologuem, Milena Caires, Nicolas Dulian, Erik Marcade
-
Publication number: 20170264670Abstract: Techniques are described for providing a unified client to interact with a distributed processing platform such as a Hadoop cluster. The unified client may include multiple sub-clients each of which is configured to interface with a particular subsystem of the distributed processing platform, such as MapReduce, Hive, Spark, and so forth. The unified client may be included in an application to provide, for the application, a single interface for communications between the application and the distributed processing platform during a unified communication session.Type: ApplicationFiled: September 9, 2016Publication date: September 14, 2017Inventors: Jacques Doan Huu, Alan McShane, Ahmed Abdelrahman, Fadi Maali, Milena Caires
-
Publication number: 20160098448Abstract: Examples of dynamic database query efficiency improvement are provided herein. Query portions of a received database query can be identified as candidates for replacement. The candidates for replacement can be query portions that reduce the efficiency of the query. Alternative queries can be determined that include substitute query portion(s) in place of candidate(s) for replacement. An expected performance of the alternative queries can be determined. Based at least in part on the expected performance of the alternative queries, one or more alternative queries can be selected as replacement database queries for the received database query.Type: ApplicationFiled: October 6, 2014Publication date: April 7, 2016Inventor: Alan McShane