Patents by Inventor Shaikh Shahriar Quader
Shaikh Shahriar Quader has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Publication number: 20240112074Abstract: An embodiment of the present invention extracts information from a natural language query requesting performance of a task. A machine learning model determines a task that corresponds to the task requested by the natural language query based on the extracted information. A query is generated for retrieving data from a plurality of different data sources based on the extracted information. The data for the determined task is retrieved from the plurality of different data sources based on the generated query. The determined task is performed using the retrieved data. Present invention embodiments include a method, system, and computer program product for processing a natural language query in substantially the same manner described above.Type: ApplicationFiled: September 30, 2022Publication date: April 4, 2024Inventors: Bryson Chisholm, Shikhar Kwatra, Shaikh Shahriar Quader, Ayesha Bhangu, Jack Zhang, Shabana Dhayananth, Tarandeep kaur Randhawa
-
Publication number: 20240070522Abstract: Providing a representative dataset from an initial dataset by accessing a dataset associated with a machine learning model, receiving input parameters associated with the representative dataset selection, the input parameters including an evaluation metric, determining a density of a plurality of datapoints associated with the dataset, training a first iteration of a machine learning model using a first data point selected according to the density, determining a first value of the evaluation metric for the first iteration of the machine learning model, generating a representative subset based on the first value of the evaluation metric value, and providing the representative dataset and a final machine learning model trained using the representative dataset.Type: ApplicationFiled: August 23, 2022Publication date: February 29, 2024Inventors: Shaikh Shahriar Quader, Aindrila Basak, Adrian Mahjour, Petr Novotny, CARLO APPUGLIESE, Berthold Reinwald, Dheeraj Arremsetty
-
Patent number: 11853908Abstract: Noisy labeled and unlabeled datapoint detection and rectification in a training dataset for machine-learning is facilitated by a processor(s) obtaining a training dataset for use in training a machine-learning model. The processor(s) applies ensemble machine-learning and a generative model to the training dataset to detect noisy labeled datapoints in the training dataset, and create a clean dataset with preliminary labels added for any unlabeled datapoints in the training dataset. Data-driven active learning and the clean dataset are used by the processor(s) to facilitate generating an active-learned dataset with true labels added for one or more selected datapoints of a datapoint pool including the detected noisy labeled datapoints and the unlabeled datapoints of the training dataset. The machine-learning model is trained by the processor(s) using, at least in part, the clean dataset and the active-learned dataset.Type: GrantFiled: May 13, 2020Date of Patent: December 26, 2023Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Shaikh Shahriar Quader, Mona Nashaat Ali Elmowafy, Darrell Christopher Reimer
-
Patent number: 11816127Abstract: A quality determination method, system, and computer program product that includes performing a dimensionality reduction on a high-dimensional dataset to form a dimensional-reduced dataset and determining, using a machine learning tool executed on a computing device, a quality of the dimensional-reduced dataset via a review of an extracted feature extracted from the dimensional-reduced dataset.Type: GrantFiled: February 26, 2021Date of Patent: November 14, 2023Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Petr Novotny, Aindrila Basak, Shaikh Shahriar Quader, Horst Cornelius Samulowitz, Chad Marston
-
Publication number: 20230153566Abstract: Classification of cell data includes obtaining a target dataset and an artificial intelligence (AI) model trained to identify relationship(s) between cells of a row and classify whether a focus cell of the row is erroneous based on the identified relationship(s), and applying the AI model to the target dataset to identify erroneous cell(s) thereof. The applying includes selecting a row of cells of the target dataset, inputting the selected row of cells to the AI model with an identification of a focus cell, the focus cell to be classified by the AI model, classifying the focus cell to obtain a classification of the focus cell, the classifying identifying whether the focus cell is erroneous, and outputting an indication of the classification of the focus cell.Type: ApplicationFiled: November 18, 2021Publication date: May 18, 2023Inventors: Shaikh Shahriar Quader, Omar Al-Shamali, James Miller, Yannick Saillet, Albert Maier, Remus Lazar
-
Publication number: 20230078134Abstract: Classification of erroneous cell data includes using at least one transformation function, the at least one transformation function determined based on correlations of observed cell data to correct call data, to automatically generate training examples that correlate erroneous data values to correct data values as informed by the at least one transformation function; augmenting an initial training set of labeled training examples with the generated training examples to produce an augmented training set; and training a machine learning model using the augmented training set to classify observed cell data based on a comparison between the observed cell data and data that the machine learning model predicts.Type: ApplicationFiled: November 7, 2022Publication date: March 16, 2023Inventors: Shaikh Shahriar QUADER, Piotr MIERZEJEWSKI, Mona Nashaat Ali ELMOWAFY
-
Patent number: 11574250Abstract: Classification of erroneous cell data includes performing unsupervised pre-training of a machine learning model to learn a bidirectional encoder representation of data cells, obtaining an initial training set, with labeled training examples that correlate observed cell data to correct cell data, for training the machine learning model to classify cell data, automatically augmenting the initial training set to produce an augmented training set, where the augmenting includes identifying patterns in the labeled training examples, generating transformation functions, and using the transformation functions, learning an augmentation strategy and automatically generating additional training examples correlating erroneous data values to correct data values, and training the machine learning model using the augmented training set.Type: GrantFiled: August 12, 2020Date of Patent: February 7, 2023Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Shaikh Shahriar Quader, Piotr Mierzejewski, Mona Nashaat Ali Elmowafy
-
Patent number: 11500830Abstract: A DBMS training subsystem trains a DBMS workload-manager model with training data identifying resources used to execute previous DBMS data-access requests. The subsystem integrates each request's high-level features and compile-time operations into a vector and clusters similar vectors into templates. The requests are divided into workloads each represented by a training histogram that describes the distribution of templates associated with the workload and identifies the total amounts and types of resources consumed when executing the entire workload.Type: GrantFiled: October 15, 2020Date of Patent: November 15, 2022Assignee: International Business Machines CorporationInventors: Shaikh Shahriar Quader, Nicolas Andres Jaramillo Duran, Sumona Mukhopadhyay, Emmanouil Papangelis, Marin Litoiu, David Kalmuk, Piotr Mierzejewski
-
Publication number: 20220292107Abstract: A quality determination method, system, and computer program product that includes performing a dimensionality reduction on a high-dimensional dataset to form a dimensional-reduced dataset and determining, using a machine learning tool executed on a computing device, a quality of the dimensional-reduced dataset via a review of an extracted feature extracted from the dimensional-reduced dataset.Type: ApplicationFiled: February 26, 2021Publication date: September 15, 2022Inventors: Petr Novotny, Aindrila Basak, Shaikh Shahriar Quader, Horst Cornelius Samulowitz, Chad Marston
-
Publication number: 20220237503Abstract: Model data comprising a model object and model metadata is extracted from a trained model. The model data is integrated within a function executable from within a database system environment. The integrated function is deployed within the database system environment, the deploying activating the trained model for execution within the database system environment.Type: ApplicationFiled: January 26, 2021Publication date: July 28, 2022Applicant: International Business Machines CorporationInventors: CARLO APPUGLIESE, Dheeraj Arremsetty, Ravikumar Govindan, Rakshith Dasenahalli Lingaraju, Timothy Thomas Bohn, Shaikh Shahriar Quader, Carmen-Gabriela Stefanita, Ingo Schuster
-
Publication number: 20220121633Abstract: A DBMS training subsystem trains a DBMS workload-manager model with training data identifying resources used to execute previous DBMS data-access requests. The subsystem integrates each request's high-level features and compile-time operations into a vector and clusters similar vectors into templates. The requests are divided into workloads each represented by a training histogram that describes the distribution of templates associated with the workload and identifies the total amounts and types of resources consumed when executing the entire workload.Type: ApplicationFiled: October 15, 2020Publication date: April 21, 2022Inventors: Shaikh Shahriar Quader, Nicolas Andres Jaramillo Duran, Sumona Mukhopadhyay, Emmanouil Papangelis, Marin Litoiu, David Kalmuk, Piotr Mierzejewski
-
Publication number: 20220075761Abstract: Embodiments of the present invention provide methods, computer program products, and systems. Embodiments of the present invention can receive, by a computing device, a request to access a datapoint of a machine learning dataset contained in a database. Embodiments of the present invention can access, by the computing device, a virtual data frame that includes a schema which represents a structure of the machine learning dataset in the database. Embodiments of the present invention can retrieve, by the computing device, the datapoint of the machine learning utilizing the virtual data frame and return, by the computing device, the retrieved datapoint in response to the request.Type: ApplicationFiled: September 8, 2020Publication date: March 10, 2022Inventors: Petr Novotny, Hong Min, Shaikh Shahriar Quader
-
Publication number: 20220051126Abstract: Classification of erroneous cell data includes performing unsupervised pre-training of a machine learning model to learn a bidirectional encoder representation of data cells, obtaining an initial training set, with labeled training examples that correlate observed cell data to correct cell data, for training the machine learning model to classify cell data, automatically augmenting the initial training set to produce an augmented training set, where the augmenting includes identifying patterns in the labeled training examples, generating transformation functions, and using the transformation functions, learning an augmentation strategy and automatically generating additional training examples correlating erroneous data values to correct data values, and training the machine learning model using the augmented training set.Type: ApplicationFiled: August 12, 2020Publication date: February 17, 2022Inventors: Shaikh Shahriar QUADER, Piotr MIERZEJEWSKI, Mona Nashaat Ali ELMOWAFY
-
Publication number: 20210357776Abstract: Noisy labeled and unlabeled datapoint detection and rectification in a training dataset for machine-learning is facilitated by a processor(s) obtaining a training dataset for use in training a machine-learning model. The processor(s) applies ensemble machine-learning and a generative model to the training dataset to detect noisy labeled datapoints in the training dataset, and create a clean dataset with preliminary labels added for any unlabeled datapoints in the training dataset. Data-driven active learning and the clean dataset are used by the processor(s) to facilitate generating an active-learned dataset with true labels added for one or more selected datapoints of a datapoint pool including the detected noisy labeled datapoints and the unlabeled datapoints of the training dataset. The machine-learning model is trained by the processor(s) using, at least in part, the clean dataset and the active-learned dataset.Type: ApplicationFiled: May 13, 2020Publication date: November 18, 2021Inventors: Shaikh Shahriar QUADER, Mona Nashaat Ali ELMOWAFY, Darrell Christopher REIMER
-
Publication number: 20210209412Abstract: A computer-implemented method includes: receiving, by a computing device, data comprising a labeled dataset and an unlabeled dataset; generating, by the computing device, a set of heuristics using the labeled dataset; generating, by the computing device, a vector of initial labels by labeling each point in the unlabeled dataset using the set of heuristics; generating, by the computing device, a refined set of heuristics using data-driven active learning; generating, by the computing device, a vector of training labels by automatically labeling each point in the unlabeled dataset using the refined set of heuristics; and outputting, by the computing device, the vector of training labels to a client device or a data repository.Type: ApplicationFiled: January 2, 2020Publication date: July 8, 2021Inventors: Shaikh Shahriar Quader, Jean-François Puget, Mona Nashaat Ali Elmowafy