Patents by Inventor Orna Raz

Orna Raz has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Publication number: 20240362337
    Abstract: One or more systems, devices, computer program products and/or computer-implemented methods provided herein relate to risk assessment for artificial intelligence models, and more specifically, to the generation of customized risk scores and converted comparable scores. In an embodiment, the customized risk assessment scores can be based on a risk profile determined from risk assessment requirements and measurements of an artificial intelligence model. In another embodiment, one or more customized risk assessment scores can be converted to a converted risk assessment score that is comparable to a customized risk assessment score or another converted risk assessment score.
    Type: Application
    Filed: April 28, 2023
    Publication date: October 31, 2024
    Inventors: Abigail Goldsteen, Michael Hind, Jacquelyn Martino, David John Piorkowski, Orna Raz, John Thomas Richards, Moninder Singh, Marcel Zalmanovici
  • Publication number: 20240339112
    Abstract: Various systems and methods are presented regarding detecting data drift. The data of interest can be batches of utterances received at an interface (e.g., a chatbot). The batches of utterances can be compared with topics present in training data utilized to train a data classifier (e.g., an autoencoder), wherein topics identified in the batches of utterances that are not present in the training data can be considered to be novel topics. The greater the presence of novel topics in a batch of utterances, the greater the divergence of the batch of utterances from the content of the training data. The novel topics can be identified and subsequently applied to the training data such that the data classifier can be re-trained with the novel topics, thereby causing the data classifier to be contemporaneous with the novel topics. In an embodiment, the utterances can be short streams of text, symbols, and suchlike.
    Type: Application
    Filed: April 5, 2023
    Publication date: October 10, 2024
    Inventors: Ella Rabinovich, Matan Vetzler, Samuel Solomon Ackerman, Ateret Anaby - Tavor, Eitan Daniel Farchi, Orna Raz
  • Patent number: 12056580
    Abstract: A method, system and computer program product, the method comprising: creating a model representing underperforming cases; from a case collection having a total performance, and which comprises for each of a multiplicity of records: a value for each feature from a collection of features, a ground truth label and a prediction of a machine learning (ML) engine, obtaining one or more features; dividing the records into groups, based on values of the features in each record; for one group of the groups, calculating a performance parameter of the ML engine over the portion of the records associated with the group; subject to the performance parameter of the group being below the total performance in at least a predetermined threshold: determining a characteristic for the group; adding the characteristic of the group to the model; and providing the model to a user, thus indicating under-performing parts of the test collection.
    Type: Grant
    Filed: October 24, 2019
    Date of Patent: August 6, 2024
    Assignee: International Business Machines Corporation
    Inventors: Orna Raz, Marcel Zalmanovici, Aviad Zlotnick
  • Publication number: 20240202575
    Abstract: A computer hardware system includes a slice generator and a policy generator and performs the following. The slice generator slices a first dataset including true values and predicted values of a class variable into a plurality of slices each defining a plurality of observations within the first dataset. A first one and another one of the plurality of slices are selected, and a union of observations is generated by adding observations within the selected another one to observations within the selected first one of the plurality of slices. The selecting another one of the plurality of slices and the generating the union is repeated until a number of observations within the union reaches a predetermined value. Using the policy generator and after the number of observations within the union reaches the predetermined value, an error policy is generated. The predicted values were generated by a machine learning engine.
    Type: Application
    Filed: December 20, 2022
    Publication date: June 20, 2024
    Inventors: Samuel Solomon Ackerman, Orna Raz, Eitan Daniel Farchi, Marcel Zalmanovici
  • Publication number: 20230274169
    Abstract: An example system includes a processor to receive a data set. The processor can generate a data slice rule based on a data observation for a data point in the data set. The processor can generate an instance of data based on the generated data slice rule.
    Type: Application
    Filed: February 28, 2022
    Publication date: August 31, 2023
    Inventors: Orna RAZ, George KOUR, Ramasuri NARAYANAM, Samuel Solomon ACKERMAN, Marcel ZALMANOVICI
  • Patent number: 11734143
    Abstract: A method, apparatus and a product for determining a performance measurement of predictors. The method comprises obtaining a dataset comprising data instances. Each data instance is associated with a label; obtaining a predictor. The predictor is configured to provide a prediction of a label for a data instance; determining a plurality of data slices that are subsets of the dataset. computing, for each data slice in the plurality of data slices and based on an application of the predictor on each data instance that is mapped to the data slice, a performance measurement that is indicative of a successful label prediction for a data instance comprised by the data slice, whereby obtaining a plurality of performance measurements; based on the plurality of performance measurements, computing a performance measurement of the predictor over the dataset; if the performance measurement of the predictor is below a threshold, performing a mitigating action.
    Type: Grant
    Filed: April 10, 2020
    Date of Patent: August 22, 2023
    Assignee: International Business Machines Corporation
    Inventors: Orna Raz, Eitan Farchi, Marcel Zalmanovici
  • Publication number: 20230237343
    Abstract: An example system includes a processor to receive a test set, data slices, and a measure of interest. The processor can rank the data slices based on the test set, the data slices, and the set of measures of interest. The test set includes data points from the same feature space used to train a machine learning model. Each data slice is ranked according to generated slice grades representing unique information contribution of each data slice to the measure of interest with respect to the other data slices. The processor can then present the ranked data slices.
    Type: Application
    Filed: January 26, 2022
    Publication date: July 27, 2023
    Inventors: Orna RAZ, Samuel Solomon ACKERMAN, Marcel ZALMANOVICI, Eitan Daniel FARCHI, Ramasuri NARAYANAM
  • Publication number: 20230205847
    Abstract: Systems and methods for automatically identifying in a dataset insufficient data for learning, or records with anomalous combinations of feature values, by partition of numeric and/or categorical data space into human-interpretable regions are disclosed. The method comprises: receiving a dataset of numeric and/or categorical features with a plurality of observations. Calculating observation density for each observation according to a distance or anomaly based metric, and receiving a density measurement. Partitioning the dataset along the numeric and/or categorical features according to the density measurement of each observation by a perpendicular cut along the feature spaces, receiving a map of a plurality of hyper-rectangular shapes representing various levels of density including empty spaces.
    Type: Application
    Filed: December 26, 2021
    Publication date: June 29, 2023
    Inventors: Samuel Solomon Ackerman, Orna Raz, Marcel Zalmanovici, Eitan Daniel Farchi, Avi Ziv
  • Patent number: 11676043
    Abstract: A mechanism is provided in a data processing system having a processor and a memory. The memory comprises instructions which are executed by the processor to cause the processor to implement a training system for finding an optimal surface for hierarchical classification task on an ontology. The training system receives a training data set and a hierarchical classification ontology data structure. The training system generates a neural network architecture based on the training data set and the hierarchical classification ontology data structure. The neural network architecture comprises an indicative layer, a parent tier (PT) output and a lower leaf tier (LLT) output. The training system trains the neural network architecture to classify the training data set to leaf nodes at the LLT output and parent nodes at the PT output. The indicative layer in the neural network architecture determines a surface that passes through each path from a root to a leaf node in the hierarchical ontology data structure.
    Type: Grant
    Filed: March 4, 2019
    Date of Patent: June 13, 2023
    Assignee: International Business Machines Corporation
    Inventors: Pathirage Dinindu Sujan Udayanga Perera, Orna Raz, Ramani Routray, Vivek Krishnamurthy, Sheng Hua Bao, Eitan D. Farchi
  • Publication number: 20230102152
    Abstract: A system, program product, and method for automatic detection of data drift in a data set are presented. The method includes determining changes to relations in the data set through generating baseline and production data sets. The method further includes generating a production data set with some inserted data distortion, and defining, for a plurality of features in the baseline data set, potential relations for participant features. The method also includes determining a first likelihood and a second likelihood of each potential relation in the baseline and production data sets, respectively, for the participant features. The method further includes comparing each first likelihood with each second likelihood, generating a comparison value that is compared with a threshold value, and determining, subject to the comparison value exceeding the threshold value, the potential relation in the baseline data set does not describe a relation in the production data set.
    Type: Application
    Filed: September 24, 2021
    Publication date: March 30, 2023
    Inventors: Eliran Roffe, Samuel Solomon Ackerman, Eitan Daniel Farchi, Orna Raz
  • Patent number: 11568169
    Abstract: A method, apparatus and product for identifying data drifts.
    Type: Grant
    Filed: April 28, 2019
    Date of Patent: January 31, 2023
    Assignee: International Business Machines Corporation
    Inventors: Eitan Farchi, Orna Raz, Marcel Zalmanovici
  • Patent number: 11556847
    Abstract: A method, system and computer program product, the method comprising: obtaining computer code of an employed system comprising a plurality of components; obtaining data related to operating the plurality of components; based on the computer code and the data, identifying: a first component from the plurality of components, to be maintained; and a second component from the plurality of components, to be at least partly replaced by a machine learning component; and providing to a user an identification of the first component and the second component.
    Type: Grant
    Filed: October 17, 2019
    Date of Patent: January 17, 2023
    Assignee: International Business Machines Corporation
    Inventors: Eitan Daniel Farchi, Howard Michael Hess, Orna Raz
  • Patent number: 11556810
    Abstract: A method, computer system, and a computer program product for assessing a likelihood of success associated with developing at least one machine learning (ML) solution is provided. The present invention may include generating a set of questions based on a set of raw training data. The present invention may also include computing a feasibility score based on an answer corresponding with each question from the generated set of questions. The present invention may then include, in response to determining that the computed feasibility score satisfies a threshold, computing a level of effort associated with developing the at least one ML solution to address a problem. The present invention may further include presenting, to a user, a plurality of results associated with assessing the likelihood of success of the at least one ML solution.
    Type: Grant
    Filed: July 11, 2019
    Date of Patent: January 17, 2023
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Pathirage Dinindu Sujan Udayanga Perera, Orna Raz, Ramani Routray, Eitan Daniel Farchi
  • Patent number: 11514691
    Abstract: A computer system trains a machine learning model. A vector representation is generated for each document in a collection of documents. The documents are clustered based on the vector representations of the documents to produce a plurality of clusters. A training set is produced by selecting one or more documents from each cluster, wherein the selected documents represent a sample of the collection of documents to train the machine learning model. The machine learning model is trained by applying the training set to the machine learning model. Embodiments of the present invention further include a method and program product for training a machine learning model in substantially the same manner described above.
    Type: Grant
    Filed: June 12, 2019
    Date of Patent: November 29, 2022
    Assignee: International Business Machines Corporation
    Inventors: Pathirage D. S. U. Perera, Eitan D. Farchi, Orna Raz, Ramani Routray, Sheng Hua Bao, Marcel Zalmanovici
  • Patent number: 11514311
    Abstract: A method, apparatus and a computer program product for automated data slicing based on an Artificial Neural Network (ANN). The method comprising: obtaining an ANN, wherein the ANN is configured to provide a prediction for a data instance, wherein the ANN comprises a set of nodes having interconnections therebetween; determining an attribute vector based on a subset of the nodes of the ANN; determining, based on the attribute vector, a plurality of data slices; obtaining a testing dataset comprising testing data instances; computing, for each data slice, a performance measurement of the ANN over the data slice, wherein said computing is based on an application of the ANN on each testing data instance that is mapped to the data slice; and performing an action based on at least a portion of the performance measurements of the data slices.
    Type: Grant
    Filed: July 3, 2019
    Date of Patent: November 29, 2022
    Assignee: International Business Machines Corporation
    Inventors: Rachel Brill, Eitan Farchi, Orna Raz, Aviad Zlotnick
  • Patent number: 11481667
    Abstract: Embodiments of the present systems and methods may provide improved machine learning performance even though data drift has occurred. For example, a method may comprise providing a machine learning model in a computer system, operating the machine learning model using a first dataset to obtain results of the first dataset, operating the machine learning model using a second dataset to obtain results of the second dataset, performing statistical testing on a confidence distribution of results of the first dataset and of results of the second dataset to determine a difference in a result confidence distribution between the first dataset and of the second dataset, and determining whether data included in the second dataset has data drift relative to the first dataset based on the difference in a result confidence distribution between the first dataset and of the second dataset.
    Type: Grant
    Filed: January 24, 2019
    Date of Patent: October 25, 2022
    Assignee: International Business Machines Corporation
    Inventors: Orna Raz, Marcel Zalmanovici, Aviad Zlotnick
  • Patent number: 11409992
    Abstract: A method and a computer program product for identification and improvement of machine learning (ML) under-performance The method comprises slicing data of ML model based on a functional model representing requirements of a system utilizing the ML model. The functional model comprises a set of attributes and respective domain of values. Each data slice is associated with a different valuation of one or more attributes of the functional model. Each data instance of the ML model is mapped to one or more data slices, based on valuation of the attributes. A performance measurement of the ML model over is computed for each data slice, based on an application of the ML model on each data instance that is mapped to the data slice. A Determination whether ML model adheres to a target performance requirement may be performed based on the performance measurements of the data slices.
    Type: Grant
    Filed: June 10, 2019
    Date of Patent: August 9, 2022
    Assignee: International Business Machines Corporation
    Inventors: Rachel Brill, Eitan Farchi, Orna Raz, Aviad Zlotnick
  • Patent number: 11372905
    Abstract: From metadata corresponding to a narrative text, a first encoding is constructed, the first encoding comprising a standardized text string, the first encoding formed according to an encoding scheme. A specified portion of the standardized text string of the first encoding is marked as an anchor term. A correspondence between the first encoding and a second encoding is tested using the encoding scheme and a Natural Language Processing engine, responsive to finding the anchor term within the narrative text. The second encoding corresponds to a text window. The text window comprises a portion of the narrative text comprising an instance of the anchor term and a word within a predetermined distance from the instance. Responsive to the second encoding being identical to the first encoding, the narrative text is annotated, the annotating creating new data linking the narrative text with the second encoding.
    Type: Grant
    Filed: February 4, 2019
    Date of Patent: June 28, 2022
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Nakul Chakrapani, Ramani Routray, Pathirage Perera, Sheng Hua Bao, Orna Raz, Eitan Farchi
  • Publication number: 20220172124
    Abstract: A system and method for generating data slices for validating a classifier and validating the classifier. The classifier is trained using a training data set to train the underlying machine learning algorithm. Data is passed through the trained classifier to obtain results. The results are scored to determine the likelihood that the classifier correctly classified the data. Features are identified in the data set that can be used to validate the classifier. Based on the identified features at least one data slice in the data set is identified. The classifier is validated using the at least one data slice.
    Type: Application
    Filed: December 2, 2020
    Publication date: June 2, 2022
    Inventors: Orna Raz, Marcel Zalmanovici, Eitan Daniel Farchi, Raviv Gal, Avi Ziv
  • Patent number: 11334816
    Abstract: A mechanism is provided in a data processing system having a processor and a memory. The memory comprises instructions which are executed by the processor to cause the processor to implement a training system for finding an optimal surface for hierarchical classification task on an ontology. The training system receives a training data set and a hierarchical ontology data structure. A surface finding component executing within the training system selects a surface that passes through each path from a root to a leaf node in the hierarchical ontology data structure. The surface finding component determines a plurality of adjacent surfaces that differ from the selected component by one node. The surface finding component selects an optimal surface, based on the selected surface and the plurality of adjacent surfaces, that maximizes accuracy and coverage. The training system trains a classifier model for a cognitive system using the optimal surface and the training data set.
    Type: Grant
    Filed: November 14, 2018
    Date of Patent: May 17, 2022
    Assignee: International Business Machines Corporation
    Inventors: Eitan D. Farchi, Pathirage Perera, Orna Raz