Patents by Inventor Naama Tepper

Naama Tepper has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 11797516
    Abstract: Balancing an imbalanced dataset, by: Receiving a balancing policy and the imbalanced dataset. Performing initial adjustment of the imbalanced dataset to comply with the balancing policy, by: oversampling one or more underrepresented classes, and, if one or more of the classes are overrepresented, undersampling them. Operating a generative machine learning model to generate samples for the one or more underrepresented classes, based on the initially-adjusted dataset. Operating a machine learning classification model to label the generated samples with class labels corresponding to the one or more underrepresented classes. Selecting some of the generated samples which, according to the labeling, have a relatively high probability of preserving their class labels.
    Type: Grant
    Filed: May 12, 2021
    Date of Patent: October 24, 2023
    Assignee: International Business Machines Corporation
    Inventors: Naama Tepper, Esther Goldbraich, Boaz Carmeli, Naama Zwerdling, George Kour, Ateret Anaby Tavor
  • Patent number: 11526667
    Abstract: Embodiments of the present systems and methods may provide techniques for augmenting textual data that may be used for textual classification tasks. Embodiments of such techniques may provide the capability to synthesize labeled data to improve text classification tasks. Embodiments may be specifically useful when only a small amount of data is available, and provide improved performance in such cases. For example, in an embodiment, a method implemented in a computer system may comprise a processor, memory accessible by the processor, and computer program instructions stored in the memory and executable by the processor, and the method may comprise fine-tuning a language model using a training dataset, synthesizing a plurality of samples using the fine-tuned language model, filtering the plurality of synthesized samples, and generating an augmented training dataset comprising the training dataset and the filtered plurality of synthesized sentences.
    Type: Grant
    Filed: May 9, 2020
    Date of Patent: December 13, 2022
    Assignee: International Business Machines Corporation
    Inventors: Amir Kantor, Ateret Anaby Tavor, Boaz Carmeli, Esther Goldbraich, George Kour, Segev Shlomov, Naama Tepper, Naama Zwerdling
  • Publication number: 20220374410
    Abstract: Balancing an imbalanced dataset, by: Receiving a balancing policy and the imbalanced dataset. Performing initial adjustment of the imbalanced dataset to comply with the balancing policy, by: oversampling one or more underrepresented classes, and, if one or more of the classes are overrepresented, undersampling them. Operating a generative machine learning model to generate samples for the one or more underrepresented classes, based on the initially-adjusted dataset. Operating a machine learning classification model to label the generated samples with class labels corresponding to the one or more underrepresented classes. Selecting some of the generated samples which, according to the labeling, have a relatively high probability of preserving their class labels.
    Type: Application
    Filed: May 12, 2021
    Publication date: November 24, 2022
    Inventors: Naama Tepper, Esther Goldbraich, Boaz Carmeli, Naama Zwerdling, GEORGE KOUR, Ateret Anaby Tavor
  • Patent number: 11222058
    Abstract: Familiarity-based text classification framework selection is described. A list of participants in an electronic message thread is selected. For each pairing of participants, a familiarity score is determined based on a number of criteria. A familiarity model is formed based on multiple familiarity scores and a text classification framework for the electronic message thread is selected based on the familiarity model.
    Type: Grant
    Filed: December 13, 2017
    Date of Patent: January 11, 2022
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Ethan A. Geyer, Jonathan F. Brunn, Jonathan Dunne, Naama Tepper
  • Publication number: 20210350076
    Abstract: Embodiments of the present systems and methods may provide techniques for augmenting textual data that may be used for textual classification tasks. Embodiments of such techniques may provide the capability to synthesize labeled data to improve text classification tasks. Embodiments may be specifically useful when only a small amount of data is available, and provide improved performance in such cases. For example, in an embodiment, a method implemented in a computer system may comprise a processor, memory accessible by the processor, and computer program instructions stored in the memory and executable by the processor, and the method may comprise fine-tuning a language model using a training dataset, synthesizing a plurality of samples using the fine-tuned language model, filtering the plurality of synthesized samples, and generating an augmented training dataset comprising the training dataset and the filtered plurality of synthesized sentences.
    Type: Application
    Filed: May 9, 2020
    Publication date: November 11, 2021
    Inventors: Amir Kantor, Ateret Anaby Tavor, Boaz Carmeli, Esther Goldbraich, GEORGE KOUR, Segev Shlomov, Naama Tepper, Naama Zwerdling
  • Patent number: 11057230
    Abstract: A method, computer system, and computer program product for calculating a group chat segment duration is provided. The embodiment may include capturing a plurality of group chat messages from a chat message repository. The embodiment may also include determining a probability distribution based on analyzing the captured group chat messages over a time vector. The embodiment may further include calculating a time parameter based on the determined probability distribution. The embodiment may also include calculating a content parameter based on one or more relevant chat topics. The embodiment may further include calculating an attendee parameter based on a plurality of attendees and one or more attendee associations. The embodiment may also include determining a chat duration prediction based on the calculated time parameter, the calculated content parameter, and the calculated attendee parameter.
    Type: Grant
    Filed: November 15, 2019
    Date of Patent: July 6, 2021
    Assignee: International Business Machines Corporation
    Inventors: Jonathan F. Brunn, Rachael M. H. Dickens, Jonathan Dunne, Ethan A. Geyer, Liam S. Harpur, Bo Jiang, Andrew Penrose, Naama Tepper
  • Publication number: 20200084055
    Abstract: A method, computer system, and computer program product for calculating a group chat segment duration is provided. The embodiment may include capturing a plurality of group chat messages from a chat message repository. The embodiment may also include determining a probability distribution based on analyzing the captured group chat messages over a time vector. The embodiment may further include calculating a time parameter based on the determined probability distribution. The embodiment may also include calculating a content parameter based on one or more relevant chat topics. The embodiment may further include calculating an attendee parameter based on a plurality of attendees and one or more attendee associations. The embodiment may also include determining a chat duration prediction based on the calculated time parameter, the calculated content parameter, and the calculated attendee parameter.
    Type: Application
    Filed: November 15, 2019
    Publication date: March 12, 2020
    Inventors: Jonathan F. Brunn, Rachael M.H. Dickens, Jonathan Dunne, Ethan A. Geyer, Liam S. HARPUR, Bo Jiang, ANDREW PENROSE, Naama Tepper
  • Patent number: 10574613
    Abstract: A method, computer system, and a computer program product for generating a chat summary personalized to a user is provided. The present invention may include receiving a plurality of input interactions associated with the user. The present invention may include determining a user profile based on the received plurality of input interactions, whereby the determined user profile includes a plurality of topics of interest. The present invention may include receiving a plurality of missed messages. The present invention may include determining a plurality of message clusters from the plurality of missed messages, whereby a topic is associated with each message cluster. The present invention may include ranking the determined plurality of message clusters based on comparing the topic associated with each message cluster to the plurality of topics of interest. The present invention may include presenting the ranked plurality of message clusters to the user.
    Type: Grant
    Filed: April 4, 2017
    Date of Patent: February 25, 2020
    Assignee: International Business Machines Corporation
    Inventors: Lior Leiba, Inbal Ronen, Naama Tepper
  • Patent number: 10541822
    Abstract: A method, computer system, and computer program product for calculating a group chat segment duration is provided. The embodiment may include capturing a plurality of group chat messages from a chat message repository. The embodiment may also include determining a probability distribution based on analyzing the captured group chat messages over a time vector. The embodiment may further include calculating a time parameter based on the determined probability distribution. The embodiment may also include calculating a content parameter based on one or more relevant chat topics. The embodiment may further include calculating an attendee parameter based on a plurality of attendees and one or more attendee associations. The embodiment may also include determining a chat duration prediction based on the calculated time parameter, the calculated content parameter, and the calculated attendee parameter.
    Type: Grant
    Filed: September 29, 2017
    Date of Patent: January 21, 2020
    Assignee: International Business Machines Corporation
    Inventors: Jonathan F. Brunn, Rachael M. H. Dickens, Jonathan Dunne, Ethan A. Geyer, Liam S. Harpur, Bo Jiang, Andrew Penrose, Naama Tepper
  • Publication number: 20190288964
    Abstract: Configuring a data output stream based on combined multi-source data streams by a) processing data from one or more data stream collators in accordance with predefined data pre-processing procedures, where the data are known to be associated with a given data stream source, b) processing the data using data group identification procedures to derive a data group distribution for the data stream source, c) processing the data using data segmentation procedures that relate to a data segmentation model, d) processing the data using data stream network identification procedures to identify network connections between data stream sources that are associated with the data, and to construct a model of the network connections, e) deriving, from output of any of steps b), c), and d), values for one or more attributes associated with the data stream source, and configuring a data output stream based on the attributes and the attribute values.
    Type: Application
    Filed: March 15, 2018
    Publication date: September 19, 2019
    Inventors: JONATHAN DUNNE, Anat Hashavit, Amir Nissan Cohen, Naama Tepper Naama Tepper
  • Publication number: 20190179955
    Abstract: Familiarity-based text classification framework selection is described. A list of participants in an electronic message thread is selected. For each pairing of participants, a familiarity score is determined based on a number of criteria. A familiarity model is formed based on multiple familiarity scores and a text classification framework for the electronic message thread is selected based on the familiarity model.
    Type: Application
    Filed: December 13, 2017
    Publication date: June 13, 2019
    Inventors: Ethan A. Geyer, Jonathan F. Brunn, Jonathan Dunne, Naama Tepper
  • Publication number: 20190121907
    Abstract: Message grouping using temporal and multi-factor similarity includes grouping multiple messages of a corpus in a group messaging system into a number of message bursts. Each message burst includes a number of messages that have a temporal relationship. Multiple of the number of message bursts are grouped into a message cluster. The grouping is based on a similarity of the number of message bursts as defined by multiple features of the message bursts.
    Type: Application
    Filed: October 23, 2017
    Publication date: April 25, 2019
    Inventors: Jonathan F. Brunn, Daniel Dulaney, Ami Dewar, Ethan A. Geyer, Bo Jiang, Rachael Dickens, Scott E. Chapman, Thomas Blanchflower, Naama Tepper
  • Publication number: 20190103982
    Abstract: A method, computer system, and computer program product for calculating a group chat segment duration is provided. The embodiment may include capturing a plurality of group chat messages from a chat message repository. The embodiment may also include determining a probability distribution based on analyzing the captured group chat messages over a time vector. The embodiment may further include calculating a time parameter based on the determined probability distribution. The embodiment may also include calculating a content parameter based on one or more relevant chat topics. The embodiment may further include calculating an attendee parameter based on a plurality of attendees and one or more attendee associations. The embodiment may also include determining a chat duration prediction based on the calculated time parameter, the calculated content parameter, and the calculated attendee parameter.
    Type: Application
    Filed: September 29, 2017
    Publication date: April 4, 2019
    Inventors: Jonathan F. Brunn, Rachael M.H. Dickens, Jonathan Dunne, Ethan A. Geyer, Liam S. HARPUR, Bo Jiang, ANDREW PENROSE, Naama Tepper
  • Publication number: 20180287981
    Abstract: A method, computer system, and a computer program product for generating a chat summary personalized to a user is provided. The present invention may include receiving a plurality of input interactions associated with the user. The present invention may include determining a user profile based on the received plurality of input interactions, whereby the determined user profile includes a plurality of topics of interest. The present invention may include receiving a plurality of missed messages. The present invention may include determining a plurality of message clusters from the plurality of missed messages, whereby a topic is associated with each message cluster. The present invention may include ranking the determined plurality of message clusters based on comparing the topic associated with each message cluster to the plurality of topics of interest. The present invention may include presenting the ranked plurality of message clusters to the user.
    Type: Application
    Filed: April 4, 2017
    Publication date: October 4, 2018
    Inventors: Lior Leiba, Inbal Ronen, Naama Tepper