Patents by Inventor Yigal Shai Dayan

Yigal Shai Dayan has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 9218336
    Abstract: Constructing an automaton for automated analysis of agglutinative languages comprises: constructing an affix automaton for each of a plurality of affix types of an agglutinative language, where each of the affix types is associated with one or more affixes associated with a morphological concept; combining any of the affix automatons to form a plurality of template automatons, where each of the template automatons is patterned after any of a plurality of agglutination templates of any of the affix types for the language; and combining the template automatons into a master automaton.
    Type: Grant
    Filed: March 28, 2007
    Date of Patent: December 22, 2015
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Daniel Cohen, Yigal Shai Dayan, Josemina Marcella Magdalen, Victoria Mazel
  • Patent number: 8438010
    Abstract: A system for stemming words of Semitic languages, the system including an affix scanner configured to scan a word of a Semitic language for at least one affix according to a predefined scanning sequence and determine if at least one predefined scanning criterion is met, and a stemmer configured to remove the affix from the word if the predefined scanning criterion is met.
    Type: Grant
    Filed: December 6, 2007
    Date of Patent: May 7, 2013
    Assignee: International Business Machines Corporation
    Inventors: Daniel Cohen, Yigal Shai Dayan, Josemina Magdalen, Victoria Mazel
  • Patent number: 8165869
    Abstract: Illustrative embodiments provide a computer implemented method, apparatus, and computer program product for learning word segmentation from non-white space language corpora. In one illustrative embodiment, the computer implemented method receives text input characters and calculates a ratio-measure for each pair of characters in the input characters. The computer implemented method further determines whether the ratio-measure of each pair of characters is equal to a predetermined threshold value. Responsive to determining the ratio-measure is less than the predetermined threshold value, and a local-minimum value, the computer method further identifies the pair as a weak pair and breaks the weak pair of characters.
    Type: Grant
    Filed: December 10, 2007
    Date of Patent: April 24, 2012
    Assignee: International Business Machines Corporation
    Inventors: Josemina Marcolla Magdalon, Yigal Shai Dayan, Victoria Mazel, Daniel Cohen
  • Patent number: 7917353
    Abstract: A hybrid n-gram/lexical analysis tokenization system including a lexicon and a hybrid tokenizer operative to perform both N-gram tokenization of a text and lexical analysis tokenization of a text using the lexicon, and to construct either of an index and a classifier from the results of both of the N-gram tokenization and the lexical analysis tokenization, where the hybrid tokenizer is implemented in at least one of computer hardware and computer software and is embodied within a computer-readable medium.
    Type: Grant
    Filed: March 29, 2007
    Date of Patent: March 29, 2011
    Assignee: International Business Machines Corporation
    Inventors: Yigal Shai Dayan, Josemina Marcella Magdalen, Victoria Mazel
  • Patent number: 7912703
    Abstract: Illustrated embodiments provide a computer implemented method, an apparatus, and a computer program product for unsupervised stemming schema learning and lexicon acquisition from corpora. In one illustrative embodiment, the computer implemented method obtains a corpus from corpora, analyzes the corpus to deduce a set of possible stemming schema and reviews and revises the set of possible stemming schema, to create a pruned set of stemming schema. The computer implemented method further deduces a lexicon from the corpus using the pruned set of stemming schema.
    Type: Grant
    Filed: December 10, 2007
    Date of Patent: March 22, 2011
    Assignee: International Business Machines Corporation
    Inventors: Josemina Marcella Magdalen, Yigal Shai Dayan, Victoria Mazel, Daniel Cohen
  • Publication number: 20090150140
    Abstract: A system for stemming words of Semitic languages, the system including an affix scanner configured to scan a word of a Semitic language for at least one affix according to a predefined scanning sequence and determine if at least one predefined scanning criterion is met, and a stemmer configured to remove the affix from the word if the predefined scanning criterion is met.
    Type: Application
    Filed: December 6, 2007
    Publication date: June 11, 2009
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Daniel COHEN, Yigal Shai Dayan, Josemina Magdalen, Victoria Mazel
  • Publication number: 20090150145
    Abstract: Illustrative embodiments provide a computer implemented method, apparatus, and computer program product for learning word segmentation from non-white space language corpora. In one illustrative embodiment, the computer implemented method receives text input characters and calculates a ratio-measure for each pair of characters in the input characters. The computer implemented method further determines whether the ratio-measure of each pair of characters is equal to a predetermined threshold value. Responsive to determining the ratio-measure is less than the predetermined threshold value, and a local-minimum value, the computer method further identifies the pair as a weak pair and breaks the weak pair of characters.
    Type: Application
    Filed: December 10, 2007
    Publication date: June 11, 2009
    Inventors: Josemina Marcella Magdalen, Yigal Shai Dayan, Victoria Mazel, Daniel Cohen
  • Publication number: 20090150415
    Abstract: Illustrated embodiments provide a computer implemented method, an apparatus, and a computer program product for unsupervised stemming schema learning and lexicon acquisition from corpora. In one illustrative embodiment, the computer implemented method obtains a corpus from corpora, analyzes the corpus to deduce a set of possible stemming schema and reviews and revises the set of possible stemming schema, to create a pruned set of stemming schema. The computer implemented method further deduces a lexicon from the corpus using the pruned set of stemming schema.
    Type: Application
    Filed: December 10, 2007
    Publication date: June 11, 2009
    Inventors: Josemina Marcella Magdalen, Yigal Shai Dayan, Victoria Mazel, Daniel Cohen
  • Publication number: 20080243478
    Abstract: A method for constructing an automaton for automated analysis of agglutinative languages, the method including constructing an affix automaton for each of a plurality of affix types of an agglutinative language, where each of the affix types is associated with one or more affixes associated with a morphological concept, combining any of the affix automatons to form a plurality of template automatons, where each of the template automatons is patterned after any of a plurality of agglutination templates of any of the affix types for the language, and combining the template automatons into a master automaton.
    Type: Application
    Filed: March 28, 2007
    Publication date: October 2, 2008
    Inventors: Daniel Cohen, Yigal Shai Dayan, Josemina Marcella Magdalen, Victoria Mazel
  • Publication number: 20080243487
    Abstract: A hybrid n-gram/lexical analysis tokenization system including a lexicon and a hybrid tokenizer operative to perform both N-gram tokenization of a text and lexical analysis tokenization of a text using the lexican, and to construct either of an index and a classifier from the results of both of the N-gram tokenization and the lexical analysis tokenization, where the hybrid tokenizer is implemented in at least one of computer hardware and computer software and is embodied within a computer-readable medium.
    Type: Application
    Filed: March 29, 2007
    Publication date: October 2, 2008
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: YIGAL SHAI DAYAN, JOSEMINA MARCELLA MAGDALEN, VICTORIA MAZEL