Patents by Inventor Xiaoqiang Luo

Xiaoqiang Luo has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Publication number: 20220253540
    Abstract: Techniques for securely storing and processing data for training data generation are provided. In one technique, multiple encrypted records are retrieved from a first persistent storage. For each encrypted record, that record is decrypted in memory to generate a decrypted record that comprises multiple attribute values. Then, based on the attribute values and a definition of multiple features of a machine-learned model, multiple feature values are generated and stored, along with a label, in a training instance, which is then stored in a second persistent storage. One or more machine learning techniques are used to train the machine-learned model based on training data that includes the training instances that are stored in the second persistent storage.
    Type: Application
    Filed: February 5, 2021
    Publication date: August 11, 2022
    Inventors: Yunpeng XU, Tianhao LU, Xiaoqiang LUO, Jiashuo WANG, Chencheng WU
  • Patent number: 10698964
    Abstract: A method for automatically extracting and organizing information by a processing device from a plurality of data sources is provided. A natural language processing information extraction pipeline that includes an automatic detection of entities is applied to the data sources. Information about detected entities is identified by analyzing products of the natural language processing pipeline. Identified information is grouped into equivalence classes containing equivalent information. At least one displayable representation of the equivalence classes is created. An order in which the at least one displayable representation is displayed is computed. A combined representation of the equivalence classes that respects the order in which the displayable representation is displayed is produced.
    Type: Grant
    Filed: January 30, 2017
    Date of Patent: June 30, 2020
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Vittorio Castelli, Radu Florian, Xiaoqiang Luo, Hema Raghavan
  • Publication number: 20200097605
    Abstract: A system and method are provided for automatic identification, extraction, and validation of data pertaining to receiving entity events (REE). Feature (or attribute) values associated with web content are identified. The web content may contain news and features on current/past affairs. The identified feature values are considered by a rule-based or a machine-learned model and, based upon output of the model, a determination as to whether the set of data comprises a REE is made. If the determination is positive, then multiple data items are extracted from the set of data and, optionally, from other data from the source.
    Type: Application
    Filed: September 25, 2018
    Publication date: March 26, 2020
    Inventors: Jingyuan Liu, Xiaoqiang Luo, Tzu Ming Kuo, Marcello Oliva, Yunpeng Xu
  • Publication number: 20190197176
    Abstract: Techniques for identifying relationships between entities using machine learning are disclosed herein.
    Type: Application
    Filed: December 21, 2017
    Publication date: June 27, 2019
    Inventors: Xiaoqiang Luo, Yunpeng Xu, Marcello Oliva
  • Publication number: 20170140057
    Abstract: A method for automatically extracting and organizing information by a processing device from a plurality of data sources is provided. A natural language processing information extraction pipeline that includes an automatic detection of entities is applied to the data sources. Information about detected entities is identified by analyzing products of the natural language processing pipeline. Identified information is grouped into equivalence classes containing equivalent information. At least one displayable representation of the equivalence classes is created. An order in which the at least one displayable representation is displayed is computed. A combined representation of the equivalence classes that respects the order in which the displayable representation is displayed is produced.
    Type: Application
    Filed: January 30, 2017
    Publication date: May 18, 2017
    Inventors: VITTORIO CASTELLI, Radu Florian, Xiaoqiang Luo, Hema Raghavan
  • Patent number: 9471559
    Abstract: Creating training data for a natural language processing system may comprise obtaining natural language input, the natural language input annotated with one or more important phrases; and generating training instances comprising a syntactic parse tree of nodes representing elements of the natural language input augmented with the annotated important phrases. In another aspect, a classifier may be trained based on the generated training instances. The classifier may be used to predict one or more potential important phrases in a query.
    Type: Grant
    Filed: March 15, 2013
    Date of Patent: October 18, 2016
    Assignee: International Business Machines Corporation
    Inventors: Vittorio Castelli, Radu Florian, Xiaoqiang Luo, Sameer Maskey, Hema Raghavan
  • Patent number: 8903707
    Abstract: A method, an apparatus and an article of manufacture for determining a dropped pronoun from a source language. The method includes collecting parallel sentences from a source and a target language, creating at least one word alignment between the parallel sentences in the source and the target language, mapping at least one pronoun from the target language sentence onto the source language sentence, computing at least one feature from the mapping, wherein the at least one feature is extracted from both the source language and the at least one pronoun projected from the target language, and using the at least one feature to train a classifier to predict position and spelling of at least one pronoun in the target language when the at least one pronoun is dropped in the source language.
    Type: Grant
    Filed: January 12, 2012
    Date of Patent: December 2, 2014
    Assignee: International Business Machines Corporation
    Inventors: Bing Zhao, Imed Zitouni, Xiaoqiang Luo, Vittorio Castelli
  • Publication number: 20140163962
    Abstract: Creating training data for a natural language processing system may comprise obtaining natural language input, the natural language input annotated with one or more important phrases; and generating training instances comprising a syntactic parse tree of nodes representing elements of the natural language input augmented with the annotated important phrases. In another aspect, a classifier may be trained based on the generated training instances. The classifier may be used to predict one or more potential important phrases in a query.
    Type: Application
    Filed: March 15, 2013
    Publication date: June 12, 2014
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Vittorio Castelli, Radu Florian, Xiaoqiang Luo, Sameer Maskey, Hema Raghavan
  • Patent number: 8620961
    Abstract: A Bell Tree data structure is provided to model the process of chaining the mentions, from one or more documents, into entities, tracking the entire process; where the data structure is used in an entity tracking process that produces multiple results ranked by a product of probability scores.
    Type: Grant
    Filed: May 5, 2008
    Date of Patent: December 31, 2013
    Assignee: International Business Machines Corporation
    Inventors: Abraham Ittycheriah, Hongyan Jing, Nandakishore Kambhatla, Xiaoqiang Luo, Salim E. Roukos
  • Publication number: 20130332450
    Abstract: A method for automatically extracting and organizing information by a processing device from a plurality of data sources is provided. A natural language processing information extraction pipeline that includes an automatic detection of entities is applied to the data sources. Information about detected entities is identified by analyzing products of the natural language processing pipeline. Identified information is grouped into equivalence classes containing equivalent information. At least one displayable representation of the equivalence classes is created. An order in which the at least one displayable representation is displayed is computed. A combined representation of the equivalence classes that respects the order in which the displayable representation is displayed is produced.
    Type: Application
    Filed: June 11, 2012
    Publication date: December 12, 2013
    Applicant: International Business Machines Corporation
    Inventors: Vittorio Castelli, Radu Florian, Xiaoqiang Luo, Hema Raghavan
  • Publication number: 20130185049
    Abstract: A method, an apparatus and an article of manufacture for determining a dropped pronoun from a source language. The method includes collecting parallel sentences from a source and a target language, creating at least one word alignment between the parallel sentences in the source and the target language, mapping at least one pronoun from the target language sentence onto the source language sentence, computing at least one feature from the mapping, wherein the at least one feature is extracted from both the source language and the at least one pronoun projected from the target language, and using the at least one feature to train a classifier to predict position and spelling of at least one pronoun in the target language when the at least one pronoun is dropped in the source language.
    Type: Application
    Filed: January 12, 2012
    Publication date: July 18, 2013
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Bing Zhao, Imed Zitouni, Xiaoqiang Luo, Vittorio Castelli
  • Patent number: 7464024
    Abstract: A parser is provided that parses a Chinese text stream at the character level and builds a syntactic structure of Chinese character sequences. A character-based syntactic parse tree contains word boundaries, part-of-speech tags, and phrasal structure information. Syntactic knowledge constrains the system when it determines word boundaries. A deterministic procedure is used to convert word-based parse trees into character-based trees. Character-level tags are derived from word-level part-of-speech tags and word-boundary information is encoded with a positional tag. Word-level parts-of-speech become a constituent label in character-based trees. A maximum entropy parser is then built and tested.
    Type: Grant
    Filed: April 16, 2004
    Date of Patent: December 9, 2008
    Assignee: International Business Machines Corporation
    Inventors: Xiaoqiang Luo, Robert Todd Ward
  • Publication number: 20080243888
    Abstract: A Bell Tree data structure is provided to model the process of chaining the mentions, from one or more documents, into entities, tracking the entire process; where the data structure is used in an entity tracking process that produces multiple results ranked by a product of probability scores.
    Type: Application
    Filed: May 5, 2008
    Publication date: October 2, 2008
    Inventors: Abraham Ittycheriah, Hongyan Jing, Nandakishore Kambhatla, Xiaoqiang Luo, Salim E. Roukos
  • Patent number: 7398274
    Abstract: A Bell Tree data structure is provided to model the process of chaining the mentions, from one or more documents, into entities, tracking the entire process; where the data structure is used in an entity tracking process that produces multiple results ranked by a product of probability scores.
    Type: Grant
    Filed: April 27, 2004
    Date of Patent: July 8, 2008
    Assignee: International Business Machines Corporation
    Inventors: Abraham Ittycheriah, Hongyan Jing, Nandakishore Kambhatla, Xiaoqiang Luo, Salim E. Roukos
  • Patent number: 7308400
    Abstract: An arrangement for adapting statistical parsers to new data using a mathematical transform, particularly a Markov transform. In particular, it is assumed that an initial statistical parser is available and a batch of new data is given. The initial model is mapped to a new model by a Markov matrix, each of whose rows sums to one. In the unsupervised setup, where “true” parses are missing, the transform matrix is obtained by maximizing the log likelihood of the parses of test data decoded using the model before adaptation. The proposed algorithm can be applied to supervised adaptation, as well.
    Type: Grant
    Filed: December 14, 2000
    Date of Patent: December 11, 2007
    Assignee: International Business Machines Corporation
    Inventors: Xiaoqiang Luo, Salim E. Roukos, Robert T. Ward
  • Publication number: 20050237227
    Abstract: A Bell Tree data structure is provided to model the process of chaining the mentions, from one or more documents, into entities, tracking the entire process; where the data structure is used in an entity tracking process that produces multiple results ranked by a product of probability scores.
    Type: Application
    Filed: April 27, 2004
    Publication date: October 27, 2005
    Applicant: International Business Machines Corporation
    Inventors: Abraham Ittycheriah, Hongyan Jing, Nandakishore Kambhatla, Xiaoqiang Luo, Salim Roukos
  • Publication number: 20050234707
    Abstract: A parser is provided that parses a Chinese text stream at the character level and builds a syntactic structure of Chinese character sequences. A character-based syntactic parse tree contains word boundaries, part-of-speech tags, and phrasal structure information. Syntactic knowledge constrains the system when it determines word boundaries. A deterministic procedure is used to convert word-based parse trees into character-based trees. Character-level tags are derived from word-level part-of-speech tags and word-boundary information is encoded with a positional tag. Word-level parts-of-speech become a constituent label in character-based trees. A maximum entropy parser is then built and tested.
    Type: Application
    Filed: April 16, 2004
    Publication date: October 20, 2005
    Applicant: International Business Machines Corporation
    Inventors: Xiaoqiang Luo, Robert Ward
  • Publication number: 20040111253
    Abstract: A method, computer program product, and data processing system for training a statistical parser by utilizing active learning techniques to reduce the size of the corpus of human-annotated training samples (e.g., sentences) needed is disclosed. According to a preferred embodiment of the present invention, the statistical parser under training is used to compare the grammatical structure of the samples according to the parser's current level of training. The samples are then divided into clusters, with each cluster representing samples having a similar structure as ascertained by the statistical parser. Uncertainty metrics are applied to the clustered samples to select samples from each cluster that reflect uncertainty in the statistical parser's grammatical model. These selected samples may then be annotated by a human trainer for training the statistical parser.
    Type: Application
    Filed: December 10, 2002
    Publication date: June 10, 2004
    Applicant: International Business Machines Corporation
    Inventors: Xiaoqiang Luo, Salim Roukos, Min Tang
  • Publication number: 20020111793
    Abstract: An arrangement for adapting statistical parsers to new data using a mathematical transform, particularly a Markov transform. In particular, it is assumed that an initial statistical parser is available and a batch of new data is given. The initial model is mapped to a new model by a Markov matrix, each of whose rows sums to one. In the unsupervised setup, where “true” parses are missing, the transform matrix is obtained by maximizing the log likelihood of the parses of test data decoded using the model before adaptation. The proposed algorithm can be applied to supervised adaptation, as well.
    Type: Application
    Filed: December 14, 2000
    Publication date: August 15, 2002
    Applicant: IBM Corporation
    Inventors: Xiaoqiang Luo, Salim E. Roukos, Robert T. Ward