Patents by Inventor Ning-yi Xu

Ning-yi Xu has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 8868470
    Abstract: Systems, methods, and devices are described for implementing learning algorithms on data sets. A data set may be partitioned into a plurality of data partitions that may be distributed to two or more processors, such as a graphics processing unit. The data partitions may be processed in parallel by each of the processors to determine local counts associated with the data partitions. The local counts may then be aggregated to form a global count that reflects the local counts for the data set. The partitioning may be performed by a data partition algorithm and the processing and the aggregating may be performed by a parallel collapsed Gibbs sampling (CGS) algorithm and/or a parallel collapsed variational Bayesian (CVB) algorithm. In addition, the CGS and/or the CVB algorithms may be associated with the data partition algorithm and may be parallelized to train a latent Dirichlet allocation model.
    Type: Grant
    Filed: November 9, 2010
    Date of Patent: October 21, 2014
    Assignee: Microsoft Corporation
    Inventors: Ning-Yi Xu, Feng-Hsiung Hsu, Feng Yan
  • Patent number: 8583569
    Abstract: Accelerator systems and methods are disclosed that utilize FPGA technology to achieve better parallelism and flexibility. The accelerator system may be used to implement a relevance-ranking algorithm, such as RankBoost, for a training process. The algorithm and related data structures may be organized to enable streaming data access and, thus, increase the training speed. The data may be compressed to enable the system and method to be operable with larger data sets. At least a portion of the approximated RankBoost algorithm may be implemented as a single instruction multiple data streams (SIMD) architecture with multiple processing engines (PEs) in the FPGA. Thus, large data sets can be loaded on memories associated with an FPGA to increase the speed of the relevance ranking algorithm.
    Type: Grant
    Filed: December 22, 2011
    Date of Patent: November 12, 2013
    Assignee: Microsoft Corporation
    Inventors: Ning-Yi Xu, Xiong-Fei Cai, Feng-Hsiung Hsu
  • Patent number: 8301638
    Abstract: A method using a RankBoost-based algorithm to automatically select features for further ranking model training is provided. The method reiteratively applies a set of ranking candidates to a training data set comprising a plurality of ranking objects having a known pairwise ranking order. Each round of iteration applies a weight distribution of ranking object pairs, yields a ranking result by each ranking candidate, identifies a favored ranking candidate for the round based on the ranking results, and updates the weight distribution to be used in next iteration round by increasing weights of ranking object pairs that are poorly ranked by the favored ranking candidate. The method then infers a target feature set from the favored ranking candidates identified in the iterations.
    Type: Grant
    Filed: September 25, 2008
    Date of Patent: October 30, 2012
    Assignee: Microsoft Corporation
    Inventors: Ning-Yi Xu, Feng-Hsiung Hsu, Rui Gao, Xiong-Fei Cai, Junyan Chen
  • Publication number: 20120117008
    Abstract: Systems, methods, and devices are described for implementing learning algorithms on data sets. A data set may be partitioned into a plurality of data partitions that may be distributed to two or more processors, such as a graphics processing unit. The data partitions may be processed in parallel by each of the processors to determine local counts associated with the data partitions. The local counts may then be aggregated to form a global count that reflects the local counts for the data set. The partitioning may be performed by a data partition algorithm and the processing and the aggregating may be performed by a parallel collapsed Gibbs sampling (CGS) algorithm and/or a parallel collapsed variational Bayesian (CVB) algorithm. In addition, the CGS and/or the CVB algorithms may be associated with the data partition algorithm and may be parallelized to train a latent Dirichlet allocation model.
    Type: Application
    Filed: November 9, 2010
    Publication date: May 10, 2012
    Applicant: MICROSOFT CORPORATION
    Inventors: Ning-Yi Xu, Feng-Hsiung Hsu, Feng Yan
  • Publication number: 20120092040
    Abstract: Accelerator systems and methods are disclosed that utilize FPGA technology to achieve better parallelism and flexibility. The accelerator system may be used to implement a relevance-ranking algorithm, such as RankBoost, for a training process. The algorithm and related data structures may be organized to enable streaming data access and, thus, increase the training speed. The data may be compressed to enable the system and method to be operable with larger data sets. At least a portion of the approximated RankBoost algorithm may be implemented as a single instruction multiple data streams (SIMD) architecture with multiple processing engines (PEs) in the FPGA. Thus, large data sets can be loaded on memories associated with an FPGA to increase the speed of the relevance ranking algorithm.
    Type: Application
    Filed: December 22, 2011
    Publication date: April 19, 2012
    Applicant: Microsoft Corporation
    Inventors: Ning-Yi Xu, Feng-Hsiung Hsu, Xiong-Fei Cai
  • Patent number: 8131659
    Abstract: Accelerator systems and methods are disclosed that utilize FPGA technology to achieve better parallelism and processing speed. A Field Programmable Gate Array (FPGA) is configured to have a hardware logic performing computations associated with a neural network training algorithm, especially a Web relevance ranking algorithm such as LambaRank. The training data is first processed and organized by a host computing device, and then streamed to the FPGA for direct access by the FPGA to perform high-bandwidth computation with increased training speed. Thus, large data sets such as that related to Web relevance ranking can be processed. The FPGA may include a processing element performing computations of a hidden layer of the neural network training algorithm. Parallel computing may be realized using a single instruction multiple data streams (SIMD) architecture with multiple arithmetic logic units in the FPGA.
    Type: Grant
    Filed: September 25, 2008
    Date of Patent: March 6, 2012
    Assignee: Microsoft Corporation
    Inventors: Ning-Yi Xu, Xiong-Fei Cai, Rui Gao, Jing Yan, Feng-Hsiung Hsu
  • Patent number: 8117137
    Abstract: Accelerator systems and methods are disclosed that utilize FPGA technology to achieve better parallelism and flexibility. The accelerator system may be used to implement a relevance-ranking algorithm, such as RankBoost, for a training process. The algorithm and related data structures may be organized to enable streaming data access and, thus, increase the training speed. The data may be compressed to enable the system and method to be operable with larger data sets. At least a portion of the approximated RankBoost algorithm may be implemented as a single instruction multiple data streams (SIMD) architecture with multiple processing engines (PEs) in the FPGA. Thus, large data sets can be loaded on memories associated with an FPGA to increase the speed of the relevance ranking algorithm.
    Type: Grant
    Filed: April 19, 2007
    Date of Patent: February 14, 2012
    Assignee: Microsoft Corporation
    Inventors: Ning-yi Xu, Feng-Hsiung Hsu, Xiong-Fei Cai
  • Publication number: 20100076911
    Abstract: A method using a RankBoost-based algorithm to automatically select features for further ranking model training is provided. The method reiteratively applies a set of ranking candidates to a training data set comprising a plurality of ranking objects having a known pairwise ranking order. Each round of iteration applies a weight distribution of ranking object pairs, yields a ranking result by each ranking candidate, identifies a favored ranking candidate for the round based on the ranking results, and updates the weight distribution to be used in next iteration round by increasing weights of ranking object pairs that are poorly ranked by the favored ranking candidate. The method then infers a target feature set from the favored ranking candidates identified in the iterations.
    Type: Application
    Filed: September 25, 2008
    Publication date: March 25, 2010
    Applicant: Microsoft Corporation
    Inventors: Ning-Yi Xu, Junyan Chen, Rui Gao, Xiong-Fei Cai, Feng-Hsiung Hsu
  • Publication number: 20100076915
    Abstract: Accelerator systems and methods are disclosed that utilize FPGA technology to achieve better parallelism and processing speed. A Field Programmable Gate Array (FPGA) is configured to have a hardware logic performing computations associated with a neural network training algorithm, especially a Web relevance ranking algorithm such as LambaRank. The training data is first processed and organized by a host computing device, and then streamed to the FPGA for direct access by the FPGA to perform high-bandwidth computation with increased training speed. Thus, large data sets such as that related to Web relevance ranking can be processed. The FPGA may include a processing element performing computations of a hidden layer of the neural network training algorithm. Parallel computing may be realized using a single instruction multiple data streams (SIMD) architecture with multiple arithmetic logic units in the FPGA.
    Type: Application
    Filed: September 25, 2008
    Publication date: March 25, 2010
    Applicant: Microsoft Corporation
    Inventors: Ning-Yi Xu, Xiong-Fei Cai, Rui Gao, Jing Yan, Feng-Hsiung Hsu
  • Publication number: 20080262984
    Abstract: Accelerator systems and methods are disclosed that utilize FPGA technology to achieve better parallelism and flexibility. The accelerator system may be used to implement a relevance-ranking algorithm, such as RankBoost, for a training process. The algorithm and related data structures may be organized to enable streaming data access and, thus, increase the training speed. The data may be compressed to enable the system and method to be operable with larger data sets. At least a portion of the approximated RankBoost algorithm may be implemented as a single instruction multiple data streams (SIMD) architecture with multiple processing engines (PEs) in the FPGA. Thus, large data sets can be loaded on memories associated with an FPGA to increase the speed of the relevance ranking algorithm.
    Type: Application
    Filed: April 19, 2007
    Publication date: October 23, 2008
    Applicant: Microsoft Corporation
    Inventors: Ning-yi Xu, Feng-Hsiung Hsu, Xiong-Fei Cai