Patents by Inventor Ning-yi Xu
Ning-yi Xu has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Patent number: 8868470Abstract: Systems, methods, and devices are described for implementing learning algorithms on data sets. A data set may be partitioned into a plurality of data partitions that may be distributed to two or more processors, such as a graphics processing unit. The data partitions may be processed in parallel by each of the processors to determine local counts associated with the data partitions. The local counts may then be aggregated to form a global count that reflects the local counts for the data set. The partitioning may be performed by a data partition algorithm and the processing and the aggregating may be performed by a parallel collapsed Gibbs sampling (CGS) algorithm and/or a parallel collapsed variational Bayesian (CVB) algorithm. In addition, the CGS and/or the CVB algorithms may be associated with the data partition algorithm and may be parallelized to train a latent Dirichlet allocation model.Type: GrantFiled: November 9, 2010Date of Patent: October 21, 2014Assignee: Microsoft CorporationInventors: Ning-Yi Xu, Feng-Hsiung Hsu, Feng Yan
-
Patent number: 8583569Abstract: Accelerator systems and methods are disclosed that utilize FPGA technology to achieve better parallelism and flexibility. The accelerator system may be used to implement a relevance-ranking algorithm, such as RankBoost, for a training process. The algorithm and related data structures may be organized to enable streaming data access and, thus, increase the training speed. The data may be compressed to enable the system and method to be operable with larger data sets. At least a portion of the approximated RankBoost algorithm may be implemented as a single instruction multiple data streams (SIMD) architecture with multiple processing engines (PEs) in the FPGA. Thus, large data sets can be loaded on memories associated with an FPGA to increase the speed of the relevance ranking algorithm.Type: GrantFiled: December 22, 2011Date of Patent: November 12, 2013Assignee: Microsoft CorporationInventors: Ning-Yi Xu, Xiong-Fei Cai, Feng-Hsiung Hsu
-
Patent number: 8301638Abstract: A method using a RankBoost-based algorithm to automatically select features for further ranking model training is provided. The method reiteratively applies a set of ranking candidates to a training data set comprising a plurality of ranking objects having a known pairwise ranking order. Each round of iteration applies a weight distribution of ranking object pairs, yields a ranking result by each ranking candidate, identifies a favored ranking candidate for the round based on the ranking results, and updates the weight distribution to be used in next iteration round by increasing weights of ranking object pairs that are poorly ranked by the favored ranking candidate. The method then infers a target feature set from the favored ranking candidates identified in the iterations.Type: GrantFiled: September 25, 2008Date of Patent: October 30, 2012Assignee: Microsoft CorporationInventors: Ning-Yi Xu, Feng-Hsiung Hsu, Rui Gao, Xiong-Fei Cai, Junyan Chen
-
Publication number: 20120117008Abstract: Systems, methods, and devices are described for implementing learning algorithms on data sets. A data set may be partitioned into a plurality of data partitions that may be distributed to two or more processors, such as a graphics processing unit. The data partitions may be processed in parallel by each of the processors to determine local counts associated with the data partitions. The local counts may then be aggregated to form a global count that reflects the local counts for the data set. The partitioning may be performed by a data partition algorithm and the processing and the aggregating may be performed by a parallel collapsed Gibbs sampling (CGS) algorithm and/or a parallel collapsed variational Bayesian (CVB) algorithm. In addition, the CGS and/or the CVB algorithms may be associated with the data partition algorithm and may be parallelized to train a latent Dirichlet allocation model.Type: ApplicationFiled: November 9, 2010Publication date: May 10, 2012Applicant: MICROSOFT CORPORATIONInventors: Ning-Yi Xu, Feng-Hsiung Hsu, Feng Yan
-
Publication number: 20120092040Abstract: Accelerator systems and methods are disclosed that utilize FPGA technology to achieve better parallelism and flexibility. The accelerator system may be used to implement a relevance-ranking algorithm, such as RankBoost, for a training process. The algorithm and related data structures may be organized to enable streaming data access and, thus, increase the training speed. The data may be compressed to enable the system and method to be operable with larger data sets. At least a portion of the approximated RankBoost algorithm may be implemented as a single instruction multiple data streams (SIMD) architecture with multiple processing engines (PEs) in the FPGA. Thus, large data sets can be loaded on memories associated with an FPGA to increase the speed of the relevance ranking algorithm.Type: ApplicationFiled: December 22, 2011Publication date: April 19, 2012Applicant: Microsoft CorporationInventors: Ning-Yi Xu, Feng-Hsiung Hsu, Xiong-Fei Cai
-
Patent number: 8131659Abstract: Accelerator systems and methods are disclosed that utilize FPGA technology to achieve better parallelism and processing speed. A Field Programmable Gate Array (FPGA) is configured to have a hardware logic performing computations associated with a neural network training algorithm, especially a Web relevance ranking algorithm such as LambaRank. The training data is first processed and organized by a host computing device, and then streamed to the FPGA for direct access by the FPGA to perform high-bandwidth computation with increased training speed. Thus, large data sets such as that related to Web relevance ranking can be processed. The FPGA may include a processing element performing computations of a hidden layer of the neural network training algorithm. Parallel computing may be realized using a single instruction multiple data streams (SIMD) architecture with multiple arithmetic logic units in the FPGA.Type: GrantFiled: September 25, 2008Date of Patent: March 6, 2012Assignee: Microsoft CorporationInventors: Ning-Yi Xu, Xiong-Fei Cai, Rui Gao, Jing Yan, Feng-Hsiung Hsu
-
Patent number: 8117137Abstract: Accelerator systems and methods are disclosed that utilize FPGA technology to achieve better parallelism and flexibility. The accelerator system may be used to implement a relevance-ranking algorithm, such as RankBoost, for a training process. The algorithm and related data structures may be organized to enable streaming data access and, thus, increase the training speed. The data may be compressed to enable the system and method to be operable with larger data sets. At least a portion of the approximated RankBoost algorithm may be implemented as a single instruction multiple data streams (SIMD) architecture with multiple processing engines (PEs) in the FPGA. Thus, large data sets can be loaded on memories associated with an FPGA to increase the speed of the relevance ranking algorithm.Type: GrantFiled: April 19, 2007Date of Patent: February 14, 2012Assignee: Microsoft CorporationInventors: Ning-yi Xu, Feng-Hsiung Hsu, Xiong-Fei Cai
-
Publication number: 20100076911Abstract: A method using a RankBoost-based algorithm to automatically select features for further ranking model training is provided. The method reiteratively applies a set of ranking candidates to a training data set comprising a plurality of ranking objects having a known pairwise ranking order. Each round of iteration applies a weight distribution of ranking object pairs, yields a ranking result by each ranking candidate, identifies a favored ranking candidate for the round based on the ranking results, and updates the weight distribution to be used in next iteration round by increasing weights of ranking object pairs that are poorly ranked by the favored ranking candidate. The method then infers a target feature set from the favored ranking candidates identified in the iterations.Type: ApplicationFiled: September 25, 2008Publication date: March 25, 2010Applicant: Microsoft CorporationInventors: Ning-Yi Xu, Junyan Chen, Rui Gao, Xiong-Fei Cai, Feng-Hsiung Hsu
-
Publication number: 20100076915Abstract: Accelerator systems and methods are disclosed that utilize FPGA technology to achieve better parallelism and processing speed. A Field Programmable Gate Array (FPGA) is configured to have a hardware logic performing computations associated with a neural network training algorithm, especially a Web relevance ranking algorithm such as LambaRank. The training data is first processed and organized by a host computing device, and then streamed to the FPGA for direct access by the FPGA to perform high-bandwidth computation with increased training speed. Thus, large data sets such as that related to Web relevance ranking can be processed. The FPGA may include a processing element performing computations of a hidden layer of the neural network training algorithm. Parallel computing may be realized using a single instruction multiple data streams (SIMD) architecture with multiple arithmetic logic units in the FPGA.Type: ApplicationFiled: September 25, 2008Publication date: March 25, 2010Applicant: Microsoft CorporationInventors: Ning-Yi Xu, Xiong-Fei Cai, Rui Gao, Jing Yan, Feng-Hsiung Hsu
-
Publication number: 20080262984Abstract: Accelerator systems and methods are disclosed that utilize FPGA technology to achieve better parallelism and flexibility. The accelerator system may be used to implement a relevance-ranking algorithm, such as RankBoost, for a training process. The algorithm and related data structures may be organized to enable streaming data access and, thus, increase the training speed. The data may be compressed to enable the system and method to be operable with larger data sets. At least a portion of the approximated RankBoost algorithm may be implemented as a single instruction multiple data streams (SIMD) architecture with multiple processing engines (PEs) in the FPGA. Thus, large data sets can be loaded on memories associated with an FPGA to increase the speed of the relevance ranking algorithm.Type: ApplicationFiled: April 19, 2007Publication date: October 23, 2008Applicant: Microsoft CorporationInventors: Ning-yi Xu, Feng-Hsiung Hsu, Xiong-Fei Cai