Patents by Inventor Yaoqing Gao
Yaoqing Gao has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Patent number: 11573777Abstract: A method includes analyzing a dataflow graph representing data dependencies between operators of a dataflow application to identify a plurality of candidate groups of the operators. Based on characteristics of a given hardware accelerator and the operators of a given candidate group of the plurality of candidate groups, determining whether the operators of the given candidate group are to be combined. In response to determining that the operators of the given candidate group are to be combined, retrieving executable binary code segments corresponding to the operators of the given candidate group, generating a unit of binary code including the executable binary code segments and metadata representing an execution control flow among the executable binary code segments, and dispatching the unit of code to the given hardware accelerator for execution of the unit of code.Type: GrantFiled: February 26, 2021Date of Patent: February 7, 2023Assignee: HUAWEI TECHNOLOGIES CO., LTD.Inventors: Reza Azimi, Cheng Xiang Feng, Kai-Ting Amy Wang, Yaoqing Gao, Ye Tian, Xiang Wang
-
Patent number: 11429359Abstract: A method for improving the performance of applications executed within asynchronous processor architectures. In an embodiment, a method for improving execution time of compiled synchronized source code on an asynchronous processor architecture includes receiving, by a processing system, synchronized source code comprising synchronization instructions to synchronize execution of the synchronized source code on different pipelines of the asynchronous processor architecture. The method also includes analyzing, by the processing system, the synchronized source code to determine whether the synchronized source code includes a broken code condition.Type: GrantFiled: July 20, 2020Date of Patent: August 30, 2022Assignee: Huawei Technologies Co., Ltd.Inventors: Ahmed Mohammed ElShafiey Mohammed Eltantawy, Yaoqing Gao, Christopher Rodrigues, Lijuan Hai
-
Patent number: 11221834Abstract: Systems and methods for auto-tuning and compiling source code are provided. A first executable file is generated by compiling the source code in accordance with a first optimization scheme. Compiling reports, performance reports, and bottleneck information are generated for the first executable file. A second optimization scheme is generated, and a second executable file is generated by compiling the source code in accordance with the second optimization scheme. An optimized executable file is output based on the first executable file and the second executable file.Type: GrantFiled: March 23, 2020Date of Patent: January 11, 2022Assignee: Huawei Technologies Co., Ltd.Inventors: Yaoqing Gao, Xuan Zhong, Peng Wu, Long Chen
-
Patent number: 11188314Abstract: Systems and methods for auto-tuning and compiling source code are provided. A first executable file is generated by compiling the source code in accordance with a first optimization scheme. Compiling reports, performance reports, and bottleneck information are generated for the first executable file. A second optimization scheme is generated, and a second executable file is generated by compiling the source code in accordance with the second optimization scheme. An optimized executable file is output based on the first executable file and the second executable file.Type: GrantFiled: March 23, 2020Date of Patent: November 30, 2021Assignee: Huawei Technologies Co., Ltd.Inventors: Yaoqing Gao, Xuan Zhong, Peng Wu, Long Chen
-
Patent number: 11144290Abstract: A method includes analyzing a dataflow graph representing data dependencies between operators of a dataflow application to identify a plurality of candidate groups of the operators. Based on characteristics of a given hardware accelerator and the operators of a given candidate group of the plurality of candidate groups, determining whether the operators of the given candidate group are to be combined. In response to determining that the operators of the given candidate group are to be combined, retrieving executable binary code segments corresponding to the operators of the given candidate group, generating a unit of binary code including the executable binary code segments and metadata representing an execution control flow among the executable binary code segments, and dispatching the unit of code to the given hardware accelerator for execution of the unit of code.Type: GrantFiled: September 13, 2019Date of Patent: October 12, 2021Assignee: HUAWEI TECHNOLOGIES CO., LTD.Inventors: Reza Azimi, Cheng Xiang Feng, Kai-Ting Amy Wang, Yaoqing Gao, Ye Tian, Xiang Wang
-
Publication number: 20210182041Abstract: A method includes analyzing a dataflow graph representing data dependencies between operators of a dataflow application to identify a plurality of candidate groups of the operators. Based on characteristics of a given hardware accelerator and the operators of a given candidate group of the plurality of candidate groups, determining whether the operators of the given candidate group are to be combined. In response to determining that the operators of the given candidate group are to be combined, retrieving executable binary code segments corresponding to the operators of the given candidate group, generating a unit of binary code including the executable binary code segments and metadata representing an execution control flow among the executable binary code segments, and dispatching the unit of code to the given hardware accelerator for execution of the unit of code.Type: ApplicationFiled: February 26, 2021Publication date: June 17, 2021Applicant: HUAWEI TECHNOLOGIES CO., LTD.Inventors: Reza AZIMI, Cheng Xiang FENG, Kai-Ting Amy WANG, Yaoqing GAO, Ye TIAN, Xiang WANG
-
Publication number: 20210081184Abstract: A method includes analyzing a dataflow graph representing data dependencies between operators of a dataflow application to identify a plurality of candidate groups of the operators. Based on characteristics of a given hardware accelerator and the operators of a given candidate group of the plurality of candidate groups, determining whether the operators of the given candidate group are to be combined. In response to determining that the operators of the given candidate group are to be combined, retrieving executable binary code segments corresponding to the operators of the given candidate group, generating a unit of binary code including the executable binary code segments and metadata representing an execution control flow among the executable binary code segments, and dispatching the unit of code to the given hardware accelerator for execution of the unit of code.Type: ApplicationFiled: September 13, 2019Publication date: March 18, 2021Applicant: HUAWEI TECHNOLOGIES CO., LTD.Inventors: Reza AZIMI, Kai-Ting Amy WANG, Yaoqing GAO, Ye TIAN, Xiang WANG, Cheng Xiang FENG
-
Publication number: 20210004213Abstract: A method for improving the performance of applications executed within asynchronous processor architectures. In an embodiment, a method for improving execution time of compiled synchronized source code on an asynchronous processor architecture includes receiving, by a processing system, synchronized source code comprising synchronization instructions to synchronize execution of the synchronized source code on different pipelines of the asynchronous processor architecture. The method also includes analyzing, by the processing system, the synchronized source code to determine whether the synchronized source code includes a broken code condition.Type: ApplicationFiled: July 20, 2020Publication date: January 7, 2021Inventors: Ahmed Mohammed ElShafiey Mohammed Eltantawy, Yaoqing Gao, Christopher Rodrigues, Lijuan Hai
-
Publication number: 20200293295Abstract: Systems and methods for auto-tuning and compiling source code are provided. A first executable file is generated by compiling the source code in accordance with a first optimization scheme. Compiling reports, performance reports, and bottleneck information are generated for the first executable file. A second optimization scheme is generated, and a second executable file is generated by compiling the source code in accordance with the second optimization scheme. An optimized executable file is output based on the first executable file and the second executable file.Type: ApplicationFiled: March 23, 2020Publication date: September 17, 2020Inventors: Yaoqing Gao, Xuan Zhong, Peng Wu, Long Chen
-
Publication number: 20200233882Abstract: In some examples, a controller comprises a bucketization logic to receive a bucketization indication from a host processor, and in response to the bucketization indication, partition data stored in a memory of a storage device into buckets, wherein a first bucket of the buckets comprises data items that share a first common characteristic. The bucketization logic is to send data items of the first bucket to the host processor for processing by the host processor using a first code module configured for the first common characteristic of the first bucket.Type: ApplicationFiled: January 18, 2019Publication date: July 23, 2020Inventors: Martin Ichilevici de Oliveira, Man Pok Ho, Jose Nelson Amaral, Kai-Ting Amy Wang, Yaoqing Gao, Bryan Chan
-
Patent number: 10216496Abstract: An approach to dynamic run-time alias checking comprising creating a main thread and a helper thread, computing an optimized first region of code in a rollback-only transactional memory associated with the main thread checking for one or more alias dependencies in an un-optimized first region of code, responsive to a determination in a predetermined amount of time that no alias dependencies are present in the un-optimized first region of code, committing a transaction and responsive to at least one of a failure to determine results of the check for one or more alias dependencies in the predetermined amount of time and a determination in the predetermined amount of time that alias dependencies are present in the un-optimized first region of code, performing a rollback of the transaction and executing the un-optimized first region of code.Type: GrantFiled: September 27, 2016Date of Patent: February 26, 2019Assignee: International Business Machines CorporationInventors: Yaoqing Gao, William G. O'Farrell, Denis Palmeiro
-
Patent number: 10095491Abstract: Embodiments of the present invention provide a method, system and computer program product for the data splitting of recursive data structures. In one embodiment of the invention, a method for data splitting recursive data structures can be provided. The method can include identifying data objects of a recursive data structure type, such as a linked list, within source code, the recursive data structure type defining multiple different data fields. The method further can include grouping the data objects into some memory pool units, each of which can contain the same number of data objects. Each memory pool unit can be seen as an array of data objects. The method can include data splitting, which could be maximal array splitting in each different memory pool unit. Finally, the method can include three different approaches, including field padding, field padding and field splitting, to handle irregular field sizes in the data structure.Type: GrantFiled: July 28, 2015Date of Patent: October 9, 2018Assignee: International Business Machines CorporationInventors: Roch G. Archambault, Shimin Cui, Stephen Curial, Yaoqing Gao, Raul E. Silvera, Peng Zhao
-
Patent number: 10061568Abstract: An approach to dynamic run-time alias checking comprising creating a main thread and a helper thread, computing an optimized first region of code in a rollback-only transactional memory associated with the main thread checking for one or more alias dependencies in an un-optimized first region of code, responsive to a determination in a predetermined amount of time that no alias dependencies are present in the un-optimized first region of code, committing a transaction and responsive to at least one of a failure to determine results of the check for one or more alias dependencies in the predetermined amount of time and a determination in the predetermined amount of time that alias dependencies are present in the un-optimized first region of code, performing a rollback of the transaction and executing the un-optimized first region of code.Type: GrantFiled: December 21, 2017Date of Patent: August 28, 2018Assignee: International Business Machines CorporationInventors: Yaoqing Gao, William G. O'Farrell, Denis Palmeiro
-
Patent number: 9946523Abstract: A code region of an application is instrumented by a multi-pass profiler with first annotations for generating profile data. The application is executed with the first annotations, wherein executing the application with the first annotations generates first profile data for the code region. The multi-pass profiler identifies, from the first profile data, the code region as a delinquent code region. The multi-pass profiler determines second annotations based, at least in part, on the first profile data and the at least one of the first annotations that defines the delinquent code region. The multi-pass profiler instruments, based on the first profile data, a code sub-region of the delinquent code region with the second annotations for generating profile data. The application is executed with second annotations, wherein executing the application with the second annotations generates second profile data for the code sub-region.Type: GrantFiled: July 6, 2010Date of Patent: April 17, 2018Assignee: International Business Machines CorporationInventors: Roch G. Archambault, Yaoqing Gao, Allan R. Martin, Mark P. Mendell, Raul E. Silvera, Graham Yiu
-
Publication number: 20180095736Abstract: An approach to dynamic run-time alias checking comprising creating a main thread and a helper thread, computing an optimized first region of code in a rollback-only transactional memory associated with the main thread checking for one or more alias dependencies in an un-optimized first region of code, responsive to a determination in a predetermined amount of time that no alias dependencies are present in the un-optimized first region of code, committing a transaction and responsive to at least one of a failure to determine results of the check for one or more alias dependencies in the predetermined amount of time and a determination in the predetermined amount of time that alias dependencies are present in the un-optimized first region of code, performing a rollback of the transaction and executing the un-optimized first region of code.Type: ApplicationFiled: December 21, 2017Publication date: April 5, 2018Inventors: Yaoqing Gao, William G. O'Farrell, Denis Palmeiro
-
Publication number: 20180088917Abstract: An approach to dynamic run-time alias checking comprising creating a main thread and a helper thread, computing an optimized first region of code in a rollback-only transactional memory associated with the main thread checking for one or more alias dependencies in an un-optimized first region of code, responsive to a determination in a predetermined amount of time that no alias dependencies are present in the un-optimized first region of code, committing a transaction and responsive to at least one of a failure to determine results of the check for one or more alias dependencies in the predetermined amount of time and a determination in the predetermined amount of time that alias dependencies are present in the un-optimized first region of code, performing a rollback of the transaction and executing the un-optimized first region of code.Type: ApplicationFiled: September 27, 2016Publication date: March 29, 2018Inventors: Yaoqing Gao, William G. O'Farrell, Denis Palmeiro
-
Patent number: 9798528Abstract: A solution for cooperative data prefetching that enables software control of a memory-side data prefetch and/or a processor-side data prefetch is provided. In one embodiment, the invention provides a solution for generating an application, in which access to application data for the application is improved (e.g., optimized) in program code for the application. In particular, a push request, for performing a memory-side data prefetch of the application data, and a prefetch request, for performing a processor-side data prefetch, are added to the program code. The memory-side data prefetch results in the application data being copied from a first data store to a second data store that is faster than the first data store while the processor-side data prefetch results in the application data being copied from the second data store to a third data store that is faster than the second data store.Type: GrantFiled: September 13, 2006Date of Patent: October 24, 2017Assignee: International Business Machines CorporationInventors: Yaoqing Gao, Gheorghe C. Cascaval, Allan H. Kielstra, Robert B. Tremaine, Michael E. Wazlowski, Lixin Zhang
-
Patent number: 9727317Abstract: A source code is pre-processed to form a pre-processed source code. The source code refers to an external code in a separate file, and the pre-processed source code creates a single file that includes the source code and the external code. The source code is profiled to create profile information identifying a hot portion having a first degree of hotness. A set of environment parameter values is determined to be applicable to a data processing system where the application will execute. At a remote optimizing compiler, a selection of a set of compiler options from a knowledgebase corresponding to the profile information and the set of environment parameter values is caused and an object code resulting from compiling the pre-processed source code using the set of compiler options is obtained. The object code is optimized according to the profile information and the set of environment parameter values.Type: GrantFiled: November 4, 2015Date of Patent: August 8, 2017Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Yaoqing Gao, John R. MacMillan, Jeremiah S. Swan, Trong Truong, Kobimanalan Vinayagamoorthy
-
Publication number: 20170123773Abstract: A source code is pre-processed to form a pre-processed source code. The source code refers to an external code in a separate file, and the pre-processed source code creates a single file that includes the source code and the external code. The source code is profiled to create profile information identifying a hot portion having a first degree of hotness. A set of environment parameter values is determined to be applicable to a data processing system where the application will execute. At a remote optimizing compiler, a selection of a set of compiler options from a knowledgebase corresponding to the profile information and the set of environment parameter values is caused and an object code resulting from compiling the pre-processed source code using the set of compiler options is obtained. The object code is optimized according to the profile information and the set of environment parameter values.Type: ApplicationFiled: November 4, 2015Publication date: May 4, 2017Applicant: International Business Machines CorporationInventors: Yaoqing Gao, John R. MacMillan, Jeremiah S. Swan, Trong Truong, Kobimanalan Vinayagamoorthy
-
Patent number: 9632762Abstract: A computer identifies one or more pairs of scalar statements and performs a cost analysis of operations of each of the one or more pairs of scalar statements to determine both a benefit and a cost of operations. The computer determines, based, at least in part, on the cost analysis, a gain for each of the one or more pairs of scalar statements. The computer creates based, at least in part, on the gain, a sorted list of each of the one or more pairs of scalar statements and selects a first pair from the sorted list. The computer issues a query to a hash table using a statement of the first pair and selects from results received from the query, a second pair. The computer then extends, based, at least in part, on the second pair, the first pair to create a pack.Type: GrantFiled: February 17, 2015Date of Patent: April 25, 2017Assignee: International Business Machines CorporationInventors: Ehsan Amiri, Christopher M. Barton, Yaoqing Gao, Denis M. Palmeiro, Raul E. Silvera