Patents by Inventor Yaoqing Gao

Yaoqing Gao has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 11573777
    Abstract: A method includes analyzing a dataflow graph representing data dependencies between operators of a dataflow application to identify a plurality of candidate groups of the operators. Based on characteristics of a given hardware accelerator and the operators of a given candidate group of the plurality of candidate groups, determining whether the operators of the given candidate group are to be combined. In response to determining that the operators of the given candidate group are to be combined, retrieving executable binary code segments corresponding to the operators of the given candidate group, generating a unit of binary code including the executable binary code segments and metadata representing an execution control flow among the executable binary code segments, and dispatching the unit of code to the given hardware accelerator for execution of the unit of code.
    Type: Grant
    Filed: February 26, 2021
    Date of Patent: February 7, 2023
    Assignee: HUAWEI TECHNOLOGIES CO., LTD.
    Inventors: Reza Azimi, Cheng Xiang Feng, Kai-Ting Amy Wang, Yaoqing Gao, Ye Tian, Xiang Wang
  • Patent number: 11429359
    Abstract: A method for improving the performance of applications executed within asynchronous processor architectures. In an embodiment, a method for improving execution time of compiled synchronized source code on an asynchronous processor architecture includes receiving, by a processing system, synchronized source code comprising synchronization instructions to synchronize execution of the synchronized source code on different pipelines of the asynchronous processor architecture. The method also includes analyzing, by the processing system, the synchronized source code to determine whether the synchronized source code includes a broken code condition.
    Type: Grant
    Filed: July 20, 2020
    Date of Patent: August 30, 2022
    Assignee: Huawei Technologies Co., Ltd.
    Inventors: Ahmed Mohammed ElShafiey Mohammed Eltantawy, Yaoqing Gao, Christopher Rodrigues, Lijuan Hai
  • Patent number: 11221834
    Abstract: Systems and methods for auto-tuning and compiling source code are provided. A first executable file is generated by compiling the source code in accordance with a first optimization scheme. Compiling reports, performance reports, and bottleneck information are generated for the first executable file. A second optimization scheme is generated, and a second executable file is generated by compiling the source code in accordance with the second optimization scheme. An optimized executable file is output based on the first executable file and the second executable file.
    Type: Grant
    Filed: March 23, 2020
    Date of Patent: January 11, 2022
    Assignee: Huawei Technologies Co., Ltd.
    Inventors: Yaoqing Gao, Xuan Zhong, Peng Wu, Long Chen
  • Patent number: 11188314
    Abstract: Systems and methods for auto-tuning and compiling source code are provided. A first executable file is generated by compiling the source code in accordance with a first optimization scheme. Compiling reports, performance reports, and bottleneck information are generated for the first executable file. A second optimization scheme is generated, and a second executable file is generated by compiling the source code in accordance with the second optimization scheme. An optimized executable file is output based on the first executable file and the second executable file.
    Type: Grant
    Filed: March 23, 2020
    Date of Patent: November 30, 2021
    Assignee: Huawei Technologies Co., Ltd.
    Inventors: Yaoqing Gao, Xuan Zhong, Peng Wu, Long Chen
  • Patent number: 11144290
    Abstract: A method includes analyzing a dataflow graph representing data dependencies between operators of a dataflow application to identify a plurality of candidate groups of the operators. Based on characteristics of a given hardware accelerator and the operators of a given candidate group of the plurality of candidate groups, determining whether the operators of the given candidate group are to be combined. In response to determining that the operators of the given candidate group are to be combined, retrieving executable binary code segments corresponding to the operators of the given candidate group, generating a unit of binary code including the executable binary code segments and metadata representing an execution control flow among the executable binary code segments, and dispatching the unit of code to the given hardware accelerator for execution of the unit of code.
    Type: Grant
    Filed: September 13, 2019
    Date of Patent: October 12, 2021
    Assignee: HUAWEI TECHNOLOGIES CO., LTD.
    Inventors: Reza Azimi, Cheng Xiang Feng, Kai-Ting Amy Wang, Yaoqing Gao, Ye Tian, Xiang Wang
  • Publication number: 20210182041
    Abstract: A method includes analyzing a dataflow graph representing data dependencies between operators of a dataflow application to identify a plurality of candidate groups of the operators. Based on characteristics of a given hardware accelerator and the operators of a given candidate group of the plurality of candidate groups, determining whether the operators of the given candidate group are to be combined. In response to determining that the operators of the given candidate group are to be combined, retrieving executable binary code segments corresponding to the operators of the given candidate group, generating a unit of binary code including the executable binary code segments and metadata representing an execution control flow among the executable binary code segments, and dispatching the unit of code to the given hardware accelerator for execution of the unit of code.
    Type: Application
    Filed: February 26, 2021
    Publication date: June 17, 2021
    Applicant: HUAWEI TECHNOLOGIES CO., LTD.
    Inventors: Reza AZIMI, Cheng Xiang FENG, Kai-Ting Amy WANG, Yaoqing GAO, Ye TIAN, Xiang WANG
  • Publication number: 20210081184
    Abstract: A method includes analyzing a dataflow graph representing data dependencies between operators of a dataflow application to identify a plurality of candidate groups of the operators. Based on characteristics of a given hardware accelerator and the operators of a given candidate group of the plurality of candidate groups, determining whether the operators of the given candidate group are to be combined. In response to determining that the operators of the given candidate group are to be combined, retrieving executable binary code segments corresponding to the operators of the given candidate group, generating a unit of binary code including the executable binary code segments and metadata representing an execution control flow among the executable binary code segments, and dispatching the unit of code to the given hardware accelerator for execution of the unit of code.
    Type: Application
    Filed: September 13, 2019
    Publication date: March 18, 2021
    Applicant: HUAWEI TECHNOLOGIES CO., LTD.
    Inventors: Reza AZIMI, Kai-Ting Amy WANG, Yaoqing GAO, Ye TIAN, Xiang WANG, Cheng Xiang FENG
  • Publication number: 20210004213
    Abstract: A method for improving the performance of applications executed within asynchronous processor architectures. In an embodiment, a method for improving execution time of compiled synchronized source code on an asynchronous processor architecture includes receiving, by a processing system, synchronized source code comprising synchronization instructions to synchronize execution of the synchronized source code on different pipelines of the asynchronous processor architecture. The method also includes analyzing, by the processing system, the synchronized source code to determine whether the synchronized source code includes a broken code condition.
    Type: Application
    Filed: July 20, 2020
    Publication date: January 7, 2021
    Inventors: Ahmed Mohammed ElShafiey Mohammed Eltantawy, Yaoqing Gao, Christopher Rodrigues, Lijuan Hai
  • Publication number: 20200293295
    Abstract: Systems and methods for auto-tuning and compiling source code are provided. A first executable file is generated by compiling the source code in accordance with a first optimization scheme. Compiling reports, performance reports, and bottleneck information are generated for the first executable file. A second optimization scheme is generated, and a second executable file is generated by compiling the source code in accordance with the second optimization scheme. An optimized executable file is output based on the first executable file and the second executable file.
    Type: Application
    Filed: March 23, 2020
    Publication date: September 17, 2020
    Inventors: Yaoqing Gao, Xuan Zhong, Peng Wu, Long Chen
  • Publication number: 20200233882
    Abstract: In some examples, a controller comprises a bucketization logic to receive a bucketization indication from a host processor, and in response to the bucketization indication, partition data stored in a memory of a storage device into buckets, wherein a first bucket of the buckets comprises data items that share a first common characteristic. The bucketization logic is to send data items of the first bucket to the host processor for processing by the host processor using a first code module configured for the first common characteristic of the first bucket.
    Type: Application
    Filed: January 18, 2019
    Publication date: July 23, 2020
    Inventors: Martin Ichilevici de Oliveira, Man Pok Ho, Jose Nelson Amaral, Kai-Ting Amy Wang, Yaoqing Gao, Bryan Chan
  • Patent number: 10216496
    Abstract: An approach to dynamic run-time alias checking comprising creating a main thread and a helper thread, computing an optimized first region of code in a rollback-only transactional memory associated with the main thread checking for one or more alias dependencies in an un-optimized first region of code, responsive to a determination in a predetermined amount of time that no alias dependencies are present in the un-optimized first region of code, committing a transaction and responsive to at least one of a failure to determine results of the check for one or more alias dependencies in the predetermined amount of time and a determination in the predetermined amount of time that alias dependencies are present in the un-optimized first region of code, performing a rollback of the transaction and executing the un-optimized first region of code.
    Type: Grant
    Filed: September 27, 2016
    Date of Patent: February 26, 2019
    Assignee: International Business Machines Corporation
    Inventors: Yaoqing Gao, William G. O'Farrell, Denis Palmeiro
  • Patent number: 10095491
    Abstract: Embodiments of the present invention provide a method, system and computer program product for the data splitting of recursive data structures. In one embodiment of the invention, a method for data splitting recursive data structures can be provided. The method can include identifying data objects of a recursive data structure type, such as a linked list, within source code, the recursive data structure type defining multiple different data fields. The method further can include grouping the data objects into some memory pool units, each of which can contain the same number of data objects. Each memory pool unit can be seen as an array of data objects. The method can include data splitting, which could be maximal array splitting in each different memory pool unit. Finally, the method can include three different approaches, including field padding, field padding and field splitting, to handle irregular field sizes in the data structure.
    Type: Grant
    Filed: July 28, 2015
    Date of Patent: October 9, 2018
    Assignee: International Business Machines Corporation
    Inventors: Roch G. Archambault, Shimin Cui, Stephen Curial, Yaoqing Gao, Raul E. Silvera, Peng Zhao
  • Patent number: 10061568
    Abstract: An approach to dynamic run-time alias checking comprising creating a main thread and a helper thread, computing an optimized first region of code in a rollback-only transactional memory associated with the main thread checking for one or more alias dependencies in an un-optimized first region of code, responsive to a determination in a predetermined amount of time that no alias dependencies are present in the un-optimized first region of code, committing a transaction and responsive to at least one of a failure to determine results of the check for one or more alias dependencies in the predetermined amount of time and a determination in the predetermined amount of time that alias dependencies are present in the un-optimized first region of code, performing a rollback of the transaction and executing the un-optimized first region of code.
    Type: Grant
    Filed: December 21, 2017
    Date of Patent: August 28, 2018
    Assignee: International Business Machines Corporation
    Inventors: Yaoqing Gao, William G. O'Farrell, Denis Palmeiro
  • Patent number: 9946523
    Abstract: A code region of an application is instrumented by a multi-pass profiler with first annotations for generating profile data. The application is executed with the first annotations, wherein executing the application with the first annotations generates first profile data for the code region. The multi-pass profiler identifies, from the first profile data, the code region as a delinquent code region. The multi-pass profiler determines second annotations based, at least in part, on the first profile data and the at least one of the first annotations that defines the delinquent code region. The multi-pass profiler instruments, based on the first profile data, a code sub-region of the delinquent code region with the second annotations for generating profile data. The application is executed with second annotations, wherein executing the application with the second annotations generates second profile data for the code sub-region.
    Type: Grant
    Filed: July 6, 2010
    Date of Patent: April 17, 2018
    Assignee: International Business Machines Corporation
    Inventors: Roch G. Archambault, Yaoqing Gao, Allan R. Martin, Mark P. Mendell, Raul E. Silvera, Graham Yiu
  • Publication number: 20180095736
    Abstract: An approach to dynamic run-time alias checking comprising creating a main thread and a helper thread, computing an optimized first region of code in a rollback-only transactional memory associated with the main thread checking for one or more alias dependencies in an un-optimized first region of code, responsive to a determination in a predetermined amount of time that no alias dependencies are present in the un-optimized first region of code, committing a transaction and responsive to at least one of a failure to determine results of the check for one or more alias dependencies in the predetermined amount of time and a determination in the predetermined amount of time that alias dependencies are present in the un-optimized first region of code, performing a rollback of the transaction and executing the un-optimized first region of code.
    Type: Application
    Filed: December 21, 2017
    Publication date: April 5, 2018
    Inventors: Yaoqing Gao, William G. O'Farrell, Denis Palmeiro
  • Publication number: 20180088917
    Abstract: An approach to dynamic run-time alias checking comprising creating a main thread and a helper thread, computing an optimized first region of code in a rollback-only transactional memory associated with the main thread checking for one or more alias dependencies in an un-optimized first region of code, responsive to a determination in a predetermined amount of time that no alias dependencies are present in the un-optimized first region of code, committing a transaction and responsive to at least one of a failure to determine results of the check for one or more alias dependencies in the predetermined amount of time and a determination in the predetermined amount of time that alias dependencies are present in the un-optimized first region of code, performing a rollback of the transaction and executing the un-optimized first region of code.
    Type: Application
    Filed: September 27, 2016
    Publication date: March 29, 2018
    Inventors: Yaoqing Gao, William G. O'Farrell, Denis Palmeiro
  • Patent number: 9798528
    Abstract: A solution for cooperative data prefetching that enables software control of a memory-side data prefetch and/or a processor-side data prefetch is provided. In one embodiment, the invention provides a solution for generating an application, in which access to application data for the application is improved (e.g., optimized) in program code for the application. In particular, a push request, for performing a memory-side data prefetch of the application data, and a prefetch request, for performing a processor-side data prefetch, are added to the program code. The memory-side data prefetch results in the application data being copied from a first data store to a second data store that is faster than the first data store while the processor-side data prefetch results in the application data being copied from the second data store to a third data store that is faster than the second data store.
    Type: Grant
    Filed: September 13, 2006
    Date of Patent: October 24, 2017
    Assignee: International Business Machines Corporation
    Inventors: Yaoqing Gao, Gheorghe C. Cascaval, Allan H. Kielstra, Robert B. Tremaine, Michael E. Wazlowski, Lixin Zhang
  • Patent number: 9727317
    Abstract: A source code is pre-processed to form a pre-processed source code. The source code refers to an external code in a separate file, and the pre-processed source code creates a single file that includes the source code and the external code. The source code is profiled to create profile information identifying a hot portion having a first degree of hotness. A set of environment parameter values is determined to be applicable to a data processing system where the application will execute. At a remote optimizing compiler, a selection of a set of compiler options from a knowledgebase corresponding to the profile information and the set of environment parameter values is caused and an object code resulting from compiling the pre-processed source code using the set of compiler options is obtained. The object code is optimized according to the profile information and the set of environment parameter values.
    Type: Grant
    Filed: November 4, 2015
    Date of Patent: August 8, 2017
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Yaoqing Gao, John R. MacMillan, Jeremiah S. Swan, Trong Truong, Kobimanalan Vinayagamoorthy
  • Publication number: 20170123773
    Abstract: A source code is pre-processed to form a pre-processed source code. The source code refers to an external code in a separate file, and the pre-processed source code creates a single file that includes the source code and the external code. The source code is profiled to create profile information identifying a hot portion having a first degree of hotness. A set of environment parameter values is determined to be applicable to a data processing system where the application will execute. At a remote optimizing compiler, a selection of a set of compiler options from a knowledgebase corresponding to the profile information and the set of environment parameter values is caused and an object code resulting from compiling the pre-processed source code using the set of compiler options is obtained. The object code is optimized according to the profile information and the set of environment parameter values.
    Type: Application
    Filed: November 4, 2015
    Publication date: May 4, 2017
    Applicant: International Business Machines Corporation
    Inventors: Yaoqing Gao, John R. MacMillan, Jeremiah S. Swan, Trong Truong, Kobimanalan Vinayagamoorthy
  • Patent number: 9632762
    Abstract: A computer identifies one or more pairs of scalar statements and performs a cost analysis of operations of each of the one or more pairs of scalar statements to determine both a benefit and a cost of operations. The computer determines, based, at least in part, on the cost analysis, a gain for each of the one or more pairs of scalar statements. The computer creates based, at least in part, on the gain, a sorted list of each of the one or more pairs of scalar statements and selects a first pair from the sorted list. The computer issues a query to a hash table using a statement of the first pair and selects from results received from the query, a second pair. The computer then extends, based, at least in part, on the second pair, the first pair to create a pack.
    Type: Grant
    Filed: February 17, 2015
    Date of Patent: April 25, 2017
    Assignee: International Business Machines Corporation
    Inventors: Ehsan Amiri, Christopher M. Barton, Yaoqing Gao, Denis M. Palmeiro, Raul E. Silvera