Patents by Inventor Yazhou Zu
Yazhou Zu has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Patent number: 12289196Abstract: Generally disclosed herein is an approach for smart topology-aware link disabling and user job rescheduling strategies for online network repair of broken links in high performance networks used in supercomputers that are common in Machine Learning (ML) and High-Performance Computing (HPC) applications. While a disabled link is repaired online, user jobs may continue to run. The broken links may be detected as part of pre-flight checks before the user jobs run and/or during the job run time via a distributed failure detection and mitigation software stack which includes a centralized network controller and multiple agents running on each node. The network controller may ensure that the user jobs are rerouted to healthy links within the same network until the broken links are fixed and tested by the repair workflows, in which case the broken links are enabled again by the network controller for future user jobs.Type: GrantFiled: December 8, 2022Date of Patent: April 29, 2025Assignee: Google LLCInventors: Yazhou Zu, Alireza Ghaffarkhah, Dayou Du
-
Patent number: 12258588Abstract: The present disclosure relates to fusion proteins including an iron-sulfur protein (e.g., cholesterol 7-desaturase (C7D)) and a ferredoxin reductase (e.g., KshB) that can efficiently convert cholesterol to 7-dehydrocholesterol (7-DHC) in vivo. Also disclosed herein are engineered microbes (e.g., E. coli) expressing the fusion proteins and methods of use thereof.Type: GrantFiled: October 18, 2024Date of Patent: March 25, 2025Assignee: Hangzhou Enhe Biotechnology Co., Ltd.Inventors: Yazhou Zu, Zhenhua Pang, Igor Walter Bogorad
-
Publication number: 20250053810Abstract: This disclosure generally provides solutions for improving the performance of a custom-built, packet-switched, TPU accelerator-side communication network. Specifically a set of solutions to improve the flow-control behavior by tuning the packet buffer queues in the on-chip router in the distributed training supercomputer network are described.Type: ApplicationFiled: October 28, 2024Publication date: February 13, 2025Inventors: Xiangyu Dong, Kais Belgaied, Yazhou Zu
-
Publication number: 20250043252Abstract: The present disclosure relates to fusion proteins including an iron-sulfur protein (e.g., cholesterol 7-desaturase (C7D)) and a ferredoxin reductase (e.g., KshB) that can efficiently convert cholesterol to 7-dehydrocholesterol (7-DHC) in vivo. Also disclosed herein are engineered microbes (e.g., E. coli) expressing the fusion proteins and methods of use thereof.Type: ApplicationFiled: October 18, 2024Publication date: February 6, 2025Inventors: Yazhou Zu, Zhenhua Pang, Igor Walter Bogorad
-
Patent number: 12159225Abstract: This disclosure generally provides solutions for improving the performance of a custom-built, packet-switched, TPU accelerator-side communication network. Specifically a set of solutions to improve the flow-control behavior by tuning the packet buffer queues in the on-chip router in the distributed training supercomputer network are described.Type: GrantFiled: December 29, 2020Date of Patent: December 3, 2024Assignee: Google LLCInventors: Xiangyu Dong, Kais Belgaied, Yazhou Zu
-
Publication number: 20240385873Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media for performing preflight checks of a distributed computing system, are described. In one aspect, a method includes assigning a computing workload to a first subset of hardware accelerator machines each having one or more hardware accelerators. A preflight check on the first subset is performed before performing the computing workload to verify the functionality of each machine in the first subset. For each hardware accelerator machine of the first subset, a program code package is installed, including a task action based at least in part on characteristics of the computing workload. The task action including a sequence of operations is performed on the hardware accelerator machine to determine whether the task action fails. Whenever the task action fails, the computing workload is re-assigned to a second subset of hardware accelerator machines different from the first subset.Type: ApplicationFiled: May 17, 2024Publication date: November 21, 2024Inventors: Jiafan Zhu, Jianqiao Liu, Xiangyu Dong, Xiao Zhang, Jikai Tang, Kexin Yang, Yong Zhao, Alireza Ghaffarkhah, Arash Rezaei, Dayou Du, Yazhou Zu, Xiangling Kong, Hoang-Vu Dang, Alexander Vadimovich Kolbasov
-
Patent number: 12020063Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media for performing preflight checks of a distributed computing system, are described. In one aspect, a method includes assigning a computing workload to a first subset of hardware accelerator machines each having one or more hardware accelerators. A preflight check on the first subset is performed before performing the computing workload to verify the functionality of each machine in the first subset. For each hardware accelerator machine of the first subset, a program code package is installed, including a task action based at least in part on characteristics of the computing workload. The task action including a sequence of operations is performed on the hardware accelerator machine to determine whether the task action fails. Whenever the task action fails, the computing workload is re-assigned to a second subset of hardware accelerator machines different from the first subset.Type: GrantFiled: December 1, 2021Date of Patent: June 25, 2024Assignee: Google LLCInventors: Jiafan Zhu, Jianqiao Liu, Xiangyu Dong, Xiao Zhang, Jikai Tang, Kexin Yang, Yong Zhao, Alireza Ghaffarkhah, Arash Rezaei, Dayou Du, Yazhou Zu, Xiangling Kong, Hoang-Vu Dang, Alexander Vadimovich Kolbasov
-
Publication number: 20240195732Abstract: Generally disclosed herein is an approach for optimizing routing strategy to tolerate faults in a toroidal network topology including, but not limited to, N-dimensional mesh, torus, and twisted torus. The approach may include balancing a load for a specified input traffic pattern operating offline or online. The approach may also include an optimization enhancement technique specifically applicable to symmetric, dynamically composable toroidal networks based on a set of centrally connected circuit switches.Type: ApplicationFiled: December 8, 2022Publication date: June 13, 2024Inventors: Yazhou Zu, Brian Patrick Towles, Alireza Ghaffarkhah
-
Publication number: 20240195679Abstract: Generally disclosed herein is an approach for smart topology-aware link disabling and user job rescheduling strategies for online network repair of broken links in high performance networks used in supercomputers that are common in Machine Learning (ML) and High-Performance Computing (HPC) applications. While a disabled link is repaired online, user jobs may continue to run. The broken links may be detected as part of pre-flight checks before the user jobs run and/or during the job run time via a distributed failure detection and mitigation software stack which includes a centralized network controller and multiple agents running on each node. The network controller may ensure that the user jobs are rerouted to healthy links within the same network until the broken links are fixed and tested by the repair workflows, in which case the broken links are enabled again by the network controller for future user jobs.Type: ApplicationFiled: December 8, 2022Publication date: June 13, 2024Inventors: Yazhou Zu, Alireza Ghaffarkhah, Dayou Du
-
Publication number: 20230168919Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media for performing preflight checks of a distributed computing system, are described. In one aspect, a method includes assigning a computing workload to a first subset of hardware accelerator machines each having one or more hardware accelerators. A preflight check on the first subset is performed before performing the computing workload to verify the functionality of each machine in the first subset. For each hardware accelerator machine of the first subset, a program code package is installed, including a task action based at least in part on characteristics of the computing workload. The task action including a sequence of operations is performed on the hardware accelerator machine to determine whether the task action fails. Whenever the task action fails, the computing workload is re-assigned to a second subset of hardware accelerator machines different from the first subset.Type: ApplicationFiled: December 1, 2021Publication date: June 1, 2023Inventors: Jiafan Zhu, Jianqiao Liu, Xiangyu Dong, Xiao Zhang, Jikai Tang, Kexin Yang, Yong Zhao, Alireza Ghaffarkhah, Arash Rezaei, Dayou Du, Yazhou Zu, Xiangling Kong, Hoang-Vu Dang, Alexander Vadimovich Kolbasov
-
Patent number: 11630152Abstract: Techniques facilitating determination and correction of physical circuit event related errors of a hardware design are provided. A system can comprise a memory that stores computer executable components and a processor that executes computer executable components stored in the memory. The computer executable components can comprise a simulation component that injects a fault into a latch and a combination of logic of an emulated hardware design. The fault can be a biased fault injection that can mimic an error caused by a physical circuit event error vulnerability. The computer executable components can also comprise an observation component that determines one or more paths of the emulated hardware design that are vulnerable to physical circuit event related errors based on the biased fault injection.Type: GrantFiled: March 4, 2021Date of Patent: April 18, 2023Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Pradip Bose, Alper Buyuktosunoglu, Schuyler Eldridge, Karthik V. Swaminathan, Yazhou Zu
-
Publication number: 20220114440Abstract: This disclosure generally provides solutions for improving the performance of a custom-built, packet-switched, TPU accelerator-side communication network. Specifically a set of solutions to improve the flow-control behavior by tuning the packet buffer queues in the on-chip router in the distributed training supercomputer network are described.Type: ApplicationFiled: December 29, 2020Publication date: April 14, 2022Inventors: Xiangyu Dong, Kais Belgaied, Yazhou Zu
-
Publication number: 20210270897Abstract: Techniques facilitating determination and correction of physical circuit event related errors of a hardware design are provided. A system can comprise a memory that stores computer executable components and a processor that executes computer executable components stored in the memory. The computer executable components can comprise a simulation component that injects a fault into a latch and a combination of logic of an emulated hardware design. The fault can be a biased fault injection that can mimic an error caused by a physical circuit event error vulnerability. The computer executable components can also comprise an observation component that determines one or more paths of the emulated hardware design that are vulnerable to physical circuit event related errors based on the biased fault injection.Type: ApplicationFiled: March 4, 2021Publication date: September 2, 2021Inventors: Pradip Bose, Alper Buyuktosunoglu, Schuyler Eldridge, Karthik V. Swaminathan, Yazhou Zu
-
Patent number: 11002791Abstract: Techniques facilitating determination and correction of physical circuit event related errors of a hardware design are provided. A system can comprise a memory that stores computer executable components and a processor that executes computer executable components stored in the memory. The computer executable components can comprise a simulation component that injects a fault into a latch and a combination of logic of an emulated hardware design. The fault can be a biased fault injection that can mimic an error caused by a physical circuit event error vulnerability. The computer executable components can also comprise an observation component that determines one or more paths of the emulated hardware design that are vulnerable to physical circuit event related errors based on the biased fault injection.Type: GrantFiled: May 14, 2020Date of Patent: May 11, 2021Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Pradip Bose, Alper Buyuktosunoglu, Schuyler Eldridge, Karthik V. Swaminathan, Yazhou Zu
-
Publication number: 20200300913Abstract: Techniques facilitating determination and correction of physical circuit event related errors of a hardware design are provided. A system can comprise a memory that stores computer executable components and a processor that executes computer executable components stored in the memory. The computer executable components can comprise a simulation component that injects a fault into a latch and a combination of logic of an emulated hardware design. The fault can be a biased fault injection that can mimic an error caused by a physical circuit event error vulnerability. The computer executable components can also comprise an observation component that determines one or more paths of the emulated hardware design that are vulnerable to physical circuit event related errors based on the biased fault injection.Type: ApplicationFiled: May 14, 2020Publication date: September 24, 2020Inventors: Pradip Bose, Alper Buyuktosunoglu, Schuyler Eldridge, Karthik V. Swaminathan, Yazhou Zu
-
Patent number: 10690723Abstract: Techniques facilitating determination and correction of physical circuit event related errors of a hardware design are provided. A system can comprise a memory that stores computer executable components and a processor that executes computer executable components stored in the memory. The computer executable components can comprise a simulation component that injects a fault into a latch and a combination of logic of an emulated hardware design. The fault can be a biased fault injection that can mimic an error caused by a physical circuit event error vulnerability. The computer executable components can also comprise an observation component that determines one or more paths of the emulated hardware design that are vulnerable to physical circuit event related errors based on the biased fault injection.Type: GrantFiled: April 30, 2019Date of Patent: June 23, 2020Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Pradip Bose, Alper Buyuktosunoglu, Schuyler Eldridge, Karthik V. Swaminathan, Yazhou Zu
-
Publication number: 20200158782Abstract: Techniques facilitating determination and correction of physical circuit event related errors of a hardware design are provided. A system can comprise a memory that stores computer executable components and a processor that executes computer executable components stored in the memory. The computer executable components can comprise a simulation component that injects a fault into a latch and a combination of logic of an emulated hardware design. The fault can be a biased fault injection that can mimic an error caused by a physical circuit event error vulnerability. The computer executable components can also comprise an observation component that determines one or more paths of the emulated hardware design that are vulnerable to physical circuit event related errors based on the biased fault injection.Type: ApplicationFiled: April 30, 2019Publication date: May 21, 2020Inventors: Pradip Bose, Alper Buyuktosunoglu, Schuyler Eldridge, Karthik V. Swaminathan, Yazhou Zu
-
Patent number: 10649514Abstract: A method and apparatus for managing processing power determine a supply voltage to supply to a processing unit, such as a central processing unit (CPU) or graphics processing unit (GPU), based on temperature inversion based voltage, frequency, temperature (VFT) data. The temperature inversion based VFT data includes supply voltages and corresponding operating temperatures that cause the processing unit's transistors to operate in a temperature inversion region. In one example, the temperature inversion based VFT data includes lower supply voltages and corresponding higher temperatures in a temperature inversion region of a processing unit. The temperature inversion based VFT data is based on an operating frequency of the processing unit. The apparatus and method adjust a supply voltage to the processing unit based on the temperature inversion based VFT data.Type: GrantFiled: September 23, 2016Date of Patent: May 12, 2020Assignee: Advanced Micro Devices, Inc.Inventors: Wei Huang, Yazhou Zu, Indrani Paul
-
Patent number: 10365327Abstract: Techniques facilitating determination and correction of physical circuit event related errors of a hardware design are provided. A system can comprise a memory that stores computer executable components and a processor that executes computer executable components stored in the memory. The computer executable components can comprise a simulation component that injects a fault into a latch and a combination of logic of an emulated hardware design. The fault can be a biased fault injection that can mimic an error caused by a physical circuit event error vulnerability. The computer executable components can also comprise an observation component that determines one or more paths of the emulated hardware design that are vulnerable to physical circuit event related errors based on the biased fault injection.Type: GrantFiled: October 18, 2017Date of Patent: July 30, 2019Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Pradip Bose, Alper Buyuktosunoglu, Schuyler Eldridge, Karthik V. Swaminathan, Yazhou Zu
-
Publication number: 20190113572Abstract: Techniques facilitating determination and correction of physical circuit event related errors of a hardware design are provided. A system can comprise a memory that stores computer executable components and a processor that executes computer executable components stored in the memory. The computer executable components can comprise a simulation component that injects a fault into a latch and a combination of logic of an emulated hardware design. The fault can be a biased fault injection that can mimic an error caused by a physical circuit event error vulnerability. The computer executable components can also comprise an observation component that determines one or more paths of the emulated hardware design that are vulnerable to physical circuit event related errors based on the biased fault injection.Type: ApplicationFiled: October 18, 2017Publication date: April 18, 2019Inventors: Pradip Bose, Alper Buyuktosunoglu, Schuyler Eldridge, Karthik V. Swaminathan, Yazhou Zu