Patents by Inventor Gongyu Wang
Gongyu Wang has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Patent number: 11899967Abstract: Aspects of the present disclosure provide an aligned storage strategy for stripes within a long vector for a vector processor, such that the extra computation needed to track strides between input stripes and output stripes may be eliminated. As a result, the stripe locations are located in a more predictable memory access pattern such that memory access bandwidth may be improved and the tendency for memory error may be reduced.Type: GrantFiled: November 15, 2021Date of Patent: February 13, 2024Assignee: Lightmatter, Inc.Inventors: Nicholas Moore, Gongyu Wang, Bradley Dobbie, Tyler J. Kenney, Ayon Basumallik
-
Patent number: 11860800Abstract: A reconfigurable compute fabric can include multiple nodes, and each node can include multiple tiles with respective processing and storage elements. Compute kernels can be parsed into directed graphs and mapped to particular node or tile resources for execution. In an example, a branch-and-bound search algorithm can be used to perform the mapping. The algorithm can use a cost function to evaluate the resources based on capability, occupancy, or power consumption of the various node or tile resources.Type: GrantFiled: August 20, 2021Date of Patent: January 2, 2024Assignee: Micron Technology, Inc.Inventors: Gongyu Wang, Jason Eckhardt
-
Patent number: 11829758Abstract: Disclosed in some examples, are systems, methods, devices, and machine readable mediums which use improved dynamic programming algorithms to pack conditional branch instructions. Conditional code branches may be modeled as directed acyclic graphs (DAGs) which have a topological ordering. These DAGs may be used to construct a dynamic programming table to find a partial mapping of one path onto the other path using dynamic programming algorithms.Type: GrantFiled: March 13, 2023Date of Patent: November 28, 2023Assignee: Micron Technology, Inc.Inventors: Skyler Arron Windh, Gongyu Wang
-
Publication number: 20230333997Abstract: A reconfigurable compute fabric can include multiple nodes, and each node can include multiple tiles with respective processing and storage elements. Compute kernels can be parsed into directed graphs and mapped to particular node or tile resources for execution. In an example, a branch-and-bound search algorithm can be used to perform the mapping. The algorithm can use a cost function to evaluate the resources based on capability, occupancy, or power consumption of the various node or tile resources.Type: ApplicationFiled: June 19, 2023Publication date: October 19, 2023Inventors: Gongyu Wang, Jason Eckhardt
-
Publication number: 20230214219Abstract: Disclosed in some examples, are systems, methods, devices, and machine readable mediums which use improved dynamic programming algorithms to pack conditional branch instructions. Conditional code branches may be modeled as directed acyclic graphs (DAGs) which have a topological ordering. These DAGs may be used to construct a dynamic programming table to find a partial mapping of one path onto the other path using dynamic programming algorithms.Type: ApplicationFiled: March 13, 2023Publication date: July 6, 2023Inventors: Skyler Arron Windh, Gongyu Wang
-
Patent number: 11675588Abstract: A reconfigurable compute fabric can include multiple nodes, and each node can include multiple tiles with respective processing and storage elements. A first tile in a first node can include a processor with a processor output and a first register network configured to receive information from the processor output and information from one or more of the multiple other tiles in the first node. In response to an output instruction and a delay instruction, the register network can provide an output signal to one of the multiple other tiles in the first node. Based on the output instruction, the output signal can include one or the other of the information from the processor output and the information from one or more of the multiple other tiles in the first node. A timing characteristic of the output signal can depend on the delay instruction.Type: GrantFiled: August 20, 2021Date of Patent: June 13, 2023Assignee: Micron Technology, Inc.Inventors: Douglas Vanesko, Tony M. Brewer, Gongyu Wang
-
Patent number: 11604650Abstract: Disclosed in some examples, are systems, methods, devices, and machine readable mediums which use improved dynamic programming algorithms to pack conditional branch instructions. Conditional code branches may be modeled as directed acyclic graphs (DAGs) which have a topological ordering. These DAGs may be used to construct a dynamic programming table to find a partial mapping of one path onto the other path using dynamic programming algorithms.Type: GrantFiled: August 11, 2021Date of Patent: March 14, 2023Assignee: Micron Technology, Inc.Inventors: Skyler Arron Windh, Gongyu Wang
-
Publication number: 20230067771Abstract: A reconfigurable compute fabric can include multiple nodes, and each node can include multiple tiles with respective processing and storage elements. A first tile in a first node can include a processor with a processor output and a first register network configured to receive information from the processor output and information from one or more of the multiple other tiles in the first node. In response to an output instruction and a delay instruction, the register network can provide an output signal to one of the multiple other tiles in the first node. Based on the output instruction, the output signal can include one or the other of the information from the processor output and the information from one or more of the multiple other tiles in the first node. A timing characteristic of the output signal can depend on the delay instruction.Type: ApplicationFiled: August 20, 2021Publication date: March 2, 2023Inventors: Douglas Vanesko, Tony M. Brewer, Gongyu Wang
-
Publication number: 20230059948Abstract: A reconfigurable compute fabric can include multiple nodes, and each node can include multiple tiles with respective processing and storage elements. Compute kernels can be parsed into directed graphs and mapped to particular node or tile resources for execution. In an example, a branch-and-bound search algorithm can be used to perform the mapping. The algorithm can use a cost function to evaluate the resources based on capability, occupancy, or power consumption of the various node or tile resources.Type: ApplicationFiled: August 20, 2021Publication date: February 23, 2023Inventors: Gongyu Wang, Jason Eckhardt
-
Publication number: 20230052450Abstract: Disclosed in some examples, are systems, methods, devices, and machine readable mediums which use improved dynamic programming algorithms to pack conditional branch instructions. Conditional code branches may be modeled as directed acyclic graphs (DAGs) which have a topological ordering. These DAGs may be used to construct a dynamic programming table to find a partial mapping of one path onto the other path using dynamic programming algorithms.Type: ApplicationFiled: August 11, 2021Publication date: February 16, 2023Inventors: Skyler Arron Windh, Gongyu Wang
-
Publication number: 20220155996Abstract: Aspects of the present disclosure provide an aligned storage strategy for stripes within a long vector for a vector processor, such that the extra computation needed to track strides between input stripes and output stripes may be eliminated. As a result, the stripe locations are located in a more predictable memory access pattern such that memory access bandwidth may be improved and the tendency for memory error may be reduced.Type: ApplicationFiled: November 15, 2021Publication date: May 19, 2022Applicant: Lightmatter, Inc.Inventors: Nicholas Moore, Gongyu Wang, Bradley Dobbie, Tyler J. Kenney, Ayon Basumallik
-
Publication number: 20220156469Abstract: Parallelization and pipelining techniques that can be applied to multi-core analog accelerators are described. The techniques descried herein improve performance of matrix multiplication (e.g., tensor-tensor multiplication, matrix-matrix multiplication or matrix-vector multiplication). The parallelization and pipelining techniques developed by the inventors and described herein focus on maintaining a high utilization of the processing cores. A representative processing systemin includes an analog accelerator, a digital processor, and a controller. The controller is configured to control the analog accelerator to output data using linear operations and to control the digital processor to perform non-linear operations based on the output data.Type: ApplicationFiled: November 15, 2021Publication date: May 19, 2022Applicant: Lightmatter, Inc.Inventors: Gongyu Wang, Cansu Demirkiran, Nicholas Moore, Ayon Basumallik, Darius Bunandar
-
Publication number: 20220147280Abstract: Aspects of the present disclosure are directed to an efficient data transfer strategy in which data transfer is scheduled based on a prediction of the internal memory utilization due to computational workload throughout its runtime. According to one aspect, the DMA transfer may be performed opportunistically: whenever internal buffer memory is available and the additional internal memory usage due to DMA transfer isn't interfering with the processor's ability to complete the workload. In some embodiments, an opportunistic transfer schedule may be found by solving an optimization problem.Type: ApplicationFiled: November 9, 2021Publication date: May 12, 2022Applicant: Lightmatter, Inc.Inventors: Darius Bunandar, Cansu Demirkiran, Gongyu Wang, Nicholas Moore, Ayon Basumallik