Patents by Inventor Yun Du
Yun Du has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Publication number: 20250103550Abstract: A system includes an array of reconfigurable units further including a plurality of configurable elements such as pattern memory units (PMUs), pattern compute units (PCUs), and communication agents. The system further includes a configuration module to provide configuration data to configure the PMUs and PCUs. The systems further includes a compiler configured to generate a pipeline of a plurality of PCUs related to a dataflow graph, interleaved between a plurality of PMUs. Each PCU is coupled to perform calculations based on data received from a preceding PMU and store results of the calculations into a following PMU of the plurality of PMUs after a latency. The compiler is further configured to remove a PMU from the pipeline based on a comparison of the latencies of the PCUs. A corresponding method is also disclosed herein.Type: ApplicationFiled: December 9, 2024Publication date: March 27, 2025Applicant: SambaNova Systems, Inc.Inventors: Yun DU, Jianding LUO
-
Patent number: 12229215Abstract: The present disclosure relates to methods and apparatus for compute processing. For example, disclosed techniques facilitate improving performance of matrix multiplication in streaming processor. Aspects of the present disclosure can execute, with a load control unit, a first load instruction to load a set of input data of an input matrix from a first memory to a second memory. Aspects of the present disclosure can also execute, with the load control unit, a second load instruction to load a set of weight data of a weight matrix from the first memory to the second memory. Additionally, aspects of the present disclosure can perform, with an ALU component, a matrix multiplication operation using the set of input data and the set of weight data to generate an output matrix. Further, aspects of the present disclosure can store the output matrix at a general purpose register accessible to the ALU component.Type: GrantFiled: October 16, 2023Date of Patent: February 18, 2025Assignee: QUALCOMM IncorporatedInventors: Yun Du, Gang Zhong, Fei Wei, Yibin Zhang, Jing Han, Hongjiang Shang, Elina Kamenetskaya, Minjie Huang, Alexei Vladimirovich Bourd, Chun Yu, Andrew Evan Gruber, Eric Demers
-
Patent number: 12229864Abstract: This disclosure provides systems, devices, apparatus, and methods, including computer programs encoded on storage media, for runtime optimization of the shader execution flow. A graphics processor may obtain instruction execution data associated with a graphics workload, the instruction execution data including graphics data for a set of shader operations. The graphics processor may configure, at a first iteration, at least one predication value based on the instruction execution data including the graphics data for the set of shader operations. The graphics processor may adjust, at a second iteration, an execution flow of the graphics workload based on the configured at least one predication value, the execution flow of the graphics workload including the set of shader operations. The graphics processor may execute or refrain from executing, at the second iteration, each of the set of shader operations based on the adjusted execution flow of the graphics workload.Type: GrantFiled: August 5, 2022Date of Patent: February 18, 2025Assignee: QUALCOMM IncorporatedInventors: Yun Du, Eric Demers, Andrew Evan Gruber, Chun Yu, Baoguang Yang, Chihong Zhang, Yuehai Du, Avinash Seetharamaiah, Jonnala Gadda Nagendra Kumar, Gang Zhong, Zilin Ying, Fei Wei
-
Patent number: 12189570Abstract: A data processing system includes an array of reconfigurable units and a compiler configured to generate a pipeline of n computational nodes related to a dataflow graph, interleaved between n+1 buffers on the array of reconfigurable units. Each computational node is coupled to perform calculations based on data received from an immediately preceding buffer of the n+1 buffers and store results of the calculations into an immediately following buffer of the n+1 buffers after a latency. The compiler is further configured to remove a buffer of the n+1 buffers from the pipeline based on a comparison of the latencies of the computational nodes. A corresponding method is also disclosed herein.Type: GrantFiled: May 19, 2023Date of Patent: January 7, 2025Assignee: SambaNova Systems, Inc.Inventors: Yun Du, Jianding Luo
-
Patent number: 12130452Abstract: A grating adjustment apparatus includes a first electrode layer, a second electrode layer and a first substrate and a second substrate that are opposite to each other; the grating adjustment apparatus further includes a plurality of first driving lines, a plurality of second driving lines and a plurality of grating units arranged in the first direction, and is configured as: when the grating adjustment apparatus is powered on, the grating unit is capable of forming a light transmission unit and a shading unit, and opening positions and/or opening ratios of the grating unit are adjustable; and the plurality of grating units are divided into at least one group; for the grating units in the same group, at least two of the first sub-electrodes are electrically connected to different first driving lines, and at least two of the second sub-electrodes are electrically connected to different second driving lines.Type: GrantFiled: January 3, 2023Date of Patent: October 29, 2024Assignees: HEFEI BOE OPTOELECTRONICS TECHNOLOGY CO., LTD., BOE TECHNOLOGY GROUP CO., LTD.Inventors: Zhao Dong, Ru Zhou, Xiaoqing Peng, Yun Du, Hu Li, Donghui Wang, Ran An, Douqing Zhang
-
Publication number: 20240346396Abstract: The present invention comprises learning and development content and programs operations management systems and methods. The present systems offers a comprehensive data and requests management system to optimize the handling of learning requests across single or multiple decentralized learning and development teams or groups, integrate and improve the planning, management and assessment of learning and development projects, and synergistically improve and integrate the creation, implementation, practice, and refinement of content design and content design processes of learning and development team professionals and users.Type: ApplicationFiled: December 9, 2023Publication date: October 17, 2024Inventors: Ryan Austin, Matthew Ryan Ball, Jason Primeau, Darren Card, Yun Du, Pratik Bidkar, Erick Alejandro Montanez Soda, Alejandro Ariztegui Abimerhi, Shruti Bhagwat
-
Patent number: 12067666Abstract: Aspects presented herein relate to methods and devices for graphics processing including an apparatus, e.g., a GPU. The apparatus may receive a set of draw call instructions corresponding to a graphics workload, where the set of draw call instructions is associated with at least one run-time parameter. The apparatus may also obtain a first shader program associated with storing data in a system memory and at least one second shader program associated with storing data in a constant memory. Further, the apparatus may execute the first shader program or the at least one second shader program based on whether the at least one run-time parameter is less than or equal to a size of the constant memory. The apparatus may also update or maintain a configuration of a shader processor or a streaming processor based on executing the first shader program or the at least one second shader program.Type: GrantFiled: May 18, 2022Date of Patent: August 20, 2024Assignee: QUALCOMM IncorporatedInventors: Yun Du, Eric Demers, Andrew Evan Gruber, Chun Yu, Chihong Zhang, Baoguang Yang, Yuehai Du, Gang Zhong, Avinash Seetharamaiah, Jonnala Gadda Nagendra Kumar
-
Patent number: 12056790Abstract: The present disclosure relates to methods and apparatus for graphics processing. For example, disclosed techniques facilitate improving bindless state processing at a graphics processor. Aspects of the present disclosure can receive, at a graphics processor, a shader program including a preamble section and a main instructions section. Aspects of the present disclosure can also execute, with a scalar processor dedicated to processing preamble sections, instructions of the preamble section to implement a bindless mechanism for loading constant data associated with the shader program. Additionally, aspects of the present disclosure can distribute the main instructions section and the constant data to a streaming processor for executing the shader program.Type: GrantFiled: January 31, 2020Date of Patent: August 6, 2024Assignee: QUALCOMM IncorporatedInventors: Yun Du, Andrew Evan Gruber, Chun Yu, Chihong Zhang, Thomas Edwin Frisinger, Richard Hammerstone, Zilin Ying, Heng Qi, Quanquan Xu, Sheng Gu
-
Patent number: 12056804Abstract: This disclosure provides systems, devices, apparatus, and methods, including computer programs encoded on storage media, for fast incremental shared constants. In aspects, a CPU may determine/update shared constant data for a first draw call of a plurality of draw calls. The shared constant data, which may correspond to at least one shader, may be updated based on a draw call update for the first draw call. The CPU may communicate the updated shared constant data for the first draw call to a GPU. The GPU may receive, in at least one register, the updated shared constant data from the CPU and configure the at least one register based on the updated shared constant data corresponding to the draw call update of the first draw call of the plurality of draw calls.Type: GrantFiled: May 15, 2023Date of Patent: August 6, 2024Assignee: QUALCOMM IncorporatedInventors: Thomas Edwin Frisinger, Richard Hammerstone, Andrew Evan Gruber, Gang Zhong, Yun Du, Jonnala Gadda Nagendra Kumar
-
Patent number: 12014006Abstract: A touch display substrate is provided, including a central touch area and a routing area located around the central touch area, where the routing area is provided with isolation lines and a plurality of touch signal lines led out from the central touch area, the extension direction of the isolation lines is parallel to the extension direction of the touch signal lines, the touch signal lines include first touch signal lines arranged close to the isolation lines and second touch signal lines arranged far from the isolation lines, and the width of the first touch signal lines is greater than the width of the second touch signal lines. A touch display device and a touch control signal line distribution method are provided.Type: GrantFiled: April 28, 2021Date of Patent: June 18, 2024Assignees: Hefei Xinsheng Optoelectronics Technology Co., Ltd., BOE Technology Group Co., Ltd.Inventors: Jiawei Xu, Yun Du, Zhao Dong, Wenjin Fan
-
Publication number: 20240168915Abstract: A method for reducing latency and increasing throughput in a reconfigurable computing system includes receiving a compute graph for execution on a reconfigurable dataflow processor comprising a grid of compute units and grid of memory units interconnected with a switching array. The compute graph includes a node specifying an operation on a tensor. The node may be split into multiple nodes that each specify the operation on a distinctive portion of the tensor to produce a first modified compute graph. The first modified compute graph may be executed. In addition, the multiple nodes may be within a single meta-pipeline stage and may be processed in parallel. Furthermore, the compute graph may further comprise a separate node for gathering the distinctive portions of the tensor into a complete tensor, to produce a second modified compute graph.Type: ApplicationFiled: May 25, 2023Publication date: May 23, 2024Applicant: SambaNova Systems, Inc.Inventors: Yun DU, Gao DENG, Jianding LUO, Zhengyu CHEN
-
Publication number: 20240160034Abstract: A grating adjustment apparatus includes a first electrode layer, a second electrode layer and a first substrate and a second substrate that are opposite to each other; the grating adjustment apparatus further includes a plurality of first driving lines, a plurality of second driving lines and a plurality of grating units arranged in the first direction, and is configured as: when the grating adjustment apparatus is powered on, the grating unit is capable of forming a light transmission unit and a shading unit, and opening positions and/or opening ratios of the grating unit are adjustable; and the plurality of grating units are divided into at least one group; for the grating units in the same group, at least two of the first sub-electrodes are electrically connected to different first driving lines, and at least two of the second sub-electrodes are electrically connected to different second driving lines.Type: ApplicationFiled: January 3, 2023Publication date: May 16, 2024Applicants: HEFEI BOE OPTOELECTRONICS TECHNOLOGY CO., LTD., BOE Technology Group Co., Ltd.Inventors: Zhao Dong, Ru Zhou, Xiaoqing Peng, Yun Du, Hu Li, Donghui Wang, Ran An, Douqing Zhang
-
Patent number: 11954758Abstract: This disclosure provides systems, devices, apparatus, and methods, including computer programs encoded on storage media, for dynamic wave pairing. A graphics processor may allocate one or more GPU workloads to one or more wave slots of a plurality of wave slots. The graphics processor may select a first execution slot of a plurality of execution slots for executing the one or more GPU workloads. The selection may be based on one of a plurality of granularities. The graphics processor may execute, at the selected first execution slot, the one or more GPU workloads at the one of the plurality of granularities.Type: GrantFiled: February 24, 2022Date of Patent: April 9, 2024Assignee: QUALCOMM IncorporatedInventors: Yun Du, Andrew Evan Gruber, Zilin Ying, Chunling Hu, Baoguang Yang, Yang Xia, Gang Zhong, Chun Yu, Eric Demers
-
Publication number: 20240045546Abstract: A touch display substrate is provided, including a central touch area and a routing area located around the central touch area, where the routing area is provided with isolation lines and a plurality of touch signal lines led out from the central touch area, the extension direction of the isolation lines is parallel to the extension direction of the touch signal lines, the touch signal lines include first touch signal lines arranged close to the isolation lines and second touch signal lines arranged far from the isolation lines, and the width of the first touch signal lines is greater than the width of the second touch signal lines. A touch display device and a touch control signal line distribution method are provided.Type: ApplicationFiled: April 28, 2021Publication date: February 8, 2024Inventors: Jiawei XU, Yun DU, Zhao DONG, Wenjin FAN
-
Publication number: 20240046543Abstract: This disclosure provides systems, devices, apparatus, and methods, including computer programs encoded on storage media, for runtime optimization of the shader execution flow. A graphics processor may obtain instruction execution data associated with a graphics workload, the instruction execution data including graphics data for a set of shader operations. The graphics processor may configure, at a first iteration, at least one predication value based on the instruction execution data including the graphics data for the set of shader operations. The graphics processor may adjust, at a second iteration, an execution flow of the graphics workload based on the configured at least one predication value, the execution flow of the graphics workload including the set of shader operations. The graphics processor may execute or refrain from executing, at the second iteration, each of the set of shader operations based on the adjusted execution flow of the graphics workload.Type: ApplicationFiled: August 5, 2022Publication date: February 8, 2024Inventors: Yun DU, Eric DEMERS, Andrew Evan GRUBER, Chun YU, Baoguang YANG, Chihong ZHANG, Yuehai DU, Avinash SEETHARAMAIAH, Jonnala Gadda NAGENDRA KUMAR, Gang ZHONG, Zilin YING, Fei WEI
-
Publication number: 20240037183Abstract: The present disclosure relates to methods and apparatus for compute processing. For example, disclosed techniques facilitate improving performance of matrix multiplication in streaming processor. Aspects of the present disclosure can execute, with a load control unit, a first load instruction to load a set of input data of an input matrix from a first memory to a second memory. Aspects of the present disclosure can also execute, with the load control unit, a second load instruction to load a set of weight data of a weight matrix from the first memory to the second memory. Additionally, aspects of the present disclosure can perform, with an ALU component, a matrix multiplication operation using the set of input data and the set of weight data to generate an output matrix. Further, aspects of the present disclosure can store the output matrix at a general purpose register accessible to the ALU component.Type: ApplicationFiled: October 16, 2023Publication date: February 1, 2024Inventors: Yun DU, Gang ZHONG, Fei WEI, Yibin ZHANG, Jing HAN, Hongjiang SHANG, Elina KAMENETSKAYA, Minjie HUANG, Alexei Vladimirovich BOURD, Chun YU, Andrew Evan GRUBER, Eric DEMERS
-
Publication number: 20230394738Abstract: The present disclosure relates to methods and apparatus for graphics processing, e.g., a GPU. The apparatus may receive an image including a plurality of pixels associated with one or more workgroups and one or more pixel tiles, each of the workgroups and the pixel tiles including one or more pixels of the plurality of pixels. The apparatus may determine whether the one or more workgroups are misaligned with the one or more pixel tiles. The apparatus may determine a conversion order of the one or more workgroups when the one or more workgroups are misaligned with the one or more pixel tiles, the conversion order corresponding to a common multiple of one of the one or more workgroups and one of the one or more pixel tiles. The apparatus may convert each of the one or more workgroups based on the conversion order of the one or more workgroups.Type: ApplicationFiled: November 9, 2020Publication date: December 7, 2023Inventors: Yibin ZHANG, Zilin YING, Yun DU, Heng QI, Jiexia YU, Yang YU, Andrew Evan GRUBER, Jian LIANG, Tao WANG, Alexei Vladimirovich BOURD, Gang ZHONG, Minjie HUANG
-
Publication number: 20230385231Abstract: A data processing system includes an array of reconfigurable units and a compiler configured to generate a pipeline of n computational nodes related to a dataflow graph, interleaved between n+1 buffers on the array of reconfigurable units. Each computational node is coupled to perform calculations based on data received from an immediately preceding buffer of the n+1 buffers and store results of the calculations into an immediately following buffer of the n+1 buffers after a latency. The compiler is further configured to remove a buffer of the n+1 buffers from the pipeline based on a comparison of the latencies of the computational nodes. A corresponding method is also disclosed herein.Type: ApplicationFiled: May 19, 2023Publication date: November 30, 2023Applicant: SambaNova Systems, Inc.Inventors: Yun DU, Jianding LUO
-
Patent number: 11829439Abstract: The present disclosure relates to methods and apparatus for compute processing. For example, disclosed techniques facilitate improving performance of matrix multiplication in streaming processor. Aspects of the present disclosure can execute, with a load control unit, a first load instruction to load a set of input data of an input matrix from a first memory to a second memory. Aspects of the present disclosure can also execute, with the load control unit, a second load instruction to load a set of weight data of a weight matrix from the first memory to the second memory. Additionally, aspects of the present disclosure can perform, with an ALU component, a matrix multiplication operation using the set of input data and the set of weight data to generate an output matrix. Further, aspects of the present disclosure can store the output matrix at a general purpose register accessible to the ALU component.Type: GrantFiled: December 29, 2020Date of Patent: November 28, 2023Assignee: QUALCOMM IncorporatedInventors: Yun Du, Gang Zhong, Fei Wei, Yibin Zhang, Jing Han, Hongjiang Shang, Elina Kamenetskaya, Minjie Huang, Alexei Vladimirovich Bourd, Chun Yu, Andrew Evan Gruber, Eric Demers
-
Publication number: 20230377240Abstract: Aspects presented herein relate to methods and devices for graphics processing including an apparatus, e.g., a GPU. The apparatus may receive a set of draw call instructions corresponding to a graphics workload, where the set of draw call instructions is associated with at least one run-time parameter. The apparatus may also obtain a first shader program associated with storing data in a system memory and at least one second shader program associated with storing data in a constant memory. Further, the apparatus may execute the first shader program or the at least one second shader program based on whether the at least one run-time parameter is less than or equal to a size of the constant memory. The apparatus may also update or maintain a configuration of a shader processor or a streaming processor based on executing the first shader program or the at least one second shader program.Type: ApplicationFiled: May 18, 2022Publication date: November 23, 2023Inventors: Yun DU, Eric DEMERS, Andrew Evan GRUBER, Chun YU, Chihong ZHANG, Baoguang YANG, Yuehai DU, Gang ZHONG, Avinash SEETHARAMAIAH, Jonnala Gadda NAGENDRA KUMAR