Patents by Inventor Sushma Wokhlu
Sushma Wokhlu has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Publication number: 20220164115Abstract: It is possible to reduce the latency attributable to memory protection in shared memory systems by performing access protection at a central Data Ownership Manager (DOM), rather than at distributed memory management units in the central processing unit (CPU) elements (CEs) responsible for parallel thread processing. In particular, the DOM may monitor read requests communicated over a data plane between the CEs and a memory controller, and perform access protection verification in parallel with the memory controller's generation of the data response. The DOM may be separate and distinct from both the CEs and the memory controller, and therefore may generally be able to make the access determination without interfering with data plane processing/generation of the read requests and data responses exchanged between the memory controller and the CEs.Type: ApplicationFiled: December 6, 2021Publication date: May 26, 2022Inventors: Sushma Wokhlu, Lee Dobson McFearin, Alan Gatherer, Hao Luan
-
Patent number: 11334355Abstract: Technology for providing data to a processing unit is disclosed. A computer processor may be divided into a master processing unit and consumer processing units. The master processing unit at least partially decodes a machine instruction and determines whether data is needed to execute the machine instruction. The master processing unit sends a request to memory for the data. The request may indicate that the data is to be sent from the memory to a consumer processing unit. The data sent by the memory in response to the request may be stored in local read storage that is close to the consumer processing unit for fast access. The master processing unit may also provide the machine instruction to the consumer processing unit. The consumer processing unit may access the data from the local read storage and execute the machine instruction based on the accessed data.Type: GrantFiled: May 4, 2017Date of Patent: May 17, 2022Assignee: Futurewei Technologies, Inc.Inventors: Alan Gatherer, Sushma Wokhlu, Peter Yan, Ywhpyng Harn, Ashish Rai Shrivastava, Tong Sun, Lee Dobson McFearin
-
Patent number: 11194478Abstract: It is possible to reduce the latency attributable to memory protection in shared memory systems by performing access protection at a central Data Ownership Manager (DOM), rather than at distributed memory management units in the central processing unit (CPU) elements (CEs) responsible for parallel thread processing. In particular, the DOM may monitor read requests communicated over a data plane between the CEs and a memory controller, and perform access protection verification in parallel with the memory controller's generation of the data response. The DOM may be separate and distinct from both the CEs and the memory controller, and therefore may generally be able to make the access determination without interfering with data plane processing/generation of the read requests and data responses exchanged between the memory controller and the CEs.Type: GrantFiled: October 21, 2019Date of Patent: December 7, 2021Assignee: Futurewei Technologies, Inc.Inventors: Sushma Wokhlu, Lee Dobson McFearin, Alan Gatherer, Hao Luan
-
Patent number: 10884756Abstract: A system and method for variable lane architecture includes memory blocks located in a memory bank, one or more computing nodes forming a vector instruction pipeline for executing a task, each of the computing nodes located in the memory bank, each of the computing nodes executing a portion of the task independently of other ones of the computing nodes, and a global program controller unit (GPCU) forming a scalar instruction pipeline for executing the task, the GPCU configured to schedule instructions for the task at one or more of the computing nodes, the GPCU further configured to dispatch an address for the memory blocks used by each of the computing nodes to the computing nodes.Type: GrantFiled: May 18, 2020Date of Patent: January 5, 2021Assignee: Futurewei Technologies, Inc.Inventors: Sushma Wokhlu, Alan Gatherer, Ashish Rai Shrivastava
-
Publication number: 20200278869Abstract: A system and method for variable lane architecture includes memory blocks located in a memory bank, one or more computing nodes forming a vector instruction pipeline for executing a task, each of the computing nodes located in the memory bank, each of the computing nodes executing a portion of the task independently of other ones of the computing nodes, and a global program controller unit (GPCU) forming a scalar instruction pipeline for executing the task, the GPCU configured to schedule instructions for the task at one or more of the computing nodes, the GPCU further configured to dispatch an address for the memory blocks used by each of the computing nodes to the computing nodes.Type: ApplicationFiled: May 18, 2020Publication date: September 3, 2020Inventors: Sushma Wokhlu, Alan Gatherer, Ashish Rai Shrivastava
-
Patent number: 10691463Abstract: A system and method for variable lane architecture includes memory blocks located in a memory bank, one or more computing nodes forming a vector instruction pipeline for executing a task, each of the computing nodes located in the memory bank, each of the computing nodes executing a portion of the task independently of other ones of the computing nodes, and a global program controller unit (GPCU) forming a scalar instruction pipeline for executing the task, the GPCU configured to schedule instructions for the task at one or more of the computing nodes, the GPCU further configured to dispatch an address for the memory blocks used by each of the computing nodes to the computing nodes.Type: GrantFiled: July 26, 2016Date of Patent: June 23, 2020Assignee: Futurewei Technologies, Inc.Inventors: Sushma Wokhlu, Alan Gatherer, Ashish Rai Shrivastava
-
Publication number: 20200050376Abstract: It is possible to reduce the latency attributable to memory protection in shared memory systems by performing access protection at a central Data Ownership Manager (DOM), rather than at distributed memory management units in the central processing unit (CPU) elements (CEs) responsible for parallel thread processing. In particular, the DOM may monitor read requests communicated over a data plane between the CEs and a memory controller, and perform access protection verification in parallel with the memory controller's generation of the data response. The DOM may be separate and distinct from both the CEs and the memory controller, and therefore may generally be able to make the access determination without interfering with data plane processing/generation of the read requests and data responses exchanged between the memory controller and the CEs.Type: ApplicationFiled: October 21, 2019Publication date: February 13, 2020Inventors: Sushma Wokhlu, Lee Dobson McFearin, Alan Gatherer, Hao Luan
-
Patent number: 10452287Abstract: It is possible to reduce the latency attributable to memory protection in shared memory systems by performing access protection at a central Data Ownership Manager (DOM), rather than at distributed memory management units in the central processing unit (CPU) elements (CEs) responsible for parallel thread processing. In particular, the DOM may monitor read requests communicated over a data plane between the CEs and a memory controller, and perform access protection verification in parallel with the memory controller's generation of the data response. The DOM may be separate and distinct from both the CEs and the memory controller, and therefore may generally be able to make the access determination without interfering with data plane processing/generation of the read requests and data responses exchanged between the memory controller and the CEs.Type: GrantFiled: June 24, 2016Date of Patent: October 22, 2019Assignee: FUTUREWEI TECHNOLOGIES, INC.Inventors: Sushma Wokhlu, Lee Dobson Mcfearin, Alan Gatherer, Hao Luan
-
Patent number: 10419501Abstract: A data streaming unit (DSU) and a method for operating a DSU are disclosed. In an embodiment the DSU includes a memory interface configured to be connected to a storage unit, a compute engine interface configured to be connected to a compute engine (CE) and an address generator configured to manage address data representing address locations in the storage unit. The data streaming unit further includes a data organization unit configured to access data in the storage unit and to reorganize the data to be forwarded to the compute engine, wherein the memory interface is communicatively connected to the address generator and the data organization unit, wherein the address generator is communicatively connected to the data organization unit, and wherein the data organization unit is communicatively connected to the compute engine interface.Type: GrantFiled: December 3, 2015Date of Patent: September 17, 2019Assignee: Futurewei Technologies, Inc.Inventors: Ashish Rai Shrivastava, Alan Gatherer, Sushma Wokhlu
-
Patent number: 10366013Abstract: The present disclosure relates to a system and method of managing operation of a cache memory. The system and method assign each nested task a level, and each task within a nested level an instance. Using the assigned task levels and instances, the cache management module is able to determine which cache entries to evict from cache when space is needed, and which evicted cache entries to recover upon completion of preempting tasks.Type: GrantFiled: January 15, 2016Date of Patent: July 30, 2019Assignee: Futurewei Technologies, Inc.Inventors: Lee McFearin, Sushma Wokhlu, Alan Gatherer
-
Publication number: 20180321939Abstract: Technology for providing data to a processing unit is disclosed. A computer processor may be divided into a master processing unit and consumer processing units. The master processing unit at least partially decodes a machine instruction and determines whether data is needed to execute the machine instruction. The master processing unit sends a request to memory for the data. The request may indicate that the data is to be sent from the memory to a consumer processing unit. The data sent by the memory in response to the request may be stored in local read storage that is close to the consumer processing unit for fast access. The master processing unit may also provide the machine instruction to the consumer processing unit. The consumer processing unit may access the data from the local read storage and execute the machine instruction based on the accessed data.Type: ApplicationFiled: May 4, 2017Publication date: November 8, 2018Applicant: Futurewei Technologies, Inc.Inventors: Alan Gatherer, Sushma Wokhlu, Peter Yan, Ywhpyng Harn, Ashish Rai Shrivastava, Tong Sun, Lee Dobson McFearin
-
Publication number: 20180300258Abstract: A method of operating a cache memory comprises receiving a first read or write command including at least a first address referring to first data and a first rank indicator associated with the first data, and in response to receiving the first read or write command, reading or writing the first data referenced by the first address, and storing the first rank indicator.Type: ApplicationFiled: April 13, 2017Publication date: October 18, 2018Applicant: Futurewei Technologies, Inc.Inventors: Sushma Wokhlu, Alex Elisa Chandra, Alan Gatherer
-
Patent number: 10042773Abstract: Systems and techniques for advance cache allocation are described. A described technique includes selecting a job from a plurality of jobs; selecting a processor core from a plurality of processor cores to execute the selected job; receiving a message which describes future memory accesses that will be generated by the selected job; generating a memory burst request based on the message; performing the memory burst request to load data from a memory to at least a dedicated portion of a cache, the cache corresponding to the selected processor core; and starting the selected job on the selected processor core. The technique can include performing an action indicated by a send message to write one or more values from another dedicated portion of the cache to the memory.Type: GrantFiled: July 28, 2015Date of Patent: August 7, 2018Assignee: FUTUREWEI TECHNOLOGIES, INC.Inventors: Sushma Wokhlu, Lee McFearin, Alan Gatherer, Ashish Shrivastava, Peter Yifey Yan
-
Patent number: 9983995Abstract: A cache and a method for operating a cache are disclosed. In an embodiment, the cache includes a cache controller, data cache and a delay write through cache (DWTC), wherein the data cache is separate and distinct from the DWTC, wherein cacheable write accesses are split into shareable cacheable write accesses and non-shareable cacheable write accesses, wherein the cacheable shareable write accesses are allocated only to the DWTC, and wherein the non-shareable cacheable write accesses are not allocated to the DWTC.Type: GrantFiled: April 18, 2016Date of Patent: May 29, 2018Assignee: Futurewei Technologies, Inc.Inventors: Sushma Wokhlu, Alan Gatherer, Ashish Rai Shrivastava
-
Publication number: 20170371570Abstract: It is possible to reduce the latency attributable to memory protection in shared memory systems by performing access protection at a central Data Ownership Manager (DOM), rather than at distributed memory management units in the central processing unit (CPU) elements (CEs) responsible for parallel thread processing. In particular, the DOM may monitor read requests communicated over a data plane between the CEs and a memory controller, and perform access protection verification in parallel with the memory controller's generation of the data response. The DOM may be separate and distinct from both the CEs and the memory controller, and therefore may generally be able to make the access determination without interfering with data plane processing/generation of the read requests and data responses exchanged between the memory controller and the CEs.Type: ApplicationFiled: June 24, 2016Publication date: December 28, 2017Inventors: Sushma Wokhlu, Lee Dobson Mcfearin, Alan Gatherer, Hao Luan
-
Publication number: 20170300414Abstract: A cache and a method for operating a cache are disclosed. In an embodiment, the cache includes a cache controller, data cache and a delay write through cache (DWTC), wherein the data cache is separate and distinct from the DWTC, wherein cacheable write accesses are split into shareable cacheable write accesses and non-shareable cacheable write accesses, wherein the cacheable shareable write accesses are allocated only to the DWTC, and wherein the non-shareable cacheable write accesses are not allocated to the DWTC.Type: ApplicationFiled: April 18, 2016Publication date: October 19, 2017Inventors: Sushma Wokhlu, Alan Gatherer, Ashish Rai Shrivastava
-
Publication number: 20170206173Abstract: The present disclosure relates to a system and method of managing operation of a cache memory. The system and method assign each nested task a level, and each task within a nested level an instance. Using the assigned task levels and instances, the cache management module is able to determine which cache entries to evict from cache when space is needed, and which evicted cache entries to recover upon completion of preempting tasks.Type: ApplicationFiled: January 15, 2016Publication date: July 20, 2017Inventors: Lee McFearin, Sushma Wokhlu, Alan Gatherer
-
Publication number: 20170163698Abstract: A data streaming unit (DSU) and a method for operating a DSU are disclosed. In an embodiment the DSU includes a memory interface configured to be connected to a storage unit, a compute engine interface configured to be connected to a compute engine (CE) and an address generator configured to manage address data representing address locations in the storage unit. The data streaming unit further includes a data organization unit configured to access data in the storage unit and to reorganize the data to be forwarded to the compute engine, wherein the memory interface is communicatively connected to the address generator and the data organization unit, wherein the address generator is communicatively connected to the data organization unit, and wherein the data organization unit is communicatively connected to the compute engine interface.Type: ApplicationFiled: December 3, 2015Publication date: June 8, 2017Inventors: Ashish Rai Shrivastava, Alan Gatherer, Sushma Wokhlu
-
Publication number: 20170031829Abstract: Systems and techniques for advance cache allocation are described. A described technique includes selecting a job from a plurality of jobs; selecting a processor core from a plurality of processor cores to execute the selected job; receiving a message which describes future memory accesses that will be generated by the selected job; generating a memory burst request based on the message; performing the memory burst request to load data from a memory to at least a dedicated portion of a cache, the cache corresponding to the selected processor core; and starting the selected job on the selected processor core. The technique can include performing an action indicated by a send message to write one or more values from another dedicated portion of the cache to the memory.Type: ApplicationFiled: July 28, 2015Publication date: February 2, 2017Inventors: Sushma Wokhlu, Lee McFearin, Alan Gatherer, Ashish Shrivastava, Peter Yifey Yan
-
Publication number: 20170031689Abstract: A system and method for variable lane architecture includes memory blocks located in a memory bank, one or more computing nodes forming a vector instruction pipeline for executing a task, each of the computing nodes located in the memory bank, each of the computing nodes executing a portion of the task independently of other ones of the computing nodes, and a global program controller unit (GPCU) forming a scalar instruction pipeline for executing the task, the GPCU configured to schedule instructions for the task at one or more of the computing nodes, the GPCU further configured to dispatch an address for the memory blocks used by each of the computing nodes to the computing nodes.Type: ApplicationFiled: July 26, 2016Publication date: February 2, 2017Inventors: Sushma Wokhlu, Alan Gatherer, Ashish Rai Shrivastava