Patents by Inventor Sushma Wokhlu

Sushma Wokhlu has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

System and Method for Shared Memory Ownership Using Context

Publication number: 20220164115

Abstract: It is possible to reduce the latency attributable to memory protection in shared memory systems by performing access protection at a central Data Ownership Manager (DOM), rather than at distributed memory management units in the central processing unit (CPU) elements (CEs) responsible for parallel thread processing. In particular, the DOM may monitor read requests communicated over a data plane between the CEs and a memory controller, and perform access protection verification in parallel with the memory controller's generation of the data response. The DOM may be separate and distinct from both the CEs and the memory controller, and therefore may generally be able to make the access determination without interfering with data plane processing/generation of the read requests and data responses exchanged between the memory controller and the CEs.

Type: Application

Filed: December 6, 2021

Publication date: May 26, 2022

Inventors: Sushma Wokhlu, Lee Dobson McFearin, Alan Gatherer, Hao Luan
Main processor prefetching operands for coprocessor operations

Patent number: 11334355

Abstract: Technology for providing data to a processing unit is disclosed. A computer processor may be divided into a master processing unit and consumer processing units. The master processing unit at least partially decodes a machine instruction and determines whether data is needed to execute the machine instruction. The master processing unit sends a request to memory for the data. The request may indicate that the data is to be sent from the memory to a consumer processing unit. The data sent by the memory in response to the request may be stored in local read storage that is close to the consumer processing unit for fast access. The master processing unit may also provide the machine instruction to the consumer processing unit. The consumer processing unit may access the data from the local read storage and execute the machine instruction based on the accessed data.

Type: Grant

Filed: May 4, 2017

Date of Patent: May 17, 2022

Assignee: Futurewei Technologies, Inc.

Inventors: Alan Gatherer, Sushma Wokhlu, Peter Yan, Ywhpyng Harn, Ashish Rai Shrivastava, Tong Sun, Lee Dobson McFearin
System and method for shared memory ownership using context

Patent number: 11194478

Abstract: It is possible to reduce the latency attributable to memory protection in shared memory systems by performing access protection at a central Data Ownership Manager (DOM), rather than at distributed memory management units in the central processing unit (CPU) elements (CEs) responsible for parallel thread processing. In particular, the DOM may monitor read requests communicated over a data plane between the CEs and a memory controller, and perform access protection verification in parallel with the memory controller's generation of the data response. The DOM may be separate and distinct from both the CEs and the memory controller, and therefore may generally be able to make the access determination without interfering with data plane processing/generation of the read requests and data responses exchanged between the memory controller and the CEs.

Type: Grant

Filed: October 21, 2019

Date of Patent: December 7, 2021

Assignee: Futurewei Technologies, Inc.

Inventors: Sushma Wokhlu, Lee Dobson McFearin, Alan Gatherer, Hao Luan
System and method for variable lane architecture

Patent number: 10884756

Abstract: A system and method for variable lane architecture includes memory blocks located in a memory bank, one or more computing nodes forming a vector instruction pipeline for executing a task, each of the computing nodes located in the memory bank, each of the computing nodes executing a portion of the task independently of other ones of the computing nodes, and a global program controller unit (GPCU) forming a scalar instruction pipeline for executing the task, the GPCU configured to schedule instructions for the task at one or more of the computing nodes, the GPCU further configured to dispatch an address for the memory blocks used by each of the computing nodes to the computing nodes.

Type: Grant

Filed: May 18, 2020

Date of Patent: January 5, 2021

Assignee: Futurewei Technologies, Inc.

Inventors: Sushma Wokhlu, Alan Gatherer, Ashish Rai Shrivastava
System and Method for Variable Lane Architecture

Publication number: 20200278869

Abstract: A system and method for variable lane architecture includes memory blocks located in a memory bank, one or more computing nodes forming a vector instruction pipeline for executing a task, each of the computing nodes located in the memory bank, each of the computing nodes executing a portion of the task independently of other ones of the computing nodes, and a global program controller unit (GPCU) forming a scalar instruction pipeline for executing the task, the GPCU configured to schedule instructions for the task at one or more of the computing nodes, the GPCU further configured to dispatch an address for the memory blocks used by each of the computing nodes to the computing nodes.

Type: Application

Filed: May 18, 2020

Publication date: September 3, 2020

Inventors: Sushma Wokhlu, Alan Gatherer, Ashish Rai Shrivastava
System and method for variable lane architecture

Patent number: 10691463

Abstract: A system and method for variable lane architecture includes memory blocks located in a memory bank, one or more computing nodes forming a vector instruction pipeline for executing a task, each of the computing nodes located in the memory bank, each of the computing nodes executing a portion of the task independently of other ones of the computing nodes, and a global program controller unit (GPCU) forming a scalar instruction pipeline for executing the task, the GPCU configured to schedule instructions for the task at one or more of the computing nodes, the GPCU further configured to dispatch an address for the memory blocks used by each of the computing nodes to the computing nodes.

Type: Grant

Filed: July 26, 2016

Date of Patent: June 23, 2020

Assignee: Futurewei Technologies, Inc.

Inventors: Sushma Wokhlu, Alan Gatherer, Ashish Rai Shrivastava
System and Method for Shared Memory Ownership Using Context

Publication number: 20200050376

Abstract: It is possible to reduce the latency attributable to memory protection in shared memory systems by performing access protection at a central Data Ownership Manager (DOM), rather than at distributed memory management units in the central processing unit (CPU) elements (CEs) responsible for parallel thread processing. In particular, the DOM may monitor read requests communicated over a data plane between the CEs and a memory controller, and perform access protection verification in parallel with the memory controller's generation of the data response. The DOM may be separate and distinct from both the CEs and the memory controller, and therefore may generally be able to make the access determination without interfering with data plane processing/generation of the read requests and data responses exchanged between the memory controller and the CEs.

Type: Application

Filed: October 21, 2019

Publication date: February 13, 2020

Inventors: Sushma Wokhlu, Lee Dobson McFearin, Alan Gatherer, Hao Luan
System and method for shared memory ownership using context

Patent number: 10452287

Abstract: It is possible to reduce the latency attributable to memory protection in shared memory systems by performing access protection at a central Data Ownership Manager (DOM), rather than at distributed memory management units in the central processing unit (CPU) elements (CEs) responsible for parallel thread processing. In particular, the DOM may monitor read requests communicated over a data plane between the CEs and a memory controller, and perform access protection verification in parallel with the memory controller's generation of the data response. The DOM may be separate and distinct from both the CEs and the memory controller, and therefore may generally be able to make the access determination without interfering with data plane processing/generation of the read requests and data responses exchanged between the memory controller and the CEs.

Type: Grant

Filed: June 24, 2016

Date of Patent: October 22, 2019

Assignee: FUTUREWEI TECHNOLOGIES, INC.

Inventors: Sushma Wokhlu, Lee Dobson Mcfearin, Alan Gatherer, Hao Luan
Data streaming unit and method for operating the data streaming unit

Patent number: 10419501

Abstract: A data streaming unit (DSU) and a method for operating a DSU are disclosed. In an embodiment the DSU includes a memory interface configured to be connected to a storage unit, a compute engine interface configured to be connected to a compute engine (CE) and an address generator configured to manage address data representing address locations in the storage unit. The data streaming unit further includes a data organization unit configured to access data in the storage unit and to reorganize the data to be forwarded to the compute engine, wherein the memory interface is communicatively connected to the address generator and the data organization unit, wherein the address generator is communicatively connected to the data organization unit, and wherein the data organization unit is communicatively connected to the compute engine interface.

Type: Grant

Filed: December 3, 2015

Date of Patent: September 17, 2019

Assignee: Futurewei Technologies, Inc.

Inventors: Ashish Rai Shrivastava, Alan Gatherer, Sushma Wokhlu
Caching structure for nested preemption

Patent number: 10366013

Abstract: The present disclosure relates to a system and method of managing operation of a cache memory. The system and method assign each nested task a level, and each task within a nested level an instance. Using the assigned task levels and instances, the cache management module is able to determine which cache entries to evict from cache when space is needed, and which evicted cache entries to recover upon completion of preempting tasks.

Type: Grant

Filed: January 15, 2016

Date of Patent: July 30, 2019

Assignee: Futurewei Technologies, Inc.

Inventors: Lee McFearin, Sushma Wokhlu, Alan Gatherer
PROCESSING UNITS HAVING TRIANGULAR LOAD PROTOCOL

Publication number: 20180321939

Abstract: Technology for providing data to a processing unit is disclosed. A computer processor may be divided into a master processing unit and consumer processing units. The master processing unit at least partially decodes a machine instruction and determines whether data is needed to execute the machine instruction. The master processing unit sends a request to memory for the data. The request may indicate that the data is to be sent from the memory to a consumer processing unit. The data sent by the memory in response to the request may be stored in local read storage that is close to the consumer processing unit for fast access. The master processing unit may also provide the machine instruction to the consumer processing unit. The consumer processing unit may access the data from the local read storage and execute the machine instruction based on the accessed data.

Type: Application

Filed: May 4, 2017

Publication date: November 8, 2018

Applicant: Futurewei Technologies, Inc.

Inventors: Alan Gatherer, Sushma Wokhlu, Peter Yan, Ywhpyng Harn, Ashish Rai Shrivastava, Tong Sun, Lee Dobson McFearin
ACCESS RANK AWARE CACHE REPLACEMENT POLICY

Publication number: 20180300258

Abstract: A method of operating a cache memory comprises receiving a first read or write command including at least a first address referring to first data and a first rank indicator associated with the first data, and in response to receiving the first read or write command, reading or writing the first data referenced by the first address, and storing the first rank indicator.

Type: Application

Filed: April 13, 2017

Publication date: October 18, 2018

Applicant: Futurewei Technologies, Inc.

Inventors: Sushma Wokhlu, Alex Elisa Chandra, Alan Gatherer
Advance cache allocator

Patent number: 10042773

Abstract: Systems and techniques for advance cache allocation are described. A described technique includes selecting a job from a plurality of jobs; selecting a processor core from a plurality of processor cores to execute the selected job; receiving a message which describes future memory accesses that will be generated by the selected job; generating a memory burst request based on the message; performing the memory burst request to load data from a memory to at least a dedicated portion of a cache, the cache corresponding to the selected processor core; and starting the selected job on the selected processor core. The technique can include performing an action indicated by a send message to write one or more values from another dedicated portion of the cache to the memory.

Type: Grant

Filed: July 28, 2015

Date of Patent: August 7, 2018

Assignee: FUTUREWEI TECHNOLOGIES, INC.

Inventors: Sushma Wokhlu, Lee McFearin, Alan Gatherer, Ashish Shrivastava, Peter Yifey Yan
Delayed write through cache (DWTC) and method for operating the DWTC

Patent number: 9983995

Abstract: A cache and a method for operating a cache are disclosed. In an embodiment, the cache includes a cache controller, data cache and a delay write through cache (DWTC), wherein the data cache is separate and distinct from the DWTC, wherein cacheable write accesses are split into shareable cacheable write accesses and non-shareable cacheable write accesses, wherein the cacheable shareable write accesses are allocated only to the DWTC, and wherein the non-shareable cacheable write accesses are not allocated to the DWTC.

Type: Grant

Filed: April 18, 2016

Date of Patent: May 29, 2018

Assignee: Futurewei Technologies, Inc.

Inventors: Sushma Wokhlu, Alan Gatherer, Ashish Rai Shrivastava
System and Method for Shared Memory Ownership Using Context

Publication number: 20170371570

Abstract: It is possible to reduce the latency attributable to memory protection in shared memory systems by performing access protection at a central Data Ownership Manager (DOM), rather than at distributed memory management units in the central processing unit (CPU) elements (CEs) responsible for parallel thread processing. In particular, the DOM may monitor read requests communicated over a data plane between the CEs and a memory controller, and perform access protection verification in parallel with the memory controller's generation of the data response. The DOM may be separate and distinct from both the CEs and the memory controller, and therefore may generally be able to make the access determination without interfering with data plane processing/generation of the read requests and data responses exchanged between the memory controller and the CEs.

Type: Application

Filed: June 24, 2016

Publication date: December 28, 2017

Inventors: Sushma Wokhlu, Lee Dobson Mcfearin, Alan Gatherer, Hao Luan
Delayed Write Through Cache (DWTC) and Method for Operating the DWTC

Publication number: 20170300414

Abstract: A cache and a method for operating a cache are disclosed. In an embodiment, the cache includes a cache controller, data cache and a delay write through cache (DWTC), wherein the data cache is separate and distinct from the DWTC, wherein cacheable write accesses are split into shareable cacheable write accesses and non-shareable cacheable write accesses, wherein the cacheable shareable write accesses are allocated only to the DWTC, and wherein the non-shareable cacheable write accesses are not allocated to the DWTC.

Type: Application

Filed: April 18, 2016

Publication date: October 19, 2017

Inventors: Sushma Wokhlu, Alan Gatherer, Ashish Rai Shrivastava
CACHING STRUCTURE FOR NESTED PREEMPTION

Publication number: 20170206173

Abstract: The present disclosure relates to a system and method of managing operation of a cache memory. The system and method assign each nested task a level, and each task within a nested level an instance. Using the assigned task levels and instances, the cache management module is able to determine which cache entries to evict from cache when space is needed, and which evicted cache entries to recover upon completion of preempting tasks.

Type: Application

Filed: January 15, 2016

Publication date: July 20, 2017

Inventors: Lee McFearin, Sushma Wokhlu, Alan Gatherer
Data Streaming Unit and Method for Operating the Data Streaming Unit

Publication number: 20170163698

Abstract: A data streaming unit (DSU) and a method for operating a DSU are disclosed. In an embodiment the DSU includes a memory interface configured to be connected to a storage unit, a compute engine interface configured to be connected to a compute engine (CE) and an address generator configured to manage address data representing address locations in the storage unit. The data streaming unit further includes a data organization unit configured to access data in the storage unit and to reorganize the data to be forwarded to the compute engine, wherein the memory interface is communicatively connected to the address generator and the data organization unit, wherein the address generator is communicatively connected to the data organization unit, and wherein the data organization unit is communicatively connected to the compute engine interface.

Type: Application

Filed: December 3, 2015

Publication date: June 8, 2017

Inventors: Ashish Rai Shrivastava, Alan Gatherer, Sushma Wokhlu
System and Method for Variable Lane Architecture

Publication number: 20170031689

Abstract: A system and method for variable lane architecture includes memory blocks located in a memory bank, one or more computing nodes forming a vector instruction pipeline for executing a task, each of the computing nodes located in the memory bank, each of the computing nodes executing a portion of the task independently of other ones of the computing nodes, and a global program controller unit (GPCU) forming a scalar instruction pipeline for executing the task, the GPCU configured to schedule instructions for the task at one or more of the computing nodes, the GPCU further configured to dispatch an address for the memory blocks used by each of the computing nodes to the computing nodes.

Type: Application

Filed: July 26, 2016

Publication date: February 2, 2017

Inventors: Sushma Wokhlu, Alan Gatherer, Ashish Rai Shrivastava
Advance Cache Allocator

Publication number: 20170031829

Abstract: Systems and techniques for advance cache allocation are described. A described technique includes selecting a job from a plurality of jobs; selecting a processor core from a plurality of processor cores to execute the selected job; receiving a message which describes future memory accesses that will be generated by the selected job; generating a memory burst request based on the message; performing the memory burst request to load data from a memory to at least a dedicated portion of a cache, the cache corresponding to the selected processor core; and starting the selected job on the selected processor core. The technique can include performing an action indicated by a send message to write one or more values from another dedicated portion of the cache to the memory.

Type: Application

Filed: July 28, 2015

Publication date: February 2, 2017

Inventors: Sushma Wokhlu, Lee McFearin, Alan Gatherer, Ashish Shrivastava, Peter Yifey Yan