Patents by Inventor Sushma Wokhlu

Sushma Wokhlu has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Publication number: 20220164115
    Abstract: It is possible to reduce the latency attributable to memory protection in shared memory systems by performing access protection at a central Data Ownership Manager (DOM), rather than at distributed memory management units in the central processing unit (CPU) elements (CEs) responsible for parallel thread processing. In particular, the DOM may monitor read requests communicated over a data plane between the CEs and a memory controller, and perform access protection verification in parallel with the memory controller's generation of the data response. The DOM may be separate and distinct from both the CEs and the memory controller, and therefore may generally be able to make the access determination without interfering with data plane processing/generation of the read requests and data responses exchanged between the memory controller and the CEs.
    Type: Application
    Filed: December 6, 2021
    Publication date: May 26, 2022
    Inventors: Sushma Wokhlu, Lee Dobson McFearin, Alan Gatherer, Hao Luan
  • Patent number: 11334355
    Abstract: Technology for providing data to a processing unit is disclosed. A computer processor may be divided into a master processing unit and consumer processing units. The master processing unit at least partially decodes a machine instruction and determines whether data is needed to execute the machine instruction. The master processing unit sends a request to memory for the data. The request may indicate that the data is to be sent from the memory to a consumer processing unit. The data sent by the memory in response to the request may be stored in local read storage that is close to the consumer processing unit for fast access. The master processing unit may also provide the machine instruction to the consumer processing unit. The consumer processing unit may access the data from the local read storage and execute the machine instruction based on the accessed data.
    Type: Grant
    Filed: May 4, 2017
    Date of Patent: May 17, 2022
    Assignee: Futurewei Technologies, Inc.
    Inventors: Alan Gatherer, Sushma Wokhlu, Peter Yan, Ywhpyng Harn, Ashish Rai Shrivastava, Tong Sun, Lee Dobson McFearin
  • Patent number: 11194478
    Abstract: It is possible to reduce the latency attributable to memory protection in shared memory systems by performing access protection at a central Data Ownership Manager (DOM), rather than at distributed memory management units in the central processing unit (CPU) elements (CEs) responsible for parallel thread processing. In particular, the DOM may monitor read requests communicated over a data plane between the CEs and a memory controller, and perform access protection verification in parallel with the memory controller's generation of the data response. The DOM may be separate and distinct from both the CEs and the memory controller, and therefore may generally be able to make the access determination without interfering with data plane processing/generation of the read requests and data responses exchanged between the memory controller and the CEs.
    Type: Grant
    Filed: October 21, 2019
    Date of Patent: December 7, 2021
    Assignee: Futurewei Technologies, Inc.
    Inventors: Sushma Wokhlu, Lee Dobson McFearin, Alan Gatherer, Hao Luan
  • Patent number: 10884756
    Abstract: A system and method for variable lane architecture includes memory blocks located in a memory bank, one or more computing nodes forming a vector instruction pipeline for executing a task, each of the computing nodes located in the memory bank, each of the computing nodes executing a portion of the task independently of other ones of the computing nodes, and a global program controller unit (GPCU) forming a scalar instruction pipeline for executing the task, the GPCU configured to schedule instructions for the task at one or more of the computing nodes, the GPCU further configured to dispatch an address for the memory blocks used by each of the computing nodes to the computing nodes.
    Type: Grant
    Filed: May 18, 2020
    Date of Patent: January 5, 2021
    Assignee: Futurewei Technologies, Inc.
    Inventors: Sushma Wokhlu, Alan Gatherer, Ashish Rai Shrivastava
  • Publication number: 20200278869
    Abstract: A system and method for variable lane architecture includes memory blocks located in a memory bank, one or more computing nodes forming a vector instruction pipeline for executing a task, each of the computing nodes located in the memory bank, each of the computing nodes executing a portion of the task independently of other ones of the computing nodes, and a global program controller unit (GPCU) forming a scalar instruction pipeline for executing the task, the GPCU configured to schedule instructions for the task at one or more of the computing nodes, the GPCU further configured to dispatch an address for the memory blocks used by each of the computing nodes to the computing nodes.
    Type: Application
    Filed: May 18, 2020
    Publication date: September 3, 2020
    Inventors: Sushma Wokhlu, Alan Gatherer, Ashish Rai Shrivastava
  • Patent number: 10691463
    Abstract: A system and method for variable lane architecture includes memory blocks located in a memory bank, one or more computing nodes forming a vector instruction pipeline for executing a task, each of the computing nodes located in the memory bank, each of the computing nodes executing a portion of the task independently of other ones of the computing nodes, and a global program controller unit (GPCU) forming a scalar instruction pipeline for executing the task, the GPCU configured to schedule instructions for the task at one or more of the computing nodes, the GPCU further configured to dispatch an address for the memory blocks used by each of the computing nodes to the computing nodes.
    Type: Grant
    Filed: July 26, 2016
    Date of Patent: June 23, 2020
    Assignee: Futurewei Technologies, Inc.
    Inventors: Sushma Wokhlu, Alan Gatherer, Ashish Rai Shrivastava
  • Publication number: 20200050376
    Abstract: It is possible to reduce the latency attributable to memory protection in shared memory systems by performing access protection at a central Data Ownership Manager (DOM), rather than at distributed memory management units in the central processing unit (CPU) elements (CEs) responsible for parallel thread processing. In particular, the DOM may monitor read requests communicated over a data plane between the CEs and a memory controller, and perform access protection verification in parallel with the memory controller's generation of the data response. The DOM may be separate and distinct from both the CEs and the memory controller, and therefore may generally be able to make the access determination without interfering with data plane processing/generation of the read requests and data responses exchanged between the memory controller and the CEs.
    Type: Application
    Filed: October 21, 2019
    Publication date: February 13, 2020
    Inventors: Sushma Wokhlu, Lee Dobson McFearin, Alan Gatherer, Hao Luan
  • Patent number: 10452287
    Abstract: It is possible to reduce the latency attributable to memory protection in shared memory systems by performing access protection at a central Data Ownership Manager (DOM), rather than at distributed memory management units in the central processing unit (CPU) elements (CEs) responsible for parallel thread processing. In particular, the DOM may monitor read requests communicated over a data plane between the CEs and a memory controller, and perform access protection verification in parallel with the memory controller's generation of the data response. The DOM may be separate and distinct from both the CEs and the memory controller, and therefore may generally be able to make the access determination without interfering with data plane processing/generation of the read requests and data responses exchanged between the memory controller and the CEs.
    Type: Grant
    Filed: June 24, 2016
    Date of Patent: October 22, 2019
    Assignee: FUTUREWEI TECHNOLOGIES, INC.
    Inventors: Sushma Wokhlu, Lee Dobson Mcfearin, Alan Gatherer, Hao Luan
  • Patent number: 10419501
    Abstract: A data streaming unit (DSU) and a method for operating a DSU are disclosed. In an embodiment the DSU includes a memory interface configured to be connected to a storage unit, a compute engine interface configured to be connected to a compute engine (CE) and an address generator configured to manage address data representing address locations in the storage unit. The data streaming unit further includes a data organization unit configured to access data in the storage unit and to reorganize the data to be forwarded to the compute engine, wherein the memory interface is communicatively connected to the address generator and the data organization unit, wherein the address generator is communicatively connected to the data organization unit, and wherein the data organization unit is communicatively connected to the compute engine interface.
    Type: Grant
    Filed: December 3, 2015
    Date of Patent: September 17, 2019
    Assignee: Futurewei Technologies, Inc.
    Inventors: Ashish Rai Shrivastava, Alan Gatherer, Sushma Wokhlu
  • Patent number: 10366013
    Abstract: The present disclosure relates to a system and method of managing operation of a cache memory. The system and method assign each nested task a level, and each task within a nested level an instance. Using the assigned task levels and instances, the cache management module is able to determine which cache entries to evict from cache when space is needed, and which evicted cache entries to recover upon completion of preempting tasks.
    Type: Grant
    Filed: January 15, 2016
    Date of Patent: July 30, 2019
    Assignee: Futurewei Technologies, Inc.
    Inventors: Lee McFearin, Sushma Wokhlu, Alan Gatherer
  • Publication number: 20180321939
    Abstract: Technology for providing data to a processing unit is disclosed. A computer processor may be divided into a master processing unit and consumer processing units. The master processing unit at least partially decodes a machine instruction and determines whether data is needed to execute the machine instruction. The master processing unit sends a request to memory for the data. The request may indicate that the data is to be sent from the memory to a consumer processing unit. The data sent by the memory in response to the request may be stored in local read storage that is close to the consumer processing unit for fast access. The master processing unit may also provide the machine instruction to the consumer processing unit. The consumer processing unit may access the data from the local read storage and execute the machine instruction based on the accessed data.
    Type: Application
    Filed: May 4, 2017
    Publication date: November 8, 2018
    Applicant: Futurewei Technologies, Inc.
    Inventors: Alan Gatherer, Sushma Wokhlu, Peter Yan, Ywhpyng Harn, Ashish Rai Shrivastava, Tong Sun, Lee Dobson McFearin
  • Publication number: 20180300258
    Abstract: A method of operating a cache memory comprises receiving a first read or write command including at least a first address referring to first data and a first rank indicator associated with the first data, and in response to receiving the first read or write command, reading or writing the first data referenced by the first address, and storing the first rank indicator.
    Type: Application
    Filed: April 13, 2017
    Publication date: October 18, 2018
    Applicant: Futurewei Technologies, Inc.
    Inventors: Sushma Wokhlu, Alex Elisa Chandra, Alan Gatherer
  • Patent number: 10042773
    Abstract: Systems and techniques for advance cache allocation are described. A described technique includes selecting a job from a plurality of jobs; selecting a processor core from a plurality of processor cores to execute the selected job; receiving a message which describes future memory accesses that will be generated by the selected job; generating a memory burst request based on the message; performing the memory burst request to load data from a memory to at least a dedicated portion of a cache, the cache corresponding to the selected processor core; and starting the selected job on the selected processor core. The technique can include performing an action indicated by a send message to write one or more values from another dedicated portion of the cache to the memory.
    Type: Grant
    Filed: July 28, 2015
    Date of Patent: August 7, 2018
    Assignee: FUTUREWEI TECHNOLOGIES, INC.
    Inventors: Sushma Wokhlu, Lee McFearin, Alan Gatherer, Ashish Shrivastava, Peter Yifey Yan
  • Patent number: 9983995
    Abstract: A cache and a method for operating a cache are disclosed. In an embodiment, the cache includes a cache controller, data cache and a delay write through cache (DWTC), wherein the data cache is separate and distinct from the DWTC, wherein cacheable write accesses are split into shareable cacheable write accesses and non-shareable cacheable write accesses, wherein the cacheable shareable write accesses are allocated only to the DWTC, and wherein the non-shareable cacheable write accesses are not allocated to the DWTC.
    Type: Grant
    Filed: April 18, 2016
    Date of Patent: May 29, 2018
    Assignee: Futurewei Technologies, Inc.
    Inventors: Sushma Wokhlu, Alan Gatherer, Ashish Rai Shrivastava
  • Publication number: 20170371570
    Abstract: It is possible to reduce the latency attributable to memory protection in shared memory systems by performing access protection at a central Data Ownership Manager (DOM), rather than at distributed memory management units in the central processing unit (CPU) elements (CEs) responsible for parallel thread processing. In particular, the DOM may monitor read requests communicated over a data plane between the CEs and a memory controller, and perform access protection verification in parallel with the memory controller's generation of the data response. The DOM may be separate and distinct from both the CEs and the memory controller, and therefore may generally be able to make the access determination without interfering with data plane processing/generation of the read requests and data responses exchanged between the memory controller and the CEs.
    Type: Application
    Filed: June 24, 2016
    Publication date: December 28, 2017
    Inventors: Sushma Wokhlu, Lee Dobson Mcfearin, Alan Gatherer, Hao Luan
  • Publication number: 20170300414
    Abstract: A cache and a method for operating a cache are disclosed. In an embodiment, the cache includes a cache controller, data cache and a delay write through cache (DWTC), wherein the data cache is separate and distinct from the DWTC, wherein cacheable write accesses are split into shareable cacheable write accesses and non-shareable cacheable write accesses, wherein the cacheable shareable write accesses are allocated only to the DWTC, and wherein the non-shareable cacheable write accesses are not allocated to the DWTC.
    Type: Application
    Filed: April 18, 2016
    Publication date: October 19, 2017
    Inventors: Sushma Wokhlu, Alan Gatherer, Ashish Rai Shrivastava
  • Publication number: 20170206173
    Abstract: The present disclosure relates to a system and method of managing operation of a cache memory. The system and method assign each nested task a level, and each task within a nested level an instance. Using the assigned task levels and instances, the cache management module is able to determine which cache entries to evict from cache when space is needed, and which evicted cache entries to recover upon completion of preempting tasks.
    Type: Application
    Filed: January 15, 2016
    Publication date: July 20, 2017
    Inventors: Lee McFearin, Sushma Wokhlu, Alan Gatherer
  • Publication number: 20170163698
    Abstract: A data streaming unit (DSU) and a method for operating a DSU are disclosed. In an embodiment the DSU includes a memory interface configured to be connected to a storage unit, a compute engine interface configured to be connected to a compute engine (CE) and an address generator configured to manage address data representing address locations in the storage unit. The data streaming unit further includes a data organization unit configured to access data in the storage unit and to reorganize the data to be forwarded to the compute engine, wherein the memory interface is communicatively connected to the address generator and the data organization unit, wherein the address generator is communicatively connected to the data organization unit, and wherein the data organization unit is communicatively connected to the compute engine interface.
    Type: Application
    Filed: December 3, 2015
    Publication date: June 8, 2017
    Inventors: Ashish Rai Shrivastava, Alan Gatherer, Sushma Wokhlu
  • Publication number: 20170031689
    Abstract: A system and method for variable lane architecture includes memory blocks located in a memory bank, one or more computing nodes forming a vector instruction pipeline for executing a task, each of the computing nodes located in the memory bank, each of the computing nodes executing a portion of the task independently of other ones of the computing nodes, and a global program controller unit (GPCU) forming a scalar instruction pipeline for executing the task, the GPCU configured to schedule instructions for the task at one or more of the computing nodes, the GPCU further configured to dispatch an address for the memory blocks used by each of the computing nodes to the computing nodes.
    Type: Application
    Filed: July 26, 2016
    Publication date: February 2, 2017
    Inventors: Sushma Wokhlu, Alan Gatherer, Ashish Rai Shrivastava
  • Publication number: 20170031829
    Abstract: Systems and techniques for advance cache allocation are described. A described technique includes selecting a job from a plurality of jobs; selecting a processor core from a plurality of processor cores to execute the selected job; receiving a message which describes future memory accesses that will be generated by the selected job; generating a memory burst request based on the message; performing the memory burst request to load data from a memory to at least a dedicated portion of a cache, the cache corresponding to the selected processor core; and starting the selected job on the selected processor core. The technique can include performing an action indicated by a send message to write one or more values from another dedicated portion of the cache to the memory.
    Type: Application
    Filed: July 28, 2015
    Publication date: February 2, 2017
    Inventors: Sushma Wokhlu, Lee McFearin, Alan Gatherer, Ashish Shrivastava, Peter Yifey Yan