Patents by Inventor Ganesh Suryanarayan Dasika

Ganesh Suryanarayan Dasika has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Publication number: 20230305923
    Abstract: A memory system uses error detection codes to detect when errors have occurred in a region of memory. A count of the number of errors is kept and a notification is output in response to the number of errors satisfying a threshold value. The notification is an indication to a host (e.g., a program accessing or managing a machine learning system) that the threshold number of errors have been detected in the region of memory. As long as the number of errors that have been detected in the region of memory remains under the threshold number no notification need be output to the host.
    Type: Application
    Filed: March 25, 2022
    Publication date: September 28, 2023
    Applicant: Advanced Micro Devices, Inc.
    Inventors: Sudhanva Gurumurthi, Ganesh Suryanarayan Dasika
  • Patent number: 11663814
    Abstract: The present disclosure advantageously provides a system and a method for skipping recurrent neural network (RNN) state updates using a skip predictor. Sequential input data are received and divided into sequences of input data values, each input data value being associated with a different time step for a pre-trained RNN model. At each time step, the hidden state vector for a prior time step is received from the pre-trained RNN model, and a determination, based on the input data value and the hidden state vector for at least one prior time step, is made whether to provide or not provide the input data value associated with the time step to the pre-trained RNN model for processing. When the input data value is not provided, the pre-trained RNN model does not update its hidden state vector. Importantly, the skip predictor is trained without retraining the pre-trained RNN model.
    Type: Grant
    Filed: April 22, 2020
    Date of Patent: May 30, 2023
    Assignee: Arm Limited
    Inventors: Urmish Ajit Thakker, Jin Tao, Ganesh Suryanarayan Dasika, Jesse Garrett Beu
  • Patent number: 11507841
    Abstract: The present disclosure advantageously provides a hardware accelerator for a natural language processing application including a first memory, a second memory, and a computing engine (CE). The first memory is configured to store a configurable NLM and a set of NLM fixed weights. The second memory is configured to store an ANN model, a set of ANN weights, a set of NLM delta weights, input data and output data. The set of NLM delta weights may be smaller than the set of NLM fixed weights, and each NLM delta weight corresponds to an NLM fixed weight. The CE is configured to execute the NLM, based on the input data, the set of NLM fixed weights and the set of NLM delta weights, to generate intermediate output data, and execute the ANN model, based on the intermediate output data and the set of ANN weights, to generate the output data.
    Type: Grant
    Filed: February 10, 2020
    Date of Patent: November 22, 2022
    Assignee: Arm Limited
    Inventors: Urmish Ajit Thakker, Ganesh Suryanarayan Dasika
  • Patent number: 11494188
    Abstract: A single instruction multiple thread (SIMT) processor includes execution circuitry, prefetch circuitry and prefetch strategy selection circuitry. The prefetch strategy selection circuitry serves to detect one or more characteristics of a stream of program instructions that are being executed to identify whether or not a given data access instruction within a program will be executed a plurality of times. The prefetch strategy to use is selected from a plurality of selectable prefetch strategies in dependence upon the detection of such detected characteristics.
    Type: Grant
    Filed: October 24, 2013
    Date of Patent: November 8, 2022
    Assignee: ARM LIMITED
    Inventors: Ganesh Suryanarayan Dasika, Rune Holm, David Hennah Mansell
  • Patent number: 11468305
    Abstract: The present disclosure advantageously provides a hybrid memory artificial neural network hardware accelerator that includes a communication bus interface, a static memory, a non-refreshed dynamic memory, a controller and a computing engine. The static memory stores at least a portion of an ANN model. The ANN model includes an input layer, one or more hidden layers and an output layer, ANN basis weights, input data and output data. The non-refreshed dynamic memory is configured to store ANN custom weights for the input, hidden and output layers, and output data. For each layer or layer portion, the computing engine generates the ANN custom weights based on the ANN basis weights, stores the ANN custom weights in the non-refreshed dynamic memory, executes the layer or layer portion, based on inputs and the ANN custom weights, to generate layer output data, and stores the layer output data.
    Type: Grant
    Filed: March 18, 2020
    Date of Patent: October 11, 2022
    Assignee: Arm Limited
    Inventors: Urmish Ajit Thakker, Shidhartha Das, Ganesh Suryanarayan Dasika
  • Publication number: 20210295137
    Abstract: The present disclosure advantageously provides a hybrid memory artificial neural network hardware accelerator that includes a communication bus interface, a static memory, a non-refreshed dynamic memory, a controller and a computing engine. The static memory stores at least a portion of an ANN model. The ANN model includes an input layer, one or more hidden layers and an output layer, ANN basis weights, input data and output data. The non-refreshed dynamic memory is configured to store ANN custom weights for the input, hidden and output layers, and output data. For each layer or layer portion, the computing engine generates the ANN custom weights based on the ANN basis weights, stores the ANN custom weights in the non-refreshed dynamic memory, executes the layer or layer portion, based on inputs and the ANN custom weights, to generate layer output data, and stores the layer output data.
    Type: Application
    Filed: March 18, 2020
    Publication date: September 23, 2021
    Applicant: Arm Limited
    Inventors: Urmish Ajit Thakker, Shidhartha Das, Ganesh Suryanarayan Dasika
  • Publication number: 20210248008
    Abstract: The present disclosure advantageously provides a hardware accelerator for a natural language processing application including a first memory, a second memory, and a computing engine (CE). The first memory is configured to store a configurable NLM and a set of NLM fixed weights. The second memory is configured to store an ANN model, a set of ANN weights, a set of NLM delta weights, input data and output data. The set of NLM delta weights may be smaller than the set of NLM fixed weights, and each NLM delta weight corresponds to an NLM fixed weight. The CE is configured to execute the NLM, based on the input data, the set of NLM fixed weights and the set of NLM delta weights, to generate intermediate output data, and execute the ANN model, based on the intermediate output data and the set of ANN weights, to generate the output data.
    Type: Application
    Filed: February 10, 2020
    Publication date: August 12, 2021
    Inventors: Urmish Ajit Thakker, Ganesh Suryanarayan Dasika
  • Publication number: 20210056422
    Abstract: The present disclosure advantageously provides a system and a method for skipping recurrent neural network (RNN) state updates using a skip predictor. Sequential input data are received and divided into sequences of input data values, each input data value being associated with a different time step for a pre-trained RNN model. At each time step, the hidden state vector for a prior time step is received from the pre-trained RNN model, and a determination, based on the input data value and the hidden state vector for at least one prior time step, is made whether to provide or not provide the input data value associated with the time step to the pre-trained RNN model for processing. When the input data value is not provided, the pre-trained RNN model does not update its hidden state vector. Importantly, the skip predictor is trained without retraining the pre-trained RNN model.
    Type: Application
    Filed: April 22, 2020
    Publication date: February 25, 2021
    Applicant: Arm Limited
    Inventors: Urmish Ajit Thakker, Jin Tao, Ganesh Suryanarayan Dasika, Jesse Garrett Beu
  • Patent number: 10607147
    Abstract: A method for estimating a number of occupants in a region comprises receiving a time series of sensor values detected over a period of time by a motion sensor sensing motion in the region. A spread parameter indicative of the spread of the sensor values is determined. The number of occupants in the region is estimated based on the spread parameter.
    Type: Grant
    Filed: June 15, 2016
    Date of Patent: March 31, 2020
    Assignee: ARM Limited
    Inventors: Yordan Petrov Raykov, Emre Özer, Ganesh Suryanarayan Dasika
  • Publication number: 20170364817
    Abstract: A method for estimating a number of occupants in a region comprises receiving a time series of sensor values detected over a period of time by a motion sensor sensing motion in the region. A spread parameter indicative of the spread of the sensor values is determined. The number of occupants in the region is estimated based on the spread parameter.
    Type: Application
    Filed: June 15, 2016
    Publication date: December 21, 2017
    Inventors: Yordan Petrov RAYKOV, Emre Özer, Ganesh Suryanarayan DASIKA
  • Patent number: 9582419
    Abstract: A data processing device 100 comprises a plurality of storage circuits 130, 160, which store a plurality of data elements of the bits in an interleaved manner. Data processing device also comprises a consumer 110 with a number of lanes 120. The consumer is able to individually access each of the plurality of storage circuits 130, 160 in order to receive into the lanes 120 either a subset of the plurality of data elements or y bits of each of the plurality of data elements. The consumer 110 is also able to execute a common instruction of each of the plurality of lanes 120. The relationship of the bits is such that b is greater than y and is an integer multiple of y. Each of the plurality of storage circuits 130, 160 stores at most y bits of each of the data elements. Furthermore, each of the storage circuits 130, 160 stores at most y/b of the plurality of data elements. By carrying out the interleaving in this manner, the plurality of storage circuits 130, 160 comprise no more than b/y storage circuits.
    Type: Grant
    Filed: October 25, 2013
    Date of Patent: February 28, 2017
    Assignee: ARM Limited
    Inventors: Ganesh Suryanarayan Dasika, Rune Holm, Stephen John Hill
  • Patent number: 9037835
    Abstract: A data processing device includes processing circuitry 20 for executing a first memory access instruction to a first address of a memory device 40 and a second memory access instruction to a second address of the memory device 40, the first address being different from the second address. The data processing device also includes prefetching circuitry 30 for prefetching data from the memory device 40 based on a stride length 70 and instruction analysis circuitry 50 for determining a difference between the first address and the second address. Stride refining circuitry 60 is also provided to refine the stride length based on factors of the stride length and factors of the difference calculated by the instruction analysis circuitry 50.
    Type: Grant
    Filed: October 24, 2013
    Date of Patent: May 19, 2015
    Assignee: ARM Limited
    Inventors: Ganesh Suryanarayan Dasika, Rune Holm
  • Publication number: 20150134933
    Abstract: A data processing apparatus and method of data processing are disclosed. An instruction execution unit executes a sequence of program instructions, wherein execution of at least some of the program instructions initiates memory access requests to retrieve data values from a memory. A prefetch unit prefetches data values from the memory for storage in a cache unit before they are requested by the instruction execution unit. The prefetch unit is configured to perform a miss response comprising increasing a number of the future data values which it prefetches, when a memory access request specifies a pending data value which is already subject to prefetching but is not yet stored in the cache unit. The prefetch unit is also configured, in response to an inhibition condition being met, to temporarily inhibit the miss response for an inhibition period.
    Type: Application
    Filed: November 14, 2013
    Publication date: May 14, 2015
    Applicant: ARM Limited
    Inventors: Rune HOLM, Ganesh Suryanarayan Dasika
  • Publication number: 20150121014
    Abstract: A data processing device includes processing circuitry 20 for executing a first memory access instruction to a first address of a memory device 40 and a second memory access instruction to a second address of the memory device 40, the first address being different from the second address. The data processing device also includes prefetching circuitry 30 for prefetching data from the memory device 40 based on a stride length 70 and instruction analysis circuitry 50 for determining a difference between the first address and the second address. Stride refining circuitry 60 is also provided to refine the stride length based on factors of the stride length and factors of the difference calculated by the instruction analysis circuitry 50.
    Type: Application
    Filed: October 24, 2013
    Publication date: April 30, 2015
    Applicant: ARM LIMITED
    Inventors: Ganesh Suryanarayan DASIKA, Rune HOLM
  • Publication number: 20150121038
    Abstract: A single instruction multiple thread (SIMT) processor 2 includes execution circuitry 6, prefetch circuitry 12 and prefetch strategy selection circuitry 14. The prefetch strategy selection circuitry serves to detect one or more characteristics of a stream of program instructions that are being executed to identify whether or not a given data access instruction within a program will be executed a plurality of times. The prefetch strategy to use is selected from a plurality of selectable prefetch strategy in dependence upon the detection of such characteristics.
    Type: Application
    Filed: October 24, 2013
    Publication date: April 30, 2015
    Applicant: ARM LIMITED
    Inventors: Ganesh Suryanarayan DASIKA, Rune HOLM, David Hennah MANSELL
  • Publication number: 20150121019
    Abstract: A data processing device 100 comprises a plurality of storage circuits 130, 160, which store a plurality of data elements of the bits in an interleaved manner. Data processing device also comprises a consumer 110 with a number of lanes 120. The consumer is able to individually access each of the plurality of storage circuits 130, 160 in order to receive into the lanes 120 either a subset of the plurality of data elements or y bits of each of the plurality of data elements. The consumer 110 is also able to execute a common instruction of each of the plurality of lanes 120. The relationship of the bits is such that b is greater than y and is an integer multiple of y. Each of the plurality of storage circuits 130, 160 stores at most y bits of each of the data elements. Furthermore, each of the storage circuits 130, 160 stores at most y/b of the plurality of data elements. By carrying out the interleaving in this manner, the plurality of storage circuits 130, 160 comprise no more than b/y storage circuits.
    Type: Application
    Filed: October 25, 2013
    Publication date: April 30, 2015
    Applicant: ARM LIMITED
    Inventors: Ganesh Suryanarayan DASIKA, Rune HOLM, Stephen John HILL
  • Patent number: 8230277
    Abstract: Data storage control circuitry for controlling storage and retrieval of data in a data store in which data is stored in data blocks. A group data store stores data by grouping together blocks that have at least one faulty bit into groups of at least two blocks. For each group of blocks at least one of the blocks has a non-faulty bit for each of the bit locations in the blocks. A selector data store stores indicators for each group indicating which bits of the blocks within a group are the non-faulty bits. When storing data to a data block within a group, the data is stored in each of the blocks within the group. When retrieving data from a data block within a group, the data is read from respective bits of the blocks within the group as indicated by the indicators.
    Type: Grant
    Filed: April 4, 2011
    Date of Patent: July 24, 2012
    Assignees: ARM Limited, The Regents of the University of Michigan
    Inventors: Trevor Nigel Mudge, Ganesh Suryanarayan Dasika, David Andrew Roberts
  • Patent number: 8145960
    Abstract: Data storage control circuitry for controlling storage and retrieval of data in a data store in which data is stored in data blocks. A group data store stores data by grouping together blocks that have at least one faulty bit into groups of at least two blocks. For each group of blocks at least one of the blocks has a non-faulty bit for each of the bit locations in the blocks. A selector data store stores indicators for each group indicating which bits of the blocks within a group are the non-faulty bits. When storing data to a data block within a group, the data is stored in each of the blocks within the group. When retrieving data from a data block within a group, the data is read from respective bits of the blocks within the group as indicated by the indicators.
    Type: Grant
    Filed: July 2, 2007
    Date of Patent: March 27, 2012
    Assignees: ARM Limited, The Regents of the University of Michigan
    Inventors: Trevor Nigel Mudge, Ganesh Suryanarayan Dasika, David Andrew Roberts
  • Publication number: 20110185260
    Abstract: Data storage control circuitry for controlling storage and retrieval of data in a data store in which data is stored in data blocks. A group data store stores data by grouping together blocks that have at least one faulty bit into groups of at least two blocks. For each group of blocks at least one of the blocks has a non-faulty bit for each of the bit locations in the blocks. A selector data store stores indicators for each group indicating which bits of the blocks within a group are the non-faulty bits. When storing data to a data block within a group, the data is stored in each of the blocks within the group. When retrieving data from a data block within a group, the data is read from respective bits of the blocks within the group as indicated by the indicators.
    Type: Application
    Filed: April 4, 2011
    Publication date: July 28, 2011
    Inventors: Trevor Nigel Mudge, Ganesh Suryanarayan Dasika, David Andrew Roberts
  • Patent number: 7945811
    Abstract: To prevent short path errors from occurring in systems having error detection and recovery mechanisms, functional elements are combined to form compound functional units comprising at least two evaluation stages, each evaluation stage including at least one functional element. At least one functional element includes error detection/recovery circuitry. The flow of input values to the first evaluation stage in the compound functional unit is controlled so that the input values are changed at most every second clock cycle.
    Type: Grant
    Filed: October 2, 2008
    Date of Patent: May 17, 2011
    Assignees: ARM Limited, The Regents of the University of Michigan
    Inventors: David Michael Bull, Ganesh Suryanarayan Dasika