Patents by Inventor Ganesh Suryanarayan Dasika
Ganesh Suryanarayan Dasika has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Publication number: 20230305923Abstract: A memory system uses error detection codes to detect when errors have occurred in a region of memory. A count of the number of errors is kept and a notification is output in response to the number of errors satisfying a threshold value. The notification is an indication to a host (e.g., a program accessing or managing a machine learning system) that the threshold number of errors have been detected in the region of memory. As long as the number of errors that have been detected in the region of memory remains under the threshold number no notification need be output to the host.Type: ApplicationFiled: March 25, 2022Publication date: September 28, 2023Applicant: Advanced Micro Devices, Inc.Inventors: Sudhanva Gurumurthi, Ganesh Suryanarayan Dasika
-
Patent number: 11663814Abstract: The present disclosure advantageously provides a system and a method for skipping recurrent neural network (RNN) state updates using a skip predictor. Sequential input data are received and divided into sequences of input data values, each input data value being associated with a different time step for a pre-trained RNN model. At each time step, the hidden state vector for a prior time step is received from the pre-trained RNN model, and a determination, based on the input data value and the hidden state vector for at least one prior time step, is made whether to provide or not provide the input data value associated with the time step to the pre-trained RNN model for processing. When the input data value is not provided, the pre-trained RNN model does not update its hidden state vector. Importantly, the skip predictor is trained without retraining the pre-trained RNN model.Type: GrantFiled: April 22, 2020Date of Patent: May 30, 2023Assignee: Arm LimitedInventors: Urmish Ajit Thakker, Jin Tao, Ganesh Suryanarayan Dasika, Jesse Garrett Beu
-
Patent number: 11507841Abstract: The present disclosure advantageously provides a hardware accelerator for a natural language processing application including a first memory, a second memory, and a computing engine (CE). The first memory is configured to store a configurable NLM and a set of NLM fixed weights. The second memory is configured to store an ANN model, a set of ANN weights, a set of NLM delta weights, input data and output data. The set of NLM delta weights may be smaller than the set of NLM fixed weights, and each NLM delta weight corresponds to an NLM fixed weight. The CE is configured to execute the NLM, based on the input data, the set of NLM fixed weights and the set of NLM delta weights, to generate intermediate output data, and execute the ANN model, based on the intermediate output data and the set of ANN weights, to generate the output data.Type: GrantFiled: February 10, 2020Date of Patent: November 22, 2022Assignee: Arm LimitedInventors: Urmish Ajit Thakker, Ganesh Suryanarayan Dasika
-
Patent number: 11494188Abstract: A single instruction multiple thread (SIMT) processor includes execution circuitry, prefetch circuitry and prefetch strategy selection circuitry. The prefetch strategy selection circuitry serves to detect one or more characteristics of a stream of program instructions that are being executed to identify whether or not a given data access instruction within a program will be executed a plurality of times. The prefetch strategy to use is selected from a plurality of selectable prefetch strategies in dependence upon the detection of such detected characteristics.Type: GrantFiled: October 24, 2013Date of Patent: November 8, 2022Assignee: ARM LIMITEDInventors: Ganesh Suryanarayan Dasika, Rune Holm, David Hennah Mansell
-
Patent number: 11468305Abstract: The present disclosure advantageously provides a hybrid memory artificial neural network hardware accelerator that includes a communication bus interface, a static memory, a non-refreshed dynamic memory, a controller and a computing engine. The static memory stores at least a portion of an ANN model. The ANN model includes an input layer, one or more hidden layers and an output layer, ANN basis weights, input data and output data. The non-refreshed dynamic memory is configured to store ANN custom weights for the input, hidden and output layers, and output data. For each layer or layer portion, the computing engine generates the ANN custom weights based on the ANN basis weights, stores the ANN custom weights in the non-refreshed dynamic memory, executes the layer or layer portion, based on inputs and the ANN custom weights, to generate layer output data, and stores the layer output data.Type: GrantFiled: March 18, 2020Date of Patent: October 11, 2022Assignee: Arm LimitedInventors: Urmish Ajit Thakker, Shidhartha Das, Ganesh Suryanarayan Dasika
-
Publication number: 20210295137Abstract: The present disclosure advantageously provides a hybrid memory artificial neural network hardware accelerator that includes a communication bus interface, a static memory, a non-refreshed dynamic memory, a controller and a computing engine. The static memory stores at least a portion of an ANN model. The ANN model includes an input layer, one or more hidden layers and an output layer, ANN basis weights, input data and output data. The non-refreshed dynamic memory is configured to store ANN custom weights for the input, hidden and output layers, and output data. For each layer or layer portion, the computing engine generates the ANN custom weights based on the ANN basis weights, stores the ANN custom weights in the non-refreshed dynamic memory, executes the layer or layer portion, based on inputs and the ANN custom weights, to generate layer output data, and stores the layer output data.Type: ApplicationFiled: March 18, 2020Publication date: September 23, 2021Applicant: Arm LimitedInventors: Urmish Ajit Thakker, Shidhartha Das, Ganesh Suryanarayan Dasika
-
Publication number: 20210248008Abstract: The present disclosure advantageously provides a hardware accelerator for a natural language processing application including a first memory, a second memory, and a computing engine (CE). The first memory is configured to store a configurable NLM and a set of NLM fixed weights. The second memory is configured to store an ANN model, a set of ANN weights, a set of NLM delta weights, input data and output data. The set of NLM delta weights may be smaller than the set of NLM fixed weights, and each NLM delta weight corresponds to an NLM fixed weight. The CE is configured to execute the NLM, based on the input data, the set of NLM fixed weights and the set of NLM delta weights, to generate intermediate output data, and execute the ANN model, based on the intermediate output data and the set of ANN weights, to generate the output data.Type: ApplicationFiled: February 10, 2020Publication date: August 12, 2021Inventors: Urmish Ajit Thakker, Ganesh Suryanarayan Dasika
-
Publication number: 20210056422Abstract: The present disclosure advantageously provides a system and a method for skipping recurrent neural network (RNN) state updates using a skip predictor. Sequential input data are received and divided into sequences of input data values, each input data value being associated with a different time step for a pre-trained RNN model. At each time step, the hidden state vector for a prior time step is received from the pre-trained RNN model, and a determination, based on the input data value and the hidden state vector for at least one prior time step, is made whether to provide or not provide the input data value associated with the time step to the pre-trained RNN model for processing. When the input data value is not provided, the pre-trained RNN model does not update its hidden state vector. Importantly, the skip predictor is trained without retraining the pre-trained RNN model.Type: ApplicationFiled: April 22, 2020Publication date: February 25, 2021Applicant: Arm LimitedInventors: Urmish Ajit Thakker, Jin Tao, Ganesh Suryanarayan Dasika, Jesse Garrett Beu
-
Patent number: 10607147Abstract: A method for estimating a number of occupants in a region comprises receiving a time series of sensor values detected over a period of time by a motion sensor sensing motion in the region. A spread parameter indicative of the spread of the sensor values is determined. The number of occupants in the region is estimated based on the spread parameter.Type: GrantFiled: June 15, 2016Date of Patent: March 31, 2020Assignee: ARM LimitedInventors: Yordan Petrov Raykov, Emre Özer, Ganesh Suryanarayan Dasika
-
Publication number: 20170364817Abstract: A method for estimating a number of occupants in a region comprises receiving a time series of sensor values detected over a period of time by a motion sensor sensing motion in the region. A spread parameter indicative of the spread of the sensor values is determined. The number of occupants in the region is estimated based on the spread parameter.Type: ApplicationFiled: June 15, 2016Publication date: December 21, 2017Inventors: Yordan Petrov RAYKOV, Emre Özer, Ganesh Suryanarayan DASIKA
-
Patent number: 9582419Abstract: A data processing device 100 comprises a plurality of storage circuits 130, 160, which store a plurality of data elements of the bits in an interleaved manner. Data processing device also comprises a consumer 110 with a number of lanes 120. The consumer is able to individually access each of the plurality of storage circuits 130, 160 in order to receive into the lanes 120 either a subset of the plurality of data elements or y bits of each of the plurality of data elements. The consumer 110 is also able to execute a common instruction of each of the plurality of lanes 120. The relationship of the bits is such that b is greater than y and is an integer multiple of y. Each of the plurality of storage circuits 130, 160 stores at most y bits of each of the data elements. Furthermore, each of the storage circuits 130, 160 stores at most y/b of the plurality of data elements. By carrying out the interleaving in this manner, the plurality of storage circuits 130, 160 comprise no more than b/y storage circuits.Type: GrantFiled: October 25, 2013Date of Patent: February 28, 2017Assignee: ARM LimitedInventors: Ganesh Suryanarayan Dasika, Rune Holm, Stephen John Hill
-
Patent number: 9037835Abstract: A data processing device includes processing circuitry 20 for executing a first memory access instruction to a first address of a memory device 40 and a second memory access instruction to a second address of the memory device 40, the first address being different from the second address. The data processing device also includes prefetching circuitry 30 for prefetching data from the memory device 40 based on a stride length 70 and instruction analysis circuitry 50 for determining a difference between the first address and the second address. Stride refining circuitry 60 is also provided to refine the stride length based on factors of the stride length and factors of the difference calculated by the instruction analysis circuitry 50.Type: GrantFiled: October 24, 2013Date of Patent: May 19, 2015Assignee: ARM LimitedInventors: Ganesh Suryanarayan Dasika, Rune Holm
-
Publication number: 20150134933Abstract: A data processing apparatus and method of data processing are disclosed. An instruction execution unit executes a sequence of program instructions, wherein execution of at least some of the program instructions initiates memory access requests to retrieve data values from a memory. A prefetch unit prefetches data values from the memory for storage in a cache unit before they are requested by the instruction execution unit. The prefetch unit is configured to perform a miss response comprising increasing a number of the future data values which it prefetches, when a memory access request specifies a pending data value which is already subject to prefetching but is not yet stored in the cache unit. The prefetch unit is also configured, in response to an inhibition condition being met, to temporarily inhibit the miss response for an inhibition period.Type: ApplicationFiled: November 14, 2013Publication date: May 14, 2015Applicant: ARM LimitedInventors: Rune HOLM, Ganesh Suryanarayan Dasika
-
Publication number: 20150121014Abstract: A data processing device includes processing circuitry 20 for executing a first memory access instruction to a first address of a memory device 40 and a second memory access instruction to a second address of the memory device 40, the first address being different from the second address. The data processing device also includes prefetching circuitry 30 for prefetching data from the memory device 40 based on a stride length 70 and instruction analysis circuitry 50 for determining a difference between the first address and the second address. Stride refining circuitry 60 is also provided to refine the stride length based on factors of the stride length and factors of the difference calculated by the instruction analysis circuitry 50.Type: ApplicationFiled: October 24, 2013Publication date: April 30, 2015Applicant: ARM LIMITEDInventors: Ganesh Suryanarayan DASIKA, Rune HOLM
-
Publication number: 20150121038Abstract: A single instruction multiple thread (SIMT) processor 2 includes execution circuitry 6, prefetch circuitry 12 and prefetch strategy selection circuitry 14. The prefetch strategy selection circuitry serves to detect one or more characteristics of a stream of program instructions that are being executed to identify whether or not a given data access instruction within a program will be executed a plurality of times. The prefetch strategy to use is selected from a plurality of selectable prefetch strategy in dependence upon the detection of such characteristics.Type: ApplicationFiled: October 24, 2013Publication date: April 30, 2015Applicant: ARM LIMITEDInventors: Ganesh Suryanarayan DASIKA, Rune HOLM, David Hennah MANSELL
-
Publication number: 20150121019Abstract: A data processing device 100 comprises a plurality of storage circuits 130, 160, which store a plurality of data elements of the bits in an interleaved manner. Data processing device also comprises a consumer 110 with a number of lanes 120. The consumer is able to individually access each of the plurality of storage circuits 130, 160 in order to receive into the lanes 120 either a subset of the plurality of data elements or y bits of each of the plurality of data elements. The consumer 110 is also able to execute a common instruction of each of the plurality of lanes 120. The relationship of the bits is such that b is greater than y and is an integer multiple of y. Each of the plurality of storage circuits 130, 160 stores at most y bits of each of the data elements. Furthermore, each of the storage circuits 130, 160 stores at most y/b of the plurality of data elements. By carrying out the interleaving in this manner, the plurality of storage circuits 130, 160 comprise no more than b/y storage circuits.Type: ApplicationFiled: October 25, 2013Publication date: April 30, 2015Applicant: ARM LIMITEDInventors: Ganesh Suryanarayan DASIKA, Rune HOLM, Stephen John HILL
-
Patent number: 8230277Abstract: Data storage control circuitry for controlling storage and retrieval of data in a data store in which data is stored in data blocks. A group data store stores data by grouping together blocks that have at least one faulty bit into groups of at least two blocks. For each group of blocks at least one of the blocks has a non-faulty bit for each of the bit locations in the blocks. A selector data store stores indicators for each group indicating which bits of the blocks within a group are the non-faulty bits. When storing data to a data block within a group, the data is stored in each of the blocks within the group. When retrieving data from a data block within a group, the data is read from respective bits of the blocks within the group as indicated by the indicators.Type: GrantFiled: April 4, 2011Date of Patent: July 24, 2012Assignees: ARM Limited, The Regents of the University of MichiganInventors: Trevor Nigel Mudge, Ganesh Suryanarayan Dasika, David Andrew Roberts
-
Patent number: 8145960Abstract: Data storage control circuitry for controlling storage and retrieval of data in a data store in which data is stored in data blocks. A group data store stores data by grouping together blocks that have at least one faulty bit into groups of at least two blocks. For each group of blocks at least one of the blocks has a non-faulty bit for each of the bit locations in the blocks. A selector data store stores indicators for each group indicating which bits of the blocks within a group are the non-faulty bits. When storing data to a data block within a group, the data is stored in each of the blocks within the group. When retrieving data from a data block within a group, the data is read from respective bits of the blocks within the group as indicated by the indicators.Type: GrantFiled: July 2, 2007Date of Patent: March 27, 2012Assignees: ARM Limited, The Regents of the University of MichiganInventors: Trevor Nigel Mudge, Ganesh Suryanarayan Dasika, David Andrew Roberts
-
Publication number: 20110185260Abstract: Data storage control circuitry for controlling storage and retrieval of data in a data store in which data is stored in data blocks. A group data store stores data by grouping together blocks that have at least one faulty bit into groups of at least two blocks. For each group of blocks at least one of the blocks has a non-faulty bit for each of the bit locations in the blocks. A selector data store stores indicators for each group indicating which bits of the blocks within a group are the non-faulty bits. When storing data to a data block within a group, the data is stored in each of the blocks within the group. When retrieving data from a data block within a group, the data is read from respective bits of the blocks within the group as indicated by the indicators.Type: ApplicationFiled: April 4, 2011Publication date: July 28, 2011Inventors: Trevor Nigel Mudge, Ganesh Suryanarayan Dasika, David Andrew Roberts
-
Patent number: 7945811Abstract: To prevent short path errors from occurring in systems having error detection and recovery mechanisms, functional elements are combined to form compound functional units comprising at least two evaluation stages, each evaluation stage including at least one functional element. At least one functional element includes error detection/recovery circuitry. The flow of input values to the first evaluation stage in the compound functional unit is controlled so that the input values are changed at most every second clock cycle.Type: GrantFiled: October 2, 2008Date of Patent: May 17, 2011Assignees: ARM Limited, The Regents of the University of MichiganInventors: David Michael Bull, Ganesh Suryanarayan Dasika