Patents by Inventor Kiran Gunnam
Kiran Gunnam has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Patent number: 11797830Abstract: An apparatus includes a tensor compute cluster having a plurality of tensor compute units to process a plurality of sub-feature maps in a machine learning application and a tensor memory cluster having a plurality of tensor feature map memory units to store the plurality of sub-feature maps. The apparatus also includes circuitry to partition an input feature map into the plurality of sub-feature maps such that sparsity in each of the plurality of sub-feature maps satisfies a predetermined threshold, and assign each of the plurality of sub-feature maps to one of the plurality of tensor compute units and one of the plurality of tensor feature map memory units for processing in parallel.Type: GrantFiled: March 25, 2020Date of Patent: October 24, 2023Assignee: Western Digital Technologies, Inc.Inventors: Kiran Gunnam, Anand Kulkarni, Zvonimir Bandic
-
Patent number: 11755683Abstract: An apparatus includes a first tensor compute cluster configured to receive first input feature tensors, a second tensor compute cluster configured to receive second input feature tensors more sparse than the first input feature tensors, and a vector accelerator. The apparatus also includes circuitry configured to partition an input feature map into a plurality of input feature tensors based on a compression criteria and assign each of the plurality of input feature tensors to one of the first tensor compute cluster, the second tensor compute cluster, or the vector accelerator based upon at least one of parameters including a sparsity and an optimization parameter.Type: GrantFiled: December 23, 2019Date of Patent: September 12, 2023Assignee: Western Digital Technologies, Inc.Inventors: Kiran Gunnam, Anand Kulkarni, Zvonimir Bandic
-
Patent number: 11462003Abstract: A system with a multiplication circuit having a plurality of multipliers is disclosed. Each of the plurality of multipliers is configured to receive a data value and a weight value to generate a product value in a convolution operation of a machine learning application. The system also includes an accumulator configured to receive the product value from each of the plurality of multipliers and a register bank configured to store an output of the convolution operation. The accumulator is further configured to receive a portion of values stored in the register bank and combine the received portion of values with the product values to generate combined values. The register bank is further configured to replace the portion of values with the combined values.Type: GrantFiled: March 25, 2020Date of Patent: October 4, 2022Assignee: Western Digital Technologies, Inc.Inventors: Kiran Gunnam, Anand Kulkarni, Zvonimir Bandic
-
Publication number: 20210303976Abstract: An apparatus includes a tensor compute cluster having a plurality of tensor compute units to process a plurality of sub-feature maps in a machine learning application and a tensor memory cluster having a plurality of tensor feature map memory units to store the plurality of sub-feature maps. The apparatus also includes circuitry to partition an input feature map into the plurality of sub-feature maps such that sparsity in each of the plurality of sub-feature maps satisfies a predetermined threshold, and assign each of the plurality of sub-feature maps to one of the plurality of tensor compute units and one of the plurality of tensor feature map memory units for processing in parallel.Type: ApplicationFiled: March 25, 2020Publication date: September 30, 2021Applicant: Western Digital Technologies, Inc.Inventors: Kiran Gunnam, Anand Kulkarni, Zvonimir Bandic
-
Publication number: 20210303909Abstract: A system with a multiplication circuit having a plurality of multipliers is disclosed. Each of the plurality of multipliers is configured to receive a data value and a weight value to generate a product value in a convolution operation of a machine learning application. The system also includes an accumulator configured to receive the product value from each of the plurality of multipliers and a register bank configured to store an output of the convolution operation. The accumulator is further configured to receive a portion of values stored in the register bank and combine the received portion of values with the product values to generate combined values. The register bank is further configured to replace the portion of values with the combined values.Type: ApplicationFiled: March 25, 2020Publication date: September 30, 2021Applicant: Western Digital Technologies, Inc.Inventors: Kiran Gunnam, Anand Kulkarni, Zvonimir Bandic
-
Publication number: 20210191733Abstract: An apparatus includes a first tensor compute cluster configured to receive first input feature tensors, a second tensor compute cluster configured to receive second input feature tensors more sparse than the first input feature tensors, and a vector accelerator. The apparatus also includes circuitry configured to partition an input feature map into a plurality of input feature tensors based on a compression criteria and assign each of the plurality of input feature tensors to one of the first tensor compute cluster, the second tensor compute cluster, or the vector accelerator based upon at least one of parameters including a sparsity and an optimization parameter.Type: ApplicationFiled: December 23, 2019Publication date: June 24, 2021Applicant: Western Digital Technologies, Inc.Inventors: Kiran Gunnam, Anand Kulkarni, Zvonimir Bandic
-
Patent number: 10372530Abstract: Systems and methods are provided for encoding data based on an LDPC code using various inversion mechanisms to obtain parity bits. In some embodiments, an LDPC encoder may compute parity bits using a speculative recursion and correction mechanism. In these embodiments, the LDPC encoder may initiate a recursion using at least one speculative value in place of the actual value for a parity component. The speculative values may then be corrected using a correction factor. In other embodiments, an LDPC encoder is provided that can perform a blockwise inversion mechanism. This mechanism may be used on LDPC codes with parity check matrices having a parity portion composed partially of a large triangular matrix. In still other embodiments, a generic LDPC encoder is provided. The generic LDPC encoder can implement a variety of different encoding techniques, such as different inversion mechanisms, and may be processor-based or finite state machine-based.Type: GrantFiled: May 26, 2017Date of Patent: August 6, 2019Assignee: Marvell International Ltd.Inventors: Kiran Gunnam, Nedeljko Varnica
-
Patent number: 10223018Abstract: The amount of remapping data in a file system of a memory device is reduced. In one aspect, for each request access, e.g., read or write operation, the memory cells of a primary physical address are evaluated. If the evaluation indicates the memory cells are good, the read or write operation proceeds. If the memory cells have a failure such as uncorrectable errors, the primary physical address is hashed to obtain an auxiliary physical address. If the auxiliary physical address is not available, the primary physical address can be hashed again to obtain another auxiliary physical address. In another aspect, per-page remapping is performed until a threshold number of bad pages in a block are detected, after which the entire block is remapped. In another aspect, pages of a block are remapped to auxiliary pages based on a block identifier.Type: GrantFiled: April 19, 2017Date of Patent: March 5, 2019Assignee: SanDisk Technologies LLCInventors: Kiran Gunnam, Robert Mateescu
-
Publication number: 20180307431Abstract: The amount of remapping data in a file system of a memory device is reduced. In one aspect, for each request access, e.g., read or write operation, the memory cells of a primary physical address are evaluated. If the evaluation indicates the memory cells are good, the read or write operation proceeds. If the memory cells have a failure such as uncorrectable errors, the primary physical address is hashed to obtain an auxiliary physical address. If the auxiliary physical address is not available, the primary physical address can be hashed again to obtain another auxiliary physical address. In another aspect, per-page remapping is performed until a threshold number of bad pages in a block are detected, after which the entire block is remapped. In another aspect, pages of a block are remapped to auxiliary pages based on a block identifier.Type: ApplicationFiled: April 19, 2017Publication date: October 25, 2018Applicant: SanDisk Technologies LLCInventors: Kiran Gunnam, Robert Mateescu
-
Patent number: 9667272Abstract: Systems and methods are provided for encoding data based on an LDPC code using various inversion mechanisms to obtain parity bits. In some embodiments, an LDPC encoder may compute parity bits using a speculative recursion and correction mechanism. In these embodiments, the LDPC encoder may initiate a recursion using at least one speculative value in place of the actual value for a parity component. The speculative values may then be corrected using a correction factor. In other embodiments, an LDPC encoder is provided that can perform a blockwise inversion mechanism. This mechanism may be used on LDPC codes with parity check matrices having a parity portion composed partially of a large triangular matrix. In still other embodiments, a generic LDPC encoder is provided. The generic LDPC encoder can implement a variety of different encoding techniques, such as different inversion mechanisms, and may be processor-based or finite state machine-based.Type: GrantFiled: July 16, 2015Date of Patent: May 30, 2017Assignee: Marvell International Ltd.Inventors: Kiran Gunnam, Nedeljko Varnica
-
Patent number: 9256487Abstract: Systems and methods are provided for selecting precisions during iterative decoding with a low-density parity check (LDPC) decoder in order to maximize LDPC code's performance in the error floor region. The selection of the precision of the messages may be done in such a way as to avoid catastrophic errors and to minimize the number of near-codeword errors during the decoding process. Another system and method to avoid catastrophic errors in the layered (serial) LDPC decoder is provided. Lastly, a system and method that select precisions and provide circuitry that optimizes the exchange of information between a soft-input, soft-output (SISO) channel detector and an error correction code (ECC) decoder for channels with memory is provided.Type: GrantFiled: May 21, 2014Date of Patent: February 9, 2016Assignee: Marvell International Ltd.Inventors: Nedeljko Varnica, Gregory Burd, Kiran Gunnam
-
Patent number: 9088301Abstract: Systems and methods are provided for encoding data based on an LDPC code using various inversion mechanisms to obtain parity bits. In some embodiments, an LDPC encoder may compute parity bits using a speculative recursion and correction mechanism. In these embodiments, the LDPC encoder may initiate a recursion using at least one speculative value in place of the actual value for a parity component. The speculative values may then be corrected using a correction factor. In other embodiments, an LDPC encoder is provided that can perform a blockwise inversion mechanism. This mechanism may be used on LDPC codes with parity check matrices having a parity portion composed partially of a large triangular matrix. In still other embodiments, a generic LDPC encoder is provided. The generic LDPC encoder can implement a variety of different encoding techniques, such as different inversion mechanisms, and may be processor-based or finite state machine-based.Type: GrantFiled: November 20, 2013Date of Patent: July 21, 2015Assignee: Marvell International Ltd.Inventors: Kiran Gunnam, Nedeljko Varnica
-
Patent number: 8976876Abstract: In one embodiment, a configurable communications system accommodates a plurality of different transmission word sizes. In a transmit path, the system inserts a number of padding bits corresponding to missing user-data bits onto the end of an input data sequence to generate a set of data having N bits. The N bits are interleaved and error-correction (EC) encoded to generate parity bits corresponding to an EC codeword. The parity bits are de-interleaved and multiplexed with the input data stream to generate a transmission word. In a receive path, a channel detector recovers channel values corresponding to the transmission word. Padding values, corresponding to the missing-bit locations, are inserted among the channel values. The resulting channel values are interleaved and EC decoded to recover the EC codeword. The data bits of the codeword are de-interleaved, and the padding bits corresponding to the missing channel values are discarded.Type: GrantFiled: October 25, 2010Date of Patent: March 10, 2015Assignee: LSI CorporationInventor: Kiran Gunnam
-
Patent number: 8918704Abstract: Building and using sub-sets of configurations sets are provided to compute the check-nodes update by using a particular representation of the input messages, called here-after trellis-EMS (T-EMS). In a main aspect, the system provides a decoding method to compute dc output vectors of a non-binary parity-check (NBPC) equation decoding unit used for LDPC check codes defined in a NB space.Type: GrantFiled: March 12, 2013Date of Patent: December 23, 2014Inventors: David Declercq, Erbao Li, Kiran Gunnam
-
Patent number: 8782320Abstract: In one embodiment, a multistage interconnection network (MIN) has two or more configurable stages, each stage having a plurality of switches. The network has one or more unused input terminals, each mapped using fixed switch connections to an unused output terminal. The network also has a set of used input terminals that are selectively mapped to a set of used output terminals based on values of control signals supplied to the stages. Each stage receives a different control signal, and each control signal is generated by cyclically shifting a control seed by a corresponding cyclic-shift value. Fixing the mappings of the unused terminals ensures that the used input terminals are not mapped to any unused output terminals. By storing only the control seed, memory requirements are reduced over networks that explicitly store individual control signals for all of the stages.Type: GrantFiled: November 9, 2010Date of Patent: July 15, 2014Assignee: LSI CorporationInventor: Kiran Gunnam
-
Patent number: 8769382Abstract: Systems and methods are provided for selecting precisions during iterative decoding with a low-density parity check (LDPC) decoder in order to maximize LDPC code's performance in the error floor region. The selection of the precision of the messages may be done in such a way as to avoid catastrophic errors and to minimize the number of near-codeword errors during the decoding process. Another system and method to avoid catastrophic errors in the layered (serial) LDPC decoder is provided. Lastly, a system and method that select precisions and provide circuitry that optimizes the exchange of information between a soft-input, soft-output (SISO) channel detector and an error correction code (ECC) decoder for channels with memory is provided.Type: GrantFiled: September 11, 2012Date of Patent: July 1, 2014Assignee: Marvell International Ltd.Inventors: Nedeljko Varnica, Gregory Burd, Kiran Gunnam
-
Patent number: 8768990Abstract: In one embodiment, a reconfigurable cyclic shifter arrangement has first and second reconfigurable cyclic shifters connected in series that are each selectively and independently configurable to operate in any one of three different modes at a time. In a first mode, the reconfigurable cyclic shifter is configured as four 4×4 cyclic shifters to cyclically shift four sets of four input values. In a second mode, the reconfigurable cyclic shifter is configured as two 8×8 cyclic shifters to cyclically shift two sets of eight input values. In a third mode, the reconfigurable cyclic shifter is configured as one 16×16 cyclic shifter to cyclically shift one set of 16 input values. Because the first and second reconfigurable cyclic shifters are independently configurable, there are nine different configurations of the reconfigurable cyclic shifter arrangement.Type: GrantFiled: November 11, 2011Date of Patent: July 1, 2014Assignee: LSI CorporationInventors: Kiran Gunnam, Madhusudan Kalluri
-
Patent number: 8700976Abstract: In one embodiment, a turbo equalizer has an LDPC decoder, a channel detector, and one or more adjustment blocks for recovering an LDPC codeword from a set of input samples. The decoder attempts to recover the codeword from an initial set of channel soft-output values and generates a set of extrinsic soft-output values, each corresponding to a bit of the codeword. If the decoder converges on a trapping set, then the channel detector performs detection on the set of input samples to generate a set of updated channel soft-output values, using the extrinsic soft-output values to improve the detection. The one or more adjustment blocks adjust at least one of (i) the extrinsic soft-output values before the channel detection and (ii) the updated channel soft-output values. Subsequent decoding is then performed on the updated and possibly-adjusted channel soft-output values to attempt to recover the codeword.Type: GrantFiled: August 12, 2009Date of Patent: April 15, 2014Assignee: LSI CorporationInventors: Kiran Gunnam, Shaohua Yang, Changyou Xu
-
Patent number: 8683306Abstract: Various embodiments of the present invention provide systems and methods for data processing. For example, a data processing system is disclosed that includes a channel detector circuit. The channel detector circuit includes a branch metric calculator circuit that is operable to receive a number of violated checks from a preceding stage, and to scale an intrinsic branch metric using a scalar selected based at least in part on the number of violated checks to yield a scaled intrinsic branch metric.Type: GrantFiled: January 4, 2010Date of Patent: March 25, 2014Assignee: LSI CorporationInventors: Shaohua Yang, Weijun Tan, Zongwang Li, Kiran Gunnam
-
Patent number: 8683299Abstract: In one embodiment, a turbo equalizer has a channel detector that receives equalized samples and generates channel soft-output values. An LDPC decoder attempts to decode the channel soft-output values to recover an LDPC-encoded codeword. If the decoder converges on a trapping set, then an adjustment block selects one or more of the equalized samples based on one or more specified conditions and adjusts the selected equalized samples. Selection may be performed by identifying the locations of unsatisfied check nodes of the last local decoder iteration and selecting the equalized samples that correspond to bit nodes of the LDPC-encoded codeword that are connected to the unsatisfied check nodes. Adjustment of the equalized samples may be performed using any combination of scaling, offsetting, and saturation. Channel detection is then performed using the adjusted equalized samples to generate an updated set of channel soft-output values, which are subsequently decoded by the decoder.Type: GrantFiled: August 12, 2009Date of Patent: March 25, 2014Assignee: LSI CorporationInventors: Kiran Gunnam, Shaohua Yang, Changyou Xu