Patents by Inventor Deepak Mathew
Deepak Mathew has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Patent number: 12159140Abstract: An electronic device receives a single instruction to apply a neural network operation to a set of M-bit elements stored in one or more input vector registers to initiate a sequence of computational operations related to a neural network. In response to the single instruction, the electronic device implements the neural network operation on the set of M-bit elements to generate a set of P-bit elements by obtaining the set of M-bit elements from the one or more input vector registers, quantizing each of the set of M-bit elements from M bits to P bits, and packing the set of P-bit elements into an output vector register. P is smaller than M. In some embodiments, the neural network operation is a quantization operation including at least a multiplication with a quantization factor and an addition with a zero point.Type: GrantFiled: April 28, 2022Date of Patent: December 3, 2024Assignee: QUALCOMM IncorporatedInventors: Srijesh Sudarsanan, Deepak Mathew, Marc Hoffman, Sundar Rajan Balasubramanian, Mansi Jain, James Lee, Gerald Sweeney
-
Publication number: 20240104356Abstract: Certain aspects of the present disclosure provide techniques and apparatus for quantized machine learning. A quantized input matrix is accessed at a layer of a neural network, and a first interim value is generated in an accumulator by performing matrix multiplication, using the accumulator, of the quantized input matrix and a quantized weight matrix associated with the layer of the neural network. The first interim value is normalized based at least in part on one or more leading sign bits of the first interim value, and the normalized first interim value is dequantized. A second interim value is generated by applying a rounded right-shift operation to the dequantized normalized first interim value, and activation data is generated by applying an activation function to the second interim value.Type: ApplicationFiled: September 22, 2022Publication date: March 28, 2024Inventors: Srijesh SUDARSANAN, Deepak MATHEW, Marc HOFFMAN, Sundar Rajan BALASUBRAMANIAN, Gerald SWEENEY, Mansi JAIN, James LEE, Ankita NAYAK
-
Patent number: 11900111Abstract: A device includes a vector register file, a memory, and a processor. The vector register file includes a plurality of vector registers. The memory is configured to store a permutation instruction. The processor is configured to access a periodicity parameter of the permutation instruction. The periodicity parameter indicates a count of a plurality of data sources that contain source data for the permutation instruction. The processor is also configured to execute the permutation instruction to, for each particular element of multiple elements of a first permutation result register of the plurality of vector registers, select a data source of the plurality of data sources based at least in part on the count of the plurality of data sources and populate the particular element based on a value in a corresponding element of the selected data source.Type: GrantFiled: September 24, 2021Date of Patent: February 13, 2024Assignee: QUALCOMM IncorporatedInventors: Srijesh Sudarsanan, Deepak Mathew, Marc Hoffman, Gerald Sweeney, Sundar Rajan Balasubramanian, Hongfeng Dong, Yurong Sun, Seyedmehdi Sadeghzadeh
-
Publication number: 20230350640Abstract: A device includes a processor that includes a rotation vector register file, a second vector register file, and multiply-accumulate circuitry (MAC). The rotation vector register file includes a rotation vector register. The rotation vector register file is configured to rotate data in the rotation vector register. The second vector register file includes a source vector register. The MAC is configured to receive first input data from the rotation vector register file and second input data from the source vector register.Type: ApplicationFiled: May 2, 2022Publication date: November 2, 2023Inventors: Sundar Rajan BALASUBRAMANIAN, Srijesh SUDARSANAN, Marc HOFFMAN, Deepak MATHEW, Gerald SWEENEY, James LEE, Mansi JAIN
-
Publication number: 20230351144Abstract: This application is directed to using a single instruction to initiate a sequence of computational operations related to a neural network activation function. An electronic device receives a single instruction to apply a linear activation operation to a set of first elements stored in one or more input vector registers. In response to the single instruction, the linear activation operation is implemented on the set of first elements to generate a set of output elements. For each first element, the electronic device detects a sign value of the respective first element, selects a respective scalar from one or more scalars based on the sign value, and applies the linear activation operation on the respective first element based on the selected respective scalar and a bias value to generate a respective element of the set of output elements. The electronic device quantizes the set of output elements.Type: ApplicationFiled: April 28, 2022Publication date: November 2, 2023Inventors: Srijesh SUDARSANAN, Deepak MATHEW, Marc HOFFMAN, Sundar Rajan BALASUBRAMANIAN, Mansi JAIN, James LEE, Gerald SWEENEY
-
Publication number: 20230350678Abstract: This application is directed to using a single instruction to initiate a sequence of computational operations related to a neural network. An electronic device receives a single instruction to apply a neural network operation to a set of M-bit elements stored in one or more input vector registers. In response to the single instruction, the electronic device implements the neural network operation on the set of M-bit elements to generate a set of P-bit elements by obtaining the set of M-bit elements from the one or more input vector registers, quantizing each of the set of M-bit elements from M bits to P bits, and packing the set of P-bit elements into an output vector register. P is smaller than M. In some embodiments, the neural network operation is a quantization operation including at least a multiplication with a quantization factor and an addition with a zero point.Type: ApplicationFiled: April 28, 2022Publication date: November 2, 2023Inventors: Srijesh SUDARSANAN, Deepak MATHEW, Marc HOFFMAN, Sundar Rajan BALASUBRAMANIAN, Mansi JAIN, James LEE, Gerald SWEENEY
-
Publication number: 20230097103Abstract: A device includes a memory configured to store a fast Fourier transform (FFT) instruction and parameters of the FFT instruction, a read-only memory including a phasor table, and a processor. The processor is configured to execute the FFT instruction to determine, based on the parameters of the FFT instruction, a start value and a step size. The processor is configured to execute the FFT instruction to access the phasor table according to the start value and the step size to obtain a set of twiddle values. The processor is also configured to execute the FFT instruction to compute, for each pair of input values in a set of input data, an output value based on the pair of input values and a twiddle value, of the set of twiddle values, that corresponds to that pair of input values.Type: ApplicationFiled: September 24, 2021Publication date: March 30, 2023Inventors: Santosh Srivatsan Srinivasan, Marc Hoffman, Srijesh Sudarsanan, Deepak Mathew, Hongfeng Dong, Gerald Sweeney
-
Publication number: 20230102564Abstract: A device includes a vector register file, a memory, and a processor. The vector register file includes a plurality of vector registers. The memory is configured to store a permutation instruction. The processor is configured to access a periodicity parameter of the permutation instruction. The periodicity parameter indicates a count of a plurality of data sources that contain source data for the permutation instruction. The processor is also configured to execute the permutation instruction to, for each particular element of multiple elements of a first permutation result register of the plurality of vector registers, select a data source of the plurality of data sources based at least in part on the count of the plurality of data sources and populate the particular element based on a value in a corresponding element of the selected data source.Type: ApplicationFiled: September 24, 2021Publication date: March 30, 2023Inventors: Srijesh SUDARSANAN, Deepak MATHEW, Mark HOFFMAN, Gerald SWEENEY, Sundar Rajan BALASUBRAMANIAN, Hongfeng DONG, Yurong SUN, Seyedmehdi SADEGHZADEH
-
Patent number: 10638346Abstract: Methods, systems, and devices for wireless communication are described. A user equipment (UE) utilizing enhanced carrier aggregation (eCA) may identify a limit to the number of channel state feedback (CSF) processes it is capable of supporting. The UE may transmit an indication of this limit to a base station, which may configure the UE for channel state reporting, and send channel state reporting triggers according to the indicated limit. The UE's determination of the limit to the number of CSF processes may be based on various transmit or receive antenna configurations. A single trigger may correspond to reports covering multiple subframes and/or component carriers. The base station may also arrange the channel state reporting configuration to reduce the peak number of channel state reports that the UE processes during each subframe. The UE may also determine that a number of channel state processes needed to support channel state reporting in a subframe exceeds its capacity.Type: GrantFiled: August 11, 2016Date of Patent: April 28, 2020Assignee: QUALCOMM IncorporatedInventors: Parvathanathan Subrahmanya, Qiang Shen, Aamod Dinkar Khandekar, Mariam Motamed, Wanshi Chen, Hanfang Pan, Deepak Mathew
-
Patent number: 10466967Abstract: An apparatus includes one or more registers configured to store a vector of input values. The apparatus also includes a coefficient determination unit configured to, responsive to execution by a processor of a single instruction, select a plurality of piecewise analysis coefficients. The plurality of piecewise analysis coefficients includes one or more sets of piecewise analysis coefficients, and each set of piecewise analysis coefficients corresponds to an input value of the vector of input values. The apparatus further includes arithmetic logic circuitry configured to, responsive to the execution of at least the single instruction, determine estimated output values of a function based on the plurality of piecewise analysis coefficients and the vector of input values.Type: GrantFiled: July 29, 2016Date of Patent: November 5, 2019Assignee: QUALCOMM IncorporatedInventors: Deepak Mathew, Ajay Anant Ingle, Yurong Sun, Jianming Zhu, Marc Hoffman
-
Publication number: 20180032311Abstract: An apparatus includes one or more registers configured to store a vector of input values. The apparatus also includes a coefficient determination unit configured to, responsive to execution by a processor of a single instruction, select a plurality of piecewise analysis coefficients. The plurality of piecewise analysis coefficients includes one or more sets of piecewise analysis coefficients, and each set of piecewise analysis coefficients corresponds to an input value of the vector of input values. The apparatus further includes arithmetic logic circuitry configured to, responsive to the execution of at least the single instruction, determine estimated output values of a function based on the plurality of piecewise analysis coefficients and the vector of input values.Type: ApplicationFiled: July 29, 2016Publication date: February 1, 2018Inventors: Deepak Mathew, Ajay Anant Ingle, Yurong Sun, Jianming Zhu, Marc Hoffman
-
Publication number: 20170094545Abstract: Methods, systems, and devices for wireless communication are described. A user equipment (UE) utilizing enhanced carrier aggregation (eCA) may identify a limit to the number of channel state feedback (CSF) processes it is capable of supporting. The UE may transmit an indication of this limit to a base station, which may configure the UE for channel state reporting, and send channel state reporting triggers according to the indicated limit. The UE's determination of the limit to the number of CSF processes may be based on various transmit or receive antenna configurations. A single trigger may correspond to reports covering multiple subframes and/or component carriers. The base station may also arrange the channel state reporting configuration to reduce the peak number of channel state reports that the UE processes during each subframe. The UE may also determine that a number of channel state processes needed to support channel state reporting in a subframe exceeds its capacity.Type: ApplicationFiled: August 11, 2016Publication date: March 30, 2017Inventors: Parvathanathan Subrahmanya, Qiang Shen, Aamod Dinkar Khandekar, Mariam Motamed, Wanshi Chen, Hanfang Pan, Deepak Mathew
-
Patent number: 9363749Abstract: A system and method dynamically scale power consumed by the circuitry of an electronic device based on channel state and/or data rate. The electronic device then operates according to the power scaling. The scaling may be in accordance with an effective data rate, a number of multiple input multiple output (MIMO) layers, receiver type, a cell scenario, or a number of carriers. A number of MIMO layers can be predicted based on at least one of channel conditions or a channel quality index (CQI).Type: GrantFiled: August 15, 2013Date of Patent: June 7, 2016Assignee: QUALCOMM INCORPORATEDInventors: Deepak Mathew, Garret Webster Shih, Jose Fridman, Robin Lee Brown
-
Patent number: 9342479Abstract: Systems and methods of data extraction in a vector processor are disclosed. In a particular embodiment a method of data extraction in a vector processor includes copying at least one data element to a source register of a permutation network. The method includes reordering multiple data elements of the source register, populating a destination register of the permutation network with the reordered data elements, and copying the reordered data elements from the destination register to a memory.Type: GrantFiled: August 23, 2012Date of Patent: May 17, 2016Assignee: QUALCOMM IncorporatedInventors: Jose Fridman, Ajay Anant Ingle, Deepak Mathew, Marc M. Hoffman, Michael John Lopez
-
Patent number: 9268571Abstract: A method includes selectively coupling a first address line of a plurality of address lines and a second address line of the plurality of address lines to a first element bank of a plurality of element banks of a vector register file according to a selection pattern. The method also includes accessing data stored within the first element bank that is selectively addressed by the first address line via a single read port.Type: GrantFiled: October 18, 2012Date of Patent: February 23, 2016Assignee: QUALCOMM IncorporatedInventors: Ajay Anant Ingle, Marc M. Hoffman, Deepak Mathew
-
Patent number: 9130786Abstract: An apparatus includes selection logic configured to select a first subset of a first set of samples stored at a first set of registers. The first subset includes a first sample stored at a first register of the first set of registers and further includes a second sample stored at a second register of the first set of registers. The apparatus further includes shift logic configured to shift a second set of samples stored at a second set of registers. The apparatus further includes a channel estimator configured to generate a first value associated with a channel estimate based on the first subset and further based on a second subset of the shifted second set of samples.Type: GrantFiled: March 15, 2013Date of Patent: September 8, 2015Assignee: Qualcomm IncorporatedInventors: Deepak Mathew, Ajay Anant Ingle, Mao Zeng, Marc M. Hoffman
-
Publication number: 20150052330Abstract: In a particular embodiment, a method includes executing a vector instruction at a processor. The vector instruction includes a vector input that includes a plurality of elements. Executing the vector instruction includes providing a first element of the plurality of elements as a first output. Executing the vector instruction further includes performing an arithmetic operation on the first element and a second element of the plurality of elements to provide a second output. Executing the vector instruction further includes storing the first output and the second output in an output vector.Type: ApplicationFiled: August 14, 2013Publication date: February 19, 2015Applicant: QUALCOMM IncorporatedInventors: Ajay Anant Ingle, Marc Murray Hoffman, Deepak Mathew, Mao Zeng
-
Publication number: 20140270017Abstract: An apparatus includes selection logic configured to select a first subset of a first set of samples stored at a first set of registers. The first subset includes a first sample stored at a first register of the first set of registers and further includes a second sample stored at a second register of the first set of registers. The apparatus further includes shift logic configured to shift a second set of samples stored at a second set of registers. The apparatus further includes a channel estimator configured to generate a first value associated with a channel estimate based on the first subset and further based on a second subset of the shifted second set of samples.Type: ApplicationFiled: March 15, 2013Publication date: September 18, 2014Applicant: QUALCOMM INCORPORATEDInventors: Deepak Mathew, Ajay Anant Ingle, Mao Zeng, Marc M. Hoffman
-
Publication number: 20140281368Abstract: An example method for executing multiple instructions in one or more slots includes receiving a packet including multiple instructions and executing the multiple instructions in one or more slots in a time shared manner. Each slot is associated with an execution data path or a memory data path. An example method for executing at least one instruction in a plurality of phases includes receiving a packet including an instruction, splitting the instruction into a plurality of phases, and executing the instruction in the plurality of phases.Type: ApplicationFiled: March 14, 2013Publication date: September 18, 2014Applicant: QUALCOMM INCORPORATEDInventors: Ajay Anant Ingle, Lucian Codrescu, David J. Hoyle, Jose Fridman, Marc M. Hoffman, Deepak Mathew
-
Publication number: 20140115227Abstract: A method includes selectively coupling a first address line of a plurality of address lines and a second address line of the plurality of address lines to a first element bank of a plurality of element banks of a vector register file according to a selection pattern. The method also includes accessing data stored within the first element bank that is selectively addressed by the first address line via a single read port.Type: ApplicationFiled: October 18, 2012Publication date: April 24, 2014Applicant: QUALCOMM IncorporatedInventors: Ajay Anant Ingle, Marc M. Hoffman, Deepak Mathew