Patents by Inventor Venkateswara Madduri

Venkateswara Madduri has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 11809867
    Abstract: An apparatus and method for performing dual concurrent multiplications of packed data elements.
    Type: Grant
    Filed: September 21, 2020
    Date of Patent: November 7, 2023
    Assignee: Intel Corporation
    Inventors: Venkateswara Madduri, Elmoustapha Ould-Ahmed-Vall, Jesus Corbal, Mark Charney, Robert Valentine, Binwei Yang
  • Publication number: 20230325241
    Abstract: Embodiments for allocating shared resources are disclosed. In an embodiment, an apparatus includes a core and a hardware rate selector. The hardware rate selector is to, in response to a first indication that demand for memory bandwidth from the core has reached a threshold value, determine a delay value to be used to limit allocation of memory bandwidth to the core. The hardware rate selector includes a controller having a first counter to count a second indication of demand for memory bandwidth from the first core and a second counter to count expirations of time windows. The first indication is based on a difference between the first counter value and the second counter value.
    Type: Application
    Filed: September 26, 2020
    Publication date: October 12, 2023
    Applicant: Intel Corporation
    Inventors: Andrew J. HERDRICH, Yen-Cheng LIU, Venkateswara MADDURI, Krishnakumar K. GANAPATHY, Edwin VERPLANKE, Christopher GIANOS, Hanna ALAM, Joseph NUZMAN, Larisa NOVAKOVSKY
  • Patent number: 11755323
    Abstract: An apparatus and method for multiplying packed real and imaginary components of complex numbers are described. A processor embodiment includes: a decoder to decode a first instruction to generate a decoded instruction; a first source register to store a first plurality of packed real and imaginary data elements; a second source register to store a second plurality of packed real and imaginary data elements; and execution circuitry to execute the decoded instruction.
    Type: Grant
    Filed: February 15, 2022
    Date of Patent: September 12, 2023
    Assignee: Intel Corporation
    Inventors: Venkateswara Madduri, Elmoustapha Ould-Ahmed-Vall, Jesus Corbal, Mark Charney, Robert Valentine, Binwei Yang
  • Publication number: 20230098724
    Abstract: Techniques for copying a subset of status flags from a control and status register to a flags register in response to an instruction are described. An exemplary instruction includes a field for an opcode, the opcode to indicate execution circuitry is to copy from a first register a saturation flag value, an overflow value, and a carry value to a second register into one or more instructions of a different instruction set.
    Type: Application
    Filed: September 25, 2021
    Publication date: March 30, 2023
    Inventors: Vedvyas SHANBHOGUE, Robert VALENTINE, Mark CHARNEY, Venkateswara MADDURI
  • Patent number: 11573799
    Abstract: An apparatus and method for performing dual concurrent multiplications of packed data elements.
    Type: Grant
    Filed: April 9, 2021
    Date of Patent: February 7, 2023
    Assignee: Intel Corporation
    Inventors: Venkateswara Madduri, Elmoustapha Ould-Ahmed-Vall, Mark Charney, Robert Valentine, Jesus Corbal, Binwei Yang
  • Publication number: 20220413861
    Abstract: Techniques for matrix multiplication are described. In some examples, a single instruction having a format of fields for an opcode, one or more fields to indicate a location of a source/destination operand, one or more fields to indicate a location of a first source operand, and one or more fields to indicate a location of a second source operand is used. Wherein the opcode is to indicate that execution circuitry is to: multiply values from corresponding data elements of the first and second sources, add a first subset of the multiplied values to a first value from the source/destination operand and store in a first data element position of the source/destination operand, and add a second subset of the multiplied values to a second value from the source/destination operand and store in a second data element position of the source/destination operand.
    Type: Application
    Filed: June 26, 2021
    Publication date: December 29, 2022
    Inventors: Venkateswara MADDURI, Cristina ANDERSON, Robert VALENTINE, Mark CHARNEY, Vedvyas SHANBHOGUE
  • Publication number: 20220326946
    Abstract: An apparatus and method for performing a transform on complex data.
    Type: Application
    Filed: January 31, 2022
    Publication date: October 13, 2022
    Applicant: Intel Corporation
    Inventors: VENKATESWARA MADDURI, ELMOUSTAPHA OULD-AHMED-VALL, MARK CHARNEY, ROBERT VALENTINE, JESUS CORBAL, BINWEI YANG
  • Publication number: 20220309005
    Abstract: Techniques for controlling bandwidth in a core are described. An exemplary core includes a memory bandwidth monitor per thread local to the core, each thread's local bandwidth monitor to at least allocate bandwidth for memory requests originating from the thread according to a class of service level stored in a field of quality of service (QoS) model-specific register (MSR), the class of service level pointed to by a class of service field in a platform quality of service MSR; and execution resources to support execution of at least one thread of the core.
    Type: Application
    Filed: March 27, 2021
    Publication date: September 29, 2022
    Inventors: Vedvyas SHANBHOGUE, Krishnakumar GANAPATHY, Venkateswara MADDURI, James ALLEN, James COLEMAN, Stephen ROBINSON
  • Publication number: 20220236991
    Abstract: An apparatus and method for performing a packed horizontal addition of words and doublewords. One embodiment of a processor includes a decoder to decode a packed horizontal add instruction which includes an opcode and one or more operands used to identify a plurality of packed words; a source register to store a plurality of packed words; execution circuitry to execute the decoded instruction, and a destination register to store a final result as a packed result word in a designated data element position. The execution circuitry includes operand selection circuitry to identify first and second packed words from the source register in accordance with the operands and opcode; adder circuitry to add the two packed words to generate a temporary sum; a temporary storage of at least 17 bits to store the temporary sum; and saturation circuitry to saturate the temporary sum if necessary to generate the final result.
    Type: Application
    Filed: February 14, 2022
    Publication date: July 28, 2022
    Applicant: Intel Corporation
    Inventors: Venkateswara Madduri, Elmoustapha Ould-Ahmed-Vall, Robert Valentine, Mark Charney
  • Publication number: 20220171624
    Abstract: An apparatus and method for multiplying packed real and imaginary components of complex numbers are described. A processor embodiment includes: a decoder to decode a first instruction to generate a decoded instruction; a first source register to store a first plurality of packed real and imaginary data elements; a second source register to store a second plurality of packed real and imaginary data elements; and execution circuitry to execute the decoded instruction.
    Type: Application
    Filed: February 15, 2022
    Publication date: June 2, 2022
    Applicant: Intel Corporation
    Inventors: VENKATESWARA MADDURI, ELMOUSTAPHA OULD-AHMED-VALL, JESUS CORBAL, MARK CHARNEY, ROBERT VALENTINE, BINWEI YANG
  • Publication number: 20220129267
    Abstract: An apparatus and method for performing right-shifting operations on packed quadword data. For example, one processor embodiment comprises a decoder to decode a right-shift instruction, a first source register to store a plurality of packed quadword data elements, and execution circuitry to execute the decoded right-shift instruction. The execution circuitry comprises shift circuitry with sign preservation logic to right-shift first and second packed quadword data elements in the first source register by an amount specified in an immediate value or in a control value in a second source register, the right-shifting to generate first and second right-shifted quadwords, the sign preservation logic to shift in the sign bit. The execution circuitry is to cause selection of 32 most significant bits of the first and second right-shifted quadwords to be written to 32 least significant bit positions of first and second quadword data element locations of a destination register.
    Type: Application
    Filed: November 3, 2021
    Publication date: April 28, 2022
    Inventors: Venkateswara MADDURI, ElMoustapha OULD-AHMED-VALL, Robert VALENTINE, Mark CHARNEY
  • Publication number: 20220129273
    Abstract: An apparatus and method for performing signed multiplication of packed signed doublewords and accumulation with a signed quadword. For example, one exemplary processor comprises three registers and execution circuitry. The execution circuitry is to multiply first and second packed signed doubleword data elements from the first register with third and fourth packed signed doubleword data elements from the second register, respectively, to generate first and second temporary products. It is also to select first, second, third, and fourth signed doubleword data elements. It is also to combine the first temporary products with a first packed signed quadword value read from the third register to generate a first accumulated result and to combine the second temporary product with a second packed signed quadword value read from the third source register to generate a second accumulated result. The third register is to store the results.
    Type: Application
    Filed: November 3, 2021
    Publication date: April 28, 2022
    Inventors: ElMoustapha OULD-AHMED-VALL, Robert VALENTINE, Mark CHARNEY, Jesus CORBAL, Venkateswara MADDURI
  • Publication number: 20220129268
    Abstract: An apparatus and method for performing right-shifting operations on packed quadword data. For example, one embodiment of a processor comprises a decoder to decode a right-shift instruction, a first source register to store a plurality of packed quadword data elements, and execution circuitry to execute the decoded right-shift instruction. The execution circuitry comprises shift circuitry with sign preservation logic to right-shift first and second packed quadword data elements in the first source register by an amount specified in an immediate value or in a control value in a second source register, the right-shifting to generate first and second right-shifted quadwords, the sign preservation logic to shift in the sign bit. The execution circuitry is to cause selection of 16 most significant bits of the first and second right-shifted quadwords to be written to 16 least significant bit regions of first and second quadword data element locations of a destination register.
    Type: Application
    Filed: November 3, 2021
    Publication date: April 28, 2022
    Inventors: Venkateswara MADDURI, ElMoustapha OULD-AHMED-VALL, Robert VALENTINE, Mark CHARNEY
  • Patent number: 11256504
    Abstract: An apparatus and method for multiplying packed real and imaginary components of complex numbers are described. A processor embodiment includes: a decoder to decode a first instruction to generate a decoded instruction; a first source register to store a first plurality of packed real and imaginary data elements; a second source register to store a second plurality of packed real and imaginary data elements; and execution circuitry to execute the decoded instruction.
    Type: Grant
    Filed: September 29, 2017
    Date of Patent: February 22, 2022
    Assignee: Intel Corporation
    Inventors: Venkateswara Madduri, Elmoustapha Ould-Ahmed-Vall, Jesus Corbal, Mark Charney, Robert Valentine, Binwei Yang
  • Patent number: 11249754
    Abstract: An apparatus and method for performing a packed horizontal addition of words and doublewords. One embodiment of a processor includes a decoder to decode a packed horizontal add instruction which includes an opcode and one or more operands used to identify a plurality of packed words; a source register to store a plurality of packed words; execution circuitry to execute the decoded instruction, and a destination register to store a final result as a packed result word in a designated data element position. The execution circuitry includes operand selection circuitry to identify first and second packed words from the source register in accordance with the operands and opcode; adder circuitry to add the two packed words to generate a temporary sum; a temporary storage of at least 17 bits to store the temporary sum; and saturation circuitry to saturate the temporary sum if necessary to generate the final result.
    Type: Grant
    Filed: December 21, 2017
    Date of Patent: February 15, 2022
    Assignee: Intel Corporation
    Inventors: Venkateswara Madduri, Elmoustapha Ould-Ahmed-Vall, Robert Valentine, Mark Charney
  • Patent number: 11243765
    Abstract: Apparatus and method to transform complex data including a processor that comprises: multiplier circuitry to multiply packed complex N-bit data elements with packed complex M-bit data elements to generate at least four real products; adder circuitry to subtract a first real product from a second real product to generate a first temporary result, subtract a third real product from a fourth real product to generate a second temporary result, add the first temporary result to a first packed N-bit data element to generate a first pre-scaled result, subtract the first temporary result from the first packed N-bit data element to generate a second pre-scaled result, add the second temporary result to a second packed N-bit data element to generate a third pre-scaled result, and subtract the second temporary result from the second packed N-bit data element to generate a fourth pre-scaled result; and scaling circuitry to scale the pre-scaled results.
    Type: Grant
    Filed: September 29, 2017
    Date of Patent: February 8, 2022
    Assignee: Intel Corporation
    Inventors: Venkateswara Madduri, Elmoustapha Ould-Ahmed-Vall, Mark Charney, Robert Valentine, Jesus Corbal, Binwei Yang
  • Publication number: 20210357215
    Abstract: An apparatus and method for performing dual concurrent multiplications, subtraction/addition, and accumulation of packed data elements.
    Type: Application
    Filed: July 20, 2021
    Publication date: November 18, 2021
    Inventors: Venkateswara MADDURI, Elmoustapha OULD-AHMED-VALL, Mark CHARNEY, Robert VALENTINE, Jesus CORBAL
  • Publication number: 20210294604
    Abstract: An apparatus and method for performing dual concurrent multiplications of packed data elements.
    Type: Application
    Filed: April 9, 2021
    Publication date: September 23, 2021
    Applicant: Intel Corporation
    Inventors: VENKATESWARA MADDURI, ELMOUSTAPHA OULD-AHMED-VALL, MARK CHARNEY, ROBERT VALENTINE, JESUS CORBAL, BINWEI YANG
  • Patent number: 11074073
    Abstract: An apparatus and method for performing dual concurrent multiplications, subtraction/addition, and accumulation of packed data elements.
    Type: Grant
    Filed: September 29, 2017
    Date of Patent: July 27, 2021
    Assignee: INTEL CORPORATION
    Inventors: Venkateswara Madduri, Elmoustapha Ould-Ahmed-Vall, Mark Charney, Robert Valentine, Jesus Corbal
  • Patent number: 10977039
    Abstract: An apparatus and method for performing dual concurrent multiplications of packed data elements.
    Type: Grant
    Filed: November 1, 2019
    Date of Patent: April 13, 2021
    Assignee: Intel Corporation
    Inventors: Venkateswara Madduri, Elmoustapha Ould-Ahmed-Vall, Mark Charney, Robert Valentine, Jesus Corbal, Binwei Yang