Patents by Inventor Sridhar Samudrala

Sridhar Samudrala has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

Double rounded combined floating-point multiply and add

Patent number: 9477441

Abstract: Methods, apparatus, instructions and logic are disclosed providing double rounded combined floating-point multiply and add functionality as scalar or vector SIMD instructions or as fused micro-operations. Embodiments include detecting floating-point (FP) multiplication operations and subsequent FP operations specifying as source operands results of the FP multiplications. The FP multiplications and the subsequent FP operations are encoded as combined FP operations including rounding of the results of FP multiplication followed by the subsequent FP operations. The encoding of said combined FP operations may be stored and executed as part of an executable thread portion using fused-multiply-add hardware that includes overflow detection for the product of FP multipliers, first and second FP adders to add third operand addend mantissas and the products of the FP multipliers with different rounding inputs based on overflow, or no overflow, in the products of the FP multiplier.

Type: Grant

Filed: November 23, 2015

Date of Patent: October 25, 2016

Assignee: Intel Corporation

Inventors: Sridhar Samudrala, Grigorios Magklis, Marc Lupon, David R. Ditzel
Combined floating point multiplier adder with intermediate rounding logic

Patent number: 9389871

Abstract: An error handling method includes identifying a code region eligible for cumulative multiply add (CMA) optimization and translating code region instructions into interpreter code instructions, which may include translating sequences of multiply add instructions in the code region instructions into fusion code including CMA instructions. Floating point (FP) exceptions generated by the fusion code may be monitored and at least a portion of the code region instructions may be re-translated to eliminate some or all fusion code if CMA intermediate rounding exceptions exceed a threshold.

Type: Grant

Filed: March 15, 2013

Date of Patent: July 12, 2016

Assignee: Intel Corporation

Inventors: Marc Lupon, Grigorios Magklis, Sridhar Samudrala, Raul Martinez, Kyriakos A. Stavrou, Enric Gibert Codina
Mechanism for facilitating dynamic and efficient fusion of computing instructions in software programs

Patent number: 9329848

Abstract: A mechanism is described for facilitating dynamic and efficient fusion of computing instructions according to one embodiment. A method of embodiments, as described herein, includes monitoring a software program for a program region having fusion candidate instructions for a fusion operation at a computing system; evaluating whether the macro operation of the candidate instructions is valuable to the software program; and performing the fusion operation if it is evaluated to be valuable.

Type: Grant

Filed: March 27, 2013

Date of Patent: May 3, 2016

Assignee: Intel Corporation

Inventors: Marc Lupon, Raul Martinez, Enric Gibert Codina, Kyriakos A. Stavrou, Grigorios Magklis, Sridhar Samudrala
DOUBLE ROUNDED COMBINED FLOATING-POINT MULTIPLY AND ADD

Publication number: 20160077802

Abstract: Methods, apparatus, instructions and logic are disclosed providing double rounded combined floating-point multiply and add functionality as scalar or vector SIMD instructions or as fused micro-operations. Embodiments include detecting floating-point (FP) multiplication operations and subsequent FP operations specifying as source operands results of the FP multiplications. The FP multiplications and the subsequent FP operations are encoded as combined FP operations including rounding of the results of FP multiplication followed by the subsequent FP operations. The encoding of said combined FP operations may be stored and executed as part of an executable thread portion using fused-multiply-add hardware that includes overflow detection for the product of FP multipliers, first and second FP adders to add third operand addend mantissas and the products of the FP multipliers with different rounding inputs based on overflow, or no overflow, in the products of the FP multiplier.

Type: Application

Filed: November 23, 2015

Publication date: March 17, 2016

Inventors: Sridhar Samudrala, Grigorios Magklis, Marc Lupon, David R. Ditzel
Double rounded combined floating-point multiply and add

Patent number: 9213523

Abstract: Methods, apparatus, instructions and logic are disclosed providing double rounded combined floating-point multiply and add functionality as scalar or vector SIMD instructions or as fused micro-operations. Embodiments include detecting floating-point (FP) multiplication operations and subsequent FP operations specifying as source operands results of the FP multiplications. The FP multiplications and the subsequent FP operations are encoded as combined FP operations including rounding of the results of FP multiplication followed by the subsequent FP operations. The encoding of said combined FP operations may be stored and executed as part of an executable thread portion using fused-multiply-add hardware that includes overflow detection for the product of FP multipliers, first and second FP adders to add third operand addend mantissas and the products of the FP multipliers with different rounding inputs based on overflow, or no overflow, in the products of the FP multiplier.

Type: Grant

Filed: June 29, 2012

Date of Patent: December 15, 2015

Assignee: Intel Corporation

Inventors: Sridhar Samudrala, Grigorios Magklis, Marc Lupon, David R. Ditzel
Vector logical reduction operation implemented using swizzling on a semiconductor chip

Patent number: 9141386

Abstract: A semiconductor processor is described. The semiconductor processor includes logic circuitry to perform a logical reduction instruction. The logic circuitry has swizzle circuitry to swizzle a vector's elements so as to form a swizzle vector. The logic circuitry also has vector logic circuitry to perform a vector logic operation on said vector and said swizzle vector.

Type: Grant

Filed: September 24, 2010

Date of Patent: September 22, 2015

Assignee: Intel Corporation

Inventors: Jeff Wiedemeier, Sridhar Samudrala, Roger Golliver
Functional unit for vector leading zeroes, vector trailing zeroes, vector operand 1s count and vector parity calculation

Patent number: 9092213

Abstract: A method of performing vector operations on a semiconductor chip is described. The method includes performing a first vector instruction with a vector functional unit implemented on the semiconductor chip and performing a second vector instruction with the vector functional unit. The first vector instruction is a vector multiply add instruction. The second vector instruction is a vector leading zeros count instruction.

Type: Grant

Filed: September 24, 2010

Date of Patent: July 28, 2015

Assignee: Intel Corporation

Inventors: Jeff Wiedemeier, Sridhar Samudrala, Roger Golliver, Eric W. Mahurin
VIRTUAL GROUP POLICY BASED FILTERING WITHIN AN OVERLAY NETWORK

Publication number: 20150195137

Abstract: A virtual switch connected to at least one virtual machine of multiple virtual machines communicatively connected through an overlay network, receives a data packet, each of the virtual machines configured within a separate one of multiple virtual groups in the overlay network, the data packet comprising a packet header comprising at least one address. The virtual switch receives a virtual group identifier for the at least one address from at least one address resolution service returning the virtual group identifier and a resolved address for the at least one address, in response to an address resolution request for the at least one address. The virtual switch sends the data packet through the virtual switch to the resolved address only if the virtual group identifier is allowed according to a filtering policy applied by the virtual switch for a particular virtual group identified by the virtual group identifier.

Type: Application

Filed: January 6, 2014

Publication date: July 9, 2015

Inventors: VIVEK KASHYAP, SRIDHAR SAMUDRALA, DAVID L. STEVENS, JR.
MECHANISM FOR FACILITATING DYNAMIC AND EFFICIENT FUSION OF COMPUTING INSTRUCTIONS IN SOFTWARE PROGRAMS

Publication number: 20150026671

Abstract: A mechanism is described for facilitating dynamic and efficient fusion of computing instructions according to one embodiment. A method of embodiments, as described herein, includes monitoring a software program for a program region having fusion candidate instructions for a fusion operation at a computing system; evaluating whether the macro operation of the candidate instructions is valuable to the software program; and performing the fusion operation if it is evaluated to be valuable.

Type: Application

Filed: March 27, 2013

Publication date: January 22, 2015

Inventors: Marc Lupon, Raul Martinez, Enric Gibert Codina, Kyriakos A. Stavrou, Grigorios Magklis, Sridhar Samudrala
INSTRUCTION AND LOGIC TO PROVIDE VECTOR BLEND AND PERMUTE FUNCTIONALITY

Publication number: 20140372727

Abstract: Vector blend and permute functionality are provided, responsive to instructions specifying: a destination vector register comprising fields to store vector elements, a first vector register, a vector element size, a second vector register, and a third operand. Indices are read from fields in the second register. Each index has a first selector portion and a second selector portion. Corresponding unmasked vector elements are stored to fields of the destination register, wherein each vector element, responsive to the respective first selector portion having a first value, is copied to an intermediate vector from a corresponding data field of the first register, and responsive to the respective first selector portion having a second value, is copied to the intermediate vector from a corresponding data field of the third operand. Then unmasked data fields of the destination are replaced by data fields in the intermediate vector indexed by the corresponding second selector portions.

Type: Application

Filed: December 23, 2011

Publication date: December 18, 2014

Applicant: INTEL CORPORATION

Inventors: Robert Valentine, Bret L. Toll, Jesus Corbal, Jeff G. Wiedemeier, Sridhar Samudrala
COMBINED FLOATING POINT MULTIPLIER ADDER WITH INTERMEDIATE ROUNDING LOGIC

Publication number: 20140281419

Abstract: An error handling method includes identifying a code region eligible for cumulative multiply add (CMA) optimization and translating code region instructions into interpreter code instructions, which may include translating sequences of multiply add instructions in the code region instructions into fusion code including CMA instructions. Floating point (FP) exceptions generated by the fusion code may be monitored and at least a portion of the code region instructions may be re-translated to eliminate some or all fusion code if CMA intermediate rounding exceptions exceed a threshold.

Type: Application

Filed: March 15, 2013

Publication date: September 18, 2014

Applicant: Intel Corporation

Inventors: Marc Lupon, Grigorios Magklis, Sridhar Samudrala, Raul Martinez, Kyriakos A. Stavrou, Enric Gibert Codina
VECTOR FRIENDLY INSTRUCTION FORMAT AND EXECUTION THEREOF

Publication number: 20140149724

Abstract: A vector friendly instruction format and execution thereof. According to one embodiment of the invention, a processor is configured to execute an instruction set. The instruction set includes a vector friendly instruction format. The vector friendly instruction format has a plurality of fields including a base operation field, a modifier field, an augmentation operation field, and a data element width field, wherein the first instruction format supports different versions of base operations and different augmentation operations through placement of different values in the base operation field, the modifier field, the alpha field, the beta field, and the data element width field, and wherein only one of the different values may be placed in each of the base operation field, the modifier field, the alpha field, the beta field, and the data element width field on each occurrence of an instruction in the first instruction format in instruction streams.

Type: Application

Filed: January 31, 2014

Publication date: May 29, 2014

Inventors: Robert C. Valentine, Jesus Corbal San Adrian, Roger Espasa Sans, Robert D. Cavin, Bret L. Toll, Santiago Galan Duran, Jeffrey G. Wiedemeier, Sridhar Samudrala, Milind Baburao Girkar, Edward Thomas Grochowski, Jonathan Cannon Hall, Dennis R. Bradford, Elmoustapha Ould-Ahmed-Vall, James C. Abel, Mark Charney, Seth Abraham, Suleyman Sair, Andrew Thomas Forsyth, Lisa Wu, Charles Yount
Functional unit for vector integer multiply add instruction

Patent number: 8667042

Abstract: A vector functional unit implemented on a semiconductor chip to perform vector operations of dimension N is described. The vector functional unit includes N functional units. Each of the N functional units have logic circuitry to perform: a first integer multiply add instruction that presents highest ordered bits but not lowest ordered bits of a first integer multiply add calculation, and, a second integer multiply add instruction that presents lowest ordered bits but not highest ordered bits of a second integer multiply add calculation.

Type: Grant

Filed: September 24, 2010

Date of Patent: March 4, 2014

Assignee: Intel Corporation

Inventors: Jeff Wiedemeier, Sridhar Samudrala, Roger Golliver
DOUBLE ROUNDED COMBINED FLOATING-POINT MULTIPLY AND ADD

Publication number: 20140006467

Abstract: Methods, apparatus, instructions and logic are disclosed providing double rounded combined floating-point multiply and add functionality as scalar or vector SIMD instructions or as fused micro-operations. Embodiments include detecting floating-point (FP) multiplication operations and subsequent FP operations specifying as source operands results of the FP multiplications. The FP multiplications and the subsequent FP operations are encoded as combined FP operations including rounding of the results of FP multiplication followed by the subsequent FP operations. The encoding of said combined FP operations may be stored and executed as part of an executable thread portion using fused-multiply-add hardware that includes overflow detection for the product of FP multipliers, first and second FP adders to add third operand addend mantissas and the products of the FP multipliers with different rounding inputs based on overflow, or no overflow, in the products of the FP multiplier.

Type: Application

Filed: June 29, 2012

Publication date: January 2, 2014

Inventors: Sridhar Samudrala, Grigorios Magklis, Marc Lupon, David R. Ditzel
METHOD AND APPARATUS FOR CONTROLLING A MXCSR

Publication number: 20130326199

Abstract: Disclosed is an apparatus and method generally related to controlling a multimedia extension control and status register (MXCSR). A processor core may include a floating point unit (FPU) to perform arithmetic functions; and a multimedia extension control register (MXCR) to provide control bits to the FPU. Further an optimizer may be used to select a speculative multimedia extension status register (SPEC_MXSR) from a plurality of SPEC_MXSRs to update a multimedia extension status register (MXSR) based upon an instruction.

Type: Application

Filed: December 29, 2011

Publication date: December 5, 2013

Inventors: Grigorios Magklis, Josep M. Codina, Craig B. Zilles, Michael Neilly, Sridhar Samudrala, Alejandro Martinez Vicente, Polychronis Xekalakis, F. Jesus Sanchez, Marc Lupon, Georgios Tournavitis, Enric Gibert Codina, Crispin Gomez Requena, Antonio Gonzalez, Mirem Hyuseinova, Christos E. Kotselidis, Fernando Latorre, Pedro Lopez, Carlos Madriles Gimeno, Pedro Marcuello, Raul Martinez, Daniel Ortega, Demos Pavlou, Kyriakos A. Stavrou
VECTOR FRIENDLY INSTRUCTION FORMAT AND EXECUTION THEREOF

Publication number: 20130305020

Abstract: A vector friendly instruction format and execution thereof. According to one embodiment of the invention, a processor is configured to execute an instruction set. The instruction set includes a vector friendly instruction format. The vector friendly instruction format has a plurality of fields including a base operation field, a modifier field, an augmentation operation field, and a data element width field, wherein the first instruction format supports different versions of base operations and different augmentation operations through placement of different values in the base operation field, the modifier field, the alpha field, the beta field, and the data element width field, and wherein only one of the different values may be placed in each of the base operation field, the modifier field, the alpha field, the beta field, and the data element width field on each occurrence of an instruction in the first instruction format in instruction streams.

Type: Application

Filed: September 30, 2011

Publication date: November 14, 2013

Inventors: Robert C. Valentine, Jesus Corbal San Adrian, Roger Espasa Sans, Robert D. Cavin, Bret L. Toll, Santiago Galan Duran, Jeffrey G. Wiedemeier, Sridhar Samudrala, Milind Baburao Girkar, Edward Thomas Grochowski, Jonathan Cannon Hall, Dennis R. Bradford, Elmoustapha Ould-Ahmed-Vall, James C. Abel, Mark Charney, Seth Abraham, Suleyman Sair, Andrew Thomas Forsyth, Lisa Wu, Charles Yount
SYSTEMS, APPARATUSES, AND METHODS FOR BLENDING TWO SOURCE OPERANDS INTO A SINGLE DESTINATION USING A WRITEMASK

Publication number: 20120254588

Abstract: Embodiments of systems, apparatuses, and methods for performing a blend instruction in a computer processor are described. In some embodiments, the execution of a blend instruction causes a data element-by-element selection of data elements of first and second source operands using the corresponding bit positions of a writemask as a selector between the first and second operands and storage of the selected data elements into the destination at the corresponding position in the destination.

Type: Application

Filed: April 1, 2011

Publication date: October 4, 2012

Inventors: Jesus Corbal San Adrian, Bret L. Toll, Robert C. Valentine, Jeffrey G. Wiedemeier, Sridhar Samudrala, Milind Baburao Girkar, Andrew Thomas Forsyth, Elmoustapha Ould-Ahmed-Vall, Dennis R. Bradford, Lisa K. Wu
SYSTEMS, APPARATUSES, AND METHODS FOR EXPANDING A MEMORY SOURCE INTO A DESTINATION REGISTER AND COMPRESSING A SOURCE REGISTER INTO A DESTINATION MEMORY LOCATION

Publication number: 20120254592

Abstract: Embodiments of systems, apparatuses, and methods for performing an expand and/or compress instruction in a computer processor are described. In some embodiments, the execution of an expand instruction causes the selection of elements from a source that are to be sparsely stored in a destination based on values of the writemask and store each selected data element of the source as a sparse data element into a destination location, wherein the destination locations correspond to each writemask bit position that indicates that the corresponding data element of the source is to be stored.

Type: Application

Filed: April 1, 2011

Publication date: October 4, 2012

Inventors: Jesus Corbal San Adrian, Roger Espasa Sans, Robert C. Valentine, Santiago Galan Duran, Jeffrey G. Wiedemeier, Sridhar Samudrala, Milind Baburao Girkar, Andrew Thomas Forsyth, Victor W. Lee
FUNCTIONAL UNIT FOR VECTOR INTEGER MULTIPLY ADD INSTRUCTION

Publication number: 20120078992

Abstract: A vector functional unit implemented on a semiconductor chip to perform vector operations of dimension N is described. The vector functional unit includes N functional units. Each of the N functional units have logic circuitry to perform: a first integer multiply add instruction that presents highest ordered bits but not lowest ordered bits of a first integer multiply add calculation, and, a second integer multiply add instruction that presents lowest ordered bits but not highest ordered bits of a second integer multiply add calculation.

Type: Application

Filed: September 24, 2010

Publication date: March 29, 2012

Inventors: Jeff Wiedemeier, Sridhar Samudrala, Roger Golliver
FUNCTIONAL UNIT FOR VECTOR LEADING ZEROES, VECTOR TRAILING ZEROES, VECTOR OPERAND 1s COUNT AND VECTOR PARITY CALCULATION

Publication number: 20120079253

Abstract: A method of performing vector operations on a semiconductor chip is described. The method includes performing a first vector instruction with a vector functional unit implemented on the semiconductor chip and performing a second vector instruction with the vector functional unit. The first vector instruction is a vector multiply add instruction. The second vector instruction is a vector leading zeros count instruction.

Type: Application

Filed: September 24, 2010

Publication date: March 29, 2012

Inventors: Jeff Wiedemeier, Sridhar Samudrala, Roger Golliver, Eric W. Mahurin

prev 1 2 3 next