Patents by Inventor Milind B. Girkar
Milind B. Girkar has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Patent number: 10445092Abstract: A processor for performing a vector permute comprises: a source vector register to store a plurality of source data elements; a destination vector register to store a plurality of destination data elements; a control vector register to store a plurality of control data elements, each control data element corresponding to one of the destination data elements and including an N bit value indicating whether a source data element is to be copied to the corresponding destination data element; vector permute logic to compare the N bit value of each control data element to an N bit portion of an immediate to determine whether to copy a source data element to the corresponding destination data element, wherein if the N bit values match, then the vector permute logic is to identify a source data element using an index value included in the control data element.Type: GrantFiled: December 27, 2014Date of Patent: October 15, 2019Assignee: Intel CorporationInventors: Jesus Corbal San Adrian, Elmoustapha Ould-Ahmed-Vall, Robert Valentine, Mark J. Charney, Milind B. Girkar, Bret L. Toll, Roger Espasa, Guillem Sole, Jairo Balart, Brian Hickman
-
Patent number: 10423411Abstract: A processor includes a decode unit to decode an instruction that is to indicate a first source packed data operand that is to include at least four data elements, to indicate a second source packed data operand that is to include at least four data elements, and to indicate one or more destination storage locations. The execution unit, in response to the instruction, is to store at least one result mask operand in the destination storage location(s). The at least one result mask operand is to include a different mask element for each corresponding data element in one of the first and second source packed data operands in a same relative position. Each mask element is to indicate whether the corresponding data element in said one of the source packed data operands equals any of the data elements in the other of the source packed data operands.Type: GrantFiled: September 26, 2015Date of Patent: September 24, 2019Assignee: Intel CorporationInventors: Asit K. Mishra, Edward T. Grochowski, Jonathan D. Pearce, Deborah T. Marr, Ehud Cohen, Elmoustapha Ould-Ahmed-Vall, Jesus Corbal San Adrian, Robert Valentine, Mark J. Charney, Christopher J. Hughes, Milind B. Girkar
-
Publication number: 20190286559Abstract: In one embodiment, a processor comprises: at least one core formed on a die to execute instructions; a first memory controller to interface with an in-package memory; a second memory controller to interface with a platform memory to couple to the processor; and the in-package memory located within a package of the processor, where the in-package memory is to be identified as a more distant memory with respect to the at least one core than the platform memory. Other embodiments are described and claimed.Type: ApplicationFiled: June 6, 2019Publication date: September 19, 2019Inventors: Avinash Sodani, Robert J. Kyanko, Richard J. Greco, Andreas Kleen, Milind B. Girkar, Christopher M. Cantalupo
-
Publication number: 20190278577Abstract: Methods, apparatus, and system to optimize compilation of source code into vectorized compiled code, notwithstanding the presence of output dependencies which might otherwise preclude vectorization.Type: ApplicationFiled: July 1, 2016Publication date: September 12, 2019Inventors: Mikhail PLOTNIKOV, Hideki IDO, Xinmin TIAN, Sergey PREIS, Milind B. GIRKAR, Maxim SHUTOV
-
Patent number: 10387156Abstract: Systems, methods, and apparatuses for data speculation execution (DSX) are described. In some embodiments, a hardware apparatus for performing DSX comprises a hardware decoder to decode an instruction, the instruction to include an opcode, and execution hardware to execute the decoded instruction to continue a data speculative execution (DSX) and to determine that a DSX loop iteration is to be committed, commit speculative stores associated with the DSX loop iteration, and start a new DSX loop iteration.Type: GrantFiled: December 24, 2014Date of Patent: August 20, 2019Assignee: Intel CorporationInventors: Elmoustapha Ould-Ahmed-Vall, Christopher J. Hughes, Robert Valentine, Milind B. Girkar
-
Patent number: 10387158Abstract: Systems, methods, and apparatuses for data speculation execution (DSX) are described. In some embodiments, a hardware apparatus for performing DSX comprises a hardware decoder to decode an instruction, the instruction to include an opcode, and execution hardware to execute the decoded instruction inside a speculative execution (DSX) and rollback execution to a stored address and clear a DSX status indication in a DSX status register, and thereby abort the DSX.Type: GrantFiled: December 24, 2014Date of Patent: August 20, 2019Assignee: Intel CorporationInventors: Elmoustapha Ould-Ahmed-Vall, Christopher J. Hughes, Robert Valentine, Milind B. Girkar
-
Patent number: 10346300Abstract: In one embodiment, a processor comprises: at least one core formed on a die to execute instructions; a first memory controller to interface with an in-package memory; a second memory controller to interface with a platform memory to couple to the processor; and the in-package memory located within a package of the processor, where the in-package memory is to be identified as a more distant memory with respect to the at least one core than the platform memory. Other embodiments are described and claimed.Type: GrantFiled: June 21, 2017Date of Patent: July 9, 2019Assignee: Intel CorporationInventors: Avinash Sodani, Robert J. Kyanko, Richard J. Greco, Andreas Kleen, Milind B. Girkar, Christopher M. Cantalupo
-
Patent number: 10303525Abstract: Systems, methods, and apparatuses for data speculation execution (DSX) are described. In some embodiments, a hardware apparatus for performing DSX comprises a hardware decoder to decode an instruction, the instruction to include an opcode and an operand to store a portion of a fallback address, execution hardware to execute the decoded instruction to initiate a data speculative execution (DSX) region by activating DSX tracking hardware to track speculative memory accesses and detect ordering violations in the DSX region, and storing the fallback address.Type: GrantFiled: December 24, 2014Date of Patent: May 28, 2019Assignee: Intel CorporationInventors: Elmoustapha Ould-Ahmed-Vall, Christopher J. Hughes, Robert Valentine, Milind B. Girkar, Hideki Ido, Youfeng Wu, Cheng Wang
-
Publication number: 20190121637Abstract: Embodiments of systems, apparatuses, and methods for chained fused multiply add. In some embodiments, an apparatus includes a decoder to decode a single instruction having an opcode, a destination field representing a destination operand, a first source field representing a plurality of packed data source operands of a first type that have packed data elements of a first size, a second source field representing a plurality of packed data source operands that have packed data elements of a second size, and a field for a memory location that stores a scalar value. A register file having a plurality of packed data registers includes registers for the plurality of packed data source operands that have packed data elements of a first size, the source operands that have packed data elements of a second size, and the destination operand.Type: ApplicationFiled: October 24, 2018Publication date: April 25, 2019Inventors: JESUS CORBAL, ROBERT VALENTINE, ROMAN S. DUBTSOV, NIKITA A. SHUSTROV, MARK J. CHARNEY, DENNIS R. BRADFORD, MILIND B. GIRKAR, EDWARD T. GROCHOWSKI, THOMAS D. FLETCHER, WARREN E. FERGUSON
-
Publication number: 20190121644Abstract: Systems, methods, and apparatuses for data speculation execution (DSX) are described. In some embodiments, a hardware apparatus for DSX comprises execution hardware to execute instructions to begin and end a data speculative execution (DSX) and speculative instructions during the DSX, and DSX tracking hardware to track speculative memory accesses and detect ordering violations in a DSX of speculative instructions using a sequence number, addresses of instruction accesses, and whether an instruction being tracked is a write, and to trigger a mis-speculation upon an ordering violation.Type: ApplicationFiled: December 24, 2014Publication date: April 25, 2019Inventors: Elmoustapha OULD-AHMED-VALL, Christopher J. HUGHES, Robert VALENTINE, Milind B. GIRKAR
-
Patent number: 10255072Abstract: A processor of an aspect includes a decode unit to decode an instruction. The instruction is to explicitly specify a first architectural register and is to implicitly indicate at least a second architectural register. The second architectural register is implicitly to be at a higher register number than the first architectural register. The processor also includes an architectural register replacement unit coupled with the decode unit. The architectural register replacement unit is to replace the first architectural register with a third architectural register, and is to replace the second architectural register with a fourth architectural register. The third architectural register is to be at a lower register number than the first architectural register. The fourth architectural register is to be at a lower register number than the second architectural register. Other processors are also disclosed, as are methods and systems.Type: GrantFiled: July 1, 2016Date of Patent: April 9, 2019Assignee: Intel CorporationInventors: Mark J. Charney, Robert Valentine, Milind B. Girkar, Ashish Jha, Bret L. Toll, Elmoustapha Ould-Ahmed-Vall, Jesus Corbal San Adrian, Jason W. Brandt
-
Patent number: 10175990Abstract: According to a first aspect, efficient data transfer operations can be achieved by: decoding by a processor device, a single instruction specifying a transfer operation for a plurality of data elements between a first storage location and a second storage location; issuing the single instruction for execution by an execution unit in the processor; detecting an occurrence of an exception during execution of the single instruction; and in response to the exception, delivering pending traps or interrupts to an exception handler prior to delivering the exception.Type: GrantFiled: May 20, 2013Date of Patent: January 8, 2019Assignee: Intel CorporationInventors: Christopher J. Hughes, Yen-Kuang (Y. K.) Chen, Mayank Bomb, Jason W. Brandt, Mark J. Buxton, Mark J. Charney, Srinivas Chennupaty, Jesus Corbal, Martin G. Dixon, Milind B. Girkar, Jonathan C. Hall, Hideki (Saito) Ido, Peter Lachner, Gilbert Neiger, Chris J. Newburn, Rajesh S. Parthasarathy, Bret L. Toll, Robert Valentine, Jeffrey G. Wiedemeier
-
Patent number: 10146535Abstract: Embodiments of systems, apparatuses, and methods for chained fused multiply add. In some embodiments, an apparatus includes a decoder to decode a single instruction having an opcode, a destination field representing a destination operand, a first source field representing a plurality of packed data source operands of a first type that have packed data elements of a first size, a second source field representing a plurality of packed data source operands that have packed data elements of a second size, and a field for a memory location that stores a scalar value. A register file having a plurality of packed data registers includes registers for the plurality of packed data source operands that have packed data elements of a first size, the source operands that have packed data elements of a second size, and the destination operand.Type: GrantFiled: October 20, 2016Date of Patent: December 4, 2018Assignee: Intel CorporatoinInventors: Jesus Corbal, Robert Valentine, Roman S. Dubtsov, Nikita A. Shustrov, Mark J. Charney, Dennis R. Bradford, Milind B. Girkar, Edward T. Grochowski, Thomas D. Fletcher, Warren E. Ferguson
-
Patent number: 10114651Abstract: According to a first aspect, efficient data transfer operations can be achieved by: decoding by a processor device, a single instruction specifying a transfer operation for a plurality of data elements between a first storage location and a second storage location; issuing the single instruction for execution by an execution unit in the processor; detecting an occurrence of an exception during execution of the single instruction; and in response to the exception, delivering pending traps or interrupts to an exception handler prior to delivering the exception.Type: GrantFiled: January 4, 2018Date of Patent: October 30, 2018Assignee: Intel CorporationInventors: Christopher J. Hughes, Yen-Kuang (Y. K.) Chen, Mayank Bomb, Jason W. Brandt, Mark J. Buxton, Mark J. Charney, Srinivas Chennupaty, Jesus Corbal, Martin G. Dixon, Milind B. Girkar, Jonathan C. Hall, Hideki (Saito) Ido, Peter Lachner, Gilbert Neiger, Chris J. Newburn, Rajesh S. Parthasarathy, Bret L. Toll, Robert Valentine, Jeffrey G. Wiedemeier
-
Patent number: 10061583Abstract: Systems, methods, and apparatuses for data speculation execution (DSX) are described. In some embodiments, a hardware apparatus for performing DSX comprises a hardware decoder to decode an instruction, the instruction to include an opcode, and execution hardware to execute the decoded instruction to reset data speculative execution (DSX) tracking hardware to track speculative memory accesses, clear a DSX status indication in a DSX status register, and commit all speculatively executed stores of the DSX region and thereby end a DSX region.Type: GrantFiled: December 24, 2014Date of Patent: August 28, 2018Assignee: Intel CorporationInventors: Elmoustapha Ould-Ahmed-Vall, Christopher J. Hughes, Robert Valentine, Milind B. Girkar
-
Patent number: 10061589Abstract: Systems, methods, and apparatuses for data speculation execution (DSX) are described. In some embodiments, a hardware apparatus for DSX comprises decoder hardware to decode a class of instructions to support data speculative execution (DSX) including an instruction to begin a DSX, end a DSX, and speculative instructions to execute during a DSX, and execution hardware to speculatively execute decoded instructions that support DSX including the speculative instructions and update speculative instruction tracking hardware.Type: GrantFiled: December 24, 2014Date of Patent: August 28, 2018Assignee: Intel CorporationInventors: Elmoustapha Ould-Ahmed-Vall, Christopher J. Hughes, Robert Valentine, Milind B. Girkar
-
Patent number: 10019262Abstract: A processor comprises a plurality of vector registers, and an execution unit, operatively coupled to the plurality of vector registers, the execution unit comprising a logic circuit implementing a load instruction for loading, into two or more vector registers, two or more data items associated with a data structure stored in a memory, wherein each one of the two or more vector registers is to store a data item associated with a certain position number within the data structure.Type: GrantFiled: December 22, 2015Date of Patent: July 10, 2018Assignee: Intel CorporationInventors: Ashish Jha, Elmoustapha Ould-Ahmed-Vall, Robert Valentine, Mark J. Charney, Milind B. Girkar
-
Publication number: 20180181404Abstract: In one example, a system for generating vector based selection control statements can include a processor to determine a vector cost of the selection control statement is below a scalar cost and determine the selection control statement is to be executed in a sorted order based on dependencies between branch instructions of the selection control statement. The processor can also determine a program ordering of labels of the selection control statement does not match a mathematical ordering of the labels and execute the selection control statement with a vector of values, wherein the selection control statement is to be executed based on a jump table and a sorted unique value technique, wherein the sorted unique value technique comprises selecting at least one of the plurality of branch instructions from the jump table.Type: ApplicationFiled: December 28, 2016Publication date: June 28, 2018Applicant: Intel CorporationInventors: Hideki Saito Ido, Eric N. Garcia, Xinmin Tian, Milind B. Girkar, James Brodman
-
Patent number: 9996319Abstract: An example processor includes a register and an ADD low functional unit. The register stores first, second, and third floating point (FP) values. The ADD low functional unit receives a request to perform an ADD low operation and, responsive to the request: adds the first FP value with the second FP value to obtain a first sum value; rounds the first sum value to generate an ADD value; adds the first FP value with the second FP value to obtain a second sum value; subtracts the ADD value from the second sum value to generate a difference value; normalizes the difference value to obtain a normalized difference value; rounds the normalized difference value to generate an ADD low value; and sends the ADD low value to an application.Type: GrantFiled: December 23, 2015Date of Patent: June 12, 2018Assignee: Intel CorporationInventors: Cristina S. Anderson, Marius A. Cornea-Hasegan, Elmoustapha Ould-Ahmed-Vall, Robert Valentine, Jesus Corbal, Nikita Astafev, Mark J. Charney, Milind B. Girkar, Amit Gradstein, Simon Rubanovich, Zeev Sperber
-
Patent number: 9996320Abstract: An example processor includes a register and a fused multiply-add (FMA) low functional unit. The register stores first, second, and third floating point (FP) values. The FMA low functional unit receives a request to perform an FMA low operation: multiplies the first FP value with the second FP value to obtain a first product value; adds the first product with the third FP value to generate a first result value; rounds the first result to generate a first FMA value; multiplies the first FP value with the second FP value to obtain a second product value; adds the second product value with the third FP value to generate a second result value; and subtracts the FMA value from the second result value to obtain a third result value, which can then be normalized and rounded (FMA low result) and sent the FMA low result to an application.Type: GrantFiled: December 23, 2015Date of Patent: June 12, 2018Assignee: Intel CorporationInventors: Cristina S. Anderson, Marius A. Cornea-Hasegan, Elmoustapha Ould-Ahmed-Vall, Robert Valentine, Jesus Corbal, Nikita Astafev, Mark J. Charney, Milind B. Girkar, Amit Gradstein, Simon Rubanovich, Zeev Sperber