Patents by Inventor Paul Caprioli

Paul Caprioli has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 10241787
    Abstract: Embodiments of an invention for control transfer overrides are disclosed. In one embodiment, a processor includes an instruction unit to receive a control transfer instruction. The instruction unit includes a transfer override register to provide an alternative target for the control transfer instruction.
    Type: Grant
    Filed: March 31, 2014
    Date of Patent: March 26, 2019
    Assignee: Intel Corporation
    Inventor: Paul Caprioli
  • Patent number: 10209989
    Abstract: A vector reduction instruction is executed by a processor to provide efficient reduction operations on an array of data elements. The processor includes vector registers. Each vector register is divided into a plurality of lanes, and each lane stores the same number of data elements. The processor also includes execution circuitry that receives the vector reduction instruction to reduce the array of data elements stored in a source operand into a result in a destination operand using a reduction operator. Each of the source operand and the destination operand is one of the vector registers. Responsive to the vector reduction instruction, the execution circuitry applies the reduction operator to two of the data elements in each lane, and shifts one or more remaining data elements when there is at least one of the data elements remaining in each lane.
    Type: Grant
    Filed: March 7, 2017
    Date of Patent: February 19, 2019
    Assignee: Intel Corporation
    Inventors: Paul Caprioli, Abhay S. Kanhere, Jeffrey J. Cook, Muawya M. Al-Otoom
  • Publication number: 20180290764
    Abstract: Some embodiments described herein relate to a drone landing platform. One or more sensors coupled to the drone landing platform can detect local conditions in the vicinity of the drone landing platform. A communications system can be operable to transmit information related local conditions to a drone.
    Type: Application
    Filed: April 5, 2018
    Publication date: October 11, 2018
    Applicant: DroneTerminus LLC
    Inventors: Joseph Barry MCMILLIAN, Mark MESSINA, Paul CAPRIOLI
  • Publication number: 20180285283
    Abstract: A processor includes a memory to store original code and a fingerprint data structure, which stores, in a way thereof, an entry including a physical address for a page and a stored fingerprint generated from the page of the original code. A core includes a translation protection data structure (TPDS) to detect modification to the page, wherein the core is to, upon execution of a translation check instruction included within a translated page code corresponding to the page, transmit, to the TPDS, a modification check request having the physical address of the page in the memory and the way of the fingerprint data structure. A hardware TPDS miss handler is coupled to the core and is to process a miss request received from the TPDS responsive to the physical address not being present in the TPDS.
    Type: Application
    Filed: March 31, 2017
    Publication date: October 4, 2018
    Inventors: Paul Caprioli, Jeffrey J. Cook
  • Patent number: 10089244
    Abstract: A processor includes a memory to store original code and a fingerprint data structure, which stores, in a way thereof, an entry including a physical address for a page and a stored fingerprint generated from the page of the original code. A core includes a translation protection data structure (TPDS) to detect modification to the page, wherein the core is to, upon execution of a translation check instruction included within a translated page code corresponding to the page, transmit, to the TPDS, a modification check request having the physical address of the page in the memory and the way of the fingerprint data structure. A hardware TPDS miss handler is coupled to the core and is to process a miss request received from the TPDS responsive to the physical address not being present in the TPDS.
    Type: Grant
    Filed: March 31, 2017
    Date of Patent: October 2, 2018
    Assignee: Intel Corporation
    Inventors: Paul Caprioli, Jeffrey J. Cook
  • Patent number: 10063569
    Abstract: Embodiments of an invention for custom protection against side channel attacks are disclosed. In one embodiment, a processor includes instruction hardware and execution hardware. The instruction hardware is to receive an instruction to provide for shielding code against side channel attacks, wherein the instruction includes a first operand to specify one of a plurality of levels of protection. The execution hardware is to execute the instruction, wherein execution of the instruction includes configuring the processor to provide a specified level of protection.
    Type: Grant
    Filed: March 24, 2015
    Date of Patent: August 28, 2018
    Assignee: Intel Corporation
    Inventor: Paul Caprioli
  • Patent number: 9934124
    Abstract: In an embodiment, a processor includes execution logic to execute binary translated (BT) code that is translated from native architecture (NA) code. The processor also includes processor trace (PT) logic to output trace information responsive to execution of a BT direct branch instruction in the BT code when the NA code includes an NA direct branch instruction that corresponds to the BT direct branch instruction. The trace information is to include an indication of an NA outcome associated with an execution of the NA direct branch instruction. The trace information is to be based on a BT outcome associated with the execution of the BT direct branch instruction. Other embodiments are described and claimed.
    Type: Grant
    Filed: June 5, 2015
    Date of Patent: April 3, 2018
    Assignee: Intel Corporation
    Inventors: Furat F. Afram, Jeffrey J. Cook, Paul Caprioli
  • Publication number: 20180088921
    Abstract: Technologies for optimized binary translation include a computing device that determines a cost-benefit metric associated with each translated code block of a translation cache. The cost-benefit metric is indicative of translation cost and performance benefit associated with the translated code block. The translation cost may be determined by measuring translation time of the translated code block. The cost-benefit metric may be calculated using a weighted cost-benefit function based on an expected workload of the computing device. In response to determining to free space in the translation cache, the computing device determines whether to discard each translated code block as a function of the cost-benefit metric. In response to determining to free space in the translation cache, the computing device may increment an iteration count and skip each translated code block if the iteration count modulo the corresponding cost-benefit metric is non-zero. Other embodiments are described and claimed.
    Type: Application
    Filed: September 23, 2016
    Publication date: March 29, 2018
    Inventors: Paul Caprioli, Jeffrey J. Cook
  • Publication number: 20170242699
    Abstract: A vector reduction instruction is executed by a processor to provide efficient reduction operations on an array of data elements. The processor includes vector registers. Each vector register is divided into a plurality of lanes, and each lane stores the same number of data elements. The processor also includes execution circuitry that receives the vector reduction instruction to reduce the array of data elements stored in a source operand into a result in a destination operand using a reduction operator. Each of the source operand and the destination operand is one of the vector registers. Responsive to the vector reduction instruction, the execution circuitry applies the reduction operator to two of the data elements in each lane, and shifts one or more remaining data elements when there is at least one of the data elements remaining in each lane.
    Type: Application
    Filed: March 7, 2017
    Publication date: August 24, 2017
    Inventors: PAUL CAPRIOLI, ABHAY S. KANHERE, JEFFREY J. COOK, MUAWYA M. AL-OTOOM
  • Publication number: 20170212825
    Abstract: A hardware profiling mechanism implemented by performance monitoring hardware enables page level automatic binary translation. The hardware during runtime identifies a code page in memory containing potentially optimizable instructions. The hardware requests allocation of a new page in memory associated with the code page, where the new page contains a collection of counters and each of the counters corresponds to one of the instructions in the code page. When the hardware detects a branch instruction having a branch target within the code page, it increments one of the counters that has the same position in the new page as the branch target in the code page. The execution of the code page is repeated and the counters are incremented when branch targets fall within the code page. The hardware then provides the counter values in the new page to a binary translator for binary translation.
    Type: Application
    Filed: January 10, 2017
    Publication date: July 27, 2017
    Inventors: Paul Caprioli, Matthew C. Merten, Muawya M. Al-Otoom, Omar M. Shaikh, Abhay S. Kanhere, Suresh Srinivas, Koichi Yamada, Vivek Thakkar, Pawel Osciak
  • Patent number: 9703948
    Abstract: A processor includes a decode unit to decode a return target restrictive return from procedure (RTR return) instruction. A return target restriction unit is responsive to the RTR return instruction to determine whether to restrict an attempt by the RTR return instruction to make a control flow transfer to an instruction at a return address corresponding to the RTR return instruction. The determination is based on compatibility of a type of the instruction at the return address with the RTR return instruction and based on compatibility of first return target restrictive information (RTR information) of the RTR return instruction with second RTR information of the instruction at the return address. A control flow transfer unit is responsive to the RTR return instruction to transfer control flow to the instruction at the return address when the return target restriction unit determines not to restrict the attempt.
    Type: Grant
    Filed: March 28, 2014
    Date of Patent: July 11, 2017
    Assignee: Intel Corporation
    Inventor: Paul Caprioli
  • Patent number: 9652234
    Abstract: A dynamic optimization of code for a processor-specific dynamic binary translation of hot code pages (e.g., frequently executed code pages) may be provided by a run-time translation layer. A method may be provided to use an instruction look-aside buffer (iTLB) to map original code pages and translated code pages. The method may comprise fetching an instruction from an original code page, determining whether the fetched instruction is a first instruction of a new code page and whether the original code page is deprecated. If both determinations return yes, the method may further comprise fetching a next instruction from a translated code page. If either determinations returns no, the method may further comprise decoding the instruction and fetching the next instruction from the original code page.
    Type: Grant
    Filed: September 30, 2011
    Date of Patent: May 16, 2017
    Assignee: Intel Corporation
    Inventors: Paul Caprioli, Martin G. Dixon, Brett L. Toll, Muawya M. Al-Otoom, Omar M. Shaikh
  • Publication number: 20170097826
    Abstract: Systems, apparatuses, and methods for improving TM throughput using a TM region indicator (or color) are described. Through the use of TM region indicators younger TM regions can have their instructions retired while waiting for older TM regions to commit.
    Type: Application
    Filed: December 16, 2016
    Publication date: April 6, 2017
    Inventors: Omar M. Shaikh, Ravi Rajwar, Paul Caprioli, Muawya M. Al-Otoom
  • Publication number: 20170097891
    Abstract: Systems, apparatuses, and methods for improving TM throughput using a TM region indicator (or color) are described. Through the use of TM region indicators younger TM regions can have their instructions retired while waiting for older TM regions to commit.
    Type: Application
    Filed: December 16, 2016
    Publication date: April 6, 2017
    Inventors: Omar M. Shaikh, Ravi Rajwar, Paul Caprioli, Muawya M. Al-Otoom
  • Publication number: 20170090927
    Abstract: Embodiments of an invention for control transfer instructions indicating intent to call or return are disclosed. In one embodiment, a processor includes a return target predictor, instruction hardware, and execution hardware. The instruction hardware is to receive a first instruction, a second instruction, and a third instruction, and the execution hardware to execute the first instruction, the second instruction, and the third instruction. Execution of the first instruction is to store a first return address on a stack and to transfer control to a first target address. Execution of the second instruction is to store a second return address in the return target predictor and transfer control to a second target address. Execution of the third instruction is to transfer control to the second target address.
    Type: Application
    Filed: September 30, 2015
    Publication date: March 30, 2017
    Inventors: Paul Caprioli, KOICHI YAMADA, TUGRUL INCE
  • Patent number: 9588766
    Abstract: A vector reduction instruction is executed by a processor to provide efficient reduction operations on an array of data elements. The processor includes vector registers. Each vector register is divided into a plurality of lanes, and each lane stores the same number of data elements. The processor also includes execution circuitry that receives the vector reduction instruction to reduce the array of data elements stored in a source operand into a result in a destination operand using a reduction operator. Each of the source operand and the destination operand is one of the vector registers. Responsive to the vector reduction instruction, the execution circuitry applies the reduction operator to two of the data elements in each lane, and shifts one or more remaining data elements when there is at least one of the data elements remaining in each lane.
    Type: Grant
    Filed: September 28, 2012
    Date of Patent: March 7, 2017
    Assignee: Intel Corporation
    Inventors: Paul Caprioli, Abhay S. Kanhere, Jeffrey J. Cook, Muawya M. Al-Otoom
  • Publication number: 20170046140
    Abstract: State recovery methods and apparatus for computing platforms are disclosed. An example method includes inserting, with a processor, a first instruction into optimized code to cause a first portion of a register in a first state to be saved to memory before execution of a region of the optimized code, maintaining, with the processor, a first indication of a first manner in which the first portion of the register is to be restored in connection with a state recovery after execution of the region of the optimized code, and maintaining, with the processor, a second indication of a second manner in which a second portion of the register is to be restored in connection with the state recovery after execution of the region of the optimized code.
    Type: Application
    Filed: October 27, 2016
    Publication date: February 16, 2017
    Inventors: Abhay S. Kanhere, Saurabh Shukla, Suriya Subramanian, Paul Caprioli
  • Patent number: 9542191
    Abstract: A hardware profiling mechanism implemented by performance monitoring hardware enables page level automatic binary translation. The hardware during runtime identifies a code page in memory containing potentially optimizable instructions. The hardware requests allocation of a new page in memory associated with the code page, where the new page contains a collection of counters and each of the counters corresponds to one of the instructions in the code page. When the hardware detects a branch instruction having a branch target within the code page, it increments one of the counters that has the same position in the new page as the branch target in the code page. The execution of the code page is repeated and the counters are incremented when branch targets fall within the code page. The hardware then provides the counter values in the new page to a binary translator for binary translation.
    Type: Grant
    Filed: March 30, 2012
    Date of Patent: January 10, 2017
    Assignee: Intel Corporation
    Inventors: Paul Caprioli, Matthew C. Merten, Muawya M. Al-Otoom, Omar M. Shaikh, Abhay S. Kanhere, Suresh Srinivas, Koichi Yamada, Vivek Thakkar, Pawel Osciak
  • Publication number: 20160378498
    Abstract: Systems, methods, and apparatuses for last branch record support are described. In an embodiment, a hardware processor core comprises a hardware execution unit to execute a branch instruction, at least two last branch record (LBR) registers to store a source and destination information of a branch taken during program execution, wherein an entry in a LBR register to include an encoding of the branch, a write bit array to indicate which LBR register is architecturally correct, an architectural bit array to indicate when an LBR register has been written, and a plurality of top of stack pointers to indicate which LBR register in a LBR register stack is to be written.
    Type: Application
    Filed: June 27, 2015
    Publication date: December 29, 2016
    Inventors: Paul Caprioli, Koichi Yamada, Jason M. Agron, Jiwei Lu
  • Publication number: 20160378480
    Abstract: Embodiments for systems, methods, and apparatuses for improving performance of status dependent computations are detailed. In an embodiment, an hardware apparatus comprises decoder hardware to decode an instruction, operand retrieval hardware to retrieve data from at least one source operand associated with the instruction decoded by the decoder hardware, and execution hardware to execute the decoded instruction to generate a result including at least one status bit and to cause the result and at least one status bit to be stored in a single destination physical storage location, wherein the at least one status bit and result are accessible through a read of the single register.
    Type: Application
    Filed: June 27, 2015
    Publication date: December 29, 2016
    Inventors: Pavel G. Matveyev, Dmitry M. Maslennikov, Paul Caprioli, Gadi Haber