Patents Examined by William V Nguyen
  • Patent number: 10831504
    Abstract: A method and circuit arrangement provide support for a hybrid pipeline that dynamically switches between out-of-order and in-order modes. The hybrid pipeline may selectively execute instructions from at least one instruction stream that require the high performance capabilities provided by out-of-order processing in the out-of-order mode. The hybrid pipeline may also execute instructions that have strict power requirements in the in-order mode where the in-order mode conserves more power compared to the out-of-order mode. Each stage in the hybrid pipeline may be activated and fully functional when the hybrid pipeline is in the out-of-order mode. However, stages in the hybrid pipeline not used for the in-order mode may be deactivated and bypassed by the instructions when the hybrid pipeline dynamically switches from the out-of-order mode to the in-order mode. The deactivated stages may then be reactivated when the hybrid pipeline dynamically switches from the in-order mode to the out-of-order mode.
    Type: Grant
    Filed: October 2, 2018
    Date of Patent: November 10, 2020
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Miguel Comparan, Andrew D. Hilton, Hans M. Jacobson, Brian M. Rogers, Robert A. Shearer, Ken V. Vu, Alfred T. Watson, III
  • Patent number: 10817295
    Abstract: A streaming multiprocessor (SM) includes a nanosleep (NS) unit configured to cause individual threads executing on the SM to sleep for a programmer-specified interval of time. For a given thread, the NS unit parses a NANOSLEEP instruction and extracts a sleep time. The NS unit then maps the sleep time to a single bit of a timer and causes the thread to sleep. When the timer bit changes, the sleep time expires, and the NS unit awakens the thread. The thread may then continue executing. The SM also includes a nanotrap (NT) unit configured to issue traps using a similar timing mechanism to that described above. For a given thread, the NT unit parses a NANOTRAP instruction and extracts a trap time. The NT unit then maps the trap time to a single bit of a timer. When the timer bit changes, the NT unit issues a trap.
    Type: Grant
    Filed: April 28, 2017
    Date of Patent: October 27, 2020
    Assignee: NVIDIA Corporation
    Inventors: Olivier Giroux, Peter Nelson, Jack Choquette, Ajay Sudarshan Tirumala
  • Patent number: 10802829
    Abstract: Aspects of the invention include tracking dependencies between instructions in an issue queue. The tracking includes, for each instruction in the issue queue, tracking a specific dependency on each of a threshold number of instructions most recently added to the issue queue prior to the instruction, tracking as a single group a dependency of the instruction on any instructions in the issue queue that are not in the threshold number of instructions, and tracking for each source register used by the instruction an indicator of whether its content is dependent on results from an instruction in the single group that has not finished execution. Based at least in part on detecting removal from the issue queue of an instruction in the single group that has issued and not finished execution, the method includes indicating that the instruction is ready for issuance or waiting for a notification that the removed instruction has finished execution.
    Type: Grant
    Filed: November 30, 2017
    Date of Patent: October 13, 2020
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Joel A. Silberman, Balaram Sinharoy
  • Patent number: 10795680
    Abstract: A vector friendly instruction format and execution thereof. According to one embodiment of the invention, a processor is configured to execute an instruction set. The instruction set includes a vector friendly instruction format. The vector friendly instruction format has a plurality of fields including a base operation field, a modifier field, an augmentation operation field, and a data element width field, wherein the first instruction format supports different versions of base operations and different augmentation operations through placement of different values in the base operation field, the modifier field, the alpha field, the beta field, and the data element width field, and wherein only one of the different values may be placed in each of the base operation field, the modifier field, the alpha field, the beta field, and the data element width field on each occurrence of an instruction in the first instruction format in instruction streams.
    Type: Grant
    Filed: February 28, 2019
    Date of Patent: October 6, 2020
    Assignee: Intel Corporation
    Inventors: Robert C. Valentine, Jesus Corbal San Adrian, Roger Espasa Sans, Robert D. Cavin, Bret L. Toll, Santiago Galan Duran, Jeffrey G. Wiedemeier, Sridhar Samudrala, Milind Baburao Girkar, Edward Thomas Grochowski, Jonathan Cannon Hall, Dennis R. Bradford, Elmoustapha Ould-Ahmed-Vall, James C. Abel, Mark Charney, Seth Abraham, Suleyman Sair, Andrew Thomas Forsyth, Lisa Wu, Charles Yount
  • Patent number: 10747712
    Abstract: An integrated circuit includes a plurality of tiles. Each tile includes a processor, a switch including switching circuitry to forward data over data paths from other tiles to the processor and to switches of other tiles, and a switch memory that stores instruction streams that are able to operate independently for respective output ports of the switch.
    Type: Grant
    Filed: June 2, 2014
    Date of Patent: August 18, 2020
    Assignee: Massachusetts Institute of Technology
    Inventor: Anant Agarwal
  • Patent number: 10740107
    Abstract: Operation of a multi-slice processor that includes a plurality of execution slices and an instruction sequencing unit. Operation of such a multi-slice processor includes: receiving, at the instruction sequencing unit, a load instruction indicating load address data and a load data length; determining a previous store instruction in an issue queue such that store address data for the previous store instruction corresponds to the load address data, wherein the previous store instruction corresponds to a store data length; and generating, in dependence upon the store data length matching the load data length, an indication in the issue queue that indicates a dependency between the load instruction and the previous store instruction.
    Type: Grant
    Filed: June 1, 2016
    Date of Patent: August 11, 2020
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Salma Ayub, Joshua W. Bowman, Jeffrey C. Brownscheidle, Kurt A. Feiste, Dung Q. Nguyen, Salim A. Shah, Brian W. Thompto
  • Patent number: 10719316
    Abstract: An apparatus is described having instruction execution logic circuitry. The instruction execution logic circuitry has input vector element routing circuitry to perform the following for each of three different instructions: for each of a plurality of output vector element locations, route into an output vector element location an input vector element from one of a plurality of input vector element locations that are available to source the output vector element. The output vector element and each of the input vector element locations are one of three available bit widths for the three different instructions. The apparatus further includes masking layer circuitry coupled to the input vector element routing circuitry to mask a data structure created by the input vector routing element circuitry. The masking layer circuitry is designed to mask at three different levels of granularity that correspond to the three available bit widths.
    Type: Grant
    Filed: November 9, 2017
    Date of Patent: July 21, 2020
    Assignee: INTEL CORPORATION
    Inventors: Elmoustapha Ould-Ahmed-Vall, Robert Valentine, Jesus Corbal, Bret L. Toll, Mark J. Charney, Zeev Sperber, Amit Gradstein
  • Patent number: 10649786
    Abstract: Embodiments are generally directed to a multithreaded processor for executing a plurality of threads, as well as an associated method and system. The multithreaded processor comprises a first control register configured to store a stack limit value, and instruction decode logic configured to, upon receiving a procedure entry instruction for a stack associated with a first thread, determine whether to throw a stack limit exception based on the stack limit value and a first predefined stack region size associated with the stack.
    Type: Grant
    Filed: December 1, 2016
    Date of Patent: May 12, 2020
    Assignee: Cisco Technology, Inc.
    Inventor: Donald E. Steiss
  • Patent number: 10552152
    Abstract: A data processing apparatus, and method of operation thereof, for executing instructions. The apparatus includes one or more host processors, each having a first processing unit, and a multi-level memory system. One or more levels of the memory system are tightly coupled to a corresponding second processing unit. At least one of the host processors includes an instruction scheduler that routes instructions selectively to at least one of the first and second processing units, dependent upon the availability of the processing units and the location, within the memory system, of data to be used when executing the instructions.
    Type: Grant
    Filed: May 27, 2016
    Date of Patent: February 4, 2020
    Assignee: Arm Limited
    Inventors: Jonathan Curtis Beard, Wendy Elsasser, Eric Van Hensbergen, Stephan Diestelhorst
  • Patent number: 10545797
    Abstract: A method for dynamically providing a status of a hardware thread/hardware resource independent of the operation of the hardware thread/hardware resource using an inter-thread communication protocol. A master hardware thread may be configured to communicate status requests to associated slave hardware threads and/or hardware resources. Each slave hardware thread/hardware resource may be configured with hardware logic configured to automatically determine status information for the slave hardware thread/hardware resource and communicate a status response to the master hardware thread without interrupting processing of the slave hardware thread/hardware resource.
    Type: Grant
    Filed: February 8, 2016
    Date of Patent: January 28, 2020
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Jamie R. Kuesel, Mark G. Kupferschmidt, Paul E. Schardt, Robert A. Shearer
  • Patent number: 10534654
    Abstract: A circuit arrangement and program product for dynamically providing a status of a hardware thread/hardware resource independent of the operation of the hardware thread/hardware resource using an inter-thread communication protocol. A master hardware thread may be configured to communicate status requests to associated slave hardware threads and/or hardware resources. Each slave hardware thread/hardware resource may be configured with hardware logic configured to automatically determine status information for the slave hardware thread/hardware resource and communicate a status response to the master hardware thread without interrupting processing of the slave hardware thread/hardware resource.
    Type: Grant
    Filed: February 8, 2016
    Date of Patent: January 14, 2020
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Jamie R. Kuesel, Mark G. Kupferschmidt, Paul E. Schardt, Robert A. Shearer
  • Patent number: 10528352
    Abstract: Blocking instruction fetching in a computer processor, includes: receiving a non-branching instruction to be executed by the computer processor; determining whether executing the non-branching instruction will cause a flush; and responsive to determining that executing the non-branching instruction will cause a flush, disabling instruction fetching for the computer processor for a time, including recoding the instruction such that the recoded instruction will be interpreted by an instruction fetch unit as an unconditional branch instruction.
    Type: Grant
    Filed: March 8, 2016
    Date of Patent: January 7, 2020
    Assignee: International Business Machines Corporation
    Inventors: Bryan G. Hickerson, Sheldon Levenstein, David S. Levitan, Albert J. Van Norstrand, Jr.
  • Patent number: 10514919
    Abstract: A data processing apparatus has processing circuitry for processing vector operands from a vector register store in response to vector micro-operations, some of which have control information identifying which data elements of the vector operands are selected for processing. Control circuitry detects vector micro-operations for which the control information specifies that a portion of the vector operand to be processed has no selected elements. If this is the case, then the control circuitry controls the processing circuitry to process a lower latency replacement micro-operation instead of the original micro-operation. This provides better performance than if a branch instruction is used to bypass the micro-operation if there are no selected elements.
    Type: Grant
    Filed: January 21, 2015
    Date of Patent: December 24, 2019
    Assignee: ARM Limited
    Inventors: Matthias Boettcher, Mbou Eyole-Monono, Giacomo Gabrielli
  • Patent number: 10474459
    Abstract: An apparatus is described having instruction execution logic circuitry. The instruction execution logic circuitry has input vector element routing circuitry to perform the following for each of three different instructions: for each of a plurality of output vector element locations, route into an output vector element location an input vector element from one of a plurality of input vector element locations that are available to source the output vector element. The output vector element and each of the input vector element locations are one of three available bit widths for the three different instructions. The apparatus further includes masking layer circuitry coupled to the input vector element routing circuitry to mask a data structure created by the input vector routing element circuitry. The masking layer circuitry is designed to mask at three different levels of granularity that correspond to the three available bit widths.
    Type: Grant
    Filed: November 9, 2017
    Date of Patent: November 12, 2019
    Assignee: Intel Corporation
    Inventors: Elmoustapha Ould-Ahmed-Vall, Robert Valentine, Jesus Corbal, Bret L. Toll, Mark J. Charney, Zeev Sperber, Amit Gradstein
  • Patent number: 10467008
    Abstract: Methods and apparatus for identifying an effective address (EA) using an interrupt instruction tag (ITAG) in a multi-slice processor including receiving, by an instruction fetch unit of the processor, the interrupt ITAG; retrieving an effective address table (EAT) row from an EAT, wherein the EAT row comprises a range of EAs and a first ITAG of a range of ITAGs; accessing a processor instruction vector comprising a plurality of elements, each element corresponding to one of a plurality of ITAGs; applying a mask to the processor instruction vector to obtain a portion of the processor instruction vector that begins with an element corresponding to the first ITAG and is defined by an element corresponding to the interrupt ITAG; calculating an EA offset; and identifying the EA for the interrupt ITAG using the EA offset and the range of EAs in the retrieved EAT row.
    Type: Grant
    Filed: May 31, 2016
    Date of Patent: November 5, 2019
    Assignee: International Business Machines Corporation
    Inventors: David S. Levitan, Mehul Patel, Albert J. Van Norstrand, Jr., Phillip G. Williams
  • Patent number: 10467011
    Abstract: A processor of an aspect includes a decode unit to decode a thread pause instruction from a first thread. A back-end portion of the processor is coupled with the decode unit. The back-end portion of the processor, in response to the thread pause instruction, is to pause processing of subsequent instructions of the first thread for execution. The subsequent instructions occur after the thread pause instruction in program order. The back-end portion, in response to the thread pause instruction, is also to keep at least a majority of the back-end portion of the processor, empty of instructions of the first thread, except for the thread pause instruction, for a predetermined period of time. The majority may include a plurality of execution units and an instruction queue unit.
    Type: Grant
    Filed: July 21, 2014
    Date of Patent: November 5, 2019
    Assignee: Intel Corporation
    Inventors: Lihu Rappoport, Zeev Sperber, Michael Mishaeli, Stanislav Shwartsman, Lev Makovsky, Adi Yoaz, Ofer Levy
  • Patent number: 10459728
    Abstract: An apparatus is described having instruction execution logic circuitry to execute first, second, third and fourth instruction. Both the first instruction and the second instruction insert a first group of input vector elements to one of multiple first non overlapping sections of respective first and second resultant vectors. The first group has a first bit width. Each of the multiple first non overlapping sections have a same bit width as the first group. Both the third instruction and the fourth instruction insert a second group of input vector elements to one of multiple second non overlapping sections of respective third and fourth resultant vectors. The second group has a second bit width that is larger than said first bit width. Each of the multiple second non overlapping sections have a same bit width as the second group.
    Type: Grant
    Filed: November 10, 2017
    Date of Patent: October 29, 2019
    Assignee: Intel Corporation
    Inventors: Elmoustapha Ould-Ahmed-Vall, Robert Valentine, Jesus Corbal, Bret L. Toll, Mark J. Charney, Zeev Sperber, Amit Gradstein
  • Patent number: 10387157
    Abstract: An instruction set conversion system and method is provided, which can convert guest instructions to host instructions for processor core execution. Through configuration, instruction sets supported by the processor core are easily expanded. A method for real-time conversion between host instruction addresses and guest instruction addresses is also provided, such that the processor core can directly read out the host instructions from a higher level cache, reducing the depth of a pipeline.
    Type: Grant
    Filed: November 26, 2014
    Date of Patent: August 20, 2019
    Assignee: SHANGHAI XINHAO MICROELECTRONICS CO. LTD.
    Inventor: Kenneth Chenghao Lin
  • Patent number: 10338920
    Abstract: A processor includes an execution unit to execute instructions to get data elements of the same type from multiple data structures packed in vector registers. The execution unit includes logic to extract data elements from specific positions within each data structure dependent on an instruction encoding. A vector GET3 instruction encoding specifies that data elements be extracted from the first, second, or third position in each XYZ-type data structure. A vector GET4 instruction encoding specifies that data elements be extracted from the first, second, third, or fourth position in each XYZW-type data structure and that the extracted data elements be placed in the upper or lower half of a destination vector. The execution unit includes logic to place the extracted data elements in contiguous locations in the destination vector. The execution unit includes logic to store the destination vector to a destination vector register specified in the instruction.
    Type: Grant
    Filed: December 18, 2015
    Date of Patent: July 2, 2019
    Assignee: Intel Corporation
    Inventor: Elmoustapha Ould-Ahmed-Vall
  • Patent number: 10324725
    Abstract: The disclosure provides a method and a system for identifying and replacing code translations that generate spurious fault events. In one embodiment the method includes executing a first set and a second set of native instructions, performing a third translation of a target instruction to form a third set of native instructions in response to a determination that a fault occurrence is attributed to a first translation, wherein the third set of native instructions is not the same as the second set of native instructions, and the third set of native instructions is not the same as the first set of native instructions, and executing the third set of native instructions.
    Type: Grant
    Filed: March 8, 2018
    Date of Patent: June 18, 2019
    Assignee: Nvidia Corporation
    Inventors: Nathan Tuck, David Dunn, Ross Segelken, Madhu Swarna