Patents Examined by Daniel H. Pan

Decimal load immediate instruction

Patent number: 10430185

Abstract: An instruction generates a value for use in processing within a computing environment. The instruction obtains a sign control associated with the instruction, and shifts an input value of the instruction in a specified direction by a selected amount to provide a result. The result is placed in a first designated location in a register, and the sign, which is based on the sign control, is placed in a second designated location of the register. The result and the sign provide a signed value to be used in processing within the computing environment.

Type: Grant

Filed: November 8, 2017

Date of Patent: October 1, 2019

Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Jonathan D. Bradbury, Reid T. Copeland, Silvia Melitta Mueller
Method for a delayed branch implementation by using a front end track table

Patent number: 10417000

Abstract: A method for a delayed branch implementation by using a front end track table. The method includes receiving an incoming instruction sequence using a global front end, wherein the instruction sequence includes at least one branch, creating a delayed branch in response to receiving the one branch, and sing a front end track table to track both the delayed branch the one branch.

Type: Grant

Filed: October 13, 2017

Date of Patent: September 17, 2019

Assignee: Intel Corporation

Inventor: Mohammad Abdallah
Hazard detection of out-of-order execution of load and store instructions in processors without using real addresses

Patent number: 10417002

Abstract: Technical solutions are described for hazard detection of out-of-order execution of load and store instructions without using real addresses in a processing unit. An example includes an out-of-order load-store unit (LSU) for transferring data between memory and registers. The LSU detects a store-hit-load (SHL) in an out-of-order execution of instructions based only on effective addresses by: determining an effective address associated with a store instruction; determining whether a load instruction entry using said effective address is present in a load reorder queue; and indicating that a SHL has been detected based at least in part on determining that load instruction entry using said effective address is present in the load reorder queue. The LSU, in response to detecting the SHL, flushes instructions starting from a load instruction corresponding to the load instruction entry.

Type: Grant

Filed: October 6, 2017

Date of Patent: September 17, 2019

Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Bryan Lloyd, Balaram Sinharoy, Shih-Hsiung S. Tung
Multi-multidimensional computer architecture for big data applications

Patent number: 10417005

Abstract: A data processing apparatus is provided comprising a front-end interface electronically coupled to a main processor. The front-end interface is configured to receive data stored in a repository, in particular an external storage and/or a network, determine whether the data is a single-access data or a multiple-access data by analyzing an access parameter designating the data, route the multiple-access data for processing by the main processor, and route the single-access data for pre-processing by the front-end interface and routing results of the pre-processing to the main processor.

Type: Grant

Filed: September 11, 2017

Date of Patent: September 17, 2019

Assignee: Huawei Technologies Co., Ltd.

Inventors: Uri Weiser, Tal Horowitz, Jintang Wang
Method and apparatus for detecting memory conflicts using distinguished memory addresses

Patent number: 10402201

Abstract: A method and apparatus for detecting potential memory conflicts in a parallel computing environment by executing two parallel program threads. The parallel program threads include special operands that are used by a processing core to identify memory addresses that have the potential for conflict. These memory addresses are combined into a composite access record for each thread. The composite access records are compared to each other in order to detect a potential memory conflict.

Type: Grant

Filed: March 9, 2017

Date of Patent: September 3, 2019

Inventors: Joel Kevin Jones, Ananth Jasty
Executing load-store operations without address translation hardware per load-store unit port

Patent number: 10394558

Abstract: Technical solutions are described for out-of-order (OoO) execution of one or more instructions by a processing unit includes receiving, by a load-store unit (LSU) of the processing unit, an OoO window of instructions including a plurality of instructions to be executed OoO, and issuing, by the LSU, instructions from the OoO window. The issuing includes selecting an instruction from the OoO window, the instruction using an effective address. Further, in response to the instruction being a load instruction, it is determined whether the effective address is present in an effective address directory (EAD). In response to the effective address being present in the EAD, the load instruction is issued using the effective address. Further, in response to the instruction being a store instruction, a real address mapped to the effective address is determined from an effective-real translation (ERT) table, and the store instruction is issued using the real address.

Type: Grant

Filed: October 6, 2017

Date of Patent: August 27, 2019

Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Christopher Gonzalez, Bryan Lloyd, Balaram Sinharoy
Speculative and iterative execution of delayed data flow graphs

Patent number: 10394729

Abstract: A system for executing a data flow graph comprises: at least two first actors each comprising means for independently executing a computation of a same data set comprising at least one datum, and producing a quality descriptor of the data set, the execution of the computation by each of at least two first actors being triggered by a synchronization system; a third actor, comprising means for triggering the execution of the computation by each of at least two first actors, and initializing a clock configured to emit an interrupt signal when a duration has elapsed; a fourth actor, comprising means for executing, at the latest at the interrupt signal from the clock: the selection, from the set of at least two first actors having produced a quality descriptor, of the one whose descriptor exhibits the most favorable value; the transfer of the data set computed by the selected actor.

Type: Grant

Filed: September 22, 2015

Date of Patent: August 27, 2019

Assignee: COMMISSARIAT A L'ENERGIE ATOMIQUE ET AUX ENERGIES ALTERNATIVES

Inventors: Paul Dubrulle, Thierry Goubier, Stéphane Louise
Synchronizing a set of code branches

Patent number: 10387153

Abstract: Techniques for synchronizing a set of code branches are disclosed. A synchronization process is triggered by an event and/or a schedule. The synchronization process includes traversing each code branch, such that parent branches of a particular branch are “in sync” prior to being merged into the particular branch. In an embodiment, a hierarchical order for a set of branches is determined. The branch represented by the top node of the hierarchical order does not have any parents. A branch that is a child of the branch represented by the top node is in the second level of the hierarchical order. The branch in the second level is updated by incorporating the current state of the branch represented by the top node. Thereafter, each branch is iteratively updated by incorporating the current state of the branch's parent branch. Hence, changes to any parent branch are propagated through all its descendant branches.

Type: Grant

Filed: November 27, 2017

Date of Patent: August 20, 2019

Assignee: Oracle International Corporation

Inventors: Maurizio Cimadamore, Brian Goetz
Techniques for capturing state information and performing actions for threads in a multi-threaded computing environment

Patent number: 10387161

Abstract: Techniques are disclosed for implementing an extensible, light-weight, flexible (ELF) processing platform that can efficiently capture state information from multiple threads during execution of instructions (e.g., an instance of a game). The ELF processing platform supports execution of multiple threads in a single process for parallel execution of multiple instances of the same or different program code or games. Upon capturing the state information, one or more threads may be executed in the ELF platform to compute one or more actions to perform at any state of execution by each of those threads. The threads can easily access the state information from a shared memory space and use the state information to implement rule-based and/or learning-based techniques for determining subsequent actions for execution for the threads.

Type: Grant

Filed: September 1, 2017

Date of Patent: August 20, 2019

Assignee: Facebook, Inc.

Inventors: Yuandong Tian, Qucheng Gong, Yuxin Wu
Processors, methods, systems, and instructions to load multiple data elements to destination storage locations other than packed data registers

Patent number: 10379855

Abstract: A processor of an aspect includes a plurality of packed data registers, and a decode unit to decode an instruction. The instruction is to indicate a packed data register of the plurality of packed data registers that is to store a source packed memory address information. The source packed memory address information is to include a plurality of memory address information data elements. An execution unit is coupled with the decode unit and the plurality of packed data registers, the execution unit, in response to the instruction, is to load a plurality of data elements from a plurality of memory addresses that are each to correspond to a different one of the plurality of memory address information data elements, and store the plurality of loaded data elements in a destination storage location. The destination storage location does not include a register of the plurality of packed data registers.

Type: Grant

Filed: September 30, 2016

Date of Patent: August 13, 2019

Assignee: Intel Corporation

Inventors: William C. Hasenplaugh, Chris J. Newburn, Simon C. Steely, Jr., Samantika S. Sury
Processor prefetch throttling based on short streams

Patent number: 10379864

Abstract: In an embodiment, a processor comprises a prefetch history array and a prefetch circuit. The prefetch history array comprises a plurality of entries corresponding to prefetch addresses, each entry of the plurality of entries comprising a sublength value associated with a frequency that a stride is repeated. The prefetch circuit is to: for each entry of the plurality of entries, adjust the sublength value based on stride matches for an address of the entry; adjust a short stream counter based on the sublength values of the plurality of entries in the prefetch history array; determine whether the short stream counter has exceeded a throttling threshold; and in response to a determination that the short stream counter has exceeded the throttling threshold, throttle a prefetch level of the prefetch circuit. Other embodiments are described and claimed.

Type: Grant

Filed: December 26, 2016

Date of Patent: August 13, 2019

Assignee: Intel Corporation

Inventors: Chunhui Zhang, Seth H. Pugsley, Mark J. Dechene
Optimize control-flow convergence on SIMD engine using divergence depth

Patent number: 10379869

Abstract: There are provided a system, a method and a computer program product for selecting an active data stream (a lane) while running SPMD (Single Program Multiple Data) code on SIMD (Single Instruction Multiple Data) machine. The machine runs an instruction stream over input data streams. The machine increments lane depth counters of all active lanes upon the thread-PC reaching a branch operation. The machine updates the lane-PC of each active lane according to targets of the branch operation. The machine selects an active lane and activates only lanes whose lane-PCs match the thread-PC. The machine decrements the lane depth counters of the selected active lanes and updates the lane-PC of each active lane upon the instruction stream reaching a first instruction. The machine assigns the lane-PC of a lane with a largest lane depth counter value to the thread-PC and activates all lanes whose lane-PCs match the thread-PC.

Type: Grant

Filed: February 7, 2018

Date of Patent: August 13, 2019

Assignee: International Business Machines Corporation

Inventors: Gheorghe Almasi, Jose Moreira, Jessica H. Tseng, Peng Wu
Systems, apparatuses, and methods for setting an output mask in a destination writemask register from a source write mask register using an input writemask and immediate

Patent number: 10372450

Abstract: Embodiments of systems, apparatuses, and methods for performing in a computer processor generation of a predicate mask based on vector comparison in response to a single instruction are described.

Type: Grant

Filed: July 11, 2017

Date of Patent: August 6, 2019

Assignee: Intel Corporation

Inventors: Victor W. Lee, Daehyun Kim, Tin-Fook Ngai, Jayashankar Bharadwaj, Albert Hartono, Sara Baghsorkhi, Nalini Vasudevan
Hardware processors and methods for tightly-coupled heterogeneous computing

Patent number: 10372668

Abstract: Methods and apparatuses relating to tightly-coupled heterogeneous computing are described. In one embodiment, a hardware processor includes a plurality of execution units in parallel, a switch to connect inputs of the plurality of execution units to outputs of a first buffer and a plurality of memory banks and connect inputs of the plurality of memory banks and a plurality of second buffers in parallel to outputs of the first buffer, the plurality of memory banks, and the plurality of execution units, and an offload engine with inputs connected to outputs of the plurality of second buffers.

Type: Grant

Filed: January 12, 2018

Date of Patent: August 6, 2019

Assignee: Intel Corporation

Inventors: Chang Yong Kang, Pierre Laurent, Hari K. Tadepalli, Prasad M. Ghatigar, T.J. O'Dwyer, Serge Zhilyaev
Non-default instruction handling within transaction

Patent number: 10365927

Abstract: Embodiments relate to non-default instruction handling within a transaction. An aspect includes entering a transaction, the transaction comprising a first plurality of instructions and a second plurality of instructions, wherein a default manner of handling of instructions in the transaction is one of atomic and non-atomic. Another aspect includes encountering a non-default specification instruction in the transaction, wherein the non-default specification instruction comprises a single instruction that specifies the second plurality of instructions of the transaction for handling in a non-default manner comprising one of atomic and non-atomic, wherein the non-default manner is different from the default manner. Another aspect includes handling the first plurality of instructions in the default manner. Yet another aspect includes handling the second plurality of instructions in the non-default manner.

Type: Grant

Filed: November 9, 2017

Date of Patent: July 30, 2019

Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Jonathan D. Bradbury, Michael K. Gschwind, Maged M. Michael, Eric M. Schwarz, Valentina Salapura, Chung-Lung K. Shum
Spin loop delay instruction

Patent number: 10365929

Abstract: A Spin Loop Delay instruction. The instruction has a field associated therewith that indicates one or more conditions to be checked. Dispatching of the instruction is initially delayed. The instruction is subsequently dispatched based on a timeout, provided the instruction has not been previously dispatched based on meeting at least one condition of the one or more conditions to be checked.

Type: Grant

Filed: November 13, 2017

Date of Patent: July 30, 2019

Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Fadi Y. Busaba, Christian Jacobi, Anthony Saporito, Eric M. Schwarz, Timothy J. Slegel
Fetch unit for predicting target for subroutine return instructions

Patent number: 10360037

Abstract: A fetch unit configured to, in response to detecting a subroutine call and link instruction, calculate and store a predicted target address for the corresponding subroutine return instruction in a prediction stack, and if certain conditions are met, also cause to be stored in the prediction stack a predicted target instruction bundle. The fetch unit is also configured to, in response to detecting a subroutine return instruction, use the predicted target address in the prediction stack to determine the address of the next instruction bundle to be fetched, and if certain conditions are met, cause any valid predicted target instruction bundle in the prediction stack to be the next bundle to be decoded.

Type: Grant

Filed: September 30, 2016

Date of Patent: July 23, 2019

Assignee: MIPS Tech, LLC

Inventor: Philip Day
Associating working sets and threads

Patent number: 10353736

Abstract: Associating working sets and threads is disclosed. An indication of a stalling event is received. In response to receiving the indication of the stalling event, a state of a processor associated with the stalling event is saved. At least one of an identifier of a guest thread running in the processor and a guest physical address referenced by the processor is obtained from the saved processor state.

Type: Grant

Filed: August 25, 2017

Date of Patent: July 16, 2019

Assignee: TidalScale, Inc.

Inventors: Isaac R. Nassi, Kleoni Ioannidou, David P. Reed, I-Chun Fang, Michael Berman, Mark Hill, Brian Moffet
Systems and methods for using error correction and pipelining techniques for an access triggered computer architecture

Patent number: 10353681

Abstract: A method for improving performance of an access triggered architecture for a computer implemented application is provided. The method first executes typical operations of the access triggered architecture according to an execution time, wherein the typical operations comprise: obtaining a dataset and an instruction set; and using the instruction set to transmit the dataset to a functional block associated with an operation, wherein the functional block performs the operation using the dataset to generate a revised dataset. The method further creates a pipeline of the typical operations to reduce the execution time of the typical operations, to create a reduced execution time; and executes the typical operations according to the reduced execution time, using the pipeline.

Type: Grant

Filed: August 28, 2017

Date of Patent: July 16, 2019

Assignee: HONEYWELL INTERNATIONAL INC.

Inventors: Thom Kreider, Jon Douglas Gilreath, Gary Warnica, Paul D. Kammann, Vince J. Gavagan, IV, Ronald E. Strong
Vector processor configured to operate on variable length vectors with asymmetric multi-threading

Patent number: 10339094

Abstract: A computer processor is disclosed. The computer processor comprises one or more processor resources. The computer processor further comprises a plurality of hardware thread units coupled to the one or more processor resources. The computer processor may be configured to permit simultaneous access to the one or more processor resources by only a subset of hardware thread units of the plurality of hardware thread units. The number of hardware threads in the subset may be less than the total number of hardware threads of the plurality of hardware thread units.

Type: Grant

Filed: May 19, 2015

Date of Patent: July 2, 2019

Assignee: OPTIMUM SEMICONDUCTOR TECHNOLOGIES, INC.

Inventors: Mayan Moudgill, Gary J. Nacer, C. John Glossner, Arthur Joseph Hoane, Paul Hurtley, Murugappan Senthilvelan, Pablo Balzola, Vitaly Kalashnikov, Sitij Agrawal

prev … 5 6 7 8 9 10 11 12 13 … next