Patents Examined by Jyoti Mehta

Context switching using a context controller and on-chip context cache

Patent number: 9658877

Abstract: The disclosure relates generally to techniques, methods and apparatus for controlling context switching at a central processing unit. Alternatively, methods and apparatus are provided for providing security to memory blocks. Alternatively, methods and apparatus are provided for enabling transactional processing using a multi-core device.

Type: Grant

Filed: August 23, 2010

Date of Patent: May 23, 2017

Assignee: EMPIRE TECHNOLOGY DEVELOPMENT LLC

Inventor: James Barwick
Data processing apparatus using privileged and non-privileged modes with multiple stacks

Patent number: 9645949

Abstract: Embodiments of the invention relate to a data processing apparatus including a processor adapted to operate under control of an executable comprising instructions, and in any of a plurality of operating modes including a non-privileged mode and a privileged mode, the apparatus comprising: means for storing a plurality of stacks; a first stack pointer register for storing a pointer to an address in a first of said stacks; a second stack pointer register for storing a pointer to an address in a second of said stacks, wherein said processing apparatus is adapted to use said second stack pointer when said processor is operating in either the non-privileged mode or the privileged mode; and means for transferring operation of said processor from the non-privileged mode to the privileged mode in response to at least one of said instructions. Embodiments of the invention also relate to a method of operating a data processing apparatus.

Type: Grant

Filed: May 27, 2009

Date of Patent: May 9, 2017

Assignee: Cambridge Consultants Ltd.

Inventors: Alistair G. Morfey, Karl Leighton Swepson, Peter Giles Lloyd
Inter-processor communication techniques in a multiple-processor computing platform

Patent number: 9645866

Abstract: This disclosure describes communication techniques that may be used within a multiple-processor computing platform. The techniques may, in some examples, provide software interfaces that may be used to support message passing within a multiple-processor computing platform that initiates tasks using command queues. The techniques may, in additional examples, provide software interfaces that may be used for shared memory inter-processor communication within a multiple-processor computing platform. In further examples, the techniques may provide a graphics processing unit (GPU) that includes hardware for supporting message passing and/or shared memory communication between the GPU and a host CPU.

Type: Grant

Filed: September 16, 2011

Date of Patent: May 9, 2017

Assignee: QUALCOMM Incorporated

Inventors: Alexei V. Bourd, Colin Christopher Sharp, David Rigel Garcia Garcia, Chihong Zhang
Gather/scatter of multiple data elements with packed loading/storing into /from a register file entry

Patent number: 9632778

Abstract: Embodiments relate to packed loading and storing of data. An aspect includes a system for packed loading and storing of distributed data. The system includes memory and a processing element configured to communicate with the memory. The processing element is configured to perform a method including fetching and decoding an instruction for execution by the processing element. A plurality of individually addressable data elements is gathered from non-contiguous locations in the memory which are narrower than a nominal width of register file elements in the processing element based on the instruction. The processing element packs and loads the data elements into register file elements of a register file entry based on the instruction, such that at least two of the data elements gathered from the non-contiguous locations in the memory are packed and loaded into a single register file element of the register file entry.

Type: Grant

Filed: August 8, 2012

Date of Patent: April 25, 2017

Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Bruce M. Fleischer, Thomas W. Fox, Hans M. Jacobson, Jaime H. Moreno, Ravi Nair, Daniel A. Prener
Gather/scatter of multiple data elements with packed loading/storing into/from a register file entry

Patent number: 9632777

Abstract: Embodiments relate to packed loading and storing of data. An aspect includes a method for packed loading and storing of data distributed in a system that includes memory and a processing element. The method includes fetching and decoding an instruction for execution by the processing element. The processing element gathers a plurality of individually addressable data elements from non-contiguous locations in the memory which are narrower than a nominal width of register file elements in the processing element based on the instruction. The data elements are packed and loaded into register file elements of a register file entry by the processing element based on the instruction, such that at least two of the data elements gathered from the non-contiguous locations in the memory are packed and loaded into a single register file element of the register file entry.

Type: Grant

Filed: August 3, 2012

Date of Patent: April 25, 2017

Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Bruce M. Fleischer, Thomas W. Fox, Hans M. Jacobson, Jaime H. Moreno, Ravi Nair, Daniel A. Prener
Data processor with a load instruction that branches based on a control register value and a bit or bits read from memory

Patent number: 9619228

Abstract: The data processor includes CPU operable to execute an instruction included in an instruction set. The instruction set includes a load instruction for reading data on a memory space. The data read according to the load instruction includes data of a format type having a data-read-branching-occurrence bit region. The CPU includes a data-read-branching control register; a data-read-branching address register; and a read-data-analyzing unit. On condition that a bit value showing the occurrence of data read branching has been set on the data-read-branching-occurrence bit region, and a value showing the data-read-branching-occurrence bit remaining valid has been set on the data-read-branching control register, the switching between processes is performed by branching to an address stored in the data-read-branching address register.

Type: Grant

Filed: March 28, 2011

Date of Patent: April 11, 2017

Assignee: Renesas Electronics Corporation

Inventors: Takafumi Yuasa, Hiroaki Nakata, Motoki Kimura, Kazushi Akie
Scheduling execution of instructions on a processor having multiple hardware threads with different execution resources

Patent number: 9612844

Abstract: A method and apparatus are provided for executing instructions of a multi-threaded processor having multiple hardware threads (32, 34) with differing hardware resources comprising the steps of receiving a plurality of streams of instructions (38, 44) and determining which hardware threads are able to receive instructions for execution (40, 46), determining whether a thread determined to be available for executing an instructions has the hardware resources available required by that instructions (36) and executing the instruction in dependence on the result of the determination (50).

Type: Grant

Filed: January 18, 2010

Date of Patent: April 4, 2017

Assignee: Imagination Technologies Limited

Inventor: Andrew Webber
Compressing execution cycles for divergent execution in a single instruction multiple data (SIMD) processor

Patent number: 9606797

Abstract: In one embodiment, the present invention includes a processor with a vector execution unit to execute a vector instruction on a vector having a plurality of individual data elements, where the vector instruction is of a first width and the vector execution unit is of a smaller width. The processor further includes a control logic coupled to the vector execution unit to compress a number of execution cycles consumed in execution of the vector instruction when at least some of the individual data elements are not to be operated on by the vector instruction. Other embodiments are described and claimed.

Type: Grant

Filed: December 21, 2012

Date of Patent: March 28, 2017

Assignee: Intel Corporation

Inventors: Aniruddha S. Vaidya, Anahita Shayesteh, Dong Hyuk Woo, Saikat Saharoy, Mani Azimi
Load-store dependency predictor PC hashing

Patent number: 9600289

Abstract: Methods and processors for managing load-store dependencies in an out-of-order instruction pipeline. A load store dependency predictor includes a table for storing entries for load-store pairs that have been found to be dependent and execute out of order. Each entry in the table includes hashed values to identify load and store operations. When a load or store operation is detected, the PC and an architectural register number are used to create a hashed value that can be used to uniquely identify the operation. Then, the load store dependency predictor table is searched for any matching entries with the same hashed value.

Type: Grant

Filed: May 30, 2012

Date of Patent: March 21, 2017

Assignee: Apple Inc.

Inventors: Stephan G. Meier, John H. Mylius, Gerard R. Williams, III, Suparn Vats
Processor and method for implementing barrier operation using speculative and architectural color values

Patent number: 9582276

Abstract: Methods and processors for enforcing an order of memory access requests in the presence of barriers in an out-of-order processor pipeline. A speculative color is assigned to instruction operations in the front-end of the processor pipeline, while the instruction operations are still in order. The instruction operations are placed in any of multiple reservation stations and then issued out-of-order from the reservation stations. When a barrier is encountered in the front-end, the speculative color is changed, and instruction operations are assigned the new speculative color. A core interface unit maintains an architectural color, and the architectural color is changed when a barrier retires. The core interface unit stalls instruction operations with a speculative color that does match the architectural color.

Type: Grant

Filed: September 27, 2012

Date of Patent: February 28, 2017

Assignee: Apple Inc.

Inventors: Stephan G. Meier, Gerard R. Williams, III
Performance of processors is improved by limiting number of branch prediction levels

Patent number: 9582284

Abstract: A method utilizes information provided by performance monitoring hardware to dynamically adjust the number of levels of speculative branch predictions allowed (typically 3 or 4 per thread) for a processor core. The information includes cycles-per-instruction (CPI) for the processor core and number of memory accesses per unit time. If the CPI is below a CPI threshold; and the number of memory accesses (NMA) per unit time is above a prescribed threshold, the number of levels of speculative branch predictions is reduced per thread for the processor core. Likewise, the number of levels of speculative branch predictions could be increased, from a low level to maximum allowed, if the CPI threshold is exceeded or the number of memory accesses per unit time is below the prescribed threshold.

Type: Grant

Filed: December 1, 2011

Date of Patent: February 28, 2017

Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Robert H. Bell, Jr., Wen-Tzer T. Chen
Prefetching using a prefetch lookup table identifying previously accessed cache lines

Patent number: 9582282

Abstract: A data processing apparatus has prefetch circuitry for prefetching cache lines of instructions into an instruction cache. A prefetch lookup table is provided for storing prefetch entries, with each entry corresponding to a region of a memory address space and identifying at least one block of one or more cache lines within the corresponding region from which processing circuitry accessed an instruction on a previous occasion. When the processing circuitry executes an instruction from a new region, the prefetch circuitry looks up the table, and if it stores a prefetch entry for the new region, then the at least one block identified by the corresponding entry is prefetched into the cache.

Type: Grant

Filed: July 17, 2014

Date of Patent: February 28, 2017

Assignee: ARM Limited

Inventors: Mitchell Bryan Hayenga, Christopher Daniel Emmons
Device and method for the distributed execution of digital data processing operations

Patent number: 9569272

Abstract: A method and device for digital data processing based on a data flow processing model is suitable for the execution, in a distributed manner on multiple calculation nodes, of multiple data processing operations modelled by directed graphs, where two different processing operations include at least one common calculation node. The device includes an identification processor configured to, from a valued directed multi-graph made up of the union of several distinct processing graphs and divided into several valued directed sub-multi-graphs, called chunks, and whose input and output nodes are buffer memory nodes of the multi-graph, identify a coordination module for each chunk. Furthermore each identified coordination module is configured to synchronize portions of processing operations that are to be executed in the chunk with which the respective coordination module is associated, independently of portions of processing operations that are to be executed in other chunks.

Type: Grant

Filed: July 9, 2010

Date of Patent: February 14, 2017

Assignee: Commissariat a l'energie atomique et aux alternatives

Inventor: Yvain Thonnart
Integrated circuit with control node circuitry and processing circuitry

Patent number: 9552206

Abstract: Traditionally, providing parallel processing within a multi-core system has been very difficult. Here, however, a system is provided where serial source code is automatically converted into parallel source code, and a processing cluster is reconfigured “on the fly” to accommodate the parallelized code based on an allocation of memory and compute resources. Thus, the processing cluster and its corresponding system programming tool provide a system that can perform parallel processing from a serial program that is transparent to a user. Generally, a control node connected to the address and data leads of a host processor uses messages to control the processing of data in a processing cluster. The cluster includes nodes of parallel processors, shared function memory, a global load/store, and hardware accelerators all connected to the control node by message busses. A crossbar data interconnect routes data to the cluster circuits separate from the message busses.

Type: Grant

Filed: September 14, 2011

Date of Patent: January 24, 2017

Assignee: Texas Instruments Incorporated

Inventors: William M. Johnson, Murali S. Chinnakonda, Jeffrey L. Nye, Toshio Nagata, John W. Glotzbach, Hamid R. Sheikh, Ajay Jayaraj, Stephen Busch, Shalini Gupta, Robert J.P. Nychka, David H. Bartley, Ganesh Sundararajan
Hardware profiling mechanism to enable page level automatic binary translation

Patent number: 9542191

Abstract: A hardware profiling mechanism implemented by performance monitoring hardware enables page level automatic binary translation. The hardware during runtime identifies a code page in memory containing potentially optimizable instructions. The hardware requests allocation of a new page in memory associated with the code page, where the new page contains a collection of counters and each of the counters corresponds to one of the instructions in the code page. When the hardware detects a branch instruction having a branch target within the code page, it increments one of the counters that has the same position in the new page as the branch target in the code page. The execution of the code page is repeated and the counters are incremented when branch targets fall within the code page. The hardware then provides the counter values in the new page to a binary translator for binary translation.

Type: Grant

Filed: March 30, 2012

Date of Patent: January 10, 2017

Assignee: Intel Corporation

Inventors: Paul Caprioli, Matthew C. Merten, Muawya M. Al-Otoom, Omar M. Shaikh, Abhay S. Kanhere, Suresh Srinivas, Koichi Yamada, Vivek Thakkar, Pawel Osciak
Method and apparatus for scheduling instructions in a multi-strand out of order processor with instruction synchronization bits and scoreboard bits

Patent number: 9529596

Abstract: In accordance with embodiments disclosed herein, there are provided methods, systems, and apparatuses for scheduling instructions in a multi-strand out-of-order processor. For example, an apparatus for scheduling instructions in a multi-strand out-of-order processor includes an out-of-order instruction fetch unit to retrieve a plurality of interdependent instructions for execution from a multi-strand representation of a sequential program listing; an instruction scheduling unit to schedule the execution of the plurality of interdependent instructions based at least in part on operand synchronization bits encoded within each of the plurality of interdependent instructions; and a plurality of execution units to execute at least a subset of the plurality of interdependent instructions in parallel.

Type: Grant

Filed: July 1, 2011

Date of Patent: December 27, 2016

Assignee: Intel Corporation

Inventors: Boris A. Babayan, Vladimir M. Pentkovski, Alexander V. Butuzov, Sergey Y. Shishlov, Alexey Y. Sivtsov, Nikolay E. Kosarev
Processing data sets using dedicated logic units to prevent data collision in a pipelined stream processor

Patent number: 9514094

Abstract: There is provided a method for processing multiple sets of data concurrently in a statically scheduled pipelined stream processor by allowing a data set to enter the pipeline while another data set is being processed. Dedicated logic units enable independent control of each of the data sets being processed.

Type: Grant

Filed: July 10, 2012

Date of Patent: December 6, 2016

Assignee: MAXELER TECHNOLOGIES LTD

Inventors: Oliver Pell, Itay Greenspon, James Barry Spooner, Robert Gwilym Dimond, Jacob Bower, Richard Berry
Fingerprint-based branch prediction

Patent number: 9495157

Abstract: Embodiments relate to fingerprint-based branch prediction. An aspect includes based on encountering a branch instruction during execution of software on a processor of a computer system, determining a fingerprint of the software, the fingerprint comprising a representation of a sequence of behavior that occurs in the processor while the software is executing. Another aspect includes based on determining that a match for the fingerprint and the branch instruction is located in an entry in the prediction table: predicting the branch instruction according to the associated prediction field. Another aspect includes based on determining that no match for the fingerprint and the branch instruction are located in an entry in the prediction table: creating a new entry in the prediction table for the fingerprint and the branch instruction.

Type: Grant

Filed: December 7, 2015

Date of Patent: November 15, 2016

Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Jonathan D. Bradbury, Giles R. Frazier, Michael Karl Gschwind, Christian Jacobi, Anthony Saporito, Chung-Lung K. Shum
Interleaving data accesses issued in response to vector access instructions

Patent number: 9483243

Abstract: A vector data access unit includes data access ordering circuitry, for issuing data access requests indicated by elements of earlier and a later vector instructions, one being a write instruction. An element indicating the next data access for each of the instructions is determined. The next data accesses for the earlier and the later instructions may be reordered. The next data access of the earlier instruction is selected if the position of the earlier instruction's next data element is less than or equal to the position of the later instruction's next data element minus a predetermined value. The next data access of the later instruction may be selected if the position of the earlier instruction's next data element is higher than the position of the later instruction's next data element minus a predetermined value. Thus data accesses from earlier and later instructions are partially interleaved.

Type: Grant

Filed: March 23, 2015

Date of Patent: November 1, 2016

Assignee: ARM Limited

Inventor: Alastair David Reid
Synchronizing exception control in a multiprocessor system using processing unit exception states and group exception states

Patent number: 9430419

Abstract: A data processing apparatus is provided with a plurality of processing units executing respective streams of program instructions corresponding to respective processing threads. Exception control circuitry controls exception processing for a group of the processing units in response to an exception triggering event. Each of the processing units moves only once and in sequence between normal, in-exception, and done-exception states in response to a given exception event. A group of processing units moves in sequence between states normal, triggering, and completing in response to the exception event. A counter value is used to track the number of processing units which have entered exception processing and then to track the number of processing units which have completed their exception processing.

Type: Grant

Filed: October 13, 2011

Date of Patent: August 30, 2016

Assignee: ARM Limited

Inventors: Simon Jones, Joe Dominic Michael Tapply

prev … 6 7 8 9 10 11 12 next