Patents Examined by Daniel H. Pan

Decimal load immediate instruction

Patent number: 10235170

Abstract: An instruction generates a value for use in processing within a computing environment. The instruction obtains a sign control associated with the instruction, and shifts an input value of the instruction in a specified direction by a selected amount to provide a result. The result is placed in a first designated location in a register, and the sign, which is based on the sign control, is placed in a second designated location of the register. The result and the sign provide a signed value to be used in processing within the computing environment.

Type: Grant

Filed: September 30, 2016

Date of Patent: March 19, 2019

Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Jonathan D. Bradbury, Reid T. Copeland, Silvia Melitta Mueller
Accessing data in multi-dimensional tensors

Patent number: 10228947

Abstract: Methods, systems, and apparatus, including an apparatus for processing an instruction for accessing a N-dimensional tensor, the apparatus including multiple tensor index elements and multiple dimension multiplier elements, where each of the dimension multiplier elements has a corresponding tensor index element. The apparatus includes one or more processors configured to obtain an instruction to access a particular element of a N-dimensional tensor, where the N-dimensional tensor has multiple elements arranged across each of the N dimensions, and where N is an integer that is equal to or greater than one; determine, using one or more tensor index elements of the multiple tensor index elements and one or more dimension multiplier elements of the multiple dimension multiplier elements, an address of the particular element; and output data indicating the determined address for accessing the particular element of the N-dimensional tensor.

Type: Grant

Filed: December 15, 2017

Date of Patent: March 12, 2019

Assignee: Google LLC

Inventors: Dong Hyuk Woo, Andrew Everett Phelps
Supporting binary translation alias detection in an out-of-order processor

Patent number: 10228956

Abstract: In one implementation, a processing device is provided that includes a memory to store instructions and a processor core to execute the instructions. The processor core is to receive a sequence of instructions reordered by a binary translator for execution. A first load of the sequence of instructions is identified. The first load references a memory location that stores a data item to be loaded. An occurrence of a second load is detected. The second load to access the memory location subsequent to an execution of the first load instruction. A protection field in the first load is enabled based on the detected occurrence of the second load. The enabled protection field indicates that the first load is to be checked for an aliasing associated with the memory location with respect to a subsequent store instruction. The second load is eliminated based on the enabled of the protection field.

Type: Grant

Filed: September 30, 2016

Date of Patent: March 12, 2019

Assignee: Intel Corporation

Inventors: Vineeth Mekkat, Mark J. Dechene, Zhongying Zhang, Jason Agron, Sebastian Winkel
Architected store and verify guard word instructions

Patent number: 10229266

Abstract: Corruption of call stacks is detected by using guard words placed in the call stacks. A store guard word instruction is used to store a guard word on a stack frame of a caller routine, and a verify guard word instruction issued by one or more callee routines is used to verify the guard word is an expected value. If the guard word is an unexpected value, corruption is indicated.

Type: Grant

Filed: February 17, 2017

Date of Patent: March 12, 2019

Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventor: Michael K. Gschwind
Central processing unit and arithmetic unit

Patent number: 10223110

Abstract: There is a need to provide a central processing unit capable of improving the resistance to power analysis attack without changing programs, lowering clock frequencies, and greatly redesigning a central processing unit of the related art. In a central processing unit, an arithmetic unit is capable of performing arithmetic operation using data irrelevant to data stored in a register group. A control unit allows the arithmetic unit to perform arithmetic processing corresponding to an incorporated instruction. At this time, the control unit allows the arithmetic unit to perform arithmetic processing using the irrelevant data during a first one-clock cycle.

Type: Grant

Filed: August 29, 2013

Date of Patent: March 5, 2019

Assignee: Renesas Electronics Corporation

Inventor: Minoru Saeki
Method and apparatus for supporting quasi-posted loads

Patent number: 10223121

Abstract: A processor includes a decoder, a data return buffer, and an execution unit. The decoder is to decode an instruction for a non-posted load into a decoded instruction for loading data from memory mapped input/output. The execution unit is for executing the decoded instruction. The execution is to start a timer, determine whether the timer exceeds a timeout threshold, allocate an entry in the data return buffer for the load, and determine whether an event arrived. The timer is to measure an amount of time taken to return the non-posted load instruction. The determination whether an event arrived is made in response to at least one of the allocation of the entry for the load, or a determination that the timer exceeds the timeout threshold.

Type: Grant

Filed: December 22, 2016

Date of Patent: March 5, 2019

Assignee: Intel Corporation

Inventors: Ido Ouziel, Raanan Sade, Jacob Doweck
Multi-level loops for computer processor control

Patent number: 10216246

Abstract: In an embodiment, a processor includes processing cores, and a central control unit to: concurrently execute an outer control loop and an inner control loop, wherein the outer control loop is to monitor the processor as a whole, and wherein the inner control loop is to monitor a first processing core included in the processor; determine, based on the outer control loop, a first control action for the first processing core included in the processor; determine, based on the inner control loop, a second control action for the first processing core included in the processor; based on a comparison of the first control action and the second control action, select one of the first control action and the second control action as a selected control action; and apply the selected control action to the first processing core. Other embodiments are described and claimed.

Type: Grant

Filed: September 30, 2016

Date of Patent: February 26, 2019

Assignee: Intel Corporation

Inventors: Doron Rajwan, Efraim Rotem, Eliezer Weissmann, Avinash N. Ananthakrishnan, Dorit Shapira
Instruction to cancel outstanding cache prefetches

Patent number: 10216635

Abstract: Techniques relate to handling outstanding cache miss prefetches. A processor pipeline recognizes that a prefetch canceling instruction is being executed. In response to recognizing that the prefetch canceling instruction is being executed, all outstanding prefetches are evaluated according to a criterion as set forth by the prefetch canceling instruction in order to select qualified prefetches. In response to evaluating, a cache subsystem is communicated with to cause canceling of the qualified prefetches that fit the criterion. In response to successful canceling of the qualified prefetches, a local cache is prevented from being updated from the qualified prefetches.

Type: Grant

Filed: December 22, 2016

Date of Patent: February 26, 2019

Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Michael Karl Gschwind, Maged M. Michael, Valentina Salapura, Eric M. Schwarz, Chung-Lung K. Shum
Fused adjacent memory stores

Patent number: 10216516

Abstract: A processing device includes a store instruction identification unit to identify a pair of store instructions among a plurality of instructions in an instruction queue. The pair of store instructions include a first store instruction and a second store instruction. The first data of the first store instruction corresponds to a first memory region adjacent to a second memory region, and a second data of the second store instruction corresponds to the second memory region. The processing device to include a store instruction fusion unit to fuse the first store instruction with the second store instruction resulting in a fused store instruction.

Type: Grant

Filed: September 30, 2016

Date of Patent: February 26, 2019

Assignee: Intel Corporation

Inventors: Sebastian Winkel, Jamison D. Collins, Tyler Sondag
Streaming engine with stream metadata saving for context switching

Patent number: 10203958

Abstract: A streaming engine employed in a digital data processor specifies a fixed read only data stream defined by plural nested loops. An address generator produces addresses of data elements. A steam head register stores data elements next to be supplied to functional units for use as operands. Stream metadata is stored in response to a stream store instruction. Stored stream metadata is restored to the stream engine in response to a stream restore instruction. An interrupt changes an open stream to a frozen state discarding stored stream data. A return from interrupt changes a frozen stream to an active state.

Type: Grant

Filed: June 28, 2017

Date of Patent: February 12, 2019

Assignee: TEXAS INSTRUMENTS INCORPORATED

Inventors: Joseph Zbiciak, Timothy D. Anderson
Processor with improved alias queue and store collision detection to reduce memory violations and load replays

Patent number: 10203957

Abstract: A register alias table for a processor including an alias queue, load and store comparators, and dependency logic. Each entry of the alias queue stores instruction pointers of a pair of colliding load and store instructions that caused a memory violation and a valid value. The store comparator compares the instruction pointer of a subsequent store instruction with those stored in the alias queue, and if a match occurs, indicates that a store index of the subsequent store instruction is valid. The load comparator determines whether the instruction pointer of a subsequent load instruction matches an instruction pointer stored in the alias queue. If so, dependency logic provides a store index, if valid, as dependency information for the subsequent load instruction.

Type: Grant

Filed: September 30, 2016

Date of Patent: February 12, 2019

Assignee: VIA ALLIANCE SEMICONDUCTOR CO., LTD.

Inventor: Xiaolong Fei
Write nullification

Patent number: 10198263

Abstract: Apparatus and methods are disclosed for nullifying one or more registers identified in a target field of a nullification instruction. In some examples of the disclosed technology, an apparatus can include memory and one or more block-based processor cores configured to fetch and execute a plurality of instruction blocks. One of the cores can include a control unit configured, based at least in part on receiving a nullification instruction, to obtain a register identification of at least one of a plurality of registers, based on a target field of the nullification instruction. A write to the at least one register associated with the register identification is nullified. The nullification instruction is in a first instruction block of the plurality of instruction blocks. Based on the nullified write to the at least one register, a subsequent instruction is executed from a second, different instruction block.

Type: Grant

Filed: March 3, 2016

Date of Patent: February 5, 2019

Assignee: Microsoft Technology Licensing, LLC

Inventors: Douglas C. Burger, Aaron L. Smith
Scatter reduction instruction

Patent number: 10191749

Abstract: Single Instruction, Multiple Data (SIMD) technologies are described. A processing device can include a processor core and a memory. The processor core can receive, from a software application, a request to perform an operation on a first set of variables that includes a first input value and a register value and perform the operation on a second set of variables that includes a second input value and the first register value. The processor core can vectorize the operation on the first set of variables and the second set of variables. The processor core can perform the operation on the first set of variables and the second set of variables in parallel to obtain a first operation value and a second operation value. The processor core can perform a horizontal add operation on the first operation value and the second operation value and write the result to memory.

Type: Grant

Filed: December 24, 2015

Date of Patent: January 29, 2019

Assignee: Intel Corporation

Inventors: Jun Jin, Elmoustapha Ould-Ahmed-Vall
Dynamic thread sharing in branch prediction structures

Patent number: 10185570

Abstract: Embodiments relate to multithreaded branch prediction. An aspect includes a system for dynamically evaluating how to share entries of a multithreaded branch prediction structure. The system includes a first-level branch target buffer coupled to a processor circuit. The processor circuit is configured to perform a method. The method includes receiving a search request to locate branch prediction information associated with the search request, and searching for an entry corresponding to the search request in the first-level branch prediction structure. The entry is not allowed based on a thread state of the entry indicating that the entry has caused a problem on a thread associated with the thread state.

Type: Grant

Filed: November 9, 2017

Date of Patent: January 22, 2019

Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: James J. Bonanno, Daniel Lipetz, Brian R. Prasky, Anthony Saporito
Conflict mask generation

Patent number: 10185562

Abstract: Single Instruction, Multiple Data (SIMD) technologies are described. A processing device can include a processor core and a memory. The processor core can generate a first bitmap comprising a plurality of bits, where the plurality of bits includes a first bit that represents a first memory location. The processor core can determine that the value of the first bit is equal to the value of a second bit in the first bitmap. The processor core can determine the location of the second bit in relation to the first bit in the first bitmap. The processor core can generate a second bitmap including a third bit indicating that the first bit is the last bit in the first bitmap with the same value as the second bit.

Type: Grant

Filed: December 24, 2015

Date of Patent: January 22, 2019

Assignee: Intel Corporation

Inventors: Jun Jin, Elmoustapha Ould-Ahmed-Vall
Apparatus for information processing with loop cache and associated methods

Patent number: 10180839

Abstract: An apparatus includes a processor and a loop cache coupled to the processor. The loop cache provides to the processor instructions corresponding to a loop in the instructions. The loop cache includes a persistence counter.

Type: Grant

Filed: March 4, 2016

Date of Patent: January 15, 2019

Assignee: Silicon Laboratories Inc.

Inventors: Mark W. Johnson, Paul Zavalney, Marius Grannæs, Oeivind A. G. Loe
Multi-processor core three-dimensional (3D) integrated circuits (ICs) (3DICs), and related methods

Patent number: 10176147

Abstract: Multi-processor core three-dimensional (3D) integrated circuits (ICs) (3DICs) and related methods are disclosed. In aspects disclosed herein, ICs are provided that include a central processing unit (CPU) having multiple processor cores (“cores”) to improve performance. To further improve CPU performance, the multiple cores can also be designed to communicate with each other to offload workloads and/or share resources for parallel processing, but at a communication overhead associated with passing data through interconnects which have an associated latency. To mitigate this communication overhead inefficiency, aspects disclosed herein provide the CPU with its multiple cores in a 3DIC.

Type: Grant

Filed: March 7, 2017

Date of Patent: January 8, 2019

Assignee: QUALCOMM Incorporated

Inventors: Kambiz Samadi, Amin Ansari, Yang Du
Instruction predecoding

Patent number: 10176104

Abstract: An apparatus comprises processing circuitry, an instruction cache, decoding circuitry to decode program instructions fetched from the cache to generate macro-operations to be processed by the processing circuitry, and predecoding circuitry to perform a predecoding operation on a block of program instructions fetched from a data store to generate predecode information to be stored to the cache with the block of instructions. In one example the predecoding operation comprises generating information on how many macro-operations are to generated by the decoding circuitry for a group of one or more program instructions. In another example the predecoding operation comprises generating information indicating whether at least one of a given subset of program instructions within the prefetched block is a branch instruction.

Type: Grant

Filed: September 30, 2016

Date of Patent: January 8, 2019

Assignee: ARM Limited

Inventors: Vasu Kudaravalli, Matthew Paul Elwood, Adam George, Muhammad Umar Farooq, Michael Filippo
Highly integrated scalable, flexible DSP megamodule architecture

Patent number: 10162641

Abstract: This invention addresses implements a range of interesting technologies into a single block. Each DSP CPU has a streaming engine. The streaming engines include: a SE to L2 interface that can request 512 bits/cycle from L2; a loose binding between SE and L2 interface, to allow a single stream to peak at 1024 bits/cycle; one-way coherence where the SE sees all earlier writes cached in system, but not writes that occur after stream opens; full protection against single-bit data errors within its internal storage via single-bit parity with semi-automatic restart on parity error.

Type: Grant

Filed: February 10, 2017

Date of Patent: December 25, 2018

Assignee: TEXAS INSTRUMENTS INCORPORATED

Inventors: Timothy D. Anderson, Joseph Zbiciak, Duc Quang Bui, Abhijeet A. Chachad, Kai Chirca, Naveen Bhoria, Matthew D. Pierson, Daniel Wu, Ramakrishnan Venkatasubramanian
Techniques for implementing barriers to efficiently support cumulativity in a weakly-ordered memory system

Patent number: 10162755

Abstract: A technique for operating a cache memory of a data processing system includes creating respective pollution vectors to track which of multiple concurrent threads executed by an associated processor core are currently polluted by a store operation resident in the cache memory. Dependencies in a dependency data structure of a store queue of the cache memory are set based on the pollution vectors to reduce unnecessary ordering effects. Store operations are dispatched from the store queue in accordance with the dependencies indicated by the dependency data structure.

Type: Grant

Filed: October 31, 2016

Date of Patent: December 25, 2018

Assignee: International Business Machines Corporation

Inventors: Guy L. Guthrie, Hugh Shen, William J. Starke, Derek E. Williams

prev … 7 8 9 10 11 12 13 14 15 … next