Patents Examined by Keith Vicary
-
Patent number: 10120692Abstract: A method of compressing a sequence of program instructions begins by examining a program instruction stream to identify a sequence of two or more instructions that meet a parameter. The identified sequence of two or more instructions is replaced by a selected type of layout instruction which is then compressed. A method of decompressing accesses an X-index and a Y-index together as a compressed value. The compressed value is decompressed to a selected type of layout instruction which is decoded and replaced with a sequence of two or more instructions. An apparatus for decompressing includes a storage subsystem configured for storing compressed instructions, wherein a compressed instruction comprises an X-index and a Y-index. A decompressor is configured for translating an X-index and Y-index accessed from the storage subsystem to a selected type of layout instruction which is decoded and replaced with a sequence of two or more instructions.Type: GrantFiled: July 28, 2011Date of Patent: November 6, 2018Assignee: QUALCOMM IncorporatedInventors: Sergei Larin, Lucian Codrescu, Anshuman Das Gupta
-
Patent number: 9996353Abstract: An approach is provided in which a mapper control unit receives first dispatch information corresponding to a first instruction that identifies a first register and a first register type. The mapper control unit dynamically configures a first history buffer entry to support the first register type and, in turn, stores content from the first register into the first history buffer entry. The mapper control unit then receives second dispatch information corresponding to a second instruction that identifies a second register and a second register type, which is different than the first register type. The mapper control unit dynamically configures a second history buffer entry to support the second register type and, in turn, stores content from the second register into the second history buffer entry.Type: GrantFiled: February 26, 2015Date of Patent: June 12, 2018Assignee: International Business Machines CorporationInventors: Michael J. Genden, Hung Q. Le, Dung Q. Nguyen, Kenneth L. Ward
-
Patent number: 9996360Abstract: A TRANSACTION ABORT instruction is used to abort a transaction that is executing in a computing environment. The TRANSACTION ABORT instruction includes at least one field used to specify a user-defined abort code that indicates the specific reason for aborting the transaction. Based on executing the TRANSACTION ABORT instruction, a condition code is provided that indicates whether re-execution of the transaction is recommended.Type: GrantFiled: August 9, 2016Date of Patent: June 12, 2018Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Dan F. Greiner, Christian Jacobi, Marcel Mitran, Timothy J. Slegel
-
Patent number: 9983883Abstract: A TRANSACTION ABORT instruction is used to abort a transaction that is executing in a computing environment. The TRANSACTION ABORT instruction includes at least one field used to specify a user-defined abort code that indicates the specific reason for aborting the transaction. Based on executing the TRANSACTION ABORT instruction, a condition code is provided that indicates whether re-execution of the transaction is recommended.Type: GrantFiled: August 9, 2016Date of Patent: May 29, 2018Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Dan F. Greiner, Christian Jacobi, Marcel Mitran, Timothy J. Slegel
-
Patent number: 9971601Abstract: Dynamic resource allocation is provided in which additional resources, such as additional architected registers, are provided to an instruction, if it is determined that resources in addition to those configured to be provided to the instruction are to be used for the particular instruction. An instruction to be executed is dispatched on a pipe of a pipeline and that pipe is configured to have a set number of architected registers for use by the instruction. However, if one or more other architected registers are needed, those additional architected registers are dynamically allocated to the instruction by assigning one or more source ports of an additional pipe to the instruction.Type: GrantFiled: February 13, 2015Date of Patent: May 15, 2018Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Gregory W. Alexander, Brian D. Barrick, Fadi Y. Busaba, Wen H. Li, Edward T. Malley
-
Patent number: 9959119Abstract: A computer processor including an instruction buffer configured to store at least one variable-length instruction having a bit bundle bounded by a head end and a tail end with a plurality of slots each defining a corresponding operation, wherein the plurality of slots and corresponding operations are logically partitioned into a plurality of distinct blocks with a first group of blocks extending from the head end of the bit bundle toward the tail end of the bit bundle and a second group of blocks extending from the tail end of the bit bundle toward the head end of the bit bundle, wherein the second group of blocks includes a tail end block disposed adjacent the tail end of the bit bundle. A decode stage is operably coupled to the instruction buffer and configured to process a given variable-length instruction stored by the instruction buffer by decoding at least one operation of a particular block belonging to the first group of blocks in parallel with decoding at least one operation of the tail end block.Type: GrantFiled: May 29, 2014Date of Patent: May 1, 2018Assignee: MILL COMPUTING, INC.Inventors: Roger Rawson Godard, Arthur David Kahlich, David Arthur Yost
-
Patent number: 9952866Abstract: A method of compressing a sequence of program instructions begins by examining a program instruction stream to identify a sequence of two or more instructions that meet a parameter. The identified sequence of two or more instructions is replaced by a selected type of layout instruction which is then compressed. A method of decompressing accesses an X-index and a Y-index together as a compressed value. The compressed value is decompressed to a selected type of layout instruction which is decoded and replaced with a sequence of two or more instructions. An apparatus for decompressing includes a storage subsystem configured for storing compressed instructions, wherein a compressed instruction comprises an X-index and a Y-index. A decompressor is configured for translating an X-index and Y-index accessed from the storage subsystem to a selected type of layout instruction which is decoded and replaced with a sequence of two or more instructions.Type: GrantFiled: July 28, 2011Date of Patent: April 24, 2018Assignee: QUALCOMM IncorporatedInventors: Sergei Larin, Lucian Codrescu, Anshuman Das Gupta
-
Patent number: 9946540Abstract: An apparatus is described having instruction execution logic circuitry. The instruction execution logic circuitry has input vector element routing circuitry to perform the following for each of three different instructions: for each of a plurality of output vector element locations, route into an output vector element location an input vector element from one of a plurality of input vector element locations that are available to source the output vector element. The output vector element and each of the input vector element locations are one of three available bit widths for the three different instructions. The apparatus further includes masking layer circuitry coupled to the input vector element routing circuitry to mask a data structure created by the input vector routing element circuitry. The masking layer circuitry is designed to mask at three different levels of granularity that correspond to the three available bit widths.Type: GrantFiled: May 22, 2017Date of Patent: April 17, 2018Assignee: INTEL CORPORATIONInventors: Elmoustapha Ould-Ahmed-Vall, Robert Valentine, Jesus Corbal, Bret L. Toll, Mark J. Charney, Zeev Sperber, Amit Gradstein
-
Patent number: 9940139Abstract: A split level history buffer in a central processing unit is provided. A history buffer is split into a first portion and a second portion. An instruction fetch unit fetches and tags instructions. A register file stores tagged instructions. An execution unit generates results for tagged instructions. A first instruction is fetched, tagged, and stored in an entry of the register file. A second instruction is fetched and tagged, and then evicts the first instruction from the register file, such that the second instruction is stored in the entry of the register file. Subsequently, the first instruction is stored in an entry in the first portion of the history buffer. After a result for the first instruction is generated, the first instruction is moved from the first portion of the history buffer to the second portion of the history buffer.Type: GrantFiled: September 20, 2016Date of Patent: April 10, 2018Assignee: Internaitonal Business Machines CorporationInventors: Hung Q. Le, Dung Q. Nguyen, David R. Terry
-
Patent number: 9928072Abstract: A system includes a processor configured to: initiate atomic execution of a plurality of instruction units in a thread, starting with a beginning instruction unit in the plurality of instruction units, wherein the plurality of instruction units is not programmatically specified to be executed atomically; detect an atomicity terminating event during atomic execution of the plurality of instruction units, wherein the atomicity terminating event is triggered by a memory access by another processor; and establish an incidentally atomic sequence of instruction units based at least in part on detection of the atomicity terminating event, wherein the incidentally atomic sequence of instruction units correspond to a sequence of instruction units in the plurality of instruction units. The system further includes a memory coupled to the processor, configured to provide the processor with the plurality of instruction units.Type: GrantFiled: May 1, 2009Date of Patent: March 27, 2018Assignee: Azul Systems, Inc.Inventors: Gil Tene, Michael A. Wolf, Cliff N. Click, Jr.
-
Patent number: 9928071Abstract: A system includes a processor configured to: initiate atomic execution of a plurality of instruction units in a thread, starting with a beginning instruction unit in the plurality of instruction units, wherein the plurality of instruction units in the thread are not programmatically specified to be executed atomically; detect an atomicity terminating event during atomic execution of the plurality of instruction units, wherein the atomicity terminating event is triggered by a memory access by another processor; and commit at least some of the one or more memory modification instructions. The system further includes a memory coupled to the processor, configured to provide the processor with the plurality of instruction units.Type: GrantFiled: May 1, 2009Date of Patent: March 27, 2018Assignee: Azul Systems, Inc.Inventors: Gil Tene, Michael A. Wolf, Cliff N. Click, Jr.
-
Patent number: 9921832Abstract: A vector reduction instruction with non-unit strided access pattern is received and executed by the execution circuitry of a processor. In response to the instruction, the execution circuitry performs an associative reduction operation on data elements of a first vector register. Based on values of the mask register and a current element position being processed, the execution circuitry sequentially sets one or more data elements of the first vector register to a result, which is generated by the associative reduction operation applied to both a previous data element of the first vector register and a data clement of a third vector register. The previous data element is located more than one element position away from the current element position.Type: GrantFiled: December 28, 2012Date of Patent: March 20, 2018Assignee: Intel CorporationInventors: Albert Hartono, Jayashankar Bharadwaj, Nalini Vasudevan, Sara S. Baghsorkhi, Victor W. Lee, Daehyun Kim
-
Patent number: 9921847Abstract: In one embodiment of the present invention, a streaming multiprocessor (SM) uses a tree of nodes to manage threads. Each node specifies a set of active threads and a program counter. Upon encountering a conditional instruction that causes an execution path to diverge, the SM creates child nodes corresponding to each of the divergent execution paths. Based on the conditional instruction, the SM assigns each active thread included in the parent node to at most one child node, and the SM temporarily discontinues executing instructions specified by the parent node. Instead, the SM concurrently executes instructions specified by the child nodes. After all the divergent paths reconverge to the parent path, the SM resumes executing instructions specified by the parent node. Advantageously, the disclosed techniques enable the SM to execute divergent paths in parallel, thereby reducing undesirable program behavior associated with conventional techniques that serialize divergent paths across thread groups.Type: GrantFiled: January 21, 2014Date of Patent: March 20, 2018Assignee: NVIDIA CorporationInventor: John Erik Lindholm
-
Patent number: 9910674Abstract: In the data processor in which a combination of multiple specific instructions is prohibited, an instruction set is employed that additionally defines that prohibition combination pattern as a separate instruction. With respect to the prohibition combination pattern additionally defined as the separate instruction, for example, in order to make a definition in such a manner that an instruction dispatch mechanism for the instruction set that is present before the additional definition is used as is, the instruction to be additionally defined by the prohibition combination pattern is limited to an instruction type that is the same as the instruction defined only with a latter-half code of the instruction in a case of an instruction set in which the instruction set that is present before the additional definition includes a prefix code.Type: GrantFiled: April 10, 2012Date of Patent: March 6, 2018Assignee: Renesas Electronics CorporationInventor: Fumio Arakawa
-
Patent number: 9910672Abstract: A method and load and store buffer for issuing a load instruction to a data cache. The method includes determining whether there are any unresolved store instructions in the store buffer that are older than the load instruction. If there is at least one unresolved store instruction in the store buffer older than the load instruction, it is determined whether the oldest unresolved store instruction in the store buffer is within a speculation window for the load instruction. If the oldest unresolved store instruction is within the speculation window for the load instruction, the load instruction is speculatively issued to the data cache. Otherwise, the load instruction is stalled until any unresolved store instructions outside the speculation window are resolved. The speculation window is a short window that defines a number of instructions or store instructions that immediately precede the load instruction.Type: GrantFiled: June 15, 2016Date of Patent: March 6, 2018Assignee: MIPS Tech, LLCInventors: Hugh Jackson, Anand Khot
-
Patent number: 9904616Abstract: Generating instructions, in particular for mailbox verification in a simulation environment. A sequence of instructions is received, as well as selection data representative of a plurality of commands including a special command. Repeatedly selecting one of the plurality of commands and outputting an instruction based on the selected command. The outputting of an instruction includes outputting a next instruction in the sequence of instructions if the selected command is the special command, and outputting an instruction associated with the command if the selected command is not the special command.Type: GrantFiled: November 1, 2012Date of Patent: February 27, 2018Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Joerg Deutschle, Ursel Hahn, Joerg Walter, Ernst-Dieter Weissenberger
-
Patent number: 9880842Abstract: A mechanism for tracking the control flow of instructions in an application and performing one or more optimizations of a processing device, based on the control flow of the instructions in the application, is disclosed. Control flow data is generated to indicate the control flow of blocks of instructions in the application. The control flow data may include annotations that indicate whether optimizations may be performed for different blocks of instructions. The control flow data may also be used to track the execution of the instructions to determine whether an instruction in a block of instructions is assigned to a thread, a process, and/or an execution core of a processor, and to determine whether errors have occurred during the execution of the instructions.Type: GrantFiled: March 15, 2013Date of Patent: January 30, 2018Assignee: Intel CorporationInventors: Jayaram Bobba, Ruchira Sasanka, Jeffrey J. Cook, Abhinav Das, Arvind Krishnaswamy, David J. Sager, Jason M. Agron
-
Patent number: 9851979Abstract: A split level history buffer in a central processing unit is provided. A first instruction and a second instruction are fetched, tagged, and the first instruction is stored an entry of a register file. The first instruction is evicted from the entry and the second instruction is stored in the entry. If the first instruction is evicted, then the first instruction is stored in a first portion of a history buffer. If a result for the first instruction is generated, then the first instruction is moved to a second portion of the history buffer and the result is stored with the first instruction in the second portion of the history buffer. If it is determined that a third instruction evicts the second instruction from the entry, then the second instruction is stored in the first portion of the history buffer.Type: GrantFiled: September 16, 2016Date of Patent: December 26, 2017Assignee: International Business Machines CorporationInventors: Hung Q. Le, Dung Q. Nguyen, David R. Terry
-
Patent number: 9841979Abstract: A method and corresponding apparatus for processing a shuffle instruction are provided. Shuffle units are configured in a hierarchical structure, and each of the shuffle units generates a shuffled data element array by performing shuffling on an input data element array. In the hierarchical structure, which includes an upper shuffle unit and a lower shuffle unit, the shuffled data element array output from the lower shuffle unit is input to the upper shuffle unit as a portion of the input data element array for the upper shuffle unit.Type: GrantFiled: July 14, 2014Date of Patent: December 12, 2017Assignee: Samsung Electronics Co., Ltd.Inventors: Keshava Prasad, Navneet Basutkar, Young Hwan Park, Ho Yang, Yeon Bok Lee
-
Patent number: 9817668Abstract: One embodiment of the present invention sets forth an approach for executing replay operations for divergent operations in a parallel processing subsystem. Specifically, the streaming multiprocessor (SM) includes a multistage pipeline configured to batch two or more replay operations for processing via replay loop. A logic element within the multistage pipeline detects whether the current pipeline stage is accessing a shared resource, such as loading data from a shared memory. If the threads are accessing data which are distributed across multiple cache lines, then the multistage pipeline batches two or more replay operations, where the replay operations are inserted into the pipeline back-to-back.Type: GrantFiled: December 16, 2011Date of Patent: November 14, 2017Assignee: NVIDIA CorporationInventors: Michael Fetterman, Jack Hilaire Choquette, Omkar Paranjape, Anjana Rajendran, Eric Lyell Hill, Stewart Glenn Carlton, Rajeshwaran Selvanesan, Douglas J. Hahn, Steven James Heinrich