Patents Examined by Keith Nielsen
-
Patent number: 9465613Abstract: A method and circuit arrangement for selectively predicating an instruction in an instruction stream based upon a value corresponding to a predication register address indicated by a portion of an operand associated with the instruction. A first compare instruction in an instruction stream stores a compare result in at a register address of a predication register. The register address of the predication register is stored in a portion of an operand associated with a second instruction, and during decoding the second instruction, the predication register is accessed to determine a value stored at the register address of the predication register, and the second instruction is selectively predicated based on the value stored at the register address of the predication register.Type: GrantFiled: December 19, 2011Date of Patent: October 11, 2016Assignee: International Business Machines CorporationInventors: Adam J. Muff, Paul E. Schardt, Robert A. Shearer, Matthew R. Tubbs
-
Patent number: 9396013Abstract: A next-generation OS with a virtualization feature is executed as a user program on a first virtual processor by selecting, in response to a cause of a call for a host VMM, one of a guest status area (221) for executing a user program on a second virtual processor and a host status area (222) for executing the guest VMM, and by updating a guest status area (131) of a shadow VMCS for controlling a physical processor. Accordingly, without a decrease in performance of a virtual computer, the next-generation OS incorporating the virtualization feature is executed on a virtual server, and the next-generation OS and an existing OS are integrated on a single physical computer.Type: GrantFiled: February 27, 2015Date of Patent: July 19, 2016Assignee: Hitachi, Ltd.Inventors: Toshiomi Moriki, Naoya Hattori, Yuji Tsushima
-
Patent number: 9367321Abstract: The invention provides a processor comprising: an execution unit, and a thread scheduler configured to schedule a plurality of threads for execution by the execution unit in dependence on a respective runnable status for each thread. The execution unit is configured to execute thread scheduling instructions which manage the runnable statuses. The thread scheduling instructions including at least: one or more source event enable instructions each of which sets an event source to a mode in which it generates an event dependent on activity occurring at that source, and a wait instruction which sets one of said runnable statuses to suspended pending one of the events upon which continued execution of the respective thread depends. The continued execution comprises retrieval of a continuation point vector for the respective thread.Type: GrantFiled: March 14, 2007Date of Patent: June 14, 2016Assignee: XMOS LIMITEDInventor: Michael David May
-
Patent number: 9335994Abstract: Machine instructions, referred to herein as a long Convert from Zoned instruction (CDZT) and extended Convert from Zoned instruction (CXZT), are provided that read EBCDIC or ASCII data from memory, convert it to the appropriate decimal floating point format, and write it to a target floating point register or floating point register pair. Further, machine instructions, referred to herein as a long Convert to Zoned instruction (CZDT) and extended Convert to Zoned instruction (CZXT), are provided that convert a decimal floating point (DFP) operand in a source floating point register or floating point register pair to EBCDIC or ASCII data and store it to a target memory location.Type: GrantFiled: December 4, 2014Date of Patent: May 10, 2016Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Steven R. Carlough, Reid T. Copeland, Charles W. Gainey, Jr., Marcel Mitran, Eric M. Schwarz, Timothy J. Slegel
-
Patent number: 9335993Abstract: Machine instructions, referred to herein as a long Convert from Zoned instruction (CDZT) and extended Convert from Zoned instruction (CXZT), are provided that read EBCDIC or ASCII data from memory, convert it to the appropriate decimal floating point format, and write it to a target floating point register or floating point register pair. Further, machine instructions, referred to herein as a long Convert to Zoned instruction (CZDT) and extended Convert to Zoned instruction (CZXT), are provided that convert a decimal floating point (DFP) operand in a source floating point register or floating point register pair to EBCDIC or ASCII data and store it to a target memory location.Type: GrantFiled: December 29, 2011Date of Patent: May 10, 2016Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Steven R. Carlough, Reid T. Copeland, Charles W. Gainey, Jr., Marcel Mitran, Eric M. Schwarz, Timothy J. Slegel
-
Patent number: 9335995Abstract: Machine instructions, referred to herein as a long Convert from Zoned instruction (CDZT) and extended Convert from Zoned instruction (CXZT), are provided that read EBCDIC or ASCII data from memory, convert it to the appropriate decimal floating point format, and write it to a target floating point register or floating point register pair. Further, machine instructions, referred to herein as a long Convert to Zoned instruction (CZDT) and extended Convert to Zoned instruction (CZXT), are provided that convert a decimal floating point (DFP) operand in a source floating point register or floating point register pair to EBCDIC or ASCII data and store it to a target memory location.Type: GrantFiled: December 4, 2014Date of Patent: May 10, 2016Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Steven R. Carlough, Reid T. Copeland, Charles W. Gainey, Jr., Marcel Mitran, Eric M. Schwarz, Timothy J. Slegel
-
Patent number: 9329861Abstract: Machine instructions, referred to herein as a long Convert from Zoned instruction (CDZT) and extended Convert from Zoned instruction (CXZT), are provided that read EBCDIC or ASCII data from memory, convert it to the appropriate decimal floating point format, and write it to a target floating point register or floating point register pair. Further, machine instructions, referred to herein as a long Convert to Zoned instruction (CZDT) and extended Convert to Zoned instruction (CZXT), are provided that convert a decimal floating point (DFP) operand in a source floating point register or floating point register pair to EBCDIC or ASCII data and store it to a target memory location.Type: GrantFiled: December 29, 2011Date of Patent: May 3, 2016Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Steven R. Carlough, Reid T. Copeland, Charles W. Gainey, Jr., Marcel Mitran, Eric M. Schwarz, Timothy J. Slegel
-
Patent number: 9286067Abstract: A hierarchical barrier synchronization of cores and nodes on a multiprocessor system, in one aspect, may include providing by each of a plurality of threads on a chip, input bit signal to a respective bit in a register, in response to reaching a barrier; determining whether all of the plurality of threads reached the barrier by electrically tying bits of the register together and “AND”ing the input bit signals; determining whether only on-chip synchronization is needed or whether inter-node synchronization is needed; in response to determining that all of the plurality of threads on the chip reached the barrier, notifying the plurality of threads on the chip, if it is determined that only on-chip synchronization is needed; and after all of the plurality of threads on the chip reached the barrier, communicating the synchronization signal to outside of the chip, if it is determined that inter-node synchronization is needed.Type: GrantFiled: September 13, 2012Date of Patent: March 15, 2016Assignee: International Business Machines CorporationInventors: Valentina Salapura, Robert W. Wisniewski
-
Patent number: 9280345Abstract: There is provided a processor comprising a plurality of registers, an acquisition unit, a calculation unit, a pipeline register, and a storage unit, wherein in a case in which a register indicated by source register information included in a second instruction and a register indicated by destination register information included in a first instruction match, and the second instruction or an instruction that precedes to the second instruction designates the second instruction as the last instruction that uses the calculated value obtained in accordance with the first instruction, the storage unit does not store the calculated value stored in the pipeline register in a register indicated by destination register information included in the first instruction, and stores, in other cases, the calculated value stored in the pipeline register in the register indicated by the destination register information included in the first instruction.Type: GrantFiled: July 28, 2011Date of Patent: March 8, 2016Assignee: Canon Kabushiki KaishaInventor: Akihiro Takamura
-
Patent number: 9223576Abstract: A method of reducing a set of instructions for execution on a processor extracts information from a first instruction of the set of instructions. The method identifies unencoded space in one or more further instructions of the set of instructions and replaces the unencoded space of the one or more further instructions with the extracted information of the first instruction so as to form one or more amalgamated instructions. The method removes the first instruction from the set of instructions. A method of expanding a set of reduced instructions on a processor is also disclosed.Type: GrantFiled: December 29, 2011Date of Patent: December 29, 2015Assignee: QUALCOMM TECHNOLOGIES INTERNATIONAL, LTD.Inventors: Peter Smith, David Richard Hargreaves
-
Patent number: 9201651Abstract: A data processing apparatus is described which comprises processing circuitry responsive to data processing instructions to execute integer data processing operations and floating point data processing operations, a first set of integer registers useable by the processing circuitry in executing the integer data processing operations, and a second set of floating point registers useable by the processing circuitry in executing the floating point data processing operations.Type: GrantFiled: May 3, 2010Date of Patent: December 1, 2015Assignee: ARM LimitedInventor: Simon John Craske
-
Patent number: 9195629Abstract: A data transfer system includes: a plurality of processors; and a plurality of data transfer units that executes a data transfer from one processor to another processor via a plurality of input ports and a plurality of output ports. The data transfer unit includes: an arbitration unit that executes arbitration of conflicting data sent to a same next destination; and a strength information notification unit that sends strength information indicating a number of conflicts of the arbitrated conflicting data to the next destination. The arbitration unit decides a selection ratio, which is a ratio of selecting each of the input ports and receiving the conflicting data from the selected input port, according to a ratio between the input ports in relation to a magnitude of the number of conflicts indicated by the strength information received from each of the input ports.Type: GrantFiled: August 19, 2011Date of Patent: November 24, 2015Assignee: NEC CORPORATIONInventor: Yasushi Kanoh
-
Patent number: 9176738Abstract: Method and apparatus for fast decoding of microinstructions are disclosed. An integrated circuit is disclosed wherein microinstructions are queued for execution in an execution unit having multiple pipelines where each pipeline is configured to execute a set of supported microinstructions. The execution unit receives microinstruction data including an operation code (opcode) or a complex opcode. The execution unit executes the microinstruction multiple times wherein the microinstruction is executed at least once to get an address value and at least once to get a result of an operation. The execution unit processes complex opcodes by utilizing both a load/store support and a simple opcode support by splitting the complex opcode into load/store and simple opcode components and creating an internal source/destination between the two components.Type: GrantFiled: January 12, 2011Date of Patent: November 3, 2015Assignee: Advanced Micro Devices, Inc.Inventors: Ganesh Venkataramanan, Emil Talpes
-
Patent number: 9141392Abstract: A method includes determining a rate of resource occupancy of a constituent stage of an unbalanced instruction pipeline implemented in a processor through profiling an instruction code. The method also includes performing data processing at a maximum throughput at an optimum clock frequency based on the rate of resource occupancy.Type: GrantFiled: April 18, 2011Date of Patent: September 22, 2015Assignee: TEXAS INSTRUMENTS INCORPORATEDInventor: Senthilkannan Chandrasekaran
-
Patent number: 9086889Abstract: Techniques are disclosed relating to reducing the latency of restarting a pipeline in a processor that implements scouting. In one embodiment, the processor may reduce pipeline restart latency using two instruction fetch units that are configured to fetch and re-fetch instructions in parallel with one another. In some embodiments, the processor may reduce pipeline restart latency by initiating re-fetching instructions in response to determining that a commit operation is to be attempted with respect to one or more deferred instructions. In other embodiments, the processor may reduce pipeline restart latency by initiating re-fetching instructions in response to receiving an indication that a request for a set of data has been received by a cache, where the indication is sent by the cache before determining whether the data is present in the cache or not.Type: GrantFiled: April 27, 2010Date of Patent: July 21, 2015Assignee: Oracle International CorporationInventors: Martin Karlsson, Sherman H. Yip, Shailender Chaudhry
-
Patent number: 9047153Abstract: Circuitry for stochastic computation includes processing nodes, including a first processing node and a second processing node, each configured to process an outcome stream having a plurality of outcomes, each outcome being in one of a plurality of states, wherein an outcome from said outcome stream is in a particular state with a particular probability; communication links configured to transmit outcome streams between pairs of said processing nodes; and a delay module on each of said communication links, said delay module configured to delay outcome streams traversing said communication link by an assigned delay; wherein said first and second processing nodes are connected by a plurality of data paths, at least one of which comprises a plurality of communication links, each of said data paths causing an aggregate delay to an outcome stream traversing said data path; wherein no two aggregate delays impose the same delay on an outcome stream.Type: GrantFiled: February 22, 2011Date of Patent: June 2, 2015Assignee: ANALOG DEVICES, INC.Inventor: William Bradley
-
Patent number: 9009701Abstract: A next-generation OS with a virtualization feature is executed as a user program on a first virtual processor by selecting, in response to a cause of a call for a host VMM, one of a guest status area (221) for executing a user program on a second virtual processor and a host status area (222) for executing the guest VMM, and by updating a guest status area (131) of a shadow VMCS for controlling a physical processor. Accordingly, without a decrease in performance of a virtual computer, the next-generation OS incorporating the virtualization feature is executed on a virtual server, and the next-generation OS and an existing OS are integrated on a single physical computer.Type: GrantFiled: June 17, 2008Date of Patent: April 14, 2015Assignee: Hitachi, Ltd.Inventors: Toshiomi Moriki, Naoya Hattori, Yuji Tsushima
-
Patent number: 8959313Abstract: Techniques are described for transmitting predicted output data on a processing element in a stream computing application instead of processing currently received input data. The stream computing application monitors the output of a processing element and determines whether its output is predictable, for example, if the previously transmitted output values are within a predefined range or if one or more input values correlate with the same one or more output values. The application may then generate a predicted output value to transmit from the processing element instead of transmitting a processed output value based on current input values. The predicted output value may be, for example, an average of the previously transmitted output values or a previously transmitted output value that was transmitted in response to a previously received input value that is similar to a currently received input value.Type: GrantFiled: July 26, 2011Date of Patent: February 17, 2015Assignee: International Business Machines CorporationInventors: John M. Santosuosso, Brandon W. Schulz
-
Patent number: 8954713Abstract: Techniques are described for transmitting predicted output data on a processing element in a stream computing application instead of processing currently received input data. The stream computing application monitors the output of a processing element and determines whether its output is predictable, for example, if the previously transmitted output values are within a predefined range or if one or more input values correlate with the same one or more output values. The application may then generate a predicted output value to transmit from the processing element instead of transmitting a processed output value based on current input values. The predicted output value may be, for example, an average of the previously transmitted output values or a previously transmitted output value that was transmitted in response to a previously received input value that is similar to a currently received input value.Type: GrantFiled: November 20, 2012Date of Patent: February 10, 2015Assignee: International Business Machines CorporationInventors: John M. Santosuosso, Brandon W. Schulz
-
Patent number: 8904153Abstract: Mechanisms for performing a scattered load operation are provided. With these mechanisms, an extended address is received in a cache memory of a processor. The extended address has a plurality of data element address portions that specify a plurality of data elements to be accessed using the single extended address. Each of the plurality of data element address portions is provided to corresponding data element selector logic units of the cache memory. Each data element selector logic unit in the cache memory selects a corresponding data element from a cache line buffer based on a corresponding data element address portion provided to the data element selector logic unit. Each data element selector logic unit outputs the corresponding data element for use by the processor.Type: GrantFiled: September 7, 2010Date of Patent: December 2, 2014Assignee: International Business Machines CorporationInventors: Alexandre E. Eichenberger, Michael K. Gschwind, Valentina Salapura