Processing Control Patents (Class 712/220)
  • Patent number: 8762690
    Abstract: The described embodiments provide a processor for generating a result vector with incremented or decremented values from an input vector. During operation, the processor receives an input vector and a control vector. The processor then copies a value contained in a selected element of the input vector. The processor next generates the result vector, which involves writing an incremented or decremented value to the result vector, depending on the value of the control vector and the embodiment. In addition, a predicate vector can be used to control the values that are written to the result vector.
    Type: Grant
    Filed: June 30, 2009
    Date of Patent: June 24, 2014
    Assignee: Apple Inc.
    Inventors: Jeffry E. Gonion, Keith E. Diefendorff, Jr.
  • Patent number: 8762320
    Abstract: According to one embodiment of the invention, software operating as a state machine may be implemented within a digital device to support out-of-ordering processing of events by the state machine. Upon execution of the software by a processor, the following operations are performed. First, a determination is made if an incoming event is a transition, and if so, if the transition is not a transition associated with the current state of the state machine, but rather, is out-of-order from a predetermined order of transitions supported by the state machine. Upon determining that the transition is out-of-order, a determination is made whether the transition is to a reachable state such as a state prior to the current state of the state machine or to a future state from the current state. If so, the transition is allowed to be undertaken.
    Type: Grant
    Filed: December 23, 2009
    Date of Patent: June 24, 2014
    Assignee: Drumright Group, LLC.
    Inventors: Michael Allen Latta, Christian W. Stassen, Himansu Desai
  • Publication number: 20140173257
    Abstract: Methods, parallel computers, and computer program products for requesting shared variable directory (SVD) information from a plurality of threads in a parallel computer are provided. Embodiments include a runtime optimizer detecting that a first thread requires a plurality of updated SVD information associated with shared resource data stored in a plurality of memory partitions. Embodiments also include a runtime optimizer broadcasting, in response to detecting that the first thread requires the updated SVD information, a gather operation message header to the plurality of threads. The gather operation message header indicates an SVD key corresponding to the required updated SVD information and a local address associated with the first thread to receive a plurality of updated SVD information associated with the SVD key. Embodiments also include the runtime optimizer receiving at the local address, the plurality of updated SVD information from the plurality of threads.
    Type: Application
    Filed: December 18, 2012
    Publication date: June 19, 2014
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: CHARLES J. ARCHER, JAMES E. CAREY, PHILIP J. SANDERS, BRIAN E. SMITH
  • Patent number: 8754896
    Abstract: In an apparatus which includes a plurality of processing modules connected via a ring-shape bus, if a plurality pieces of pipeline processing to be processed in a different order is allocated to a plurality of processing modules, the transfer efficiency may decrease when an amount of data transferred from one of the processing modules to a post-stage module exceeds a processing capacity of the post-stage module. Accordingly, a module positioned on the preceding side in the pipeline processing controls a transmission interval of processed data so that the post-stage module can receive the data processed by the preceding module.
    Type: Grant
    Filed: October 4, 2010
    Date of Patent: June 17, 2014
    Assignee: Canon Kabushiki Kaisha
    Inventors: Hiroyasu Watanabe, Hirowo Inoue, Hisashi Ishikawa
  • Publication number: 20140164743
    Abstract: Systems and methods for scheduling instructions for execution on a multi-core processor reorder the execution of different threads to ensure that instructions specified as having localized memory access behavior are executed over one or more sequential clock cycles to benefit from memory access locality. At compile time, code sequences including memory access instructions that may be localized are delineated into separate batches. A scheduling unit ensures that multiple parallel threads are processed over one or more sequential scheduling cycles to execute the batched instructions. The scheduling unit waits to schedule execution of instructions that are not included in the particular batch until execution of the batched instructions is done so that memory access locality is maintained for the particular batch. In between the separate batches, instructions that are not included in a batch are scheduled so that threads executing non-batched instructions are also processed and not starved.
    Type: Application
    Filed: December 10, 2012
    Publication date: June 12, 2014
    Applicant: NVIDIA CORPORATION
    Inventors: Olivier GIROUX, Jack Hilaire CHOQUETTE, Xiaogang QIU, Robert J. STOLL
  • Patent number: 8751774
    Abstract: A system and method for controlling messaging between a first processor and a second processor is disclosed. The second processor controls one or more peripheral devices on behalf of a plurality of predetermined tasks being executed by the first processor. The system includes a message control module that receives an input message intended for the second processor from the first processor and maintains a message history based on the received input message and previously received input messages. The message history indicates which peripheral devices of the system are to be on and which tasks of the plurality of tasks requested the peripheral devices to be on. The message control module is further configured to generate an output message that includes output instructions for the second processor based on the message history and an output duration based on the message history. The second processor executes the output instructions.
    Type: Grant
    Filed: March 31, 2011
    Date of Patent: June 10, 2014
    Assignees: DENSO International America, Inc., Denso Corporation
    Inventors: Wan-ping Yang, Koji Shinoda, Hiroaki Shibata
  • Patent number: 8751833
    Abstract: A data processing apparatus is provided comprising first processing circuitry, second processing circuitry and shared processing circuitry. The first processing circuitry and second processing circuitry are configured to operate in different first and second power domains respectively and the shared processing circuitry is configured to operate in a shared power domain. The data processing apparatus forms a uni-processing environment for executing a single instruction stream in which either the first processing circuitry and the shared processing circuitry operate together to execute the instruction stream or the second processing circuitry and the shared processing circuitry operate together to execute the single instruction stream. Execution flow transfer circuitry is provided for transferring at least one bit of processing-state restoration information between the two hybrid processing units.
    Type: Grant
    Filed: April 30, 2010
    Date of Patent: June 10, 2014
    Assignee: ARM Limited
    Inventor: Stephen John Hill
  • Publication number: 20140157288
    Abstract: A method, apparatus and computer program product are therefore provided to enable context aware logging. In this regard, the method, apparatus, and computer program product may record events that occur in one or more applications, where the events are due to user input. These events may be associated with time values and data describing application contexts, such that the events may be used to generate an input log that also records application semantics and statuses. A variety of operations may be performed using this input log, including recreation of an application state by playing back the log, the ability to suspend or resume a user session, the ability to perform undo or pause operations, the ability to analyze user inputs to train or audit users, testing of users, troubleshooting of errors, and enabling multi-user collaboration.
    Type: Application
    Filed: December 5, 2012
    Publication date: June 5, 2014
    Applicant: MCKESSON FINANCIAL HOLDINGS
    Inventor: Eldon Wong
  • Publication number: 20140156972
    Abstract: In an embodiment, the present invention includes a processor having an execution logic to execute instructions and a control transfer termination (CTT) logic coupled to the execution logic. This logic is to cause a CTT fault to be raised if a target instruction of a control transfer instruction is not a CTT instruction. Other embodiments are described and claimed.
    Type: Application
    Filed: November 30, 2012
    Publication date: June 5, 2014
    Inventors: Vedyvas Shanbhogue, Jason W. Brandt, Uday R. Savagaonkar, Ravi L. Sahita
  • Patent number: 8738892
    Abstract: A Very Long Instruction Word (VLIW) processor having an instruction set with a reduced size resulting in a small number of bits being necessary to specify registers. The VLIW processor includes a register file, and first through third operation units, and executes a very long instruction word. Further, the very long instruction word includes a register specifying field which specifies a least one of the registers in the register file and a plurality of instructions. The operand of each instruction includes bits src1, src2, and dst, which indicate whether or not the registers specified by the register specifying field are to be used as the source register and the destination register.
    Type: Grant
    Filed: April 15, 2008
    Date of Patent: May 27, 2014
    Assignee: Panasonic Corporation
    Inventors: Takahiro Kageyama, Hideshi Nishida, Takeshi Tanaka, Kouji Nakajima
  • Publication number: 20140143523
    Abstract: In a processor core, high latency operations are tracked in entries of a data structure associated with an execution unit of the processor core. In the execution unit, execution of an instruction dependent on a high latency operation tracked by an entry of the data structure is speculatively finished prior to completion of the high latency operation. Speculatively finishing the instruction includes reporting an identifier of the entry to completion logic of the processor core and removing the instruction from an execution pipeline of the execution unit. The completion logic records dependence of the instruction on the high latency operation and commits execution results of the instruction to an architected state of the processor only after successful completion of the high latency operation.
    Type: Application
    Filed: November 16, 2012
    Publication date: May 22, 2014
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: SUNDEEP CHADHA, BRYAN LLOYD, DUNG Q. NGUYEN, DAVID S. RAY, BENJAMIN W. STOLT
  • Publication number: 20140136819
    Abstract: A processor includes a physical register file having physical registers and an execution unit to perform an arithmetic operation to generate a result mapped to a physical register, wherein the processor delays a write of the result to the physical register file until the result is qualified as valid. A method includes mapping the same physical register both to store load data of a load-execute operation and to subsequently store a result of an arithmetic operation of the load-execute operation, and writing the load data into the physical register. The method further includes, in a first clock cycle, executing the arithmetic operation to generate the result, and, in a second clock cycle, providing the result as a source operand for a dependent operation. The method includes, in a third clock cycle, enabling a write of the result to the physical register file responsive to the result qualifying as valid.
    Type: Application
    Filed: November 9, 2012
    Publication date: May 15, 2014
    Applicant: Advanced Micro Devices, Inc.
    Inventors: Ganesh Venkataramanan, Debjit Das Sarma, Betty A. McDaniel, Gregory W. Smaus, Francesco Spadini
  • Patent number: 8725961
    Abstract: Disclosed are methods and devices, among which is a method for configuring an electronic device. In one embodiment, an electronic device may include one or more memory locations having stored values representative of the capabilities of the device. According to an example configuration method, a configuring system may access the device capabilities from the one or more memory locations and configure the device based on the accessed device capabilities.
    Type: Grant
    Filed: March 20, 2012
    Date of Patent: May 13, 2014
    Assignee: Micron Technology Inc.
    Inventor: Harold B Noyes
  • Patent number: 8725992
    Abstract: A programming language may include hint instructions that may notify a programming idiom accelerator that a programming idiom is coming. An idiom begin hint exposes the programming idiom to the programming idiom accelerator. Thus, the programming idiom accelerator need not perform pattern matching or other forms of analysis to recognize a sequence of instructions. Rather, the programmer may insert idiom hint instructions, such as an idiom begin hint, to expose the idiom to the programming idiom accelerator. Similarly, an idiom end hint may mark the end of the programming idiom.
    Type: Grant
    Filed: February 1, 2008
    Date of Patent: May 13, 2014
    Assignee: International Business Machines Corporation
    Inventors: Ravi K. Arimilli, Satya P. Sharma, Randal C. Swanberg
  • Patent number: 8725993
    Abstract: Various systems, processes, products, and techniques may be used to manage thread transitions. In particular implementations, a system and process for managing thread transitions may include the ability to determine that a transition is to be made regarding the relative use of two data register sets and determine, based on the transition determination, whether to move thread data in at least one of the data register sets to second-level registers. The system and process may also include the ability to move the thread data from at least one data register set to second-level registers based on the move determination.
    Type: Grant
    Filed: February 23, 2011
    Date of Patent: May 13, 2014
    Assignee: International Business Machines Corporation
    Inventors: Christopher M. Abernathy, Mary D. Brown, Susan E. Eisen, James A. Kahle, Hung Q. Le, Dung Q. Nguyen
  • Publication number: 20140129807
    Abstract: A system and method are described for providing hints to a processing unit that subsequent operations are likely. Responsively, the processing unit takes steps to prepare for the likely subsequent operations. Where the hints are more likely than not to be correct, the processing unit operates more efficiently. For example, in an embodiment, the processing unit consumes less power. In another embodiment, subsequent operations are performed more quickly because the processing unit is prepared to efficiently handle the subsequent operations.
    Type: Application
    Filed: November 7, 2012
    Publication date: May 8, 2014
    Applicant: NVIDIA CORPORATION
    Inventors: David Conrad TANNENBAUM, Ming Y. SIU, Stuart F. OBERMAN, Colin SPRINKLE, Srinivasan IYER, Ian Chi Yan KWONG
  • Publication number: 20140129806
    Abstract: A method and apparatus for picking load or store instructions is presented. Some embodiments of the method include determining that the entry in the queue includes an instruction that is ready to be executed by the processor based on at least one instruction-based event and concurrently determining cancel conditions based on global events of the processor. Some embodiments also include selecting the instruction for execution when the cancel conditions are not satisfied.
    Type: Application
    Filed: November 8, 2012
    Publication date: May 8, 2014
    Inventor: David A. Kaplan
  • Patent number: 8719554
    Abstract: A processor includes an initiating hardware thread, which initiates a first assist hardware thread to execute a first code segment. Next, the initiating hardware thread sets an assist thread executing indicator in response to initiating the first assist hardware thread. The set assist thread executing indicator indicates whether assist hardware threads are executing. A second assist hardware thread initiates and begins executing a second code segment. In turn, the initiating hardware thread detects a change in the assist thread executing indicator, which signifies that both the first assist hardware thread and the second assist hardware thread terminated. As such, the initiating hardware thread evaluates assist hardware thread results in response to both of the assist hardware threads terminating.
    Type: Grant
    Filed: January 23, 2013
    Date of Patent: May 6, 2014
    Assignee: International Business Machines Corporation
    Inventors: Richard Louis Arndt, Giles Roger Frazier, Ronald P. Hall
  • Patent number: 8719553
    Abstract: A microprocessor pipeline arrangement 1 includes a plurality of functional units 2, 3, 4, 5 and 6. Each functional unit 2, 3, 4, 5, 6 also has access to a respective cache memory 7, 8, 9, 10, 11. Threads for processing are received by the first functional unit 2 from an external source 12, and output by an end functional unit 6 of the pipeline to an output target 13. If a thread encounters a cache-miss on its passage through the pipeline, the thread is allowed to continue to pass through the pipeline in the normal manner. However, when the thread reaches the end of the pipeline, it is sent via a loopback path 14 back to the beginning of the pipeline to be sent through the pipeline again. In this way, any thread that has not completed its processing on passing through the pipeline can be sent through the pipeline again to allow the processing of the thread to be completed.
    Type: Grant
    Filed: January 31, 2008
    Date of Patent: May 6, 2014
    Assignee: ARM Norway AS
    Inventors: Jorn Nystad, Frode Heggelund
  • Patent number: 8717586
    Abstract: An image processing apparatus which makes it possible to select a plurality of instructions at a time, and connect a plurality of documents together so that they can be processed as one document. The image processing apparatus has a reading unit, which reads an image on an original to generate image data, and performs processing according to an instruction defining reading processing to be performed, as well as processing on the generated image data. The selected plurality of instructions are analyzed, and based on the analysis result, the selected plurality of instructions are connected together to create a new instruction.
    Type: Grant
    Filed: May 31, 2011
    Date of Patent: May 6, 2014
    Assignee: Canon Kabushiki Kaisha
    Inventor: Shinichi Takano
  • Publication number: 20140122842
    Abstract: A processor with a register file mapper can use a hasher to improve the distribution of mappings within a mapping structure. The hasher generates a value based, at least in part, on a thread identifier and logical register identifier. The hash value is used as an index value into the mapping structure. The hashing algorithm is chosen to provide a more even distribution of mappings within the mapping structure, reducing the amount of data written from a first level register file to a second level register file.
    Type: Application
    Filed: October 31, 2012
    Publication date: May 1, 2014
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventor: International Business Machines Corporation
  • Publication number: 20140122838
    Abstract: One embodiment of the present invention enables threads executing on a processor to locally generate and execute work within that processor by way of work queues and command blocks. A device driver, as an initialization procedure for establishing memory objects that enable the threads to locally generate and execute work, generates a work queue, and sets a GP_GET pointer of the work queue to the first entry in the work queue. The device driver also, during the initialization procedure, sets a GP_PUT pointer of the work queue to the last free entry included in the work queue, thereby establishing a range of entries in the work queue into which new work generated by the threads can be loaded and subsequently executed by the processor. The threads then populate command blocks with generated work and point entries in the work queue to the command blocks to effect processor execution of the work stored in the command blocks.
    Type: Application
    Filed: October 26, 2012
    Publication date: May 1, 2014
    Applicant: NVIDIA CORPORATION
    Inventors: Ignacio LLAMAS, Craig Ross DUTTWEILER, Jeffrey A. BOLZ, Daniel Elliot WEXLER
  • Publication number: 20140122848
    Abstract: Systems and methods for instruction entity allocation and scheduling on multi-processors is provided. In at least one embodiment, a method for generating an execution schedule for a plurality of instruction entities for execution on a plurality of processing units comprises arranging the plurality of instruction entities into a sorted order and allocating instruction entities in the plurality of instruction entities to individual processing units in the plurality of processing units. The method further comprises scheduling instances of the instruction entities in scheduled time windows in the execution schedule, wherein the instances of the instruction entities are scheduled in scheduled time windows according to the sorted order of the plurality of instruction entities and organizing the execution schedule into execution groups.
    Type: Application
    Filed: October 31, 2012
    Publication date: May 1, 2014
    Applicant: HONEYWELL INTERNATIONAL INC.
    Inventors: Arvind Easwaran, Srivatsan Varadarajan
  • Patent number: 8713289
    Abstract: Emulation of source machine instructions is provided in which target machine CPU condition codes are employed to produce emulated condition code settings without the use, encoding or generation of branching instructions.
    Type: Grant
    Filed: January 30, 2007
    Date of Patent: April 29, 2014
    Assignee: International Business Machines Corporation
    Inventors: Reid T. Copeland, Patrick R. Doyle, Charles B. Hall, Andrew Johnson, Ali I. Sheikh
  • Patent number: 8713335
    Abstract: A parallel processing computing system includes an ordered set of m memory banks and a processor core. The ordered set of m memory banks includes a first and a last memory bank, wherein m is an integer greater than 1. The processor core implements n virtual processors, a pipeline having p ordered stages, including a memory operation stage, and a virtual processor selector function.
    Type: Grant
    Filed: July 19, 2013
    Date of Patent: April 29, 2014
    Assignee: Cognitive Electronics, Inc.
    Inventors: Andrew C. Felch, Richard H. Granger
  • Patent number: 8713290
    Abstract: A processor includes an initiating hardware thread, which initiates a first assist hardware thread to execute a first code segment. Next, the initiating hardware thread sets an assist thread executing indicator in response to initiating the first assist hardware thread. The set assist thread executing indicator indicates whether assist hardware threads are executing. A second assist hardware thread initiates and begins executing a second code segment. In turn, the initiating hardware thread detects a change in the assist thread executing indicator, which signifies that both the first assist hardware thread and the second assist hardware thread terminated. As such, the initiating hardware thread evaluates assist hardware thread results in response to both of the assist hardware threads terminating.
    Type: Grant
    Filed: September 20, 2010
    Date of Patent: April 29, 2014
    Assignee: International Business Machines Corporation
    Inventors: Richard Louis Arndt, Giles Roger Frazier, Ronald P. Hall
  • Patent number: 8713575
    Abstract: A data processing architecture includes multiple processors connected in series between a load balancer and reorder logic. The load balancer is configured to receive data and distribute the data across the processors. Appropriate ones of the processors are configured to process the data. The reorder logic is configured to receive the data processed by the processors, reorder the data, and output the reordered data.
    Type: Grant
    Filed: June 29, 2012
    Date of Patent: April 29, 2014
    Assignee: Juniper Networks, Inc.
    Inventors: John C Carney, Michael E Lipman
  • Patent number: 8713294
    Abstract: A method and system for providing a memory access check on a processor including the steps of detecting accesses to a memory device including level-1 cache using a wakeup unit. The method includes invalidating level-1 cache ranges corresponding to a guard page, and configuring a plurality of wakeup address compare (WAC) registers to allow access to selected WAC registers. The method selects one of the plurality of WAC registers, and sets up a WAC register related to the guard page. The method configures the wakeup unit to interrupt on access of the selected WAC register. The method detects access of the memory device using the wakeup unit when a guard page is violated. The method generates an interrupt to the core using the wakeup unit, and determines the source of the interrupt. The method detects the activated WAC registers assigned to the violated guard page, and initiates a response.
    Type: Grant
    Filed: January 29, 2010
    Date of Patent: April 29, 2014
    Assignee: International Business Machines Corporation
    Inventors: Thomas M. Gooding, David L. Satterfield, Burkhard Steinmacher-Burow
  • Publication number: 20140115301
    Abstract: The present disclosure provides a processor, and associated method, for performing parallel processing within a register. An exemplary processor may include a processing element having a compute unit and a register file. The register file includes a register that is divisible into lanes for parallel processing. The processor may further include a mask register and a predicate register. The mask register and the predicate register respective include a number of mask bits and predicate bits equal to a maximum number of divisible lanes of the register. A state of the mask bits and predicate bits is set to respectively achieve enabling/disabling of the lanes from executing an instruction and conditional performance of an operation defined by the instruction. Further, the processor is operable to perform a reduction operation across the lanes of the processing element and/or generate an address for each of the lanes of the processing element.
    Type: Application
    Filed: January 10, 2013
    Publication date: April 24, 2014
    Applicant: Analog Devices Technology
    Inventors: Kaushal Sanghai, Michael G. Perkins, Andrew J. Higham
  • Publication number: 20140115302
    Abstract: According to an example embodiment, a processor such as a digital signal processor (DSP), is provided with a register acting as a predicate counter. The predicate counter may include more than two useful values, and in addition to acting as a condition for executing an instruction, may also keep track of nesting levels within a loop or conditional branch. In some cases, the predicate counter may be configured to operate in single-instruction, multiple data (SIMD) mode, or SIMD-within-a-register (SWAR) mode.
    Type: Application
    Filed: August 9, 2013
    Publication date: April 24, 2014
    Applicant: ANALOG DEVICES TECHNOLOGY
    Inventors: Andrew J. Higham, Boris Lemer, Kaushal Sanghai, Michael G. Perkins, John L. Redford, Michael S. Allen
  • Patent number: 8707016
    Abstract: A set of helper thread binaries is created to retrieve data used by a set of main thread binaries. The set of helper thread binaries and the set of main thread binaries are partitioned according to common instruction boundaries. As a first partition in the set of main thread binaries executes within a first core, a second partition in the set of helper thread binaries executes within a second core, thus “warming up” the cache in the second core. When the first partition of the main completes execution, a second partition of the main core moves to the second core, and executes using the warmed up cache in the second core.
    Type: Grant
    Filed: February 1, 2008
    Date of Patent: April 22, 2014
    Assignee: International Business Machines Corporation
    Inventors: Ravi K. Arimilli, Juan C. Rubio, Balaram Sinharoy
  • Patent number: 8700884
    Abstract: A processor in a data processing system executes a permutation instruction which identifies a first source register, at least one other source register, and a destination register. The first source register stores at least one in-range index value for the at least one other source register and at least one out-of-range index value for the at least one other source register. The at least one other source register stores a plurality of vector element values, wherein each in-range index value indicates which vector element value of the at least one other source register is to be stored into a corresponding vector element of the destination register. Each out-of-range index value is used to indicate which one of at least two predetermined constant values is to be stored into a corresponding vector element of the destination register. Partial table lookups using a permutation instruction shortens the time required to retrieve data.
    Type: Grant
    Filed: October 12, 2007
    Date of Patent: April 15, 2014
    Assignee: Freescale Semiconductor, Inc.
    Inventors: William C. Moyer, Imran Ahmed, Dan E. Tamir
  • Patent number: 8700887
    Abstract: A processor and a processor control method which efficiently perform an operation on data using a register, are provided. The register may include a data type field and a data field. The processor may generate the data type bits and store the generated data type bits in the data type field.
    Type: Grant
    Filed: September 30, 2010
    Date of Patent: April 15, 2014
    Assignee: Samsung Electronics Co., Ltd.
    Inventors: Bernhard Egger, Dong-Hoon Yoo
  • Publication number: 20140101412
    Abstract: Systems and methods are provided for speculatively elevating a privilege level at which instructions are executed. In embodiment, this is accomplished b identification of a privilege elevation instruction (e.g., SYSCALL) at an early pipeline stage and speculatively executing subsequent instructions with elevated privileges.
    Type: Application
    Filed: October 4, 2012
    Publication date: April 10, 2014
    Inventor: Ricardo RAMIREZ
  • Publication number: 20140101417
    Abstract: An information processing system records an execution of a program instruction. A determination is made that a thread has entered a program unit. Another determination is made that that the thread is associated with at least one attribute that matches a set of thread recording criteria. An instruction recording mechanism for the thread is dynamically activated in response to the at least one attribute of the thread matching the set of thread recording criteria.
    Type: Application
    Filed: December 13, 2013
    Publication date: April 10, 2014
    Applicant: International Business Machines Corporation
    Inventors: Christopher D. FILACHEK, Mei Hui WANG, Joshua B. WISNIEWSKI
  • Publication number: 20140095839
    Abstract: A pipelined processing device includes: a pipeline controller configured to receive at least one instruction associated with an operation from each of a plurality of subcontrollers, and input the at least one instruction into a pipeline; and a pipeline counter configured to receive an active time value from each of the plurality of subcontrollers, the active time value indicating at least a portion of a time taken to process the at least one instruction, the pipeline controller configured to route the active time value to a shared pipeline storage for performance analysis.
    Type: Application
    Filed: December 3, 2013
    Publication date: April 3, 2014
    Applicant: International Business Machines Corporation
    Inventors: Ekaterina M. Ambroladze, Deanna Postles Dunn Berger, Michael Fee, Christine C. Jones, Arthur J. O'Neill, Diana Lynn Orf, Robert J. Sonnelitter
  • Publication number: 20140095831
    Abstract: An apparatus and method are described for performing efficient gather operations in a pipelined processor. For example, a processor according to one embodiment of the invention comprises: gather setup logic to execute one or more gather setup operations in anticipation of one or more gather operations, the gather setup operations to determine one or more addresses of vector data elements to be gathered by the gather operations; and gather logic to execute the one or more gather operations to gather the vector data elements using the one or more addresses determined by the gather setup operations.
    Type: Application
    Filed: September 28, 2012
    Publication date: April 3, 2014
    Inventors: Edward T. Grochowski, Dennis R. Bradford, George Z. Chrysos, Andrew T. Forsyth, Michael D. Upton, Lisa K. Wu
  • Publication number: 20140095840
    Abstract: A system serialization capability is provided to facilitate processing in those environments that allow multiple processors to update the same resources. The system serialization capability is used to facilitate processing in a multi-processing environment in which guests and hosts use locks to provide serialization. The system serialization capability includes a diagnose instruction which is issued after the host acquires a lock, eliminating the need for the guest to acquire the lock.
    Type: Application
    Filed: December 3, 2013
    Publication date: April 3, 2014
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventor: Lisa C. Heller
  • Publication number: 20140095837
    Abstract: A processor executes a mask update instruction to perform updates to a first mask register and a second mask register. A register file within the processor includes the first mask register and the second mask register. The processor includes execution circuitry to execute the mask update instruction. In response to the mask update instruction, the execution circuitry is to invert a given number of mask bits in the first mask register, and also to invert the given number of mask bits in the second mask register.
    Type: Application
    Filed: September 28, 2012
    Publication date: April 3, 2014
    Inventors: Mikhail Plotnikov, Andrey Naraikin, Christopher Hughes
  • Publication number: 20140095841
    Abstract: A processor including a circuit unit includes a state information holding unit, a direction controller, a direction generator, and a direction execution unit. The state information holding unit holds state information indicating a state of the circuit unit. The direction controller decodes a first direction for generating a control direction that is contained in a program. The direction generator generates a second direction when the first direction decoded by the direction controller is a direction for generating the second direction for reading the state information from the state information holding unit. The direction execution unit reads the state information from the state information holding unit based on the second direction generated by the direction generator so as to store the state information in a register unit that is capable of being read from a program.
    Type: Application
    Filed: December 5, 2013
    Publication date: April 3, 2014
    Applicant: FUJITSU LIMITED
    Inventors: MASANORI DOI, Michiharu HARA, Iwao YAMAZAKI, Ryuichi SUNAYAMA
  • Publication number: 20140095838
    Abstract: A processor includes a processing unit including a storage module having stored thereon a physical reference list for storing identifications of physical registers that have been referenced by multiple logical registers, and a reclamation module for reclaiming physical registers to a free list based on a count of each of the physical registers on the physical reference list.
    Type: Application
    Filed: September 28, 2012
    Publication date: April 3, 2014
    Inventors: VIJAYKUMAR VIJAY KADGI, JAMES D. HADLEY, AVINASH SODANI, MATTHEW C. MERTEN, MORRIS MARDEN, JOSEPH A. MCMAHON, GRACE C. LEE, LAURA A. KNAUTH, ROBERT S. CHAPPELL, FARIBORZ TABESH
  • Patent number: 8689222
    Abstract: A method, a system and a computer program product for controlling the hardware priority of hardware threads in a data processing system. A Thread Priority Control (TPC) utility assigns a primary level and one or more secondary levels of hardware priority to a hardware thread. When a hardware thread initiates execution in the absence of a system call, the TPC utility enables execution based on the primary level. When the hardware thread initiates execution within a system call, the TPC utility dynamically adjusts execution from the primary level to the secondary level associated with the system call. The TPC utility adjusts hardware priority levels in order to: (a) raise the hardware priority of one hardware thread relative to another; (b) reduce energy consumed by the hardware thread; and (c) fulfill requirements of time critical hardware sections.
    Type: Grant
    Filed: October 30, 2008
    Date of Patent: April 1, 2014
    Assignee: International Business Machines Corporation
    Inventors: Vaijayanthimala K. Anand, Joerg Droste, Bruce Mealey, Bret Ronald Olszewski
  • Publication number: 20140089640
    Abstract: A processor includes a plurality of execution units. At least one of the execution units is configured to execute a complex instruction that requires multiple instruction cycles to execute, and to enforce atomic execution of the complex instruction during a first-portion of the multiple instruction cycles required to execute the complex instruction. The at least one of the execution units is further configured to enable execution of the complex instruction to be interrupted for execution of a different instruction by the at least one execution unit during execution of a second portion of the multiple instruction cycles. The first portion and the second portion are non-overlapping.
    Type: Application
    Filed: September 27, 2012
    Publication date: March 27, 2014
    Applicant: TEXAS INSTRUMENTS INCORPORATED
    Inventors: Horst Diewald, Johann Zipperer
  • Publication number: 20140089637
    Abstract: A technique for optimizing program instruction execution throughput in a central processing unit core (CPU). The CPU implements a simultaneous multithreading (SMT) operational mode wherein program instructions associated with at least two software threads are executed in parallel as hardware threads while sharing one or more hardware resources used by the CPU, such as cache memory, translation lookaside buffers, functional execution units, etc. As part of the SMT mode, the CPU implements an autothread (AT) operational mode. During the AT operational mode, a determination is made whether there is a resource conflict between the hardware threads that undermines instruction execution throughput. If a resource conflict is detected, the CPU adjusts the relative instruction execution rates of the hardware threads based on relative priorities of the software threads.
    Type: Application
    Filed: November 29, 2013
    Publication date: March 27, 2014
    Applicant: International Business Machines Corporation
    Inventors: Amit Merchant, Dipankar Sarma, Vaidyanathan Srinivasan
  • Publication number: 20140089642
    Abstract: One or more embodiments may provide a method for performing a replay. The method includes initiating execution of a program, the program having a plurality of sets of instructions, and each set of instructions has a number of chunks of instructions. The method also includes intercepting, by a virtual machine unit executing on a processor, an instruction of a chunk of the number of chunks before execution. The method further includes determining, by a replay module executing on the processor, whether the chunk is an active chunk, and responsive to the chunk being the active chunk, executing the instruction.
    Type: Application
    Filed: September 27, 2012
    Publication date: March 27, 2014
    Inventors: Justin E. Gottschlich, Klaus Danne, Cristiano L. Pereira, Gilles A. Pokam, Rolf Kassa, Shiliang Hu, Tim Kranich
  • Publication number: 20140089641
    Abstract: A processor includes a plurality of execution units. At least one of the execution units is configured to repeatedly execute a first instruction based on a first field of the first instruction indicating that the first instruction is to be iteratively executed.
    Type: Application
    Filed: September 27, 2012
    Publication date: March 27, 2014
    Applicant: TEXAS INSTRUMENTS INCORPORATED
    Inventors: Horst Diewald, Johann Zipperer
  • Patent number: 8683181
    Abstract: An arithmetic processor includes a first pipeline unit configured to execute a first instruction that is input; a second pipeline unit configured to execute a second instruction that is input; a registration unit into which an aborted instruction is registered, the aborted instruction being the first instruction when the first pipeline unit is unable to complete the first instruction or the second instruction when the second pipeline unit is unable to complete the second instruction; a determination unit configured to make a determination as to which one of the first pipeline unit and the second pipeline unit is operating under a lower load; and an input unit configured to input, in the first pipeline unit or the second pipeline unit that is determined as operating under the lower load by the determination unit, the aborted instruction that is registered in the registration unit.
    Type: Grant
    Filed: December 9, 2010
    Date of Patent: March 25, 2014
    Assignee: Fujitsu Limited
    Inventor: Hideki Okawara
  • Publication number: 20140082331
    Abstract: A system and method for controlling processor instruction execution. In one example, a method for synchronizing a number of instructions performed by processors includes instructing a first processor to iteratively execute instructions via a first set of iterations until a predetermined time period has elapsed. A number of instructions executed in each iteration of the first set of iterations is less than a number of instructions executed in a prior iteration of the first set of iterations. The method also includes instructing a second processor to iteratively execute instructions via a second set of iterations until the predetermined time period has elapsed. A number of instructions executed in each iteration of the second set of iterations is less than a number of instructions executed in a prior iteration of the second set of iterations. The method includes determining whether additional instructions are to be executed.
    Type: Application
    Filed: September 14, 2012
    Publication date: March 20, 2014
    Applicant: General Electric Company
    Inventors: Willliam David Smith, II, Safayet Nizam Uddin Ahmed, Jon Marc Diekema
  • Patent number: 8677362
    Abstract: Provided are an apparatus for reconfiguring a mapping method and a scheduling method in a reconfigurable multi-processor system. A single function is mapped to a reconfigurable processor. When a task is created in the reconfigurable multi-processor system, a function of the task is dynamically mapped to a host processor or a reconfigurable processor, thereby removing temporal sharing between functions on the reconfigurable processor and thus reducing the number of times reconfiguration is performed. The overhead of the reconfigurable processor is minimized and the reconfigurable processor is optimized for a dynamic multi-application environment.
    Type: Grant
    Filed: October 31, 2007
    Date of Patent: March 18, 2014
    Assignee: Samsung Electronics Co., Ltd.
    Inventors: Chae-Seok Im, Gyu-Sang Choi, Jung-Keun Park
  • Patent number: 8677102
    Abstract: An instruction fusion calculation device of the present invention includes an instruction fusion detection circuit, an instruction fusion circuit, and a calculator. The instruction fusion detection circuit determines whether or not a fusion of a preceding instruction and a subsequent instruction that have a flow dependence relationship between them can be made. The instruction fusion circuit fuses the preceding instruction and the subsequent instruction to which it is determined by the instruction fusion detection circuit that the instructions can be fused into one instruction. The calculator executes the fused instruction into which the instructions are fused by the instruction fusion circuit to output the calculation result and outputs at least one of the calculation results obtained by executing the preceding instruction and the subsequent instruction as an intermediate result.
    Type: Grant
    Filed: May 18, 2010
    Date of Patent: March 18, 2014
    Assignee: NEC Corporation
    Inventor: Takahiko Uesugi