Long Instruction Word Patents (Class 712/24)
  • Patent number: 10860321
    Abstract: An electronic device including a memory; and a processor configured to generate an instruction code based on a same opcode when the same opcode is used in one or more slots defined in the memory upon application compiling.
    Type: Grant
    Filed: November 30, 2018
    Date of Patent: December 8, 2020
    Assignee: Samsung Electronics Co., Ltd.
    Inventors: Hwee Soo Kim, Hyuk Min Kwon, Won Jin Kim
  • Patent number: 10838728
    Abstract: Supplemental instruction dispatch may be used in some instances in a parallel slice processor to dispatch additional instructions, referred to as supplemental instructions, to supplemental instruction ports of execution slices and using primary instruction ports of one or more execution slices to supply one or more source operands for such supplemental instructions. In addition, in some instances, in lieu of or in addition to supplemental instruction dispatch, selective slice partitioning may be used to selectively partition groups of execution slices in a parallel slice processor based upon a threading mode within which such execution slices are executing.
    Type: Grant
    Filed: September 24, 2018
    Date of Patent: November 17, 2020
    Assignee: International Business Machines Corporation
    Inventors: Kurt A. Feiste, Christopher M. Mueller, Dung Q. Nguyen, Eula A. Tolentino, Tien T. Tran, Jing Zhang
  • Patent number: 10678724
    Abstract: Systems, methods, and apparatuses relating to in-network storage for a configurable spatial accelerator are described.
    Type: Grant
    Filed: December 29, 2018
    Date of Patent: June 9, 2020
    Assignee: Intel Corporation
    Inventors: Kermin ChoFleming, Simon Steely, Jr., Kent Glossop
  • Patent number: 10664269
    Abstract: Selected installed function of a multi-function instruction is hidden such that even though a processor is capable of performing the hidden installed function, the availability of the hidden function is hidden such that responsive to the multi-function instruction querying the availability of functions, only functions not hidden are reported as installed.
    Type: Grant
    Filed: December 8, 2017
    Date of Patent: May 26, 2020
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Dan F. Greiner, Damian L. Osisek, Timothy J. Slegel
  • Patent number: 10628143
    Abstract: Provided is a program development assist system, a program development assist method, and a non-transitory computer readable recording medium storing a program development assist program. The program development assist system includes: a shared variable extraction part that extracts, from the first source code that is described in the first programming language, shared variables that are variables shared by the first source code and the second source code that is described in the second programming language in a memory; and a display control part that causes a development screen of the second source code to display information indicating shared variables that are extracted by the shared variable extraction part.
    Type: Grant
    Filed: February 14, 2019
    Date of Patent: April 21, 2020
    Assignee: OMRON Corporation
    Inventors: Yoshimi Niwa, Taku Oya, Kei Yasuda
  • Patent number: 10540183
    Abstract: As disclosed herein a method, executed by a processor, for accelerated instruction execution includes retrieving an execute instruction including a register reference and a reference to a target instruction, retrieving the target instruction, decoding the execute instruction using an instruction pipeline, decoding the target instruction using the instruction pipeline, associating the register reference to the target instruction, and executing the target instruction using the register reference as a source operand modifier. The instruction pipeline is configured such that it allows the target instruction to continue processing without waiting for the register reference to be resolved. The contents of the referenced register may be retrieved in a later stage of the instruction pipeline, and the target instruction may be modified and executed. An apparatus corresponding to the described method is also disclosed herein.
    Type: Grant
    Filed: October 31, 2017
    Date of Patent: January 21, 2020
    Assignee: International Business Machines Corporation
    Inventors: Khary J. Alexander, Fadi Y. Busaba, Brian W. Curran, David S. Hutton, Edward T. Malley, Brian R. Prasky, John G. Rell, Jr.
  • Patent number: 10523764
    Abstract: A synchronous packet-processing pipeline whose data paths are populated with data-plane stateful processing units (DSPUs) is provided. A DSPU is a programmable processor whose operations are synchronous with the dataflow of the packet-processing pipeline. A DSPU performs every computation with fixed latency. Each DSPU is capable of maintaining a set of states and perform its computations based on its maintained set of states. The programming of a DSPU determines how and when the DSPU updates one of its maintained states. Such programming may configure the DSPU to update the state based on its received packet data, or to change the state regardless of the received packet data.
    Type: Grant
    Filed: September 24, 2015
    Date of Patent: December 31, 2019
    Assignee: Barefoot Networks, Inc.
    Inventors: Anirudh Sivaraman Kaushalram, Mihai Budiu, Changhoon Kim
  • Patent number: 10514928
    Abstract: A data processing apparatus has control circuitry for detecting whether a first micro-operation to be processed by a first processing lane would give the same result as a second micro-operation processed by a second processing lane. If they would give the same result, then the first micro-operation is prevented from being processed by the first processing lane and the result of the second micro-operation is output as the result of the first micro-operation. This avoids duplication of processing, to save energy for example.
    Type: Grant
    Filed: March 20, 2015
    Date of Patent: December 24, 2019
    Assignee: ARM Limited
    Inventors: Isidoros Sideris, Daren Croxford, Andrew Burdass
  • Patent number: 10481892
    Abstract: A system for updating a multiple domain embedded system may include a processor that can identify a device associated with the embedded system and a driver that supports the device. The processor can determine a domain associated with the driver and a first configuration label of a first configuration of the multiple domain embedded system. The processor can also determine a second configuration label of a second configuration of the multiple domain embedded system, based on the first configuration label, an identification of the driver, and an identification of the device. Further, the processor can update the driver based on the second configuration label.
    Type: Grant
    Filed: April 2, 2013
    Date of Patent: November 19, 2019
    Assignee: HARMAN BECKER AUTOMOTIVE SYSTEMS GMBH
    Inventors: Markus Broghammer, Dirk Fries
  • Patent number: 10353681
    Abstract: A method for improving performance of an access triggered architecture for a computer implemented application is provided. The method first executes typical operations of the access triggered architecture according to an execution time, wherein the typical operations comprise: obtaining a dataset and an instruction set; and using the instruction set to transmit the dataset to a functional block associated with an operation, wherein the functional block performs the operation using the dataset to generate a revised dataset. The method further creates a pipeline of the typical operations to reduce the execution time of the typical operations, to create a reduced execution time; and executes the typical operations according to the reduced execution time, using the pipeline.
    Type: Grant
    Filed: August 28, 2017
    Date of Patent: July 16, 2019
    Assignee: HONEYWELL INTERNATIONAL INC.
    Inventors: Thom Kreider, Jon Douglas Gilreath, Gary Warnica, Paul D. Kammann, Vince J. Gavagan, IV, Ronald E. Strong
  • Patent number: 10355737
    Abstract: A touch screen controller (TSC) includes: a front end circuit configured to send a control signal to a touch panel and to receive a touch signal from the touch panel; an algorithm processing circuit configured to process source data generated based on the touch signal according to a predetermined algorithm; a memory configured to store the source data and result data obtained as a result of processing the source data at the algorithm processing circuit; and a bus configured to transfer data among the front end circuit, the algorithm processing circuit, and the memory. The algorithm processing circuit includes: a buffer configured to temporarily store the source data or the result data and shared by at least two circuits; and a special function register (SFR) configured to store a setting value necessary for an operation of the algorithm processing circuit.
    Type: Grant
    Filed: November 5, 2018
    Date of Patent: July 16, 2019
    Assignee: SAMSUNG ELECTRONICS CO., LTD.
    Inventors: Hyung-Dal Kwon, Gyeong Min Ha, Ho-Suk Na
  • Patent number: 10289416
    Abstract: Embodiments of systems, apparatuses, and methods for lane-based strided gather are disclosed. In an embodiment, an apparatus includes a decoder to decode an instruction, wherein the instruction to include fields for indices of addresses to memory, and a packed data destination register operand; and execution circuitry to execute the decoded instruction to extract data elements of a defined number of types from memory using the indices of the instruction, and for each type, store the extracted data elements in one or more lanes of a packed data destination register dedicated to that type, wherein relative data elements between types are strided data elements apart.
    Type: Grant
    Filed: December 30, 2015
    Date of Patent: May 14, 2019
    Assignee: Intel Corporation
    Inventor: Elmoustapha Ould-Ahmed-Vall
  • Patent number: 10127043
    Abstract: A method and system for implementing very long instruction words (VLIW), the system operable to: receive a first very long instruction word (VLIW) including a set of slot instructions corresponding to a set of functional units, where: each slot instruction includes an opcode identifying an operation to be performed by the set of functional units and value fields related to the operation, where a dedicated subset of the value fields include dedicated bits dedicated to the slot instruction and an allocable subset of the value fields include allocable bits allocable to other slot instructions; identify the opcodes of each slot instruction; determine, based on the opcodes, which allocable bits are allocated to which slot instructions; and instruct each functional unit to perform an operation identified by a corresponding slot instruction using the corresponding dedicated bits and any allocable bits determined to be allocated to the slot instruction.
    Type: Grant
    Filed: October 19, 2016
    Date of Patent: November 13, 2018
    Assignee: Rex Computing, Inc.
    Inventors: Paul Michael Sebexen, Thomas Rex Sohmers
  • Patent number: 10102001
    Abstract: Supplemental instruction dispatch may be used in some instances in a parallel slice processor to dispatch additional instructions, referred to as supplemental instructions, to supplemental instruction ports of execution slices and using primary instruction ports of one or more execution slices to supply one or more source operands for such supplemental instructions. In addition, in some instances, in lieu of or in addition to supplemental instruction dispatch, selective slice partitioning may be used to selectively partition groups of execution slices in a parallel slice processor based upon a threading mode within which such execution slices are executing.
    Type: Grant
    Filed: March 28, 2018
    Date of Patent: October 16, 2018
    Assignee: International Business Machines Corporation
    Inventors: Kurt A. Feiste, Christopher M. Mueller, Dung Q. Nguyen, Eula A. Tolentino, Tien T. Tran, Jing Zhang
  • Patent number: 9985649
    Abstract: A technique for managing data storage applies both inline software compression and inline hardware compression in a data storage system, using both types of compression together. The data storage system applies inline software compression for compressing a first set of newly arriving data and applies inline hardware compression for compressing a second set of newly arriving data. Both sets of data are directed to a data object, and the data storage system compresses both sets of data without first storing uncompressed versions thereof in the data object.
    Type: Grant
    Filed: June 29, 2016
    Date of Patent: May 29, 2018
    Assignee: EMC IP Holding Company LLC
    Inventors: Ivan Bassov, Wai C. Yim
  • Patent number: 9977678
    Abstract: A processor core having multiple parallel instruction execution slices and coupled to multiple dispatch queues by a dispatch routing network provides flexible and efficient use of internal resources. The configuration of the execution slices is selectable so that capabilities of the processor core can be adjusted according to execution requirements for the instruction streams. Two or more execution slices can be combined as super-slices to handle wider data, wider operands and/or vector operations, according to one or more mode control signal that also serves as a configuration control signal. The mode control signal is also used to partition clusters of the execution slices within the processor core according to whether single-threaded or multi-threaded operation is selected, and additionally according to a number of hardware threads that are active.
    Type: Grant
    Filed: January 12, 2015
    Date of Patent: May 22, 2018
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Lee Evan Eisen, Hung Qui Le, Jentje Leenstra, Jose Eduardo Moreira, Bruce Joseph Ronchetti, Brian William Thompto, Albert James Van Norstrand, Jr.
  • Patent number: 9971602
    Abstract: A method of operating a processor core having multiple parallel instruction execution slices and coupled to multiple dispatch queues by a dispatch routing network provides flexible and efficient use of internal resources. The configuration of the execution slices is selectable so that capabilities of the processor core can be adjusted according to execution requirements for the instruction streams. Two or more execution slices can be combined as super-slices to handle wider data, wider operands and/or vector operations, according to one or more mode control signal that also serves as a configuration control signal. The mode control signal is also used to partition clusters of the execution slices within the processor core according to whether single-threaded or multi-threaded operation is selected, and additionally according to a number of hardware threads that are active.
    Type: Grant
    Filed: May 28, 2015
    Date of Patent: May 15, 2018
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Lee Evan Eisen, Hung Qui Le, Jentje Leenstra, Jose Eduardo Moreira, Bruce Joseph Ronchetti, Brian William Thompto, Albert James Van Norstrand, Jr.
  • Patent number: 9921755
    Abstract: System, method, and apparatus for integrated main memory (MM) and configurable coprocessor (CP) chip for processing subset of network functions. Chip supports external accesses to MM without additional latency from on-chip CP. On-chip memory scheduler resolves all bank conflicts and configurably load balances MM accesses. Instruction set and data on which the CP executes instructions are all disposed on-chip with no on-chip cache memory, thereby avoiding latency and coherency issues. Multiple independent and orthogonal threading domains used: a FIFO-based scheduling domain (SD) for the I/O; a multi-threaded processing domain for the CP. The CP is an array of independent, autonomous, unsequenced processing engines that process on-chip data tracked by SD of external CMD and reordered per FIFO CMD sequence before transmission.
    Type: Grant
    Filed: September 30, 2015
    Date of Patent: March 20, 2018
    Assignee: MoSys, Inc.
    Inventors: Michael J Miller, Jay B Patel, Michael J Morrison
  • Patent number: 9916163
    Abstract: A system for synchronizing parallel processing of a plurality of functional processing units (FPU), a first FPU and a first program counter to control timing of a first stream of program instructions issued to the first FPU by advancement of the first program counter; a second FPU and a second program counter to control timing of a second stream of program instructions issued to the second FPU by advancement of the second program counter, the first FPU is in communication with a second FPU to synchronize the issuance of a first stream of program instructions to the second stream of program instructions and the second FPU is in communication with the first FPU to synchronize the issuance of the second stream program instructions to the first stream of program instructions.
    Type: Grant
    Filed: January 9, 2017
    Date of Patent: March 13, 2018
    Assignee: International Business Machines Corporation
    Inventor: Changhoan Kim
  • Patent number: 9851969
    Abstract: Selected installed function of a multi-function instruction is hidden such that even though a processor is capable of performing the hidden installed function, the availability of the hidden function is hidden such that responsive to the multi-function instruction querying the availability of functions, only functions not hidden are reported as installed.
    Type: Grant
    Filed: June 24, 2010
    Date of Patent: December 26, 2017
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Dan F. Greiner, Damian Leo Osisek, Timothy J. Slegel
  • Patent number: 9747038
    Abstract: Systems and methods are disclosed for a hybrid parallel-serial memory access by a system on chip (SoC). The SoC is electrically coupled to the memory by both a parallel access channel and a separate serial access channel. A request for access to the memory is received. In response to receiving the request to access the memory, a type of memory access is identified. A determination is then made whether to access the memory with the serial access channel. In response to the determination to access the memory with the serial access channel, a first portion of the memory is accessed with the parallel access channel, and a second portion of the memory is accessed with the serial access channel.
    Type: Grant
    Filed: December 2, 2015
    Date of Patent: August 29, 2017
    Assignee: QUALCOMM Incorporated
    Inventors: Javid Jaffari, Amin Ansari, Rodolfo Beraha
  • Patent number: 9626624
    Abstract: An inference task is performed using a computation device having a plurality of processing elements operable in parallel and connected via a connectivity system. Performing the task includes accepting at the device a specification of at least part of the inference task. The specification characterizes a plurality of variables and a plurality of factors, each factor being associated with a subset of the variables. Each of the processing elements is configured with data defining one or more of the plurality of factors. At each of the processing elements, computation associated with one of the factors is performed concurrently with other of the processing elements performing computation associated with different ones of the factors. Messages are exchanged via a connectivity system. The messages provide inputs and/or outputs to the processing elements for the computations associated with the factors and provide a result of performing of the at least the part of the inference task.
    Type: Grant
    Filed: July 20, 2011
    Date of Patent: April 18, 2017
    Assignee: ANALOG DEVICES, INC.
    Inventors: Jeffrey Bernstein, Benjamin Vigoda
  • Patent number: 9626191
    Abstract: One embodiment of the present invention sets forth a technique for performing a shaped access of a register file that includes a set of N registers, wherein N is greater than or equal to two. The technique involves, for at least one thread included in a group of threads, receiving a request to access a first amount of data from each register in the set of N registers, and configuring a crossbar to allow the at least one thread to access the first amount of data from each register in the set of N registers.
    Type: Grant
    Filed: December 22, 2011
    Date of Patent: April 18, 2017
    Assignee: NVIDIA Corporation
    Inventors: Jack Hilaire Choquette, Michael Fetterman, Shirish Gadre, Xiaogang Qiu, Omkar Paranjape, Anjana Rajendran, Stewart Glenn Carlton, Eric Lyell Hill, Rajeshwaran Selvanesan, Douglas J. Hahn
  • Patent number: 9600281
    Abstract: Mechanisms for performing a matrix multiplication operation are provided. A vector load operation is performed to load a first vector operand of the matrix multiplication operation to a first target vector register. A pair-wise load and splat operation is performed to load a pair of scalar values of a second vector operand and replicate the pair of scalar values within a second target vector register. An operation is performed on elements of the first target vector register and elements of the second target vector register to generate a partial product of the matrix multiplication operation. The partial product is accumulated with other partial products and a resulting accumulated partial product is stored. This operation may be repeated for a second pair of scalar values of the second vector operand.
    Type: Grant
    Filed: July 12, 2010
    Date of Patent: March 21, 2017
    Assignee: International Business Machines Corporation
    Inventors: Alexandre E. Eichenberger, Michael K. Gschwind, John A. Gunnels, Valentina Salapura
  • Patent number: 9563585
    Abstract: Embodiments are provided for isolating Input/Output (I/O) execution by combining compiler and Operating System (OS) techniques. The embodiments include dedicating selected cores, in multicore or many-core processors, as I/O execution cores, and applying compiler-based analysis to classify I/O regions of program source codes so that the OS can schedule such regions onto the designated I/O cores. During the compilation of a program source code, each I/O operation region of the program source code is identified. During the execution of the compiled program source code, each I/O operation region is scheduled for execution on a preselected I/O core. The other regions of the compiled program source code are scheduled for execution on other cores.
    Type: Grant
    Filed: February 19, 2014
    Date of Patent: February 7, 2017
    Assignee: FUTUREWEI TECHNOLOGIES, INC.
    Inventors: Chen Tian, Handong Ye, Ziang Hu
  • Patent number: 9477474
    Abstract: Instructions grouped into instruction groups are optimized across group boundaries. Instruction sequences spanning multiple groups are optimized by retaining information relating to an instruction at the end of one instruction group to be co-optimized with an instruction at the beginning of a subsequent instruction group. This retained information is then used in optimization of one or more instructions of the subsequent group. Optimization may be performed across n group boundaries, where n is equal to two or greater. Additionally, optimization of instructions within a group may be performed, in addition to the optimizations across group boundaries.
    Type: Grant
    Filed: December 20, 2013
    Date of Patent: October 25, 2016
    Assignee: GLOBALFOUNDRIES Inc.
    Inventor: Michael K. Gschwind
  • Patent number: 9411983
    Abstract: In an embodiment of the present invention, a processor includes content storage logic to parse digital content into portions and to cause each portion to be stored into a corresponding page of a memory. The processor also includes protection logic to receive a write instruction having a destination address within the memory, and if the destination address is associated with a memory location stores a portion of the digital content, erase the page associated with the memory location. If the destination address is associated with another memory location that does not store any of the digital content, the protection logic is to permit execution of the write instruction. Other embodiments are described and claimed.
    Type: Grant
    Filed: March 15, 2013
    Date of Patent: August 9, 2016
    Assignee: Intel Corporation
    Inventors: Jayant Mangalampalli, Rajesh P. Banginwar
  • Patent number: 9344115
    Abstract: A method of compressing configuration data used in a reconfigurable processor including generating one piece of combined data by combining configuration data used at two or more cycles and generating a bit table indicating valid operations at each of the two or more cycles among operations included in the combined data.
    Type: Grant
    Filed: March 27, 2015
    Date of Patent: May 17, 2016
    Assignee: SAMSUNG ELECTRONICS CO., LTD.
    Inventors: Young-chul Cho, Do-hyung Kim, Suk-jin Kim, Si-hwa Lee
  • Patent number: 9304934
    Abstract: Register files for use in an out-of-order processor that have been divided into a plurality of sub-register files. The register files also have a plurality of buffers which are each associated with one of the sub-register files. Each buffer receives and stores write operations destined for the associated sub-register file which can be later issued to the sub-register file. Specifically, each clock cycle it is determined whether there is at least one write operation in the buffer that has not been issued to the associated sub-register file. If there is at least one write operation in the buffer that has not been issued to the associated sub-register file, one of the non-issued write operations is issued to the associated sub-register file.
    Type: Grant
    Filed: January 17, 2014
    Date of Patent: April 5, 2016
    Assignee: Imagination Technologies Limited
    Inventor: Hugh Jackson
  • Patent number: 9286074
    Abstract: An instruction compressing apparatus and method for a parallel processing computer such as a very long instruction word (VLIW) computer, are provided. The instruction compressing apparatus includes a bundle code generating unit, an instruction compressing unit, and an instruction converting unit. The bundle code generating unit may generate a bundle code in response to an input of instructions to be compressed. The bundle code may indicate whether a current instruction group is terminated, and also whether an instruction group following the current instruction group is a no-operation (NOP) instruction group. The instruction compressing unit may remove a NOP instruction and/or a NOP instruction group from the input instructions according to the generated bundle code. The instruction converting unit may include the generated bundle code in the remaining instructions which have not been removed by the instruction compressing unit.
    Type: Grant
    Filed: October 26, 2010
    Date of Patent: March 15, 2016
    Assignee: Samsung Electronics Co., Ltd.
    Inventors: Tai-Song Jin, Dong-Hoon Yoo, Bernhard Egger, Won-Sub Kim, Jin-Seok Lee, Sun-Hwa Kim, Hee-Jin Ahn
  • Patent number: 9256438
    Abstract: A computer processor pipeline has both an architectural register file and a working register file. The lifetime of an entry in the working register file is determined by a predetermined number of instructions passing through a specified stage in the pipeline after the location in the working register file is allocated for an instruction. The size of the working register file is selected based upon performance characteristics. A working register file creditor indicator is coupled to the front end pipeline portion and to the back end pipeline portion. The working register file credit indicator is monitored to prevent a working register file overflow. When the a location in the architectural register file is read early, the location is monitored to determine whether the location is written to prior to issuance of the instruction associated with the early read.
    Type: Grant
    Filed: January 15, 2009
    Date of Patent: February 9, 2016
    Assignee: ORACLE AMERICA, INC.
    Inventors: Shailender Chaudhry, Paul Caprioli, Marc Tremblay
  • Patent number: 9235418
    Abstract: A processor device includes a memory and a sequencer that is responsive to the memory. The sequencer supports very long instruction word (VLIW) type instructions and at least one VLIW instruction packet uses a number of operands during execution, The processor device further includes a plurality of instruction execution units responsive to the sequencer and a plurality of register files. Each of the plurality of register files includes a plurality of registers and the plurality of register files are coupled to the plurality of instruction execution units. Further, each of the plurality of register flies includes a number of data read ports and the number of data read ports of each of the plurality of register files is less than the number of operands used by the at least one VLIW instruction packet.
    Type: Grant
    Filed: February 25, 2014
    Date of Patent: January 12, 2016
    Assignee: QUALCOMM Incorporated
    Inventors: Muhammad Ahmed, Erich James Plondke, Lucian Codrescu, William C. Anderson
  • Patent number: 9165165
    Abstract: A method for protecting a volatile memory against a virus, wherein: rights of writing, reading, or execution are assigned to certain areas of the memory; and a first list of opcodes authorized or forbidden as a content of the areas is associated with each of these areas.
    Type: Grant
    Filed: April 27, 2012
    Date of Patent: October 20, 2015
    Assignee: STMicroelectronics (Rousset) SAS
    Inventor: Yannick Teglia
  • Patent number: 9021236
    Abstract: Techniques are described for decoupling fetching of an instruction stored in a main program memory from earliest execution of the instruction. An indirect execution method and program instructions to support such execution are addressed. In addition, an improved indirect deferred execution processor (DXP) VLIW architecture is described which supports a scalable array of memory centric processor elements that do not require local load and store units.
    Type: Grant
    Filed: January 9, 2014
    Date of Patent: April 28, 2015
    Assignee: Altera Corporation
    Inventors: Gerald G. Pechanek, Stamatis Vassiliadis
  • Patent number: 9009365
    Abstract: Details of a highly cost effective and efficient implementation of a manifold array (ManArray) architecture and instruction syntax for use therewith are described herein. Various aspects of this approach include the regularity of the syntax, the relative ease with which the instruction set can be represented in database form, the ready ability with which tools can be created, the ready generation of self-checking codes and parameterized test cases. Parameterizations can be fairly easily mapped and system maintenance is significantly simplified.
    Type: Grant
    Filed: February 20, 2013
    Date of Patent: April 14, 2015
    Assignee: Altera Corporation
    Inventors: Gerald George Pechanek, David Strube, Edwin Franklin Barry, Charles W. Kurak, Jr., Carl Donald Busboom, Dale Edward Schneider, Nikos P. Pitsianis, Grayson Morris, Edward A. Wolff, Patrick R. Marchand, Ricardo Rodriguez, Marco Jacobs
  • Patent number: 9007382
    Abstract: A system and method of rendering three-dimensional (3D) graphics. The system for rendering 3D graphics may include a plurality of cores including a scratch pad memory, a first memory to perform a control flow, a second memory for loop acceleration, and a shared memory to interpolate with the plurality of cores.
    Type: Grant
    Filed: July 20, 2009
    Date of Patent: April 14, 2015
    Assignee: Samsung Electronics Co., Ltd.
    Inventors: Kyoung June Min, Chan Min Park, Won Jong Lee, Dong-Hoon Yoo
  • Patent number: 9009506
    Abstract: Embodiments of a processing architecture are described. The architecture includes a fetch unit for fetching instructions from a data bus. A scheduler receives data from the fetch unit and creates a schedule allocates the data and schedule to a plurality of computational units. The scheduler also modifies voltage and frequency settings of the processing architecture to optimize power consumption and throughput of the system. The computational units include control units and execute units. The control units receive and decode the instructions and send the decoded instructions to execute units. The execute units then execute the instructions according to relevant software.
    Type: Grant
    Filed: March 1, 2012
    Date of Patent: April 14, 2015
    Assignee: NXP B.V.
    Inventors: Hamed Fatemi, Ajay Kapoor, Jose Pineda de Gyvez
  • Patent number: 8990543
    Abstract: In a particular embodiment, a method is disclosed that includes receiving an instruction packet including a first instruction and a second instruction that is dependent on the first instruction at a processor having a plurality of parallel execution pipelines, including a first execution pipeline and a second execution pipeline. The method further includes executing in parallel at least a portion of the first instruction and at least a portion of the second instruction. The method also includes selectively committing a second result of executing the at least a portion of the second instruction with the second execution pipeline based on a first result related to execution of the first instruction with the first execution pipeline.
    Type: Grant
    Filed: March 11, 2008
    Date of Patent: March 24, 2015
    Assignee: QUALCOMM Incorporated
    Inventors: Lucian Codrescu, Robert Allan Lester, Charles Joseph Tabony, Erich James Plondke, Mao Zeng, Suresh Venkumahanti, Ajay Anant Ingle
  • Patent number: 8954941
    Abstract: Method of generating respective instruction compaction schemes for subsets of instructions to be processed by a programmable processor, comprising the steps of a) receiving at least one input code sample representative for software to be executed on the programmable processor, the input code comprising a plurality of instructions defining a first set of instructions (S1), b) initializing a set of removed instructions as empty (S3), c) determining the most compact representation of the first set of instructions (S4) d) comparing the size of said most compact representation with a threshold value (S5), e) carrying out steps e1 to e3 if the size is larger than said threshold value, e1) determining which instruction of the first set of instructions has a highest coding cost (S6), e2) removing said instruction having the highest coding cost from the first set of instructions and (S7), e3) adding said instruction to the set of removed instructions (S8), f) repeating steps b-f, wherein the first set of instructions
    Type: Grant
    Filed: September 3, 2010
    Date of Patent: February 10, 2015
    Assignee: Intel Corporation
    Inventors: Hendrik Tjeerd Joannes Zwartenkot, Alexander Augusteijn, Yuanging Guo, J├╝rgen Von Oerthel, Jeroen Anton Johan Leijten, Erwan Yann Maurice Le Thenaff
  • Patent number: 8954714
    Abstract: An apparatus includes a processor. The processor includes two memories. The first memory stores one set of instructions. The second memory stores another set of instructions that are longer than the set of instructions in the first memory. An instruction in the set of instructions in the first memory is used as a pointer to a corresponding instruction in the set of instructions in the second memory.
    Type: Grant
    Filed: February 1, 2010
    Date of Patent: February 10, 2015
    Assignee: Altera Corporation
    Inventor: Steven Perry
  • Publication number: 20150039856
    Abstract: Efficient computation of complex multiplication results and very efficient fast Fourier transforms (FFTs) are provided. A parallel array VLIW digital signal processor is employed along with specialized complex multiplication instructions and communication operations between the processing elements which are overlapped with computation to provide very high performance operation. Successive iterations of a loop of tightly packed VLIWs are used allowing the complex multiplication pipeline hardware to be efficiently used. In addition, efficient techniques for supporting combined multiply accumulate operations are described.
    Type: Application
    Filed: August 11, 2014
    Publication date: February 5, 2015
    Applicant: Altera Corporation
    Inventors: Nikos P. Pitsianis, Gerald George Pechanek, Ricardo Rodriguez
  • Patent number: 8898433
    Abstract: An apparatus having a buffer and a circuit is disclosed. The buffer may be configured to store a plurality of fetch sets. Each fetch set generally includes a prefix word and a plurality of instruction words. Each prefix word may include a plurality of symbols. Each symbol generally corresponds to a respective one of the instruction words. The circuit may be configured to (i) identify each of the symbols in each of the fetch sets having a predetermined value and (ii) parse the fetch sets into a plurality of execution sets in response to the symbols having the predetermined value.
    Type: Grant
    Filed: April 26, 2012
    Date of Patent: November 25, 2014
    Assignee: Avago Technologies General IP (Singapore) Pte. Ltd.
    Inventors: Alexander Rabinovitch, Leonid Dubrovin
  • Patent number: 8850557
    Abstract: Disclosed are a processor and processing method that provide non-hierarchical computer security enhancements for context states. The processor can comprise a context control unit that uses context identifier tags associated with corresponding contexts to control access by the contexts to context information (i.e., context states) contained in the processor's non-stackable and/or stackable registers. For example, in response to an access request, the context control unit can grant a specific context access to a register only when that register is tagged with a specific context identifier tag. If the register is tagged with another context identifier tag, the contents of the specific register are saved in a context save area of memory and the previous context states of the specific context are restored to the specific register before access can be granted.
    Type: Grant
    Filed: February 29, 2012
    Date of Patent: September 30, 2014
    Assignee: International Business Machines Corporation
    Inventors: Richard H. Boivie, William E. Hall, Guerney D. H. Hunt, Suzanne K. McIntosh, Mark F. Mergen, Marcel C. Rosu, David R. Safford, David C. Toll, Carl Lynn C. Karger
  • Patent number: 8850170
    Abstract: An apparatus and method for dynamically determining the execution mode of a reconfigurable array are provided. Performance information of a loop may be obtained before and/or during the execution of the loop. The performance information may be used to determine whether to operate the apparatus in a very long instruction word (VLIW) mode or in a coarse grained array (CGA) mode.
    Type: Grant
    Filed: August 25, 2011
    Date of Patent: September 30, 2014
    Assignee: Samsung Electronics Co., Ltd.
    Inventors: Bernhard Egger, Dong-Hoon Yoo, Tai-Song Jin, Won-Sub Kim, Min-Wook Ahn, Jin-Seok Lee, Hee-Jin Ahn
  • Publication number: 20140215183
    Abstract: A system and method for efficiently processing instructions in hardware parallel execution lanes within a processor. In response to a given divergent point within an identified loop, a compiler arranges instructions within the identified loop into very large instruction words (VLIW's). At least one VLIW includes instructions intermingled from different basic blocks between the given divergence point and a corresponding convergence point. The compiler generates code wherein when executed assigns at runtime instructions within a given VLIW to multiple parallel execution lanes within a target processor. The target processor includes a single instruction multiple data (SIMD) micro-architecture. The assignment for a given lane is based on branch direction found at runtime for the given lane at the given divergent point. The target processor includes a vector register for storing indications indicating which given instruction within a fetched VLIW for an associated lane to execute.
    Type: Application
    Filed: January 29, 2013
    Publication date: July 31, 2014
    Applicant: ADVANCED MICRO DEVICES, INC.
    Inventor: Reza Yazdani
  • Patent number: 8775777
    Abstract: Sourcing immediate values from a very long instruction word includes determining if a VLIW sub-instruction expansion condition exists. If the sub-instruction expansion condition exists, operation of a portion of a first arithmetic logic unit component is minimized. In addition, a part of a second arithmetic logic unit component is expanded by utilizing a block of a very long instruction word, which is normally utilized by the first arithmetic logic unit component, for the second arithmetic logic unit component if the sub-instruction expansion condition exists.
    Type: Grant
    Filed: August 15, 2007
    Date of Patent: July 8, 2014
    Assignee: NVIDIA Corporation
    Inventors: Tyson J. Bergland, Craig M. Okruhlica, Michael J. M. Toksvig, Justin M. Mahan, Edward A. Hutchins
  • Patent number: 8775147
    Abstract: An algorithm and architecture are disclosed for performing multi-argument associative operations. The algorithm and architecture can be used to schedule operations on multiple facilities for computations or can be used in the development of a model in a modeling environment. The algorithm and architecture resulting from the algorithm use the latency of the components that are used to process the associative operations. The algorithm minimizes the number of components necessary to produce an output of multi-argument associative operations and also can minimize the number of inputs each component receives.
    Type: Grant
    Filed: May 31, 2006
    Date of Patent: July 8, 2014
    Assignee: The MathWorks, Inc.
    Inventors: Alireza Pakyari, Brian K. Ogilvie
  • Patent number: 8769245
    Abstract: A very long instruction word (VLIW) processor and an apparatus with power management and a method of power management therefor are provided in consistent with the exemplary embodiments of the disclosure. The power management method includes the following steps. Valid instruction(s) and no operation (NOP) instruction(s) of an input instruction package are rearranged to output a transcoded instruction package, wherein the transcoded instruction package by the rearrangement has its NOP instruction(s) corresponding to at least one execution unit, which is to be placed in power reduction state, of a VLIW processor. Power reduction control is selectively performed on at least one execution unit corresponding to at least one NOP instruction of the transcoded instruction package according to the transcoded instruction package.
    Type: Grant
    Filed: May 20, 2011
    Date of Patent: July 1, 2014
    Assignee: Industrial Technology Research Institute
    Inventors: Hsien-Ching Hsieh, Po-Han Huang, Shing-Wu Tung
  • Publication number: 20140181468
    Abstract: A processor device is disclosed and includes a memory and a sequencer that is responsive to the memory. The sequencer supports very long instruction word (VLIW) type instructions and at least one VLIW instruction packet uses a number of operands during execution. The processor device further includes a plurality of instruction execution units responsive to the sequencer and a plurality of register files. Each of the plurality of register files includes a plurality of registers and the plurality of register files are coupled to the plurality of instruction execution units. Further, each of the plurality of register files includes a number of data read ports and the number of data read ports of each of the plurality of register files is less than the number of operands used by the at least one VLIW instruction packet.
    Type: Application
    Filed: February 25, 2014
    Publication date: June 26, 2014
    Applicant: QUALCOMM Incorporated
    Inventors: Muhammad Ahmed, Erich James Plondke, Lucian Codrescu, William C. Anderson
  • Patent number: 8745359
    Abstract: A VLIW processor executes a very long instruction word containing a plurality of instructions, and executes a plurality of instruction streams at low cost. A processor executing a very long instruction word containing a plurality of instructions fetches concurrently the very long instruction words of up to M instruction streams, from N instruction caches including a plurality of memory banks to store the very long instruction words of the M instruction streams.
    Type: Grant
    Filed: February 3, 2009
    Date of Patent: June 3, 2014
    Assignee: NEC Corporation
    Inventor: Shohei Nomoto