Long Instruction Word Patents (Class 712/24)
-
Patent number: 11868804Abstract: A processor comprises a computational array of computational elements and an instruction dispatch circuit. The computational elements receive data operands via data lanes extending along a first dimension, and processes the operands based upon instructions received from the instruction dispatch circuit via instruction lanes extending along a second dimension. The instruction dispatch circuit receives raw instructions, and comprises an instruction dispatch unit (IDU) processor that processes a set of raw instructions to generate processed instructions for dispatch to the computational elements, where the number of processed instructions is not equal to the number of instructions of the set of raw instructions.Type: GrantFiled: November 18, 2020Date of Patent: January 9, 2024Assignee: Groq, Inc.Inventors: Brian Lee Kurtz, Dinesh Maheshwari, James David Sprach
-
Patent number: 11366684Abstract: In one approach, an import mechanism allows new hardware intrinsics to be utilized by writing or updating a library of source code, rather than specifically modifying the virtual machine for each new intrinsic. Thus, once the architecture is in place to allow the import mechanism to function, the virtual machine itself (e.g. the code which implements the virtual machine) no longer needs to be modified in order to allow new intrinsics to be utilized by end user programmers. Since source code is typically more convenient to write than the language used to implement the virtual machine and the risk of miscoding the virtual machine is minimized when introducing new intrinsics, the import mechanism described herein increases the efficiency at which new hardware intrinsics can be introduced.Type: GrantFiled: March 24, 2020Date of Patent: June 21, 2022Assignee: Oracle International CorporationInventors: John Robert Rose, Vladimir Ivanov
-
Patent number: 11354405Abstract: First and second neighboring bit sequences containing machine language are determined to be (latently) separable. Such determination may be partly based on suitability for separation and partly based on environmental readiness, for example. If separability is determined, any of several response protocols may ensue. For example one or both of the bit sequences may be moved, modified, or trapped as part of a moving target defense.Type: GrantFiled: July 6, 2021Date of Patent: June 7, 2022Assignee: Polyverse CorporationInventor: Mariusz G. Borsa
-
Patent number: 11327757Abstract: In at least one embodiment, a processor includes architected and non-architected register files for buffering operands. The processor additionally includes an instruction fetch unit that fetches instructions to be executed and at least one execution unit. The at least one execution unit is configured to execute a first class of instructions that access operands in the architected register file and a second class of instructions that access operands in the non-architected register file. The processor also includes a mapper circuit that assigns physical registers to the instructions for buffering of operands. The processor additionally includes a dispatch circuit configured, based on detection of an instruction in one of the first and second classes of instructions for which correct operands do not reside in a respective one of the architected and non-architected register files, to automatically initiate transfer of operands between the architected and non-architected register files.Type: GrantFiled: December 14, 2020Date of Patent: May 10, 2022Assignee: International Business Machines CorporationInventors: Steven J. Battle, Kurt A. Feiste, Susan E. Eisen, Dung Q. Nguyen, Christian Gerhard Zoellin, Kent Li, Brian W. Thompto, Dhivya Jeganathan, Kenneth L. Ward, Brian D. Barrick
-
Patent number: 11307855Abstract: Instructions have an opcode and at least one data operand, the opcode identifying a data processing operation to perform on the at least one data operand. For a register-provided-opcode instruction specifying at least one source register, at least part of the opcode is a register-provided opcode represented by a first portion of data stored in said at least one source register of the register-provided-opcode instruction, and the at least one data operand comprises data represented by a second portion of the data stored in the at least one source register. The register-provided opcode is used to select between different data processing operations supported for the same instruction encoding of the register-provided-opcode instruction.Type: GrantFiled: November 12, 2020Date of Patent: April 19, 2022Assignee: Arm LimitedInventors: John Michael Horley, Simon John Craske
-
Patent number: 11210402Abstract: A method includes receiving a processor design of a processor, receiving an application to be executed by the processor, and receiving a security policy. The method includes simulating the execution of the application on the processor to identify information flow violations generated by the application based on the security policy.Type: GrantFiled: October 2, 2018Date of Patent: December 28, 2021Assignees: Regents of the University of Minnesota, The Board of Trustees of the University of IllinoisInventors: Hari Cherupalli, Rakesh Kumar, John Sartori, Henry Duwe
-
Patent number: 11188326Abstract: Selected installed function of a multi-function instruction is hidden such that even though a processor is capable of performing the hidden installed function, the availability of the hidden function is hidden such that responsive to the multi-function instruction querying the availability of functions, only functions not hidden are reported as installed.Type: GrantFiled: March 18, 2020Date of Patent: November 30, 2021Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Dan F. Greiner, Damian L. Osisek, Timothy J. Slegel
-
Patent number: 10860321Abstract: An electronic device including a memory; and a processor configured to generate an instruction code based on a same opcode when the same opcode is used in one or more slots defined in the memory upon application compiling.Type: GrantFiled: November 30, 2018Date of Patent: December 8, 2020Assignee: Samsung Electronics Co., Ltd.Inventors: Hwee Soo Kim, Hyuk Min Kwon, Won Jin Kim
-
Patent number: 10838728Abstract: Supplemental instruction dispatch may be used in some instances in a parallel slice processor to dispatch additional instructions, referred to as supplemental instructions, to supplemental instruction ports of execution slices and using primary instruction ports of one or more execution slices to supply one or more source operands for such supplemental instructions. In addition, in some instances, in lieu of or in addition to supplemental instruction dispatch, selective slice partitioning may be used to selectively partition groups of execution slices in a parallel slice processor based upon a threading mode within which such execution slices are executing.Type: GrantFiled: September 24, 2018Date of Patent: November 17, 2020Assignee: International Business Machines CorporationInventors: Kurt A. Feiste, Christopher M. Mueller, Dung Q. Nguyen, Eula A. Tolentino, Tien T. Tran, Jing Zhang
-
Patent number: 10678724Abstract: Systems, methods, and apparatuses relating to in-network storage for a configurable spatial accelerator are described.Type: GrantFiled: December 29, 2018Date of Patent: June 9, 2020Assignee: Intel CorporationInventors: Kermin ChoFleming, Simon Steely, Jr., Kent Glossop
-
Patent number: 10664269Abstract: Selected installed function of a multi-function instruction is hidden such that even though a processor is capable of performing the hidden installed function, the availability of the hidden function is hidden such that responsive to the multi-function instruction querying the availability of functions, only functions not hidden are reported as installed.Type: GrantFiled: December 8, 2017Date of Patent: May 26, 2020Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Dan F. Greiner, Damian L. Osisek, Timothy J. Slegel
-
Patent number: 10628143Abstract: Provided is a program development assist system, a program development assist method, and a non-transitory computer readable recording medium storing a program development assist program. The program development assist system includes: a shared variable extraction part that extracts, from the first source code that is described in the first programming language, shared variables that are variables shared by the first source code and the second source code that is described in the second programming language in a memory; and a display control part that causes a development screen of the second source code to display information indicating shared variables that are extracted by the shared variable extraction part.Type: GrantFiled: February 14, 2019Date of Patent: April 21, 2020Assignee: OMRON CorporationInventors: Yoshimi Niwa, Taku Oya, Kei Yasuda
-
Patent number: 10540183Abstract: As disclosed herein a method, executed by a processor, for accelerated instruction execution includes retrieving an execute instruction including a register reference and a reference to a target instruction, retrieving the target instruction, decoding the execute instruction using an instruction pipeline, decoding the target instruction using the instruction pipeline, associating the register reference to the target instruction, and executing the target instruction using the register reference as a source operand modifier. The instruction pipeline is configured such that it allows the target instruction to continue processing without waiting for the register reference to be resolved. The contents of the referenced register may be retrieved in a later stage of the instruction pipeline, and the target instruction may be modified and executed. An apparatus corresponding to the described method is also disclosed herein.Type: GrantFiled: October 31, 2017Date of Patent: January 21, 2020Assignee: International Business Machines CorporationInventors: Khary J. Alexander, Fadi Y. Busaba, Brian W. Curran, David S. Hutton, Edward T. Malley, Brian R. Prasky, John G. Rell, Jr.
-
Patent number: 10523764Abstract: A synchronous packet-processing pipeline whose data paths are populated with data-plane stateful processing units (DSPUs) is provided. A DSPU is a programmable processor whose operations are synchronous with the dataflow of the packet-processing pipeline. A DSPU performs every computation with fixed latency. Each DSPU is capable of maintaining a set of states and perform its computations based on its maintained set of states. The programming of a DSPU determines how and when the DSPU updates one of its maintained states. Such programming may configure the DSPU to update the state based on its received packet data, or to change the state regardless of the received packet data.Type: GrantFiled: September 24, 2015Date of Patent: December 31, 2019Assignee: Barefoot Networks, Inc.Inventors: Anirudh Sivaraman Kaushalram, Mihai Budiu, Changhoon Kim
-
Patent number: 10514928Abstract: A data processing apparatus has control circuitry for detecting whether a first micro-operation to be processed by a first processing lane would give the same result as a second micro-operation processed by a second processing lane. If they would give the same result, then the first micro-operation is prevented from being processed by the first processing lane and the result of the second micro-operation is output as the result of the first micro-operation. This avoids duplication of processing, to save energy for example.Type: GrantFiled: March 20, 2015Date of Patent: December 24, 2019Assignee: ARM LimitedInventors: Isidoros Sideris, Daren Croxford, Andrew Burdass
-
Patent number: 10481892Abstract: A system for updating a multiple domain embedded system may include a processor that can identify a device associated with the embedded system and a driver that supports the device. The processor can determine a domain associated with the driver and a first configuration label of a first configuration of the multiple domain embedded system. The processor can also determine a second configuration label of a second configuration of the multiple domain embedded system, based on the first configuration label, an identification of the driver, and an identification of the device. Further, the processor can update the driver based on the second configuration label.Type: GrantFiled: April 2, 2013Date of Patent: November 19, 2019Assignee: HARMAN BECKER AUTOMOTIVE SYSTEMS GMBHInventors: Markus Broghammer, Dirk Fries
-
Patent number: 10355737Abstract: A touch screen controller (TSC) includes: a front end circuit configured to send a control signal to a touch panel and to receive a touch signal from the touch panel; an algorithm processing circuit configured to process source data generated based on the touch signal according to a predetermined algorithm; a memory configured to store the source data and result data obtained as a result of processing the source data at the algorithm processing circuit; and a bus configured to transfer data among the front end circuit, the algorithm processing circuit, and the memory. The algorithm processing circuit includes: a buffer configured to temporarily store the source data or the result data and shared by at least two circuits; and a special function register (SFR) configured to store a setting value necessary for an operation of the algorithm processing circuit.Type: GrantFiled: November 5, 2018Date of Patent: July 16, 2019Assignee: SAMSUNG ELECTRONICS CO., LTD.Inventors: Hyung-Dal Kwon, Gyeong Min Ha, Ho-Suk Na
-
Patent number: 10353681Abstract: A method for improving performance of an access triggered architecture for a computer implemented application is provided. The method first executes typical operations of the access triggered architecture according to an execution time, wherein the typical operations comprise: obtaining a dataset and an instruction set; and using the instruction set to transmit the dataset to a functional block associated with an operation, wherein the functional block performs the operation using the dataset to generate a revised dataset. The method further creates a pipeline of the typical operations to reduce the execution time of the typical operations, to create a reduced execution time; and executes the typical operations according to the reduced execution time, using the pipeline.Type: GrantFiled: August 28, 2017Date of Patent: July 16, 2019Assignee: HONEYWELL INTERNATIONAL INC.Inventors: Thom Kreider, Jon Douglas Gilreath, Gary Warnica, Paul D. Kammann, Vince J. Gavagan, IV, Ronald E. Strong
-
Patent number: 10289416Abstract: Embodiments of systems, apparatuses, and methods for lane-based strided gather are disclosed. In an embodiment, an apparatus includes a decoder to decode an instruction, wherein the instruction to include fields for indices of addresses to memory, and a packed data destination register operand; and execution circuitry to execute the decoded instruction to extract data elements of a defined number of types from memory using the indices of the instruction, and for each type, store the extracted data elements in one or more lanes of a packed data destination register dedicated to that type, wherein relative data elements between types are strided data elements apart.Type: GrantFiled: December 30, 2015Date of Patent: May 14, 2019Assignee: Intel CorporationInventor: Elmoustapha Ould-Ahmed-Vall
-
Patent number: 10127043Abstract: A method and system for implementing very long instruction words (VLIW), the system operable to: receive a first very long instruction word (VLIW) including a set of slot instructions corresponding to a set of functional units, where: each slot instruction includes an opcode identifying an operation to be performed by the set of functional units and value fields related to the operation, where a dedicated subset of the value fields include dedicated bits dedicated to the slot instruction and an allocable subset of the value fields include allocable bits allocable to other slot instructions; identify the opcodes of each slot instruction; determine, based on the opcodes, which allocable bits are allocated to which slot instructions; and instruct each functional unit to perform an operation identified by a corresponding slot instruction using the corresponding dedicated bits and any allocable bits determined to be allocated to the slot instruction.Type: GrantFiled: October 19, 2016Date of Patent: November 13, 2018Assignee: Rex Computing, Inc.Inventors: Paul Michael Sebexen, Thomas Rex Sohmers
-
Patent number: 10102001Abstract: Supplemental instruction dispatch may be used in some instances in a parallel slice processor to dispatch additional instructions, referred to as supplemental instructions, to supplemental instruction ports of execution slices and using primary instruction ports of one or more execution slices to supply one or more source operands for such supplemental instructions. In addition, in some instances, in lieu of or in addition to supplemental instruction dispatch, selective slice partitioning may be used to selectively partition groups of execution slices in a parallel slice processor based upon a threading mode within which such execution slices are executing.Type: GrantFiled: March 28, 2018Date of Patent: October 16, 2018Assignee: International Business Machines CorporationInventors: Kurt A. Feiste, Christopher M. Mueller, Dung Q. Nguyen, Eula A. Tolentino, Tien T. Tran, Jing Zhang
-
Patent number: 9985649Abstract: A technique for managing data storage applies both inline software compression and inline hardware compression in a data storage system, using both types of compression together. The data storage system applies inline software compression for compressing a first set of newly arriving data and applies inline hardware compression for compressing a second set of newly arriving data. Both sets of data are directed to a data object, and the data storage system compresses both sets of data without first storing uncompressed versions thereof in the data object.Type: GrantFiled: June 29, 2016Date of Patent: May 29, 2018Assignee: EMC IP Holding Company LLCInventors: Ivan Bassov, Wai C. Yim
-
Patent number: 9977678Abstract: A processor core having multiple parallel instruction execution slices and coupled to multiple dispatch queues by a dispatch routing network provides flexible and efficient use of internal resources. The configuration of the execution slices is selectable so that capabilities of the processor core can be adjusted according to execution requirements for the instruction streams. Two or more execution slices can be combined as super-slices to handle wider data, wider operands and/or vector operations, according to one or more mode control signal that also serves as a configuration control signal. The mode control signal is also used to partition clusters of the execution slices within the processor core according to whether single-threaded or multi-threaded operation is selected, and additionally according to a number of hardware threads that are active.Type: GrantFiled: January 12, 2015Date of Patent: May 22, 2018Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Lee Evan Eisen, Hung Qui Le, Jentje Leenstra, Jose Eduardo Moreira, Bruce Joseph Ronchetti, Brian William Thompto, Albert James Van Norstrand, Jr.
-
Patent number: 9971602Abstract: A method of operating a processor core having multiple parallel instruction execution slices and coupled to multiple dispatch queues by a dispatch routing network provides flexible and efficient use of internal resources. The configuration of the execution slices is selectable so that capabilities of the processor core can be adjusted according to execution requirements for the instruction streams. Two or more execution slices can be combined as super-slices to handle wider data, wider operands and/or vector operations, according to one or more mode control signal that also serves as a configuration control signal. The mode control signal is also used to partition clusters of the execution slices within the processor core according to whether single-threaded or multi-threaded operation is selected, and additionally according to a number of hardware threads that are active.Type: GrantFiled: May 28, 2015Date of Patent: May 15, 2018Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Lee Evan Eisen, Hung Qui Le, Jentje Leenstra, Jose Eduardo Moreira, Bruce Joseph Ronchetti, Brian William Thompto, Albert James Van Norstrand, Jr.
-
Patent number: 9921755Abstract: System, method, and apparatus for integrated main memory (MM) and configurable coprocessor (CP) chip for processing subset of network functions. Chip supports external accesses to MM without additional latency from on-chip CP. On-chip memory scheduler resolves all bank conflicts and configurably load balances MM accesses. Instruction set and data on which the CP executes instructions are all disposed on-chip with no on-chip cache memory, thereby avoiding latency and coherency issues. Multiple independent and orthogonal threading domains used: a FIFO-based scheduling domain (SD) for the I/O; a multi-threaded processing domain for the CP. The CP is an array of independent, autonomous, unsequenced processing engines that process on-chip data tracked by SD of external CMD and reordered per FIFO CMD sequence before transmission.Type: GrantFiled: September 30, 2015Date of Patent: March 20, 2018Assignee: MoSys, Inc.Inventors: Michael J Miller, Jay B Patel, Michael J Morrison
-
Patent number: 9916163Abstract: A system for synchronizing parallel processing of a plurality of functional processing units (FPU), a first FPU and a first program counter to control timing of a first stream of program instructions issued to the first FPU by advancement of the first program counter; a second FPU and a second program counter to control timing of a second stream of program instructions issued to the second FPU by advancement of the second program counter, the first FPU is in communication with a second FPU to synchronize the issuance of a first stream of program instructions to the second stream of program instructions and the second FPU is in communication with the first FPU to synchronize the issuance of the second stream program instructions to the first stream of program instructions.Type: GrantFiled: January 9, 2017Date of Patent: March 13, 2018Assignee: International Business Machines CorporationInventor: Changhoan Kim
-
Patent number: 9851969Abstract: Selected installed function of a multi-function instruction is hidden such that even though a processor is capable of performing the hidden installed function, the availability of the hidden function is hidden such that responsive to the multi-function instruction querying the availability of functions, only functions not hidden are reported as installed.Type: GrantFiled: June 24, 2010Date of Patent: December 26, 2017Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Dan F. Greiner, Damian Leo Osisek, Timothy J. Slegel
-
Patent number: 9747038Abstract: Systems and methods are disclosed for a hybrid parallel-serial memory access by a system on chip (SoC). The SoC is electrically coupled to the memory by both a parallel access channel and a separate serial access channel. A request for access to the memory is received. In response to receiving the request to access the memory, a type of memory access is identified. A determination is then made whether to access the memory with the serial access channel. In response to the determination to access the memory with the serial access channel, a first portion of the memory is accessed with the parallel access channel, and a second portion of the memory is accessed with the serial access channel.Type: GrantFiled: December 2, 2015Date of Patent: August 29, 2017Assignee: QUALCOMM IncorporatedInventors: Javid Jaffari, Amin Ansari, Rodolfo Beraha
-
Patent number: 9626191Abstract: One embodiment of the present invention sets forth a technique for performing a shaped access of a register file that includes a set of N registers, wherein N is greater than or equal to two. The technique involves, for at least one thread included in a group of threads, receiving a request to access a first amount of data from each register in the set of N registers, and configuring a crossbar to allow the at least one thread to access the first amount of data from each register in the set of N registers.Type: GrantFiled: December 22, 2011Date of Patent: April 18, 2017Assignee: NVIDIA CorporationInventors: Jack Hilaire Choquette, Michael Fetterman, Shirish Gadre, Xiaogang Qiu, Omkar Paranjape, Anjana Rajendran, Stewart Glenn Carlton, Eric Lyell Hill, Rajeshwaran Selvanesan, Douglas J. Hahn
-
Patent number: 9626624Abstract: An inference task is performed using a computation device having a plurality of processing elements operable in parallel and connected via a connectivity system. Performing the task includes accepting at the device a specification of at least part of the inference task. The specification characterizes a plurality of variables and a plurality of factors, each factor being associated with a subset of the variables. Each of the processing elements is configured with data defining one or more of the plurality of factors. At each of the processing elements, computation associated with one of the factors is performed concurrently with other of the processing elements performing computation associated with different ones of the factors. Messages are exchanged via a connectivity system. The messages provide inputs and/or outputs to the processing elements for the computations associated with the factors and provide a result of performing of the at least the part of the inference task.Type: GrantFiled: July 20, 2011Date of Patent: April 18, 2017Assignee: ANALOG DEVICES, INC.Inventors: Jeffrey Bernstein, Benjamin Vigoda
-
Patent number: 9600281Abstract: Mechanisms for performing a matrix multiplication operation are provided. A vector load operation is performed to load a first vector operand of the matrix multiplication operation to a first target vector register. A pair-wise load and splat operation is performed to load a pair of scalar values of a second vector operand and replicate the pair of scalar values within a second target vector register. An operation is performed on elements of the first target vector register and elements of the second target vector register to generate a partial product of the matrix multiplication operation. The partial product is accumulated with other partial products and a resulting accumulated partial product is stored. This operation may be repeated for a second pair of scalar values of the second vector operand.Type: GrantFiled: July 12, 2010Date of Patent: March 21, 2017Assignee: International Business Machines CorporationInventors: Alexandre E. Eichenberger, Michael K. Gschwind, John A. Gunnels, Valentina Salapura
-
Patent number: 9563585Abstract: Embodiments are provided for isolating Input/Output (I/O) execution by combining compiler and Operating System (OS) techniques. The embodiments include dedicating selected cores, in multicore or many-core processors, as I/O execution cores, and applying compiler-based analysis to classify I/O regions of program source codes so that the OS can schedule such regions onto the designated I/O cores. During the compilation of a program source code, each I/O operation region of the program source code is identified. During the execution of the compiled program source code, each I/O operation region is scheduled for execution on a preselected I/O core. The other regions of the compiled program source code are scheduled for execution on other cores.Type: GrantFiled: February 19, 2014Date of Patent: February 7, 2017Assignee: FUTUREWEI TECHNOLOGIES, INC.Inventors: Chen Tian, Handong Ye, Ziang Hu
-
Patent number: 9477474Abstract: Instructions grouped into instruction groups are optimized across group boundaries. Instruction sequences spanning multiple groups are optimized by retaining information relating to an instruction at the end of one instruction group to be co-optimized with an instruction at the beginning of a subsequent instruction group. This retained information is then used in optimization of one or more instructions of the subsequent group. Optimization may be performed across n group boundaries, where n is equal to two or greater. Additionally, optimization of instructions within a group may be performed, in addition to the optimizations across group boundaries.Type: GrantFiled: December 20, 2013Date of Patent: October 25, 2016Assignee: GLOBALFOUNDRIES Inc.Inventor: Michael K. Gschwind
-
Patent number: 9411983Abstract: In an embodiment of the present invention, a processor includes content storage logic to parse digital content into portions and to cause each portion to be stored into a corresponding page of a memory. The processor also includes protection logic to receive a write instruction having a destination address within the memory, and if the destination address is associated with a memory location stores a portion of the digital content, erase the page associated with the memory location. If the destination address is associated with another memory location that does not store any of the digital content, the protection logic is to permit execution of the write instruction. Other embodiments are described and claimed.Type: GrantFiled: March 15, 2013Date of Patent: August 9, 2016Assignee: Intel CorporationInventors: Jayant Mangalampalli, Rajesh P. Banginwar
-
Patent number: 9344115Abstract: A method of compressing configuration data used in a reconfigurable processor including generating one piece of combined data by combining configuration data used at two or more cycles and generating a bit table indicating valid operations at each of the two or more cycles among operations included in the combined data.Type: GrantFiled: March 27, 2015Date of Patent: May 17, 2016Assignee: SAMSUNG ELECTRONICS CO., LTD.Inventors: Young-chul Cho, Do-hyung Kim, Suk-jin Kim, Si-hwa Lee
-
Patent number: 9304934Abstract: Register files for use in an out-of-order processor that have been divided into a plurality of sub-register files. The register files also have a plurality of buffers which are each associated with one of the sub-register files. Each buffer receives and stores write operations destined for the associated sub-register file which can be later issued to the sub-register file. Specifically, each clock cycle it is determined whether there is at least one write operation in the buffer that has not been issued to the associated sub-register file. If there is at least one write operation in the buffer that has not been issued to the associated sub-register file, one of the non-issued write operations is issued to the associated sub-register file.Type: GrantFiled: January 17, 2014Date of Patent: April 5, 2016Assignee: Imagination Technologies LimitedInventor: Hugh Jackson
-
Patent number: 9286074Abstract: An instruction compressing apparatus and method for a parallel processing computer such as a very long instruction word (VLIW) computer, are provided. The instruction compressing apparatus includes a bundle code generating unit, an instruction compressing unit, and an instruction converting unit. The bundle code generating unit may generate a bundle code in response to an input of instructions to be compressed. The bundle code may indicate whether a current instruction group is terminated, and also whether an instruction group following the current instruction group is a no-operation (NOP) instruction group. The instruction compressing unit may remove a NOP instruction and/or a NOP instruction group from the input instructions according to the generated bundle code. The instruction converting unit may include the generated bundle code in the remaining instructions which have not been removed by the instruction compressing unit.Type: GrantFiled: October 26, 2010Date of Patent: March 15, 2016Assignee: Samsung Electronics Co., Ltd.Inventors: Tai-Song Jin, Dong-Hoon Yoo, Bernhard Egger, Won-Sub Kim, Jin-Seok Lee, Sun-Hwa Kim, Hee-Jin Ahn
-
Patent number: 9256438Abstract: A computer processor pipeline has both an architectural register file and a working register file. The lifetime of an entry in the working register file is determined by a predetermined number of instructions passing through a specified stage in the pipeline after the location in the working register file is allocated for an instruction. The size of the working register file is selected based upon performance characteristics. A working register file creditor indicator is coupled to the front end pipeline portion and to the back end pipeline portion. The working register file credit indicator is monitored to prevent a working register file overflow. When the a location in the architectural register file is read early, the location is monitored to determine whether the location is written to prior to issuance of the instruction associated with the early read.Type: GrantFiled: January 15, 2009Date of Patent: February 9, 2016Assignee: ORACLE AMERICA, INC.Inventors: Shailender Chaudhry, Paul Caprioli, Marc Tremblay
-
Register files for a digital signal processor operating in an interleaved multi-threaded environment
Patent number: 9235418Abstract: A processor device includes a memory and a sequencer that is responsive to the memory. The sequencer supports very long instruction word (VLIW) type instructions and at least one VLIW instruction packet uses a number of operands during execution, The processor device further includes a plurality of instruction execution units responsive to the sequencer and a plurality of register files. Each of the plurality of register files includes a plurality of registers and the plurality of register files are coupled to the plurality of instruction execution units. Further, each of the plurality of register flies includes a number of data read ports and the number of data read ports of each of the plurality of register files is less than the number of operands used by the at least one VLIW instruction packet.Type: GrantFiled: February 25, 2014Date of Patent: January 12, 2016Assignee: QUALCOMM IncorporatedInventors: Muhammad Ahmed, Erich James Plondke, Lucian Codrescu, William C. Anderson -
Patent number: 9165165Abstract: A method for protecting a volatile memory against a virus, wherein: rights of writing, reading, or execution are assigned to certain areas of the memory; and a first list of opcodes authorized or forbidden as a content of the areas is associated with each of these areas.Type: GrantFiled: April 27, 2012Date of Patent: October 20, 2015Assignee: STMicroelectronics (Rousset) SASInventor: Yannick Teglia
-
Patent number: 9021236Abstract: Techniques are described for decoupling fetching of an instruction stored in a main program memory from earliest execution of the instruction. An indirect execution method and program instructions to support such execution are addressed. In addition, an improved indirect deferred execution processor (DXP) VLIW architecture is described which supports a scalable array of memory centric processor elements that do not require local load and store units.Type: GrantFiled: January 9, 2014Date of Patent: April 28, 2015Assignee: Altera CorporationInventors: Gerald G. Pechanek, Stamatis Vassiliadis
-
Patent number: 9009365Abstract: Details of a highly cost effective and efficient implementation of a manifold array (ManArray) architecture and instruction syntax for use therewith are described herein. Various aspects of this approach include the regularity of the syntax, the relative ease with which the instruction set can be represented in database form, the ready ability with which tools can be created, the ready generation of self-checking codes and parameterized test cases. Parameterizations can be fairly easily mapped and system maintenance is significantly simplified.Type: GrantFiled: February 20, 2013Date of Patent: April 14, 2015Assignee: Altera CorporationInventors: Gerald George Pechanek, David Strube, Edwin Franklin Barry, Charles W. Kurak, Jr., Carl Donald Busboom, Dale Edward Schneider, Nikos P. Pitsianis, Grayson Morris, Edward A. Wolff, Patrick R. Marchand, Ricardo Rodriguez, Marco Jacobs
-
Patent number: 9007382Abstract: A system and method of rendering three-dimensional (3D) graphics. The system for rendering 3D graphics may include a plurality of cores including a scratch pad memory, a first memory to perform a control flow, a second memory for loop acceleration, and a shared memory to interpolate with the plurality of cores.Type: GrantFiled: July 20, 2009Date of Patent: April 14, 2015Assignee: Samsung Electronics Co., Ltd.Inventors: Kyoung June Min, Chan Min Park, Won Jong Lee, Dong-Hoon Yoo
-
Patent number: 9009506Abstract: Embodiments of a processing architecture are described. The architecture includes a fetch unit for fetching instructions from a data bus. A scheduler receives data from the fetch unit and creates a schedule allocates the data and schedule to a plurality of computational units. The scheduler also modifies voltage and frequency settings of the processing architecture to optimize power consumption and throughput of the system. The computational units include control units and execute units. The control units receive and decode the instructions and send the decoded instructions to execute units. The execute units then execute the instructions according to relevant software.Type: GrantFiled: March 1, 2012Date of Patent: April 14, 2015Assignee: NXP B.V.Inventors: Hamed Fatemi, Ajay Kapoor, Jose Pineda de Gyvez
-
Patent number: 8990543Abstract: In a particular embodiment, a method is disclosed that includes receiving an instruction packet including a first instruction and a second instruction that is dependent on the first instruction at a processor having a plurality of parallel execution pipelines, including a first execution pipeline and a second execution pipeline. The method further includes executing in parallel at least a portion of the first instruction and at least a portion of the second instruction. The method also includes selectively committing a second result of executing the at least a portion of the second instruction with the second execution pipeline based on a first result related to execution of the first instruction with the first execution pipeline.Type: GrantFiled: March 11, 2008Date of Patent: March 24, 2015Assignee: QUALCOMM IncorporatedInventors: Lucian Codrescu, Robert Allan Lester, Charles Joseph Tabony, Erich James Plondke, Mao Zeng, Suresh Venkumahanti, Ajay Anant Ingle
-
Patent number: 8954941Abstract: Method of generating respective instruction compaction schemes for subsets of instructions to be processed by a programmable processor, comprising the steps of a) receiving at least one input code sample representative for software to be executed on the programmable processor, the input code comprising a plurality of instructions defining a first set of instructions (S1), b) initializing a set of removed instructions as empty (S3), c) determining the most compact representation of the first set of instructions (S4) d) comparing the size of said most compact representation with a threshold value (S5), e) carrying out steps e1 to e3 if the size is larger than said threshold value, e1) determining which instruction of the first set of instructions has a highest coding cost (S6), e2) removing said instruction having the highest coding cost from the first set of instructions and (S7), e3) adding said instruction to the set of removed instructions (S8), f) repeating steps b-f, wherein the first set of instructionsType: GrantFiled: September 3, 2010Date of Patent: February 10, 2015Assignee: Intel CorporationInventors: Hendrik Tjeerd Joannes Zwartenkot, Alexander Augusteijn, Yuanging Guo, Jürgen Von Oerthel, Jeroen Anton Johan Leijten, Erwan Yann Maurice Le Thenaff
-
Patent number: 8954714Abstract: An apparatus includes a processor. The processor includes two memories. The first memory stores one set of instructions. The second memory stores another set of instructions that are longer than the set of instructions in the first memory. An instruction in the set of instructions in the first memory is used as a pointer to a corresponding instruction in the set of instructions in the second memory.Type: GrantFiled: February 1, 2010Date of Patent: February 10, 2015Assignee: Altera CorporationInventor: Steven Perry
-
Publication number: 20150039856Abstract: Efficient computation of complex multiplication results and very efficient fast Fourier transforms (FFTs) are provided. A parallel array VLIW digital signal processor is employed along with specialized complex multiplication instructions and communication operations between the processing elements which are overlapped with computation to provide very high performance operation. Successive iterations of a loop of tightly packed VLIWs are used allowing the complex multiplication pipeline hardware to be efficiently used. In addition, efficient techniques for supporting combined multiply accumulate operations are described.Type: ApplicationFiled: August 11, 2014Publication date: February 5, 2015Applicant: Altera CorporationInventors: Nikos P. Pitsianis, Gerald George Pechanek, Ricardo Rodriguez
-
Patent number: 8898433Abstract: An apparatus having a buffer and a circuit is disclosed. The buffer may be configured to store a plurality of fetch sets. Each fetch set generally includes a prefix word and a plurality of instruction words. Each prefix word may include a plurality of symbols. Each symbol generally corresponds to a respective one of the instruction words. The circuit may be configured to (i) identify each of the symbols in each of the fetch sets having a predetermined value and (ii) parse the fetch sets into a plurality of execution sets in response to the symbols having the predetermined value.Type: GrantFiled: April 26, 2012Date of Patent: November 25, 2014Assignee: Avago Technologies General IP (Singapore) Pte. Ltd.Inventors: Alexander Rabinovitch, Leonid Dubrovin
-
Patent number: 8850557Abstract: Disclosed are a processor and processing method that provide non-hierarchical computer security enhancements for context states. The processor can comprise a context control unit that uses context identifier tags associated with corresponding contexts to control access by the contexts to context information (i.e., context states) contained in the processor's non-stackable and/or stackable registers. For example, in response to an access request, the context control unit can grant a specific context access to a register only when that register is tagged with a specific context identifier tag. If the register is tagged with another context identifier tag, the contents of the specific register are saved in a context save area of memory and the previous context states of the specific context are restored to the specific register before access can be granted.Type: GrantFiled: February 29, 2012Date of Patent: September 30, 2014Assignee: International Business Machines CorporationInventors: Richard H. Boivie, William E. Hall, Guerney D. H. Hunt, Suzanne K. McIntosh, Mark F. Mergen, Marcel C. Rosu, David R. Safford, David C. Toll, Carl Lynn C. Karger