Logic Operation Instruction Processing Patents (Class 712/223)
-
Publication number: 20110153997Abstract: Receiving an instruction indicating a source operand and a destination operand. Storing a result in the destination operand in response to the instruction. The result operand may have: (1) first range of bits having a first end explicitly specified by the instruction in which each bit is identical in value to a bit of the source operand in a corresponding position; and (2) second range of bits that all have a same value regardless of values of bits of the source operand in corresponding positions. Execution of instruction may complete without moving the first range of the result relative to the bits of identical value in the corresponding positions of the source operand, regardless of the location of the first range of bits in the result. Execution units to execute such instructions, computer systems having processors to execute such instructions, and machine-readable medium storing such an instruction are also disclosed.Type: ApplicationFiled: December 22, 2009Publication date: June 23, 2011Inventors: Maxim Loktyukhin, Eric W. Mahurin, Bret L. Toll, Martin G. Dixon, Sean P. Mirkes, David L. Kreitzer, El Moustapha Ould-Ahmed-Vall, Vinodh Gopal
-
Publication number: 20110138156Abstract: A method and associated processor suitable for executing machine instructions for evaluating a logical expression are provided. The approach suggested makes use of a memory and an extended set of instructions. The memory, which can be embodied in a general purpose register for example, is for storing information related to an intermediate results obtained in evaluating the logical expression as well as a nesting level of sub-expressions in the logical expression being evaluated. The extended set of instruction allows for initializing and updating the information in that memory. A processor for executing the extended set of instruction is also provided along with a process for generating machine code making use of this extended set of instructions for evaluating a logical expression.Type: ApplicationFiled: October 15, 2010Publication date: June 9, 2011Inventors: Tom AWAD, Martin LAURENCE, Martin FILTEAU
-
Patent number: 7917906Abstract: Method and apparatus for allocating system resources for use by software processes in a computer-based system, such as a wide area network (WAN) comprising a data storage array. A first memory space provides a first bit indicator to indicate whether at least one system resource is available for use. A second memory space provides a second bit indicator to indicate whether a pending software process awaits availability of the system resource. The resource is allocated for use by the process in relation to a combinatorial operation upon the first and second bit indicators, preferably comprising a logical AND operation. The first and second memory spaces are preferably characterized as multi-bit registers. A free resource stack identifies available resources, and a process queue identifies pending processes waiting for released processes. The statuses of the respective stack and queue are reflected in the bits in the multi-bit registers.Type: GrantFiled: July 2, 2004Date of Patent: March 29, 2011Assignee: Seagate Technology LLCInventor: Michael D. Walker
-
Publication number: 20110055516Abstract: An innovative realization of computer hardware, software and firmware comprising a multiprocessor system wherein at least one processor can be configured to have a fixed instruction set and one or more processors can be statically or dynamically configured to implement a plurality of processor states in a plurality of technologies. The processor states may be instructions sets for the processors. The technologies may include programmable logic arrays.Type: ApplicationFiled: August 20, 2010Publication date: March 3, 2011Applicant: FTL Systems Technology CorporationInventor: John C. Willis
-
Publication number: 20110035570Abstract: A superscalar pipelined microprocessor includes a register set defined by an instruction set architecture of the microprocessor, execution units, and a store unit, coupled to the cache memory and distinct from the other execution units of the microprocessor. The store unit comprises an ALU. The store unit receives an instruction that specifies a source register of the register set and an operation to be performed on a source operand to generate a result. The store unit reads the source operand from the source register. The ALU performs the operation on the source operand to generate the result, rather than forwarding the source operand to any of the other execution units of the microprocessor to perform the operation on the source operand to generate the result. The store unit operatively writes the result to the cache memory.Type: ApplicationFiled: October 30, 2009Publication date: February 10, 2011Inventors: Gerard M. Col, Colin Eddy, Rodney E. Hooker
-
Publication number: 20110035569Abstract: A superscalar pipelined microprocessor includes a register set defined by its instruction set architecture, a cache memory, execution units, and a load unit, coupled to the cache memory and distinct from the other execution units. The load unit comprises an ALU. The load unit receives an instruction that specifies a memory address of a source operand, an operation to be performed on the source operand to generate a result, and a destination register of the register set to which the result is to be stored. The load unit reads the source operand from the cache memory. The ALU performs the operation on the source operand to generate the result, rather than forwarding the source operand to any of the other execution units of the microprocessor to perform the operation on the source operand to generate the result. The load unit outputs the result for subsequent retirement to the destination register.Type: ApplicationFiled: October 30, 2009Publication date: February 10, 2011Inventors: Gerard M. Col, Colin Eddy, Rodney E. Hooker
-
Publication number: 20110010523Abstract: A cascadable arithmetic and logic unit (ALU) which is configurable in function and interconnection. No decoding of commands is needed during execution of the algorithm. The ALU can be reconfigured at run time without any effect on surrounding ALUs, processing units or data streams. The volume of configuration data is very small, which has positive effects on the space required and the configuration speed. Broadcasting is supported through the internal bus systems in order to distribute large volumes of data rapidly and efficiently. The ALU is equipped with a power-saving mode to shut down power consumption completely. There is also a clock rate divider which makes it possible to operate the ALU at a slower clock rate. Special mechanisms are available for feedback on the internal states to the external controllers.Type: ApplicationFiled: July 27, 2010Publication date: January 13, 2011Inventors: Martin VORBACH, Robert Münch
-
Publication number: 20100318773Abstract: A computer system is operable to identify index elements in a vector index array that cannot be processed in parallel by calculating a complement modified bit matrix compare function between a first matrix filled with elements from the vector index array and a second matrix filled with the same elements from the vector index array.Type: ApplicationFiled: June 11, 2009Publication date: December 16, 2010Applicant: Cray Inc.Inventors: Terry D. Greyzck, William F. Long, Peter M. Klausler, Matthew F. Taylor
-
Publication number: 20100313000Abstract: The present techniques provide an internal processor of a memory device configured to selectively execute instructions in parallel, for example. One such internal processor includes a plurality of arithmetic logic units (ALUs), each connected to conditional masking logic, and each configured to process conditional instructions. A condition instruction may be received by a sequencer of the memory device. Once the condition instruction is received, the sequencer may enable the conditional masking logic of the ALUs. The sequencer may toggle a signal to the conditional masking logic such that the masking logic masks certain instructions if a condition of the condition instruction has been met, and masks other instructions if the condition has not been met. In one embodiment, each ALU in the internal processor may selectively perform instructions in parallel.Type: ApplicationFiled: June 4, 2009Publication date: December 9, 2010Applicant: MICRON TECHNOLOGY, INC.Inventor: Robert Walker
-
Publication number: 20100312998Abstract: Devices, systems, and methods of communicating information directly to a sequencer or a buffer in a memory device are provided. In some embodiments, instructions are sent directly from an external processor to a sequencer in the memory device, and the sequencer configures the instructions for an internal processor, such as one or more arithmetic logic units (ALUs) embedded on the memory device. Further, data to be operated on by the internal processor can be sent directly from the external processor to a buffer, and the sequencer can copy the data from the buffer to the internal processor. As power can be consumed each time a memory array is written to or read from, the direct communication of instructions and/or data can reduce the power consumed in writing to or reading from the memory array.Type: ApplicationFiled: June 4, 2009Publication date: December 9, 2010Applicant: MICRON TECHNOLOGY, INC.Inventor: Robert Walker
-
Publication number: 20100312999Abstract: One or more of the present techniques provide a compute engine buffer configured to maneuver data and increase the efficiency of a compute engine. One such compute engine buffer is connected to a compute engine which performs operations on operands retrieved from the buffer, and stores results of the operations to the buffer. Such a compute engine buffer includes a compute buffer having storage units which may be electrically connected or isolated, based on the size of the operands to be stored and the configuration of the compute engine. The compute engine buffer further includes a data buffer, which may be a simple buffer. Operands may be copied to the data buffer before being copied to the compute buffer, which may save additional clock cycles for the compute engine, further increasing the compute engine efficiency.Type: ApplicationFiled: June 4, 2009Publication date: December 9, 2010Applicant: MICRON TECHNOLOGY, INC.Inventor: Robert Walker
-
Patent number: 7849466Abstract: A multithreaded processor device is disclosed and includes a processor that is configured to execute a plurality of executable program threads and a mode control register. The mode control register includes a first data field to control a first execution mode of a first of the plurality of executable program threads and a second data field to control a second execution mode of a second of the plurality of executable program threads. In a particular embodiment, the first execution mode is a run mode and the second execution mode is a low power mode.Type: GrantFiled: July 12, 2005Date of Patent: December 7, 2010Assignee: QUALCOMM IncorporatedInventors: Lucian Codrescu, Donald Robert Padgett, Erich Plondke, Taylor Simpson, Muhammad Ahmed, William C. Anderson, Sujat Jamil
-
Publication number: 20100299506Abstract: A rotate then operate instruction having a T bit is fetched and executed wherein a first operand in a first register is rotated by an amount and a Boolean operation is performed on a selected portion of the rotated first operand and a second operand in of a second register. If the T bit is ‘0’ the selected portion of the result of the Boolean operation is inserted into corresponding bits of a second operand of a second register. If the T bit is ‘1’, in addition to the inserted bits, the bits other than the selected portion of the rotated first operand are saved in the second register.Type: ApplicationFiled: July 21, 2010Publication date: November 25, 2010Applicant: International Business Machines CorporationInventors: Dan F. Greiner, Timothy J. Slegel, Joachim von Buttlar
-
Publication number: 20100293358Abstract: A dynamic processor-set management method provides for transferring a process from a shared processor set to a dedicated processor set when that process meets a first utilization-related criterion. The method also provides for transferring a process between from a dedicated processor set to a shared processor set when that process meets a second utilization-related criterion. The processor sets are mapped to processor cores that execute the processes.Type: ApplicationFiled: May 15, 2009Publication date: November 18, 2010Inventors: Ryohei Leo SAKAGUCHI, Daniel Edward Herington, Seiji Inokuchi
-
Publication number: 20100262805Abstract: A processor has a central processing unit (CPU), a first CPU register set, a second CPU register set, a multiplexer logic for either coupling the first or the second CPU register set with the CPU, and control logic for controlling the multiplexer logic to switch from the first CPU register set to the second CPU register set upon receipt of at least one of a plurality of interrupt signals, wherein the at least one of a plurality of interrupt signals must meet a condition that is programmable within the control logic.Type: ApplicationFiled: March 29, 2010Publication date: October 14, 2010Inventors: Robert Sean Justice, Tyler Nye Boddie, Joseph Triece
-
Publication number: 20100262747Abstract: A multi-core bus termination apparatus includes a location array and a plurality of drivers. The location array generates a plurality of location signals that indicate locations on the bus of a corresponding plurality of nodes that are coupled to the bus, where the locations comprise either an internal location or a bus end location. Each of the plurality of drivers has one of the corresponding plurality of nodes, and controls how the one of the corresponding plurality of nodes is driven responsive to a state of a corresponding one of the plurality of location signals. Each of the plurality of drivers has configurable multi-core logic. The configurable multi-core logic enables pull-up logic and first pull-down logic if the state indicates the bus end location. The configurable multi-core logic disables the pull-up logic and to enable the first pull-down logic and second pull-down logic if the state indicates the internal location.Type: ApplicationFiled: April 14, 2009Publication date: October 14, 2010Applicant: VIA TECHNOLOGIES, INC.Inventors: DARIUS D. GASKINS, JAMES R. LUNDBERG
-
Publication number: 20100223444Abstract: A method and a device having a plurality of bit operations capability, the device includes: a first and a second registers and an instruction fetch circuit, and an arithmetic logic unit adapted to: calculate, during a first clock cycle, a position value representative of a position, within a first information vector, of a first bit of information that has a first value; and to multiply the position value by a multiplication factor to provide a first result and to alter the value of the first bit to a second value to provide an updated information vector, during the first clock cycle.Type: ApplicationFiled: August 18, 2006Publication date: September 2, 2010Applicant: Freescale Semiconductor, Inc.Inventors: Eran Glickman, Evgeni Ginzburg, Noam Sheffer
-
Publication number: 20100217960Abstract: A method for performing serial functions in parallel, where a datapath is divided into several independent stages, or pipeline stages, so that logical functions can be implemented in each pipeline stage concurrently. In an illustrative embodiment of the invention, a pipelined logic tree is described. This method allows for n-bits to be input to the system and n-bits to output from the system concurrently.Type: ApplicationFiled: April 29, 2009Publication date: August 26, 2010Applicant: AVALON MICROELECTRONICS, INC.Inventor: Wally Haas
-
Patent number: 7783627Abstract: An apparatus and method retrieves a database record from an in-memory database of a parallel computer system using a unique key. The parallel computer system performs a simultaneous search on each node of the computer system using the unique key and then utilizes a global combining network to combine the results from the searches of each node to efficiently and quickly search the entire database.Type: GrantFiled: July 30, 2007Date of Patent: August 24, 2010Assignee: International Business Machines CorporationInventors: Charles Jens Archer, Amanda Peters, Gary Ross Ricard, Albert Sidelnik, Brian Edward Smith
-
Publication number: 20100205602Abstract: A thread scheduling mechanism is provided that flexibly enforces performance isolation of multiple threads to alleviate the effect of anti-cooperative execution behavior with respect to a shared resource, for example, hoarding a cache or pipeline, using the hardware capabilities of simultaneous multi-threaded (SMT) or multi-core processors. Given a plurality of threads running on at least two processors in at least one functional processor group, the occurrence of a rescheduling condition indicating anti-cooperative execution behavior is sensed, and, if present, at least one of the threads is rescheduled such that the first and second threads no longer execute in the same functional processor group at the same time.Type: ApplicationFiled: April 26, 2010Publication date: August 12, 2010Applicant: VMWARE, INC.Inventors: John R. ZEDLEWSKI, Carl A. WALDSPURGER
-
Publication number: 20100191938Abstract: An information processing device including: a first arithmetic processing unit performing first arithmetic processing; a second arithmetic processing unit performing second arithmetic processing; input registers adapted to include a first input register allocated to the first arithmetic processing unit, and a second input register allocated to the second arithmetic processing unit; and output registers storing a processing results of the first arithmetic processing unit and a processing results of the second arithmetic processing unit, in each of given execution cycles, the first arithmetic processing unit performs the first arithmetic processing using stored data of the first input register and stores a processing result of the first arithmetic processing in the output registers and the second arithmetic processing unit performs the second arithmetic processing using stored data of the second input register and stores a processing result of the second arithmetic processing in the output registers.Type: ApplicationFiled: January 29, 2010Publication date: July 29, 2010Applicant: SEIKO EPSON CORPORATIONInventors: Hiroshi HASEGAWA, Fumio KOYAMA
-
Publication number: 20100185836Abstract: An arithmetic-program conversion apparatus includes: a program storage section storing an arithmetic program describing a circuit by a logical expression including a plurality of input and output variables, and operators; if the expression has three input variables or more, an intermediate-variable generation section generating an intermediate variable for converting the expression into a plurality of binomials including input and output variables; if the intermediate variable is generated, an expression conversion section converting the logical expression into a plurality of binomials including a binomial for obtaining the intermediate variable and a binomial obtaining the output variable from the intermediate variable; if a plurality of binomials are generated, an expression update section updating the stored original expression; a bit-width determination section determining bit widths of the output, input, and intermediate variables of the expression; and a bit-width storage section storing the bit widthsType: ApplicationFiled: January 19, 2010Publication date: July 22, 2010Applicant: Sony CorporationInventor: Shota Hasegawa
-
Publication number: 20100185837Abstract: A family of reconfigurable asynchronous logic elements that interact with their nearest neighbors permits reconfigurable implementation of circuits that are asynchronous at the bit level, rather than at the level of functional blocks. These elements pass information by means of tokens. Each cell is self-timed, and cells that are configured as interconnect perform at propagation delay speeds, so no hardware non-local connections are needed. A reconfigurable asynchronous logic element comprises a set of edges for communication with at least one neighboring cell, each edge having an input for receiving tokens from neighboring cells and an output for transferring tokens to at least one neighboring cell, circuitry configured to perform a logic operation utilizing received tokens as inputs and to produce an output token reflecting the result of the logic operation, and circuitry.Type: ApplicationFiled: September 16, 2009Publication date: July 22, 2010Applicant: Massachussetts Institute of TechnologyInventors: David Allen Dalrymple, Erik Demaine, Neil Gershenfeld, Forrest Green, Ara Knaian
-
Patent number: 7761695Abstract: A data processing circuit has a programmable processor (12a, b) with an instruction set that comprises an new type of instruction. This instruction has a first operand that refers to a string of bits, and a second operand that refers to a position in that string of bits. The programmable processor (12a, b) is arranged to execute this type of instruction by returning, as a result, a code that is indicative of a count of a number of bits that occurs from said position in the string of bits until the string of bits from said position deviates from a predetermined bit pattern. The instruction is particularly useful for use in programs that perform variable length decoding and/or decoding.Type: GrantFiled: September 15, 2005Date of Patent: July 20, 2010Assignee: Silicon Hive B.V.Inventor: Kornelis Meinds
-
Publication number: 20100180129Abstract: An arrangement of arithmetic logic units carries out an operation on at least one operand, wherein the operation is determined by operation codes received by the arithmetic logic units. The operation codes and at least one operand are received on a first clock cycle. The result of the operation is output from at least one arithmetic logic unit to at least one further arithmetic logic unit. A result of the plurality of arithmetic logic units is then output on a next clock cycle.Type: ApplicationFiled: December 18, 2009Publication date: July 15, 2010Applicant: STMicroelectronics R&D Ltd.Inventor: David Smith
-
Patent number: 7752424Abstract: A processor 2 is provided with the ability to execute program instructions in the form of Java bytecodes including a dedicated null checking instruction. The null checking instruction reads the top of stack value, compares this with a null value and jumps to an exception handling routine if the top of stack value equals the null value, otherwise the next program instruction is executed.Type: GrantFiled: August 8, 2007Date of Patent: July 6, 2010Assignee: ARM LimitedInventor: Rodolph Gérard Jacques Ascanio Jean-Denis Perfetta
-
Publication number: 20100138774Abstract: A computer-implemented method for processing multivariate data, comprising: inputting or receiving an alphanumeric expression comprising at least one process pointer, indicative of a gating process, a Boolean process or an external process; parsing the expression; executing the process indicated by the process pointer on multivariate data in a data file; and outputting output data comprising the multivariate data processed according to the expression.Type: ApplicationFiled: October 30, 2007Publication date: June 3, 2010Inventors: Nicholas Daryl Crosbie, Vittorio Cordioli
-
Patent number: 7721072Abstract: An information processing method includes generating a state transition diagram based on state transition information; displaying the state transition diagram; manipulating the displayed state transition diagram; updating the state transition information in accordance with how the state transition diagram has been manipulated; and storing a position of a state designated as a transition starting state by the manipulating step. When the position of the transition starting state has been specified by the manipulating step, the displaying step displays as a pointer an icon indicating that the position of the transition starting state has been specified.Type: GrantFiled: November 2, 2006Date of Patent: May 18, 2010Assignee: Sony CorproationInventors: Yasuhiro Watanabe, Shuichi Konami
-
Publication number: 20100122064Abstract: A device may include a data processing logic cell field and one or more sequential CPUs. The logic cell field and the CPUs may be configured to be coupled to each other for data exchange. The data exchange may be in block form using lines leading to a cache memory. In a method for operating a reconfigurable unit having runtime-limited configurations, the configurations may be able to increase their maximum allowed runtime, e.g., by triggering a parallel counter. An increase in configuration runtime by the configurations may be suppressed in response to an interrupt.Type: ApplicationFiled: September 30, 2009Publication date: May 13, 2010Inventor: MARTIN VORBACH
-
Publication number: 20100100714Abstract: Systems and methods are provided for managing access to registers. A system may include a set of direct registers and a set of indirect registers. The indirect registers may be accessed through the direct registers, and the direct registers may provide various features to provide faster access to the indirect registers. One of the direct registers may indicate access modes for accessing the indirect registers. The access modes may include auto-increment, auto-decrement, auto-reset, and no change modes. Based on the access mode, the currently accessed address may be automatically modified after accessing the indirect register at the address.Type: ApplicationFiled: October 18, 2008Publication date: April 22, 2010Applicant: Micron Technology, Inc.Inventors: Harold B Noyes, Mark Jurenka, Gavin Huggins
-
Publication number: 20100077187Abstract: A system and method to execute a linear feedback-shift instruction is disclosed. In a particular embodiment the method includes executing an instruction at a processor by receiving source data and executing a bitwise logical operation on the source data and on reference data to generate intermediate data. The method further includes determining a parity value of the intermediate data, shifting the source data, and entering the parity value of the intermediate data into a data field of the shifted source data to produce resultant data.Type: ApplicationFiled: September 23, 2008Publication date: March 25, 2010Applicant: QUALCOMM INCORPORATEDInventors: Erich Plondke, Lucian Codrescu, Remi Gurski, Shankar Krithivasan
-
Publication number: 20100049952Abstract: An apparatus for decreasing the likelihood of incorrectly forwarding store data includes a hash generator, which hashes J address bits to K hashed bits. The J address bits are a memory address specified by a load/store instruction, where K is an integer greater than zero and J is an integer greater than K. The apparatus also includes a comparator, which outputs a first value if L address bits specified by the load instruction match L address bits specified by the store instruction and K hashed bits of the load instruction match corresponding K hashed bits of the store instruction, and otherwise to output a second value, where L is greater than zero. The apparatus also includes forwarding logic, which forwards data from the store instruction to the load instruction if the comparator outputs the first value and foregoes forwarding the data when the comparator outputs the second value.Type: ApplicationFiled: August 25, 2008Publication date: February 25, 2010Applicant: VIA TECHNOLOGIES, INC.Inventors: Colin Eddy, Rodney E. Hooker
-
Publication number: 20100049951Abstract: The described embodiments provide a processor for generating a result vector with shifted values. During operation, the processor receives a first input vector, a second input vector, and a control vector. When generating the result vector, the processor first captures a base value from a key element position in the second input vector. The processor then writes the product of the base value and values from relevant elements in the first input vector into selected elements in the result vector. In addition, a predicate vector can be used to control the values that are written to the result vector.Type: ApplicationFiled: August 14, 2009Publication date: February 25, 2010Applicant: APPLE INC.Inventors: Jeffry E. Gonion, Keith E. Diefendorff
-
Publication number: 20100030837Abstract: A method of modifying a group of full adder circuits to compute a Boolean function of a set number of input bits, each full adder circuit having first and second data inputs, a data output, a carry input and a carry output, the full adder circuits being interconnected so as to form a carry chain. The method comprises the steps of setting the first input of each full adder circuit to a same fixed value, connecting each respective input bit of the set number of input bits to the second input of a respective one of the full adder circuits and using the output of the carry chain of the array of full adder circuits as the result of the Boolean function.Type: ApplicationFiled: June 26, 2009Publication date: February 4, 2010Inventor: Anthony STANSFIELD
-
Publication number: 20090319759Abstract: A method and apparatus for seamless frequency sequestering is herein described. In response to a frequency throttle event, controlling software, such as an OS, is provided access to a throttled amount of frequency associated with the frequency throttle event, while another amount of frequency is transparently sequestered for performance of non-controlling software tasks.Type: ApplicationFiled: June 19, 2008Publication date: December 24, 2009Inventors: Michael A. Rothman, Vincent J. Zimmer
-
Patent number: 7627458Abstract: A method is provided to automatically allocate resources of an integrated circuit (IC) to form multipliers in a given design to optimize the use of IC resources. Information about the multipliers in the design is extracted to place the multipliers into a priority order. The priority allows primitives in the IC, like DSP blocks LUTs or MUXCYs to be economically allocated to the multipliers. The ordering criteria can include: (1) a user defined criteria, (2) the number of primitives required to implement a multiplier, or (3) a size of the multiplier operands. This invention further optimally allocates LUTs and MUXCYs when DSP48 blocks are exhausted. The steps for generating a multiplier include: constructing a partial product matrix and minimizing the adders used in the multiplier by minimizing the size of support for the partial products. Either LUTs or MUXCYs are selected depending on the size of support determined.Type: GrantFiled: December 21, 2005Date of Patent: December 1, 2009Assignee: XILINX, Inc.Inventors: David Nguyen Van Mau, Yassine Rjimati
-
Publication number: 20090292904Abstract: An apparatus providing for a secure execution environment including a microprocessor and a secure non-volatile memory. The microprocessor executes non-secure application programs and a secure application program, where the non-secure application programs are accessed from a system memory via a system bus, and where the secure application program is executed in a secure execution mode. The microprocessor has secure watchdog logic that monitors environmental attributes corresponding to the microprocessor and to the secure application program, and that is configured to transfer program control to one of a plurality of event handlers within the secure application program. The secure non-volatile memory is coupled to the microprocessor via a private bus.Type: ApplicationFiled: October 31, 2008Publication date: November 26, 2009Applicant: VIA TECHNOLOGIES, INCInventors: G. Glenn Henry, Terry Parks
-
Patent number: 7624251Abstract: One embodiment of the present invention provides a processor that is configured to execute load-swapped-partial instructions. An instruction fetch unit within the processor is configured to fetch the load-swapped-partial instruction to be executed. Note that the load-swapped-partial instruction specifies a source address in a memory, which is possibly an unaligned address. Furthermore, an execution unit within the processor is configured to execute the load-swapped-partial instruction. This involves loading a partial-vector-sized datum from a naturally-aligned memory region encompassing the source address.Type: GrantFiled: January 18, 2007Date of Patent: November 24, 2009Assignee: Apple Inc.Inventors: Jeffry E. Gonion, Keith E. Diefendorff
-
Patent number: 7620797Abstract: One embodiment of the present invention provides a processor which is configured to execute load-swapped instructions, which are possibly directed to unaligned source address. The processor is configured to execute the load-swapped instruction by loading a vector from a naturally-aligned memory region encompassing the source address, and in doing so rotating the bytes of the vector to cause the byte at the specified source address to reside at the least-significant byte position within the vector for a little-endian memory transaction, or causing said byte to be positioned at the most-significant byte position within the vector for a big-endian memory transaction.Type: GrantFiled: November 1, 2006Date of Patent: November 17, 2009Assignee: Apple Inc.Inventors: Jeffry E. Gonion, Keith E. Diefendorff
-
Patent number: 7610472Abstract: Some embodiments present a method of performing a variable shift operation. This method can be used by a microprocessor that does not allow variable shift operation for certain operand sizes. The method simulates a shift instruction that shifts an operand by a shift count. The method identifies a first shift command and a second shift command. The method computes a mask value. The mask value depends on whether the shift count is less than half of the operand size or greater than or equal to half of the operand size. The method uses the mask value to cause one of the first shift command and the second shift command to produce no shift. In some embodiments, the method allows for the shift count to be specified in bytes or in bits.Type: GrantFiled: June 5, 2006Date of Patent: October 27, 2009Assignee: Apple Inc.Inventors: Hyeonkuk Jeong, Paul Chang
-
Publication number: 20090249041Abstract: A device and method for reducing the power consumption of an electronic device using register file with bypass mechanism. The width of a pulse controlling the word write operation may be extended twice as long so that the extended portion substantially overlaps a following word read pulse. The extension of the pulse width of the read operation may enable lowering the Vcc Min value for the electronic device and thus may lower the power consumption of the device.Type: ApplicationFiled: March 28, 2008Publication date: October 1, 2009Inventors: Satish Damaraju, Scott Siers, Omar Malik
-
Patent number: 7594099Abstract: A processor according to the present invention includes a decoding unit 20, an operation unit 40 and others. When the decoding unit 20 decodes Instruction vcchk, the operation unit 40 and the like judges whether vector condition flags VC0˜VC3 (110) of a condition flag register (CFR) 32 are all zero or not, and (i) sets condition flags C4 and C5 of the condition flag register (CFR) 32 to 1 and 0, respectively, when all of the vector condition flags VC0˜VC3 are zero, and (ii) sets the condition flags C4 and C5 to 0 and 1, respectively, when not all the vector condition flags are zero. Then, the vector condition flags VC0˜VC3 are stored in the condition flags C0˜C3.Type: GrantFiled: August 31, 2007Date of Patent: September 22, 2009Assignee: Panasonic CorporationInventors: Tetsuya Tanaka, Hazuki Okabayashi, Taketo Heishi, Hajime Ogawa, Tsuneyuki Suzuki, Tokuzo Kiyohara, Takeshi Tanaka, Hideshi Nishida, Masaki Maeda
-
Publication number: 20090222393Abstract: Systems and methods are disclosed for deciding a satisfiability problem with linear and non-linear operations by: encoding non-linear integer operations into encoded linear operations with Boolean constraints by Booleaning and linearizing, combining the linear and encoded linear operations into a formula, solving the satisifiability of the formula using a solver, wherein the encoding and solving includes at least one of following: a. Booleanizing one of the non-linear operands by bit-wise structural decomposition b. Linearizing a non-linear operator by selectively choosing one of the operands for Booleanization c. Solving using an incremental lazy bounding refinement (LBR) procedure without re-encoding formula, and verifying the linear and non-linear operations in a computer software.Type: ApplicationFiled: December 9, 2008Publication date: September 3, 2009Applicant: NEC LABORATORIES AMERICA, INC.Inventor: Malay K. Ganai
-
Publication number: 20090187749Abstract: A bypass circuit is provided in a pipeline processor. A pipeline register is provided between an instruction execution stage and a write-back stage. The pipeline register stores a data validity flag and a WRITE control flag to control writing data into a general purpose register unit. The data retained in the pipeline register is allowed to be written back into the general purpose register unit when the WRITE control flag indicates “valid”. The pipeline register continues to retain the retained data even after the writing of the retained data into the general purpose register unit. The first pipeline register supplies the retained data to the second stage through the bypass circuit at the time of executing a subsequent instruction having data dependency on a preceding instruction.Type: ApplicationFiled: January 12, 2009Publication date: July 23, 2009Applicant: KABUSHIKI KAISHA TOSHIBAInventor: Jun TANABE
-
Patent number: 7565514Abstract: A processing system and method performs data processing operations in response to a single data processing instruction. At least two registers store data. First control circuitry compares data in respective corresponding fields of the at least two registers to create a plurality of condition values. Second control circuitry performs one or more predetermined logic operations on less than all of the plurality of condition values and on more than one condition value of the plurality of condition values to generate a condition code for each of the one or more predetermined logic operations. A condition code register stores the condition code for each of the one or more predetermined logic operations.Type: GrantFiled: April 28, 2006Date of Patent: July 21, 2009Assignee: Freescale Semiconductor, Inc.Inventor: William C. Moyer
-
Publication number: 20090177870Abstract: A method of providing wiring efficiency in a permute unit. Multiple selectors receive input data and shared control signals from multiple register files. The permute unit includes multiple multiplexors (MUXs) coupled to multiple logical AND gates. The multiple logical AND gates are coupled to multiple logical OR gates. The logical AND gates are physically separated from the logical OR gates. The logical AND gates receive input from one or more output data signals from the selectors. The logical OR gates combine the one or more output signals from the logical AND gates and provide output data from the permute unit.Type: ApplicationFiled: January 3, 2008Publication date: July 9, 2009Inventors: Bruce M. Fleischer, Hung C. Ngo, Jun Sawada
-
Publication number: 20090164762Abstract: A “code optimizer” provides various techniques for optimizing arbitrary XOR-based codes for encoding and/or decoding of data. Further, the optimization techniques enabled by the code optimizer do not depend on any underlining code structure. Therefore, the optimization techniques provided by the code optimizer are applicable to arbitrary codes with arbitrary redundancy. As such, the optimized XOR-based codes generated by the code optimizer are more flexible than specially designed codes, and allow for any desired level of fault tolerance. Typical uses of XOR-based codes include, for example, encoding and/or decoding data using redundant data packets for data transmission real-time communications systems, encoding and/or decoding operations for storage systems such as RAID arrays, etc.Type: ApplicationFiled: December 20, 2007Publication date: June 25, 2009Applicant: MICROSOFT CORPORATIONInventors: Cheng Huang, Jin Li, Minghua Chen
-
Publication number: 20090138679Abstract: A processor including a Boolean logic unit, wherein the Boolean logic unit is operated for performing the short-circuit evaluation of a Normal Form Boolean expression/operation, a plurality of input/output interfaces in communication with the Boolean logic unit, wherein the plurality of input/output interfaces are operated for receiving a plurality of compiled Boolean expressions/operations and transmitting a plurality of compiled results, and a plurality of registers coupled to the plurality of input/output interface circuits, wherein the plurality of multi-bit registers include an instruction register, a first address register and a second address register.Type: ApplicationFiled: February 2, 2009Publication date: May 28, 2009Applicant: University of North Carolina at CharlotteInventor: Kenneth Elmon Koch, III
-
Patent number: 7529918Abstract: A system and method for efficiently performing bit-field extraction and bit-field combination operations in a processor is provided. The system includes a plurality of general purpose registers, a plurality of predicate registers, and at least one execution unit configured to extract a plurality of bit fields from a source reservoir and to populate a plurality of destination lanes in response to a single instruction. In addition, the execution unit is configured to write supplied fill data into the source reservoir if the number of bits in the source reservoir is less than a predetermined number. In addition or alternatively, the system may include at least one execution unit configured to combine a plurality of bit fields from a plurality of source lanes into a continuous bit stream in response to a single instruction executable by the processor.Type: GrantFiled: December 22, 2006Date of Patent: May 5, 2009Assignee: Broadcom CorporationInventor: Mark Taunton
-
Publication number: 20090113174Abstract: A co-processor for efficiently decoding codewords encoded according to a Low Density Parity Check (LDPC) code, and arranged to efficiently execute an instruction to multiply the value of one operand with the sign of another operand, is disclosed. Logic circuitry is included in the co-processor to select between the value of a second operand, and an arithmetic inverse of the second operand value, in response to the sign bit of the first operand. This logic circuitry is arranged to operate according to 2's-complement integer arithmetic, by also including invert-and-increment circuitry to produce a 2's-complement inverse of the second operand. A comparator determines whether the second operand is at a maximum 2's-complement negative value, in which case the arithmetic inverse is selected to be a hard-wired maximum 2's-complement positive value.Type: ApplicationFiled: October 31, 2007Publication date: April 30, 2009Applicant: TEXAS INSTRUMENTS INCORPORATEDInventors: Tod David Wolf, Eric Biscondi, David John Hoyle