Shifting Patents (Class 708/209)
  • Patent number: 10409592
    Abstract: An apparatus has processing circuitry comprising an L×M multiplier array. An instruction decoder associated with the processing circuitry supports a multiply-and-accumulate-product (MAP) instruction for generating at least one result element corresponding to a sum of respective E×F products of E-bit and F-bit portions of J-bit and K-bit operands respectively, where 1<E<J?L and 1<F<K?M. In response to the MAP instruction, the instruction decoder controls the processing circuitry to rearrange F-bit portions of the second K-bit operand to form a transformed K-bit operand, and to control the L×M multiplier array in dependence on the first J-bit operand and the transformed K-bit operand to add the respective E×F products using a subset of the adders used for accumulating partial products for a conventional multiplication.
    Type: Grant
    Filed: April 24, 2017
    Date of Patent: September 10, 2019
    Assignee: ARM Limited
    Inventors: Neil Burgess, David Raymond Lutz, Javier Diaz Bruguera
  • Patent number: 10366741
    Abstract: Circuitry comprises: a set of bit processing circuitries to apply two or more successive instances of bitwise processing to an ordered bit array; each bit processing circuitry for a given bit position within the ordered bit array comprising: bit shifting circuitry to selectively apply a bit shift of a respective input bit to a next bit processing circuitry in a first direction relative to the ordered bit array, in response to an active state of a bit shift control signal, the bit shifting circuitry not applying the bit shift in response to an inactive state of the bit shift control signal; and bit shift control circuitry to selectively allow or inhibit a bit shifting operation in response to one or more inhibit control signals; in which: the bit shift control circuitry is configured to selectively propagate an output inhibit control signal, indicating that a bit shifting operation should be inhibited, as an inhibit control signal to bit processing circuitry applying a next instance of the bitwise processing a
    Type: Grant
    Filed: September 21, 2017
    Date of Patent: July 30, 2019
    Assignee: ARM Limited
    Inventors: Neil Burgess, Nigel John Stephens, Lee Evan Eisen, Jaime Ferragut Martinez-Vara De Rey
  • Patent number: 10318298
    Abstract: An apparatus and method for performing left-shifting operations on packed quadword data.
    Type: Grant
    Filed: September 29, 2017
    Date of Patent: June 11, 2019
    Assignee: Intel Corporation
    Inventors: Venkateswara Madduri, Elmoustapha Ould-Ahmed-Vall, Robert Valentine, Mark Charney, Jesus Corbal
  • Patent number: 10296333
    Abstract: Vector single instruction multiple data (SIMD) shift and rotate instructions are provided specifying: a destination vector register comprising fields to store vector elements, a first vector register, a vector element size, and a second vector register. Vector data fields of a first element size are duplicated. Duplicate vector data fields are stored as corresponding data fields of twice the first element size. Control logic receives an element size for performing a SIMD shift or rotation operation. Through selectors corresponding to a vector element, portions are selected from the duplicated data fields, the selectors corresponding to any particular vector element select all portions similarly from the duplicated data fields for that particular vector element responsive to the first element size, but selectors corresponding to any particular vector element select at least two portions from the duplicated data fields differently for that particular vector element responsive to a second element size.
    Type: Grant
    Filed: December 27, 2016
    Date of Patent: May 21, 2019
    Assignee: Intel Corporation
    Inventors: Asaf Rubinstein, Tom Aviram
  • Patent number: 10289382
    Abstract: An apparatus for mathematical manipulation is described allowing the selective combination of shifters to shift binary numbers of various widths. Selective combination allows on-the-fly adjustment of shifters from independent to coordinated shifting operations. Selective combination allows adjustable hardware-based shifting while saving space and resources. Multiple eight-bit shifters can be configured for a variety of operand widths, such as a 32-bit width, a 24-bit width, a 16-bit width, or an eight-bit width. Multiplexers route the appropriate input data to the appropriate shifters. Bidirectional shifting is configured through a selector tree, including both shift left and shift right operations. Opcodes configure the shifters for the desired type of shift and a shifted result is generated.
    Type: Grant
    Filed: March 30, 2018
    Date of Patent: May 14, 2019
    Assignee: Wave Computing, Inc.
    Inventor: Samit Chaudhuri
  • Patent number: 10216705
    Abstract: A circuit comprises an input register configured to receive an input vector of elements, a control register configured to receive a control vector of elements, wherein each element of the control vector corresponds to a respective element of the input vector, and wherein each element specifies a permutation of a corresponding element of the input vector, and a permute execution circuit configured to generate an output vector of elements corresponding to a permutation of the input vector. Generating each element of the output vector comprises accessing, at the input register, a particular element of the input vector, accessing, at the control register, a particular element of the control vector corresponding to the particular element of the input vector, and outputting the particular element of the input vector as an element at a particular position of the output vector that is selected based on the particular element of the control vector.
    Type: Grant
    Filed: April 30, 2018
    Date of Patent: February 26, 2019
    Assignee: Google LLC
    Inventors: Dong Hyuk Woo, Gregory Michael Thorson, Andrew Everett Phelps, Olivier Temam, Jonathan Ross, Christopher Aaron Clark
  • Patent number: 10162633
    Abstract: An apparatus has processing circuitry comprising multiplier circuitry for performing multiplication on a pair of input operands. In response to a shift instruction specifying at least one shift amount and a source operand comprising at least one data element, the source operand and a shift operand determined in dependence on the shift amount are provided as input operands to the multiplier circuitry and the multiplier circuitry is controlled to perform at least one multiplication which is equivalent to shifting a corresponding data element of the source operand by a number of bits specified by a corresponding shift amount to generate a shift result value.
    Type: Grant
    Filed: April 24, 2017
    Date of Patent: December 25, 2018
    Assignee: ARM Limited
    Inventors: François Christopher Jacques Botman, Thomas Christopher Grocutt
  • Patent number: 10126976
    Abstract: An address and a data size are provided to a rotator. The rotator stores, based on the address and the data size, a data element in a location having a defined number of positions. The data element includes one or more data units and the one or more data units are aligned correctly in one or more positions of the location based on a predefined position in the location to receive a selected data unit of the one or more data units. The rotator replicates a value of a chosen data unit of the one or more data units to one or more other positions of the location.
    Type: Grant
    Filed: February 17, 2017
    Date of Patent: November 13, 2018
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventor: Michael K. Gschwind
  • Patent number: 10108397
    Abstract: Embodiments of the inventive concept include a fast close path solution and circuit of a three path fused multiply-adder circuit. The fast close path circuit can include one or more compressors that can receive multiple operands and produce a result sum and a result carry. The close path circuit can include one or more leading zero anticipators (LZAs). The one or more LZAs can receive and process the result sum and the result carry. The close path circuit can include one or more adders. The one or more adders can receive and add the result sum and the result carry in parallel with the one or more LZAs processing the result sum and the result carry. Since the close path is the critical timing path, by performing the addition operations in parallel with the LZA and/or priority encode (PENC) operations, the logic depth and latency of the close path are reduced.
    Type: Grant
    Filed: February 1, 2016
    Date of Patent: October 23, 2018
    Assignee: SAMSUNG ELECTRONICS CO., LTD.
    Inventor: Ashraf Ahmed
  • Patent number: 10013258
    Abstract: Embodiments are directed to a method of adjusting an index, wherein the index identifies a location of an element within an array. The method includes executing, by a computer, a single instruction that adjusts a first parameter of the index to match a parameter of an array address. The single instruction further adjusts a second parameter of the index to match a parameter of the array element. The adjustment of the first parameter includes a sign extension.
    Type: Grant
    Filed: September 29, 2014
    Date of Patent: July 3, 2018
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventor: Michael K. Gschwind
  • Patent number: 9959247
    Abstract: A circuit comprises an input register configured to receive an input vector of elements, a control register configured to receive a control vector of elements, wherein each element of the control vector corresponds to a respective element of the input vector, and wherein each element specifies a permutation of a corresponding element of the input vector, and a permute execution circuit configured to generate an output vector of elements corresponding to a permutation of the input vector. Generating each element of the output vector comprises accessing, at the input register, a particular element of the input vector, accessing, at the control register, a particular element of the control vector corresponding to the particular element of the input vector, and outputting the particular element of the input vector as an element at a particular position of the output vector that is selected based on the particular element of the control vector.
    Type: Grant
    Filed: April 25, 2017
    Date of Patent: May 1, 2018
    Assignee: Google LLC
    Inventors: Dong Hyuk Woo, Gregory Michael Thorson, Andrew Everett Phelps, Olivier Temam, Jonathan Ross, Christopher Aaron Clark
  • Patent number: 9959094
    Abstract: An arithmetic apparatus comprises a plurality of cascade-connected arithmetic units. Each of the plurality of arithmetic units comprises: a calculator configured to operate in one of a rotation mode of performing a rotation calculation, and a vectoring mode of calculating a rotation angle; and a holding unit configured to hold rotational direction information output from the calculator in the vectoring mode. In addition, when operating in the rotation mode, the calculator performs the rotation calculation on data input from an arithmetic unit in a preceding stage, based on the rotational direction information held in the holding unit.
    Type: Grant
    Filed: June 3, 2015
    Date of Patent: May 1, 2018
    Assignee: CANON KABUSHIKI KAISHA
    Inventors: Tadayoshi Nakayama, Koki Mitsunami
  • Patent number: 9933996
    Abstract: An apparatus for mathematical manipulation is described allowing the selective combination of shifters to shift binary numbers of various widths. Selective combination allows on-the-fly adjustment of shifters from independent to coordinated shifting operations. Selective combination allows adjustable hardware-based shifting while saving space and resources. Multiple eight-bit shifters can be configured for a variety of operand widths, such as a 32-bit width, a 24-bit width, a 16-bit width, or an eight-bit width. Multiplexers route the appropriate input data to the appropriate shifters. Opcodes configure the shifters for the desired type of shift and a shifted result is generated.
    Type: Grant
    Filed: December 20, 2013
    Date of Patent: April 3, 2018
    Assignee: Wave Computing, Inc.
    Inventor: Samit Chaudhuri
  • Patent number: 9933998
    Abstract: In a novel computation device, a plurality of partial product generators is communicatively coupled to a binary number multiplier. The binary number is partitioned in the computation device into non-overlapping subsets of binary bits and each subset is coupled to one of the plurality of partial product generators. Each partial product generator, upon receiving a subset of binary bits representing a number, generates a multiplication product of the number and a predetermined constant. The multiplication products from all partial product generators are summed to generate the final product between the predetermined constant and the binary number. The partial product generators are constructed by logic gates and wires connected the logic gates including a AND gate. The partial product generators are free of memory elements.
    Type: Grant
    Filed: February 6, 2017
    Date of Patent: April 3, 2018
    Inventors: Kuo-Tseng Tseng, Parkson Wong
  • Patent number: 9904511
    Abstract: An improved shifter design for high-speed data processors is described. The shifter may include a first stage, in which the input bits are shifted by increments of N bits where N>1, followed by a second stage, in which all bits are shifted by a residual amount. A pre-shift may be removed from an input to the shifter and replaced by a shift adder at the second stage to further increase the speed of the shifter.
    Type: Grant
    Filed: February 6, 2015
    Date of Patent: February 27, 2018
    Assignee: Cavium, Inc.
    Inventors: Nitin Mohan, Ilan Pragaspathy
  • Patent number: 9841979
    Abstract: A method and corresponding apparatus for processing a shuffle instruction are provided. Shuffle units are configured in a hierarchical structure, and each of the shuffle units generates a shuffled data element array by performing shuffling on an input data element array. In the hierarchical structure, which includes an upper shuffle unit and a lower shuffle unit, the shuffled data element array output from the lower shuffle unit is input to the upper shuffle unit as a portion of the input data element array for the upper shuffle unit.
    Type: Grant
    Filed: July 14, 2014
    Date of Patent: December 12, 2017
    Assignee: Samsung Electronics Co., Ltd.
    Inventors: Keshava Prasad, Navneet Basutkar, Young Hwan Park, Ho Yang, Yeon Bok Lee
  • Patent number: 9805819
    Abstract: Described herein are techniques, systems, and circuits for addressing image data according to blocks. For example, in some cases, the address space may be divided into high order address bits and low order address bits. In these cases, an address circuit may twist an address space by shifting the high order bits and low order bits of an address in a rightward direction, shifting the low order bits of the address in a leftward direction, and shifting the high order bits and the low order bits of the address in the leftward direction. The circuit may modify the address value and untwist the address space. For example, the untwisting may include shifting the high order bits and the low order bits of an address in the rightward direction, shifting the low order bits of the address in the rightward direction, and shifting the high order bits and the low order bits of the address in the leftward direction.
    Type: Grant
    Filed: October 14, 2016
    Date of Patent: October 31, 2017
    Assignee: Amazon Technologies, Inc.
    Inventor: Carl Ryan Kelso
  • Patent number: 9762365
    Abstract: It is possible to provide a radio communication terminal device and a radio transmission method which can improve reception performance of a CQI and a reference signal. A phase table storage unit stores a phase table which correlates the amount of cyclic shift to complex coefficients {w1, w2} to be multiplied on the reference signal. A complex coefficient multiplication unit reads out a complex coefficient corresponding to the amount of cyclic shift indicated by resource allocation information, from the phase table storage unit and multiplies the read-out complex coefficient on the reference signal so as to change the phase relationship between the reference signals in a slot.
    Type: Grant
    Filed: December 12, 2016
    Date of Patent: September 12, 2017
    Assignee: Sun Patent Trust
    Inventors: Tomofumi Takata, Daichi Imamura, Seigo Nakao, Sadaki Futagi, Takashi Iwai, Yoshihiko Ogawa
  • Patent number: 9665346
    Abstract: Mechanisms are provided for performing a floating point arithmetic operation in a data processing system. A plurality of floating point operands of the floating point arithmetic operation are received and bits in a mantissa of at least one floating point operand of the plurality of floating point operands are shifted. One or more bits of the mantissa that are shifted outside a range of bits of the mantissa of at least one floating point operand are stored and a vector value is generated based on the stored one or more bits of the mantissa that are shifted outside of the range of bits of the mantissa of the at least one floating point operand. A resultant value is generated for the floating point arithmetic operation based on the vector value and the plurality of floating point operands.
    Type: Grant
    Filed: October 28, 2014
    Date of Patent: May 30, 2017
    Assignee: International Business Machines Corporation
    Inventors: John B. Carter, Bruce G. Mealey, Karthick Rajamani, Eric E. Retter, Jeffrey A. Stuecheli
  • Patent number: 9658986
    Abstract: A data processing apparatus includes a computing unit that performs a matrix computation between data streams whose unit data is of a matrix format; a determining unit that for each matrix obtained by the matrix computation by the computing unit, determines based on the value of each element included in the matrix, an exponent value for expressing each element included in the matrix as a floating decimal point value; a converting unit that converts the value of each element into a significand value of the element, according to the exponent value determined by the determining unit; and an output unit that correlates and outputs the exponent value and each matrix after conversion in which the value of each element in the matrix has been converted by the converting unit.
    Type: Grant
    Filed: January 29, 2014
    Date of Patent: May 23, 2017
    Assignee: FUJITSU LIMITED
    Inventors: Yi Ge, Noboru Kobayashi, Hiroshi Hatano, Yasuhiro Oyama
  • Patent number: 9600194
    Abstract: An address and a data size are provided to a rotator. The rotator stores, based on the address and the data size, a data element in a location having a defined number of positions. The data element includes one or more data units and the one or more data units are aligned correctly in one or more positions of the location based on a predefined position in the location to receive a selected data unit of the one or more data units. The rotator replicates a value of a chosen data unit of the one or more data units to one or more other positions of the location.
    Type: Grant
    Filed: November 25, 2015
    Date of Patent: March 21, 2017
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventor: Michael K. Gschwind
  • Patent number: 9554376
    Abstract: It is possible to provide a radio communication terminal device and a radio transmission method which can improve reception performance of a CQI and a reference signal. A phase table storage unit stores a phase table which correlates the amount of cyclic shift to complex coefficients {w1, w2} to be multiplied on the reference signal. A complex coefficient multiplication unit reads out a complex coefficient corresponding to the amount of cyclic shift indicated by resource allocation information, from the phase table storage unit and multiplies the read-out complex coefficient on the reference signal so as to change the phase relationship between the reference signals in a slot.
    Type: Grant
    Filed: March 1, 2016
    Date of Patent: January 24, 2017
    Assignee: Sun Patent Trust
    Inventors: Tomofumi Takata, Daichi Imamura, Seigo Nakao, Sadaki Futagi, Takashi Iwai, Yoshihiko Ogawa
  • Patent number: 9537510
    Abstract: A variable shifter includes: a plurality of shifters that cyclically shift input data having a plurality of bits or cyclically shifted data; and a control unit that selects a shift amount for each of the plurality of shifters in accordance with a predetermined cyclic shift amount. The number of types of the predetermined cyclic shift amount is smaller than the number of bits in the input data, each shifter selects one of a plurality of shift amounts in accordance with the predetermined cyclic shift amount, and the plurality of shift amounts have a combination of shift amounts that differ from one shifter to another.
    Type: Grant
    Filed: February 2, 2015
    Date of Patent: January 3, 2017
    Assignee: Panasonic Intellectual Property Managament Co., Ltd.
    Inventors: Hiroyuki Motozuka, Hiroyuki Yoshikawa
  • Patent number: 9529591
    Abstract: Vector single instruction multiple data (SIMD) shift and rotate instructions are provided specifying: a destination vector register comprising fields to store vector elements, a first vector register, a vector element size, and a second vector register. Vector data fields of a first element size are duplicated. Duplicate vector data fields are stored as corresponding data fields of twice the first element size. Control logic receives an element size for performing a SIMD shift or rotation operation. Through selectors corresponding to a vector element, portions are selected from the duplicated data fields, the selectors corresponding to any particular vector element select all portions similarly from the duplicated data fields for that particular vector element responsive to the first element size, but selectors corresponding to any particular vector element select at least two portions from the duplicated data fields differently for that particular vector element responsive to a second element size.
    Type: Grant
    Filed: December 30, 2011
    Date of Patent: December 27, 2016
    Assignee: Intel Corporation
    Inventors: Asaf Rubinstein, Tom Aviram
  • Patent number: 9519483
    Abstract: A method and apparatus are described for generating flags in response to processing data during an execution pipeline cycle of a processor. The processor may include a multiplexer configured to generate valid bits for received data according to a designated data size, and a logic unit configured to control the generation of flags based on a shift or rotate operation command, the designated data size and information indicating how many bytes and bits to rotate or shift the data by. A carry flag may be used to extend the amount of bits supported by shift and rotate operations. A sign flag may be used to indicate whether a result is a positive or negative number. An overflow flag may be used to indicate that a data overflow exists, whereby there are not a sufficient number of bits to store the data.
    Type: Grant
    Filed: December 22, 2011
    Date of Patent: December 13, 2016
    Assignee: ADVANCED MICRO DEVICES, INC.
    Inventors: Srikanth Arekapudi, Saurabh Gupta
  • Patent number: 9478312
    Abstract: Described herein are techniques, systems, and circuits for addressing image data according to blocks. For example, in some cases, the address space may be divided into high order address bits and low order address bits. In these cases, an address circuit may twist an address space by shifting the high order bits and low order bits of an address in a rightward direction, shifting the low order bits of the address in a leftward direction, and shifting the high order bits and the low order bits of the address in the leftward direction. The circuit may modify the address value and untwist the address space. For example, the untwisting may include shifting the high order bits and the low order bits of an address in the rightward direction, shifting the low order bits of the address in the rightward direction, and shifting the high order bits and the low order bits of the address in the leftward direction.
    Type: Grant
    Filed: December 23, 2014
    Date of Patent: October 25, 2016
    Assignee: Amazon Technologies, Inc.
    Inventor: Carl Ryan Kelso
  • Patent number: 9398617
    Abstract: Embodiments of the present invention relate to a method and an apparatus for processing random access in a wireless communication network, and a processing method of a user equipment and an apparatus. The method for processing random access in the communication network includes: the base station receives a first Zadoff-Chu sequence and a second Zadoff-Chu sequence that are sent by a user equipment, a du of the first Zadoff-Chu sequence is smaller than a du of the second Zadoff-Chu sequence; the base station estimates an error range for a round trip delay RTD of the user equipment according to the first Zadoff-Chu sequence, estimates, according to the second Zadoff-Chu sequence, the RTD within the error range for the RTD or a frequency offset of an uplink signal of the user equipment. The problem that the user equipment with a frequency offset accesses a network is solved.
    Type: Grant
    Filed: July 8, 2014
    Date of Patent: July 19, 2016
    Assignee: Huawei Technologies Co., Ltd.
    Inventors: Changyu Guo, Li Wan, Chunhui Le, Jing Li, Yan Liu
  • Patent number: 9317283
    Abstract: A processor may generate a result vector when executing a RunningShiftForDivide1P or RunningShiftForDivide2P instruction. For example, upon executing a RunningShiftForDivide1P/2P instruction, the processor may receive a first input vector and a second input vector. The processor then may record a base value from an element at a key element position in the first input vector. Next, when generating the result vector, for each active element in the result vector to the right of the key element position, the processor may generate a shifted base value using shift values from the second input vector. The processor then may correct the shifted base value when a predetermined condition is met. Next, the processor may set the element of the result vector equal to the shifted base value.
    Type: Grant
    Filed: December 17, 2012
    Date of Patent: April 19, 2016
    Assignee: Apple Inc.
    Inventor: Jeffry E. Gonion
  • Patent number: 9318813
    Abstract: A QRD processor for computing input signals in a receiver for wireless communication relies upon a combination of multi-dimensional Givens Rotations, Householder Reflections and conventional two-dimensional (2D) Givens Rotations, for computing the QRD of matrices. The proposed technique integrates the benefits of multi-dimensional annihilation capability of Householder reflections plus the low-complexity nature of the conventional 2D Givens rotations. Such integration increases throughput and reduces the hardware complexity, by first decreasing the number of rotation operations required and then by enabling their parallel execution. A pipelined architecture is presented (290) that uses un-rolled pipelined CORDIC processors (245a to 245d) iteratively to improve throughput and resource utilization, while reducing the gate count.
    Type: Grant
    Filed: May 24, 2010
    Date of Patent: April 19, 2016
    Assignee: MaxLinear, Inc.
    Inventors: Dimpesh Patel, Glenn Gulak, Mahdi Shabany
  • Patent number: 9189237
    Abstract: Method, apparatus, and program means for performing bitstream buffer manipulation with a SIMD merge instruction. The method of one embodiment comprises determining whether any unprocessed data bits for a partial variable length symbol exist in a first data block is made. A shift merge operation is performed to merge the unprocessed data bits from the first data block with a second data block. A merged data block is formed. A merged variable length symbol comprised of the unprocessed data bits and a plurality of data bits from the second data block is extracted from the merged data block.
    Type: Grant
    Filed: December 27, 2012
    Date of Patent: November 17, 2015
    Assignee: Intel Corporation
    Inventors: Yen-Kuang Chen, William W. Macy, Jr., Matthew Holliman, Eric L. Debes, Minerva M. Yeung, Huy V. Nguyen, Julien Sebot
  • Patent number: 9189238
    Abstract: Method, apparatus, and program means for performing bitstream buffer manipulation with a SIMD merge instruction. The method of one embodiment comprises determining whether any unprocessed data bits for a partial variable length symbol exist in a first data block is made. A shift merge operation is performed to merge the unprocessed data bits from the first data block with a second data block. A merged data block is formed. A merged variable length symbol comprised of the unprocessed data bits and a plurality of data bits from the second data block is extracted from the merged data block.
    Type: Grant
    Filed: January 29, 2013
    Date of Patent: November 17, 2015
    Assignee: Intel Corporation
    Inventors: Yen-Kuang Chen, William W. Macy, Jr., Matthew Holliman, Eric L. Debes, Minerva M. Yeung, Huy V. Nguyen, Julien Sebot
  • Patent number: 9182987
    Abstract: Method, apparatus, and program means for performing bitstream buffer manipulation with a SIMD merge instruction. The method of one embodiment comprises determining whether any unprocessed data bits for a partial variable length symbol exist in a first data block is made. A shift merge operation is performed to merge the unprocessed data bits from the first data block with a second data block. A merged data block is formed. A merged variable length symbol comprised of the unprocessed data bits and a plurality of data bits from the second data block is extracted from the merged data block.
    Type: Grant
    Filed: January 29, 2013
    Date of Patent: November 10, 2015
    Assignee: Intel Corporation
    Inventors: Yen-Kuang Chen, William W. Macy, Jr., Matthew Holliman, Eric L. Debes, Minerva M. Yeung, Huy V. Nguyen, Julien Sebot
  • Patent number: 9182985
    Abstract: Method, apparatus, and program means for performing bitstream buffer manipulation with a SIMD merge instruction. The method of one embodiment comprises determining whether any unprocessed data bits for a partial variable length symbol exist in a first data block is made. A shift merge operation is performed to merge the unprocessed data bits from the first data block with a second data block. A merged data block is formed. A merged variable length symbol comprised of the unprocessed data bits and a plurality of data bits from the second data block is extracted from the merged data block.
    Type: Grant
    Filed: November 5, 2012
    Date of Patent: November 10, 2015
    Assignee: Intel Corporation
    Inventors: Yen-Kuang Chen, William W. Macy, Jr., Matthew Holliman, Eric L. Debes, Minerva M. Yeung, Huy V. Nguyen, Julien Sebot
  • Patent number: 9182988
    Abstract: Method, apparatus, and program means for performing bitstream buffer manipulation with a SIMD merge instruction. The method of one embodiment comprises determining whether any unprocessed data bits for a partial variable length symbol exist in a first data block is made. A shift merge operation is performed to merge the unprocessed data bits from the first data block with a second data block. A merged data block is formed. A merged variable length symbol comprised of the unprocessed data bits and a plurality of data bits from the second data block is extracted from the merged data block.
    Type: Grant
    Filed: March 7, 2013
    Date of Patent: November 10, 2015
    Assignee: Intel Corporation
    Inventors: Yen-Kuang Chen, William W. Macy, Jr., Matthew Holliman, Eric L. Debes, Minerva M. Yeung, Huy V. Nguyen, Julien Sebot
  • Patent number: 9170815
    Abstract: Method, apparatus, and program means for performing bitstream buffer manipulation with a SIMD merge instruction. The method of one embodiment comprises determining whether any unprocessed data bits for a partial variable length symbol exist in a first data block is made. A shift merge operation is performed to merge the unprocessed data bits from the first data block with a second data block. A merged data block is formed. A merged variable length symbol comprised of the unprocessed data bits and a plurality of data bits from the second data block is extracted from the merged data block.
    Type: Grant
    Filed: January 29, 2013
    Date of Patent: October 27, 2015
    Assignee: Intel Corporation
    Inventors: Yen-Kuang Chen, William W. Macy, Jr., Matthew Holliman, Eric L. Debes, Minerva M. Yeung, Huy V. Nguyen, Julien Sebot
  • Patent number: 9170814
    Abstract: Method, apparatus, and program means for performing bitstream buffer manipulation with a SIMD merge instruction. The method of one embodiment comprises determining whether any unprocessed data bits for a partial variable length symbol exist in a first data block is made. A shift merge operation is performed to merge the unprocessed data bits from the first data block with a second data block. A merged data block is formed. A merged variable length symbol comprised of the unprocessed data bits and a plurality of data bits from the second data block is extracted from the merged data block.
    Type: Grant
    Filed: November 5, 2012
    Date of Patent: October 27, 2015
    Assignee: Intel Corporation
    Inventors: Yen-Kuang Chen, William W. Macy, Matthew Holliman, Eric L. Debes, Minerva M. Yeung, Huy V. Nguyen, Julien Sebot
  • Patent number: 9152420
    Abstract: Method, apparatus, and program means for performing bitstream buffer manipulation with a SIMD merge instruction. The method of one embodiment comprises determining whether any unprocessed data bits for a partial variable length symbol exist in a first data block is made. A shift merge operation is performed to merge the unprocessed data bits from the first data block with a second data block. A merged data block is formed. A merged variable length symbol comprised of the unprocessed data bits and a plurality of data bits from the second data block is extracted from the merged data block.
    Type: Grant
    Filed: January 29, 2013
    Date of Patent: October 6, 2015
    Assignee: Intel Corporation
    Inventors: Yen-Kuang Chen, William W. Macy, Jr., Matthew Holliman, Eric L. Debes, Minerva M. Yeung
  • Patent number: 9134953
    Abstract: Microprocessor shifter circuits utilizing butterfly and inverse butterfly circuits, and control circuits therefor, are provided. The same shifter circuits can also perform complex bit manipulations at high speeds, including butterfly and inverse butterfly operations, parallel extract and deposit operations, group operations, mix operations, permutation operations, as well as instructions executed by existing microprocessors, including shift right, shift left, rotate, extract, deposit and multimedia mix operations. The shifter circuits can be provided in various combinations to provide microprocessor functional units which perform a plurality of bit manipulation operations.
    Type: Grant
    Filed: October 9, 2012
    Date of Patent: September 15, 2015
    Assignee: Teleputers, LLC
    Inventors: Ruby B. Lee, Yedidya Hilewitz
  • Publication number: 20150149518
    Abstract: A barrel shifter uses a sign magnitude to 2's complement converter to generate decoder signals for its cascaded multiplexer selectors. The sign input receives the shift direction and the magnitude input receives the shift amount. The sign magnitude to 2's complement converter computes an output result as a 2's complement of the shift amount using the shift direction as a sign input, assigns a first portion (most significant bit half) of the output result to a first decoder signal, and assigns a second portion (least significant bit half) of the output result to a second decoder signal. This encoding scheme allows the decoder circuits to be relatively simple, for example, 3-to-8 decoders for an implementation adapted to shift a 64-bit operand value rather than the 4-to-9 decoder required in a conventional barrel shifter, leading to faster operation, less area, and reduced power consumption.
    Type: Application
    Filed: January 31, 2015
    Publication date: May 28, 2015
    Applicant: International Business Machines Corporation
    Inventor: Takeo Yasuda
  • Publication number: 20150134713
    Abstract: Examples of the present disclosure provide apparatuses and methods for performing division operations in a memory. An example apparatus comprises a first address space comprising a first number of memory cells coupled to a sense line and to a first number of select lines wherein the first address space stores a dividend value. A second address space comprises a second number of memory cells coupled to the sense line and to a second number of select lines wherein the second address space stores a divisor value. A third address space comprises a third number of memory cells coupled to the sense line and to a third number of select lines wherein the third address space stores a remainder value. Sensing circuitry can be configured to receive the dividend value and the divisor value, divide the dividend value by the divisor value, and store a remainder result in the third number of memory cells.
    Type: Application
    Filed: November 8, 2013
    Publication date: May 14, 2015
    Applicant: Micron Technology, Inc.
    Inventor: Kyle B. Wheeler
  • Patent number: 9021000
    Abstract: A barrel shifter uses a sign magnitude to 2's complement converter to generate decoder signals for its cascaded multiplexer selectors. The sign input receives the shift direction and the magnitude input receives the shift amount. The sign magnitude to 2's complement converter computes an output result as a 2's complement of the shift amount using the shift direction as a sign input, assigns a first portion (most significant bit half) of the output result to a first decoder signal, and assigns a second portion (least significant bit half) of the output result to a second decoder signal. This encoding scheme allows the decoder circuits to be relatively simple, for example, 3-to-8 decoders for an implementation adapted to shift a 64-bit operand value rather than the 4-to-9 decoder required in a conventional barrel shifter, leading to faster operation, less area, and reduced power consumption.
    Type: Grant
    Filed: June 29, 2012
    Date of Patent: April 28, 2015
    Assignee: International Business Machines Corporation
    Inventor: Takeo Yasuda
  • Patent number: 9015216
    Abstract: In one embodiment, a rotator, a mask generator, and circuitry configured to mask the rotated operand output by the rotator with the output mask generated by the mask generator perform a shift operation. The rotator is configured to rotate the input operand by the shift count. The mask generator is configured to generate an output mask by decoding a most significant bit (MSB) field of the shift count to generate a first mask, decoding a least significant bit (LSB) field of the shift count to generate a second mask, logically ANDing the bits of the second mask with the corresponding bit of the first mask and logically ORing the result with an adjacent bit of the first mask that is selected responsive to the shift direction.
    Type: Grant
    Filed: September 14, 2011
    Date of Patent: April 21, 2015
    Assignee: Apple Inc.
    Inventor: Honkai Tam
  • Publication number: 20150100612
    Abstract: A method and apparatus for processing numeric calculation are provided. The method includes determining a shift bit and an index bit that falls within an index range of a lookup table from among bits representing a divisor scaled up by an offset, obtaining a replacement value corresponding to an index value of the determined index bit by using the lookup table, multiplying a dividend scaled up by the offset by the obtained replacement value, and outputting a value corresponding to a division operation by correcting a scale of a result of the multiplication using a right shift operation.
    Type: Application
    Filed: July 11, 2014
    Publication date: April 9, 2015
    Applicant: SAMSUNG ELECTRONICS CO., LTD.
    Inventors: Jong-hun LEE, Young-su MOON, Jung-uk CHO, Yong-min TAI, Do-hyung KIM, Si-hwa LEE
  • Patent number: 9002915
    Abstract: A circuit for shifting bussed data includes a first column of shift blocks, a compare block, and a second column of multiplexer blocks. The first column shifts the bussed data by a number of bits specified by first bits of a shift control input. The compare block determines the value of a second bit of the shift control input and creates an output reflecting that value. The second column has a control input coupled to the output of the compare block, shifts the data by one byte when the second bit of the shift control input has a first value, and does not shift the data when the second bit has a second value. The shift, compare, and multiplexer blocks can be substantially similar logic blocks programmable to perform any of these functions, can include N-bit data inputs and outputs, and can operate on the bussed data as an N-bit bus.
    Type: Grant
    Filed: April 2, 2009
    Date of Patent: April 7, 2015
    Assignee: Xilinx, Inc.
    Inventors: Steven P. Young, Brian C. Gaide
  • Publication number: 20150095388
    Abstract: Field programmable gate arrays (FPGA) contain, in addition to random logic, also other components, such as processing units, multiply-accumulate (MAC) units, analog circuits, and other elements, configurable with respect of the random logic, to enhance the capabilities of the FPGA. A circuit for a filed configurable MAC unit is provided to allow various configurations of ADD, SUBTRACT, MULTIPLY and SHIFT functions. Optionally, registered input and registered output support a multi-cycle path. A configuration of a constant facilitates the configuration of the circuit to perform infinite impulse response (IIR) and finite impulse response (FIR) functions in hardware.
    Type: Application
    Filed: December 10, 2013
    Publication date: April 2, 2015
    Applicant: Scaleo Chip
    Inventors: Loic Vezier, Farid Tahiri
  • Publication number: 20150088947
    Abstract: A method is described that involves executing a first instruction with a functional unit. The first instruction is a multiply-add instruction. The method further includes executing a second instruction with the functional unit. The second instruction is a round instruction.
    Type: Application
    Filed: December 3, 2014
    Publication date: March 26, 2015
    Applicant: INTEL CORPORATION
    Inventors: Cristina S. Anderson, Zeev Sperber, Simon Rubanovich, Benny Eitan, Amit Gradstein
  • Publication number: 20150067010
    Abstract: An integrated circuit is provided that performs floating-point addition or subtraction operations involving at least three floating-point numbers. The floating-point numbers are pre-processed by dynamically extending the number of mantissa bits, determining the floating-point number with the biggest exponent, and shifting the mantissa of the other floating-point numbers to the right. Each extended mantissa has at least twice the number of bits of the mantissa entering the floating-point operation. The exact bit extension is dependent on the number of floating-point numbers to be added. The mantissas of all floating-point numbers with an exponent smaller than the biggest exponent are shifted to the right. The number of right shift bits is dependent on the difference between the biggest exponent and the respective floating-point exponent.
    Type: Application
    Filed: September 5, 2013
    Publication date: March 5, 2015
    Applicant: Altera Corporation
    Inventor: Tomasz Czajkowski
  • Patent number: 8972469
    Abstract: A system and method for efficiently rotating data in a processor for multiple operand sizes. A processor comprises a rotator configured to support multiple operand sizes. The rotator receives a rotate amount and an input operand with a size less than a maximum operand size supported by the processor. The rotator generates a mask with a same size as the received input operand. The mask comprises a number of asserted most-significant bits equal to the rotate amount. The remaining bits in the mask are deasserted. For a given rotation result bit position with an associated asserted mask bit, the rotator selects a value in the input operand at a bit position with a distance from the given result bit position equal to the rotate amount plus a difference between the maximum operand size supported by the processor and the input operand size.
    Type: Grant
    Filed: June 30, 2011
    Date of Patent: March 3, 2015
    Assignee: Apple Inc.
    Inventors: Fang Liu, Honkai (John) Tam
  • Publication number: 20150058389
    Abstract: Techniques are disclosed relating to performing extended multiplies without a carry flag. In one embodiment, an apparatus includes a multiply unit configured to perform multiplications of operands having a particular width. In this embodiment, the apparatus also includes multiple storage elements configured to store operands for the multiply unit. In this embodiment, each of the storage elements is configured to provide a portion of a stored operand that is less than an entirety of the stored operand in response to a control signal from the apparatus. In one embodiment, the apparatus is configured to perform a multiplication of given first and second operands having a width greater than the particular width by performing a sequence of multiply operations using the multiply unit, using portions of the stored operands and without using a carry flag between any of the sequence of multiply operations.
    Type: Application
    Filed: August 20, 2013
    Publication date: February 26, 2015
    Applicant: Apple Inc.
    Inventors: James S. Blomgren, Terence M. Potter
  • Publication number: 20150039662
    Abstract: A fused floating-point multiply-add element includes a multiplier that generates a product, and a shifter that shifts an addend within a narrow range. Interpreting logic analyzes the magnitude of the addend relative to the product and then causes logic arrays to position the shifted addend within the left, center, or right portions of a composite register depending in the magnitude of the addend relative to the product. The interpreting logic also forces other portions of the composite register to zero. When the addend is zero, the interpreting logic forces all portions of the composite register to zero. Final combining logic then adds the contents of the composite register to the product.
    Type: Application
    Filed: August 5, 2013
    Publication date: February 5, 2015
    Applicant: NVIDIA CORPORATION
    Inventors: Srinivasan IYER, David Conrad TANNENBAUM, Stuart F. OBERMAN, Ming (Michael) Y. SIU