Arithmetical Operation Patents (Class 708/490)
  • Patent number: 10461887
    Abstract: Methods and systems for blind detection. At the encoder, a code word is encoded using a polar coder, where the input vector includes a user equipment (UE)-specific frozen sequence in the frozen bit positions. At the decoder, a set of short listed channel candidates is generated based on decoding using the UE-specific frozen sequence.
    Type: Grant
    Filed: August 2, 2017
    Date of Patent: October 29, 2019
    Assignee: HUAWEI TECHNOLOGIES CO., LTD.
    Inventors: Yiqun Ge, Wuxian Shi
  • Patent number: 10447983
    Abstract: A reciprocal approximation circuit has a first iteration circuit for generating an approximate reciprocal value of an operand. The operation of the first iteration circuit is controlled by two bits of the operand, which indicate a range in which the operand lies. The first iteration circuit uses hardware friendly initial values based on the two bits for generating the approximate reciprocal value. The reciprocal approximation circuit does not require any additional circuit for selecting an initial value for the first iteration circuit.
    Type: Grant
    Filed: November 15, 2017
    Date of Patent: October 15, 2019
    Assignee: NXP USA, INC.
    Inventor: Mahesh Chandra
  • Patent number: 10437561
    Abstract: The invention relates to a stochastic-type microprocessor. In some embodiments, the microprocessor comprises an elementary stochastic computation module able to receive, as input, two random and independent binary input signals each representing a binary coding of two respective given input probability values, and able to generate, as output, a random binary output signal. The elementary module comprises: a programmable logic unit, able to combine two input signals to generate an output signal; an addressable memory, able to store an output probability value coded by an output signal generated by the logic unit; a first stochastic clock, able to produce a first clock signal; a second stochastic clock, able to produce a second clock signal.
    Type: Grant
    Filed: June 17, 2016
    Date of Patent: October 8, 2019
    Assignees: CENTRE NATIONAL DE LA RECHERCHE SCIENTIFIQUE, COLLEGE DE FRANCE
    Inventors: Jacques Droulez, Pierre Bessiere
  • Patent number: 10432357
    Abstract: Methods, systems, and devices that support an efficient sequence-based polar code description are described. In some cases, a wireless device (e.g., a user equipment (UE) or a base station) may transmit a codeword including a set of information bits encoded using a polar code or receive a codeword including a set of information bits encoded using a polar code. As described herein, the wireless device may determine the bit locations of the information bits in the polar code based on a partition assignment vector. Specifically, the wireless device may partition bit-channels for one or more stages of polarization and assign information bits to partitions based on the partition assignment vector. Once the bit locations of the information bits are determined, the wireless device may decode a received codeword or transmit an encoded codeword based on the determined bit locations of the information bits.
    Type: Grant
    Filed: November 20, 2017
    Date of Patent: October 1, 2019
    Assignee: QUALCOMM Incorporated
    Inventors: Yang Yang, Jing Jiang, Gabi Sarkis
  • Patent number: 10432937
    Abstract: A method for reducing the entropy of an original matrix, includes a step using a wavelet transformation of the original matrix into a transformed matrix, a quantization coefficient corresponding to each wavelet level, the wavelet transformation being calculated in fixed decimal point using a first number of digits at least equal to 1, for example 3 digits, after the decimal point. Such a method is particularly advantageous for image compression.
    Type: Grant
    Filed: July 9, 2015
    Date of Patent: October 1, 2019
    Inventor: Jean-Claude Colin
  • Patent number: 10425186
    Abstract: Concepts and examples pertaining to combined coding design for efficient codeblock extension are described. A processor of a communication apparatus may combine channel polarization of a communication channel with a first coding scheme for first codeblocks of a smaller size to generate a second coding scheme. The processor may also code second codeblocks of a larger size using the second coding scheme.
    Type: Grant
    Filed: September 11, 2017
    Date of Patent: September 24, 2019
    Assignee: MEDIATEK INC.
    Inventors: Wei-De Wu, Mao-Ching Chiu
  • Patent number: 10341048
    Abstract: Embodiments of the present invention provide a channel encoding and decoding method and apparatus, where a channel encoding method includes: acquiring, by an encoder, an information bit index set; generating, by the encoder, a second bit vector according to a to-be-encoded first information bit and the information bit index set; and performing, by the encoder, Polar code encoding on the second bit vector to generate an encoded first code word. In technical solutions of the present invention, an encoder first acquires an information bit index set, generates a second bit vector according to a to-be-encoded first information bit and the information bit index set, and then performs Polar code encoding on the second bit vector to generate an encoded first code word.
    Type: Grant
    Filed: September 25, 2015
    Date of Patent: July 2, 2019
    Assignee: Huawei Technologies Co., Ltd.
    Inventors: Hui Shen, Bin Li
  • Patent number: 10340944
    Abstract: An object of the invention is to speed up processing of adding floating-point numbers. A floating-point adder includes: a first register configured to store a first fixed-point number having a predetermined number of digits corresponding to a result of accumulation of a plurality of floating-point numbers; a first conversion unit configured to convert an input first floating-point number into a second fixed-point number having the predetermined number of digits; a second register configured to store the second fixed-point number; an adder configured to add the second fixed-point number stored in the second register and the first fixed-point number stored in the first register, and store a result of the addition in the first register as the first fixed-point number; and a second conversion unit configured to convert the first fixed-point number into a second floating-point number, and output the second floating-point number.
    Type: Grant
    Filed: February 2, 2016
    Date of Patent: July 2, 2019
    Assignee: Renesas Electronics Corporation
    Inventor: Katsunori Tanaka
  • Patent number: 10324730
    Abstract: A computing device performs parallel computations using a set of thread processing units and a memory shuffle engine. The memory shuffle engine includes a register array to store an array of data elements retrieved from a memory buffer, and an array of input selectors. According to a first control signal, each input selector transfers at least a first data element from a corresponding subset of the register array, which is coupled to the input selector via input lines, to one or more corresponding thread processing units. According to a second control signal, each input selector transfers at least a second data element from another subset of the register array, which is coupled to another input selector via other input lines, to the one or more corresponding thread processing units.
    Type: Grant
    Filed: October 4, 2016
    Date of Patent: June 18, 2019
    Assignee: MediaTek, Inc.
    Inventors: Shou-Jen Lai, Pei-Kuei Tsung, Po-Chun Fan, Sung-Fang Tsai
  • Patent number: 10305514
    Abstract: There is described a multi-mode unrolled decoder. The decoder comprises a master code input configured to receive a polar encoded master code of length N carrying k information bits and N?k frozen bits, decoding resources comprising processing elements and memory elements connected in an unrolled architecture and defining an operation path between the master code input and an output, for decoding a polar encoded code word, at least one constituent code input configured to receive a polar encoded constituent code of length N/p carrying j information bits and N/p?j frozen bits, where p is a power of 2, and at least one input multiplexer provided in the operation path to selectively transmit N/p bits of one of the master code and the constituent code to a subset of the decoding resources.
    Type: Grant
    Filed: February 3, 2017
    Date of Patent: May 28, 2019
    Assignee: THE ROYAL INSTITUTION FOR THE ADVANCEMENT OF LEARNING/MCGILL UNIVERSITY
    Inventors: Pascal Giard, Gabi Sarkis, Warren Gross, Claude Thibeault
  • Patent number: 10235169
    Abstract: A computer program product for implementing a received add program counter immediate shift (ADDPCIS) instruction using a micro-coded or cracked sequence is provided. The computer program product includes a computer readable storage medium having program instructions embodied therewith. The program instructions are readable and executable by a processing circuit to cause the processing circuit to recognize register operand and integer terms associated with the ADDPCIS instruction, set a value of a target register associated with the ADDPCIS instruction in accordance with the integer term summed with another term by obtaining a next instruction address (NIA), moving an architecturally defined register file from a first temporary register to a general purpose register and adding a shifted immediate constant to a value stored in a second temporary register.
    Type: Grant
    Filed: June 10, 2016
    Date of Patent: March 19, 2019
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventor: Michael K. Gschwind
  • Patent number: 10237782
    Abstract: Hardware acceleration for batched sparse (BATS) codes is enabled. Hardware implementation of some timing-critical procedures can effectively offload computationally intensive overheads, for example, finite field arithmetic, Gaussian elimination, and belief propagation (BP) calculations, and this can be done without direct mapping of software codes to a hardware implementation. Suitable acceleration hardware may include pipelined multipliers configured to multiply input data with coefficients of a matrix associated with a random linear network code in a pipelined manner, addition components configured to add multiplier output to feedback data, and switches to direct data flows to and from memory components such that valid result data is not overwritten and such that feedback data corresponds to most recent valid result data. Acceleration hardware components (e.g., number and configuration) may be dynamically adjusted to modify BATS code parameters and adapt to changing network conditions.
    Type: Grant
    Filed: December 30, 2016
    Date of Patent: March 19, 2019
    Assignee: The Chinese University of Hong Kong
    Inventors: Shenghao Yang, Wai-ho Yeung, Tak-Ion Chao, Kin-Hong Lee, Chi-Iam Ho
  • Patent number: 10175946
    Abstract: An instruction to perform a sign operation of a plurality of sign operations configured for the instruction. The instruction is executed, and the executing includes selecting at least a portion of an input operand as a result to be placed in a select location. The selecting is based on a control of the instruction, in which the control indicates a user-defined size of the input operand to be selected as the result. A sign of the result is determined based on a plurality of criteria, including a value of the result, obtained based on the control of the instruction, having a first particular relationship or a second particular relationship with respect to a selected value. The result and the sign are stored in the select location to provide a signed output to be used in processing within the computing environment.
    Type: Grant
    Filed: September 30, 2016
    Date of Patent: January 8, 2019
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Jonathan D. Bradbury, Reid T. Copeland, Silvia Melitta Mueller, Timothy J. Slegel
  • Patent number: 10146536
    Abstract: A method and apparatus for including in a processor instructions for performing logical-comparison and branch support operations on packed or unpacked data. In one embodiment, instruction decode logic decodes instructions for an execution unit to operate on packed data elements including logical comparisons. A register file including 128-bit packed data registers stores packed single-precision floating point (SPFP) and packed integer data elements. The logical comparisons may include comparison of SPFP data elements and comparison of integer data elements and setting at least one bit to indicate the results. Based on these comparisons, branch support actions are taken. Such branch support actions may include setting the at least one bit, which in turn may be utilized by a branching unit in response to a branch instruction. Alternatively, the branch support actions may include branching to an indicated target code location.
    Type: Grant
    Filed: January 31, 2018
    Date of Patent: December 4, 2018
    Assignee: Intel Corporation
    Inventors: Rajiv Kapoor, Ronen Zohar, Mark Buxton, Zeev Sperber, Koby Gottlieb
  • Patent number: 10120680
    Abstract: Embodiments of systems, apparatuses, and methods for broadcast arithmetic in a processor are described.
    Type: Grant
    Filed: December 30, 2016
    Date of Patent: November 6, 2018
    Assignee: Intel Corporation
    Inventors: Rama Kishan V. Malladi, Elmoustapha Ould-Ahmed-Vall
  • Patent number: 10091336
    Abstract: A method includes providing a cloud-side database storing data, an objects model of the data, and a user interface (UI) model of the data. The method further involves providing an instance of an application server coded in JavaScript, for example, in a Node.js cross-platform runtime environment. The instance of the application server coded in JavaScript includes the logic of an application coded to process the data. The application logic is executed (and data processed) on either the client-side or on the cloud-side. The execution of the application logic (and processing of the data) is dynamically switchable between the client-side and the cloud-side.
    Type: Grant
    Filed: December 21, 2015
    Date of Patent: October 2, 2018
    Assignee: SAP SE
    Inventors: Tim Kornmann, Rene Gross, Thomas Biesemann, Jens Kisker
  • Patent number: 10019232
    Abstract: An apparatus and method are provided for inhibiting roundoff error in a floating point argument reduction operation. The apparatus has reciprocal estimation circuitry that is responsive to a first floating point value to determine a second floating point value that is an estimated reciprocal of the first floating point value. During this determination, the second floating point value has both its magnitude and its error bound constrained in dependence on a specified value N. Argument reduction circuitry then performs an argument reduction operation using the first and second floating point values as inputs, in order to generate a third floating point value. The use of the specified value N to constrain both the magnitude and the error bound of the second floating point value causes roundoff error to be inhibited in the third floating point value that is generated by the argument reduction operation.
    Type: Grant
    Filed: April 28, 2016
    Date of Patent: July 10, 2018
    Assignee: ARM Limited
    Inventor: Jørn Nystad
  • Patent number: 9983851
    Abstract: A hardware circuit computes a checksum using a technique such as the Adler-32 checksum algorithm. The hardware circuit may include one or more serially-connected chains of adders followed by a modulus circuit. The modulus circuit produces a modulus value in N, where N is not an integer power of 2. In some examples, N is 65,521. In some examples, the modulus circuit may produce a modulus value modulo 216 and then correct that value to modulo N. In other examples, the modulus circuit may include selection logic that selects an appropriate integer multiple of 65,521 to determine the modulo 65,521 result directly.
    Type: Grant
    Filed: September 23, 2016
    Date of Patent: May 29, 2018
    Assignee: Amazon Technologies, Inc.
    Inventors: Ron Diamant, Michael Baranchik, Svetlana Kantorovych, Ori Weber
  • Patent number: 9910826
    Abstract: Implementing a 1D stencil code via SIMD instructions on a computer with vector registers having N processing elements (PEs), among them a set of coefficient vector registers, a set of at most N data vector registers, and a set of result vector registers. The M stencil coefficients are loaded in a particular pattern into M+N?1 coefficient vector registers. Successive sets of N consecutive data values are received, and each data value of a set is loaded into all PEs of a data vector register of the set of data vector registers. The result vector registers accumulate sums of products of consecutive coefficient vector registers with corresponding data vector registers. The contents of any result vector register containing a sum of all coefficient vector register-data vector register products is output, and the result vector register is reused for accumulating.
    Type: Grant
    Filed: March 27, 2015
    Date of Patent: March 6, 2018
    Assignee: International Business Machines Corporation
    Inventors: Leopold Grinberg, Karen A. Magerlein
  • Patent number: 9885351
    Abstract: A pump system including a motor, a fluid pump powered by the motor, a user-interface, and a controller. The controller including a user-interface input electrically coupled to the user-interface, a serial communication input, a digital input having a plurality of digital input pins sharing a common ground pin, a processor, and a computer readable memory. The computer readable memory storing instructions that, when executed by the processor, cause the controller to receive an operating signal simultaneously from the serial communication input and the digital input, and control the motor based on one of the operating signal from the serial communication input and the operating signal from the digital input.
    Type: Grant
    Filed: March 15, 2013
    Date of Patent: February 6, 2018
    Assignee: Regal Beloit America, Inc.
    Inventor: Marc C. McKinzie
  • Patent number: 9880839
    Abstract: A processor is described having an instruction execution pipeline. The instruction execution pipeline has an instruction fetch stage to fetch an instruction specifying multiple target resultant registers. The instruction execution pipeline has an instruction decode stage to decode the instruction. The instruction execution pipeline has a functional unit to prepare resultant content specific to each of the multiple target resultant registers. The instruction execution pipeline has a write-back stage to write back said resultant content specific to each of said multiple target resultant registers.
    Type: Grant
    Filed: April 24, 2014
    Date of Patent: January 30, 2018
    Assignee: INTEL CORPORATION
    Inventors: Wei-Yu Chen, Guei-Yuan Lueh, Subramaniam Maiyuran, Supratim Pal
  • Patent number: 9870404
    Abstract: When plural processing programs for generating post-processing data which is a source of services to be provided are present, a relationship between post-processing data and a data group which is a source of the post-processing data is managed. The processing units acquire pre-processing data, execute given processing, and generate post-processing data as a result of the processing. At an opportunity to acquire the pre-processing data, a process ID indicative of ordering of the acquisition, and not updated before and after the given processing is allocated to acquired pre-processing data. The generated post-processing data is stored, and in extracting the post-processing data satisfying the given data search condition, the post-processing data having a process ID equal to or before the process ID that is latest in the post-processing data and oldest among the respective processing units is extracted from the post-processing data that satisfies the data search condition.
    Type: Grant
    Filed: September 7, 2012
    Date of Patent: January 16, 2018
    Assignee: Hitachi, Ltd.
    Inventors: Yasushi Miyata, Shoji Kodama
  • Patent number: 9870200
    Abstract: Arithmetic logic circuitry is provided for performing a floating point arithmetic add/subtract operation on first and second floating point numbers. The method includes: generating a guard digit for the first or second number by transforming the first and second numbers by a compressing function; determining a result depending on the arithmetic operation, a sum of the transformed floating point numbers, and first and second differences of the transformed floating point numbers, and determining a corresponding result plus one by additionally adding a value of one to the result; generating injection values for rounding the final result; generating injection carry values based on the transformed first and second numbers and the injection values; and selecting the final result from the result, the result plus one, and a least significant digit, based on the injection carry values and the end around carry signals.
    Type: Grant
    Filed: November 17, 2016
    Date of Patent: January 16, 2018
    Assignee: International Business Machines Corporation
    Inventors: Steven R. Carlough, Klaus M. Kroener, Petra Leber, Cedric Lichtenau, Silvia M. Mueller
  • Patent number: 9804841
    Abstract: New instruction definitions for a packet add (PADD) operation and for a single instruction multiple add (SMAD) operation are disclosed. In addition, a new dedicated PADD logic device that performs the PADD operation in about one to two processor clock cycles is disclosed. Also, a new dedicated SMAD logic device that performs a single instruction multiple data add (SMAD) operation in about one to two clock cycles is disclosed.
    Type: Grant
    Filed: January 20, 2015
    Date of Patent: October 31, 2017
    Assignee: Intel Corporation
    Inventors: Corey Gee, Bapiraju Vinnakota, Saleem Mohammadali, Carl A. Alberola
  • Patent number: 9766857
    Abstract: An apparatus includes processing circuitry to perform one or more arithmetic operations for generating a result value based on at least one operand. For at least one arithmetic operation, the processing circuitry is responsive to programmable significance data indicative of a target significance for the result value, to generate the result value having the target significance. For example, this allows programmers to set a significance boundary for the arithmetic operation so that it is not necessary for the processing circuitry to calculate bit values having a significance outside the specified boundary, enabling a performance improvement.
    Type: Grant
    Filed: December 24, 2014
    Date of Patent: September 19, 2017
    Assignee: ARM Limited
    Inventors: David Raymond Lutz, Neil Burgess, Christopher Neal Hinds
  • Patent number: 9760371
    Abstract: A method of an aspect includes receiving a packed data operation mask register arithmetic combination instruction. The packed data operation mask register arithmetic combination instruction indicates a first packed data operation mask register, indicates a second packed data operation mask register, and indicates a destination storage location. An arithmetic combination of at least a portion of bits of the first packed data operation mask register and at least a corresponding portion of bits of the second packed data operation mask register is stored in the destination storage location in response to the packed data operation mask register arithmetic combination instruction. Other methods, apparatus, systems, and instructions are disclosed.
    Type: Grant
    Filed: December 22, 2011
    Date of Patent: September 12, 2017
    Assignee: Intel Corporation
    Inventors: Bret L. Toll, Robert Valentine, Jesus Corbal San Adrian, Elmoustapha Ould-Ahmed-Vall, Mark J. Charney
  • Patent number: 9753693
    Abstract: A binary logic circuit is provided for determining a rounded value of px q , where p and q are coprime constant integers with p<q and q?2i, i is any integer, and x is an integer variable between 0 and integer M where M?2q, the binary logic circuit implementing in hardware the optimal solution of the multiply-add operation ax + b 2 k where a, b and k are fixed integers.
    Type: Grant
    Filed: March 13, 2014
    Date of Patent: September 5, 2017
    Assignee: Imagination Technologies Limited
    Inventor: Thomas Rose
  • Patent number: 9748928
    Abstract: Digital signal processing (“DSP”) block circuitry on an integrated circuit (“IC”) is adapted for use, e.g., in multiple instances of the DSP block circuitry on the IC, for implementing finite-impulse-response (“FIR”) filters that are dynamically adjustable. Advantages of such DSP block circuitries may include an increase in performance and a reduction in logic and memory usage for multi-standard FIR filters.
    Type: Grant
    Filed: September 2, 2016
    Date of Patent: August 29, 2017
    Assignee: Altera Corporation
    Inventor: Volker Mauer
  • Patent number: 9696964
    Abstract: A floating point multiply add circuit 24 includes a multiplier 26 and an adder 28. The input operands A, B and C together with the result value all have a normal exponent value range, such as a range consistent with the IEEE Standard 754. The product value which is passed from the multiplier 26 to the adder 28 as an extended exponent value range that extents lower than the normal exponent value range. Shifters 48, 50 within the adder can take account of the extended exponent value range of the product as necessary in order to bring the result value back into the normal exponent value range.
    Type: Grant
    Filed: December 11, 2014
    Date of Patent: July 4, 2017
    Assignee: ARM Limited
    Inventors: David Raymond Lutz, Neil Burgess
  • Patent number: 9690543
    Abstract: A data processing system uses alignment circuitry to align input operands in accordance with a programmable significance parameter to form aligned input operands. The aligned input operands are supplied to arithmetic circuitry, such as an integer adder or an integer multiplier, where a result value is formed. The result value is stored in an output operand storage element, such as a result register. The programmable significance parameter is independent of the result value.
    Type: Grant
    Filed: December 24, 2014
    Date of Patent: June 27, 2017
    Assignee: ARM Limited
    Inventors: David Raymond Lutz, Neil Burgess, Christopher Neal Hinds
  • Patent number: 9658828
    Abstract: Arithmetic logic circuitry is provided for performing a floating point arithmetic add/subtract operation on first and second floating point numbers. The method includes: generating a guard digit for the first or second number by transforming the first and second numbers by a compressing function; determining a result depending on the arithmetic operation, a sum of the transformed floating point numbers, and first and second differences of the transformed floating point numbers, and determining a corresponding result plus one by additionally adding a value of one to the result; generating injection values for rounding the final result; generating injection carry values based on the transformed first and second numbers and the injection values; and selecting the final result from the result, the result plus one, and a least significant digit, based on the injection carry values and the end around carry signals.
    Type: Grant
    Filed: October 2, 2015
    Date of Patent: May 23, 2017
    Assignee: International Business Machines Corporation
    Inventors: Steven R. Carlough, Klaus M. Kroener, Petra Leber, Cedric Lichtenau, Silvia M. Mueller
  • Patent number: 9569418
    Abstract: Converting data transformations entered in a spreadsheet program into a circuit representation of those transformations. The circuit representation can run independently of the spreadsheet program to transform input data into output data. In some cases the circuit representation is in the form of hardware, accepts and/or produces data streams, and/or the circuit and/or output data or data streams can be shared among multiple users and/or subscribers. Where data streams are processed, the transformations may include well-specified timing semantics, supporting operations that involve rate-based rate manipulation, value-based rate manipulation, and/or access to past cell values.
    Type: Grant
    Filed: June 27, 2014
    Date of Patent: February 14, 2017
    Assignee: International Busines Machines Corporation
    Inventors: Martin J. Hirzel, Rodric Rabbah, Philippe Suter, Olivier L. J. Tardieu, Mandana Vaziri
  • Patent number: 9563401
    Abstract: An extensible iterative multiplier design is provided. Embodiments provide cascaded 8-bit multipliers for simplifying the performance of multi-byte multiplications. Booth encoding is performed in the lowest order multiplier, with the result of the Booth encoding then provided to higher order multipliers. Additionally, multiply-add operations can be performed by initializing a partial product sum register. Configurable connections between the multipliers facilitate a variety of possible multiplication options, including the possibility of varying the width of the operands.
    Type: Grant
    Filed: December 7, 2013
    Date of Patent: February 7, 2017
    Assignee: Wave Computing, Inc.
    Inventors: Samit Chaudhuri, Radoslav Danilak
  • Patent number: 9501281
    Abstract: Method and apparatus for performing a shift and XOR operation. In one embodiment, an apparatus includes execution resources to execute a first instruction. In response to the first instruction, said execution resources perform a shift and XOR on at least one value.
    Type: Grant
    Filed: December 1, 2014
    Date of Patent: November 22, 2016
    Assignee: Intel Corporation
    Inventors: Vinodh Gopal, James D. Guilford, Erdinc Ozturk, Wajdi K. Feghali, Gilbert M. Wolrich, Martin G. Dixon
  • Patent number: 9495166
    Abstract: Method and apparatus for performing a shift and XOR operation. In one embodiment, an apparatus includes execution resources to execute a first instruction. In response to the first instruction, said execution resources perform a shift and XOR on at least one value.
    Type: Grant
    Filed: December 1, 2014
    Date of Patent: November 15, 2016
    Assignee: Intel Corporation
    Inventors: Vinodh Gopal, James D. Guilford, Erdinc Ozturk, Wajdi K. Feghali, Gilbert M. Wolrich, Martin G. Dixon
  • Patent number: 9495165
    Abstract: Method and apparatus for performing a shift and XOR operation. In one embodiment, an apparatus includes execution resources to execute a first instruction. In response to the first instruction, said execution resources perform a shift and XOR on at least one value.
    Type: Grant
    Filed: December 1, 2014
    Date of Patent: November 15, 2016
    Assignee: Intel Corporation
    Inventors: Vinodh Gopal, James D. Guilford, Erdinc Ozturk, Wajdi K. Feghali, Gilbert M. Wolrich, Martin G. Dixon
  • Patent number: 9489343
    Abstract: Systems and methods for sparse matrix vector multiplication (SpMV) are disclosed. The systems and methods include a novel streaming reduction architecture for floating point accumulation and a novel on-chip cache design optimized for streaming compressed sparse row (CSR) matrices. The present disclosure is also directed to implementation of the reduction circuit and/or processing elements for SpMV processing into a personality for the Convey HC-1 computing device.
    Type: Grant
    Filed: October 7, 2014
    Date of Patent: November 8, 2016
    Assignee: University of South Carolina
    Inventor: Jason D. Bakos
  • Patent number: 9438203
    Abstract: Digital signal processing (“DSP”) block circuitry on an integrated circuit (“IC”) is adapted for use, e.g., in multiple instances of the DSP block circuitry on the IC, for implementing finite-impulse-response (“FIR”) filters that are dynamically adjustable. Advantages of such DSP block circuitries may include an increase in performance and a reduction in logic and memory usage for multi-standard FIR filters.
    Type: Grant
    Filed: January 10, 2014
    Date of Patent: September 6, 2016
    Assignee: Altera Corporation
    Inventor: Volker Mauer
  • Patent number: 9396111
    Abstract: A consuming subsystem calculates information based on setup information from one or more other subsystems. Each of the one or more other subsystems generates a base version value that changes every time any of the setup information changes. The consuming subsystem caches information, including the base version values at the time the information was calculated by the consuming subsystem.
    Type: Grant
    Filed: January 10, 2014
    Date of Patent: July 19, 2016
    Assignee: Microsoft Technology Licensing, LLC
    Inventor: Jeffrey R. Anderson
  • Patent number: 9383968
    Abstract: One embodiment of the present invention includes a method for simplifying arithmetic operations by detecting operands with elementary values such as zero or 1.0. Computer and graphics processing systems perform a great number of multiply-add operations. In a significant portion of these operations, the values of one or more of the operands are zero or 1.0. By detecting the occurrence of these elementary values, math operations can be greatly simplified, for example by eliminating multiply operations when one multiplicand is zero or 1.0 or eliminating add operations when one addend is zero. The simplified math operations resulting from detecting elementary valued operands provide significant savings in overhead power, dynamic processing power, and cycle time.
    Type: Grant
    Filed: September 27, 2013
    Date of Patent: July 5, 2016
    Assignee: NVIDIA Corporation
    Inventors: Daniel Finchelstein, David Conrad Tannenbaum, Srinivasan (Vasu) Iyer
  • Patent number: 9348963
    Abstract: A system and method for optimizing a design layout by identifying features for abutment where shapes of the features that trigger the abutment are overlapping or within a predefined proximity of each other. The abutment process is implemented for features that have overlapping pins or that will have overlapping pins when abutted. Connectivity of abutted features is analyzed for the overlapped pins; pins of one of the abutted features are swapped so that at least one overlapping set of horizontal pins is connected to a same net; and a pin of the abutted features can be shortened as necessary to prevent short-circuit between pins connected to different nets. The overlapping pins are then merged. Pins can be shortened by cutting the pin or by adjusting pin style or pin size.
    Type: Grant
    Filed: September 30, 2014
    Date of Patent: May 24, 2016
    Assignee: Cadence Design System, Inc.
    Inventors: Min-Ching Lin, Kenny Ferguson, Ming Yi Fang, SSU-Ping Ko
  • Patent number: 9268565
    Abstract: A method and apparatus for including in a processor instructions for performing logical-comparison and branch support operations on packed or unpacked data. In one embodiment, instruction decode logic decodes instructions for an execution unit to operate on packed data elements including logical comparisons. A register file including 128-bit packed data registers stores packed single-precision floating point (SPFP) and packed integer data elements. The logical comparisons may include comparison of SPFP data elements and comparison of integer data elements and setting at least one bit to indicate the results. Based on these comparisons, branch support actions are taken. Such branch support actions may include setting the at least one bit, which in turn may be utilized by a branching unit in response to a branch instruction. Alternatively, the branch support actions may include branching to an indicated target code location.
    Type: Grant
    Filed: April 12, 2015
    Date of Patent: February 23, 2016
    Assignee: Intel Corporation
    Inventors: Rajiv Kapoor, Ronen Zohar, Mark Buxton, Zeev Sperber, Koby Gottlieb
  • Patent number: 9164727
    Abstract: This invention discloses a FPGA based high-speed low-latency floating-point accumulation and its implementation method. Floating accumulation of this invention comprises a floating-point adder unit, numerous intermediate result buffers, an input control unit and an output control unit. The floating-point accumulation implementation method of this invention is used for gradation of the whole accumulation calculation process to ensure cross execution of accumulation calculation processes and graded storage of intermediate results of accumulation calculation at different levels; meanwhile, the operation in the mode of pure flow line can significantly improve utilization rate of internal floating-point adder, and maintain relatively low latency to output of final results of floating-point accumulation calculation.
    Type: Grant
    Filed: December 1, 2011
    Date of Patent: October 20, 2015
    Assignee: ZHEJIANG UNIVERSITY
    Inventors: Yaowu Chen, Longtao Yuan, Fan Zhou
  • Patent number: 9147462
    Abstract: It is an object to provide a memory device for which a complex manufacturing process is not necessary and whose power consumption can be suppressed and a signal processing circuit including the memory device. In a memory element including a phase-inversion element by which the phase of an input signal is inverted and the signal is output such as an inverter or a clocked inverter, a capacitor which holds data and a switching element which controls storing and releasing of electric charge in the capacitor are provided. For the switching element, a transistor including an oxide semiconductor in a channel formation region is used. The memory element is applied to a memory device such as a register or a cache memory included in a signal processing circuit.
    Type: Grant
    Filed: November 25, 2013
    Date of Patent: September 29, 2015
    Assignee: Semiconductor Energy Laboratory Co., Ltd.
    Inventors: Jun Koyama, Shunpei Yamazaki
  • Patent number: 9087002
    Abstract: An addition/subtraction hardware operator includes a plurality of addition/subtraction hardware modules and a plurality of transmission links between these modules, on one hand, and between inputs and outputs of the operator and these modules, on the other hand, according to a pre-determined structure for performing arithmetical calculations. At least a part of the addition/subtraction hardware modules and at least a part of the links between these modules can be configured by at least one programmable parameter, at least between a first configuration in which the operator finalizes a computation of real parts of fast Fourier transform coefficients, a second configuration in which the operator finalizes a computation of imaginary parts of fast Fourier transform coefficients, and a third configuration in which the operator carries out a computation of path metrics and survivors values of a Viterbi algorithm implementation.
    Type: Grant
    Filed: November 29, 2010
    Date of Patent: July 21, 2015
    Assignee: Commissariat à l'énergie atomique et aux énergies alternatives
    Inventors: Laurent Alaus, Dominique Noguet
  • Patent number: 9037626
    Abstract: A method and apparatus for including in a processor instructions for performing logical-comparison and branch support operations on packed or unpacked data. In one embodiment, instruction decode logic decodes instructions for an execution unit to operate on packed data elements including logical comparisons. A register file including 128-bit packed data registers stores packed single-precision floating point (SPFP) and packed integer data elements. The logical comparisons may include comparison of SPFP data elements and comparison of integer data elements and setting at least one bit to indicate the results. Based on these comparisons, branch support actions are taken. Such branch support actions may include setting the at least one bit, which in turn may be utilized by a branching unit in response to a branch instruction. Alternatively, the branch support actions may include branching to an indicated target code location.
    Type: Grant
    Filed: March 15, 2013
    Date of Patent: May 19, 2015
    Assignee: Intel Corporation
    Inventors: Rajiv Kapoor, Ronen Zohar, Mark J. Buxton, Zeev Sperber, Koby Gottlieb
  • Patent number: 9037798
    Abstract: A system (200) and a method (100) of operating a computing device to perform memoization are disclosed. The method includes determining whether a result of a function is stored in a cache and, if so, retrieving the result from the cache and, if not, calculating the result and storing it in the cache. The method (100) includes transforming (104) by the computing device at least one selected from the input parameters and the output parameters of the function, the transforming being based on an analysis of the function and its input arguments to establish whether or not there is a possible relationship reflecting redundancy among the input parameters and output parameters of the function. The transforming may include at least one of: use of symmetry, scaling, linear shift, interchanging of variables, inversion, polynomial and/or trigonometric transformations, spectral or logical transformations, fuzzy transformations, and systematic arrangement of parameters.
    Type: Grant
    Filed: December 10, 2009
    Date of Patent: May 19, 2015
    Assignee: CSIR
    Inventor: Albert Anatolievich Lysko
  • Publication number: 20150120796
    Abstract: The standing wave simple math processor is a new system for doing simple math (addition). By utilizing standing waves and conventional connections, a charge of direct current is transferred from 2 adjacent standing waves to 1 final standing wave representing the output. In essence the two input waves function as operands in a math problem. The operator, in this sense, is an interconnection of DC current between the 3 standing waves, as well as a set of redistribution rules. The final solution is retrieved upon completion of the redistribution rules.
    Type: Application
    Filed: January 9, 2015
    Publication date: April 30, 2015
    Inventor: Seth John Winnipeg
  • Patent number: 9021004
    Abstract: A circuit arrangement and method couple a hardware-based pseudorandom number generator (PRNG) to an execution unit in such a manner that pseudorandom numbers generated by the PRNG may be selectively output to the execution unit for use as an operand during the execution of instructions by the execution unit. A PRNG may be coupled to an input of an operand multiplexer that outputs to an operand input of an execution unit so that operands provided by instructions supplied to the execution unit are selectively overridden with pseudorandom numbers generated by the PRNG. Furthermore, overridden operands provided by instructions supplied to the execution unit may be used as seed values for the PRNG.
    Type: Grant
    Filed: July 24, 2012
    Date of Patent: April 28, 2015
    Assignee: International Business Machines Corporation
    Inventors: Adam James Muff, Matthew Ray Tubbs
  • Patent number: 9015221
    Abstract: A method and apparatus for distributing objects. In one embodiment, the method comprises computing a modulus operand based on a number of objects to be distributed and a number of objects pertaining to a first category; computing a modulus operation based on a number of distributed objects and the modulus operand; and distributing a first object or a second object based on a result of computing the modulus operation.
    Type: Grant
    Filed: December 1, 2011
    Date of Patent: April 21, 2015
    Assignee: Vonage Network LLC
    Inventor: Domenic A. Cicchino