Sum Of Products Generation Patents (Class 708/603)
-
Patent number: 12136032Abstract: A technique for stable and fast computation of a variance representing a confidence interval for an estimation result in an estimation apparatus using a neural network including an integrated layer that combines a dropout layer for dropping out part of input data and an FC layer for computing a weight is provided. When input data having a multivariate distribution is supplied to the integrated layer, a data analysis unit 30 determines, based on a numerical distribution of terms formed by respective products of each vector element of the input data and the weight, a data type of each vector element of output data from the integrated layer. An estimated confidence interval computation unit 20 applies an approximate computation method associated with the data type, to analytically compute a variance of each vector element of the output data from the integrated layer based on the input data to the integrated layer.Type: GrantFiled: November 14, 2017Date of Patent: November 5, 2024Assignee: DENSO IT LABORATORY, INC.Inventor: Jingo Adachi
-
Patent number: 12079733Abstract: Anon-volatile memory structure capable of storing weights for layers of a deep neural network (DNN) and perform an inferencing operation within the structure is presented. An in-array multiplication can be performed between multi-bit valued inputs, or activations, for a layer of the DNN and multi-bit valued weights of the layer. Each bit of a weight value is stored in a binary valued memory cell of the memory array and each bit of the input is applied as a binary input to a word line of the array for the multiplication of the input with the weight. To perform a multiply and accumulate operation, the results of the multiplications are accumulated by adders connected to sense amplifiers along the bit lines of the array. The adders can be configured to multiple levels of precision, so that the same structure can accommodate weights and activations of 8-bit, 4-bit, and 2-bit precision.Type: GrantFiled: July 28, 2020Date of Patent: September 3, 2024Assignee: SanDisk Technologies LLCInventors: Tung Thanh Hoang, Won Ho Choi, Martin Lueker-Boden
-
Patent number: 12008069Abstract: In a system with control logic and a processing element array, two modes of operation may be provided. In the first mode of operation, the control logic may configure the system to perform matrix multiplication or 1×1 convolution. In the second mode of operation, the control logic may configure the system to perform 3×3 convolution. The processing element array may include an array of processing elements. Each of the processing elements may be configured to compute the dot product of two vectors in a single clock cycle, and further may accumulate the dot products that are sequentially computed over time.Type: GrantFiled: November 29, 2023Date of Patent: June 11, 2024Assignee: Recogni Inc.Inventors: Jian hui Huang, Gary S. Goldman
-
Patent number: 12001953Abstract: Techniques for mixed-precision data manipulation for neural network data computation are disclosed. A first left group comprising eight bytes of data and a first right group of eight bytes of data are obtained for computation using a processor. A second left group comprising eight bytes of data and a second right group of eight bytes of data are obtained. A sum of products is performed between the first left and right groups and the second left and right groups. The sum of products is performed on bytes of 8-bit integer data. A first result is based on a summation of eight values that are products of the first group's left eight bytes and the second group's left eight bytes. A second result is based on the summation of eight values that are products of the first group's left eight bytes and the second group's right eight bytes. Results are output.Type: GrantFiled: February 24, 2023Date of Patent: June 4, 2024Assignee: MIPS Tech, LLCInventors: James Hippisley Robinson, Sanjay Patel
-
Patent number: 11947929Abstract: An arithmetic device includes a comparison unit comparing voltage generated with charge stored in a storage unit with a threshold, and outputting an output signal at a timing when the voltage exceeds the threshold, and a timing extension unit extending an interval between timings at each of which the output signal is output.Type: GrantFiled: July 4, 2019Date of Patent: April 2, 2024Assignee: SONY CORPORATIONInventor: Hiroyuki Yamagishi
-
Patent number: 11900184Abstract: A multiply-accumulate device (10) includes: a comparison unit (18) that compares, with a threshold voltage, a voltage generated by an electric charge stored in a storage unit (14), and outputs an output signal at timing at which the voltage exceeds the threshold voltage; and a control circuit (110) that reduces, based on a predetermined set value, a charging current to the storage unit (14) from a plurality of input units (13) connected to the storage unit (14).Type: GrantFiled: July 5, 2019Date of Patent: February 13, 2024Assignee: Sony Group CorporationInventors: Yasushi Fujinami, Hiroyuki Yamagishi
-
Patent number: 11625224Abstract: An apparatus includes a first holding unit and a second holding unit configured to hold first-type data and second-type data, respectively, a first operation unit configured to execute a first product-sum operation based on the first-type data, a branch unit configured to output an operation result of the first product-sum operation in parallel, a sampling unit configured to sample the operation result and to output a sampling result, and a second operation unit configured to execute a second product-sum operation based on the second-type data and the sampling result.Type: GrantFiled: April 17, 2019Date of Patent: April 11, 2023Assignee: CANON KABUSHIKI KAISHAInventors: Tsewei Chen, Masami Kato, Masahiro Ariizumi
-
Patent number: 11614919Abstract: A circuit, comprising a first term operation circuit and a second term operation circuit, a third term operation circuit, and a second calculation circuit. Each of the first and the second term operation circuits comprises multiple higher bit operation circuits, a lowest bit operation circuit, and a first calculation circuit. Each of the higher bit operation circuits selectively left-shifts a multiplicand by different bits, outputs the shifted multiplicand, determines a sign of the shifted multiplicand, and left-shifts the shifted multiplicand. The lowest bit operation circuit outputs the multiplicand, and determines a sign of the multiplicand. The first calculation circuit generates a term operation result. The third term operation circuit generates a third term operation result. The second calculation circuit adds the term operation result of the first and second term operation circuits and the third term operation result to generate a total operation result.Type: GrantFiled: August 13, 2020Date of Patent: March 28, 2023Assignee: REALTEK SEMICONDUCTOR CORPORATIONInventor: Szu-Chun Chang
-
Patent number: 11615307Abstract: Techniques for mixed-precision data manipulation for neural network data computation are disclosed. A first left group comprising eight bytes of data and a first right group of eight bytes of data are obtained for computation using a processor. A second left group comprising eight bytes of data and a second right group of eight bytes of data are obtained. A sum of products is performed between the first left and right groups and the second left and right groups. The sum of products is performed on bytes of 8-bit integer data. A first result is based on a summation of eight values that are products of the first group's left eight bytes and the second group's left eight bytes. A second result is based on the summation of eight values that are products of the first group's left eight bytes and the second group's right eight bytes. Results are output.Type: GrantFiled: August 5, 2020Date of Patent: March 28, 2023Assignee: MIPS Tech, LLCInventors: James Hippisley Robinson, Sanjay Patel
-
Patent number: 11500629Abstract: A multiplying-and-accumulating (MAC) circuit includes a multiplying circuit and an adding circuit. The multiplying circuit includes a first multiplier and a second multiplier, and each of the first multiplier and the second multiplier performs a multiplying calculation for first input data with N bits and second input data with M bits to output multiplication result data with (N+M) bits (where, “N” and “M” are natural numbers which are equal to or greater than one). The adding circuit includes an adder which performs an adding calculation for the multiplication result data of the first multiplier and the multiplication result data of the second multiplier to output addition result data with (N+M) bits.Type: GrantFiled: January 8, 2021Date of Patent: November 15, 2022Assignee: SK hynix Inc.Inventor: Choung Ki Song
-
Patent number: 11487541Abstract: Embodiments of systems, apparatuses, and methods for chained fused multiply add. In some embodiments, an apparatus includes a decoder to decode a single instruction having an opcode, a destination field representing a destination operand, a first source field representing a plurality of packed data source operands of a first type that have packed data elements of a first size, a second source field representing a plurality of packed data source operands that have packed data elements of a second size, and a field for a memory location that stores a scalar value. A register file having a plurality of packed data registers includes registers for the plurality of packed data source operands that have packed data elements of a first size, the source operands that have packed data elements of a second size, and the destination operand.Type: GrantFiled: November 30, 2020Date of Patent: November 1, 2022Assignee: Intel CorporationInventors: Jesus Corbal, Robert Valentine, Roman S. Dubtsov, Nikita A. Shustrov, Mark J. Charney, Dennis R. Bradford, Milind B. Girkar, Edward T. Grochowski, Thomas D. Fletcher, Warren E. Ferguson
-
Patent number: 11455143Abstract: A device (e.g., an integrated circuit chip) includes a dot product processing component, a data alignment component, and an accumulator. The dot product processing component is configured to calculate a dot product of a first group of elements stored in a first storage unit with a second group of elements, wherein: each element of the first group of elements is represented using a first number of bits, each value of a group of values stored in the first storage unit is represented using a second number of bits greater than the first number of bits, and each value of the group of values is stored as split segments across more than one element of the elements of the first group of elements. The data alignment component is configured to receive results of the dot product processing component and modify one or more of the results of the dot product processing component. The accumulator is configured to sum outputs of the data alignment component to at least in part determine a sum of the group of values.Type: GrantFiled: May 7, 2020Date of Patent: September 27, 2022Assignee: Meta Platforms, Inc.Inventors: Thomas Mark Ulrich, Krishnakumar Narayanan Nair, Ehsan Khish Ardestani Zadeh
-
Patent number: 11455142Abstract: Embodiments for implementing a fused multiply-multiply-accumulate (“FMMA”) unit by one or more processors in a computing system. Mantissas for two products, an exponent difference of the two products serving as an alignment shift amount for a product of the two products having a smallest exponent, and an alignment shift amount for an addend relative to an alternative product of the two product having a larger exponent may be determined in parallel. The addend may be aligned relative to the alternative product having the larger exponent. The product having the smallest exponent may be aligned relative to the alternative product having the larger exponent according to the alignment shift amount.Type: GrantFiled: June 5, 2019Date of Patent: September 27, 2022Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Ankur Agrawal, Silvia Mueller, Kailash Gopalakrishnan, Bruce Fleischer, Balaram Sinharoy, Mingu Kang
-
Patent number: 11366663Abstract: Disclosed embodiments relate to systems and methods for performing 16-bit floating-point vector dot product instructions. In one example, a processor includes fetch circuitry to fetch an instruction having fields to specify an opcode and locations of first source, second source, and destination vectors, the opcode to indicate execution circuitry is to multiply N pairs of 16-bit floating-point formatted elements of the specified first and second sources, and accumulate the resulting products with previous contents of a corresponding single-precision element of the specified destination, decode circuitry to decode the fetched instruction, and execution circuitry to respond to the decoded instruction as specified by the opcode.Type: GrantFiled: November 9, 2018Date of Patent: June 21, 2022Assignee: Intel CorporationInventors: Alexander F. Heinecke, Robert Valentine, Mark J. Charney, Raanan Sade, Menachem Adelman, Zeev Sperber, Amit Gradstein, Simon Rubanovich
-
Patent number: 11262982Abstract: A computation circuit includes a plurality of processing elements and a common accumulator. The plurality of processing elements are sequentially coupled in series, and performs a multiply and accumulate (MAC) operation on a weight signal and at least one of two or more input signals received in each unit cycle. The common accumulator is sequentially and cyclically coupled to first to Kth processing elements among the plurality of processing elements, and configured to receive a computation value outputted from a processing element coupled thereto among the first to Kth processing elements, and store computation information. The K is decided based on values of the two or more input signals and the number of guard bits included in one processing element.Type: GrantFiled: July 22, 2019Date of Patent: March 1, 2022Assignees: SK hynix Inc., SK Telecom Co., Ltd.Inventors: Yong Sang Park, Seok Joong Hwang
-
Patent number: 11188305Abstract: A computation device includes: a data multiplexer configured to output first high-order data as first output data and fifth output data, output first low-order data as third output data and seventh output data, output second high-order data as second output data, output second low-order data as fourth output data, output third high-order data, which is high-order data having a second bit number out of third input data, as sixth output data, and output third low-order data, which is low-order data having the second bit number out of the third input data, as eighth output data when a mode signal indicates a second computation mode; and first to fourth multipliers each of which multiplies two output data.Type: GrantFiled: May 11, 2018Date of Patent: November 30, 2021Assignees: Preferred Networks, Inc., RikenInventors: Junichiro Makino, Takayuki Muranushi, Miyuki Tsubouchi, Ken Namura
-
Patent number: 11163532Abstract: A method may include obtaining a set of multivariate quadratic polynomials associated with a multivariate quadratic problem and generating an Ising Model connection weight matrix “W” and an Ising Model bias vector “b” based on the multivariate quadratic polynomials. The method may also include providing the matrix “W” and the vector “b” to an annealing system configured to solve problems written according to the Ising Model and obtaining an output from the annealing system that represents a set of integers. The method may also include using the set of integers as a solution to the multivariate quadratic problem.Type: GrantFiled: January 18, 2019Date of Patent: November 2, 2021Assignee: FUJITSU LIMITEDInventors: Hart Montgomery, Arnab Roy, Ryuichi Ohori, Toshiya Shimizu, Takeshi Shimoyama, Jumpei Yamaguchi
-
Patent number: 10983755Abstract: A transcendental calculation unit includes a configuration table storing a set of constants and provide a selected one of the constants, a power series multiplier that iteratively develops a power series, a coefficient series multiplier and accumulator that develops an accumulated product of the power series and the constant, and a round and normalize stage that rounds the accumulated product and normalizes rounded product.Type: GrantFiled: July 22, 2020Date of Patent: April 20, 2021Inventor: Mitchell K. Alsup
-
Patent number: 10970044Abstract: A semiconductor device for performing a sum-of-product computation and an operating method thereof are provided. The semiconductor device includes an inputting circuit, a scaling circuit, a computing memory and an outputting circuit. The inputting circuit is used for receiving a plurality of inputting signals. The inputting signals are voltages or currents. The scaling circuit is connected to the inputting circuit for transforming the inputting signals to be a plurality of compensated signals respectively. The compensated signals are voltages or currents. The computing memory is connected to the scaling circuit. The computing memory includes a plurality of computing cells and the compensated signals are applied to the computing cells respectively. The outputting circuit is connected to the computing memory for reading an outputting signals of the computing cells. The outputting signal is voltage or current.Type: GrantFiled: May 9, 2019Date of Patent: April 6, 2021Assignee: MACRONIX INTERNATIONAL CO., LTD.Inventors: Ming-Hsiu Lee, Chao-Hung Wang
-
Patent number: 10963265Abstract: Examples described herein include systems and methods which include an apparatus comprising a plurality of configurable logic units and a plurality of switches, with each switch being coupled to at least one configurable logic unit of the plurality of configurable logic units. The apparatus further includes an instruction register configured to provide respective switch instructions of a plurality of switch instructions to each switch based on a computation to be implemented among the plurality of configurable logic units. For example, the switch instructions may include allocating the plurality of configurable logic units to perform the computation and activating an input of the switch and an output of the switch to couple at least a first configurable logic unit and a second configurable logic unit. In various embodiments, configurable logic units can include arithmetic logic units (ALUs), bit manipulation units (BMUs), and multiplier-accumulator units (MACs).Type: GrantFiled: April 21, 2017Date of Patent: March 30, 2021Assignee: Micron Technology, Inc.Inventors: Fa-Long Luo, Tamara Schmitz, Jeremy Chritz, Jaime Cummins
-
Patent number: 10936939Abstract: An operation processing apparatus includes a memory and a processor coupled to the memory. The processor executes an operation according to an operation instruction, acquires statistical information for a distribution of bits in fixed point data after an execution of an operation for the fixed point data according to an acquisition instruction, and outputs the statistical information to a register designated by the acquisition instruction.Type: GrantFiled: February 14, 2019Date of Patent: March 2, 2021Assignee: FUJITSU LIMITEDInventors: Mitsuru Tomono, Makiko Ito
-
Patent number: 10929134Abstract: A processor to facilitate acceleration of instruction execution is disclosed. The processor includes a plurality of execution units (EUs), each including an instruction decode unit to decode an instruction into one or more operands and opcode defining an operation to be performed at an accelerator, a register file having a plurality of registers to store the one or more operands and an accelerator having programmable hardware to retrieve the one or more operands from the register file and perform the operation on the one or more operands.Type: GrantFiled: June 28, 2019Date of Patent: February 23, 2021Assignee: Intel CorporationInventors: Radhakrishna Sripada, Peter Yiannacouras, Josh Triplett, Nagabhushan Chitlur, Kalyan Kondapally
-
Patent number: 10802826Abstract: An apparatus and method for performing dual concurrent multiplications of packed data elements.Type: GrantFiled: September 29, 2017Date of Patent: October 13, 2020Assignee: Intel CorporationInventors: Venkateswara Madduri, Elmoustapha Ould-Ahmed-Vall, Jesus Corbal, Mark Charney, Robert Valentine, Binwei Yang
-
Patent number: 10747501Abstract: Providing efficient floating-point operations using matrix processors in processor-based systems is disclosed. In this regard, a matrix-processor-based device provides a matrix processor comprising a positive partial sum accumulator and a negative partial sum accumulator. As the matrix processor processes pairs of floating-point operands, the matrix processor calculates an intermediate product based on a first floating-point operand and a second floating-point operand and determines a sign of the intermediate product. Based on the sign, the matrix processor normalizes the intermediate product with a partial sum fraction of the positive partial sum accumulator or the negative partial sum accumulator, then adds the intermediate product to the positive sum accumulator or the negative sum accumulator.Type: GrantFiled: August 30, 2018Date of Patent: August 18, 2020Assignee: Qualcomm IncorporatedInventors: Mattheus Cornelis Antonius Adrianus Heddes, Natarajan Vaidhyanathan, Robert Dreyer, Colin Beaton Verrilli, Koustav Bhattacharya
-
Patent number: 10719296Abstract: A device for generating sum-of-products data includes an array of variable resistance cells, variable resistance cells in the array each comprising a programmable threshold transistor and a resistor connected in parallel, the array including n columns of cells including strings of series-connected cells and m rows of cells. Control and bias circuitry are coupled to the array, including logic for programming the programmable threshold transistors in the array with thresholds corresponding to values of a weight factor Wmn for the corresponding cell. Input drivers are coupled to corresponding ones of the m rows of cells, the input drivers selectively applying inputs Xm to rows m. Column drivers are configured to apply currents In to corresponding ones of the n columns of cells. Voltage sensing circuits operatively coupled to the columns of cells.Type: GrantFiled: January 17, 2018Date of Patent: July 21, 2020Assignee: MACRONIX INTERNATIONAL CO., LTD.Inventors: Feng-Min Lee, Yu-Yu Lin
-
Patent number: 10705839Abstract: A processor having a decoder to decode an instruction to generate a decoded instruction; a first source register to store a first plurality of packed signed bytes; a second source register to store a second plurality of packed signed bytes; execution circuitry to execute the decoded instruction, the execution circuitry including: multiplier circuitry to multiply each packed signed byte from the first source register with a corresponding packed signed byte from the second source register to generate temporary products, adder circuitry to add a plurality of sets of the temporary products to generate a plurality of temporary sums; negation and extension circuitry to negate and extend each of the temporary sums to doublewords sums; and accumulation circuitry to add each of the doublewords sums to a doubleword from a third source register to generate final doubleword results; and a packed data destination register to store the final doubleword results.Type: GrantFiled: December 21, 2017Date of Patent: July 7, 2020Assignee: Intel CorporationInventors: Venkateswara Madduri, Elmoustapha Ould-Ahmed-Vall, Robert Valentine, Mark Charney, Jesus Corbal
-
Patent number: 10552154Abstract: An apparatus and method for multiplying packed real and imaginary components of complex numbers. A method comprises: multiplying selected imaginary and real data elements in a first and second source registers to generate a plurality of imaginary products; adding a first subset of the plurality of imaginary products to generate a first temporary result and adding a second subset of the plurality of imaginary products to generate a second temporary result; negating the first temporary result to generate a third temporary result and the second temporary result to generate a fourth temporary result; accumulating the third temporary result with first data to generate a first final result and accumulating the fourth temporary result with second data to generate a second final result; and storing the first final result and second final.Type: GrantFiled: September 29, 2017Date of Patent: February 4, 2020Assignee: Intel CorporationInventors: Venkateswara Madduri, Elmoustapha Ould-Ahmed-Vall, Jesus Corbal, Mark Charney, Robert Valentine, Binwei Yang
-
Patent number: 10482156Abstract: A special-purpose, hardware-based accelerator may include an input subsystem configured to receive first and second vectors as operands of a full dot-product operation. The accelerator may also include a sparsity-aware dot-product engine communicatively coupled to the input subsystem and configured to perform adaptive dot-product processing by: (1) identifying, within the first and second vectors, at least one zero-value element and (2) executing, in response to identifying the zero-value element, a reduced dot-product operation that excludes, relative to the full dot-product operation, at least one mathematical operation in which the zero-value element is an operand. The accelerator may also include an output subsystem that is communicatively coupled to the sparsity-aware dot-product engine and configured to send a result of the reduced dot-product operation to a storage subsystem. Various other accelerators, computing systems, and methods are also disclosed.Type: GrantFiled: December 29, 2017Date of Patent: November 19, 2019Assignee: Facebook, Inc.Inventors: Abdulkadir Utku Diril, Jong Soo Park, Nadav Rotem, Mikhail Smelyanskiy
-
Patent number: 10459876Abstract: A processing element (PE) of a systolic array can perform neural networks computations in parallel on two or more sequential data elements of an input data set using the same weight. Thus, two or more output data elements corresponding to an output data set may be generated in parallel. Based on the size of the input data set and an input data type, the systolic array can process a single data element or multiple data elements in parallel.Type: GrantFiled: January 31, 2018Date of Patent: October 29, 2019Assignee: Amazon Technologies, Inc.Inventors: Dana Michelle Vantrease, Ron Diamant
-
Patent number: 10249356Abstract: A method of obtaining a dot product includes applying a programming signal to a number of capacitive memory devices coupled at a number of junctions formed between a number of row lines and a number of column lines. The programming signal defines a number of values within a matrix. The method further includes applying a vector signal. The vector signal defines a number of vector values to be applied to the capacitive memory devices.Type: GrantFiled: October 28, 2014Date of Patent: April 2, 2019Assignee: HEWLETT PACKARD ENTERPRISE DEVELOPMENT LPInventors: Ning Ge, John Paul Strachan, Jianhua Yang, Miao Hu
-
Patent number: 10228911Abstract: An apparatus includes a plurality of arithmetic logic units each having an accumulator and an integer arithmetic unit that receives and performs integer arithmetic operations on integer inputs and accumulates integer results of a series of the integer arithmetic operations into the accumulator as an integer accumulated value. A register is programmable with an indication of a number of fractional bits of the integer accumulated values and an indication of a number of fractional bits of integer outputs. A first bit width of the accumulator is greater than twice a second bit width of the integer outputs. A plurality of adjustment units scale and saturate the first bit width integer accumulated values to generate the second bit width integer outputs based on the indications of the number of fractional bits of the integer accumulated values and outputs programmed into the register.Type: GrantFiled: April 5, 2016Date of Patent: March 12, 2019Assignee: VIA ALLIANCE SEMICONDUCTOR CO., LTD.Inventors: G. Glenn Henry, Terry Parks
-
Patent number: 10055383Abstract: A circuit is provided. In an example, the circuit includes a memory array that includes a plurality of memory cells to store a matrix and a plurality of data lines coupled to the plurality of memory cells to provide a first set of values of the matrix. The circuit includes a multiplier coupled to the plurality of data lines to multiply the first set of values by a second set of values to produce a third set of values. A summing unit is included that is coupled to the multiplier to sum the third set of values to produce a sum. The circuit includes a shifting unit coupled to the summing unit to shift the sum and to add the shifted sum to a running total.Type: GrantFiled: April 28, 2017Date of Patent: August 21, 2018Assignee: Hewlett Packard Enterprise Development LPInventors: Ali Shafiee Ardestani, Naveen Muralimanohar
-
Patent number: 9893742Abstract: An information processing method for a computer for data compression, the method includes: performing projective transformation of a first numeric string corresponding to an input signal into a second numeric string which contains more components than the first numeric string and having a sum of squares of components as a predetermined value by using a plurality of projective parameters; and generating a bit string in which bits indicating positive and negative signs of the respective components of an operation result obtained by a vector product operation of the second numeric string obtained by the projective transformation and an observation matrix are arranged.Type: GrantFiled: September 12, 2017Date of Patent: February 13, 2018Assignee: FUJITSU LIMITEDInventor: Yui Noma
-
Patent number: 9710228Abstract: Embodiments disclosed pertain to apparatuses, systems, and methods for performing multi-precision single instruction multiple data (SIMD) operations on integer, fixed point and floating point operands. Disclosed embodiments pertain to a circuit that is capable of performing concurrent multiply, fused multiply-add, rounding, saturation, and dot products on the above operand types. In addition, the circuit may facilitate 64-bit multiplication when Newton-Raphson, divide and square root operations are performed.Type: GrantFiled: December 29, 2014Date of Patent: July 18, 2017Assignee: Imagination Technologies LimitedInventor: Leonard Rarick
-
Patent number: 9639362Abstract: An integrated circuit device comprising at least one instruction processing module arranged to receive a bit-manipulation instruction, and in response to receiving the bit-manipulation instruction to select at least one bit from at least one source data register in accordance with a value of at least one control bit, select from candidate values a manipulation value for the at least one selected bit in accordance with a value of at least one further control bit, and store the selected manipulation value for the at least one selected bit in at least one output data register.Type: GrantFiled: March 30, 2011Date of Patent: May 2, 2017Assignee: NXP USA, INC.Inventors: Noam Eshel-Goldman, Aviram Amir, Itzhak Barak, Amir Kleen
-
Patent number: 9513912Abstract: Methods and controllers for executing an instruction set are provided. In one such method, executing an instruction set includes executing an instruction of one type in the instruction set, executing a context switch instruction, and executing an instruction of a second type in the instruction set. in one such controller, a single machine executes instructions in an instruction set with instructions having an operational code, and instructions that do not have an operational code.Type: GrantFiled: July 27, 2012Date of Patent: December 6, 2016Assignee: Micron Technology, Inc.Inventors: Luca De Santis, Maria-Luisa Gallese, Emanuele Sirizotti, Walter Di-Francesco
-
Patent number: 9411554Abstract: A signed multiplier circuit includes a two-dimensional array of substantially similar logic blocks. Each of the logic blocks is programmable to implement any of four multiply functions of first and second inputs, in which: the first and second inputs are both signed; the first and second inputs are both unsigned; the first input is signed and the second input is unsigned; and the first input is unsigned and the second input is signed. Each logic block includes rows and columns of sub-circuits, e.g., logical AND gates and full adders. One row and one column of each logic block include a programmably invertible AND gate, with the row and column being independently controlled. The ability to program the logic block to perform all four of these functions enables the combination of rows and columns of the logic blocks to build large signed multipliers of virtually any size.Type: GrantFiled: April 2, 2009Date of Patent: August 9, 2016Assignee: XILINX, INC.Inventors: Steven P. Young, Brian C. Gaide
-
Patent number: 9280633Abstract: A method of designing a content-addressable memory (CAM) includes associating CAM cells with a summary circuit. The summary circuit includes a first level of logic gates and a second level of logic gates. The first level of logic gates have inputs each configured to receive an output of a corresponding one of the plurality of CAM cell. The second level of logic gates have inputs each configured to receive an output of a corresponding one of the first level of logic gates. Logic gates in at least one of the first level of logic gates or the second level of logic gates are selected to have an odd number of input pins so that an input pin and an output pin share a layout sub-slot.Type: GrantFiled: May 16, 2014Date of Patent: March 8, 2016Assignee: TAIWAN SEMICONDUCTOR MANUFACTURING COMPANY, LTD.Inventors: Young Seog Kim, Kuoyuan Hsu, Jacklyn Chang
-
Patent number: 9207909Abstract: Polynomial circuitry includes a respective partial product generator for each bit position of each term of a plurality of terms of a polynomial to be evaluated. A respective plurality of adders for each bit position adds partial products of a respective bit position across all of the plurality of terms to provide a respective bit-slice sum. Resulting bit-slice sums are offset from one another according to their respective bit positions. A final adder adds together the respective offset bit-slice sums to provide a result.Type: GrantFiled: March 8, 2013Date of Patent: December 8, 2015Assignee: Altera CorporationInventor: Martin Langhammer
-
Patent number: 9170775Abstract: A multiplier-accumulator (MAC) block can be programmed to operate in one or more modes. When the MAC block implements at least one multiply-and-accumulate operation, the accumulator value can be zeroed without introducing clock latency or initialized in one clock cycle. To zero the accumulator value, the most significant bits (MSBs) of data representing zero can be input to the MAC block and sent directly to the add-subtract-accumulate unit. Alternatively, dedicated configuration bits can be set to clear the contents of a pipeline register for input to the add-subtract-accumulate unit.Type: GrantFiled: January 7, 2010Date of Patent: October 27, 2015Assignee: Altera CorporationInventors: Leon Zheng, Martin Langhammer, Nitin Prasad, Greg Starr, Chiao Kai Hwang, Kumara Tharmalingam
-
Patent number: 8930435Abstract: A method for computation, including defining a sequence of n bits that encodes an exponent d, such that no more than a specified number of successive bits in the sequence are the same, initializing first and second registers using a value of a base x that is to be exponentiated, whereby the first and second registers hold respective first and second values, which are successively updated during the computation, successively, for each bit in the sequence computing a product of the first and second values, depending on whether the bit is one or zero, selecting one of the first and second registers, and storing the product in the selected one of the registers, whereby the first and second registers hold respective first and second final values upon completion of the sequence, and returning xd based on the first and second final values. Related apparatus and methods are also described.Type: GrantFiled: September 21, 2010Date of Patent: January 6, 2015Assignee: Cisco Technology Inc.Inventors: Yaacov Belenky, Zeev Geyzel
-
Patent number: 8805916Abstract: Digital signal processing (“DSP”) circuit blocks are provided that can more easily work together to perform larger (e.g., more complex and/or more arithmetically precise) DSP operations if desired. These DSP blocks may also include redundancy circuitry that facilitates stitching together multiple such blocks despite an inability to use some block (e.g., because of a circuit defect).Type: GrantFiled: March 3, 2009Date of Patent: August 12, 2014Assignee: Altera CorporationInventors: Martin Langhammer, Yi-Wen Lin, Keone Streicher
-
Patent number: 8793300Abstract: A circuit for calculating a sum of products, each product having a q-bit binary operand and a k-bit binary operand, where k is a multiple of q, includes a q-input carry-save adder (CSA); a multiplexer (10) by input of the adder, having four k-bit channels respectively receiving the value 0, a first (Yi) of the k-bit operands, the second k-bit operand (M[63:0], mi), and the sum of the two k-bit operands, the output of a multiplexer of rank t (where t is between 0 and q?1) being taken into account by the adder with a t-bit left shift; and each multiplexer having first and second path selection inputs, the bits of a first of the q-bit operands being respectively supplied to the first selection inputs, and the bits of the second q-bit operand being respectively supplied to the second selection inputs.Type: GrantFiled: April 11, 2012Date of Patent: July 29, 2014Assignee: INSIDE SecureInventor: Michael Niel
-
Patent number: 8649508Abstract: A system and method for implementing the Elliptic Curve scalar multiplication method in cryptography, where the Double Base Number System is expressed in decreasing order of exponents and further on using it to determine Elliptic curve scalar multiplication over a finite elliptic curve.Type: GrantFiled: September 29, 2008Date of Patent: February 11, 2014Assignee: Tata Consultancy Services Ltd.Inventor: Natarajan Vijayarangan
-
Patent number: 8645450Abstract: Multiplier-accumulator circuitry includes circuitry for forming a plurality of partial products of multiplier and multiplicand inputs, carry-save adder circuitry for adding together the partial products and another input to produce intermediate sum and carry outputs, final adder circuitry for adding together the intermediate sum and carry outputs to produce a final output, and feedback circuitry for applying the final output (typically after some delay, e.g., due to registration of the final output) to the carry-save adder circuitry as said another input. The above circuitry may be implemented in so-called “hard IP” (intellectual property) of a field-programmable gate array (“FPGA”) integrated circuit device. If desired, any overflow from the accumulation performed by the above circuitry may be accumulated in “soft” accumulator-overflow circuitry that is implemented in the general-purpose programmable logic of the FPGA.Type: GrantFiled: March 2, 2007Date of Patent: February 4, 2014Assignee: Altera CorporationInventors: Kok Heng Choe, Tony K Ngai, Henry Y. Lui
-
Patent number: 8543634Abstract: A specialized processing block such as a DSP block may be enhanced by including direct connections that allow the block output to be directly connected to either the multiplier inputs or the adder inputs of another such block. A programmable integrated circuit device may includes a plurality of such specialized processing blocks. The specialized processing block includes a multiplier having two multiplicand inputs and a product output, an adder having as one adder input the product output of the multiplier, and having a second adder input and an adder output, a direct-connect output of the adder output to a first other one of the specialized processing block, and a direct-connect input from a second other one of the specialized processing block. The direct-connect input connects a direct-connect output of that second other one of the specialized processing block to a first one of the multiplicand inputs.Type: GrantFiled: March 30, 2012Date of Patent: September 24, 2013Assignee: Altera CorporationInventors: Lei Xu, Volker Mauer, Steven Perry
-
Patent number: 8463837Abstract: A method and apparatus for performing bi-linear interpolation and motion compensation including multiply-add operations and byte shuffle operations on packed data in a processor. In one embodiment, two or more lines of 2n+1 content byte elements may be shuffled to generate a first and second packed data respectively including at least a first and a second 4n byte elements including 2n?1 duplicated elements. A third packed data including sums of products is generated from the first packed data and packed byte coefficients by a multiply-add instruction. A fourth packed data including sums of products is generated from the second packed data and elements and packed byte coefficients by another multiply-add instruction. Corresponding sums of products of the third and fourth packed data are then summed, and may be rounded and averaged.Type: GrantFiled: October 17, 2003Date of Patent: June 11, 2013Assignee: Intel CorporationInventors: Yen-Kuang Chen, Minerva M. Yeung
-
Patent number: 8457309Abstract: Apparatus for ciphering, including a non-volatile memory, which stores a number from which a private cryptographic key, having a complementary public cryptographic key, is derivable, wherein the number is shorter than the private cryptographic key, and a processor, which is configured to receive an instruction indicating that the private cryptographic key is to be applied to data and, responsively to the instruction, to compute the private cryptographic key using the stored number and to perform a cryptographic operation on the data using the private cryptographic key. Related apparatus and methods are also described.Type: GrantFiled: June 28, 2010Date of Patent: June 4, 2013Assignee: Cisco Technology, Inc.Inventors: Yaacov Belenky, Yaakov (Jordan) Levy
-
Publication number: 20130097212Abstract: Disclosed are new approaches to Multi-dimensional filtering with a reduced number of memory reads and writes. In one embodiment, a filter includes first and second coefficients. A block of a data having width and height each equal to the number of one of the first or second coefficients is read from a memory device. Arrays of values from the block are filtering using the first filter coefficients and the results filtered using the second coefficients. The final result may be optionally blended with another data value and written to a memory device. Registers store results of filtering with the first coefficients. The block of data may be read from a location including a source coordinate. The final result of filtering may be written to a destination coordinate obtained by rotating and/or mirroring the source coordinate. The orientation of arrays filtered using the first coefficients varies according to a rotation mode.Type: ApplicationFiled: October 14, 2011Publication date: April 18, 2013Applicant: Vivante CorporationInventors: Mike M. Cai, Huiming Zhang
-
Publication number: 20130054666Abstract: A method for predicting a value for a length of a future time interval in which a physical variable changes is described, in which at least one measured value for the length of a past time interval and an instantaneously measured value for a length of an instantaneous time interval are taken into account, m values for lengths of past time intervals being added. A first value precedes the instantaneously measured value by k?1, and an mth value precedes the instantaneously measured value by k?m. The m added values are divided by a value for a length of a past time interval which precedes the instantaneously measured value by k. A ratio of the mentioned values is formed. For determining the value to be predicted, an average error is initially added to the instantaneously measured value, forming a sum. The formed ratio is subsequently applied to this sum.Type: ApplicationFiled: January 12, 2011Publication date: February 28, 2013Inventors: Eberhard Boehl, Bernd Becker, Bernard Pawlok