Patents Examined by Matthew D Sandifer
-
Patent number: 11568022Abstract: Detailed are embodiments related to bit matrix multiplication in a processor. For example, in some embodiments a processor comprising: decode circuitry to decode an instruction have fields for an opcode, an identifier of a first source bit matrix, an identifier of a second source bit matrix, an identifier of a destination bit matrix, and an immediate; and execution circuitry to execute the decoded instruction to perform a multiplication of a matrix of S-bit elements of the identified first source bit matrix with S-bit elements of the identified second source bit matrix, wherein the multiplication and accumulation operations are selected by the operation selector and store a result of the matrix multiplication into the identified destination bit matrix, wherein S indicates a plural bit size is described.Type: GrantFiled: January 22, 2021Date of Patent: January 31, 2023Assignee: Intel CorporationInventors: Dmitry Y. Babokin, Kshitij A. Doshi, Vadim Sukhomlinov
-
Patent number: 11567730Abstract: A planar fabrication charge transfer capacitor for coupling charge from a Unit Element (UE) generates a positive charge first output V_PP and a positive charge second output V_NP, the first output coupled to a positive charge line comprising a continuous first planar conductor, a continuous second planar conductor parallel to the first planar conductor, and a continuous third planar conductor parallel to the first planar conductor and second planar conductor, the charge transfer capacitor comprising, in sequence: a first co-planar conductor segment, the first planar conductor, a second co-planar conductor segment, the second planar conductor, a third co-planar conductor segment, the third planar conductor, and a fourth coplanar conductor segment, the first and third coplanar conductor segments capacitively edge coupled to the UE first output V_PP, the second and fourth coplanar conductor segments capacitively edge coupled to the UE second output V_NP.Type: GrantFiled: January 31, 2021Date of Patent: January 31, 2023Assignee: Ceremorphic, Inc.Inventors: Martin Kraemer, Ryan Boesch, Wei Xiong
-
Patent number: 11561768Abstract: Provided are a method and system for using a non-linear feedback shift register (NLFSR) for generating a pseudo-random sequence. The method may include generating, for an n-stage NLFSR that requires more than two taps to generate a maximal length pseudo-random sequence, a pseudo-random sequence using a feedback logical operation of only a first logic gate and a second logic gate. Two non-end taps suitable for providing an at least near-maximal length pseudo-random sequence are inputs for the first logic gate, an output of the first logic gate and an end tap are inputs for the second logic gate, and an output of the second logic gate is used as feedback to a first stage of the n-stage NLFSR.Type: GrantFiled: April 27, 2021Date of Patent: January 24, 2023Assignee: International Business Machines CorporationInventor: Andrew Johnson
-
Patent number: 11550545Abstract: Techniques related to a low-power, low-memory multiply and accumulate (MAC) unit are described. In an example, the MAC unit performs a MAC operation that represents a multiplication of numbers. At least the bit representation of one number is compressed based on a quantization and a clustering of quantization values, whereby index bits are used instead of the actual bit representation. The index bits are loaded in an index buffer and a bit representation of another number is loaded in an input buffer. The index bits are used in a lookup to determine whether the corresponding bit representation and shift operations are applied to the input buffer based on this bit representation, followed by accumulation operations.Type: GrantFiled: November 6, 2020Date of Patent: January 10, 2023Assignee: SK hynix Inc.Inventors: Fan Zhang, Aman Bhatia
-
Patent number: 11550543Abstract: A semiconductor memory device includes a plurality of memory bank groups configured to be accessed in parallel; an internal memory bus configured to receive external data from outside the plurality of memory bank groups; and a first computation circuit configured to receive internal data from a first memory bank group of the plurality of memory bank groups during each first period of a plurality of first periods, receive the external data through the internal memory bus during each second period of a plurality of second periods, the second period being shorter than the first period, and perform a processing in memory (PIM) arithmetic operation on the internal data and the external data during each second period.Type: GrantFiled: November 21, 2019Date of Patent: January 10, 2023Assignee: Samsung Electronics Co., Ltd.Inventors: Shinhaeng Kang, Seongil O
-
Patent number: 11550548Abstract: Briefly, example methods, apparatuses, and/or articles of manufacture are disclosed that may be implemented, in whole or in part, using one or more computing devices to facilitate and/or support one or more operations and/or techniques for an autonomous pseudo-random seed generator (APRSG) for embedded computing devices, which may include IoT-type devices, such as implemented in connection with one or more computing and/or communication networks and/or protocols.Type: GrantFiled: March 31, 2020Date of Patent: January 10, 2023Assignee: Arm LimitedInventors: Andrew Neil Sloss, Christopher Neal Hinds, Hannah Marie Peeler, Gary Dale Carpenter
-
Patent number: 11544037Abstract: An improved electronic mixed mode multiplier and accumulate circuit for artificial intelligence and computing system applications that perform vector-vector, vector-matrix and other multiply-accumulate computations. The circuit is provided is a high resolution, high linearity, low area, low power multiply—accumulate (MAC) unit to interface with a memory device for storing computation output results. The MAC unit uses a less number of current carrying elements resulting in much lower integrated circuit area, and provides a tight matching between the current elements thus preserving inherent linearity requirements due to current mode operation. Further the MAC performs current scaling using switches and current division where the current switches occupy minimum size transistors requiring a small area to implement that renders it compatible with MRAM such as a magnetic tunnel junction device.Type: GrantFiled: April 30, 2020Date of Patent: January 3, 2023Assignee: International Business Machines CorporationInventors: Sudipto Chakraborty, Rajiv Joshi
-
Patent number: 11544349Abstract: A method for implementing a neural network system in an integrated circuit includes presenting digital pulses to word line inputs of a matrix vector multiplier including a plurality of word lines, the word lines forming intersections with a plurality of summing bit lines, a programmable Vt transistor at each intersection having a gate connected to the intersecting word line, a source connected to a fixed potential and a drain connected to the intersecting summing bit line, each digital pulse having a pulse width proportional to an analog quantity. During a charge collection time frame charge collected on each of the summing bit lines from current flowing in the programmable Vt transistor is summed. During a pulse generating time frame digital pulses are generated having pulse widths proportional to the amount of charge that was collected on each summing bit line during the charge collection time frame.Type: GrantFiled: April 15, 2021Date of Patent: January 3, 2023Assignee: Microsemi SoC Corp.Inventors: John L. McCollum, Jonathan W. Greene, Gregory William Bakker
-
Patent number: 11537361Abstract: A processing unit and a method for multiplying at least two multiplicands. The multiplicands are present in an exponential notation, that is, each multiplicand is assigned an exponent and a base. The processing unit is configured to carry out a multiplication of the multiplicands and includes at least one bitshift unit, the bitshift unit shifting a binary number a specified number of places, in particular, to the left; an arithmetic unit, which carries out an addition of two input variables and a subtraction of two input variables; and a storage device. A computer program, which is configured to execute the method, and a machine-readable storage element, in which the computer program is stored, are also described.Type: GrantFiled: May 21, 2019Date of Patent: December 27, 2022Assignee: Robert Bosch GmbHInventor: Sebastian Vogel
-
Patent number: 11526581Abstract: A method of performing matrix computations includes receiving a compression-encoded matrix including a plurality of rows. Each row of the compression-encoded matrix has a plurality of defined element values and, for each such defined element value, a schedule tag indicating a schedule for using the defined element value in a scheduled matrix computation. The method further includes loading the plurality of rows of the compression-encoded matrix into a corresponding plurality of work memory banks, and providing decoded input data to a matrix computation module configured for performing the scheduled matrix computation. For each work memory bank, a next defined element value and a corresponding schedule tag are read. If the schedule tag meets a scheduling condition, the next defined element value is provided to the matrix computation module. Otherwise, a default element value is provided to the matrix computation module.Type: GrantFiled: October 30, 2020Date of Patent: December 13, 2022Assignee: Microsoft Technology Licensing, LLCInventors: Shuayb M. Zarar, Amol Ashok Ambardekar, Jun Zhang
-
Patent number: 11520853Abstract: A processor system comprises two groups of registers and a hardware channel convolution processor unit. The first group of registers is configured to store data elements of channels of a portion of a convolution data matrix. Each register stores at least one data element from each channel. The second group of registers is configured to store data elements of convolution weight matrices including a separate matrix for each channel. Each register stores at least one data element from each matrix. The hardware channel convolution processor unit is configured to multiply each data element in a first and second portion of the first group of registers with a corresponding data element in the second group of registers to determine corresponding multiplication results and sum together the multiplication results for each specific channel to determine two corresponding channel convolution result data elements in a corresponding channel convolution result matrix.Type: GrantFiled: February 28, 2020Date of Patent: December 6, 2022Assignee: Meta Platforms, Inc.Inventors: Krishnakumar Narayanan Nair, Rakesh Komuravelli, Abdulkadir Utku Diril, Ehsan Khish Ardestani Zadeh, Yuchen Hao, Martin Schatz, Thomas Mark Ulrich, Olivia Wu, Anup Ramesh Kadkol, Amin Firoozshahian
-
Patent number: 11520856Abstract: A system for performing tensor decomposition in a selective expansive and/or recursive manner, a tensor is decomposed into a specified number of components, and one or more tensor components are selected for further decomposition. For each selected component, the significant elements thereof are identified, and using the indices of the significant elements a sub-tensor is formed. In a subsequent iteration, each sub-tensor is decomposed into a respective specified number of components. Additional sub-tensors corresponding to the components generated in the subsequent iteration are formed, and these additional sub-tensors may be decomposed further in yet another iteration, until no additional components are selected. The mode of a sub-tensor can be decreased or increased prior to decomposition thereof. Components likely to reveal information about the data stored in the tensor can be selected for decomposition.Type: GrantFiled: November 2, 2020Date of Patent: December 6, 2022Assignee: Qualcomm IncorporatedInventors: Muthu M. Baskaran, David Bruns-Smith, James Ezick, Richard A. Lethin
-
Patent number: 11520562Abstract: A method comprising storing a plurality of entries, each entry of the plurality of entries associated with a portion of a range of input values, each entry of the plurality of entries comprising a set of coefficients defining a power series approximation; selecting first entry of the plurality of entries based on a determination that a floating point input value is within a portion of the range of input values that is associated with the first entry; and calculating an output value by evaluating the power series approximation defined by the set of coefficients of the first entry at the floating point input value.Type: GrantFiled: August 30, 2019Date of Patent: December 6, 2022Assignee: Intel CorporationInventors: Brian J. Hickmann, Nitin N. Garegrat, Maciej Urbanski, Michael Rotzin
-
Patent number: 11509291Abstract: A zero-insertion FIR filter architecture for filtering a signal with a target band and a secondary band. Digital filter circuitry includes an L-tap FIR (finite impulse response) filter, with a number L filter tap elements (L=0, 1, 2, . . . (L?1)), each with an assigned coefficient from a defined coefficient sequence. The L-tap FIR filter is configurable with a defined zero-insertion coefficient sequence of a repeating sub-sequence of a nonzero coefficient followed by one or more zero-inserted coefficients, with a number Nj of nonzero coefficients, and a number Nk of zero-inserted coefficients, so that L=Nj+Nk. The L-tap FIR filter is configurable as an M-tap FIR filter with a nonzero coefficient sequence in which each of the L filter tap elements is assigned a non-zero coefficient, the M-tap FIR filter having an effective length of M=(Nj+Nk) non-zero coefficients.Type: GrantFiled: March 12, 2019Date of Patent: November 22, 2022Assignee: TEXAS INSTRUMENTS INCORPORATEDInventors: Jawaharlal Tangudu, Jaiganesh Balakrishnan
-
Patent number: 11500631Abstract: A method is provided that includes performing, by a processor in response to a floating point multiply instruction, multiplication of floating point numbers, wherein determination of values of implied bits of leading bit encoded mantissas of the floating point numbers is performed in parallel with multiplication of the encoded mantissas, and storing, by the processor, a result of the floating point multiply instruction in a storage location indicated by the floating point multiply instruction.Type: GrantFiled: May 20, 2020Date of Patent: November 15, 2022Assignee: Texas Instruments IncorporatedInventors: Mujibur Rahman, Timothy David Anderson
-
Patent number: 11500964Abstract: A device for computing the inner product of vectors includes a vector data arranger, a vector data pre-accumulator, a number converter, and a post-accumulator. The vector data arranger stores a first vector and sequentially outputs a plurality of vector data based on the first vector. The vector data pre-accumulator stores a second vector, receives each of the vector data, and pre-accumulates the second vector, so as to generate a plurality accumulation results. The number converter and the post-accumulator receive and process all the accumulation results corresponding to each of the vector data to generate an inner product value. The present invention implements a lookup table with the vector data pre-accumulator and the number converter to increase calculation speed and reduce power consumption.Type: GrantFiled: October 21, 2020Date of Patent: November 15, 2022Assignee: NATIONAL CHUNG CHENG UNIVERSITYInventor: Tay-Jyi Lin
-
Patent number: 11500963Abstract: A method of performing Principal Component Analysis is provided. The method includes receiving, by a computing device, evolving data for processing/visualization. The method further includes, by the computing device, a dimensionality for reducing of the evolving data using the PCA, wherein the PCA is performed on analog crossbar hardware. The method also includes using, by the computing device, the evolving data for visualization having the dimensionality thereof reduced by the principal component analysis for a further application.Type: GrantFiled: October 1, 2020Date of Patent: November 15, 2022Assignees: INTERNATIONAL BUSINESS MACHINES CORPORATION, RAMOT AT TEL AVIV UNIVERSITY LTD.Inventors: Shashanka Ubaru, Vasileios Kalantzis, Lior Horesh, Mark S. Squillante, Haim Avron
-
Patent number: 11494165Abstract: An arithmetic circuit includes a LUT generation circuit (1) that, when coefficients c[n] (n=1, . . . , N) are paired two by two, outputs a value calculated for each of the pairs, and distributed arithmetic circuits (2-m) that calculate values z[m] that are sums of products of data x[m, n] of a data set X[m] containing M pairs of data x[m, n] and the coefficients c[n], in parallel for each of the M pairs. The distributed arithmetic circuit (2-m) includes binomial distributed arithmetic circuits that, for each of the pairs, calculate sums of products of a value obtained by pairing N data x[m, n] corresponding to the circuit two by two and a value obtained by pairing the coefficients c[n] two by two, and a figure matching circuit that matches a number of decimal figures of the sums with a predetermined number of decimal figures.Type: GrantFiled: December 18, 2018Date of Patent: November 8, 2022Assignees: NTT ELECTRONICS CORPORATION, NIPPON TELEGRAPH AND TELEPHONE CORPORATIONInventors: Kenji Kawai, Ryo Awata, Kazuhito Takei, Masaaki Iizuka
-
Patent number: 11487846Abstract: Embodiments relate to a neural processor circuit including a plurality of neural engine circuits, a data buffer, and a kernel fetcher circuit. At least one of the neural engine circuits is configured to receive matrix elements of a matrix as at least the portion of the input data from the data buffer over multiple processing cycles. The at least one neural engine circuit further receives vector elements of a vector from the kernel fetcher circuit, wherein each of the vector elements is extracted as a corresponding kernel to the at least one neural engine circuit in each of the processing cycles. The at least one neural engine circuit performs multiplication between the matrix and the vector as a convolution operation to produce at least one output channel of the output data.Type: GrantFiled: May 4, 2018Date of Patent: November 1, 2022Assignee: Apple Inc.Inventors: Christopher L. Mills, Erik K. Norden, Sung Hee Park
-
Patent number: 11487506Abstract: An aspect includes executing, by a binary based floating-point arithmetic unit of a processor, a calculation having two or more operands in hexadecimal format based on a hexadecimal floating-point (HFP) instruction and providing a condition code for a calculation result of the calculation. The floating-point arithmetic unit includes a condition code anticipator circuit that is configured to provide the condition code to the processor prior to availability of the calculation result.Type: GrantFiled: August 9, 2019Date of Patent: November 1, 2022Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Silvia Melitta Mueller, Petra Leber, Kerstin Claudia Schelm, Cedric Lichtenau