Patents Examined by Andrew Caldwell
  • Patent number: 11966450
    Abstract: According to an embodiment, a calculation device includes a memory and one or more processors configured to update, for elements each associated with first and second variables, the first and second variables for each unit time, sequentially for the unit times and alternately between the first and second variables. In a calculation process for each unit time, the one or more processors are configured to: for each of the elements, update the first variable based on the second variable; update the second variable based on the first variables of the elements; when the first variable is smaller than a first value, change the first variable to a value of the first value or more and a threshold value or less; and when the first variable is greater than a second value, change the first variable to a value of the threshold value or more and the second value or less.
    Type: Grant
    Filed: February 25, 2021
    Date of Patent: April 23, 2024
    Assignee: Kabushiki Kaisha Toshiba
    Inventor: Hayato Goto
  • Patent number: 11960856
    Abstract: A system and/or an integrated circuit including a multiplier-accumulator execution pipeline which includes a plurality of MACs to implement a plurality of multiply and accumulate operations. A first memory stores filter weights having a Gaussian floating point (“GFP”) data format and a first bit length. A data format conversion circuitry includes circuitry to convert the filter weights from the GFP data format and the first bit length to filter weights having the data format and bit length that are different from the GFP data format and the first bit length. The converted filter weights are output to the MACs, wherein in operation, the MACs are configured to perform the plurality of multiply operations using (a) the input data and (b) the filter weights having the data format and bit length that are different from the GFP data format and the first bit length, respectively.
    Type: Grant
    Filed: January 4, 2021
    Date of Patent: April 16, 2024
    Assignee: Flex Logix Technologies, Inc.
    Inventor: Frederick A. Ware
  • Patent number: 11775303
    Abstract: Disclosed is a general-purpose computing accelerator which includes a memory including an instruction cache, a first executing unit performing a first computation operation, a second executing unit performing a second computation operation, an instruction fetching unit fetching an instruction stored in the instruction cache, a decoding unit that decodes the instruction, and a state control unit controlling a path of the instruction depending on an operation state of the second executing unit. The decoding unit provides the instruction to the first executing unit when the instruction is of a first type and provides the instruction to the state control unit when the instruction is of a second type. Depending on the operation state of the second executing unit, the state control unit provides the instruction of the second type to the second executing unit or stores the instruction of the second type as a register file in the memory.
    Type: Grant
    Filed: September 1, 2021
    Date of Patent: October 3, 2023
    Assignee: ELECTRONICS AND TELECOMMUNICATIONS RESEARCH INSTITUTE
    Inventor: Jeongmin Yang
  • Patent number: 11762664
    Abstract: There is provided a data processing apparatus comprising decode circuitry responsive to receipt of a block of instructions to generate control signals indicative of each of the block of instructions, and to analyse the block of instructions to detect a potential hazard instruction. The data processing apparatus is provided with decode circuitry to encode information indicative of a clean restart point into the control signals associated with the potential hazard instruction. The data processing apparatus is provided with data processing circuity to perform out-of-order execution of at least some of the block of instructions, and control circuitry responsive to a determination, at execution of the potential hazard instruction, that data values used as operands for the potential hazard instruction have been modified by out-of-order execution of a subsequent instruction, to restart execution from the clean restart point and to flush held data values from the data processing circuitry.
    Type: Grant
    Filed: January 5, 2022
    Date of Patent: September 19, 2023
    Assignee: Arm Limited
    Inventors: Yasuo Ishii, Michael David Achenbach, David Gum Lim, Abhishek Raja
  • Patent number: 11720356
    Abstract: An apparatus comprises an instruction decoder to decode instructions, processing circuitry to perform data processing in response to the instructions decoded by the instruction decoder, and memory attribute checking circuitry to check whether a memory access request issued by the processing circuitry satisfies access permissions specified in a plurality of memory attribute entries. Each memory attribute entry specifies access permissions for a corresponding address region of variable size within an address space. In response to a range checking instruction specifying address identifying parameters for identifying a first address and a second address, the instruction decoder controls the processing circuitry to set, in at least one software-accessible storage location, a status value indicative of whether the first address and the second address correspond to the same memory attribute entry.
    Type: Grant
    Filed: August 20, 2019
    Date of Patent: August 8, 2023
    Assignee: Arm Limited
    Inventor: Thomas Christopher Grocutt
  • Patent number: 11720360
    Abstract: Techniques are disclosed relating to data synchronization barrier operations. A system includes a first processor that may receive a data barrier operation request from a second processor include in the system. Based on receiving that data barrier operation request from the second processor, the first processor may ensure that outstanding load/store operations executed by the first processor that are directed to addresses outside of an exclusion region have been completed. The first processor may respond to the second processor that the data barrier operation request is complete at the first processor, even in the case that one or more load/store operations that are directed to addresses within the exclusion region are outstanding and not complete when the first processor responds that the data barrier operation request is complete.
    Type: Grant
    Filed: September 8, 2021
    Date of Patent: August 8, 2023
    Assignee: Apple Inc.
    Inventors: Jeff Gonion, John H. Kelm, James Vash, Pradeep Kanapathipillai, Mridul Agarwal, Gideon N. Levinsky, Richard F. Russo, Christopher M. Tsay
  • Patent number: 11687347
    Abstract: A microprocessor and a method for issuing a load/store instruction is introduced. The microprocessor includes a decode/issue unit, a load/store queue, a scoreboard, and a load/store unit. The scoreboard includes a plurality of scoreboard entries, in which each scoreboard entry includes an unknown bit value and a count value, wherein the unknown bit value or the count value is set when instructions are issued. The decode/issue unit checks for WAR, WAW, and RAW data dependencies from the scoreboard and dispatches load/store instructions to the load/store queue with the recorded scoreboard values. The load/store queue is configured to resolve the data dependencies and dispatch the load/store instructions to the load/store unit for execution.
    Type: Grant
    Filed: May 25, 2021
    Date of Patent: June 27, 2023
    Assignee: ANDES TECHNOLOGY CORPORATION
    Inventor: Thang Minh Tran
  • Patent number: 11023208
    Abstract: A true random number generator includes a latch circuit, a noise circuit coupled to the latch circuit and an equalization circuit coupled to the inputs of the latch circuit, the equalization circuit being configured to maintain the latch circuit in a balanced state and to allow the latch circuit to resolve from a metastable state based on a timing control. A method of generating a random number output includes maintaining a latch circuit in a balanced state by turning on an equalization circuit coupled to the inputs of the latch circuit, coupling at least one noise source to the latch circuit, allowing the latch circuit to resolve from a metastable state by turning off the equalization circuit and repeatedly turning the equalization circuit on and off based on a timing control.
    Type: Grant
    Filed: January 23, 2019
    Date of Patent: June 1, 2021
    Assignee: International Business Machines Corporation
    Inventors: Chitra K. Subramanian, Ghavam G. Shahidi
  • Patent number: 11023205
    Abstract: Negative zero control for execution of an instruction. A process obtains an instruction to perform operation(s) using an input value. The instruction includes a negative zero control indicator indicating whether negative zero control is enabled for execution of the instruction. The process executes the instruction, the executing including performing the operation(s) using the input value to obtain a result having a sign, determining whether to control the sign of the result, the determining being based at least in part on the negative zero control indicator being set to a defined value, and performing further processing, as part the executing the instruction, based on the determining.
    Type: Grant
    Filed: February 15, 2019
    Date of Patent: June 1, 2021
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Cedric Lichtenau, Reid Copeland, Petra Leber, Silvia M. Mueller, Jonathan D. Bradbury, Xin Guo
  • Patent number: 11023209
    Abstract: An electric device for a hardware random number generator is provided. The hardware random number generator comprises: one or more bitcells which comprise a first pair of a first transistor and a first tunable resistor and a second pair of a second transistor and a second tunable resistor, with the first pair is cross-coupled with the second pair.
    Type: Grant
    Filed: January 25, 2019
    Date of Patent: June 1, 2021
    Assignee: International Business Machines Corporation
    Inventor: Kangguo Cheng
  • Patent number: 11018692
    Abstract: Computer-implemented methods, systems, and devices to perform lossless compression of floating point format time-series data are disclosed. A first data value may be obtained in floating point format representative of an initial time-series parameter. For example, an output checkpoint of a computer simulation of a real-world event such as weather prediction or nuclear reaction simulation. A first predicted value may be determined representing the parameter at a first checkpoint time. A second data value may be obtained from the simulation. A prediction error may be calculated. Another predicted value may be generated for a next point in time and may be adjusted by the previously determined prediction error (e.g., to increase accuracy of the subsequent prediction). When a third data value is obtained, the adjusted prediction value may be used to generate a difference (e.g., XOR) for storing in a compressed data store to represent the third data value.
    Type: Grant
    Filed: July 29, 2020
    Date of Patent: May 25, 2021
    Assignee: Hewlett Packard Enterprise Development LP
    Inventors: Anirban Nag, Naveen Muralimanohar, Paolo Faraboschi
  • Patent number: 11016731
    Abstract: Disclosed embodiments relate to performing floating-point (FP) arithmetic. In one example, a processor is to decode an instruction specifying locations of first, second, and third floating-point (FP) operands and an opcode calling for accumulating a FP product of the first and second FP operands with the third FP operand, and execution circuitry to, in a first cycle, generate the FP product having a Fuzzy-Jbit format comprising a sign bit, a 9-bit exponent, and a 25-bit mantissa having two possible positions for a JBit and, in a second cycle, to accumulate the FP product with the third FP operand, while concurrently, based on Jbit positions of the FP product and the third FP operand, determining an exponent adjustment and a mantissa shift control of a result of the accumulation, wherein performing the exponent adjustment concurrently enhances an ability to perform the accumulation in one cycle.
    Type: Grant
    Filed: March 29, 2019
    Date of Patent: May 25, 2021
    Assignee: Intel Corporation
    Inventors: Amit Gradstein, Simon Rubanovich, Zeev Sperber
  • Patent number: 11010635
    Abstract: A method for processing electronic data includes the steps of transforming the electronic data to a matrix representation including a plurality of matrices; decomposing the matrix representation into a series of matrix approximations; and processing, with an approximation process, the plurality of matrices thereby obtaining a low-rank approximation of the plurality of matrices.
    Type: Grant
    Filed: September 10, 2018
    Date of Patent: May 18, 2021
    Assignee: City University of Hong Kong
    Inventors: Hing Cheung So, Wen-Jun Zeng, Jiayi Chen, Abdelhak M. Zoubir
  • Patent number: 11010131
    Abstract: An integrated circuit may include a floating-point adder. The adder may be implemented using a dual-path adder architecture having a near path and a far path. The near path may include a leading zero anticipator (LZA), a comparison circuit for comparing an exponent value to an LZA count, and associated circuitry for handling subnormal numbers. The far path may include a subtraction circuit for computing the difference between a received exponent value and a minimum exponent value, at least two shifters for shifting far greater and far lesser mantissa values in parallel, and associated circuitry for handling subnormal numbers. The adder may be dynamically configured to support a first mode that processes FP16 at inputs and outputs, a second mode that processes modified FP16? inputs, and a third mode that processes FP16? at inputs and outputs.
    Type: Grant
    Filed: September 14, 2017
    Date of Patent: May 18, 2021
    Assignee: Intel Corporation
    Inventors: Martin Langhammer, Bogdan Pasca
  • Patent number: 11010662
    Abstract: Massively parallel neural inference computing elements are provided. A plurality of multipliers is arranged in a plurality of equal-sized groups. Each of the plurality of multipliers is adapted to, in parallel, apply a weight to an input activation to generate an output. A plurality of adders is operatively coupled to one of the groups of multipliers. Each of the plurality of adders is adapted to, in parallel, add the outputs of the multipliers within its associated group to generate a partial sum. A plurality of function blocks is operatively coupled to one of the plurality of adders. Each of the plurality of function blocks is adapted to, in parallel, apply a function to the partial sum of its associated adder to generate an output value.
    Type: Grant
    Filed: March 4, 2020
    Date of Patent: May 18, 2021
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Rathinakumar Appuswamy, John V. Arthur, Andrew S. Cassidy, Pallab Datta, Steven K. Esser, Myron D. Flickner, Jennifer Klamo, Dharmendra S. Modha, Hartmut Penner, Jun Sawada, Brian Taba
  • Patent number: 11003985
    Abstract: Provided is a convolutional neural network system including a data selector configured to output an input value corresponding to a position of a sparse weight from among input values of input data on a basis of a sparse index indicating the position of a nonzero value in a sparse weight kernel, and a multiply-accumulate (MAC) computator configured to perform a convolution computation on the input value output from the data selector by using the sparse weight kernel.
    Type: Grant
    Filed: November 7, 2017
    Date of Patent: May 11, 2021
    Assignee: ELECTRONICS AND TELECOMMUNICATIONS RESEARCH INSTITUTE
    Inventors: Jin Kyu Kim, Byung Jo Kim, Seong Min Kim, Ju-Yeob Kim, Mi Young Lee, Joo Hyun Lee
  • Patent number: 10997277
    Abstract: An integrated circuit device such as a neural network accelerator can be programmed to select a numerical value based on a multinomial distribution. In various examples, the integrated circuit device can include an execution engine that includes multiple separate execution units. The multiple execution units can operate in parallel on different streams of data. For example, to make a selection based on a multinomial distribution, the execution units can be configured to perform cumulative sums on sets of numerical values, where the numerical values represent probabilities. In this example, to then obtain cumulative sums across the sets of numerical values, the largest values from the sets can be accumulated, and then added, in parallel to the sets. The resulting cumulative sum across all the numerical values can then be used to randomly select a specific index, which can provide a particular numerical value as the selected value.
    Type: Grant
    Filed: March 26, 2019
    Date of Patent: May 4, 2021
    Assignee: Amazon Technologies, Inc.
    Inventors: Yu Zhou, Vignesh Vivekraja, Ron Diamant
  • Patent number: 10997275
    Abstract: A method for an associative memory array includes storing each column of a matrix in an associated column of the associative memory array, where each bit in row j of the matrix is stored in row R-matrix-row-j of the array, storing a vector in each associated column, where a bit j from the vector is stored in an R-vector-bit-j row of the array. The method includes simultaneously activating a vector-matrix pair of rows R-vector-bit-j and R-matrix-row-j to concurrently receive a result of a Boolean function on all associated columns, using the results to calculate a product between the vector-matrix pair of rows, and writing the product to an R-product-j row in the array.
    Type: Grant
    Filed: March 23, 2017
    Date of Patent: May 4, 2021
    Assignee: GSI Technology Inc.
    Inventors: Avidan Akerib, Pat Lasserre
  • Patent number: 10990355
    Abstract: The present innovative solution solves the problem of generating pseudo-random numbers that have practically infinite period, while requiring limited processing resources and operating significantly faster that known pseudo-random number generators. A sequence of pseudo-random numbers is created by a linear congruential generator using a large seed number and the sequence is used to create a big number. The big number is formed by raising each of at least two pseudo-random numbers and their sum to the same power. The big number is then selectively split into a sequence of aperiodic pseudo-random numbers which are output for use in any suitable application and for seeding the present generator.
    Type: Grant
    Filed: July 10, 2020
    Date of Patent: April 27, 2021
    Inventor: Panagiotis Andreadakis
  • Patent number: 10984074
    Abstract: Disclosed embodiments relate to an accelerator for sparse-dense matrix instructions. In one example, a processor to execute a sparse-dense matrix multiplication instruction, includes fetch circuitry to fetch the sparse-dense matrix multiplication instruction having fields to specify an opcode, a dense output matrix, a dense source matrix, and a sparse source matrix having a sparsity of non-zero elements, the sparsity being less than one, decode circuitry to decode the fetched sparse-dense matrix multiplication instruction, execution circuitry to execute the decoded sparse-dense matrix multiplication instruction to, for each non-zero element at row M and column K of the specified sparse source matrix generate a product of the non-zero element and each corresponding dense element at row K and column N of the specified dense source matrix, and generate an accumulated sum of each generated product and a previous value of a corresponding output element at row M and column N of the specified dense output matrix.
    Type: Grant
    Filed: February 24, 2020
    Date of Patent: April 20, 2021
    Assignee: Intel Corporation
    Inventors: Srinivasan Narayanamoorthy, Nadathur Rajagopalan Satish, Alexey Suprun, Kenneth J. Janik