Patents Examined by Emily E. Larocque
-
Patent number: 11915117Abstract: A method for convolution in a convolutional neural network (CNN) is provided that includes accessing a coefficient value of a filter corresponding to an input feature map of a convolution layer of the CNN, and performing a block multiply accumulation operation on a block of data elements of the input feature map, the block of data elements corresponding to the coefficient value, wherein, for each data element of the block of data elements, a value of the data element is multiplied by the coefficient value and a result of the multiply is added to a corresponding data element in a corresponding output block of data elements comprised in an output feature map.Type: GrantFiled: May 24, 2021Date of Patent: February 27, 2024Assignee: Texas Instruments IncorporatedInventors: Manu Mathew, Kumar Desappan, Pramod Kumar Swami
-
Patent number: 11909422Abstract: A deep neural network (“DNN”) module compresses and decompresses neuron-generated activation data to reduce the utilization of memory bus bandwidth. The compression unit receives an uncompressed chunk of data generated by a neuron in the DNN module. The compression unit generates a mask portion and a data portion of a compressed output chunk. The mask portion encodes the presence and location of the zero and non-zero bytes in the uncompressed chunk of data. The data portion stores truncated non-zero bytes from the uncompressed chunk of data. A decompression unit receives a compressed chunk of data from memory in the DNN processor or memory of an application host. The decompression unit decompresses the compressed chunk of data using the mask portion and the data portion.Type: GrantFiled: November 11, 2022Date of Patent: February 20, 2024Assignee: Microsoft Technology Licensing, LLCInventors: Joseph Leon Corkery, Benjamin Eliot Lundell, Larry Marvin Wall, Chad Balling McBride, Amol Ashok Ambardekar, George Petre, Kent D. Cedola, Boris Bobrov
-
Patent number: 11907327Abstract: Provided is an arithmetic method of performing convolution operation in convolutional layers of a neutral network by calculating matrix products. The arithmetic method includes: determining, for each of the convolutional layers, whether an amount of input data to be inputted to the convolutional layer is smaller than or equal to a predetermined amount of data; selecting a first arithmetic mode and performing convolution operation in the first arithmetic mode, when the amount of input data is determined to be smaller than or equal to the predetermined amount of data in the determining; selecting a second arithmetic mode and performing convolution operation in the second arithmetic mode, when the amount of input data is determined to be larger than the predetermined amount of data in the determining; and outputting output data which is a result obtained by performing convolution operation.Type: GrantFiled: December 1, 2020Date of Patent: February 20, 2024Assignee: SOCIONEXT INC.Inventor: Makoto Yamakura
-
Patent number: 11907684Abstract: A system and method of generating a series of random number; from a source of random numbers in a computing system. Steps includes: loading a data loop (a looped array of stored values with an index) with random data from a source of random data; then repeating the following: reading a value from the data loop in relation to the index; operating on the multi-bit value thereby outputting a derived random number; and moving the index in relation to the looped array. The data loop may be a simple feedback loop which may be a shift register loaded by direct memory access (DMA). The operation may be performed by one or more arithmetic logic units (ALU) which may be fed by one or more data feeds and may perform XOR, Mask Generator, Data MUX, and/or MOD.Type: GrantFiled: February 15, 2022Date of Patent: February 20, 2024Assignee: CASSY HOLDINGS LLCInventor: Patrick D. Ross
-
Patent number: 11907681Abstract: A semiconductor device includes a dynamic reconfiguration processor that performs data processing for input data sequentially input and outputs the results of data processing sequentially as output data, an accelerator including a parallel arithmetic part that performs arithmetic operation in parallel between the output data from the dynamic reconfiguration processor and each of a plurality of predetermined data, and a data transfer unit that selects the plurality of arithmetic operation results by the accelerator in order and outputs them to the dynamic reconfiguration processor.Type: GrantFiled: January 5, 2022Date of Patent: February 20, 2024Assignee: RENESAS ELECTRONICS CORPORATIONInventors: Taro Fujii, Takao Toi, Teruhito Tanaka, Katsumi Togawa
-
Patent number: 11899742Abstract: A quantization parameter providing step of a quantization method is performed to provide a quantization parameter which includes a quantized input activation, a quantized weight and a splitting value. A parameter splitting step is performed to split the quantized weight and the quantized input activation into a plurality of grouped quantized weights and a plurality of grouped activations, respectively, according to the splitting value. A multiply-accumulate step is performed to execute a multiply-accumulate operation with one of the grouped quantized weights and one of the grouped activations, and then generate a convolution output. A convolution quantization step is performed to quantize the convolution output to a quantized convolution output according to a convolution target bit. A convolution merging step is performed to execute a partial-sum operation with the quantized convolution output according to the splitting value, and then generate an output activation.Type: GrantFiled: July 7, 2020Date of Patent: February 13, 2024Assignee: NATIONAL TSING HUA UNIVERSITYInventors: Kea-Tiong Tang, Wei-Chen Wei
-
Patent number: 11894822Abstract: A filter device includes: delay units serially connected to delay an input signal and output a delayed signal; multiplication units multiplying the delayed signal by a filter coefficient based on a predetermined value and a multiplying factor adjustment value; a coefficient adjustment unit that, when a multiplication result obtained by multiplying the predetermined value by the multiplying factor adjustment value exceeds a maximum value of a filter-coefficient representation range, divides the multiplication result exceeding the maximum value by the maximum value, and outputs a quotient of division as a coefficient adjustment value; a signal conversion unit outputting a signal obtained by adding after-filter-coefficient-multiplication signals outputted by the multiplication units and an adjusted signal obtained by adjusting a corresponding delayed signal using the coefficient adjustment value; and a division unit generating an output signal by dividing the signal outputted by the signal conversion unit by theType: GrantFiled: April 11, 2022Date of Patent: February 6, 2024Assignee: Mitsubishi Electric CorporationInventors: Yasutaka Yamashita, Shigenori Tani, Kazuma Kaneko, Shigeru Uchida
-
Patent number: 11886833Abstract: Embodiments of the present disclosure include systems and methods for providing hierarchical and shared exponent floating point data types. First and second shared exponent values are determined based on exponent values of a plurality of floating point values. A third shared exponent value is determined based the first shared exponent value and the second shared exponent value. First and second difference values are determined based on the first shared exponent value, the second shared exponent value, and the third shared exponent value. Sign values and mantissa values are determined for the plurality of floating point values. The sign value and the mantissa value for each floating point value in the plurality of floating point values, the third shared exponent value, the first difference value, and the second difference value are stored in a data structure for a shared exponent floating point data type.Type: GrantFiled: June 28, 2021Date of Patent: January 30, 2024Assignee: Microsoft Technology Licensing, LLCInventors: Bita Darvish Rouhani, Venmugil Elango, Rasoul Shafipour, Jeremy Fowers, Ming Gang Liu, Jinwen Xi, Douglas C. Burger, Eric S. Chung
-
Patent number: 11886491Abstract: A preconditioned conjugate gradient (PCG) solver, embedded in an electronic device to perform a simultaneous localization and mapping (SLAM) operation, includes an image database, a factor graph database, and a back-end processor, wherein the back-end processor is configured to receive an image from the image database to perform re-localization, receive, from the factor graph database, data for calculating six degrees of freedom (DoF)-related components, construct a matrix including the six degrees of freedom-related components based on the received data, and load and rearrange the matrix and a vector, to perform calculation on each block of each row of the matrix and the vector, then output first data, and shift second data to a location of the first data.Type: GrantFiled: May 21, 2021Date of Patent: January 30, 2024Assignees: SAMSUNG ELECTRONICS CO., LTD., University of Seoul Industry Cooperation FoundationInventors: Myungjae Jeon, Kichul Kim, Yongkyu Kim, Hongseok Lee, San Kim, Hyekwan Yun, Donggeun Lee
-
Patent number: 11886536Abstract: Methods and systems for performing a convolution transpose operation between an input tensor having a plurality of input elements and a filter comprising a plurality of filter weights. The method includes: dividing the filter into a plurality of sub-filters; performing, using hardware logic, a convolution operation between the input tensor and each of the plurality of sub-filters to generate a plurality of sub-output tensors, each sub-output tensor comprising a plurality of output elements; and interleaving, using hardware logic, the output elements of the plurality of sub-output tensors to form a final output tensor for the convolution transpose.Type: GrantFiled: January 12, 2023Date of Patent: January 30, 2024Assignee: Imagination Technologies LimitedInventors: Cagatay Dikici, Clifford Gibson, James Imber
-
Patent number: 11868740Abstract: A circuit includes a first full adder, a second full adder, a first half adder, a third full adder configured to receive a sum output signal of the first full adder, a sum output signal of the second full adder, and a sum output signal of the first half adder, a fourth full adder configured to receive a carry output signal of the first full adder, a carry output signal of the second full adder, and a carry output signal of the first half adder, a second half adder configured to receive a carry output signal of the third full adder and a sum output signal of the fourth full adder, and a third half adder configured to receive a carry output signal of the second half adder and a carry output signal of the fourth full adder.Type: GrantFiled: September 29, 2021Date of Patent: January 9, 2024Assignee: Postech Research and Business Development FoundationInventors: Seokhyeong Kang, Sunmean Kim, Sunghye Park, SungYun Lee
-
Patent number: 11868243Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for performing topological scheduling on a machine-learning accelerator having an array of tiles. One of the methods includes performing, at each time step of a plurality of time steps corresponding respectively to columns within each of a plurality of wide columns of the tile array, operations comprising: performing respective multiplications using tiles in a respective tile column for the time step, computing a respective output result for each respective tile column for the time step including computing a sum of results of the multiplications for the tile column, and storing the respective output result for the tile column in a particular output RAM having a location within the same tile column and on a row from which the output result will be read by a subsequent layer of the model.Type: GrantFiled: June 21, 2022Date of Patent: January 9, 2024Assignee: Google LLCInventor: Lukasz Lew
-
Patent number: 11868741Abstract: The present disclosure discloses a processing element and a neural processing device including the processing element. The processing element includes a weight register configured to store a weight, an input activation register configured to store input activation, a flexible multiplier configured to generate result data by performing a multiplication operation of the weight and the input activation by using a first multiplier of a first precision or using both the first multiplier and a second multiplier of the first precision in response to a calculation mode signal and a saturating adder configured to generate a partial sum by using the result data.Type: GrantFiled: June 15, 2022Date of Patent: January 9, 2024Assignee: Rebellions Inc.Inventors: Jaewan Bae, Jinwook Oh, Karim Charfi
-
Patent number: 11868742Abstract: Some embodiments provide methods and apparatus for quantum random number generation based on a single bit or multi bit Quanta Image Sensor (QIS) providing single-photon counting over a time interval for each of an array of pixels of the QIS, wherein random number data is generated based on the number of photons counted over the time interval for each of the pixels.Type: GrantFiled: February 9, 2023Date of Patent: January 9, 2024Assignees: ID QUANTIQUE SA, TRUSTEES OF DARTMOUTH COLLEGEInventors: Emna Amri, Yacine Felk, Damien Stucki, Jiaju Ma, Eric R. Fossum
-
Patent number: 11853718Abstract: In an aspect, a processor includes circuitry for iterative refinement approaches, e.g., Newton-Raphson, to evaluating functions, such as square root, reciprocal, and for division. The circuitry includes circuitry for producing an initial approximation; which can include a LookUp Table (LUT). LUT may produce an output that (with implementation-dependent processing) forms an initial approximation of a value, with a number of bits of precision. A limited-precision multiplier multiplies that initial approximation with another value; an output of the limited precision multiplier goes to a full precision multiplier circuit that performs remaining multiplications required for iteration(s) in the particular refinement process being implemented. For example, in division, the output being calculated is for a reciprocal of the divisor.Type: GrantFiled: January 16, 2023Date of Patent: December 26, 2023Assignee: Imagination Technologies LimitedInventor: Leonard Rarick
-
Patent number: 11853716Abstract: Methods and systems for determining whether an infinitely precise result of a reciprocal square root operation performed on an input floating point number is greater than a particular number in a first floating point precision. The method includes calculating the square of the particular number in a second lower floating point precision; calculating an error in the calculated square due to the second floating point precision; calculating a first delta value in the first floating point precision by calculating the square multiplied by the input floating point number less one; calculating a second delta value by calculating the error multiplied by the input floating point number plus the first delta value; and outputting an indication of whether the infinitely precise result of the reciprocal square root operation is greater than the particular number based on the second delta term.Type: GrantFiled: November 30, 2021Date of Patent: December 26, 2023Assignee: Imagination Technologies LimitedInventors: Casper Van Benthem, Sam Elliott
-
Patent number: 11853717Abstract: Embodiments of the present disclosure include systems and methods for accelerating processing based on sparsity for neural network hardware processors. An input manager determines a pair of non-zero values from a pair of data streams in a plurality of pairs of data streams and retrieve the pair of non-zero values from the pair of data streams. A multiplier performs a multiplication operation on the pair of non-zero values and generate a product of the pair of non-zero values. An accumulator manager receives the product of the pair of non-zero values from the multiplier and sends the product of the pair of non-zero values to a corresponding accumulator in a plurality of accumulators.Type: GrantFiled: January 14, 2021Date of Patent: December 26, 2023Assignee: Microsoft Technology Licensing, LLCInventors: Karthikeyan Avudaiyappan, Jeffrey Andrews
-
Patent number: 11842168Abstract: An electronic system includes a mapping circuit configured to receive input samples of a dataset within a defined range of values. The mapping circuit is configured to perform comparisons that compare each input sample to each of a plurality of comparison values selected from the defined range of values. For each comparison, the mapping circuit generates an indication value specifying whether the input sample used in the comparison is greater than or equal to the comparison value used in the comparison. The system includes an adder circuit configured to generate a sum of the indication values for each comparison value and a memory configured to maintain counts corresponding to the comparison values. The counts are updated by the respective sums. The system includes a threshold detection circuit configured to determine, for the dataset, a threshold value or threshold range based on the counts read from the memory.Type: GrantFiled: September 25, 2021Date of Patent: December 12, 2023Assignee: Xilinx, Inc.Inventors: Sai Lalith Chaitanya Ambatipudi, Vamsi Krishna Nalluri, Sandeep Jayant Sathe, Chaithanya Dudha, Krishna Kishore Bhagavatula
-
Patent number: 11842266Abstract: A processing-in-memory (PIM) device includes a plurality of multiplication/accumulation (MAC) operators and a plurality of memory banks. The MAC operators are included in each of a plurality of channels. Each of the plurality of MAC operators performs a MAC arithmetic operation using weight data of a weight matrix. The memory banks are included in each of the plurality of channels and are configured to transmit the weight data of the weight matrix to the plurality of MAC operators. The weight data arrayed in one row of the weight matrix are stored into one row of each of the plurality of memory banks.Type: GrantFiled: January 4, 2021Date of Patent: December 12, 2023Assignee: SK hynix Inc.Inventor: Choung Ki Song
-
Patent number: 11836461Abstract: An arithmetic apparatus includes input lines and multiply-accumulate devices. An electrical signal for an input value is input into each of the input lines within a predetermined input period. Multiplication units include a positive weight multiplication unit that generates a positive weight charge for a product value obtained by multiplying the input value by a positive weight value and/or a negative weight multiplication unit that generates a negative weight charge for a product value obtained by multiplying the input value by a negative weight value. They are configured such that a positive weight ratio that is a ratio of a sum total of the positive weight values to a sum total of absolute values of the weight values is any ratio of 0% to 100%. An output unit of the multiply-accumulate device accumulates the generated weight charges to output a multiply-accumulate signal representing a sum of the product values.Type: GrantFiled: January 17, 2020Date of Patent: December 5, 2023Assignee: Sony Group CorporationInventor: Hiroshi Yoshida