Shifting Patents (Class 708/209)

Processing unit with small footprint arithmetic logic unit

Patent number: 12217021

Abstract: A parallel processing unit employs an arithmetic logic unit (ALU) having a relatively small footprint, thereby reducing the overall power consumption and circuit area of the processing unit. To support the smaller footprint, the ALU includes multiple stages to execute operations corresponding to a received instruction. The ALU executes at least one operation at a precision indicated by the received instruction, and then reduces the resulting data of the at least one operation to a smaller size before providing the results to another stage of the ALU to continue execution of the instruction.

Type: Grant

Filed: July 7, 2023

Date of Patent: February 4, 2025

Assignee: Advanced Micro Devices, Inc.

Inventors: Bin He, Shubh Shah, Michael Mantor
Multibit shift instruction

Patent number: 12093688

Abstract: An article of manufacture includes a non-transitory machine-readable medium. The medium includes instructions that cause a processor to execute a shift instruction. The shift instruction is to cause a source data in memory to be shifted left or shifted right. The shift instruction is to include a source parameter and a bit size parameter. The processor is to execute the shift instruction through a shift of a first source word of the source data by the bit size parameter to yield a first intermediate word, a shift of a second source word of the source data by the bit size parameter to yield a second intermediate word and a first set of shifted-out bits, and through execution of a logical OR operation on the first intermediate word and the first set of shifted-out bits to yield a first result word.

Type: Grant

Filed: November 17, 2022

Date of Patent: September 17, 2024

Assignee: Microchip Technology Incorporated

Inventors: Michael Catherwood, David Mickey, Ashish Desai, Jason Sachs, Calum Wilkie
Automatic generation of computation kernels for approximating elementary functions

Patent number: 12001311

Abstract: An apparatus for computing functions using polynomial-based approximation, comprising one or more processing circuitries configured for computing a polynomial-based approximant approximating a function by executing one or more iterations. Each iteration comprising computing the polynomial-based approximant using scaled fixed-point unit(s) according to a constructed set of coefficients, minimizing an approximation error of the computed polynomial-based approximant compared to the function while complying with one or more constraints selected from a group comprising at least: an accuracy, a compute graph size, a computation complexity, and a hardware utilization of the processing circuitry(s), adjusting one or more of the coefficients in case the approximation error is incompliant with the constraint(s) and initiating another iteration.

Type: Grant

Filed: January 6, 2022

Date of Patent: June 4, 2024

Assignee: Next Silicon Ltd

Inventor: Daniel Khankin
Flexible random access channel configurations

Patent number: 11979919

Abstract: Methods, systems, and devices for wireless communications are described. A user equipment (UE) may communicate with a base station in a wireless communications system. The UE may perform a random access procedure to communicate with the base station. The base station may configure the UE with different sets of preamble parameters for different types of UEs. The UE may generate a random access preamble based on the sets of preamble parameters according to the type of the UE. The base station may indicate a location of the base station to the UE. The UE may identify a location of the UE. The UE may determine a pre-compensation timing based on a distance from the two locations. The UE and the base station may transmit subsequent communication according to the random access preamble, the pre-compensation timing, or any combination thereof.

Type: Grant

Filed: September 17, 2021

Date of Patent: May 7, 2024

Assignee: QUALCOMM Incorporated

Inventors: Chiranjib Saha, Alberto Rico Alvarino, Le Liu, Umesh Phuyal, Kazuki Takeda
Normalized probability determination for character encoding

Patent number: 11973519

Abstract: Examples described herein relate to an apparatus comprising a central processing unit (CPU) and an encoding accelerator coupled to the CPU, the encoding accelerator comprising an entropy encoder to determine normalized probability of occurrence of a symbol in a set of characters using a normalized probability approximation circuitry, wherein the normalized probability approximation circuitry is to output the normalized probability of occurrence of a symbol in a set of characters for lossless compression. In some examples, the normalized probability approximation circuitry includes a shifter, adder, subtractor, or a comparator. In some examples, the normalized probability approximation circuitry is to determine normalized probability by performance of non-power of 2 division without computation by a Floating Point Unit (FPU). In some examples, the normalized probability approximation circuitry is to round the normalized probability to a decimal.

Type: Grant

Filed: June 23, 2020

Date of Patent: April 30, 2024

Assignee: Intel Corporation

Inventors: Bhushan G. Parikh, Stephen T. Palermo
Memory device and method for shifting memory values

Patent number: 11955197

Abstract: A memory device comprising a cell field having memory cells, N bit lines, which are respectively connected to at least one of the memory cells of the cell field, N being a whole number greater than one, N sense amplifiers; a bit shift circuit, which has S switch element rows, S being a whole number greater than one and a row number in the range from zero to S?1 being assignable to each switch element row. Each switch element row includes at least one semiconductor switch element connected to one of the bit lines and one of the sense amplifiers. Switch elements of each row connect all bit lines, whose bit line number is smaller than or equal to N minus the row number, to sense amplifiers, so that the respective sense amplifier number is equal to the respective bit line number plus the row number.

Type: Grant

Filed: May 17, 2022

Date of Patent: April 9, 2024

Assignee: ROBERT BOSCH GMBH

Inventors: Andre Guntoro, Chirag Sudarshan, Christian Weis, Leonardo Luiz Ecco, Taha Ibrahim Ibrahim Soliman, Norbert Wehn
Load store circuit with dedicated single or dual bit shift circuit and opcodes for low power accelerator processor

Patent number: 11847427

Abstract: Described examples include integrated circuits such as microcontrollers with a low energy accelerator processor circuit or other application specific integrated processor circuit including a load store circuit operative to perform load and store operations associated with at least one register and a low gate count shift circuit to selectively shift the data of the register by only an integer number of bits less than the register data width without using a barrel shifter for low power operation to support vector operations for FFT or filtering functions.

Type: Grant

Filed: August 31, 2015

Date of Patent: December 19, 2023

Assignee: TEXAS INSTRUMENTS INCORPORATED

Inventors: Srinivas Lingam, Seok-Jun Lee
Apparatus for calculating and retaining a bound on error during floating-point operations and methods thereof

Patent number: 11797300

Abstract: The apparatus and method for calculating and retaining a bound on error during floating-point operations inserts an additional bounding field into the standard floating-point format that records the retained significant bits of the calculation with notification upon insufficient retention. The bounding field, accounting for both rounding and cancellation errors, includes the lost bits D Field and the accumulated rounding error R Field. The D Field states the number of bits in the floating-point representation that are no longer meaningful. The bounds on the represented real value are determined by the truncated floating-point value and the addition of the error determined by the number of lost bits. The true, real value is absolutely contained by these bounds. The allowable loss (optionally programmable) of significant digits provides a fail-safe, real-time notification of loss of significant digits. This allows representation of real numbers accurate to the last digit.

Type: Grant

Filed: May 31, 2021

Date of Patent: October 24, 2023

Inventor: Alan A. Jorgensen
Reconfigurable processor circuit architecture

Patent number: 11494331

Abstract: A representative reconfigurable processing circuit and a reconfigurable arithmetic circuit are disclosed, each of which may include input reordering queues; a multiplier shifter and combiner network coupled to the input reordering queues; an accumulator circuit; and a control logic circuit, along with a processor and various interconnection networks. A representative reconfigurable arithmetic circuit has a plurality of operating modes, such as floating point and integer arithmetic modes, logical manipulation modes, Boolean logic, shift, rotate, conditional operations, and format conversion, and is configurable for a wide variety of multiplication modes. Dedicated routing connecting multiplier adder trees allows multiple reconfigurable arithmetic circuits to be reconfigurably combined, in pair or quad configurations, for larger adders, complex multiplies and general sum of products use, for example.

Type: Grant

Filed: September 9, 2020

Date of Patent: November 8, 2022

Assignee: Cornami, Inc.

Inventors: Paul L. Master, Steven K. Knapp, Raymond J. Andraka, Alexei Beliaev, Martin A. Franz, Rene Meessen, Frederick Curtis Furtek
Floating-point dot-product hardware with wide multiply-adder tree for machine learning accelerators

Patent number: 11288040

Abstract: Systems, apparatuses and methods may provide for technology that conduct a first alignment between a plurality of floating-point numbers based on a first subset of exponent bits. The technology may also conduct, at least partially in parallel with the first alignment, a second alignment between the plurality of floating-point numbers based on a second subset of exponent bits, where the first subset of exponent bits are LSBs and the second subset of exponent bits are MSBs. In one example, technology adds the aligned plurality of floating-point numbers to one another. With regard to the second alignment, the technology may also identify individual exponents of a plurality of floating-point numbers, identify a maximum exponent across the individual exponents, and conduct a subtraction of the individual exponents from the maximum exponent, where the subtraction is conducted from MSB to LSB.

Type: Grant

Filed: June 7, 2019

Date of Patent: March 29, 2022

Assignee: Intel Corporation

Inventors: Himanshu Kaul, Mark Anders
Control of NAND flash memory for al applications

Patent number: 11244718

Abstract: A system comprises: a three-dimensional flash memory comprising a plurality of cells; and a controller coupled to the three-dimensional flash memory, configured to: select a block of cells in the three-dimensional flash memory; perform a matrix multiplication on the matrix stored in the block of cells, including performing a vector multiplication in a single sensing step; and output a matrix multiplication result. A matrix is stored in the block of cells.

Type: Grant

Filed: September 8, 2020

Date of Patent: February 8, 2022

Inventors: Fei Xue, Dimin Niu, Shuangchen Li, Hongzhong Zheng
Data optimization method and integral prestack depth migration method

Patent number: 11209563

Abstract: A data optimization method and an integral prestack depth migration method are provided, including acquiring a target matrix to be optimized; generating a first sequence according to the target matrix; rarefying the first sequence according to a preset grid density to obtain a value position of each element of a second sequence, and working out a value of each element of the second sequence on the basis of the principle of least squares; performing interpolation on the second sequence to obtain a third sequence; calculating a target matrix corresponding to the third sequence; calculating an error between the target matrix to be optimized and the target matrix corresponding to the third sequence; recording, when the error is less than the first error threshold, the target matrix corresponding to the above second sequence as an optimized target matrix of the target matrix to be optimized.

Type: Grant

Filed: August 30, 2019

Date of Patent: December 28, 2021

Assignee: INSTITUTE OF GEOLOGY AND GEOPHYSICS

Inventors: Linong Liu, Hongwei Gao, Wei Liu, Jianfeng Zhang
Integration of automated complier dataflow optimizations

Patent number: 11106438

Abstract: Various embodiments are generally directed to optimizing dataflow in automated transformation frameworks (e.g., compiler, runtime, etc.) for spatial architectures (e.g., Configurable Spatial Accelerator) that translate high-level user code into forms that use “streams” (e.g., Latency Insensitive Channels, line buffers) to reduce overhead, eliminate or improve the efficiency of redundant memory accesses, and improve overall throughput.

Type: Grant

Filed: March 27, 2020

Date of Patent: August 31, 2021

Assignee: INTEL CORPORATION

Inventors: Dounia Khaldi, Rakesh Krishnaiyer, Rajiv Deodhar, Daniel Woodworth, Joshua Cranmer, Kent Glossop
Function virtualization facility for blocking instruction function of a multi-function instruction of a virtual processor

Patent number: 11086624

Abstract: In a processor supporting execution of a plurality of functions of an instruction, an instruction blocking value is set for blocking one or more of the plurality of functions, such that an attempt to execute one of the blocked functions, will result in a program exception and the instruction will not execute, however the same instruction will be able to execute any of the functions that are not blocked functions.

Type: Grant

Filed: August 13, 2019

Date of Patent: August 10, 2021

Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Dan Greiner, Damian Osisek, Timothy Slegel, Lisa Cranton Heller
Performing processing using hardware counters in a computer system

Patent number: 11029921

Abstract: Performing processing using hardware counters in a computer system includes storing, in association with greatest common divisor (GCD) processing of the system, a first variable in a first redundant binary representation and a second variable in a second redundant binary representation. Each such redundant binary representation includes a respective sum term and a respective carry term, and a numerical value being represented by a redundant binary representation is equal to a sum of the sum and carry terms of the redundant binary representation. The process performs redundant arithmetic operations of the GCD processing on the first variable and second variables using hardware counter(s), of the computer system, that take input values in redundant binary representation form and provide output values in redundant binary representation form. The process uses output of the redundant arithmetic operations of the GCD processing to obtain an output GCD of integer inputs to the GCD processing.

Type: Grant

Filed: February 14, 2019

Date of Patent: June 8, 2021

Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Eric M. Schwarz, Silvia M. Mueller, Ulrich Mayer
Apparatuses, methods, and systems for configurable operand size operations in an operation configurable spatial accelerator

Patent number: 11029958

Abstract: Systems, methods, and apparatuses relating to configurable operand size operation circuitry in an operation configurable spatial accelerator are described.

Type: Grant

Filed: December 28, 2019

Date of Patent: June 8, 2021

Assignee: Intel Corporation

Inventors: Chuanjun Zhang, Kermin E. Chofleming
Bit processing involving bit-level permutation instructions or operations

Patent number: 11010159

Abstract: Apparatus comprises counter and bit-shift circuitry to provide a succession of processing stages each comprising a count operation stage and a corresponding bit-shift stage, each processing stage operating with respect to a set of contiguous n-bit groups of bit positions, where n is 1 for a first processing stage and n doubles from one processing stage in the succession of processing stages to a next processing stage in the succession of processing stages; each count operation stage being configured to generate, for a first set of alternate instances of the n-bit groups of bit positions, count values indicating a respective number of bits of a predetermined bit value in a mask data word; and each bit-shift stage being configured to generate a bit-shifted data word by bit-shifting bits of a data word to be processed, for a second set of alternate instances of the n-bit groups of bit positions complementary to the first set, by respective numbers of bit positions dependent upon the count values generated by the

Type: Grant

Filed: August 31, 2018

Date of Patent: May 18, 2021

Assignee: ARM LIMITED

Inventors: Xiaoyang Shen, Cedric Denis Robert Airaud, Luca Nassi, Damien Robin Martin
Permutation of measuring capacitors in a time-of-flight sensor

Patent number: 11002836

Abstract: A time of flight sensor device is capable of generating accurate propagation time information for emitted light pulses using a small number of measurement cycles by using multiple measuring capacitors to capture more return pulse information per pulse period. To mitigate the effects of mismatched measuring capacitors and reading paths, embodiments of the time of flight sensor device perform multiple measuring sequences per measurement operation, permutating the roles of the measuring capacitors for each of the measuring sequences. The data collected by the measuring capacitors for the multiple measuring sequences is then aggregated and used to compute the propagation time and corresponding distance. This technique mitigate yields accurate measurements despite mismatches between reading paths and measuring capacitors without the need to implement pixel-level calibration and compensation, thereby saving calibration time, memory space, and computing time.

Type: Grant

Filed: May 14, 2018

Date of Patent: May 11, 2021

Assignee: Rockwell Automation Technologies, Inc.

Inventor: Frederic Boutaud
Synchronization signal transmission and reception for radio system

Patent number: 10931497

Abstract: A user equipment comprises receiving circuitry configured to receive bit map information indicating time domain positions, within a measurement window, of synchronization signal block(s) (SSB(s)) used for an intra and/or an inter-frequency measurement, the SSB(s) comprising at least a primary synchronization signal (PPS), a secondary synchronization signal (SSS), and a physical broadcast channel (PBCH), wherein the bitmap information comprises a bit string, and different lengths of the bit string are defined for different frequency bands.

Type: Grant

Filed: May 3, 2018

Date of Patent: February 23, 2021

Assignees: SHARP KABUSHIKI KAISHA, FG Innovation Company Limited

Inventors: Jia Sheng, Tatsushi Aiba, Toshizo Nogami
Reconfigurable segmented scalable shifter

Patent number: 10877729

Abstract: Systems and methods that provide reconfigurable shifter configurations supporting multiple instruction, multiple data (MIMD) are described. Shifters implemented according to embodiments support multiple data shifts with respect to an instance of data shifting, wherein multiple individual different data shifts are implemented at a time in parallel. Reconfigurable segmented scalable shifters of embodiments, in addition being reconfigurable for scalability in supporting data shifting with respect to various bit lengths of data, are configured to support data shifting of differing bit lengths in parallel. The data shifters of embodiments implement segmentation for facilitating data shifting with respect to differing bit lengths. Different data shift commands may be provided with respect to each such segment, thereby facilitating multiple data shifts in parallel with respect to various bit lengths of data.

Type: Grant

Filed: January 31, 2019

Date of Patent: December 29, 2020

Assignee: Hong Kong Applied Science and Technology Research Institute Co., Ltd.

Inventors: Hing-Mo Lam, Man-Wai Kwan, Ching-Hong Leung, Kong-Chau Tsang
Three dimensional (3-D) look up table (LUT) used for gamut mapping in floating point format

Patent number: 10867580

Abstract: A data segmenter is configured to determine indices using numbers of most significant bits (MSBs) of fractional values of floating-point representations of component values of an input color that are selected based on exponent values of the floating-point representations. The component values are defined according to a source gamut. The data segmenter is also configured to determine offsets associated with the indices using subsets of the fractional values. An interpolator configured to map the input color to an output color defined according to a destination gamut based on a location in a three-dimensional (3-D) look up table (LUT) indicated by the indices and offsets.

Type: Grant

Filed: January 24, 2019

Date of Patent: December 15, 2020

Assignee: ATI TECHNOLOGIES ULC

Inventor: Yuxin Chen
Random access channel access and validity procedures

Patent number: 10869336

Abstract: Random access channel access and validity procedures are disclosed. In one aspect, the medium access control (MAC) indications multiple random access occasions (ROs) to user equipments (UEs) for random access transmissions. In such aspect, random access failure would only be declared if listen before talk (LBT) procedures for the random access transmission fail on all of the ROs indicated by the MAC layer. Similarly, in additional aspects, a UE will not apply a backoff value for any LBT failures for random access attempts that occur within an LBT time window. In further aspects, a UE may determine the validity of ROs that overlap with a discovery reference signal measurement timing configuration (DMTC) window. In such aspects, the UE may not use overlapping ROs or may determine a portion of the DMTC window that is not used for base station transmissions and declare the overlapping ROs with the unused portion valid.

Type: Grant

Filed: February 13, 2020

Date of Patent: December 15, 2020

Assignee: QUALCOMM Incorporated

Inventors: Pravjyot Singh Deogun, Xiaoxia Zhang, Ozcan Ozturk, Jing Sun, Kapil Bhattad, Ananta Narayanan Thyagarajan
Apparatus for hardware accelerated machine learning

Patent number: 10817802

Abstract: An architecture and associated techniques of an apparatus for hardware accelerated machine learning are disclosed. The architecture features multiple memory banks storing tensor data. The tensor data may be concurrently fetched by a number of execution units working in parallel. Each operational unit supports an instruction set specific to certain primitive operations for machine learning. An instruction decoder is employed to decode a machine learning instruction and reveal one or more of the primitive operations to be performed by the execution units, as well as the memory addresses of the operands of the primitive operations as stored in the memory banks. The primitive operations, upon performed or executed by the execution units, may generate some output that can be saved into the memory banks. The fetching of the operands and the saving of the output may involve permutation and duplication of the data elements involved.

Type: Grant

Filed: May 5, 2017

Date of Patent: October 27, 2020

Assignee: Intel Corporation

Inventors: Jeremy Bruestle, Choong Ng
Neural network computing

Patent number: 10691410

Abstract: A method including receiving, by a processor, a computing instruction for a neural network, wherein the computing instruction for the neural network includes a computing rule for the neural network and a connection weight of the neural network, and the connection weight is a power of 2; and inputting, for a multiplication operation in the computing rule for the neural network, a source operand corresponding to the multiplication operation to a shift register, and performing a shift operation based on a connection weight corresponding to the multiplication operation, wherein the shift register outputs a target result operand as a result of the multiplication operation. The neural network uses a shift operation, and a neural network computing speed is increased.

Type: Grant

Filed: July 18, 2018

Date of Patent: June 23, 2020

Assignee: Alibaba Group Holding Limited

Inventors: Cong Leng, Hao Li, Zesheng Dou, Shenghuo Zhu, Rong Jin
Methods and instructions for a 32-bit arithmetic support using 16-bit multiply and 32-bit addition

Patent number: 10656914

Abstract: Instructions for 32-bit arithmetic support using 16-bit multiply and 32-bit addition without a barrel shifter. Illustrative instructions include operations that include receiving a first 32-bit operand, receiving a second 32-bit operand, shifting the second 32-bit operand right 16 or 15 bits to obtain a shifted second 32-bit operand, and adding the shifted second 32-bit operand and the first 32-bit operand to generate a 32-bit sum.

Type: Grant

Filed: August 20, 2019

Date of Patent: May 19, 2020

Assignee: TEXAS INSTRUMENTS INCORPORATED

Inventors: Srinivas Lingam, Seok-Jun Lee, Manish Goel
Apparatus and methods related to microcode instructions indicating instruction types

Patent number: 10606587

Abstract: The present disclosure includes apparatuses and methods related to microcode instructions. One example apparatus comprises a memory storing a set of microcode instructions. Each microcode instruction of the set can comprise a first field comprising a number of control data units, and a second field comprising a number of type select data units. Each microcode instruction of the set can have a particular instruction type defined by a value of the number of type select data units, and particular functions corresponding to the number of control data units are variable based on the particular instruction type.

Type: Grant

Filed: August 24, 2016

Date of Patent: March 31, 2020

Assignee: Micron Technology, Inc.

Inventors: Shawn Rosti, Timothy P. Finkbeiner
Permuting in a matrix-vector processor

Patent number: 10592583

Abstract: A circuit comprises an input register configured to receive an input vector of elements, a control register configured to receive a control vector of elements, wherein each element of the control vector corresponds to a respective element of the input vector, and wherein each element specifies a permutation of a corresponding element of the input vector, and a permute execution circuit configured to generate an output vector of elements corresponding to a permutation of the input vector. Generating each element of the output vector comprises accessing, at the input register, a particular element of the input vector, accessing, at the control register, a particular element of the control vector corresponding to the particular element of the input vector, and outputting the particular element of the input vector as an element at a particular position of the output vector that is selected based on the particular element of the control vector.

Type: Grant

Filed: February 25, 2019

Date of Patent: March 17, 2020

Assignee: Google LLC

Inventors: Dong Hyuk Woo, Gregory Michael Thorson, Andrew Everett Phelps, Olivier Temam, Jonathan Ross, Christopher Aaron Clark
Arithmetic circuit and control method with full element permutation and element concatenate shift left

Patent number: 10592247

Abstract: An arithmetic circuit comprises first to N-th, N being an integer equal to or larger than two, element circuits respectively including: input circuits which input first operand data and second operand data; and element data selectors which select operand data of any one of the element circuits on the basis of a request element signal; and a data bus which supplies the operand data from the input circuits to the element data selectors. When a control signal is in a first state, the element data selectors select, on the basis of the request element signal included in the second operand data, the first operand data of any of the element circuits and output the first operand data.

Type: Grant

Filed: August 24, 2015

Date of Patent: March 17, 2020

Assignee: FUJITSU LIMITED

Inventor: Tomonori Tanaka
Processor to perform a bit range isolation instruction

Patent number: 10579379

Abstract: Receiving an instruction indicating a source operand and a destination operand. Storing a result in the destination operand in response to the instruction. The result operand may have: (1) first range of bits having a first end explicitly specified by the instruction in which each bit is identical in value to a bit of the source operand in a corresponding position; and (2) second range of bits that all have a same value regardless of values of bits of the source operand in corresponding positions. Execution of instruction may complete without moving the first range of the result relative to the bits of identical value in the corresponding positions of the source operand, regardless of the location of the first range of bits in the result. Execution units to execute such instructions, computer systems having processors to execute such instructions, and machine-readable medium storing such an instruction are also disclosed.

Type: Grant

Filed: December 12, 2014

Date of Patent: March 3, 2020

Assignee: INTEL CORPORATION

Inventors: Maxim Loktyukhin, Eric W Mahurin, Bret L Toll, Martin G Dixon, Sean P Mirkes, David L Kreitzer, Elmoustapha Ould-Ahmed-Vall, Vinodh Gopal
System-on-chip (SoC) to perform a bit range isolation instruction

Patent number: 10579380

Abstract: Receiving an instruction indicating a source operand and a destination operand. Storing a result in the destination operand in response to the instruction. The result operand may have: (1) first range of bits having a first end explicitly specified by the instruction in which each bit is identical in value to a bit of the source operand in a corresponding position; and (2) second range of bits that all have a same value regardless of values of bits of the source operand in corresponding positions. Execution of instruction may complete without moving the first range of the result relative to the bits of identical value in the corresponding positions of the source operand, regardless of the location of the first range of bits in the result. Execution units to execute such instructions, computer systems having processors to execute such instructions, and machine-readable medium storing such an instruction are also disclosed.

Type: Grant

Filed: December 12, 2014

Date of Patent: March 3, 2020

Assignee: INTEL CORPORATION

Inventors: Maxim Loktyukhin, Eric W Mahurin, Bret L Toll, Martin G Dixon, Sean P Mirkes, David L Kreitzer, Elmoustapha Ould-Ahmed-Vall, Vinodh Gopal
Block floating point computations using shared exponents

Patent number: 10579334

Abstract: A system for block floating point computation in a neural network receives a plurality of floating point numbers. An exponent value for an exponent portion of each floating point number of the plurality of floating point numbers is identified and mantissa portions of the floating point numbers are grouped. A shared exponent value of the grouped mantissa portions is selected according to the identified exponent values and then removed from the grouped mantissa portions to define multi-tiered shared exponent block floating point numbers. One or more dot product operations are performed on the grouped mantissa portions of the multi-tiered shared exponent block floating point numbers to obtain individual results. The individual results are shifted to generate a final dot product value, which is used to implement the neural network. The shared exponent block floating point computations reduce processing time with less reduction in system accuracy.

Type: Grant

Filed: May 8, 2018

Date of Patent: March 3, 2020

Assignee: Microsoft Technology Licensing, LLC

Inventors: Daniel Lo, Eric Sen Chung
Variable precision floating-point multiplier

Patent number: 10572222

Abstract: Integrated circuits with specialized processing blocks are provided. The specialized processing blocks may include floating-point multiplier circuits that can be configured to support variable precision. A multiplier circuit may include a first carry-propagate adder (CPA), a second carry-propagate adder (CPA), and an associated rounding circuit. The first CPA may be wide enough to handle the required precision of the mantissa. In a bridged mode, the first CPA may borrow an additional bit from the second CPA while the rounding circuit will monitor the appropriate bits to select the proper multiplier output. A parallel prefix tree operable in a non-bridged mode or the bridged mode may be used to compute multiple multiplier outputs. The multiplier circuit may also include exponent and exception handling circuitry using various masks corresponding to the desired precision width.

Type: Grant

Filed: June 25, 2019

Date of Patent: February 25, 2020

Assignee: Altera Corporation

Inventor: Martin Langhammer
Method for generating random access channel ZC sequence, and apparatus

Patent number: 10524292

Abstract: Embodiments provide a method for generating a random access channel ZC sequence, and an apparatus. A method for generating a random access channel ZC sequence includes: generating, by a base station, notification signaling, where the notification signaling instructs user equipment UE to generate a random access ZC sequence by using a second restricted set in a random access set; and sending, by the base station, the notification signaling to the UE, so that the UE generates the random access ZC sequence by using the second restricted set, where the random access set includes an unrestricted set, a first restricted set, and the second restricted set; and the second restricted set is a random access set that the UE needs to use when a Doppler frequency shift of the UE is greater than or equal to a first predetermined value.

Type: Grant

Filed: December 2, 2016

Date of Patent: December 31, 2019

Assignee: HUAWEI TECHNOLOGIES CO., LTD.

Inventors: Qiang Wu, Zhiheng Guo, Jianqin Liu, Jianghua Liu, Leiming Zhang
Apparatus and method for communicating through random access

Patent number: 10512107

Abstract: Provided is a terminal, for example, user equipment (UE), including a processor and that performs a random access (RA) procedure with a base station, for example, for example, eNodeB, E-UTRAN Node B, or also known as Evolved Node B, and is at least temporarily embodied by the processor. The terminal may be at least temporarily embodied by the processor. The terminal may include a generator configured to generate a preamble sequence using a first sequence corresponding to a first root index based on a preamble index that is randomly selected, and a determiner configured to determine a second root index using the preamble index as an input value of a root index function. Further, the generator may be configured to generate a tag sequence using a second sequence corresponding to the second root index based on a tag index that is randomly selected.

Type: Grant

Filed: December 12, 2016

Date of Patent: December 17, 2019

Assignee: KOREA ADVANCED INSTITUTE OF SCIENCE AND TECHNOLOGY

Inventors: Dan Keun Sung, Hong Shik Park, Han Seung Jang
Methods and instructions for 32-bit arithmetic support using 16-bit multiply and 32-bit addition

Patent number: 10503474

Abstract: Instructions for 32-bit arithmetic support using 16-bit multiply and 32-bit addition without a barrel shifter. Illustrative instructions include operations that include receiving a first 32-bit operand, receiving a second 32-bit operand, shifting the second 32-bit operand right 16 or 15 bits to obtain a shifted second 32-bit operand, and adding the shifted second 32-bit operand and the first 32-bit operand to generate a 32-bit sum.

Type: Grant

Filed: December 31, 2015

Date of Patent: December 10, 2019

Assignee: TEXAS INSTRUMENTS INCORPORATED

Inventors: Srinivas Lingam, Seok-Jun Lee, Manish Goel
Apparatus and method for left-shifting packed quadwords and extracting packed doublewords

Patent number: 10496403

Abstract: An apparatus and method for performing right-shifting operations on packed quadword data.

Type: Grant

Filed: December 21, 2017

Date of Patent: December 3, 2019

Assignee: Intel Corporation

Inventors: Venkateswara Madduri, Elmoustapha Ould-Ahmed-Vall, Robert Valentine, Mark Charney
Apparatus and method for shifting quadwords and extracting packed words

Patent number: 10481910

Abstract: An apparatus and method for performing right-shifting operations on packed quadword data.

Type: Grant

Filed: September 29, 2017

Date of Patent: November 19, 2019

Assignee: Intel Corporation

Inventors: Venkateswara Madduri, Elmoustapha Ould-Ahmed-Vall, Jesus Corbal, Mark Charney, Robert Valentine, Binwei Yang
Vector processing for segmentation hash values calculation

Patent number: 10459961

Abstract: A system for segmenting an input data stream using vector processing, comprising a processor adapted to repeat the following steps throughout an input data stream to create a segmented data stream consisting a plurality of segments: apply a rolling sequence over a sequence of consecutive data items of an input data stream, the rolling sequence includes a subset of consecutive data items of the sequence, calculate concurrently a plurality of partial hash values each by one of a plurality of processing pipelines of the processor, each for a respective one of a plurality of partial rolling sequences each including evenly spaced data items of the subset, determine compliance of each of the plurality of partial hash values with one or more respective partial segmentation criteria and designate the sequence as a variable size segment when at least some of the partial hash values comply with the respective partial segmentation criteria.

Type: Grant

Filed: August 2, 2017

Date of Patent: October 29, 2019

Assignee: Huawei Technologies Co., Ltd.

Inventors: Yehonatan David, Yair Toaff, Michael Hirsch
Multiply-and-accumulate-products instructions

Patent number: 10409592

Abstract: An apparatus has processing circuitry comprising an L×M multiplier array. An instruction decoder associated with the processing circuitry supports a multiply-and-accumulate-product (MAP) instruction for generating at least one result element corresponding to a sum of respective E×F products of E-bit and F-bit portions of J-bit and K-bit operands respectively, where 1<E<J?L and 1<F<K?M. In response to the MAP instruction, the instruction decoder controls the processing circuitry to rearrange F-bit portions of the second K-bit operand to form a transformed K-bit operand, and to control the L×M multiplier array in dependence on the first J-bit operand and the transformed K-bit operand to add the respective E×F products using a subset of the adders used for accumulating partial products for a conventional multiplication.

Type: Grant

Filed: April 24, 2017

Date of Patent: September 10, 2019

Assignee: ARM Limited

Inventors: Neil Burgess, David Raymond Lutz, Javier Diaz Bruguera
Bit processing

Patent number: 10366741

Abstract: Circuitry comprises: a set of bit processing circuitries to apply two or more successive instances of bitwise processing to an ordered bit array; each bit processing circuitry for a given bit position within the ordered bit array comprising: bit shifting circuitry to selectively apply a bit shift of a respective input bit to a next bit processing circuitry in a first direction relative to the ordered bit array, in response to an active state of a bit shift control signal, the bit shifting circuitry not applying the bit shift in response to an inactive state of the bit shift control signal; and bit shift control circuitry to selectively allow or inhibit a bit shifting operation in response to one or more inhibit control signals; in which: the bit shift control circuitry is configured to selectively propagate an output inhibit control signal, indicating that a bit shifting operation should be inhibited, as an inhibit control signal to bit processing circuitry applying a next instance of the bitwise processing a

Type: Grant

Filed: September 21, 2017

Date of Patent: July 30, 2019

Assignee: ARM Limited

Inventors: Neil Burgess, Nigel John Stephens, Lee Evan Eisen, Jaime Ferragut Martinez-Vara De Rey
Apparatus and method for shifting quadwords and extracting packed words

Patent number: 10318298

Abstract: An apparatus and method for performing left-shifting operations on packed quadword data.

Type: Grant

Filed: September 29, 2017

Date of Patent: June 11, 2019

Assignee: Intel Corporation

Inventors: Venkateswara Madduri, Elmoustapha Ould-Ahmed-Vall, Robert Valentine, Mark Charney, Jesus Corbal
SIMD variable shift and rotate using control manipulation

Patent number: 10296333

Abstract: Vector single instruction multiple data (SIMD) shift and rotate instructions are provided specifying: a destination vector register comprising fields to store vector elements, a first vector register, a vector element size, and a second vector register. Vector data fields of a first element size are duplicated. Duplicate vector data fields are stored as corresponding data fields of twice the first element size. Control logic receives an element size for performing a SIMD shift or rotation operation. Through selectors corresponding to a vector element, portions are selected from the duplicated data fields, the selectors corresponding to any particular vector element select all portions similarly from the duplicated data fields for that particular vector element responsive to the first element size, but selectors corresponding to any particular vector element select at least two portions from the duplicated data fields differently for that particular vector element responsive to a second element size.

Type: Grant

Filed: December 27, 2016

Date of Patent: May 21, 2019

Assignee: Intel Corporation

Inventors: Asaf Rubinstein, Tom Aviram
Selectively combinable directional shifters

Patent number: 10289382

Abstract: An apparatus for mathematical manipulation is described allowing the selective combination of shifters to shift binary numbers of various widths. Selective combination allows on-the-fly adjustment of shifters from independent to coordinated shifting operations. Selective combination allows adjustable hardware-based shifting while saving space and resources. Multiple eight-bit shifters can be configured for a variety of operand widths, such as a 32-bit width, a 24-bit width, a 16-bit width, or an eight-bit width. Multiplexers route the appropriate input data to the appropriate shifters. Bidirectional shifting is configured through a selector tree, including both shift left and shift right operations. Opcodes configure the shifters for the desired type of shift and a shifted result is generated.

Type: Grant

Filed: March 30, 2018

Date of Patent: May 14, 2019

Assignee: Wave Computing, Inc.

Inventor: Samit Chaudhuri
Permuting in a matrix-vector processor

Patent number: 10216705

Abstract: A circuit comprises an input register configured to receive an input vector of elements, a control register configured to receive a control vector of elements, wherein each element of the control vector corresponds to a respective element of the input vector, and wherein each element specifies a permutation of a corresponding element of the input vector, and a permute execution circuit configured to generate an output vector of elements corresponding to a permutation of the input vector. Generating each element of the output vector comprises accessing, at the input register, a particular element of the input vector, accessing, at the control register, a particular element of the control vector corresponding to the particular element of the input vector, and outputting the particular element of the input vector as an element at a particular position of the output vector that is selected based on the particular element of the control vector.

Type: Grant

Filed: April 30, 2018

Date of Patent: February 26, 2019

Assignee: Google LLC

Inventors: Dong Hyuk Woo, Gregory Michael Thorson, Andrew Everett Phelps, Olivier Temam, Jonathan Ross, Christopher Aaron Clark
Shift instruction

Patent number: 10162633

Abstract: An apparatus has processing circuitry comprising multiplier circuitry for performing multiplication on a pair of input operands. In response to a shift instruction specifying at least one shift amount and a source operand comprising at least one data element, the source operand and a shift operand determined in dependence on the shift amount are provided as input operands to the multiplier circuitry and the multiplier circuitry is controlled to perform at least one multiplication which is equivalent to shifting a corresponding data element of the source operand by a number of bits specified by a corresponding shift amount to generate a shift result value.

Type: Grant

Filed: April 24, 2017

Date of Patent: December 25, 2018

Assignee: ARM Limited

Inventors: François Christopher Jacques Botman, Thomas Christopher Grocutt
Integrating sign extensions for loads

Patent number: 10126976

Abstract: An address and a data size are provided to a rotator. The rotator stores, based on the address and the data size, a data element in a location having a defined number of positions. The data element includes one or more data units and the one or more data units are aligned correctly in one or more positions of the location based on a predefined position in the location to receive a selected data unit of the one or more data units. The rotator replicates a value of a chosen data unit of the one or more data units to one or more other positions of the location.

Type: Grant

Filed: February 17, 2017

Date of Patent: November 13, 2018

Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventor: Michael K. Gschwind
Fast close path solution for a three-path fused multiply-add design

Patent number: 10108397

Abstract: Embodiments of the inventive concept include a fast close path solution and circuit of a three path fused multiply-adder circuit. The fast close path circuit can include one or more compressors that can receive multiple operands and produce a result sum and a result carry. The close path circuit can include one or more leading zero anticipators (LZAs). The one or more LZAs can receive and process the result sum and the result carry. The close path circuit can include one or more adders. The one or more adders can receive and add the result sum and the result carry in parallel with the one or more LZAs processing the result sum and the result carry. Since the close path is the critical timing path, by performing the addition operations in parallel with the LZA and/or priority encode (PENC) operations, the logic depth and latency of the close path are reduced.

Type: Grant

Filed: February 1, 2016

Date of Patent: October 23, 2018

Assignee: SAMSUNG ELECTRONICS CO., LTD.

Inventor: Ashraf Ahmed
Single instruction array index computation

Patent number: 10013258

Abstract: Embodiments are directed to a method of adjusting an index, wherein the index identifies a location of an element within an array. The method includes executing, by a computer, a single instruction that adjusts a first parameter of the index to match a parameter of an array address. The single instruction further adjusts a second parameter of the index to match a parameter of the array element. The adjustment of the first parameter includes a sign extension.

Type: Grant

Filed: September 29, 2014

Date of Patent: July 3, 2018

Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventor: Michael K. Gschwind
Permuting in a matrix-vector processor

Patent number: 9959247

Abstract: A circuit comprises an input register configured to receive an input vector of elements, a control register configured to receive a control vector of elements, wherein each element of the control vector corresponds to a respective element of the input vector, and wherein each element specifies a permutation of a corresponding element of the input vector, and a permute execution circuit configured to generate an output vector of elements corresponding to a permutation of the input vector. Generating each element of the output vector comprises accessing, at the input register, a particular element of the input vector, accessing, at the control register, a particular element of the control vector corresponding to the particular element of the input vector, and outputting the particular element of the input vector as an element at a particular position of the output vector that is selected based on the particular element of the control vector.

Type: Grant

Filed: April 25, 2017

Date of Patent: May 1, 2018

Assignee: Google LLC

Inventors: Dong Hyuk Woo, Gregory Michael Thorson, Andrew Everett Phelps, Olivier Temam, Jonathan Ross, Christopher Aaron Clark
Arithmetic apparatus and control method of the same using cordic algorithm

Patent number: 9959094

Abstract: An arithmetic apparatus comprises a plurality of cascade-connected arithmetic units. Each of the plurality of arithmetic units comprises: a calculator configured to operate in one of a rotation mode of performing a rotation calculation, and a vectoring mode of calculating a rotation angle; and a holding unit configured to hold rotational direction information output from the calculator in the vectoring mode. In addition, when operating in the rotation mode, the calculator performs the rotation calculation on data input from an arithmetic unit in a preceding stage, based on the rotational direction information held in the holding unit.

Type: Grant

Filed: June 3, 2015

Date of Patent: May 1, 2018

Assignee: CANON KABUSHIKI KAISHA

Inventors: Tadayoshi Nakayama, Koki Mitsunami

1 2 3 4 5 … next