Arithmetic Operation Instruction Processing Patents (Class 712/221)

Floating point or vector (Class 712/222)

Accelerating data processing by offloading thread computation

Patent number: 12118397

Abstract: The present disclosure describes techniques for accelerating data processing by offloading thread computation. An application may be started based on creating and executing a process by a host, the process associated with a plurality of threads. Creating a plurality of computation threads on a storage device may be requested based on determining that the storage device represents a computational storage. The plurality of computation threads may be created based on preloading a plurality of libraries in the storage device. The plurality of libraries may comprise executable codes associated with the plurality of threads. Data processing associated with the plurality of threads may be offloaded to the storage device using the plurality of computation threads. Activities associated with the plurality of computation threads may be managed by the process.

Type: Grant

Filed: September 15, 2022

Date of Patent: October 15, 2024

Assignees: Lemon Inc., Beijing Youzhuju Network Technology Co. Ltd.

Inventors: Viacheslav Dubeyko, Jian Wang
Data processing circuit for neural network

Patent number: 12014264

Abstract: A data processing circuit is disclosed. The data processing circuit relates to the field of digital circuits, and includes a first computing circuit and an input control circuit. The first computing circuit includes one or more computing sub-circuits. Each computing sub-circuit includes a first addition operation circuit, a multiplication operation circuit, a first comparison operation circuit, and a first nonlinear operation circuit. The first nonlinear operation circuit includes at least one of an exponential operation circuit and a logarithmic operation circuit. The input control circuit is configured to: control the first computing circuit to read input data and an input parameter, and control, according to a received first instruction, the operation circuit in the computing sub-circuit included in the first computing circuit, to perform an operation on the input data and the input parameter.

Type: Grant

Filed: August 28, 2020

Date of Patent: June 18, 2024

Assignee: HUAWEI TECHNOLOGIES CO., LTD.

Inventors: Zhanying He, Bin Xu, Honghui Yuan
Vector processor data storage

Patent number: 11899967

Abstract: Aspects of the present disclosure provide an aligned storage strategy for stripes within a long vector for a vector processor, such that the extra computation needed to track strides between input stripes and output stripes may be eliminated. As a result, the stripe locations are located in a more predictable memory access pattern such that memory access bandwidth may be improved and the tendency for memory error may be reduced.

Type: Grant

Filed: November 15, 2021

Date of Patent: February 13, 2024

Assignee: Lightmatter, Inc.

Inventors: Nicholas Moore, Gongyu Wang, Bradley Dobbie, Tyler J. Kenney, Ayon Basumallik
Systems and methods for performing nibble-sized operations on matrix elements

Patent number: 11886875

Abstract: Disclosed embodiments relate to systems and methods for performing nibble-sized operations on matrix elements. In one example, a processor includes fetch circuitry to fetch an instruction, decode circuitry to decode the fetched instruction the fetched instruction having fields to specify an opcode and locations of first source, second source, and destination matrices, the opcode to indicate the processor is to, for each pair of corresponding elements of the first and second source matrices, logically partition each element into nibble-sized partitions, perform an operation indicated by the instruction on each partition, and store execution results to a corresponding nibble-sized partition of a corresponding element of the destination matrix. The exemplary processor includes execution circuitry to execute the decoded instruction as per the opcode.

Type: Grant

Filed: December 26, 2018

Date of Patent: January 30, 2024

Assignee: Intel Corporation

Inventors: Elmoustapha Ould-Ahmed-Vall, Jonathan D. Pearce, Dan Baum, Guei-Yuan Lueh, Michael Espig, Christopher J. Hughes, Raanan Sade, Robert Valentine, Mark J. Charney, Alexander F. Heinecke
Padded vectorization with compile time known masks

Patent number: 11789734

Abstract: A computing system includes a processing unit and a memory storing instructions that, when executed by the processor, cause the processor to receive program source code in a compiler, identify in the program source code a set of operations for vectorizing, where each operation in the set of operations specifies a set of one or more operands, in response to identifying the set of operations, vectorize the set of operations by, based on the number of operations in the set of operations and a total number of lanes in a first vector register, generating a mask indicating a first unmasked lane and a first masked lane in the first vector register, based on the mask, generating a set of one or more instructions for loading into the first unmasked lane a first operand of a first operation of the set of operations, and loading the first operand into the first masked lane.

Type: Grant

Filed: August 9, 2019

Date of Patent: October 17, 2023

Assignee: Advanced Micro Devices, Inc.

Inventor: Anupama Rajesh Rasale
Methods, apparatus, and articles of manufacture to increase data reuse for multiply and accumulate (MAC) operations

Patent number: 11789646

Abstract: Methods, apparatus, systems, and articles of manufacture are disclosed that increase data reuse for multiply and accumulate (MAC) operations. An example apparatus includes a MAC circuit to process a first context of a set of a first type of contexts stored in a first buffer and a first context of a set of a second type of contexts stored in a second buffer. The example apparatus also includes control logic circuitry to, in response to determining that there is an additional context of the second type to be processed in the set of the second type of contexts, maintain the first context of the first type in the first buffer. The control logic circuitry is also to, in response to determining that there is an additional context of the first type to be processed in the set of the first type of contexts maintain the first context of the second type in the second buffer and iterate a pointer of the second buffer from a first position to a next position in the second buffer.

Type: Grant

Filed: September 24, 2021

Date of Patent: October 17, 2023

Assignee: INTEL CORPORATION

Inventors: Niall Hanrahan, Martin Power, Kevin Brady, Martin-Thomas Grymel, David Bernard, Gary Baugh, Cormac Brick
Processing with compact arithmetic processing element

Patent number: 11768659

Abstract: A processor or other device, such as a programmable and/or massively parallel processor or other device, includes processing elements designed to perform arithmetic operations (possibly but not necessarily including, for example, one or more of addition, multiplication, subtraction, and division) on numerical values of low precision but high dynamic range (“LPHDR arithmetic”). Such a processor or other device may, for example, be implemented on a single chip. Whether or not implemented on a single chip, the number of LPHDR arithmetic elements in the processor or other device in certain embodiments of the present invention significantly exceeds (e.g., by at least 20 more than three times) the number of arithmetic elements, if any, in the processor or other device which are designed to perform high dynamic range arithmetic of traditional precision (such as 32 bit or 64 bit floating point arithmetic).

Type: Grant

Filed: September 23, 2020

Date of Patent: September 26, 2023

Assignee: SINGULAR COMPUTING LLC

Inventor: Joseph Bates
Processing with compact arithmetic processing element

Patent number: 11768660

Abstract: A processor or other device, such as a programmable and/or massively parallel processor or other device, includes processing elements designed to perform arithmetic operations (possibly but not necessarily including, for example, one or more of addition, multiplication, subtraction, and division) on numerical values of low precision but high dynamic range (“LPHDR arithmetic”). Such a processor or other device may, for example, be implemented on a single chip. Whether or not implemented on a single chip, the number of LPHDR arithmetic elements in the processor or other device in certain embodiments of the present invention significantly exceeds (e.g., by at least 20 more than three times) the number of arithmetic elements, if any, in the processor or other device which are designed to perform high dynamic range arithmetic of traditional precision (such as 32 bit or 64 bit floating point arithmetic).

Type: Grant

Filed: January 26, 2023

Date of Patent: September 26, 2023

Assignee: SINGULAR COMPUTING LLC

Inventor: Joseph Bates
Method and apparatus for storing data, and computer device and storage medium thereof

Patent number: 11705923

Abstract: Disclosed are a method and apparatus for storing data. The method includes: acquiring data to be stored; converting the data to be stored from an initial data type to a target data type, a data length corresponding to the target data type being less than that corresponding to the initial data type; and storing the data to be stored of the target data type to a database. In the method according to the present disclosure, a storage space occupied by the data to be stored in the database is greatly reduced. In addition, the method according to the present disclosure is performed prior to lossy or lossless data compression storage of the data to be stored in the related art. That is, on the basis of a compression ratio when the data to be stored is stored in the related art, the present disclosure further improves a compression effect of the data to be stored by reducing the data length when the data to be stored is stored, and further saves storage resources of the database.

Type: Grant

Filed: November 20, 2020

Date of Patent: July 18, 2023

Assignees: ENVISION DIGITAL INTERNATIONAL PTE. LTD., SHANGHAI ENVISION DIGITAL CO., LTD.

Inventors: Li Lei, Hong Zhao, Xiaomeng Chen, Degang Ning
Systems, methods, and apparatuses for dot product operations

Patent number: 11669326

Abstract: Embodiments detailed herein relate to matrix operations. For example, embodiments of instruction support for matrix (tile) dot product operations are detailed. Exemplary instructions including computing a dot product of signed words and accumulating in a quadword data elements of a matrix pair. Additionally, in some instances, non-accumulating quadword data elements of the matrix pair are set to zero.

Type: Grant

Filed: December 29, 2017

Date of Patent: June 6, 2023

Assignee: Intel Corporation

Inventors: Raanan Sade, Simon Rubanovich, Amit Gradstein, Zeev Sperber, Alexander Heinecke, Robert Valentine, Mark J. Charney, Bret Toll, Jesus Corbal, Elmoustapha Ould-Ahmed-Vall, Menachem Adelman
Fully pipelined hardware operator logic circuit for converting human-readable decimal character sequence floating-point representations to IEEE 754-2008 binary floating-point format representations

Patent number: 11635956

Abstract: A fully pipelined convertToBinaryFromDecimalCharacter hardware operator logic circuit configured to convert one or more human-readable decimal character sequence floating-point representations to IEEE 754-2008 binary floating-point representations every clock cycle. The circuit converts decimal character sequence floating-point representations up to 28 decimal digits in length to IEEE 754 binary64, binary32, or binary16 floating-point format representations.

Type: Grant

Filed: December 18, 2021

Date of Patent: April 25, 2023

Inventor: Jerry D. Harthcock
Data cache with prediction hints for cache hits

Patent number: 11620229

Abstract: Described is a data cache with prediction hints for a cache hit. The data cache includes a plurality of cache lines, where a cache line includes a data field, a tag field, and a prediction hint field. The prediction hint field is configured to store a prediction hint which directs alternate behavior for a cache hit against the cache line. The prediction hint field is integrated with the tag field or is integrated with a way predictor field.

Type: Grant

Filed: February 21, 2020

Date of Patent: April 4, 2023

Assignee: SiFive, Inc.

Inventors: John Ingalls, Josh Smith
Bypassing zero-value multiplications in a hardware multiplier

Patent number: 11614920

Abstract: A device (e.g., integrated circuit chip) includes a first operand register, a second operand register, a multiplication unit, and a hardware logic component. The first operand register is configured to store a first operand value. The second operand register is configured to store a second operand value. The multiplication unit is configured to at least multiply the first operand value with the second operand value. The hardware logic component is configured to detect whether a zero value is provided and in response to a detection that the zero value is being provided: cause an update of at least the first operand register to be disabled, and cause a result of a multiplication of the first operand value with the second operand value to be a zero-value result.

Type: Grant

Filed: May 7, 2020

Date of Patent: March 28, 2023

Assignee: Meta Platforms, Inc.

Inventors: Thomas Mark Ulrich, Abdulkadir Utku Diril, Zhao Wang
Systems and methods for performing horizontal tile operations

Patent number: 11579883

Abstract: Disclosed embodiments relate to systems and methods for performing instructions specifying horizontal tile operations. In one example, a processor includes fetch circuitry to fetch an instruction specifying a horizontal tile operation, a location of a M by N source matrix comprising K groups of elements, and locations of K destinations, wherein each of the K groups of elements comprises the same number of elements, decode circuitry to decode the fetched instruction, and execution circuitry to respond to the decoded instruction by generating K results, each result being generated by performing the specified horizontal tile operation across every element of a corresponding group of the K groups, and writing each generated result to a corresponding location of the K specified destination locations.

Type: Grant

Filed: September 14, 2018

Date of Patent: February 14, 2023

Assignee: Intel Corporation

Inventors: Christopher J. Hughes, Bret Toll, Dan Baum, Elmoustapha Ould-Ahmed-Vall, Raanan Sade, Robert Valentine, Mark J. Charney, Alexander F. Heinecke
Human-machine-interface system comprising a convolutional neural network hardware accelerator

Patent number: 11567770

Abstract: A human-machine-interface system comprising: register-file-memory, configured to store input-data; a first-processing-element-slice, a second-processing-element-slice, and a controller. Each of the processing-slices comprise: a register configured to store register-data; and a processing-element configured to apply an arithmetic and logic operation on the register-data in order to provide convolution-output-data. The controller is configured to: load input-data from the register-file-memory into the first-register as the first-register-data; and load: (i) input-data from the register-file-memory, or (ii) the first-register-data from the first-register, into the second-register as the second-register-data.

Type: Grant

Filed: April 3, 2018

Date of Patent: January 31, 2023

Assignee: NXP B.V.

Inventors: Jose de Jesus Pineda de Gyvez, Hamed Fatemi, Gonzalo Moro Pérez, Hendrik Corporaal
Widening arithmetic in a data processing apparatus

Patent number: 11567763

Abstract: A data processing apparatus, a method of operating a data processing apparatus, a non-transitory computer readable storage medium, and an instruction are provided. The instruction specifies a first source register and a second source register. In response to the instruction control signals are generated, causing processing circuitry to perform a dot product operation. For this operation at least a first data element and a second data element are extracted from each of the first source register and the second source register, such that then at least first data element pairs and second data element pairs are multiplied together. The dot product operation is performed independently in each of multiple intra-register lanes across each of the first source register and the second source register. A widening operation with a large density of operations per instruction is thus provided.

Type: Grant

Filed: January 26, 2018

Date of Patent: January 31, 2023

Assignee: Arm Limited

Inventor: David Hennah Mansell
Apparatus and method to switch configurable logic units

Patent number: 11507531

Abstract: Examples described herein include systems and methods which include an apparatus comprising a plurality of configurable logic units and a plurality of switches, with each switch being coupled to at least one configurable logic unit of the plurality of configurable logic units. The apparatus further includes an instruction register configured to provide respective switch instructions of a plurality of switch instructions to each switch based on a computation to be implemented among the plurality of configurable logic units. For example, the switch instructions may include allocating the plurality of configurable logic units to perform the computation and activating an input of the switch and an output of the switch to couple at least a first configurable logic unit and a second configurable logic unit. In various embodiments, configurable logic units can include arithmetic logic units (ALUs), bit manipulation units (BMUs), and multiplier-accumulator units (MACs).

Type: Grant

Filed: February 25, 2021

Date of Patent: November 22, 2022

Assignee: MICRON TECHNOLOGY, INC.

Inventors: Fa-Long Luo, Tamara Schmitz, Jeremy Chritz, Jaime Cummins
Compute optimizations for low precision machine learning operations

Patent number: 11468541

Abstract: Embodiments described herein provide a graphics processor that can perform a variety of mixed and multiple precision instructions and operations. One embodiment provides a streaming multiprocessor that can concurrently execute multiple thread groups, wherein the streaming multiprocessor includes a single instruction, multiple thread (SIMT) architecture and the streaming multiprocessor is to execute multiple threads for each of multiple instructions. The streaming multiprocessor can perform concurrent integer and floating-point operations and includes a mixed precision core to perform operations at multiple or mixed precisions and dynamic ranges.

Type: Grant

Filed: April 14, 2022

Date of Patent: October 11, 2022

Assignee: Intel Corporation

Inventors: Elmoustapha Ould-Ahmed-Vall, Sara S. Baghsorkhi, Anhang Yao, Kevin Nealis, Xiaoming Chen, Altug Koker, Abhishek R. Appu, John C. Weast, Mike B. Macpherson, Dukhwan Kim, Linda L. Hurd, Ben J. Ashbaugh, Barath Lakshmanan, Liwei Ma, Joydeep Ray, Ping T. Tang, Michael S. Strickland
Group load register of a graph streaming processor

Patent number: 11416261

Abstract: Methods, systems and apparatuses for graph streaming processing are disclosed. One method includes loading, by a group load register, a subset of a an input tensor from a data cache, wherein the group load register provides the subset of the input tensor to all of a plurality of processors, loading, by a plurality of weight data registers, a plurality of weights of a weight tensor, wherein each of the weight data registers provide an weight to a single of the plurality of processors, and performing, by the plurality of processors, a SOMAC (Sum-Of-Multiply-Accumulate) instruction, including simultaneously determining, by each of the plurality of processors, an instruction size of the SOMAC instruction, wherein the instruction size indicates a number of iterations that the SOMAC instruction is to be executed and is equal to a number of outputs within a subset of a plurality of output tensors.

Type: Grant

Filed: July 15, 2020

Date of Patent: August 16, 2022

Assignee: Blaize, Inc.

Inventors: Satyaki Koneru, Kamaraj Thangam, Sruthikesh Surineni
Systems and methods for performing duplicate detection instructions on 2D data

Patent number: 11294671

Abstract: Disclosed embodiments relate to systems and methods for performing duplicate detection instructions on two-dimensional (2D) data. In one example, a processor includes fetch circuitry to fetch an instruction, decode circuitry to decode the fetched instruction having fields to specify an opcode and locations of a source matrix comprising M×N elements and a destination, the opcode to indicate execution circuitry is to use a plurality of comparators to discover duplicates in the source matrix, and store indications of locations of discovered duplicates in the destination. The execution circuitry to execute the decoded instruction as per the opcode.

Type: Grant

Filed: December 26, 2018

Date of Patent: April 5, 2022

Assignee: Intel Corporation

Inventors: Christopher J. Hughes, Michael Espig, Dan Baum, Robert Valentine, Bret Toll, Elmoustapha Ould-Ahmed-Vall
Register-based matrix multiplication with multiple matrices per register

Patent number: 11288066

Abstract: Techniques for performing matrix multiplication in a data processing apparatus are disclosed, comprising apparatuses, matrix multiply instructions, methods of operating the apparatuses, and virtual machine implementations. Registers, each register for storing at least four data elements, are referenced by a matrix multiply instruction and in response to the matrix multiply instruction a matrix multiply operation is carried out. First and second matrices of data elements are extracted from first and second source registers, and plural dot product operations, acting on respective rows of the first matrix and respective columns of the second matrix are performed to generate a square matrix of result data elements, which is applied to a destination register. A higher computation density for a given number of register operands is achieved with respect to vector-by-element techniques.

Type: Grant

Filed: June 8, 2018

Date of Patent: March 29, 2022

Assignee: Arm Limited

Inventors: David Hennah Mansell, Rune Holm, Ian Michael Caulfield, Jelena Milanovic
Hardware security to countermeasure side-channel attacks

Patent number: 11227071

Abstract: A method and an apparatus for hardware security to countermeasure side-channel attacks are provided. The method or apparatus may introduce at least one redundant or partial redundant computation having a similar power dissipation profile or an electromagnetic emission profile when compared to that of a genuine operation for cryptographic devices, and/or to reorder the iterations of operations in a different sequence. The redundant or partial redundant computation may be performed by using a different password key and/or a different raw data (e.g., plaintext). The presence of the redundant or partial redundant computation would make side-channel attacks difficult in the sense that genuine or redundant/partial redundant operations are difficult to be clearly identified, hence serving as a countermeasure for hardware security.

Type: Grant

Filed: March 19, 2018

Date of Patent: January 18, 2022

Assignee: Nanyang Technological University

Inventors: Kwen Siong Chong, Bah Hwee Gwee, Ali Akbar Pammu
Superimposing butterfly network controls for pattern combinations

Patent number: 11221982

Abstract: A multilayer butterfly network is shown that is operable to transform and align a plurality of fields from an input to an output data stream. Many transformations are possible with such a network which may include separate control of each multiplexer. This invention supports a limited set of multiplexer control signals, which enables a similarly limited set of data transformations. This limited capability is offset by the reduced complexity of the multiplexor control circuits. This invention used precalculated inputs and simple combinatorial logic to generate control signals for the butterfly network. Controls are independent for each layer and therefore are dependent only on the input and output patterns. Controls for the layers can be calculated in parallel.

Type: Grant

Filed: August 20, 2019

Date of Patent: January 11, 2022

Assignee: Texas Instruments Incorporated

Inventors: Dheera Balasubramanian, Joseph Zbiciak, Sureshkumar Govindaraj
Systems and methods to perform floating-point addition with selected rounding

Patent number: 11175891

Abstract: Disclosed embodiments relate to performing floating-point addition with selected rounding. In one example, a processor includes circuitry to decode and execute an instruction specifying locations of first and second floating-point (FP) sources, and an opcode indicating the processor is to: bring the FP sources into alignment by shifting a mantissa of the smaller source FP operand to the right by a difference between their exponents, generating rounding controls based on any bits that escape; simultaneously generate a sum of the FP sources and of the FP sources plus one, the sums having a fuzzy-Jbit format having an additional Jbit into which a carry-out, if any, select one of the sums based on the rounding controls, and generate a result comprising a mantissa-wide number of most-significant bits of the selected sum, starting with the most significant non-zero Jbit.

Type: Grant

Filed: March 30, 2019

Date of Patent: November 16, 2021

Assignee: Intel Corporation

Inventors: Simon Rubanovich, Amit Gradstein, Zeev Sperber, Mrinmay Dutta
Providing exception stack management using stack panic fault exceptions in processor-based devices

Patent number: 11175926

Abstract: Providing exception stack management using stack panic fault exceptions in processor-based devices is disclosed. In this regard, a processor device defines a “stack panic fault exception” that may be raised upon execution of an exception handler store operation attempting to write state data into an exception stack, and provides a dedicated plurality of stack panic fault exception state registers in which stack panic fault exception state data may be saved. Upon detecting a first exception, the processor device transfers program control to an exception handler for the first exception. If a second exception occurs upon execution of a store operation in the exception handler, the processor device determines that the second exception should be handled as a stack panic fault exception, saves the stack panic fault exception state data in the stack panic fault exception state registers, and transfers program control to a stack panic fault exception handler.

Type: Grant

Filed: April 8, 2020

Date of Patent: November 16, 2021

Assignee: Microsoft Technology Licensing, LLC

Inventors: Thomas Andrew Sartorius, Michael Scott McIlvaine, James Norris Dieffenderfer, Aaron S. Giles
Compute optimizations for low precision machine learning operations

Patent number: 11138686

Abstract: Embodiments described herein provide a graphics processor that can perform a variety of mixed and multiple precision instructions and operations. One embodiment provides a streaming multiprocessor that can concurrently execute multiple thread groups, wherein the streaming multiprocessor includes a single instruction, multiple thread (SIMT) architecture and the streaming multiprocessor is to execute multiple threads for each of multiple instructions. The streaming multiprocessor can perform concurrent integer and floating-point operations and includes a mixed precision core to perform operations at multiple precisions.

Type: Grant

Filed: June 19, 2019

Date of Patent: October 5, 2021

Assignee: Intel Corporation

Inventors: Elmoustapha Ould-Ahmed-Vall, Sara S. Baghsorkhi, Anbang Yao, Kevin Nealis, Xiaoming Chen, Altug Koker, Abhishek R. Appu, John C. Weast, Mike B. Macpherson, Dukhwan Kim, Linda L. Hurd, Ben J. Ashbaugh, Barath Lakshmanan, Liwei Ma, Joydeep Ray, Ping T. Tang, Michael S. Strickland
Processing in-memory architectures for performing logical operations

Patent number: 11126549

Abstract: In an example, a method includes identifying, using at least one processor, data portions of a plurality of distinct data objects stored in at least one memory which are to be processed using the same logical operation. The method may further include identifying a representation of an operand stored in at least one memory, the operand being to provide the logical operation and providing a logical engine with the operand. The data portions may be stored in a plurality of input data buffers, wherein each of the input data buffers comprises a data portion of a different data object. The logical operation may be carried out on each of the data portions using the logical engine, and the outputs for each data portion may be stored in a plurality of output data buffers, wherein each of the outputs comprising data derived from a different data object.

Type: Grant

Filed: March 31, 2016

Date of Patent: September 21, 2021

Assignee: Hewlett Packard Enterprise Development LP

Inventors: Naveen Muralimanohar, Ali Shafiee Ardestani
Contextual awareness associated with resources

Patent number: 11119818

Abstract: Contextual awareness associated with resources can be employed to facilitate controlling access to resources of a system, including function blocks. A resource manager component (RMC) can pre-load a defined number of respective versions of configuration parameter data associated with respective applications in each resource. With regard to each application, the RMC can associate a context value, unique for each application, with the respective versions of configuration parameter data associated with that application. When a current application is being changed to a next application, the RMC can write the context value associated with the next application to a context select component (CSC). Each resource can read the context value in the CSC, identify and retrieve the version of configuration parameter data associated with the next application based on the context value, and configure the function block based on the version of configuration parameter data.

Type: Grant

Filed: December 31, 2019

Date of Patent: September 14, 2021

Assignee: GE Aviation Systems, LLC

Inventors: Melanie Sue-Hanson Graffy, Colin Holmwood, Jon Marc Diekema
System and method for translating a guest instruction of a guest architecture into at least one host instruction of a host architecture

Patent number: 11099868

Abstract: A system and method are provided for translating a guest instruction of a guest architecture into at least one host instruction of a host architecture. The method comprises providing multiple representation states, each representation state providing a representation in the host architecture for at least one item of state from the guest architecture. A current representation state is then determined from amongst the multiple representation states, and the guest instruction is translated into at least one host instruction in dependence on the current representation state. Through the use of multiple representation states, it has been found that the efficiency of the code translation can be significantly increased, thereby giving rise to performance and energy consumption benefits.

Type: Grant

Filed: March 4, 2016

Date of Patent: August 24, 2021

Assignee: ARM LIMITED

Inventor: Edmund Thomas Grimley-Evans
Systems, methods, and apparatuses for heterogeneous computing

Patent number: 11093277

Abstract: Embodiments of systems, methods, and apparatuses for heterogeneous computing are described. In some embodiments, a hardware heterogeneous scheduler dispatches instructions for execution on one or more plurality of heterogeneous processing elements, the instructions corresponding to a code fragment to be processed by the one or more of the plurality of heterogeneous processing elements, wherein the instructions are native instructions to at least one of the one or more of the plurality of heterogeneous processing elements.

Type: Grant

Filed: June 26, 2020

Date of Patent: August 17, 2021

Assignee: Intel Corporation

Inventors: Rajesh M. Sankaran, Gilbert Neiger, Narayan Ranganathan, Stephen R. Van Doren, Joseph Nuzman, Niall D. McDonnell, Michael A. O'Hanlon, Lokpraveen B. Mosur, Tracy Garrett Drysdale, Eriko Nurvitadhi, Asit K. Mishra, Ganesh Venkatesh, Deborah T. Marr, Nicholas P. Carter, Jonathan D. Pearce, Edward T. Grochowski, Richard J. Greco, Robert Valentine, Jesus Corbal, Thomas D. Fletcher, Dennis R. Bradford, Dwight P. Manley, Mark J. Charney, Jeffrey J. Cook, Paul Caprioli, Koichi Yamada, Kent D. Glossop, David B. Sheffield
Configurable code fingerprint

Patent number: 11010276

Abstract: A method, computer program product, and system performing a method that include a processor defining a code fingerprint by obtaining parameters describing at least one of an event type or an event. The code fingerprint includes a first sequence. The processor loads the code fingerprint into a register accessible to the processor. Concurrent with executing a program, the processor obtains the code fingerprint from the register and identifies the code fingerprint in the program by comparing a second sequence in the program to the first sequence. Based on identifying the code fingerprint in the program, the processor alerts a runtime environment where the program is executing.

Type: Grant

Filed: October 2, 2019

Date of Patent: May 18, 2021

Assignee: International Business Machines Corporation

Inventors: Giles R. Frazier, Michael K. Gschwind, Christian Jacobi, Chung-Lung K. Shum
Apparatuses and methods involving disabling address pointers

Patent number: 11010323

Abstract: An apparatus in various embodiments is for use in a local area network and includes a discernment logic circuit and logic circuitry. The discernment logic circuit discerns whether a requested communications transaction received over the management communications bus from another of the logic nodes involves a first type of transaction or a second type of transaction, the second type of transaction having a plurality of commands associated with the requested communications transaction to convey respectively different parts of the requested communications transaction including an address part and a data part. The logic circuitry disables, in response to a reset of an address pointer in the one of the plurality of logic nodes and the requested communications transaction being the second type of transaction, the address pointer to mitigate a likelihood that the requested communications transaction is performed via the communication protocol while the address pointer for the second type of transaction is erroneous.

Type: Grant

Filed: June 28, 2019

Date of Patent: May 18, 2021

Assignee: NXP B.V.

Inventor: Gerrit Willem den Besten
Techniques for user-centric document summarization

Patent number: 10984027

Abstract: Disclosed techniques can generate content object summaries. Content of a content object can be parsed into a set of word groups. For each word group, at least one topic to which the word group pertains can be identified and it can be determined, via a user model, at least one weight of the plurality of weights corresponding to the topic(s). For each word group, a score can be determined for the word group based on the weight(s). A subset of the set of word groups can be selected based on the scores for the word group. A summary of the content object can be generated that includes the subset but that does not include one or more other word groups in the set of word groups that are not in the subset. At least part of the summary of the content object can be output.

Type: Grant

Filed: November 11, 2016

Date of Patent: April 20, 2021

Assignee: SRI International

Inventors: Girish Acharya, John Niekrasz, John Byrnes, Chih-Hung Yeh
Inline image preprocessing for convolution operations using a matrix multiplier on an integrated circuit

Patent number: 10984500

Abstract: An example preprocessor circuit for formatting image data into a plurality of streams of image samples includes: a plurality of memory banks configured to store the image data; multiplexer circuitry coupled to the memory banks; a first plurality of registers coupled to the multiplexer circuitry; a second plurality of registers coupled to the first plurality of registers, outputs of the second plurality of registers configured to provide the plurality of streams of image samples; bank address and control circuitry coupled to control inputs of the plurality of memory banks, the multiplexer circuitry, and the first plurality of registers; output control circuitry coupled to control inputs of the second plurality of registers; and a control state machine coupled to the bank address and control circuitry and the output control circuitry.

Type: Grant

Filed: September 19, 2019

Date of Patent: April 20, 2021

Assignee: XILINX, INC.

Inventors: Ashish Sirasao, Elliott Delaye, Aaron Ng, Ehsan Ghasemi
Clock control to increase robustness of a serial bus interface

Patent number: 10956356

Abstract: A computer system for performing control of an electronic control unit (ECU) having a processor for executing computer-readable instructions and a memory for maintaining the computer-executable instructions, the computer-executable instructions when executed by the processor perform the following functions by a processor. The functions include configuring a communication controller to while operating in a secure mode, transiting to an unsecure mode, executing a program in the unsecure mode that utilizes the communication controller; and in response to detecting a clock off request while a transmit buffer of the communication controller is not empty, inhibiting the clock off request until the transmit buffer is empty.

Type: Grant

Filed: November 27, 2019

Date of Patent: March 23, 2021

Assignee: Robert Bosch GmbH

Inventors: Sekar Kulandaivel, Shalabh Jain, Jorge Guajardo Merchan
Streaming engine with flexible streaming engine template supporting differing number of nested loops with corresponding loop counts and loop offsets

Patent number: 10891231

Abstract: A streaming engine employed in a digital data processor specifies a fixed read only data stream defined by plural nested loops. An address generator produces address of data elements for the nested loops. A steam head register stores data elements next to be supplied to functional units for use as operands. A stream template specifies loop count and loop dimension for each nested loop. A format definition field in the stream template specifies the number of loops and the stream template bits devoted to the loop counts and loop dimensions. This permits the same bits of the stream template to be interpreted differently enabling trade off between the number of loops supported and the size of the loop counts and loop dimensions.

Type: Grant

Filed: July 1, 2019

Date of Patent: January 12, 2021

Assignee: Texas Instruments Incorporated

Inventor: Joseph Zbiciak
Systems, apparatuses, and methods for arithmetic recurrence

Patent number: 10860315

Abstract: Embodiments of systems, apparatuses, and methods for broadcast arithmetic in a processor are described.

Type: Grant

Filed: September 24, 2018

Date of Patent: December 8, 2020

Assignee: Intel Corporation

Inventors: Rama Kishan V. Malladi, Elmoustapha Ould-Ahmed-Vall
Systems, apparatuses, and methods for broadcast arithmetic operations

Patent number: 10846087

Abstract: Embodiments of systems, apparatuses, and methods for instruction execution. In some embodiments, an instruction has fields for a first and a second source operand, and a destination operand. When executed, the instruction causes an arithmetic operation on broadcasted packed data elements of the first source operand and storage of results of each arithmetic operation in the destination operand, wherein the packed data elements of the first source operand to be broadcast are dictated by values of packed data elements stored in a second source operand, wherein the arithmetic operation is defined by the instruction.

Type: Grant

Filed: December 30, 2016

Date of Patent: November 24, 2020

Assignee: Intel Corporation

Inventors: Mikhail Plotnikov, Jesus Corbal, Robert Valentine
Adaptive clocking scheme

Patent number: 10804906

Abstract: Adaptive clocking schemes for synchronized on-chip functional blocks are provided. The clocking schemes enable synchronous clocking which can be adapted according to changes in signal path propagation delay due temperature, process, and voltage variations, for example. In embodiments, the clocking schemes allow for the capacity utilization of a logic path to be increased.

Type: Grant

Filed: June 25, 2018

Date of Patent: October 13, 2020

Assignee: AVAGO TECHNOLOGIES INTERNATIONAL SALES PTE. LIMITED

Inventors: Paul Penzes, Mark Fullerton
Method, device, and medium for restoring text using index which associates coded text and positions thereof in text data

Patent number: 10803243

Abstract: A non-transitory computer-readable recording medium stores therein a data generation program that causes a computer to execute a program including: arranging a first morpheme in an order of a position of the first morpheme in text data by referring to an index generated by the text data, in which positions of a plurality of morphemes included in the text data are associated with each of the morphemes; and referring to relationship information indicating a relationship between morphemes, and when the first morpheme is a specific type having a relationship with a second morpheme, arranging the second morpheme in an order of a position of the second morpheme in the text data by referring to the index.

Type: Grant

Filed: March 13, 2019

Date of Patent: October 13, 2020

Assignee: FUJITSU LIMITED

Inventors: Masahiro Kataoka, Takahiro Okubo, Ryo Matsumura
Apparatus and method for adaptable and efficient lane-wise tensor processing

Patent number: 10776110

Abstract: An apparatus and method for performing efficient, adaptable tensor operations.

Type: Grant

Filed: September 29, 2018

Date of Patent: September 15, 2020

Assignee: Intel Corporation

Inventors: Jonathan Pearce, David Sheffield, Srikanth Srinivasan, Jeffrey Cook, Deborah Marr, Abhijit Davare, Asit Mishra, Steven Burns, Desmond Kirkpatrick, Andrey Ayupov, Anton Alexandrovich Sorokin, Eriko Nurvitadhi
Color variant environment mapping for augmented reality

Patent number: 10607567

Abstract: An environment map, such as a cube map, can be obtained for a scene that is appropriate for the current lighting state. A grayscale image representation is generated that represents physical objects visible in the scene. The grayscale representation is provided to a device for rendering AR content. A color lookup table (LUT) is generated for coloring the grayscale image representation. The color LUT can be appropriate for the current lighting conditions of the scene. As the lighting state changes, such as over the course of a day, different color LUTs can be sent to the device for purposes of updating the environment map. The grayscale image representation, once colored, can serve as an environment map for purposes of creating reflection effects on AR content to be rendered with respect to a live view of the scene.

Type: Grant

Filed: March 16, 2018

Date of Patent: March 31, 2020

Assignee: AMAZON TECHNOLOGIES, INC.

Inventors: Richard Schritter, Sidharth Moudgil, Pratik Patel
Apparatus and method for processing input operand values

Patent number: 10579338

Abstract: An apparatus and method are provided for processing input operand values. The apparatus has a set of vector data storage elements, each vector data storage element providing a plurality of sections for storing data values. A plurality of lanes are considered to be provided within the set of storage elements, where each lane comprises a corresponding section from each vector data storage element. Processing circuitry is arranged to perform an arithmetic operation on an input operand value comprising a plurality of portions, by performing an independent arithmetic operation on each of the plurality of portions, in order to produce a result value comprising a plurality of result portions. Storage circuitry is arranged to store the result value within a selected lane of the plurality of lanes, such that each result portion is stored in a different vector data storage element within the corresponding section for the selected lane.

Type: Grant

Filed: December 6, 2017

Date of Patent: March 3, 2020

Assignee: ARM Limited

Inventors: Christopher Neal Hinds, Neil Burgess, David Raymond Lutz
Inline image preprocessing for convolution operations using a matrix multiplier on an integrated circuit

Patent number: 10460416

Abstract: An example preprocessor circuit for formatting image data into a plurality of streams of image samples includes: a plurality of memory banks configured to store the image data; multiplexer circuitry coupled to the memory banks; a first plurality of registers coupled to the multiplexer circuitry; a second plurality of registers coupled to the first plurality of registers, outputs of the second plurality of registers configured to provide the plurality of streams of image samples; and control circuitry configured to generate addresses for the plurality of memory banks, control the multiplexer circuitry to select among outputs of the plurality of memory banks, control the first plurality of registers to store outputs of the second plurality of multiplexers, and control the second plurality of registers to store outputs of the first plurality of registers.

Type: Grant

Filed: October 17, 2017

Date of Patent: October 29, 2019

Assignee: XILINX, INC.

Inventors: Ashish Sirasao, Elliott Delaye, Aaron Ng, Ehsan Ghasemi
Vector load and duplicate operations

Patent number: 10423413

Abstract: A method of loading and duplicating scalar data from a source into a destination register. The data may be duplicated in byte, half word, word or double word parts, according to a duplication pattern.

Type: Grant

Filed: July 9, 2014

Date of Patent: September 24, 2019

Assignee: TEXAS INSTRUMENTS INCORPORATED

Inventors: Timothy David Anderson, Duc Quang Bui, Peter Richard Dent
Signal processing device and method of performing a pack-insert operation

Patent number: 10402198

Abstract: A signal processing device comprising at least one control unit arranged to receive at least one pack-insert instruction, decode the received at least one pack-insert instruction, and output at least one pack-insert control signal in accordance with the received pack-insert instruction. The signal processing device further comprising at least one pack-insert component arranged to receive at least a first data block to be inserted into a sequence of data blocks to be output to at least one destination register, receive a plurality of further data blocks to be packed within the sequence of data blocks to be output to the at least one destination register, arrange the at least first data block and the plurality of further data blocks into a sequence of data blocks based at least partly on the at least one pack-insert control signal, and output the sequence of data blocks.

Type: Grant

Filed: June 18, 2013

Date of Patent: September 3, 2019

Assignee: NXP USA, Inc.

Inventors: Avi Gal, Fabrice Aidan, Noam Eshel-Goldman, Roy Glasner, Dmitry Lachover, Itay Peled
Inference based condition code generation

Patent number: 10379860

Abstract: A condition code can depend upon a numerical output of a floating point operation for a processing pipeline. A classification can be determined for the floating point operation of a received instruction. In response to the classification and using condition determination logic, a value can be calculated for the condition code by inferring from data that is available from the processing pipeline before the numerical output is available. The value for the condition code can be provided to branch decision logic of the processing pipeline.

Type: Grant

Filed: May 3, 2017

Date of Patent: August 13, 2019

Assignee: International Business Machines Corporation

Inventors: Steven R. Carlough, Son T. Dao, Petra Leber, Silvia M. Mueller
Inference based condition code generation

Patent number: 10379859

Abstract: A condition code can depend upon a numerical output of a floating point operation for a processing pipeline. A classification can be determined for the floating point operation of a received instruction. In response to the classification and using condition determination logic, a value can be calculated for the condition code by inferring from data that is available from the processing pipeline before the numerical output is available. The value for the condition code can be provided to branch decision logic of the processing pipeline.

Type: Grant

Filed: May 3, 2017

Date of Patent: August 13, 2019

Assignee: International Business Machines Corporation

Inventors: Steven R. Carlough, Son T. Dao, Petra Leber, Silvia M. Mueller
Sequence alignment method of vector processor

Patent number: 10372451

Abstract: A sequence alignment method that may be performed by a vector processor is may include loading a sequence that is an instance of vector data including a plurality of elements, dividing the sequence into two groups, aligning respective elements of the groups to generate a sequence of sorted elements according to a single instruction multiple data mode, and iteratively performing an alignment operation based on a determination that each group in the sequence of sorted elements includes more than one element of the plurality of elements. Each iteration may include dividing each group to form new groups and aligning respective elements of each pair of adjacent new groups to generate a new sequence of sorted elements. The new sequence of a current iteration of the alignment operation may be transmitted as a data output, based on a determination that each new group does not include more than one element.

Type: Grant

Filed: November 3, 2017

Date of Patent: August 6, 2019

Assignee: Samsung Electronics Co., Ltd.

Inventors: Hyun Pil Kim, Hyun Woo Sim, Seong Woo Ahn
Heterogeneous logic gate simulation using SIMD instructions

Patent number: 10331830

Abstract: Techniques for logic gate simulation. Program instructions may be executable by a processor to select logic gates from a netlist that specifies a gate-level representation of a digital circuit. Each logic gate may be assigned to a corresponding element position of a single-instruction, multiple-data (SIMD) shuffle or population count instruction, and at least two logic gates may specify different logic functions. Simulation-executable instructions including the SIMD shuffle or population count instruction may be generated. When executed, the simulation-executable instructions simulate the functionality of the selected logic gates. More particularly, execution of the SIMD shuffle or population count instruction may concurrently simulate operation of at least two logic gates that specify different logic functions.

Type: Grant

Filed: June 13, 2016

Date of Patent: June 25, 2019

Assignee: Apple Inc.

Inventor: Alex S. Teiche

1 2 3 4 5 … next