Arithmetic Operation Instruction Processing Patents (Class 712/221)
  • Patent number: 11899967
    Abstract: Aspects of the present disclosure provide an aligned storage strategy for stripes within a long vector for a vector processor, such that the extra computation needed to track strides between input stripes and output stripes may be eliminated. As a result, the stripe locations are located in a more predictable memory access pattern such that memory access bandwidth may be improved and the tendency for memory error may be reduced.
    Type: Grant
    Filed: November 15, 2021
    Date of Patent: February 13, 2024
    Assignee: Lightmatter, Inc.
    Inventors: Nicholas Moore, Gongyu Wang, Bradley Dobbie, Tyler J. Kenney, Ayon Basumallik
  • Patent number: 11886875
    Abstract: Disclosed embodiments relate to systems and methods for performing nibble-sized operations on matrix elements. In one example, a processor includes fetch circuitry to fetch an instruction, decode circuitry to decode the fetched instruction the fetched instruction having fields to specify an opcode and locations of first source, second source, and destination matrices, the opcode to indicate the processor is to, for each pair of corresponding elements of the first and second source matrices, logically partition each element into nibble-sized partitions, perform an operation indicated by the instruction on each partition, and store execution results to a corresponding nibble-sized partition of a corresponding element of the destination matrix. The exemplary processor includes execution circuitry to execute the decoded instruction as per the opcode.
    Type: Grant
    Filed: December 26, 2018
    Date of Patent: January 30, 2024
    Assignee: Intel Corporation
    Inventors: Elmoustapha Ould-Ahmed-Vall, Jonathan D. Pearce, Dan Baum, Guei-Yuan Lueh, Michael Espig, Christopher J. Hughes, Raanan Sade, Robert Valentine, Mark J. Charney, Alexander F. Heinecke
  • Patent number: 11789734
    Abstract: A computing system includes a processing unit and a memory storing instructions that, when executed by the processor, cause the processor to receive program source code in a compiler, identify in the program source code a set of operations for vectorizing, where each operation in the set of operations specifies a set of one or more operands, in response to identifying the set of operations, vectorize the set of operations by, based on the number of operations in the set of operations and a total number of lanes in a first vector register, generating a mask indicating a first unmasked lane and a first masked lane in the first vector register, based on the mask, generating a set of one or more instructions for loading into the first unmasked lane a first operand of a first operation of the set of operations, and loading the first operand into the first masked lane.
    Type: Grant
    Filed: August 9, 2019
    Date of Patent: October 17, 2023
    Assignee: Advanced Micro Devices, Inc.
    Inventor: Anupama Rajesh Rasale
  • Patent number: 11789646
    Abstract: Methods, apparatus, systems, and articles of manufacture are disclosed that increase data reuse for multiply and accumulate (MAC) operations. An example apparatus includes a MAC circuit to process a first context of a set of a first type of contexts stored in a first buffer and a first context of a set of a second type of contexts stored in a second buffer. The example apparatus also includes control logic circuitry to, in response to determining that there is an additional context of the second type to be processed in the set of the second type of contexts, maintain the first context of the first type in the first buffer. The control logic circuitry is also to, in response to determining that there is an additional context of the first type to be processed in the set of the first type of contexts maintain the first context of the second type in the second buffer and iterate a pointer of the second buffer from a first position to a next position in the second buffer.
    Type: Grant
    Filed: September 24, 2021
    Date of Patent: October 17, 2023
    Assignee: INTEL CORPORATION
    Inventors: Niall Hanrahan, Martin Power, Kevin Brady, Martin-Thomas Grymel, David Bernard, Gary Baugh, Cormac Brick
  • Patent number: 11768660
    Abstract: A processor or other device, such as a programmable and/or massively parallel processor or other device, includes processing elements designed to perform arithmetic operations (possibly but not necessarily including, for example, one or more of addition, multiplication, subtraction, and division) on numerical values of low precision but high dynamic range (“LPHDR arithmetic”). Such a processor or other device may, for example, be implemented on a single chip. Whether or not implemented on a single chip, the number of LPHDR arithmetic elements in the processor or other device in certain embodiments of the present invention significantly exceeds (e.g., by at least 20 more than three times) the number of arithmetic elements, if any, in the processor or other device which are designed to perform high dynamic range arithmetic of traditional precision (such as 32 bit or 64 bit floating point arithmetic).
    Type: Grant
    Filed: January 26, 2023
    Date of Patent: September 26, 2023
    Assignee: SINGULAR COMPUTING LLC
    Inventor: Joseph Bates
  • Patent number: 11768659
    Abstract: A processor or other device, such as a programmable and/or massively parallel processor or other device, includes processing elements designed to perform arithmetic operations (possibly but not necessarily including, for example, one or more of addition, multiplication, subtraction, and division) on numerical values of low precision but high dynamic range (“LPHDR arithmetic”). Such a processor or other device may, for example, be implemented on a single chip. Whether or not implemented on a single chip, the number of LPHDR arithmetic elements in the processor or other device in certain embodiments of the present invention significantly exceeds (e.g., by at least 20 more than three times) the number of arithmetic elements, if any, in the processor or other device which are designed to perform high dynamic range arithmetic of traditional precision (such as 32 bit or 64 bit floating point arithmetic).
    Type: Grant
    Filed: September 23, 2020
    Date of Patent: September 26, 2023
    Assignee: SINGULAR COMPUTING LLC
    Inventor: Joseph Bates
  • Patent number: 11705923
    Abstract: Disclosed are a method and apparatus for storing data. The method includes: acquiring data to be stored; converting the data to be stored from an initial data type to a target data type, a data length corresponding to the target data type being less than that corresponding to the initial data type; and storing the data to be stored of the target data type to a database. In the method according to the present disclosure, a storage space occupied by the data to be stored in the database is greatly reduced. In addition, the method according to the present disclosure is performed prior to lossy or lossless data compression storage of the data to be stored in the related art. That is, on the basis of a compression ratio when the data to be stored is stored in the related art, the present disclosure further improves a compression effect of the data to be stored by reducing the data length when the data to be stored is stored, and further saves storage resources of the database.
    Type: Grant
    Filed: November 20, 2020
    Date of Patent: July 18, 2023
    Assignees: ENVISION DIGITAL INTERNATIONAL PTE. LTD., SHANGHAI ENVISION DIGITAL CO., LTD.
    Inventors: Li Lei, Hong Zhao, Xiaomeng Chen, Degang Ning
  • Patent number: 11669326
    Abstract: Embodiments detailed herein relate to matrix operations. For example, embodiments of instruction support for matrix (tile) dot product operations are detailed. Exemplary instructions including computing a dot product of signed words and accumulating in a quadword data elements of a matrix pair. Additionally, in some instances, non-accumulating quadword data elements of the matrix pair are set to zero.
    Type: Grant
    Filed: December 29, 2017
    Date of Patent: June 6, 2023
    Assignee: Intel Corporation
    Inventors: Raanan Sade, Simon Rubanovich, Amit Gradstein, Zeev Sperber, Alexander Heinecke, Robert Valentine, Mark J. Charney, Bret Toll, Jesus Corbal, Elmoustapha Ould-Ahmed-Vall, Menachem Adelman
  • Patent number: 11635956
    Abstract: A fully pipelined convertToBinaryFromDecimalCharacter hardware operator logic circuit configured to convert one or more human-readable decimal character sequence floating-point representations to IEEE 754-2008 binary floating-point representations every clock cycle. The circuit converts decimal character sequence floating-point representations up to 28 decimal digits in length to IEEE 754 binary64, binary32, or binary16 floating-point format representations.
    Type: Grant
    Filed: December 18, 2021
    Date of Patent: April 25, 2023
    Inventor: Jerry D. Harthcock
  • Patent number: 11620229
    Abstract: Described is a data cache with prediction hints for a cache hit. The data cache includes a plurality of cache lines, where a cache line includes a data field, a tag field, and a prediction hint field. The prediction hint field is configured to store a prediction hint which directs alternate behavior for a cache hit against the cache line. The prediction hint field is integrated with the tag field or is integrated with a way predictor field.
    Type: Grant
    Filed: February 21, 2020
    Date of Patent: April 4, 2023
    Assignee: SiFive, Inc.
    Inventors: John Ingalls, Josh Smith
  • Patent number: 11614920
    Abstract: A device (e.g., integrated circuit chip) includes a first operand register, a second operand register, a multiplication unit, and a hardware logic component. The first operand register is configured to store a first operand value. The second operand register is configured to store a second operand value. The multiplication unit is configured to at least multiply the first operand value with the second operand value. The hardware logic component is configured to detect whether a zero value is provided and in response to a detection that the zero value is being provided: cause an update of at least the first operand register to be disabled, and cause a result of a multiplication of the first operand value with the second operand value to be a zero-value result.
    Type: Grant
    Filed: May 7, 2020
    Date of Patent: March 28, 2023
    Assignee: Meta Platforms, Inc.
    Inventors: Thomas Mark Ulrich, Abdulkadir Utku Diril, Zhao Wang
  • Patent number: 11579883
    Abstract: Disclosed embodiments relate to systems and methods for performing instructions specifying horizontal tile operations. In one example, a processor includes fetch circuitry to fetch an instruction specifying a horizontal tile operation, a location of a M by N source matrix comprising K groups of elements, and locations of K destinations, wherein each of the K groups of elements comprises the same number of elements, decode circuitry to decode the fetched instruction, and execution circuitry to respond to the decoded instruction by generating K results, each result being generated by performing the specified horizontal tile operation across every element of a corresponding group of the K groups, and writing each generated result to a corresponding location of the K specified destination locations.
    Type: Grant
    Filed: September 14, 2018
    Date of Patent: February 14, 2023
    Assignee: Intel Corporation
    Inventors: Christopher J. Hughes, Bret Toll, Dan Baum, Elmoustapha Ould-Ahmed-Vall, Raanan Sade, Robert Valentine, Mark J. Charney, Alexander F. Heinecke
  • Patent number: 11567763
    Abstract: A data processing apparatus, a method of operating a data processing apparatus, a non-transitory computer readable storage medium, and an instruction are provided. The instruction specifies a first source register and a second source register. In response to the instruction control signals are generated, causing processing circuitry to perform a dot product operation. For this operation at least a first data element and a second data element are extracted from each of the first source register and the second source register, such that then at least first data element pairs and second data element pairs are multiplied together. The dot product operation is performed independently in each of multiple intra-register lanes across each of the first source register and the second source register. A widening operation with a large density of operations per instruction is thus provided.
    Type: Grant
    Filed: January 26, 2018
    Date of Patent: January 31, 2023
    Assignee: Arm Limited
    Inventor: David Hennah Mansell
  • Patent number: 11567770
    Abstract: A human-machine-interface system comprising: register-file-memory, configured to store input-data; a first-processing-element-slice, a second-processing-element-slice, and a controller. Each of the processing-slices comprise: a register configured to store register-data; and a processing-element configured to apply an arithmetic and logic operation on the register-data in order to provide convolution-output-data. The controller is configured to: load input-data from the register-file-memory into the first-register as the first-register-data; and load: (i) input-data from the register-file-memory, or (ii) the first-register-data from the first-register, into the second-register as the second-register-data.
    Type: Grant
    Filed: April 3, 2018
    Date of Patent: January 31, 2023
    Assignee: NXP B.V.
    Inventors: Jose de Jesus Pineda de Gyvez, Hamed Fatemi, Gonzalo Moro Pérez, Hendrik Corporaal
  • Patent number: 11507531
    Abstract: Examples described herein include systems and methods which include an apparatus comprising a plurality of configurable logic units and a plurality of switches, with each switch being coupled to at least one configurable logic unit of the plurality of configurable logic units. The apparatus further includes an instruction register configured to provide respective switch instructions of a plurality of switch instructions to each switch based on a computation to be implemented among the plurality of configurable logic units. For example, the switch instructions may include allocating the plurality of configurable logic units to perform the computation and activating an input of the switch and an output of the switch to couple at least a first configurable logic unit and a second configurable logic unit. In various embodiments, configurable logic units can include arithmetic logic units (ALUs), bit manipulation units (BMUs), and multiplier-accumulator units (MACs).
    Type: Grant
    Filed: February 25, 2021
    Date of Patent: November 22, 2022
    Assignee: MICRON TECHNOLOGY, INC.
    Inventors: Fa-Long Luo, Tamara Schmitz, Jeremy Chritz, Jaime Cummins
  • Patent number: 11468541
    Abstract: Embodiments described herein provide a graphics processor that can perform a variety of mixed and multiple precision instructions and operations. One embodiment provides a streaming multiprocessor that can concurrently execute multiple thread groups, wherein the streaming multiprocessor includes a single instruction, multiple thread (SIMT) architecture and the streaming multiprocessor is to execute multiple threads for each of multiple instructions. The streaming multiprocessor can perform concurrent integer and floating-point operations and includes a mixed precision core to perform operations at multiple or mixed precisions and dynamic ranges.
    Type: Grant
    Filed: April 14, 2022
    Date of Patent: October 11, 2022
    Assignee: Intel Corporation
    Inventors: Elmoustapha Ould-Ahmed-Vall, Sara S. Baghsorkhi, Anhang Yao, Kevin Nealis, Xiaoming Chen, Altug Koker, Abhishek R. Appu, John C. Weast, Mike B. Macpherson, Dukhwan Kim, Linda L. Hurd, Ben J. Ashbaugh, Barath Lakshmanan, Liwei Ma, Joydeep Ray, Ping T. Tang, Michael S. Strickland
  • Patent number: 11416261
    Abstract: Methods, systems and apparatuses for graph streaming processing are disclosed. One method includes loading, by a group load register, a subset of a an input tensor from a data cache, wherein the group load register provides the subset of the input tensor to all of a plurality of processors, loading, by a plurality of weight data registers, a plurality of weights of a weight tensor, wherein each of the weight data registers provide an weight to a single of the plurality of processors, and performing, by the plurality of processors, a SOMAC (Sum-Of-Multiply-Accumulate) instruction, including simultaneously determining, by each of the plurality of processors, an instruction size of the SOMAC instruction, wherein the instruction size indicates a number of iterations that the SOMAC instruction is to be executed and is equal to a number of outputs within a subset of a plurality of output tensors.
    Type: Grant
    Filed: July 15, 2020
    Date of Patent: August 16, 2022
    Assignee: Blaize, Inc.
    Inventors: Satyaki Koneru, Kamaraj Thangam, Sruthikesh Surineni
  • Patent number: 11294671
    Abstract: Disclosed embodiments relate to systems and methods for performing duplicate detection instructions on two-dimensional (2D) data. In one example, a processor includes fetch circuitry to fetch an instruction, decode circuitry to decode the fetched instruction having fields to specify an opcode and locations of a source matrix comprising M×N elements and a destination, the opcode to indicate execution circuitry is to use a plurality of comparators to discover duplicates in the source matrix, and store indications of locations of discovered duplicates in the destination. The execution circuitry to execute the decoded instruction as per the opcode.
    Type: Grant
    Filed: December 26, 2018
    Date of Patent: April 5, 2022
    Assignee: Intel Corporation
    Inventors: Christopher J. Hughes, Michael Espig, Dan Baum, Robert Valentine, Bret Toll, Elmoustapha Ould-Ahmed-Vall
  • Patent number: 11288066
    Abstract: Techniques for performing matrix multiplication in a data processing apparatus are disclosed, comprising apparatuses, matrix multiply instructions, methods of operating the apparatuses, and virtual machine implementations. Registers, each register for storing at least four data elements, are referenced by a matrix multiply instruction and in response to the matrix multiply instruction a matrix multiply operation is carried out. First and second matrices of data elements are extracted from first and second source registers, and plural dot product operations, acting on respective rows of the first matrix and respective columns of the second matrix are performed to generate a square matrix of result data elements, which is applied to a destination register. A higher computation density for a given number of register operands is achieved with respect to vector-by-element techniques.
    Type: Grant
    Filed: June 8, 2018
    Date of Patent: March 29, 2022
    Assignee: Arm Limited
    Inventors: David Hennah Mansell, Rune Holm, Ian Michael Caulfield, Jelena Milanovic
  • Patent number: 11227071
    Abstract: A method and an apparatus for hardware security to countermeasure side-channel attacks are provided. The method or apparatus may introduce at least one redundant or partial redundant computation having a similar power dissipation profile or an electromagnetic emission profile when compared to that of a genuine operation for cryptographic devices, and/or to reorder the iterations of operations in a different sequence. The redundant or partial redundant computation may be performed by using a different password key and/or a different raw data (e.g., plaintext). The presence of the redundant or partial redundant computation would make side-channel attacks difficult in the sense that genuine or redundant/partial redundant operations are difficult to be clearly identified, hence serving as a countermeasure for hardware security.
    Type: Grant
    Filed: March 19, 2018
    Date of Patent: January 18, 2022
    Assignee: Nanyang Technological University
    Inventors: Kwen Siong Chong, Bah Hwee Gwee, Ali Akbar Pammu
  • Patent number: 11221982
    Abstract: A multilayer butterfly network is shown that is operable to transform and align a plurality of fields from an input to an output data stream. Many transformations are possible with such a network which may include separate control of each multiplexer. This invention supports a limited set of multiplexer control signals, which enables a similarly limited set of data transformations. This limited capability is offset by the reduced complexity of the multiplexor control circuits. This invention used precalculated inputs and simple combinatorial logic to generate control signals for the butterfly network. Controls are independent for each layer and therefore are dependent only on the input and output patterns. Controls for the layers can be calculated in parallel.
    Type: Grant
    Filed: August 20, 2019
    Date of Patent: January 11, 2022
    Assignee: Texas Instruments Incorporated
    Inventors: Dheera Balasubramanian, Joseph Zbiciak, Sureshkumar Govindaraj
  • Patent number: 11175891
    Abstract: Disclosed embodiments relate to performing floating-point addition with selected rounding. In one example, a processor includes circuitry to decode and execute an instruction specifying locations of first and second floating-point (FP) sources, and an opcode indicating the processor is to: bring the FP sources into alignment by shifting a mantissa of the smaller source FP operand to the right by a difference between their exponents, generating rounding controls based on any bits that escape; simultaneously generate a sum of the FP sources and of the FP sources plus one, the sums having a fuzzy-Jbit format having an additional Jbit into which a carry-out, if any, select one of the sums based on the rounding controls, and generate a result comprising a mantissa-wide number of most-significant bits of the selected sum, starting with the most significant non-zero Jbit.
    Type: Grant
    Filed: March 30, 2019
    Date of Patent: November 16, 2021
    Assignee: Intel Corporation
    Inventors: Simon Rubanovich, Amit Gradstein, Zeev Sperber, Mrinmay Dutta
  • Patent number: 11175926
    Abstract: Providing exception stack management using stack panic fault exceptions in processor-based devices is disclosed. In this regard, a processor device defines a “stack panic fault exception” that may be raised upon execution of an exception handler store operation attempting to write state data into an exception stack, and provides a dedicated plurality of stack panic fault exception state registers in which stack panic fault exception state data may be saved. Upon detecting a first exception, the processor device transfers program control to an exception handler for the first exception. If a second exception occurs upon execution of a store operation in the exception handler, the processor device determines that the second exception should be handled as a stack panic fault exception, saves the stack panic fault exception state data in the stack panic fault exception state registers, and transfers program control to a stack panic fault exception handler.
    Type: Grant
    Filed: April 8, 2020
    Date of Patent: November 16, 2021
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Thomas Andrew Sartorius, Michael Scott McIlvaine, James Norris Dieffenderfer, Aaron S. Giles
  • Patent number: 11138686
    Abstract: Embodiments described herein provide a graphics processor that can perform a variety of mixed and multiple precision instructions and operations. One embodiment provides a streaming multiprocessor that can concurrently execute multiple thread groups, wherein the streaming multiprocessor includes a single instruction, multiple thread (SIMT) architecture and the streaming multiprocessor is to execute multiple threads for each of multiple instructions. The streaming multiprocessor can perform concurrent integer and floating-point operations and includes a mixed precision core to perform operations at multiple precisions.
    Type: Grant
    Filed: June 19, 2019
    Date of Patent: October 5, 2021
    Assignee: Intel Corporation
    Inventors: Elmoustapha Ould-Ahmed-Vall, Sara S. Baghsorkhi, Anbang Yao, Kevin Nealis, Xiaoming Chen, Altug Koker, Abhishek R. Appu, John C. Weast, Mike B. Macpherson, Dukhwan Kim, Linda L. Hurd, Ben J. Ashbaugh, Barath Lakshmanan, Liwei Ma, Joydeep Ray, Ping T. Tang, Michael S. Strickland
  • Patent number: 11126549
    Abstract: In an example, a method includes identifying, using at least one processor, data portions of a plurality of distinct data objects stored in at least one memory which are to be processed using the same logical operation. The method may further include identifying a representation of an operand stored in at least one memory, the operand being to provide the logical operation and providing a logical engine with the operand. The data portions may be stored in a plurality of input data buffers, wherein each of the input data buffers comprises a data portion of a different data object. The logical operation may be carried out on each of the data portions using the logical engine, and the outputs for each data portion may be stored in a plurality of output data buffers, wherein each of the outputs comprising data derived from a different data object.
    Type: Grant
    Filed: March 31, 2016
    Date of Patent: September 21, 2021
    Assignee: Hewlett Packard Enterprise Development LP
    Inventors: Naveen Muralimanohar, Ali Shafiee Ardestani
  • Patent number: 11119818
    Abstract: Contextual awareness associated with resources can be employed to facilitate controlling access to resources of a system, including function blocks. A resource manager component (RMC) can pre-load a defined number of respective versions of configuration parameter data associated with respective applications in each resource. With regard to each application, the RMC can associate a context value, unique for each application, with the respective versions of configuration parameter data associated with that application. When a current application is being changed to a next application, the RMC can write the context value associated with the next application to a context select component (CSC). Each resource can read the context value in the CSC, identify and retrieve the version of configuration parameter data associated with the next application based on the context value, and configure the function block based on the version of configuration parameter data.
    Type: Grant
    Filed: December 31, 2019
    Date of Patent: September 14, 2021
    Assignee: GE Aviation Systems, LLC
    Inventors: Melanie Sue-Hanson Graffy, Colin Holmwood, Jon Marc Diekema
  • Patent number: 11099868
    Abstract: A system and method are provided for translating a guest instruction of a guest architecture into at least one host instruction of a host architecture. The method comprises providing multiple representation states, each representation state providing a representation in the host architecture for at least one item of state from the guest architecture. A current representation state is then determined from amongst the multiple representation states, and the guest instruction is translated into at least one host instruction in dependence on the current representation state. Through the use of multiple representation states, it has been found that the efficiency of the code translation can be significantly increased, thereby giving rise to performance and energy consumption benefits.
    Type: Grant
    Filed: March 4, 2016
    Date of Patent: August 24, 2021
    Assignee: ARM LIMITED
    Inventor: Edmund Thomas Grimley-Evans
  • Patent number: 11093277
    Abstract: Embodiments of systems, methods, and apparatuses for heterogeneous computing are described. In some embodiments, a hardware heterogeneous scheduler dispatches instructions for execution on one or more plurality of heterogeneous processing elements, the instructions corresponding to a code fragment to be processed by the one or more of the plurality of heterogeneous processing elements, wherein the instructions are native instructions to at least one of the one or more of the plurality of heterogeneous processing elements.
    Type: Grant
    Filed: June 26, 2020
    Date of Patent: August 17, 2021
    Assignee: Intel Corporation
    Inventors: Rajesh M. Sankaran, Gilbert Neiger, Narayan Ranganathan, Stephen R. Van Doren, Joseph Nuzman, Niall D. McDonnell, Michael A. O'Hanlon, Lokpraveen B. Mosur, Tracy Garrett Drysdale, Eriko Nurvitadhi, Asit K. Mishra, Ganesh Venkatesh, Deborah T. Marr, Nicholas P. Carter, Jonathan D. Pearce, Edward T. Grochowski, Richard J. Greco, Robert Valentine, Jesus Corbal, Thomas D. Fletcher, Dennis R. Bradford, Dwight P. Manley, Mark J. Charney, Jeffrey J. Cook, Paul Caprioli, Koichi Yamada, Kent D. Glossop, David B. Sheffield
  • Patent number: 11010276
    Abstract: A method, computer program product, and system performing a method that include a processor defining a code fingerprint by obtaining parameters describing at least one of an event type or an event. The code fingerprint includes a first sequence. The processor loads the code fingerprint into a register accessible to the processor. Concurrent with executing a program, the processor obtains the code fingerprint from the register and identifies the code fingerprint in the program by comparing a second sequence in the program to the first sequence. Based on identifying the code fingerprint in the program, the processor alerts a runtime environment where the program is executing.
    Type: Grant
    Filed: October 2, 2019
    Date of Patent: May 18, 2021
    Assignee: International Business Machines Corporation
    Inventors: Giles R. Frazier, Michael K. Gschwind, Christian Jacobi, Chung-Lung K. Shum
  • Patent number: 11010323
    Abstract: An apparatus in various embodiments is for use in a local area network and includes a discernment logic circuit and logic circuitry. The discernment logic circuit discerns whether a requested communications transaction received over the management communications bus from another of the logic nodes involves a first type of transaction or a second type of transaction, the second type of transaction having a plurality of commands associated with the requested communications transaction to convey respectively different parts of the requested communications transaction including an address part and a data part. The logic circuitry disables, in response to a reset of an address pointer in the one of the plurality of logic nodes and the requested communications transaction being the second type of transaction, the address pointer to mitigate a likelihood that the requested communications transaction is performed via the communication protocol while the address pointer for the second type of transaction is erroneous.
    Type: Grant
    Filed: June 28, 2019
    Date of Patent: May 18, 2021
    Assignee: NXP B.V.
    Inventor: Gerrit Willem den Besten
  • Patent number: 10984500
    Abstract: An example preprocessor circuit for formatting image data into a plurality of streams of image samples includes: a plurality of memory banks configured to store the image data; multiplexer circuitry coupled to the memory banks; a first plurality of registers coupled to the multiplexer circuitry; a second plurality of registers coupled to the first plurality of registers, outputs of the second plurality of registers configured to provide the plurality of streams of image samples; bank address and control circuitry coupled to control inputs of the plurality of memory banks, the multiplexer circuitry, and the first plurality of registers; output control circuitry coupled to control inputs of the second plurality of registers; and a control state machine coupled to the bank address and control circuitry and the output control circuitry.
    Type: Grant
    Filed: September 19, 2019
    Date of Patent: April 20, 2021
    Assignee: XILINX, INC.
    Inventors: Ashish Sirasao, Elliott Delaye, Aaron Ng, Ehsan Ghasemi
  • Patent number: 10984027
    Abstract: Disclosed techniques can generate content object summaries. Content of a content object can be parsed into a set of word groups. For each word group, at least one topic to which the word group pertains can be identified and it can be determined, via a user model, at least one weight of the plurality of weights corresponding to the topic(s). For each word group, a score can be determined for the word group based on the weight(s). A subset of the set of word groups can be selected based on the scores for the word group. A summary of the content object can be generated that includes the subset but that does not include one or more other word groups in the set of word groups that are not in the subset. At least part of the summary of the content object can be output.
    Type: Grant
    Filed: November 11, 2016
    Date of Patent: April 20, 2021
    Assignee: SRI International
    Inventors: Girish Acharya, John Niekrasz, John Byrnes, Chih-Hung Yeh
  • Patent number: 10956356
    Abstract: A computer system for performing control of an electronic control unit (ECU) having a processor for executing computer-readable instructions and a memory for maintaining the computer-executable instructions, the computer-executable instructions when executed by the processor perform the following functions by a processor. The functions include configuring a communication controller to while operating in a secure mode, transiting to an unsecure mode, executing a program in the unsecure mode that utilizes the communication controller; and in response to detecting a clock off request while a transmit buffer of the communication controller is not empty, inhibiting the clock off request until the transmit buffer is empty.
    Type: Grant
    Filed: November 27, 2019
    Date of Patent: March 23, 2021
    Assignee: Robert Bosch GmbH
    Inventors: Sekar Kulandaivel, Shalabh Jain, Jorge Guajardo Merchan
  • Patent number: 10891231
    Abstract: A streaming engine employed in a digital data processor specifies a fixed read only data stream defined by plural nested loops. An address generator produces address of data elements for the nested loops. A steam head register stores data elements next to be supplied to functional units for use as operands. A stream template specifies loop count and loop dimension for each nested loop. A format definition field in the stream template specifies the number of loops and the stream template bits devoted to the loop counts and loop dimensions. This permits the same bits of the stream template to be interpreted differently enabling trade off between the number of loops supported and the size of the loop counts and loop dimensions.
    Type: Grant
    Filed: July 1, 2019
    Date of Patent: January 12, 2021
    Assignee: Texas Instruments Incorporated
    Inventor: Joseph Zbiciak
  • Patent number: 10860315
    Abstract: Embodiments of systems, apparatuses, and methods for broadcast arithmetic in a processor are described.
    Type: Grant
    Filed: September 24, 2018
    Date of Patent: December 8, 2020
    Assignee: Intel Corporation
    Inventors: Rama Kishan V. Malladi, Elmoustapha Ould-Ahmed-Vall
  • Patent number: 10846087
    Abstract: Embodiments of systems, apparatuses, and methods for instruction execution. In some embodiments, an instruction has fields for a first and a second source operand, and a destination operand. When executed, the instruction causes an arithmetic operation on broadcasted packed data elements of the first source operand and storage of results of each arithmetic operation in the destination operand, wherein the packed data elements of the first source operand to be broadcast are dictated by values of packed data elements stored in a second source operand, wherein the arithmetic operation is defined by the instruction.
    Type: Grant
    Filed: December 30, 2016
    Date of Patent: November 24, 2020
    Assignee: Intel Corporation
    Inventors: Mikhail Plotnikov, Jesus Corbal, Robert Valentine
  • Patent number: 10804906
    Abstract: Adaptive clocking schemes for synchronized on-chip functional blocks are provided. The clocking schemes enable synchronous clocking which can be adapted according to changes in signal path propagation delay due temperature, process, and voltage variations, for example. In embodiments, the clocking schemes allow for the capacity utilization of a logic path to be increased.
    Type: Grant
    Filed: June 25, 2018
    Date of Patent: October 13, 2020
    Assignee: AVAGO TECHNOLOGIES INTERNATIONAL SALES PTE. LIMITED
    Inventors: Paul Penzes, Mark Fullerton
  • Patent number: 10803243
    Abstract: A non-transitory computer-readable recording medium stores therein a data generation program that causes a computer to execute a program including: arranging a first morpheme in an order of a position of the first morpheme in text data by referring to an index generated by the text data, in which positions of a plurality of morphemes included in the text data are associated with each of the morphemes; and referring to relationship information indicating a relationship between morphemes, and when the first morpheme is a specific type having a relationship with a second morpheme, arranging the second morpheme in an order of a position of the second morpheme in the text data by referring to the index.
    Type: Grant
    Filed: March 13, 2019
    Date of Patent: October 13, 2020
    Assignee: FUJITSU LIMITED
    Inventors: Masahiro Kataoka, Takahiro Okubo, Ryo Matsumura
  • Patent number: 10776110
    Abstract: An apparatus and method for performing efficient, adaptable tensor operations.
    Type: Grant
    Filed: September 29, 2018
    Date of Patent: September 15, 2020
    Assignee: Intel Corporation
    Inventors: Jonathan Pearce, David Sheffield, Srikanth Srinivasan, Jeffrey Cook, Deborah Marr, Abhijit Davare, Asit Mishra, Steven Burns, Desmond Kirkpatrick, Andrey Ayupov, Anton Alexandrovich Sorokin, Eriko Nurvitadhi
  • Patent number: 10607567
    Abstract: An environment map, such as a cube map, can be obtained for a scene that is appropriate for the current lighting state. A grayscale image representation is generated that represents physical objects visible in the scene. The grayscale representation is provided to a device for rendering AR content. A color lookup table (LUT) is generated for coloring the grayscale image representation. The color LUT can be appropriate for the current lighting conditions of the scene. As the lighting state changes, such as over the course of a day, different color LUTs can be sent to the device for purposes of updating the environment map. The grayscale image representation, once colored, can serve as an environment map for purposes of creating reflection effects on AR content to be rendered with respect to a live view of the scene.
    Type: Grant
    Filed: March 16, 2018
    Date of Patent: March 31, 2020
    Assignee: AMAZON TECHNOLOGIES, INC.
    Inventors: Richard Schritter, Sidharth Moudgil, Pratik Patel
  • Patent number: 10579338
    Abstract: An apparatus and method are provided for processing input operand values. The apparatus has a set of vector data storage elements, each vector data storage element providing a plurality of sections for storing data values. A plurality of lanes are considered to be provided within the set of storage elements, where each lane comprises a corresponding section from each vector data storage element. Processing circuitry is arranged to perform an arithmetic operation on an input operand value comprising a plurality of portions, by performing an independent arithmetic operation on each of the plurality of portions, in order to produce a result value comprising a plurality of result portions. Storage circuitry is arranged to store the result value within a selected lane of the plurality of lanes, such that each result portion is stored in a different vector data storage element within the corresponding section for the selected lane.
    Type: Grant
    Filed: December 6, 2017
    Date of Patent: March 3, 2020
    Assignee: ARM Limited
    Inventors: Christopher Neal Hinds, Neil Burgess, David Raymond Lutz
  • Patent number: 10460416
    Abstract: An example preprocessor circuit for formatting image data into a plurality of streams of image samples includes: a plurality of memory banks configured to store the image data; multiplexer circuitry coupled to the memory banks; a first plurality of registers coupled to the multiplexer circuitry; a second plurality of registers coupled to the first plurality of registers, outputs of the second plurality of registers configured to provide the plurality of streams of image samples; and control circuitry configured to generate addresses for the plurality of memory banks, control the multiplexer circuitry to select among outputs of the plurality of memory banks, control the first plurality of registers to store outputs of the second plurality of multiplexers, and control the second plurality of registers to store outputs of the first plurality of registers.
    Type: Grant
    Filed: October 17, 2017
    Date of Patent: October 29, 2019
    Assignee: XILINX, INC.
    Inventors: Ashish Sirasao, Elliott Delaye, Aaron Ng, Ehsan Ghasemi
  • Patent number: 10423413
    Abstract: A method of loading and duplicating scalar data from a source into a destination register. The data may be duplicated in byte, half word, word or double word parts, according to a duplication pattern.
    Type: Grant
    Filed: July 9, 2014
    Date of Patent: September 24, 2019
    Assignee: TEXAS INSTRUMENTS INCORPORATED
    Inventors: Timothy David Anderson, Duc Quang Bui, Peter Richard Dent
  • Patent number: 10402198
    Abstract: A signal processing device comprising at least one control unit arranged to receive at least one pack-insert instruction, decode the received at least one pack-insert instruction, and output at least one pack-insert control signal in accordance with the received pack-insert instruction. The signal processing device further comprising at least one pack-insert component arranged to receive at least a first data block to be inserted into a sequence of data blocks to be output to at least one destination register, receive a plurality of further data blocks to be packed within the sequence of data blocks to be output to the at least one destination register, arrange the at least first data block and the plurality of further data blocks into a sequence of data blocks based at least partly on the at least one pack-insert control signal, and output the sequence of data blocks.
    Type: Grant
    Filed: June 18, 2013
    Date of Patent: September 3, 2019
    Assignee: NXP USA, Inc.
    Inventors: Avi Gal, Fabrice Aidan, Noam Eshel-Goldman, Roy Glasner, Dmitry Lachover, Itay Peled
  • Patent number: 10379860
    Abstract: A condition code can depend upon a numerical output of a floating point operation for a processing pipeline. A classification can be determined for the floating point operation of a received instruction. In response to the classification and using condition determination logic, a value can be calculated for the condition code by inferring from data that is available from the processing pipeline before the numerical output is available. The value for the condition code can be provided to branch decision logic of the processing pipeline.
    Type: Grant
    Filed: May 3, 2017
    Date of Patent: August 13, 2019
    Assignee: International Business Machines Corporation
    Inventors: Steven R. Carlough, Son T. Dao, Petra Leber, Silvia M. Mueller
  • Patent number: 10379859
    Abstract: A condition code can depend upon a numerical output of a floating point operation for a processing pipeline. A classification can be determined for the floating point operation of a received instruction. In response to the classification and using condition determination logic, a value can be calculated for the condition code by inferring from data that is available from the processing pipeline before the numerical output is available. The value for the condition code can be provided to branch decision logic of the processing pipeline.
    Type: Grant
    Filed: May 3, 2017
    Date of Patent: August 13, 2019
    Assignee: International Business Machines Corporation
    Inventors: Steven R. Carlough, Son T. Dao, Petra Leber, Silvia M. Mueller
  • Patent number: 10372451
    Abstract: A sequence alignment method that may be performed by a vector processor is may include loading a sequence that is an instance of vector data including a plurality of elements, dividing the sequence into two groups, aligning respective elements of the groups to generate a sequence of sorted elements according to a single instruction multiple data mode, and iteratively performing an alignment operation based on a determination that each group in the sequence of sorted elements includes more than one element of the plurality of elements. Each iteration may include dividing each group to form new groups and aligning respective elements of each pair of adjacent new groups to generate a new sequence of sorted elements. The new sequence of a current iteration of the alignment operation may be transmitted as a data output, based on a determination that each new group does not include more than one element.
    Type: Grant
    Filed: November 3, 2017
    Date of Patent: August 6, 2019
    Assignee: Samsung Electronics Co., Ltd.
    Inventors: Hyun Pil Kim, Hyun Woo Sim, Seong Woo Ahn
  • Patent number: 10331830
    Abstract: Techniques for logic gate simulation. Program instructions may be executable by a processor to select logic gates from a netlist that specifies a gate-level representation of a digital circuit. Each logic gate may be assigned to a corresponding element position of a single-instruction, multiple-data (SIMD) shuffle or population count instruction, and at least two logic gates may specify different logic functions. Simulation-executable instructions including the SIMD shuffle or population count instruction may be generated. When executed, the simulation-executable instructions simulate the functionality of the selected logic gates. More particularly, execution of the SIMD shuffle or population count instruction may concurrently simulate operation of at least two logic gates that specify different logic functions.
    Type: Grant
    Filed: June 13, 2016
    Date of Patent: June 25, 2019
    Assignee: Apple Inc.
    Inventor: Alex S. Teiche
  • Patent number: 10296342
    Abstract: Systems, methods, and apparatuses for executing an instruction are described. For example, an instruction includes at least an opcode, a field for a packed data source operand, and a field for a packed data destination operand. When executed, the instruction causes for each data element position of the source operand, add to a value stored in that data element position all values stored in preceding data element positions of the packed data source operand and store a result of the addition into a corresponding data element position of the packed data destination operand.
    Type: Grant
    Filed: July 2, 2016
    Date of Patent: May 21, 2019
    Assignee: Intel Corporation
    Inventors: William M. Brown, Elmoustapha Ould-Ahmed-Vall
  • Patent number: 10282169
    Abstract: Techniques are disclosed relating to floating-point operations with down-conversion. In some embodiments, a floating-point unit is configured to perform fused multiply-addition operations based on first and second different instruction types. In some embodiments, the first instruction type specifies result in the first floating-point format and the second instruction type specifies fused multiply addition of input operands in the first floating-point format to generate a result in a second, lower-precision floating-point format. For example, the first format may be a 32-bit format and the second format may be a 16-bit format. In some embodiments, the floating-point unit includes rounding circuitry, exponent circuitry, and/or increment circuitry configured to generate signals for the second instruction type in the same pipeline stage as for the first instruction type. In some embodiments, disclosed techniques may reduce the number of pipeline stages included in the floating-point circuitry.
    Type: Grant
    Filed: April 6, 2016
    Date of Patent: May 7, 2019
    Assignee: Apple Inc.
    Inventors: Liang-Kai Wang, Terence M. Potter, Andrew M. Havlir, Yu Sun, Nicolas X. Pena, Xiao-Long Wu, Christopher A. Burns