Patents Examined by Daniel H. Pan
  • Patent number: 10853078
    Abstract: A processor includes a store buffer to store store instructions to be processed to store data in main memory, a load buffer to store load instructions to be processed to load data from main memory, and a loop invariant code motion (LICM) protection structure coupled to the store buffer and the load buffer. The LPT tracks information to compare an address of a store or snoop microoperation with entries in the LICM and re-loads a load microoperation of a matching entry.
    Type: Grant
    Filed: December 21, 2018
    Date of Patent: December 1, 2020
    Assignee: INTEL CORPORATION
    Inventors: Vineeth Mekkat, Mark Dechene, Zhongying Zhang, John Faistl, Janghaeng Lee, Hou-Jen Ko, Sebastian Winkel, Oleg Margulis
  • Patent number: 10838724
    Abstract: Methods, systems, and apparatus, including an apparatus for processing an instruction for accessing a N-dimensional tensor, the apparatus including multiple tensor index elements and multiple dimension multiplier elements, where each of the dimension multiplier elements has a corresponding tensor index element. The apparatus includes one or more processors configured to obtain an instruction to access a particular element of a N-dimensional tensor, where the N-dimensional tensor has multiple elements arranged across each of the N dimensions, and where N is an integer that is equal to or greater than one; determine, using one or more tensor index elements of the multiple tensor index elements and one or more dimension multiplier elements of the multiple dimension multiplier elements, an address of the particular element; and output data indicating the determined address for accessing the particular element of the N-dimensional tensor.
    Type: Grant
    Filed: March 11, 2019
    Date of Patent: November 17, 2020
    Assignee: Google LLC
    Inventors: Dong Hyuk Woo, Andrew Everett Phelps
  • Patent number: 10831491
    Abstract: The present disclosure is directed to systems and methods for mitigating or eliminating the effectiveness of a side channel attack, such as a Spectre type attack, by limiting the ability of a user-level branch prediction inquiry to access system-level branch prediction data. The branch prediction data stored in the BTB may be apportioned into a plurality of BTB data portions. BTB control circuitry identifies the initiator of a received branch prediction inquiry. Based on the identity of the branch prediction inquiry initiator, the BTB control circuitry causes BTB look-up circuitry to selectively search one or more of the plurality of BTB data portions.
    Type: Grant
    Filed: June 29, 2018
    Date of Patent: November 10, 2020
    Assignee: Intel Corporation
    Inventors: Vadim Sukhomlinov, Kshitij Doshi
  • Patent number: 10824433
    Abstract: An array-based inference engine includes a plurality of processing tiles arranged in a two-dimensional array of a plurality of rows and a plurality of columns. Each processing tile comprises at least one or more of an on-chip memory (OCM) configured to load and maintain data from the input data stream for local access by components in the processing tile and further configured to maintain and output result of the ML operation performed by the processing tile as an output data stream. The array includes a first processing unit (POD) configured to perform a dense and/or regular computation task of the ML operation on the data in the OCM. The array also includes a second processing unit/element (PE) configured to perform a sparse and/or irregular computation task of the ML operation on the data in the OCM and/or from the POD.
    Type: Grant
    Filed: December 19, 2018
    Date of Patent: November 3, 2020
    Assignee: Marvell Asia Pte, Ltd.
    Inventors: Avinash Sodani, Ulf Hanebutte, Senad Durakovic, Hamid Reza Ghasemi, Chia-Hsin Chen
  • Patent number: 10810131
    Abstract: A streaming engine employed in a digital data processor may specify a fixed read-only data stream defined by plural nested loops. An address generator produces address of data elements for the nested loops. A steam head register stores data elements next to be supplied to functional units for use as operands. A stream template register independently specifies a linear address or a circular address mode for each of the nested loops.
    Type: Grant
    Filed: June 11, 2019
    Date of Patent: October 20, 2020
    Assignee: Texas Instruments Incorporated
    Inventor: Joseph Zbiciak
  • Patent number: 10789070
    Abstract: Techniques for synchronizing a set of code branches are disclosed. A synchronization process is triggered by an event and/or a schedule. The synchronization process includes traversing each code branch, such that parent branches of a particular branch are “in sync” prior to being merged into the particular branch. In an embodiment, a hierarchical order for a set of branches is determined. The branch represented by the top node of the hierarchical order does not have any parents. A branch that is a child of the branch represented by the top node is in the second level of the hierarchical order. The branch in the second level is updated by incorporating the current state of the branch represented by the top node. Thereafter, each branch is iteratively updated by incorporating the current state of the branch's parent branch. Hence, changes to any parent branch are propagated through all its descendant branches.
    Type: Grant
    Filed: July 8, 2019
    Date of Patent: September 29, 2020
    Assignee: Oracle International Corporation
    Inventors: Maurizio Cimadamore, Brian Goetz
  • Patent number: 10783000
    Abstract: Associating working sets and threads is disclosed. An indication of a stalling event is received. In response to receiving the indication of the stalling event, a state of a processor associated with the stalling event is saved. At least one of an identifier of a guest thread running in the processor and a guest physical address referenced by the processor is obtained from the saved processor state.
    Type: Grant
    Filed: May 8, 2019
    Date of Patent: September 22, 2020
    Assignee: TidalScale, Inc.
    Inventors: Isaac R. Nassi, Kleoni Ioannidou, David P. Reed, I-Chun Fang, Michael Berman, Mark Hill, Brian Moffet
  • Patent number: 10776113
    Abstract: Technical solutions are described for out-of-order (OoO) execution of one or more instructions by a processing unit includes receiving, by a load-store unit (LSU) of the processing unit, an OoO window of instructions including a plurality of instructions to be executed OoO, and issuing, by the LSU, instructions from the OoO window. The issuing includes selecting an instruction from the OoO window, the instruction using an effective address. Further, in response to the instruction being a load instruction, it is determined whether the effective address is present in an effective address directory (EAD). In response to the effective address being present in the EAD, the load instruction is issued using the effective address. Further, in response to the instruction being a store instruction, a real address mapped to the effective address is determined from an effective-real translation (ERT) table, and the store instruction is issued using the real address.
    Type: Grant
    Filed: June 21, 2019
    Date of Patent: September 15, 2020
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Christopher Gonzalez, Bryan Lloyd, Balaram Sinharoy
  • Patent number: 10768930
    Abstract: A method provides for decoding, in a microprocessor, an instruction into data identifying a first register, a second register, an immediate value, and an opcode identifier. The opcode identifier is interpreted as indicating that an arithmetic operation is to be performed on the first register and the second register, and that the microprocessor is to perform a change of control operation in response to the addition of the first register and the second register causing overflow or underflow. The change of control operation is to a location in a program determined based on the immediate value. A processor can be provided with a decoder and other supporting circuitry to implement such method. Overflow/underflow can be checked on word boundaries of a double-word operation.
    Type: Grant
    Filed: February 2, 2015
    Date of Patent: September 8, 2020
    Assignee: MIPS Tech, LLC
    Inventor: Ranganathan Sudhakar
  • Patent number: 10768933
    Abstract: A streaming engine employed in a digital data processor specifies a fixed read only data stream defined by plural nested loops. An address generator produces addresses of data elements. A steam head register stores data elements next to be supplied to functional units for use as operands. Stream metadata is stored in response to a stream store instruction. Stored stream metadata is restored to the stream engine in response to a stream restore instruction. An interrupt changes an open stream to a frozen state discarding stored stream data. A return from interrupt changes a frozen stream to an active state.
    Type: Grant
    Filed: February 12, 2019
    Date of Patent: September 8, 2020
    Assignee: TEXAS INSTRUMENTS INCORPORATED
    Inventors: Joseph Zbiciak, Timothy D. Anderson
  • Patent number: 10761850
    Abstract: Disclosed embodiments relate to look up table operations implemented in a digital data processor. A look up table read instruction recalls data elements of a specified data size from table(s) and stores recalled data elements in successive slots in a destination register. Disclosed embodiments promote data elements to a larger size with selected sign or zero extension. A source operand register stores vector offsets from a table start address. A destination operand stores the results of the look up table read. The look up table instruction implies a base address register and a configuration register. The base address register stores a table base address. The configuration register sets various look up table read operation parameters.
    Type: Grant
    Filed: March 29, 2018
    Date of Patent: September 1, 2020
    Assignee: TEXAS INSTRUMENTS INCORPORATED
    Inventors: Duc Bui, Dheera Balasubramanian, Naveen Bhoria, Sahithi Krishna
  • Patent number: 10754657
    Abstract: An apparatus includes a memory and a processor. The memory may be configured to store a directed acyclic graph. The processor may be configured to (i) receive a command to run the directed acyclic graph, (ii) parse the directed acyclic graph into a data flow including one or more operators, (iii) schedule the operators in one or more data paths, and (iv) generate one or more output vectors by processing one or more input vectors in the data paths. The processor generally comprises a plurality of hardware engines. The data paths may be implemented with the hardware engines. The hardware engines may operate in parallel to each other.
    Type: Grant
    Filed: April 11, 2019
    Date of Patent: August 25, 2020
    Assignee: Ambarella International LP
    Inventors: Leslie D. Kohn, Robert C. Kunz
  • Patent number: 10754652
    Abstract: A processor includes: an address generating unit that, when an instruction decoded by a decoding unit is an instruction to execute arithmetic processing on a plurality of operand sets each including a plurality of operands that are objects of the arithmetic processing, in parallel a plurality of times, generates an address set corresponding to each of the operand sets of the arithmetic processing for each time, based on a certain address displacement with respect to the plurality of operands included in each of the operand sets; a plurality of instruction queues that hold the generated address sets corresponding to the respective operand sets, in correspondence to respective processing units; and a plurality of processing units that perform the arithmetic processing in parallel on the operand sets obtained based on the respective address sets outputted by the plurality of instruction queues.
    Type: Grant
    Filed: May 26, 2017
    Date of Patent: August 25, 2020
    Assignee: FUJITSU LIMITED
    Inventors: Shuji Yamamura, Takumi Maruyama, Masato Nakagawa, Masahiro Kuramoto
  • Patent number: 10754655
    Abstract: A processing device includes a branch IP table and branch predication circuitry coupled to the branch IP table. The branch predication circuitry to: determine a dynamic convergence point in a conditional branch of set of instructions; store the dynamic convergence point in the branch IP table; fetch a first and second speculative path of the conditional branch; while determining which of the first speculative path and the second speculative path is a taken path of the conditional branch and determining whether a dynamic convergence point is fetched corresponding to the stored dynamic convergence point, stall scheduling of instructions of the first speculative path and the second speculative path; and in response to determining that one of the first speculative path and the second speculative path is the taken path and the fetched dynamic convergence point corresponds to the stored convergence point, resume scheduling of the instructions of the taken path.
    Type: Grant
    Filed: June 28, 2018
    Date of Patent: August 25, 2020
    Assignee: Intel Corporation
    Inventors: Adarsh Chauhan, Hong Wang, Jayesh Gaur, Zeev Sperber, Sumeet Bandishte, Lihu Rappoport, Stanislav Shwartsman, Kamil Garifullin, Sreenivas Subramoney, Adi Yoaz
  • Patent number: 10747531
    Abstract: An example core for a data processing engine (DPE) includes a register file, a processor, coupled to the register file. The processor includes a multiply-accumulate (MAC) circuit, and permute circuitry coupled between the register file and the MAC circuit, the permute circuitry configured to concatenate at least one pair of outputs of the register file to provide at least one input to the MAC circuit. The core further includes an instruction decoder, coupled to the processor, configured to decode a very large instruction word (VLIW) to set a plurality of parameters of the processor, the plurality of parameters including first parameters of the permute circuitry and second parameters of the MAC circuit.
    Type: Grant
    Filed: April 3, 2018
    Date of Patent: August 18, 2020
    Assignee: XILINX, INC.
    Inventors: Jan Langer, Baris Ozgul, Juan J. Noguera Serra, Goran HK Bilski, Tim Tuan
  • Patent number: 10747636
    Abstract: This invention is a streaming engine employed in a digital signal processor. A fixed data stream sequence is specified by a control register. The streaming engine fetches stream data ahead of use by a central processing unit and stores it in a stream buffer. Upon occurrence of a fault reading data from memory, the streaming engine identifies the data element triggering the fault preferably storing this address in a fault address register. The streaming engine defers signaling the fault to the central processing unit until this data element is used as an operand. If the data element is never used by the central processing unit, the streaming engine never signals the fault. The streaming engine preferably stores data identifying the fault in a fault source register. The fault address register and the fault source register are preferably extended control registers accessible only via a debugger.
    Type: Grant
    Filed: August 27, 2018
    Date of Patent: August 18, 2020
    Assignee: TEXAS INSTRUMENTS INCORPORATED
    Inventors: Joseph Zbiciak, Timothy D Anderson, Duc Bui, Kai Chirca
  • Patent number: 10740126
    Abstract: Methods for supporting wide and efficient front-end operation with guest architecture emulation are disclosed. As a part of a method for supporting wide and efficient front-end operation, upon receiving a request to fetch a first far taken branch instruction, a cache line that includes the first far taken branch instruction, a next cache line and a cache line located at the target of the first far taken branch instruction is read. Based on information that is accessed from a data table, the cache line and either the next cache line or the cache line located at the target is fetched in a single cycle.
    Type: Grant
    Filed: October 19, 2018
    Date of Patent: August 11, 2020
    Assignee: Intel Corporation
    Inventors: Mohammad Abdallah, Ankur Groen, Erika Gunadi, Mandeep Singh, Ravishankar Rao
  • Patent number: 10740220
    Abstract: Performing breakpoint detection via a cache includes detecting an occurrence of a memory access and identifying whether any cache line of the cache matches an address associated with the memory access. When a cache line does match the address associated with the memory access no breakpoint was encountered. When no cache line matches the address associated with the memory access embodiments identify whether any cache line matches the address associated with the memory access when one or more flag bits are ignored. When a cache line does match the address associated with the memory access when the one or more flag bits are ignored, embodiment perform a check for whether a breakpoint was encountered. Otherwise, embodiments process a cache miss.
    Type: Grant
    Filed: June 27, 2018
    Date of Patent: August 11, 2020
    Assignee: MICROSOFT TECHNOLOGY LICENSING, LLC
    Inventor: Jordi Mola
  • Patent number: 10732980
    Abstract: An apparatus and method are provided for controlling use of a register cache. The apparatus has decode circuitry for decoding instructions retrieved from memory, execution circuitry to execute the decoded instructions in order to perform operations on data values, and a register file having a plurality of registers for storing the data values to be operated on by the execution circuitry. Further, a register cache is provided that comprises a plurality of entries, and is arranged to cache a subset of the data values. Each entry is arranged to cache a data value and an indication of the register associated with that cached data value. Prefetch circuitry is then used to prefetch data values from the register file into the register cache. Further, operand analysis circuitry derives source operand information for an instruction fetched from memory, at least prior to the decode circuitry completing decoding of that instruction.
    Type: Grant
    Filed: June 26, 2018
    Date of Patent: August 4, 2020
    Assignee: ARM Limited
    Inventors: Luca Scalabrino, Frederic Jean Denis Arsanto, Claire Aupetit
  • Patent number: 10705842
    Abstract: Methods and apparatuses relating to high-performance authenticated encryption are described.
    Type: Grant
    Filed: April 2, 2018
    Date of Patent: July 7, 2020
    Assignee: INTEL CORPORATION
    Inventors: Vikram Suresh, Sanu Mathew, Sudhir Satpathy, Vinodh Gopal