Patents Examined by Daniel H. Pan
  • Patent number: 11182208
    Abstract: Embodiments involving core-to-core offload are detailed herein. For example, a processor core comprising performance monitoring circuitry to monitor performance of the core, an offload phase tracker to maintain status information about at least an availability of a second core to act as a helper core for the first core, decode circuitry to decode an instruction having fields for at least an opcode to indicate a start a task offload operation is to be performed, and execution circuitry to execute the decoded instruction to: cause a transmission an offload start request to at least the second core, the offload start request including one or more of: an identifier of the first core, a location of where the second core can find the task to perform, an identifier of the second core, an instruction pointer from the code that the task is a proper subset of, a requesting core state, and a requesting core state location is described.
    Type: Grant
    Filed: June 29, 2019
    Date of Patent: November 23, 2021
    Assignee: INTEL CORPORATION
    Inventor: Elmoustapha Ould-Ahmed-Vall
  • Patent number: 11182214
    Abstract: Various examples are disclosed for predictive allocation of computing resources based on the predicted location of a user. A computing environment can generate a predictive usage model that predicts a location of a user and allocate computing resources, such as VDI sessions or VMs, to a host device that optimizes latency to the predicted location.
    Type: Grant
    Filed: June 25, 2019
    Date of Patent: November 23, 2021
    Assignee: VMware, Inc.
    Inventors: Erich Peter Stuntebeck, Ravish Chawla, Kar Fai Tse
  • Patent number: 11169800
    Abstract: An embodiment of the invention is a processor including execution circuitry to calculate, in response to a decoded instruction, a result of a complex multiplication of a first complex number and a second complex number. The calculation includes a first operation to calculate a first term of a real component of the result and a first term of the imaginary component of the result. The calculation also includes a second operation to calculate a second term of the real component of the result and a second term of the imaginary component of the result. The processor also includes a decoder, a first source register, and a second source register. The decoder is to decode an instruction to generate the decoded instruction. The first source register is to provide the first complex number and the second source register is to provide the second complex number.
    Type: Grant
    Filed: October 18, 2019
    Date of Patent: November 9, 2021
    Assignee: Intel Corporation
    Inventors: Robert Valentine, Mark Charney, Raanan Sade, Elmoustapha Ould-Ahmed-Vall, Jesus Corbal, Roman S. Dubtsov
  • Patent number: 11169813
    Abstract: Methods, systems, and devices for data processing are described. In some systems, data pipelines may be implemented to handle data processing jobs. To improve data pipeline flexibility, the systems may use separate pipeline and policy declarations. For example, a pipeline server may receive both a pipeline definition defining a first set of data operations to perform and a policy definition including instructions for performing a second set of data operations, where the first set of data operations is a subset of the second set. The server may execute a data pipeline based on a trigger (e.g., a scheduled trigger, a received message, etc.). To execute the pipeline, the server may layer the policy definition into the pipeline definition when creating an execution plan. The server may execute the execution plan by performing a number of jobs using a set of resources and plugins according to the policy definition.
    Type: Grant
    Filed: July 30, 2019
    Date of Patent: November 9, 2021
    Assignee: Ketch Kloud, Inc.
    Inventors: Seth Yates, Yacov Salomon, Vivek Vaidya
  • Patent number: 11157278
    Abstract: A digital data processor includes an instruction memory storing instructions each specifying a data processing operation and at least one data operand field, an instruction decoder coupled to the instruction memory for sequentially recalling instructions from the instruction memory and determining the data processing operation and the at least one data operand, and at least one operational unit coupled to a data register file and to an instruction decoder to perform a data processing operation upon at least one operand corresponding to an instruction decoded by the instruction decoder and storing results of the data processing operation. The operational unit is configured to increment histogram values in response to a histogram instruction by incrementing a bin entry at a specified location in a specified number of at least one histogram.
    Type: Grant
    Filed: September 13, 2019
    Date of Patent: October 26, 2021
    Assignee: Texas Instruments Incorporated
    Inventors: Naveen Bhoria, Duc Bui, Rama Venkatasubramanian, Dheera Balasubramanian Samudrala, Alan Davis
  • Patent number: 11144815
    Abstract: A system includes a memory, a processor, and an accelerator circuit. The accelerator circuit includes an internal memory, an input circuit block, a filter circuit block, a post-processing circuit block, and an output circuit block to concurrently perform tasks of a neural network application assigned to the accelerator circuit by the processor.
    Type: Grant
    Filed: December 3, 2018
    Date of Patent: October 12, 2021
    Assignee: Optimum Semiconductor Technologies Inc.
    Inventors: Mayan Moudgill, John Glossner
  • Patent number: 11144322
    Abstract: A system includes a memory and multiple processors. The memory further includes a shared section and a non-shared section. The processors further include at least a first processor and a second processor, both of which read-only access to the shared section of the memory. The first processor and the second processor are operable to execute shared code stored in the shared section of the memory, and execute non-shared code stored in a first sub-section and a second sub-section of the non-shared section, respectively. The first processor and the second processor execute the share code according to a first scheduler and a second scheduler, respectively. The first scheduler operates independently of the second scheduler.
    Type: Grant
    Filed: November 5, 2019
    Date of Patent: October 12, 2021
    Assignee: MediaTek Inc.
    Inventors: Hsiao Tzu Feng, Chia-Wei Chang, Li-San Yao
  • Patent number: 11144364
    Abstract: Recovering microprocessor logical register values by: partitioning a register mapper by logical register type; providing a plurality of recovery ports; assigning a logical register type to a recovery port; receiving a restore required instruction; and mapping SRB (save and restore buffer) values to the register mapper by logical register type.
    Type: Grant
    Filed: January 25, 2019
    Date of Patent: October 12, 2021
    Assignee: International Business Machines Corporation
    Inventors: Steven J. Battle, Brandon R. Goddard, Dung Q. Nguyen, Joshua W. Bowman, Brian D. Barrick, Susan E. Eisen, David S. Walder, Cliff Kucharski
  • Patent number: 11119776
    Abstract: A stream of data is accessed from a memory system using a stream of addresses generated in a first mode of operating a streaming engine in response to executing a first stream instruction. A block cache management operation is performed on a cache in the memory using a block of addresses generated in a second mode of operating the streaming engine in response to executing a second stream instruction.
    Type: Grant
    Filed: March 20, 2020
    Date of Patent: September 14, 2021
    Assignee: Texas Instruments Incorporated
    Inventors: Joseph Raymond Michael Zbiciak, Timothy David Anderson, Jonathan (Son) Hung Tran, Kai Chirca, Daniel Wu, Abhijeet Ashok Chachad, David M. Thompson
  • Patent number: 11119786
    Abstract: Embodiments for automating multidimensional elasticity for streaming applications in a computing environment. Each operator in a streaming application may be identified and assigned into one of a variety of groups according to similar performance metrics. One or more threading models may be adjusted for one or more of the groups to one or more different regions of the streaming application.
    Type: Grant
    Filed: May 30, 2019
    Date of Patent: September 14, 2021
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Xiang Ni, Scott Schneider, Kun-Lung Wu
  • Patent number: 11119787
    Abstract: Systems and methods for non-intrusive hardware profiling are provided. In some cases integrated circuit devices can be manufactured without native support for performance measurement and/or debugging capabilities, thereby limiting visibility into the integrated circuit device. Understanding the timing of operations can help to determine whether the hardware of the device is operating correctly and, when the device is not operating correctly, provide information that can be used to debug the device. In order to measure execution time of various tasks performed by the integrated circuit device, program instructions may be inserted to generate notifications that provide tracing information, including timestamps, for operations executed by the integrated circuit device.
    Type: Grant
    Filed: March 28, 2019
    Date of Patent: September 14, 2021
    Assignee: Amazon Technologies, Inc.
    Inventors: Mohammad El-Shabani, Ron Diamant, Samuel Jacob, Ilya Minkin, Richard John Heaton
  • Patent number: 11119779
    Abstract: A streaming engine employed in a digital data processor specifies fixed first and second read only data streams. Corresponding stream address generator produces address of data elements of the two streams. Corresponding steam head registers stores data elements next to be supplied to functional units for use as operands. The two streams share two memory ports. A toggling preference of stream to port ensures fair allocation. The arbiters permit one stream to borrow the other's interface when the other interface is idle. Thus one stream may issue two memory requests, one from each memory port, if the other stream is idle. This spreads the bandwidth demand for each stream across both interfaces, ensuring neither interface becomes a bottleneck.
    Type: Grant
    Filed: March 20, 2020
    Date of Patent: September 14, 2021
    Assignee: Texas Instruments Incorporated
    Inventors: Joseph Zbiciak, Timothy Anderson
  • Patent number: 11113223
    Abstract: Examples herein describe techniques for communicating between data processing engines in an array of data processing engines. In one embodiment, the array is a 2D array where each of the DPEs includes one or more cores. In addition to the cores, the data processing engines can include streaming interconnects which transmit streaming data using two different modes: circuit switching and packet switching. Circuit switching establishes reserved point-to-point communication paths between endpoints in the interconnect which routes data in a deterministic manner. Packet switching, in contrast, transmits streaming data that includes headers for routing data within the interconnect in a non-deterministic manner. In one embodiment, the streaming interconnects can have one or more ports configured to perform circuit switching and one or more ports configured to perform packet switching.
    Type: Grant
    Filed: April 3, 2018
    Date of Patent: September 7, 2021
    Assignee: XILINX, INC.
    Inventors: Peter McColgan, Goran H K Bilski, Juan J. Noguera Serra, Jan Langer, Baris Ozgul, David Clarke
  • Patent number: 11113057
    Abstract: A streaming engine employed in a digital data processor specifies a fixed read only data stream defined by plural nested loops. An address generator produces address of data elements. A steam head register stores data elements next to be supplied to functional units for use as operands. The streaming engine fetches stream data ahead of use by the central processing unit core in a stream buffer constructed like a cache. The stream buffer cache includes plural cache lines, each includes tag bits, at least one valid bit and data bits. Cache lines are allocated to store newly fetched stream data. Cache lines are deallocated upon consumption of the data by a central processing unit core functional unit. Instructions preferably include operand fields with a first subset of codings corresponding to registers, a stream read only operand coding and a stream read and advance operand coding.
    Type: Grant
    Filed: March 4, 2020
    Date of Patent: September 7, 2021
    Assignee: Texas Instruments Incorporated
    Inventor: Joseph Zbiciak
  • Patent number: 11113062
    Abstract: Software instructions are executed on a processor within a computer system to configure a steaming engine with stream parameters to define a multidimensional array. The stream parameters define a size for each dimension of the multidimensional array and a pad value indicator. Data is fetched from a memory coupled to the streaming engine responsive to the stream parameters. A stream of vectors is formed for the multidimensional array responsive to the stream parameters from the data fetched from memory. A padded stream vector is formed that includes a specified pad value without accessing the pad value from system memory.
    Type: Grant
    Filed: May 23, 2019
    Date of Patent: September 7, 2021
    Assignee: TEXAS INSTRUMENTS INCORPORATED
    Inventors: Asheesh Bhardwaj, Timothy David Anderson, Son Hung Tran
  • Patent number: 11113063
    Abstract: According to one general aspect, an apparatus may include a main-branch target buffer (BTB). The apparatus may include a micro-BTB separate from and smaller than the main-BTB, and configured to produce prediction information associated with a branching instruction. The apparatus may include a micro-BTB confidence counter configured to measure a correctness of the prediction information produced by the micro-BTB. The apparatus may further include a micro-BTB misprediction rate counter configured to measure a rate of mispredictions produced by the micro-BTB. The apparatus may also include a micro-BTB enablement circuit configured to enable a usage of the micro-BTB's prediction information, based, at least in part, upon the values of the micro-BTB confidence counter and the micro-BTB misprediction rate counter.
    Type: Grant
    Filed: September 9, 2019
    Date of Patent: September 7, 2021
    Inventors: James David Dundas, Xiaoxin Fan, Shashank Nemawarkar, Madhu Saravana Sibi Govindan
  • Patent number: 11099933
    Abstract: Disclosed embodiments relate to a streaming engine employed in, for example, a digital signal processor. A fixed data stream sequence including plural nested loops is specified by a control register. The streaming engine includes an address generator producing addresses of data elements and a steam head register storing data elements next to be supplied as operands. The streaming engine fetches stream data ahead of use by the central processing unit core in a stream buffer. Parity bits are formed upon storage of data in the stream buffer which are stored with the corresponding data. Upon transfer to the stream head register a second parity is calculated and compared with the stored parity. The streaming engine signals a parity fault if the parities do not match. The streaming engine preferably restarts fetching the data stream at the data element generating a parity fault.
    Type: Grant
    Filed: March 4, 2020
    Date of Patent: August 24, 2021
    Assignee: Texas Instruments Incorporated
    Inventors: Joseph Zbiciak, Timothy Anderson
  • Patent number: 11086628
    Abstract: A system and method for load queue (LDQ) and store queue (STQ) entry allocations at address generation time that maintains age-order of instructions is described. In particular, writing LDQ and STQ entries are delayed until address generation time. This allows the load and store operations to dispatch, and younger operations (which may not be store and load operations) to also dispatch and execute their instructions. The address generation of the load or store operation is held at an address generation scheduler queue (AGSQ) until a load or store queue entry is available for the operation. The tracking of load queue entries or store queue entries is effectively being done in the AGSQ instead of at the decode engine. The LDQ and STQ depth is not visible from a decode engine's perspective, and increases the effective processing and queue depth.
    Type: Grant
    Filed: August 15, 2016
    Date of Patent: August 10, 2021
    Assignee: Advanced Micro Devices, Inc.
    Inventor: John M. King
  • Patent number: 11086626
    Abstract: Circuitry comprises decode circuitry to decode program instructions including producer instructions and consumer instructions, a consumer instruction requiring, as an input operand, a result generated by execution of a producer instruction; and execution circuitry to execute the program instructions; in which: the decode circuitry is configured to control operation of the execution circuitry in response to hint data associated with a given producer instruction and indicating, for the given producer instruction, a number of consumer instructions which require, as an input operand, a result generated by the given producer instruction.
    Type: Grant
    Filed: October 24, 2019
    Date of Patent: August 10, 2021
    Assignee: Arm Limited
    Inventors: Roko Grubisic, Giacomo Gabrielli, Matthew James Horsnell, Syed Ali Mustafa Zaidi
  • Patent number: 11086631
    Abstract: Techniques are disclosed relating to the handling of exceptions generated by illegal instructions in a processor. In an embodiment, a processor may be configured to fetch instructions defined according to an instruction set architecture (ISA). The ISA may include a set of uncompressed instructions and a set of compressed instructions. The processor may further be configured to, upon detecting a given one of the set of compressed instructions, cause a copy of the given compressed instruction to be saved and convert the given compressed instruction to a corresponding given uncompressed instruction. The processor may also be configured to detect that the given uncompressed instruction is illegal and was converted from the given compressed instruction, and based at least in part on these, cause an illegal instruction exception to be generated using the copy of the given compressed instruction.
    Type: Grant
    Filed: October 23, 2019
    Date of Patent: August 10, 2021
    Assignee: Western Digital Technologies, Inc.
    Inventors: Robert T. Golla, Matthew B. Smittle