Patents Examined by Daniel H. Pan
-
Patent number: 11216301Abstract: A method for enabling scheduling of processes in a processing system having at least one processor and associated hardware resources, at least one of the hardware resources being shared by at least two of the processes. The method is characterized by controlling execution of a process based on a usage bound of the number of allowable accesses, by the process, to a shared hardware resource by halting execution of the process when the number of allowable accesses has been reached, and enabling idle mode or start of execution of a next process. In this way, costly hardware overprovisioning and/or the need for shutting down processor cores can be avoided. By controlling execution of a process based on a usage bound of the number of allowable accesses to a shared hardware resource, instead of simply dividing CPU time between processes, highly efficient shared-resource-based process scheduling can be achieved.Type: GrantFiled: April 12, 2016Date of Patent: January 4, 2022Assignee: Telefonaktiebolaget LM Ericsson (publ)Inventors: Marcus Jägemar, Sigrid Eldh, Andreas Ermedahl
-
Patent number: 11210094Abstract: Systems and methods for minimally intrusive instruction pointer-aware processing resource activity profiling are disclosed. In one embodiment, a graphics processor includes a grouping of processing resources and control logic that is associated with the grouping of processing resources. The control logic is configured to sample a state of at least one processing resource of the grouping of processing resources and to determine activity data from the state with the activity data including at least one of stalls and reason counts for stalling activity, instruction types, pipeline utilization, thread utilization, and shader activity.Type: GrantFiled: September 27, 2019Date of Patent: December 28, 2021Assignee: Intel CorporationInventors: Michael Cole, Alexandr Kurylev, Subramaniam Maiyuran, Vikranth Vemulapalli, Sriharsha Vadlamani, Piotr Reiter
-
Patent number: 11210098Abstract: Techniques related to executing instructions by a processor comprising receiving a first instruction for execution, determining a first latency value based on an expected amount of time needed for the first instruction to be executed, storing the first latency value in a writeback queue, beginning execution of the first instruction on the instruction execution pipeline, adjusting the latency value based on an amount of time passed since beginning execution of the first instruction, outputting a first result of the first instruction based on the latency value, receiving a second instruction, determining that the second instruction is a variable latency instruction, storing a ready value indicating that a second result of the second instruction is not ready in the writeback queue, beginning execution of the second instruction on the instruction execution pipeline, updating the ready value to indicate that the second result is ready, and outputting the second result.Type: GrantFiled: April 15, 2019Date of Patent: December 28, 2021Assignee: Texas Instruments IncorporatedInventor: Timothy D. Anderson
-
Patent number: 11204801Abstract: Systems and methods for scheduling thread order to improve cache efficiency are disclosed. In one embodiment, a graphics processor includes processing resources and schedule and dispatch logic to schedule and dispatch threads to the processing resources. The schedule and dispatch logic is configured to receive threads, to schedule and dispatch the threads based on a forward thread dispatch having a forward thread order, and to determine whether to disable a reversing of a thread order upon completion of at least a portion of the forward thread dispatch including a completion or ending of a draw call or a dispatch.Type: GrantFiled: November 14, 2019Date of Patent: December 21, 2021Assignee: Intel CorporationInventors: Justin DeCell, Saurabh Sharma
-
Patent number: 11204799Abstract: A semiconductor device capable of suppressing performance degradation and systems using the same are provided. The semiconductor device includes a plurality of processors CPU1 and CPU2, a scheduling device 10 (ID1) connected to the processors CPU1 and CPU2 for controlling the processors CPU1 and CPU2 to execute a plurality of tasks in real time, memories 17 and 18 accessed by the processors CPU1 and CPU2 to store data by executing the tasks, and access monitor circuits 15 for monitoring accesses to the memories by the processors CPU1 and CPU2. When an access to the memory is detected by the access monitor circuit 15, the data stored in the memory 18 is transferred based on the destination information of the data stored in the memory 18.Type: GrantFiled: September 18, 2019Date of Patent: December 21, 2021Assignee: RENESAS ELECTRONICS CORPORATIONInventor: Yasuo Sasaki
-
Patent number: 11182208Abstract: Embodiments involving core-to-core offload are detailed herein. For example, a processor core comprising performance monitoring circuitry to monitor performance of the core, an offload phase tracker to maintain status information about at least an availability of a second core to act as a helper core for the first core, decode circuitry to decode an instruction having fields for at least an opcode to indicate a start a task offload operation is to be performed, and execution circuitry to execute the decoded instruction to: cause a transmission an offload start request to at least the second core, the offload start request including one or more of: an identifier of the first core, a location of where the second core can find the task to perform, an identifier of the second core, an instruction pointer from the code that the task is a proper subset of, a requesting core state, and a requesting core state location is described.Type: GrantFiled: June 29, 2019Date of Patent: November 23, 2021Assignee: INTEL CORPORATIONInventor: Elmoustapha Ould-Ahmed-Vall
-
Patent number: 11182214Abstract: Various examples are disclosed for predictive allocation of computing resources based on the predicted location of a user. A computing environment can generate a predictive usage model that predicts a location of a user and allocate computing resources, such as VDI sessions or VMs, to a host device that optimizes latency to the predicted location.Type: GrantFiled: June 25, 2019Date of Patent: November 23, 2021Assignee: VMware, Inc.Inventors: Erich Peter Stuntebeck, Ravish Chawla, Kar Fai Tse
-
Patent number: 11169800Abstract: An embodiment of the invention is a processor including execution circuitry to calculate, in response to a decoded instruction, a result of a complex multiplication of a first complex number and a second complex number. The calculation includes a first operation to calculate a first term of a real component of the result and a first term of the imaginary component of the result. The calculation also includes a second operation to calculate a second term of the real component of the result and a second term of the imaginary component of the result. The processor also includes a decoder, a first source register, and a second source register. The decoder is to decode an instruction to generate the decoded instruction. The first source register is to provide the first complex number and the second source register is to provide the second complex number.Type: GrantFiled: October 18, 2019Date of Patent: November 9, 2021Assignee: Intel CorporationInventors: Robert Valentine, Mark Charney, Raanan Sade, Elmoustapha Ould-Ahmed-Vall, Jesus Corbal, Roman S. Dubtsov
-
Patent number: 11169813Abstract: Methods, systems, and devices for data processing are described. In some systems, data pipelines may be implemented to handle data processing jobs. To improve data pipeline flexibility, the systems may use separate pipeline and policy declarations. For example, a pipeline server may receive both a pipeline definition defining a first set of data operations to perform and a policy definition including instructions for performing a second set of data operations, where the first set of data operations is a subset of the second set. The server may execute a data pipeline based on a trigger (e.g., a scheduled trigger, a received message, etc.). To execute the pipeline, the server may layer the policy definition into the pipeline definition when creating an execution plan. The server may execute the execution plan by performing a number of jobs using a set of resources and plugins according to the policy definition.Type: GrantFiled: July 30, 2019Date of Patent: November 9, 2021Assignee: Ketch Kloud, Inc.Inventors: Seth Yates, Yacov Salomon, Vivek Vaidya
-
Patent number: 11157278Abstract: A digital data processor includes an instruction memory storing instructions each specifying a data processing operation and at least one data operand field, an instruction decoder coupled to the instruction memory for sequentially recalling instructions from the instruction memory and determining the data processing operation and the at least one data operand, and at least one operational unit coupled to a data register file and to an instruction decoder to perform a data processing operation upon at least one operand corresponding to an instruction decoded by the instruction decoder and storing results of the data processing operation. The operational unit is configured to increment histogram values in response to a histogram instruction by incrementing a bin entry at a specified location in a specified number of at least one histogram.Type: GrantFiled: September 13, 2019Date of Patent: October 26, 2021Assignee: Texas Instruments IncorporatedInventors: Naveen Bhoria, Duc Bui, Rama Venkatasubramanian, Dheera Balasubramanian Samudrala, Alan Davis
-
Patent number: 11144322Abstract: A system includes a memory and multiple processors. The memory further includes a shared section and a non-shared section. The processors further include at least a first processor and a second processor, both of which read-only access to the shared section of the memory. The first processor and the second processor are operable to execute shared code stored in the shared section of the memory, and execute non-shared code stored in a first sub-section and a second sub-section of the non-shared section, respectively. The first processor and the second processor execute the share code according to a first scheduler and a second scheduler, respectively. The first scheduler operates independently of the second scheduler.Type: GrantFiled: November 5, 2019Date of Patent: October 12, 2021Assignee: MediaTek Inc.Inventors: Hsiao Tzu Feng, Chia-Wei Chang, Li-San Yao
-
Patent number: 11144364Abstract: Recovering microprocessor logical register values by: partitioning a register mapper by logical register type; providing a plurality of recovery ports; assigning a logical register type to a recovery port; receiving a restore required instruction; and mapping SRB (save and restore buffer) values to the register mapper by logical register type.Type: GrantFiled: January 25, 2019Date of Patent: October 12, 2021Assignee: International Business Machines CorporationInventors: Steven J. Battle, Brandon R. Goddard, Dung Q. Nguyen, Joshua W. Bowman, Brian D. Barrick, Susan E. Eisen, David S. Walder, Cliff Kucharski
-
Patent number: 11144815Abstract: A system includes a memory, a processor, and an accelerator circuit. The accelerator circuit includes an internal memory, an input circuit block, a filter circuit block, a post-processing circuit block, and an output circuit block to concurrently perform tasks of a neural network application assigned to the accelerator circuit by the processor.Type: GrantFiled: December 3, 2018Date of Patent: October 12, 2021Assignee: Optimum Semiconductor Technologies Inc.Inventors: Mayan Moudgill, John Glossner
-
Patent number: 11119787Abstract: Systems and methods for non-intrusive hardware profiling are provided. In some cases integrated circuit devices can be manufactured without native support for performance measurement and/or debugging capabilities, thereby limiting visibility into the integrated circuit device. Understanding the timing of operations can help to determine whether the hardware of the device is operating correctly and, when the device is not operating correctly, provide information that can be used to debug the device. In order to measure execution time of various tasks performed by the integrated circuit device, program instructions may be inserted to generate notifications that provide tracing information, including timestamps, for operations executed by the integrated circuit device.Type: GrantFiled: March 28, 2019Date of Patent: September 14, 2021Assignee: Amazon Technologies, Inc.Inventors: Mohammad El-Shabani, Ron Diamant, Samuel Jacob, Ilya Minkin, Richard John Heaton
-
Patent number: 11119786Abstract: Embodiments for automating multidimensional elasticity for streaming applications in a computing environment. Each operator in a streaming application may be identified and assigned into one of a variety of groups according to similar performance metrics. One or more threading models may be adjusted for one or more of the groups to one or more different regions of the streaming application.Type: GrantFiled: May 30, 2019Date of Patent: September 14, 2021Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Xiang Ni, Scott Schneider, Kun-Lung Wu
-
Patent number: 11119779Abstract: A streaming engine employed in a digital data processor specifies fixed first and second read only data streams. Corresponding stream address generator produces address of data elements of the two streams. Corresponding steam head registers stores data elements next to be supplied to functional units for use as operands. The two streams share two memory ports. A toggling preference of stream to port ensures fair allocation. The arbiters permit one stream to borrow the other's interface when the other interface is idle. Thus one stream may issue two memory requests, one from each memory port, if the other stream is idle. This spreads the bandwidth demand for each stream across both interfaces, ensuring neither interface becomes a bottleneck.Type: GrantFiled: March 20, 2020Date of Patent: September 14, 2021Assignee: Texas Instruments IncorporatedInventors: Joseph Zbiciak, Timothy Anderson
-
Patent number: 11119776Abstract: A stream of data is accessed from a memory system using a stream of addresses generated in a first mode of operating a streaming engine in response to executing a first stream instruction. A block cache management operation is performed on a cache in the memory using a block of addresses generated in a second mode of operating the streaming engine in response to executing a second stream instruction.Type: GrantFiled: March 20, 2020Date of Patent: September 14, 2021Assignee: Texas Instruments IncorporatedInventors: Joseph Raymond Michael Zbiciak, Timothy David Anderson, Jonathan (Son) Hung Tran, Kai Chirca, Daniel Wu, Abhijeet Ashok Chachad, David M. Thompson
-
Patent number: 11113057Abstract: A streaming engine employed in a digital data processor specifies a fixed read only data stream defined by plural nested loops. An address generator produces address of data elements. A steam head register stores data elements next to be supplied to functional units for use as operands. The streaming engine fetches stream data ahead of use by the central processing unit core in a stream buffer constructed like a cache. The stream buffer cache includes plural cache lines, each includes tag bits, at least one valid bit and data bits. Cache lines are allocated to store newly fetched stream data. Cache lines are deallocated upon consumption of the data by a central processing unit core functional unit. Instructions preferably include operand fields with a first subset of codings corresponding to registers, a stream read only operand coding and a stream read and advance operand coding.Type: GrantFiled: March 4, 2020Date of Patent: September 7, 2021Assignee: Texas Instruments IncorporatedInventor: Joseph Zbiciak
-
Patent number: 11113223Abstract: Examples herein describe techniques for communicating between data processing engines in an array of data processing engines. In one embodiment, the array is a 2D array where each of the DPEs includes one or more cores. In addition to the cores, the data processing engines can include streaming interconnects which transmit streaming data using two different modes: circuit switching and packet switching. Circuit switching establishes reserved point-to-point communication paths between endpoints in the interconnect which routes data in a deterministic manner. Packet switching, in contrast, transmits streaming data that includes headers for routing data within the interconnect in a non-deterministic manner. In one embodiment, the streaming interconnects can have one or more ports configured to perform circuit switching and one or more ports configured to perform packet switching.Type: GrantFiled: April 3, 2018Date of Patent: September 7, 2021Assignee: XILINX, INC.Inventors: Peter McColgan, Goran H K Bilski, Juan J. Noguera Serra, Jan Langer, Baris Ozgul, David Clarke
-
Patent number: 11113062Abstract: Software instructions are executed on a processor within a computer system to configure a steaming engine with stream parameters to define a multidimensional array. The stream parameters define a size for each dimension of the multidimensional array and a pad value indicator. Data is fetched from a memory coupled to the streaming engine responsive to the stream parameters. A stream of vectors is formed for the multidimensional array responsive to the stream parameters from the data fetched from memory. A padded stream vector is formed that includes a specified pad value without accessing the pad value from system memory.Type: GrantFiled: May 23, 2019Date of Patent: September 7, 2021Assignee: TEXAS INSTRUMENTS INCORPORATEDInventors: Asheesh Bhardwaj, Timothy David Anderson, Son Hung Tran