Patents Examined by George Giroux
-
Patent number: 10592466Abstract: A GPU architecture employs a crossbar switch to preferentially store operand vectors in a compressed form allowing reduction in the number of memory circuits that must be activated during an operand fetch and to allow existing execution units to be used for scalar execution. Scalar execution can be performed during branch divergence.Type: GrantFiled: May 12, 2016Date of Patent: March 17, 2020Assignee: Wisconsin Alumni Research FoundationInventors: Nam Sung Kim, Zhenhong Liu
-
Patent number: 10572261Abstract: A task identifier-based mechanism is configured to temporarily disable a dual-issue capability of one or more threads in a superscalar simultaneous multi-threaded core. The core executes a first thread and a second thread which are each provided with a dual-issue capability wherein up to two instructions may be issued in parallel. In response to a task identifier being received that is indicative of a task requiring an improved level of determinism, the dual-issue capability of at least one of the first thread or the second thread is temporarily disabled.Type: GrantFiled: January 6, 2016Date of Patent: February 25, 2020Assignee: NXP USA, Inc.Inventors: Alistair Paul Robertson, James Andrew Collier Scobie
-
Patent number: 10564965Abstract: Compare string processing via inline decode-based micro-operations expansion. An instruction, which is to perform a compare string operation, is decoded. The decoding provides a sequence of operations to perform the compare string operation. The sequence of operations includes a first load to boundary operation to load a first set of data up to a specified boundary of memory and a second load to boundary operation to load a second set of data. The first set of data and the second set of data are loaded as part of the compare string operation.Type: GrantFiled: March 3, 2017Date of Patent: February 18, 2020Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventor: Michael K. Gschwind
-
Patent number: 10564967Abstract: Move string processing via inline decode-based micro-operations expansion. An instruction is obtained, and the instruction, which is to perform a move string operation, is decoded. The decoding provides a sequence of operations to perform the move string operation. The sequence of operations includes a load to boundary operation to load an amount of data up to a specified boundary of memory. The data to be loaded as part of the move string operation.Type: GrantFiled: March 3, 2017Date of Patent: February 18, 2020Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventor: Michael K. Gschwind
-
Patent number: 10558460Abstract: Systems and techniques are disclosed for general purpose register dynamic allocation based on latency associated with of instructions in processor threads. A streaming processor can include a general purpose registers configured to stored data associated with threads, and a thread scheduler configured to receive allocation information for the general purpose registers, the information describing general purpose registers that are to be assigned as persistent general purpose registers (pGPRs) and volatile general purpose registers (vGPRs). The plurality of general purpose registers can be allocated according to the received information. The streaming processor can include the general purpose registers allocated according to the received information, the allocated based on execution latencies of instructions included in the threads.Type: GrantFiled: December 14, 2016Date of Patent: February 11, 2020Assignee: QUALCOMM IncorporatedInventors: Yun Du, Liang Han, Lin Chen, Chihong Zhang, Hongjiang Shang, Jing Wu, Zilin Ying, Chun Yu, Guofang Jiao, Andrew Gruber, Eric Demers
-
Patent number: 10552151Abstract: An apparatus including a memory and a circuit. The memory may be configured to store a multidimensional array of data values. The circuit may be configured to (i) fetch a plurality of data vectors from the memory, where each of the data vectors comprises a plurality of the data values, (ii) calculate a plurality of modification values based on the data values, (iii) calculate a first value of a first window based on the data values, and (iv) calculate a second value of a second window by adding to the first value of the first window a next one of the modification values and subtracting from the first value of the first window a previous one of the modification values. The second window generally overlaps the first window in the multidimensional array along a particular axis.Type: GrantFiled: December 8, 2016Date of Patent: February 4, 2020Assignee: Ambarella, Inc.Inventors: Wen Wan Yang, Peter Verplaetse
-
Patent number: 10545764Abstract: A data processing apparatus comprises register rename circuitry for mapping architectural register specifiers specified by instructions to physical registers to be accessed in response to the instructions. Available register control circuitry controls which physical registers are available for mapping to an architectural register specifier by the register rename circuitry. For at least one group of two or more physical registers, the available register control circuitry controls availability of the registers based on a group tracking indication indicative of whether there is at least one pending access to any of the physical registers in the group.Type: GrantFiled: March 28, 2016Date of Patent: January 28, 2020Assignee: ARM LimitedInventors: Luca Scalabrino, Frederic Jean Denis Arsanto, Thomas Gilles Tarridec, Cedric Denis Robert Airaud
-
Patent number: 10540183Abstract: As disclosed herein a method, executed by a processor, for accelerated instruction execution includes retrieving an execute instruction including a register reference and a reference to a target instruction, retrieving the target instruction, decoding the execute instruction using an instruction pipeline, decoding the target instruction using the instruction pipeline, associating the register reference to the target instruction, and executing the target instruction using the register reference as a source operand modifier. The instruction pipeline is configured such that it allows the target instruction to continue processing without waiting for the register reference to be resolved. The contents of the referenced register may be retrieved in a later stage of the instruction pipeline, and the target instruction may be modified and executed. An apparatus corresponding to the described method is also disclosed herein.Type: GrantFiled: October 31, 2017Date of Patent: January 21, 2020Assignee: International Business Machines CorporationInventors: Khary J. Alexander, Fadi Y. Busaba, Brian W. Curran, David S. Hutton, Edward T. Malley, Brian R. Prasky, John G. Rell, Jr.
-
Patent number: 10534644Abstract: Described herein are systems and methods for implementing a processor-local (e.g., a CPU-local) storage mechanism. An exemplary system includes a plurality of processors executing an operating system, the operating system including a processor local storage mechanism, wherein each processor accesses data unique to the processor based on the processor local storage mechanism. Each of the plurality of processors of the system may have controlled access to the resource and each of the processors is dedicated to one of a plurality of tasks of an application. The application including the plurality of tasks may be replicated using the processor local storage mechanism, wherein each of the tasks of the replicated application includes an affinity to one of the plurality of processors.Type: GrantFiled: June 25, 2009Date of Patent: January 14, 2020Assignee: Wind River Systems, Inc.Inventors: Andrew Gaiarsa, Maarten Koning
-
Patent number: 10534611Abstract: Embodiments relate to branch prediction using a pattern history table (PHT) that is indexed using a global path vector (GPV). An aspect includes receiving a search address by a branch prediction logic that is in communication with the PHT and the GPV. Another aspect includes starting with the search address, simultaneously determining a plurality of branch predictions by the branch prediction logic based on the PHT, wherein the plurality of branch predictions comprises one of: (i) at least one not taken prediction and a single taken prediction, and (ii) a plurality of not taken predictions. Another aspect includes updating the GPV by shifting an instruction identifier of a branch instruction associated with a taken prediction into the GPV, wherein the GPV is not updated based on any not taken prediction.Type: GrantFiled: July 31, 2014Date of Patent: January 14, 2020Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: James J. Bonanno, Matthias D. Heizmann, Daniel Lipetz, Brian R. Prasky
-
Patent number: 10515206Abstract: A signature module calculates a signature during the execution of a program by a central processing unit based on program instructions to the central processing unit, and stores the signature in a signature register of the signature module. The signature module includes: a calculation unit configured to generate a signature value based on program instructions executed on the central processing unit; and an instruction information interface configured to receive at least one item of instruction information from the central processing unit which indicates whether an instruction currently being executed by the central processing unit was jumped to indirectly or directly.Type: GrantFiled: March 1, 2012Date of Patent: December 24, 2019Assignee: Infineon Technologies AGInventors: Berndt Gammel, Stefan Mangard, Steffen Sonnekalb
-
Patent number: 10510022Abstract: Systems and methods for machine learning, models, and related explainability and interpretability are provided. A computing device determines a contribution of a feature to a predicted value. A feature computation dataset is defined based on a selected next selection vector. A prediction value is computed for each observation vector included in the feature computation dataset using a trained predictive model. An expected value is computed for the selected next selection vector based on the prediction values. The feature computation dataset is at least a partial copy of a training dataset with each variable value replaced in each observation vector included in the feature computation dataset based on the selected next selection vector. Each replaced variable value is replaced with a value included in a predefined query for a respective variable. A Shapley estimate value is computed for each variable.Type: GrantFiled: June 25, 2019Date of Patent: December 17, 2019Assignee: SAS INSTITUTE INC.Inventors: Ricky Dee Tharrington, Jr., Xin Jiang Hunt, Ralph Walter Abbey
-
Patent number: 10496410Abstract: A processor includes a core, a hardware prefetcher, and a prefetcher control module. The hardware prefetcher includes logic to make speculative prefetch requests, through a memory subsystem, for elements for execution by the core, and logic to store prefetched elements in a cache. The prefetcher control module includes logic to selectively suppress, based on a hardware-prefetch suppression instruction executed by the core, a speculative prefetch request to be made by the hardware prefetcher.Type: GrantFiled: December 23, 2014Date of Patent: December 3, 2019Assignee: Intel CorporationInventors: Alexander F. Heinecke, Christopher J. Hughes, Daehyun Kim, Jong Soo Park
-
Patent number: 10474461Abstract: A method of determining an execution order of memory operations performed by a processor includes executing at least one single-instruction, multiple-data (SIMD) scatter operation by the processor to store data to a memory. The method further includes executing one or more instructions by the processor to determine the execution order of a set of memory operations. The set of memory operations includes the at least one SIMD scatter operation.Type: GrantFiled: September 22, 2016Date of Patent: November 12, 2019Assignee: QUALCOMM IncorporatedInventors: Eric Mahurin, Lucian Codrescu
-
Patent number: 10474471Abstract: One or more embodiments may provide a method for performing a replay. The method includes initiating execution of a program, the program having a plurality of sets of instructions, and each set of instructions has a number of chunks of instructions. The method also includes intercepting, by a virtual machine unit executing on a processor, an instruction of a chunk of the number of chunks before execution. The method further includes determining, by a replay module executing on the processor, whether the chunk is an active chunk, and responsive to the chunk being the active chunk, executing the instruction.Type: GrantFiled: April 18, 2016Date of Patent: November 12, 2019Assignee: Intel CorporationInventors: Justin E. Gottschlich, Klaus Danne, Cristiano L. Pereira, Gilles A. Pokam, Rolf Kassa, Shiliang Hu, Tim Kranich
-
Patent number: 10474463Abstract: An apparatus and method are described for down-converting from a source operand to a destination operand with masking. For example, a method according to one embodiment includes the following operations: reading a source operand value to be down-converted from a first value to a down-converted value and stored in a destination location; reading each mask register bit stored in a mask register, the mask register bit(s) indicating whether to perform a masking operation or a conversion operation on the source operand value; if the mask register bit(s) indicates that a masking operation is to be performed, then performing a specified masking operation and storing the results of the masking operation in the destination location; and if the mask register bit indicates that a masking operation is not to be performed, then down-converting the source operand value and storing the down-converted value in the specified destination location.Type: GrantFiled: December 23, 2011Date of Patent: November 12, 2019Assignee: INTEL CORPORATIONInventors: Elmoustapha Ould-Ahmed-Vall, Robert Valentine, Tal Uliel, Jesus Corbal, Zeev Sperber, Amit Gradstein
-
Patent number: 10452416Abstract: An interactive virtualization management system provides an assessment of proposed or existing virtualization schemes. A Virtual Technology Overhead Profile (VTOP) is created for each of a variety of configurations of host computer systems and virtualization technologies by measuring the overhead experienced under a variety of conditions. The multi-variate overhead profile corresponding to each target configuration being evaluated is used by the virtualization management system to determine the overhead that is to be expected on the target system, based on the particular set of conditions at the target system. Based on these overhead estimates, and the parameters of the jobs assigned to each virtual machine on each target system, the resultant overall performance of the target system for meeting the performance criteria of each of the jobs in each virtual machine is determined, and over-committed virtual machines and computer systems are identified.Type: GrantFiled: January 17, 2014Date of Patent: October 22, 2019Assignee: Riverbed Technology, Inc.Inventors: Yiping Ding, David Carter, Shankar Ananthanarayanan
-
Patent number: 10445091Abstract: In an embodiment, an apparatus includes a first buffer, a second buffer, and a control circuit. The control circuit may be configured to receive a first plurality of instructions included in a program. The control circuit may also be configured to store each of the first plurality of instructions in an entry of a first number of entries in the first buffer, arranged in the first number of entries dependent upon a received order. The control circuit may be further configured to select a second plurality of instructions from the first buffer. The second plurality of instructions may be selected dependent upon a program order. The control circuit may be configured to store each of the second plurality of instructions in an entry of a second number of entries in the second buffer, arranged in the second number of entries dependent upon the program order.Type: GrantFiled: March 30, 2016Date of Patent: October 15, 2019Assignee: Apple Inc.Inventor: Brett S. Feero
-
Patent number: 10437598Abstract: A method and an apparatus are provided for selecting between a plurality of instruction sets available to a microprocessor. An instruction fetch address is supplied. At least one predetermined bit of the instruction fetch address is used to select between the instruction sets. Once an instruction set has been selected, instructions may be fetched and decoded with a decoding scheme appropriate to the instruction set.Type: GrantFiled: February 9, 2007Date of Patent: October 8, 2019Assignee: MIPS Tech, LLCInventor: Andrew Webber
-
Patent number: 10430342Abstract: An apparatus includes a buffer configured to store a plurality of instructions previously fetched from a memory, wherein each instruction of the plurality of instructions may be included in a respective thread of a plurality of threads. The apparatus also includes control circuitry configured to select a given thread of the plurality of threads dependent upon a number of instructions in the buffer that are included in the given thread. The control circuitry is also configured to fetch a respective instruction corresponding to the given thread from the memory, and to store the respective instruction in the buffer.Type: GrantFiled: November 18, 2015Date of Patent: October 1, 2019Assignee: Oracle International CorporationInventors: Yuan Chou, Gideon Levinsky, Manish Shah, Robert Golla, Matthew Smittle