Instruction Fetching Patents (Class 712/205)
  • Patent number: 11940928
    Abstract: Devices and techniques for parking threads in a barrel processor for managing cache eviction requests are described herein. A barrel processor includes eviction circuitry and is configured to perform operations to: (a) detect a thread that includes a memory access operation, the thread entering a memory request pipeline of the barrel processor; (b) determine that a data cache line has to be evicted from a data cache for the thread to perform the memory access operation; (c) copy the thread into a park queue; (d) evict a data cache line from the data cache; (e) identify an empty cycle in the memory request pipeline; (f) schedule the thread to execute during the empty cycle; and (g) remove the thread from the park queue.
    Type: Grant
    Filed: August 29, 2022
    Date of Patent: March 26, 2024
    Assignee: Micron Technology, Inc.
    Inventor: Christopher Baronne
  • Patent number: 11915001
    Abstract: A neural processor and a method for fetching instructions thereof are provided. The neural processor includes a local memory in which weights, input activations, and partial sums are stored, a processing unit configured to compute the weights, the input activations, and the partial sums, and a local memory load unit configured to load the weights, the input activations, and the partial sums from the local memory into the processing unit, wherein the local memory load unit includes an instruction fetch unit configured to fetch instructions included in a program of the local memory load unit for loading any one of the weights, the input activations, or the partial sums from the local memory, and an instruction execution unit configured to generate control signals for executing instructions fetched by the instruction fetch unit.
    Type: Grant
    Filed: September 28, 2023
    Date of Patent: February 27, 2024
    Assignee: Rebellions Inc.
    Inventor: Minhoo Kang
  • Patent number: 11900124
    Abstract: Various embodiments are disclosed of a multiprocessor system with processing elements optimized for high performance and low power dissipation and an associated method of programming the processing elements. Each processing element may comprise a fetch unit and a plurality of address generator units and a plurality of pipelined datapaths. The fetch unit may be configured to receive a multi-part instruction, wherein the multi-part instruction includes a plurality of fields. First and second address generator units may generate, based on different fields of the multi-part instruction, addresses from which to retrieve first and second data for use by an execution unit for the multi-part instruction or a subsequent multi-part instruction. The execution units may perform operations using a single pipeline or multiple pipelines based on third and fourth fields of the multi-part instruction.
    Type: Grant
    Filed: January 3, 2023
    Date of Patent: February 13, 2024
    Assignee: Coherent Logix, Incorporated
    Inventors: Michael B Doerr, Carl S. Dobbs, Michael B. Solka, Michael R. Trocino, Kenneth R. Faulkner, Keith M. Bindloss, Sumeer Arya, John Mark Beardslee, David A. Gibson
  • Patent number: 11902390
    Abstract: A method obtains service request information identifying computing device nodes invoked by users. Based on the service request information, sets of computing device nodes are identified, each set of computing device nodes includes computing device nodes invoked simultaneously or sequentially by one of the users. Communities are further identified based on a probability measure that is a measure of a probability of co-occurrence of two sets of computing device nodes. Each community has sets of computing device nodes each having the probability measure over a probability threshold in relation to at least one other set of computing device nodes in the community. Solutions are predicted for provision of services of the sets of computing device nodes of the communities. Each predicted solution for provision of services relates to a community and is determined based on shared knowledge of predicted solutions for provision of services relating to other communities.
    Type: Grant
    Filed: January 17, 2020
    Date of Patent: February 13, 2024
    Assignee: Telefonaktiebolaget LM Ericsson (publ)
    Inventors: Arindam Banerjee, Saravanan M
  • Patent number: 11842196
    Abstract: Obsoleting values stored in registers in a processor based on processing obsolescent register-encoded instructions is disclosed. The processor is configured to support execution of read and/or write instructions that include obsolescence encoding indicating that one or more of its source and/or target register operands are to be obsoleted by the processor. A register encoded as obsolescent means the data value stored in such register will not be used by subsequent instructions in an instruction stream, and thus does not need to be retained. Thus, such register can be set as being in an obsolescent state so that the data value stored in such register can be ignored to improve performance. As one example, data values for registers having an obsolescent state can be ignored and thus not stored in a saved context for a process being switched out, thus conserving memory and improving processing time for a process switch.
    Type: Grant
    Filed: November 9, 2021
    Date of Patent: December 12, 2023
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Thomas Andrew Sartorius, Thomas Philip Speier, Michael Scott McIlvaine, James Norris Dieffenderfer, Rodney Wayne Smith
  • Patent number: 11829768
    Abstract: The disclosure provides a method for scheduling an out-of-order queue The method includes: adding a highest bit before each address in a reorder buffer (ROB) or in a branch reorder buffer (B-ROB), in which the addresses are entered by instructions in the out-of-order queue; adding a highest bit for a read pointer (roqhead) of the ROB or B-ROB; performing an exclusive-OR (XOR) operation on the highest bit for the roqhead and the highest bit for each of the addresses entered by two instructions to be compared, and determining addresses after the XOR operation as age information of the two instructions; and comparing the age information to determine the oldest instruction in the queue for execution in response to scheduling the out-of-order queue.
    Type: Grant
    Filed: November 18, 2021
    Date of Patent: November 28, 2023
    Assignee: BEIJING VCORE TECHNOLOGY CO., LTD.
    Inventor: Dandan Huan
  • Patent number: 11811883
    Abstract: In one aspect, a computer system for vehicle configuration verification, and/or detecting unauthorized vehicle modification may be provided. In some exemplary embodiments, the computer system may include a processor and a non-transitory, tangible, computer-readable storage medium having instructions stored thereon that, in response to execution by the processor, cause the processor to perform operations including: (1) receiving a vehicle image, including a vehicle identifier and at least one software module; (2) calculating a configuration hash value of the at least one software module; generating a first data block including the configuration hash value, a first index value, the vehicle identifier, and a digital signature; (3) storing the first data block in a memory; and/or (4) transmitting the first data block to any number of network participants using a distributed network to facilitate vehicle software configuration verification.
    Type: Grant
    Filed: May 25, 2022
    Date of Patent: November 7, 2023
    Assignee: STATE FARM MUTUAL AUTOMOBILE INSURANCE COMPANY
    Inventors: Matthew Lewis Floyd, Leroy Luther Smith, Jr., Brittney Benzio, Nathan Barnard, Shannon Marie Lowry
  • Patent number: 11782718
    Abstract: Techniques related to executing a plurality of instructions by a processor comprising receiving a first instruction configured to cause the processor to output a first data value to a first address in a first data cache, outputting, by the processor, the first data value to a second address in a second data cache, receiving a second instruction configured to cause a streaming engine associated with the processor to prefetch data from the first data cache, determining that the first data value has not been outputted from the second data cache to the first data cache, stalling execution of the second instruction, receiving an indication, from the second data cache, that the first data value has been output from the second data cache to the first data cache, and resuming execution of the second instruction based on the received indication.
    Type: Grant
    Filed: March 30, 2021
    Date of Patent: October 10, 2023
    Assignee: Texas Instruments Incorporated
    Inventors: Naveen Bhoria, Kai Chirca, Timothy D. Anderson, Duc Bui, Abhijeet A. Chachad, Son Hung Tran
  • Patent number: 11755320
    Abstract: Aspects include a compute array of a processor with mixed-precision numerical linear algebra support. A first precision and a first shape of a first input matrix and a second precision and a second shape of a second input matrix to the compute array are determined. A plurality of linear algebra operations is repeated in parallel within the compute array to update a result matrix in an accumulator register based on the first input matrix, the second input matrix, and a number of rank updates of the result matrix to store in the accumulator register.
    Type: Grant
    Filed: September 21, 2021
    Date of Patent: September 12, 2023
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Jose E. Moreira, Brett Olsson, Brian W. Thompto, Silvia Melitta Mueller, Andreas Wagner
  • Patent number: 11755322
    Abstract: Disclosed embodiments relate to methods of using a processor to load and duplicate scalar data from a source into a destination register. The data may be duplicated in byte, half word, word or double word parts, according to a duplication pattern.
    Type: Grant
    Filed: September 24, 2019
    Date of Patent: September 12, 2023
    Assignee: Texas Instruments Incorporated
    Inventors: Timothy David Anderson, Duc Quang Bui, Peter Richard Dent
  • Patent number: 11750510
    Abstract: The present disclosure discloses an FPGA device for implementing a network-on-chip transmission bandwidth expansion function, and relates to the technical field of FPGAs. When a predefined functional module with built-in hardcore IP nodes is integrated in an FPGA bare die, soft-core IP nodes are configured and formed by using logical resource modules in the FPGA bare die and are connected to the hardcore IP nodes to form an NOC network structure, so as to increase nodes and expand the transmission bandwidth of the predefined functional module. On the other hand, the soft-core IP nodes can be additionally connected to input and output signals in the predefined functional module and also can expand the transmission bandwidth of the predefined functional module.
    Type: Grant
    Filed: April 21, 2021
    Date of Patent: September 5, 2023
    Assignee: WUXI ESIONTECH CO., LTD.
    Inventors: Yanfeng Xu, Yueer Shan, Jicong Fan, Yanfei Zhang, Hua Yan
  • Patent number: 11734075
    Abstract: Data format conversion processing of an accelerator accessed by a processor of a computing environment is reduced. The processor and accelerator use different data formats, and the accelerator is configured to perform an input conversion to convert data from a processor data format to an accelerator data format prior to performing an operation using the data, and an output conversion to convert resultant data from accelerator data format back to processor data format after performing the operation. The reducing includes determining that adjoining operations of a process to run on the processor and accelerator are to be performed by the accelerator, where the adjoining operations include a source operation and destination operation. Further, the reducing includes identifying for removal output data format conversion of output data of the source operation for input to the destination operation, and input data format conversion of the input to the destination operation.
    Type: Grant
    Filed: November 24, 2021
    Date of Patent: August 22, 2023
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Qi Liang, Yi Xuan Zhang, Gui Yu Jiang
  • Patent number: 11681530
    Abstract: Systems, methods, and apparatuses relating to performing hashing operations on packed data elements are described.
    Type: Grant
    Filed: March 7, 2022
    Date of Patent: June 20, 2023
    Assignee: Intel Corporation
    Inventors: Regev Shemy, Zeev Sperber, Wajdi Feghali, Vinodh Gopal, Amit Gradstein, Simon Rubanovich, Sean Gulley, Ilya Albrekht, Jacob Doweck, Jose Yallouz, Ittai Anati
  • Patent number: 11663004
    Abstract: An instruction to perform converting and scaling operations is provided. Execution of the instruction includes converting an input value in one format to provide a converted result in another format. The converted result is scaled to provide a scaled result. A result obtained from the scaled result is placed in a selected location. Further, an instruction to perform scaling and converting operations is provided. Execution of the instruction includes scaling an input value in one format to provide a scaled result and converting the scaled result from the one format to provide a converted result in another format. A result obtained from the converted result is placed in a selected location.
    Type: Grant
    Filed: February 26, 2021
    Date of Patent: May 30, 2023
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Eric Mark Schwarz, Kerstin Claudia Schelm, Petra Leber, Silvia Melitta Mueller, Reid Copeland, Xin Guo, Cedric Lichtenau
  • Patent number: 11615038
    Abstract: A gateway for use in a computing system to interface a host with the subsystem for acting as a work accelerator to the host, the gateway having: an accelerator interface for connection to the subsystem to enable transfer of batches of data between the subsystem and the gateway; a data connection interface for connection to external storage for exchanging data between the gateway and storage; a gateway interface for connection to at least one second gateway; a memory interface connected to a local memory associated with the gateway; and a streaming engine for controlling the streaming of batches of data into and out of the gateway in response to pre-compiled data exchange synchronisation points attained by the subsystem, wherein the streaming of batches of data are selectively via at least one of the accelerator interface, data connection interface, gateway interface and memory interface.
    Type: Grant
    Filed: December 28, 2018
    Date of Patent: March 28, 2023
    Assignee: Graphcore Limited
    Inventors: Ola Tørudbakken, Brian Manula, Harald Høeg
  • Patent number: 11601282
    Abstract: A computer system for verifying vehicle software configuration may be provided. The computer system may include a processor and a non-transitory, tangible, computer-readable storage medium having instructions stored thereon that, in response to execution by the processor, cause the processor to: (1) transmit, to a vehicle computing system, an authentication request including a hash algorithm specification; (2) receive, from the vehicle computing system, a current configuration hash value and a vehicle identifier; (3) retrieve a trusted data block from a memory based upon the vehicle identifier, the trusted data block including a stored configuration hash value and a smart contract code segment; (4) execute the smart contract code segment, the smart contract code segment including a failsafe code segment; and/or (5) transmit the authentication response to the vehicle computing system, and cause the vehicle computing system to execute the failsafe code segment.
    Type: Grant
    Filed: October 26, 2020
    Date of Patent: March 7, 2023
    Assignee: STATE FARM MUTUAL AUTOMOBILE INSURANCE COMPANY
    Inventors: Matthew Lewis Floyd, Leroy Luther Smith, Jr., Brittney Benzio, Nathan Barnard, Shannon Marie Lowry
  • Patent number: 11579889
    Abstract: A processing system 2 includes a processing pipeline 12, 14, 16, 18, 28 which includes fetch circuitry 12 for fetching instructions to be executed from a memory 6, 8. Buffer control circuitry 34 is responsive to a programmable trigger, such as explicit hint instructions delimiting an instruction burst, or predetermined configuration data specifying parameters of a burst together with a synchronising instruction, to trigger the buffer control circuitry to stall a stallable portion of the processing pipeline (e.g. issue circuitry 16), to accumulate within one or more buffers 30, 32 fetched instructions starting from a predetermined starting instruction, and, when those instructions have been accumulated, to restart the stallable portion of the pipeline.
    Type: Grant
    Filed: November 18, 2020
    Date of Patent: February 14, 2023
    Assignee: ARM LIMITED
    Inventors: Jatin Bhartia, Kauser Yakub Johar, Antony John Penton
  • Patent number: 11567772
    Abstract: Systems, methods, and apparatuses relating to performing hashing operations on packed data elements are described.
    Type: Grant
    Filed: November 29, 2021
    Date of Patent: January 31, 2023
    Assignee: Intel Corporation
    Inventors: Regev Shemy, Zeev Sperber, Wajdi Feghali, Vinodh Gopal, Amit Gradstein, Simon Rubanovich, Sean Gulley, Ilya Albrekht, Jacob Doweck, Jose Yallouz, Ittai Anati
  • Patent number: 11531572
    Abstract: Disclosed are various implementations of approaches for reassigning hosts between computing clusters. A computing cluster assigned to a first queue is identified. The first queue can include a first list of identifiers of computing clusters with insufficient resources for a respective workload. A host machine assigned to a second queue can then be identified. The second queue can include a second list of identifiers of host machines in an idle state. A command can then be sent to the host machine to migrate to the computing cluster. Finally, the host machine can be removed from the second queue.
    Type: Grant
    Filed: August 27, 2020
    Date of Patent: December 20, 2022
    Assignee: VMWARE, INC.
    Inventors: Sabareesh Subramaniam, Dragos Misca, Pranshu Jain, Arpitha Dondemadahalli Ramegowda
  • Patent number: 11513966
    Abstract: An apparatus has processing circuitry, load tracking circuitry and value prediction circuitry. In response to an actual value of first target data becoming available for a value-predicted load operation, it is determined whether the actual value matches the predicted value of the first target data determined by the value prediction circuitry, and whether the tracking information indicates that, for a given younger load operation issued before the actual value of the first target data was available, there is a risk of second target data associated with that given load operation having changed after having been loaded. Independent of whether the addresses of the value-predicted load operation and younger load operation correspond, at least the given load operation is re-processed when the value prediction is correct and the tracking information indicates there is a risk of the second target data having changes after being loaded. This protects against ordering violations.
    Type: Grant
    Filed: March 22, 2021
    Date of Patent: November 29, 2022
    Assignee: Arm Limited
    Inventor: . Abhishek Raja
  • Patent number: 11487341
    Abstract: Systems and techniques for improving the performance of circuits while adapting to dynamic voltage drops caused by the execution of noisy instructions (e.g. high power consuming instructions) are provided. The performance is improved by slowing down the frequency of operation selectively for types of noisy instructions. An example technique controls a clock by detecting an instruction of a predetermined noisy type that is predicted to have a predefined noise characteristic (e.g. a high level of noise generated on the voltage rails of a circuit due to greater amount of current drawn by the instruction), and, responsive to the detecting, deceasing a frequency of the clock. The detecting occurs before execution of the instruction. The changing of the frequency in accordance with instruction type enables the circuits to be operated at high frequencies even if some of the workloads include instructions for which the frequency of operation is slowed down.
    Type: Grant
    Filed: July 2, 2019
    Date of Patent: November 1, 2022
    Assignee: NVIDIA CORPORATION
    Inventors: Aniket Naik, Tezaswi Raja, Kevin Wilder, Rajeshwaran Selvanesan, Divya Ramakrishnan, Daniel Rodriguez, Benjamin Faulkner, Raj Jayakar, Fei (Walter) Li
  • Patent number: 11481390
    Abstract: Methods and systems are provided for converting a loop (e.g., a cursor loop) to a declarative Structured Query Language (SQL) query that invokes a custom aggregate function. The loop includes a select query and a loop body that includes a program fragment that can be evaluated over a result set of the select query one row at a time. The system verifies that the loop body does not modify a persistent state of the database. A custom aggregate function that expresses the loop body is automatically constructed according to a contract. An aggregate class comprising aggregation methods of the contract are used to construct the aggregate function based on results of static analysis. The select query is automatically rewritten to form a declarative SQL query that invokes the custom aggregate function. The declarative SQL query may be executed by a database management system (DBMS) SQL server.
    Type: Grant
    Filed: July 24, 2020
    Date of Patent: October 25, 2022
    Assignee: MICROSOFT TECHNOLOGY LICENSING, LLC
    Inventors: Karthik Saligrama Ramachandra, Surabhi Gupta, Sanket Jayant Purandare
  • Patent number: 11468168
    Abstract: Systems, apparatuses, and methods for efficient handling of subroutine epilogues. When an indirect control transfer instruction corresponding to a procedure return for a subroutine is identified, the return address and a signature are retrieved from one or more of a return address stack and the memory stack. An authenticator generates a signature based on at least a portion of the retrieved return address. While the signature is being generated, instruction processing speculatively continues. No instructions are permitted to commit yet. The generated signature is later compared to a copy of the signature generated earlier during the corresponding procedure call. A mismatch causes an exception.
    Type: Grant
    Filed: April 11, 2017
    Date of Patent: October 11, 2022
    Assignee: Apple Inc.
    Inventors: Conrado Blasco, Ian D. Kountanis, Douglas C. Holman, Sean M. Reynolds, Richard F. Russo
  • Patent number: 11455155
    Abstract: A computer system comprises a work accelerator, a gateway the transfer of data to the accelerator from external storage, the accelerator executes a first compiled code sequence to perform computations on data transferred to the accelerator from the gateway. The first compiled code sequence comprises a synchronisation instruction indicating a barrier between a compute phase in which the compute instructions are executed and an exchange phase, wherein execution of the synchronisation instruction causes an indication of a pre-compiled data exchange synchronisation point to be transferred to the gateway. The gateway comprises a streaming engine storing a second compiled code sequence in the form of a set of data transfer instructions executable by the streaming engine to perform data transfer operations to stream data through the gateway in the exchange phase, wherein the first and second compiled code sequences are generated as a related set at compile time.
    Type: Grant
    Filed: January 27, 2021
    Date of Patent: September 27, 2022
    Assignee: GRAPHCORE LIMITED
    Inventors: Ola Torudbakken, Daniel John Pelham Wilkinson, Brian Manula, Harald Hoeg
  • Patent number: 11449317
    Abstract: Implementations of the disclosure provide systems and methods for identifying, in view of a first control flow graph associated with a first code fragment and a second control flow graph associated with a second code fragment, a first set of sections of the first code fragment and a second set of sections of the second code fragment, such that each section of the first set of sections has a corresponding section of the second set of sections. A first section of the first set of sections is identified, where the first section is not syntactically equivalent to a corresponding second section of the second set of sections. Responsive to determining that the first section is not syntactically equivalent to the corresponding second section, it is found that the first code fragment is not semantically equivalent to the second code fragment.
    Type: Grant
    Filed: August 20, 2019
    Date of Patent: September 20, 2022
    Assignee: Red Hat, Inc.
    Inventors: Viktor Malik, Tomas Glozar
  • Patent number: 11436055
    Abstract: A first command is fetched for execution on a GPU. Dependency information for the first command, which indicates a number of parent commands that the first command depends on, is determined. The first command is inserted into an execution graph based on the dependency information. The execution graph defines an order of execution for plural commands including the first command. The number of parent commands are configured to be executed on the GPU before executing the first command. A wait count for the first command, which indicates the number of parent commands of the first command, is determined based on the execution graph. The first command is inserted into cache memory in response to determining that the wait count for the first command is zero or that each of the number of parent commands the first command depends on has already been inserted into the cache memory.
    Type: Grant
    Filed: November 19, 2019
    Date of Patent: September 6, 2022
    Assignee: Apple Inc.
    Inventors: Kutty Banerjee, Michael Imbrogno
  • Patent number: 11436166
    Abstract: A processor comprises an execution unit operable to execute programs to perform processing operations, and one or more slave accelerators each operable to perform respective processing operations under the control of the execution unit. The execution unit includes a message generation circuit that generates messages to cause a slave accelerator to perform a processing operation. The message generation circuit fetches data values for including in a message or messages to be sent to a slave accelerator into local storage of the message generation circuit pending the inclusion of those data values in a message that is sent to a slave accelerator, and retrieves the data value or values from the local storage, and sends a message including the retrieved data value or values to the slave accelerator.
    Type: Grant
    Filed: February 5, 2019
    Date of Patent: September 6, 2022
    Assignee: Arm Limited
    Inventor: Emil Lambrache
  • Patent number: 11422816
    Abstract: A computer-implemented method is disclosed. The method can comprise: monitoring utilization of a cloud architecture component that is being used by a component utilizer; determining, via a machine learning model, a pattern of usage of the cloud architecture component based on the monitoring; determining, based on the pattern of usage, a first time period when the cloud architecture component is excessively used by the component utilizer and a second time period when the cloud resource is scantily used by the component utilizer; and orchestrating, based on the first and second time periods, a scaling of the cloud architecture immediately before a subsequent iteration of the pattern of usage by the component utilizer.
    Type: Grant
    Filed: August 9, 2021
    Date of Patent: August 23, 2022
    Assignee: Capital One Services, LLC
    Inventors: Eric Barnum, Anthony Reynolds, Bryan Pinos, Joseph Krasinskas
  • Patent number: 11385873
    Abstract: Systems, apparatuses and methods may provide for technology that determines that a control loop is to be executed for an unspecified number of iterations and automatically forces the control loop to be executed for a fixed number of iterations in addition to the unspecified number of iterations, where execution of the control loop for the fixed number of iterations is conducted in parallel. In one example, the technology also removes one or more dataflow tokens associated with the execution of the control loop for the fixed number of iterations.
    Type: Grant
    Filed: December 7, 2020
    Date of Patent: July 12, 2022
    Assignee: Intel Corporation
    Inventor: Kermin ChoFleming
  • Patent number: 11385894
    Abstract: A processor circuit is provided. The processor circuit includes an instruction decode unit, an instruction detector, an address generator and a data buffer. The instruction decode unit is configured to decode a load instruction to generate a decoding result. The instruction detector, coupled to the instruction decode unit, is configured to detect if the load instruction is in a load-use scenario. The address generator, coupled to the instruction decode unit, is configured to generate a first address requested by the load instruction according to the decoding result. The data buffer is coupled to the instruction detector and the address generator. When the instruction detector detects that the load instruction is in the load-use scenario, the data buffer is configured to store the first address generated from the address generator, and store data requested by the load instruction according to the first address.
    Type: Grant
    Filed: May 6, 2020
    Date of Patent: July 12, 2022
    Assignee: REALTEK SEMICONDUCTOR CORPORATION
    Inventors: Yen-Ju Lu, Chao-Wei Huang
  • Patent number: 11379259
    Abstract: A system includes determination of whether a current number of active worker threads of a client application is less than a maximum active worker thread limit, retrieval, if the number of active worker threads is less than the maximum active worker thread limit, of a first job associated with a first context from a job pool, determination of whether an inactive worker thread is associated with the first context, and, if an inactive worker thread is associated with the first context, execution of the first job on the inactive worker thread.
    Type: Grant
    Filed: June 12, 2020
    Date of Patent: July 5, 2022
    Assignee: SAP SE
    Inventor: Johnson Wong
  • Patent number: 11360686
    Abstract: An apparatus to facilitate copying surface data is disclosed. The apparatus includes copy engine hardware to receive a command to access surface data from a source location in memory to a destination location in the memory, divide the surface data into a plurality of surface data sub-blocks, process the surface data sub-blocks to calculate virtual addresses to which accesses to the memory are to be performed and perform the memory accesses.
    Type: Grant
    Filed: January 12, 2021
    Date of Patent: June 14, 2022
    Assignee: Intel Corporation
    Inventors: Prasoonkumar Surti, Nilay Mistry
  • Patent number: 11354155
    Abstract: A system and method for operating fewer servers near maximum capacity as opposed to operating more servers at low capacity is disclosed. Computational tasks are made as small as possible to be completed within the available capacity of the servers. Computational tasks that are similar may be distributed to the same computing node (including a processor) to improve RAM utilization. Additionally, workloads may be scheduled onto multicore processors to maximize the average number of processing cores utilized per clock cycle.
    Type: Grant
    Filed: September 16, 2019
    Date of Patent: June 7, 2022
    Assignee: United Services Automobile Association (USAA)
    Inventors: Nathan Lee Post, Bryan J. Osterkamp, William Preston Culbertson, II, Ryan Thomas Russell, Ashley Raine Philbrick
  • Patent number: 11349669
    Abstract: In one aspect, a computer system for vehicle configuration verification, and/or detecting unauthorized vehicle modification may be provided. In some exemplary embodiments, the computer system may include a processor and a non-transitory, tangible, computer-readable storage medium having instructions stored thereon that, in response to execution by the processor, cause the processor to perform operations including: (1) receiving a vehicle image, including a vehicle identifier and at least one software module; (2) calculating a configuration hash value of the at least one software module; generating a first data block including the configuration hash value, a first index value, the vehicle identifier, and a digital signature; (3) storing the first data block in a memory; and/or (4) transmitting the first data block to any number of network participants using a distributed network to facilitate vehicle software configuration verification.
    Type: Grant
    Filed: July 3, 2018
    Date of Patent: May 31, 2022
    Assignee: STATE FARM MUTUAL AUTOMOBILE INSURANCE COMPANY
    Inventors: Matthew Lewis Floyd, Leroy Luther Smith, Jr., Brittney Benzio, Nathan Barnard, Shannon Marie Lowry
  • Patent number: 11343177
    Abstract: Technologies for quality of service based throttling in a fabric architecture include a network node of a plurality of network nodes interconnected across the fabric architecture via an interconnect fabric. The network node includes a host fabric interface (HFI) configured to facilitate the transmission of data to/from the network node, monitor quality of service levels of resources of the network node used to process and transmit the data, and detect a throttling condition based on a result of the monitored quality of service levels. The HFI is further configured to generate and transmit a throttling message to one or more of the interconnected network nodes in response to having detected a throttling condition. The HFI is additionally configured to receive a throttling message from another of the network nodes and perform a throttling action on one or more of the resources based on the received throttling message. Other embodiments are described herein.
    Type: Grant
    Filed: October 30, 2020
    Date of Patent: May 24, 2022
    Assignee: Intel Corporation
    Inventors: Francesc Guim Bernat, Karthik Kumar, Thomas Willhalm, Raj Ramanujan, Brian Slechta
  • Patent number: 11327755
    Abstract: In one embodiment, a processor comprises a decoder to decode a first instruction, the first instruction comprising an opcode and at least one parameter, the opcode to identify the first instruction as an instruction associated with an indirect branch, the at least one parameter indicative of whether the indirect branch is allowed; and circuitry to generate an error message based on the at least one parameter.
    Type: Grant
    Filed: June 26, 2020
    Date of Patent: May 10, 2022
    Assignee: Intel Corporation
    Inventors: Kekai Hu, Ke Sun, Rodrigo Branco
  • Patent number: 11321122
    Abstract: The embodiments of the present disclosure provide a method, an apparatus, a device and a medium for processing topological relation of tasks. The method includes: extracting at least one execution element from each of processing tasks based on a topological relation recognition rule; determining a dependency relation among the processing tasks according to content of the execution element of each processing task; and determining a topological relation among the processing tasks according to the dependency relation among the processing tasks.
    Type: Grant
    Filed: August 28, 2019
    Date of Patent: May 3, 2022
    Assignee: Apollo Intelligent Driving Technology (Beijing) Co., Ltd.
    Inventors: Chao Zhang, Zhuo Chen, Liming Xia, Weifeng Yao, Jiankang Xin, Chengliang Deng
  • Patent number: 11256518
    Abstract: Techniques are disclosed relating to sharing operands among SIMD threads for a larger arithmetic operation. In some embodiments, a set of multiple hardware pipelines is configured to execute single-instruction multiple-data (SIMD) instructions for multiple threads in parallel, where ones of the hardware pipelines include execution circuitry configured to perform floating-point operations using one or more pipeline stages of the pipeline and first routing circuitry configured to select, from among thread-specific operands stored for the hardware pipeline and from one or more other pipelines in the set, a first input operand for an operation by the execution circuitry. In some embodiments, a device is configured to perform a mathematical operation on source input data structures stored across thread-specific storage for the set of hardware pipelines, by executing multiple SIMD floating-point operations using the execution circuitry and the first routing circuitry.
    Type: Grant
    Filed: October 9, 2019
    Date of Patent: February 22, 2022
    Assignee: Apple Inc.
    Inventors: Liang-Kai Wang, Robert D. Kenney, Terence M. Potter, Vinod Reddy Nalamalapu, Sivayya V. Ayinala
  • Patent number: 11194724
    Abstract: Systems and methods for improved process caching through iterative feedback are disclosed. In embodiments, a computer implemented method comprises retrieving updated metadata of a process to be executed, wherein the updated metadata includes information regarding cache misses from a prior execution of the process; automatically modifying a setting of a data stream control register based on the updated metadata; automatically setting a hint at a data cache block touch module; performing an initial execution of the process after the steps of retrieving the updated metadata, automatically modifying the setting of the data stream control register, and automatically setting the hint at the data cache block touch module; and modifying the updated metadata of the process after the execution of the process based on cache miss statistical data gathered during the execution of the process, to produce newly updated metadata.
    Type: Grant
    Filed: September 30, 2019
    Date of Patent: December 7, 2021
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Mauro Sergio Martins Rodrigues, Rafael Camarda Silva Folco, Daniel Battaiola Kreling, Breno H. Leitao
  • Patent number: 11169805
    Abstract: A processor including a logic unit configured to execute multiple instructions being one of a speculative instruction or an architectural instruction is provided. The processor also includes a split cache comprising multiple lines, each line including a data accessed by an instruction and copied into the split cache from a memory address, wherein a line is identified as a speculative line for the speculative instruction, and as an architectural line for the architectural instruction. The processor includes a cache manager configured to select a number of speculative lines allocated in the split cache. The cache manager prevents an architectural line from being replaced by a speculative line based on a number of speculative lines allotted in the split cache, and manages the number of speculative lines to be allocated in the split cache based on the number of speculative lines relative to a number of architectural lines.
    Type: Grant
    Filed: April 30, 2018
    Date of Patent: November 9, 2021
    Assignee: Hewlett Packard Enterprise Development LP
    Inventors: Michael Bozich Calhoun, Divakar Chitturi
  • Patent number: 11119766
    Abstract: Provided are techniques for a hardware accelerator with locally stored macros. A plurality of macros are stored in a lookup memory of a hardware accelerator. In response to receiving an operation code, the operation code is mapped to one or more macros of the plurality of macros, wherein each of the one or more macros includes micro-instructions. Each of the micro-instructions of the one or more macros is routed to a function block of a plurality of function blocks. Each of the micro-instructions is processed with the plurality of function blocks. Data from the processing of each of the micro-instructions is stored in an accelerator memory of the hardware accelerator. The data is moved from the accelerator memory to a host memory.
    Type: Grant
    Filed: December 6, 2018
    Date of Patent: September 14, 2021
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Michael J. Healy, Jason A. Viehland, Jeffrey H. Derby, Diana L. Orf
  • Patent number: 11093225
    Abstract: A high parallelism computing system and instruction scheduling method thereof are disclosed. The computing system comprises: an instruction reading and distribution module for reading a plurality of types of instructions in a specific order, and distributing the acquired instructions to corresponding function modules according to the types; an internal buffer for buffering data and instructions for performing computation; a plurality of function modules each of which sequentially executes instructions of the present type distributed by the instruction reading and distribution module and reads the data from the internal buffer; and wherein the specific order is obtained by topologically sorting the instructions according to a directed acyclic graph consisting of the types and dependency relationships.
    Type: Grant
    Filed: June 27, 2019
    Date of Patent: August 17, 2021
    Assignee: Xilinx, Inc.
    Inventors: Qian Yu, Lingzhi Sui, Shaoxia Fang, Junbin Wang, Yi Shan
  • Patent number: 11080199
    Abstract: Embodiments of the inventions are directed towards a computer-implemented methods and systems for determining an oldest logical memory address. The method includes creating an M number of miss request registers and an N number of stations in a load/store unit of the processor. In response to load requests from target instructions, a processor detects each L1 cache miss. The processor stores data related to each L1 cache miss in a respective miss request register. The data includes an age of each L1 cache miss and a portion of a logical memory address of the requested load. The processor stores the entire logical memory addresses of the requested loads in respective stations based on an age of the load requests. The processor transmits the oldest logical memory address that is stored at the stations.
    Type: Grant
    Filed: March 7, 2019
    Date of Patent: August 3, 2021
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Yossi Shapira, Jonathan Hsieh, Michael Cadigan, Jr., Jane Bartik, Taylor J Pritchard
  • Patent number: 11030344
    Abstract: An apparatus and method are provided for controlling use of bounded pointers. The apparatus includes storage to store bounded pointers, where each bounded pointer comprises a pointer value and associated attributes, with the associated attributes including range information indicative of an allowable range of addresses when using the pointer value. Processing circuitry is used to perform a signing operation on an input bounded pointer in order to generate an output bounded pointer in which a signature generated by the signing operation is contained within the output bounded pointer in place of specified bits of the input bounded pointer. In addition, the associated attributes include signing information which is set by the processing circuitry within the output bounded pointer to identify that the output bounded pointer has been signed. Such an approach provides increase resilience to control flow integrity attack when using bounded pointers.
    Type: Grant
    Filed: December 23, 2016
    Date of Patent: June 8, 2021
    Assignee: ARM Limited
    Inventors: Graeme Peter Barnes, Richard Roy Grisenthwaite
  • Patent number: 10977274
    Abstract: In one example, a system and method for replication and recovery of protected resources may include one or more vendor neutral components that identify a corresponding vendor specific replication and/or recovery tool. The vendor specific tool is then executre to obtain replication data related to the protected logical entity. The replication data is formatted in a vendor neutral format, and forwarded to a target site over a data transport mechanism. The target site can then reformat the replication data into the appropriate vendor specific formats required on the target site (which may not be the same vendor or vendor formats on the source site), and proceed to recover and/or replicate the protected resources.
    Type: Grant
    Filed: October 5, 2017
    Date of Patent: April 13, 2021
    Assignee: Sungard Availability Services, LP
    Inventors: Amol Manvar, Krunal Jain, Nandkumar Mane, Rahul Rege
  • Patent number: 10970073
    Abstract: The present disclosure provides a method, computer system and computer program product for branch optimization. According to the method, execution possibilities of instruction blocks corresponding to at least one branch of in a program can be determined. Then, the instruction blocks can be loaded according to the execution possibilities.
    Type: Grant
    Filed: October 2, 2018
    Date of Patent: April 6, 2021
    Assignee: International Business Machines Corporation
    Inventors: Qian Ren, Shan Gao, Xiao Hai Ma, Li Gao, Ting Ting Tang, Bin Chen, Zhuo Hua Li
  • Patent number: 10956086
    Abstract: A memory controller circuit is disclosed which is coupleable to a first memory circuit, such as DRAM, and includes: a first memory control circuit to read from or write to the first memory circuit; a second memory circuit, such as SRAM; a second memory control circuit adapted to read from the second memory circuit in response to a read request when the requested data is stored in the second memory circuit, and otherwise to transfer the read request to the first memory control circuit; predetermined atomic operations circuitry; and programmable atomic operations circuitry adapted to perform at least one programmable atomic operation. The second memory control circuit also transfers a received programmable atomic operation request to the programmable atomic operations circuitry and sets a hazard bit for a cache line of the second memory circuit.
    Type: Grant
    Filed: January 28, 2019
    Date of Patent: March 23, 2021
    Assignee: Micron Technology, Inc.
    Inventor: Tony M. Brewer
  • Patent number: 10949207
    Abstract: Embodiments of processors, methods, and systems for a processor core supporting a heterogenous system instruction set architecture are described. In an embodiment, a processor includes an instruction decoder and an exception generation circuit. The exception generation circuit is to, in response to the instruction decoder receiving an unsupported instruction, generate an exception and report an instruction classification value of the unsupported instruction.
    Type: Grant
    Filed: September 29, 2018
    Date of Patent: March 16, 2021
    Assignee: Intel Corporation
    Inventors: Toby Opferman, Russell C. Arnold, Vedvyas Shanbhogue
  • Patent number: 10951516
    Abstract: Technologies for quality of service based throttling in a fabric architecture include a network node of a plurality of network nodes interconnected across the fabric architecture via an interconnect fabric. The network node includes a host fabric interface (HFI) configured to facilitate the transmission of data to/from the network node, monitor quality of service levels of resources of the network node used to process and transmit the data, and detect a throttling condition based on a result of the monitored quality of service levels. The HFI is further configured to generate and transmit a throttling message to one or more of the interconnected network nodes in response to having detected a throttling condition. The HFI is additionally configured to receive a throttling message from another of the network nodes and perform a throttling action on one or more of the resources based on the received throttling message. Other embodiments are described herein.
    Type: Grant
    Filed: March 4, 2019
    Date of Patent: March 16, 2021
    Assignee: Intel Corporation
    Inventors: Francesc Guim Bernat, Karthik Kumar, Thomas Willhalm, Raj K. Ramanujan, Brian J. Slechta
  • Patent number: 10922063
    Abstract: A computer system comprises a work accelerator, a gateway the transfer of data to the accelerator from external storage, the accelerator executes a first compiled code sequence to perform computations on data transferred to the accelerator from the gateway. The first compiled code sequence comprises a synchronisation instruction indicating a barrier between a compute phase in which the compute instructions are executed and an exchange phase, wherein execution of the synchronisation instruction causes an indication of a pre-compiled data exchange synchronisation point to be transferred to the gateway. The gateway comprises a streaming engine storing a second compiled code sequence in the form of a set of data transfer instructions executable by the streaming engine to perform data transfer operations to stream data through the gateway in the exchange phase, wherein the first and second compiled code sequences are generated as a related set at compile time.
    Type: Grant
    Filed: December 28, 2018
    Date of Patent: February 16, 2021
    Assignee: Graphcore Limited
    Inventors: Ola Tørudbakken, Daniel John Pelham Wilkinson, Brian Manula, Harald Høeg