Context Preserving (e.g., Context Swapping, Checkpointing, Register Windowing Patents (Class 712/228)
-
Patent number: 12165686Abstract: A mechanism where the locked pages are saved and restored by a hardware accelerator which is transparent to the OS. Prior to standby entry, the OS puts all DMA capable devices in the lowest-powered device low-power state after disabling bus mastering. The OS flushes all pageable memory to an NVM (in segments that are kept in self-refresh) and provides a list of pinned and locked pages in the DRAM to a power management controller (p-unit). The p-unit checks for all Bus Mastering DMA to be turned off and checks if a next OS timer wake event (TNTE) is greater than a threshold, to decide whether to enable or disable PASR or MPSM in Standby. If the conditions are met, the p-unit triggers a hardware accelerator to consolidate the pinned and locked pages in the DRAM to certain segment(s) of the DRAM during standby states, making it transparent to the OS.Type: GrantFiled: February 17, 2021Date of Patent: December 10, 2024Assignee: Intel CorporationInventors: Nivedha Krishnakumar, Virendra Vikramsinh Adsure, Jaya Jeyaseelan, Nadav Bonen, Barnes Cooper, Toby Opferman, Vijay Bahirji, Chia-Hung Kuo
-
Patent number: 12159143Abstract: A system and method of processing instructions may comprise an application processing domain (APD) and a metadata processing domain (MTD). The APD may comprise an application processor executing instructions and providing related information to the MTD. The MTD may comprise a tag processing unit (TPU) having a cache of policy-based rules enforced by the MTD. The TPU may determine, based on policies being enforced and metadata tags and operands associated with the instructions, that the instructions are allowed to execute (i.e., are valid). The TPU may write, if the instructions are valid, the metadata tags to a queue. The queue may (i) receive operation output information from the application processing domain, (ii) receive, from the TPU, the metadata tags, (iii) output, responsive to receiving the metadata tags, resulting information indicative of the operation output information and the metadata tags; and (iv) permit the resulting information to be written to memory.Type: GrantFiled: July 21, 2023Date of Patent: December 3, 2024Assignee: The Charles Stark Draper LaboratoryInventors: Steve E. Milburn, Eli Boling, Andre DeHon, Andrew B. Sutherland, Gregory T. Sullivan
-
Patent number: 12147840Abstract: Provided are computer program product, system, and method for using a machine learning module to determine a group of execution paths of program code and a computational resource allocation to use to execute the group of execution paths. Information on activity steps in program code and a system load of a system in which the program code is executed are provided as inputs to a resource allocation machine learning module. The resource allocation machine learning module processes the provided inputs to output computational resource allocations for execution paths of activity steps in the program code to execute in parallel, including memory and processing resource allocations optimized according to an optimization criteria. The outputted computational resource allocations are allocated to execute the activity steps in the execution paths in parallel.Type: GrantFiled: September 3, 2021Date of Patent: November 19, 2024Assignee: International Business Machines CorporationInventors: Venkata Vara Prasad Karri, Sri Harsha Varada, Sarbajit K. Rakshit, Tirumala Vasu Padisetti
-
Patent number: 12135993Abstract: In an embodiment, a local memory dedicated to one or more hardware accelerators in a system may include at least two portions: a volatile portion and a non-volatile portion. Data that is reused from iteration to iteration of the hardware accelerator (e.g. constants, instruction words, etc.) may be stored in the non-volatile portion. Data that varies from iteration to iteration may be stored in the volatile portion. Both the local memory and the hardware accelerators may be powered down between iterations, saving power. The non-volatile portion need only be initialized at a first iteration, allowing the amount of time that the hardware accelerators and the local memory are powered up to be lessened for subsequent iterations since the reused data need not be reloaded in the subsequent iterations.Type: GrantFiled: May 23, 2023Date of Patent: November 5, 2024Assignee: Apple Inc.Inventors: Paolo Di Febbo, Yohan Rajan, Chaminda Nalaka Vidanagamachchi, Anthony Ghannoum
-
Patent number: 12137059Abstract: A layer 2 (L2) switch receives session information and destination information included in upstream communication transmitted from a network device, and compresses the received session information and destination information. Then, the L2 switch stores compressed information that has been compressed, into a memory unit that stores a session table to be referred to when downstream communication is received.Type: GrantFiled: July 1, 2020Date of Patent: November 5, 2024Assignee: Nippon Telegraph and Telephone CorporationInventors: Yuki Takei, Masayuki Nishiki, Masato Nishiguchi
-
Patent number: 12112197Abstract: A processor has a register bank to which software writes descriptors specifying tasks to be processed by a hardware pipeline. The register bank includes a plurality of register sets, each for holding the descriptor of a task. The processor includes a first selector operable to connect the execution logic to a selected one of the register sets and thereby enable the software to write successive ones of said descriptors to different ones of said register sets. The processor also includes a second selector operable to connect the hardware pipeline to a selected one of the register sets. The processor further comprises control circuitry configured to control the hardware pipeline to begin processing a current task based on the descriptor in a current one of the register sets while the software is writing the descriptor of another task to another of the register sets.Type: GrantFiled: September 27, 2022Date of Patent: October 8, 2024Assignee: Imagination Technologies LimitedInventors: Michael John Livesley, Ian King, Alistair Goudie
-
Patent number: 12106106Abstract: Embodiments for memory bandwidth monitoring extensible counters are described. In embodiments, an apparatus includes memory bandwidth monitoring hardware to monitor an event, a shared cache to be shared by multiple cores. At least one of the cores is to execute multiple threads and includes at least three registers. The first register is programmable by software to store a thread identifier of one of threads and an event identifier of the event during execution of the thread. At least one value of the event identifier corresponds to a shared cache miss. The second register is to provide to the software a second value corresponding to a number of bits available to represent the count. The third register is to provide to the software a count of occurrences of the event and an indicator to indicate whether the count reached a maximum count representable by the number of bits.Type: GrantFiled: December 25, 2020Date of Patent: October 1, 2024Assignee: Intel CorporationInventors: Andrew J. Herdrich, Jason W. Brandt
-
Patent number: 12101514Abstract: An illustrative interactive content provider system transmits a live video stream to an interactive content player device, and, during the transmitting of the live video stream, provides an executable data object to the interactive content player device. The executable data object includes an interactive content instance configured to be presented within a 3D virtual playing area that bounds at least one virtual object of the interactive content instance. The interactive content provider system directs the interactive content player device to execute the executable data object by overlaying the 3D virtual playing area of the interactive content instance onto a presentation of the live video stream. Corresponding methods and systems are disclosed for both the interactive content provider system and the interactive content player device.Type: GrantFiled: October 21, 2021Date of Patent: September 24, 2024Assignee: Verizon Patent and Licensing Inc.Inventors: Mohammad Raheel Khalid, William Robert Davey, Vito Joseph Messina, Carl L. Keifer, III, Oliver S. Castaneda, Scott David Brown
-
Patent number: 12086705Abstract: An apparatus to facilitate compute optimization is disclosed. The apparatus includes a at least one processor to perform operations to implement a neural network and compute logic to accelerate neural network computations.Type: GrantFiled: December 29, 2017Date of Patent: September 10, 2024Assignee: Intel CorporationInventors: Amit Bleiweiss, Abhishek Venkatesh, Gokce Keskin, John Gierach, Oguz Elibol, Tomer Bar-On, Huma Abidi, Devan Burke, Jaikrishnan Menon, Eriko Nurvitadhi, Pruthvi Gowda Thorehosur Appajigowda, Travis T. Schluessler, Dhawal Srivastava, Nishant Patel, Anil Thomas
-
Context switching method and system for swapping contexts between register sets based on thread halt
Patent number: 12073221Abstract: A context switching system includes a processor and a scheduler. The processor is configured to execute a first thread. A first context associated with the first thread is stored in a register set of the processor. While the first thread is being executed, the scheduler is configured to select a second thread from a set of threads, and receive and store a second context associated with the second thread in a register set of the scheduler. The second thread is to be scheduled for execution after the first thread. The scheduler is further configured to swap the first and second contexts when the execution of the first thread is halted, thereby executing the context switching. Further, the processor is configured to execute the second thread based on the second context. While the second thread is being executed, the first context is stored in the data memory.Type: GrantFiled: July 14, 2020Date of Patent: August 27, 2024Assignee: NXP USA, INC.Inventors: Arvind Kaushik, Jeroen Coninx, Nishant Jain -
Patent number: 12067641Abstract: One embodiment provides a parallel processor comprising a memory interface and a processing array coupled with the memory interface. The processing array is configured to address memory accessed via the memory interface via a virtual address mapping and includes circuitry to resolve a page fault for the virtual address mapping, wherein each of the multiple compute blocks is separately preemptable.Type: GrantFiled: May 20, 2022Date of Patent: August 20, 2024Assignee: Intel CorporationInventors: Altug Koker, Ingo Wald, David Puffer, Subramaniam M. Maiyuran, Prasoonkumar Surti, Balaji Vembu, Guei-Yuan Lueh, Murali Ramadoss, Abhishek R. Appu, Joydeep Ray
-
Patent number: 12045322Abstract: Embodiments protect a computer application from being exploited by an attacker, while the application code is executed by a speculative execution engine having vulnerabilities. Embodiments are directed to systems that, prior to execution of the application by a speculative execution engine, locate a sequence of instructions of the application in which the speculative execution engine executes the instructions out of sequence. For example, the sequence of instructions may be an “if-then” code block. The systems determine a disposition that forces the speculative execution engine to execute the instructions in sequence. For example, the disposition may be adding a fence instruction to the sequence of instructions. During execution of the application code by the speculative execution engine, the systems change the sequence of instructions based on the disposition. The systems execute the changed sequence of instructions in place of the located sequence of instructions to prevent an attack on the application.Type: GrantFiled: January 11, 2019Date of Patent: July 23, 2024Assignee: Virsec System, Inc.Inventor: Satya V. Gupta
-
Patent number: 12026511Abstract: A method for performing opportunistic write-back discard of single-use vector register values. The method includes executing instructions of a GPU in a default mode, detecting a beginning of a single-use section that includes instructions that produce single-use vector register values, and executing instructions in a single-use mode. The method includes discarding the write-back of a single-use vector register value if the single-use value gets forwarded either via a bypass path or via register file cache. The method includes inserting hint instructions into an executable program code that demarcates single-use sections. A system includes a microprocessor to execute instructions in the default mode. The microprocessor detects a beginning and an ending of a single-use section that includes instructions that produce single-use vector register values.Type: GrantFiled: September 9, 2021Date of Patent: July 2, 2024Assignee: SAMSUNG ELECTRONICS CO., LTD.Inventors: Abhay Raj Kumar Gupta, Wilson Wai Lun Fung
-
Patent number: 11995442Abstract: A processor includes a register file having a plurality of register file addresses, a processing unit, configured to perform processing in accordance with a configuration defined by information stored in the register file, and an instruction sequencer. The instruction sequencer is configured to control the processing unit by retrieving a sequence of instructions from a memory, in which each instruction includes an opcode, and a subset of the instructions includes a data portion. For each instruction in the sequence of instructions, the instruction sequencer performs an action defined by the opcode. The action for the subset of the opcodes includes writing the data portion to a register file address defined by the opcode. The sequence of instructions includes variable length instructions.Type: GrantFiled: April 7, 2022Date of Patent: May 28, 2024Assignee: NXP B.V.Inventors: Paul Wielage, Mathias Martinus van Ansem, Jose de Jesus Pineda de Gyvez, Hamed Fatemi
-
Patent number: 11966726Abstract: A method, computer program product, and computer system are provided. An enhanced compiler identifies instructions for execution among them, instructions directed to an inner computation unit of a CPU core. In response to identifying instructions directed to the inner computation unit, locating in a system call table a system call to indicate a begin of an executable code block of instructions that are directed to the inner computation unit of the CPU core. The enhanced compiler searches the system hardware registry for the parameter corresponding to the inner computation unit of the CPU core. The system call is inserted as an interrupt instruction in the compiler output at the begin of the executable code block of instructions that are directed to the inner computation unit of the CPU core. The enhanced compiler executable code output is saved for later selection by a scheduler of an operating system.Type: GrantFiled: February 25, 2022Date of Patent: April 23, 2024Assignee: International Business Machines CorporationInventors: Zheng Chen, Jiu Fu Guo, Gui HaoChen, Chaofan Qiu
-
Patent number: 11940930Abstract: Methods, apparatus, systems and articles of manufacture to facilitate atomic operation in victim cache are disclosed. An example system includes a first cache storage to store a first set of data; a second cache storage to store a second set of data that has been evicted from the first cache storage; and a storage queue coupled to the first cache storage and the second cache storage, the storage queue including: an arithmetic component to: receive the second set of data from the second cache storage in response to a memory operation; and perform an arithmetic operation on the second set of data to produce a third set of data; and an arbitration manager to store the third set of data in the second cache storage.Type: GrantFiled: July 28, 2022Date of Patent: March 26, 2024Assignee: Texas Instruments IncorporatedInventors: Naveen Bhoria, Timothy David Anderson, Pete Michael Hippleheuser
-
Patent number: 11928132Abstract: Provided are a database processing method and apparatus, and a computer readable storage medium. The database processing method comprises: after a lock wait is generated, writing lock wait related information into a lock wait log.Type: GrantFiled: April 21, 2020Date of Patent: March 12, 2024Assignee: XI'AN ZHONGXING NEW SOFTWARE CO., LTD.Inventors: Pin Lin, Yan Ding, Qinyuan Lu, Chen Qi, Yifang Yu, Pei Zhao
-
Patent number: 11907758Abstract: Disclosed in the present disclosure is an out-of-order data generation method. The method comprises: creating a plurality of threads; instructing all threads to acquire transmission permission in a manner of acquisition after random delay, determining, after any thread acquires the transmission permission, a thread as the current thread, and instructing the current thread to drive currently generated data and a corresponding data ID to an AXI bus for reading by a receiving end, so as to implement an out-of-order reading test on the basis of the data and corresponding data identifier that are read by the receiving end; and after sending, by the current thread, of the currently generated data and the corresponding data identifier ends, recycling the transmission permission, and returning to execute the step of instructing the all threads to acquire the transmission permission in the manner of acquisition after the random delay.Type: GrantFiled: October 29, 2021Date of Patent: February 20, 2024Assignee: INSPUR SUZHOU INTELLIGENT TECHNOLOGY CO., LTD.Inventors: Xiangke Wang, Meng Yang, Kai Liu
-
Patent number: 11900175Abstract: The embodiments of the disclosure relate to a computing device, a computing equipment, and a programmable scheduling method for data loading and execution, and relate to the field of computer. The computing device is coupled to a first computing core and a first memory. The computing device includes a scratchpad memory, a second computing core, a first hardware queue, a second hardware queue and a synchronization unit. The second computing core is configured for acceleration in a specific field. The first hardware queue receives a load request from the first computing core. The second hardware queue receives an execution request from the first computing core. The synchronization unit configured to make the triggering of the load request and the execution request to cooperate with each other. In this manner, flexibility, throughput, and overall performance can be enhanced.Type: GrantFiled: November 11, 2021Date of Patent: February 13, 2024Assignee: Shanghai Biren Technology Co., LtdInventors: Zhou Hong, YuFei Zhang, ChengKun Sun, Lin Chen
-
Patent number: 11892950Abstract: Embodiments are generally directed to data prefetching for graphics data processing. An embodiment of an apparatus includes one or more processors including one or more graphics processing units (GPUs); and a plurality of caches to provide storage for the one or more GPUs, the plurality of caches including at least an L1 cache and an L3 cache, wherein the apparatus to provide intelligent prefetching of data by a prefetcher of a first GPU of the one or more GPUs including measuring a hit rate for the L1 cache; upon determining that the hit rate for the L1 cache is equal to or greater than a threshold value, limiting a prefetch of data to storage in the L3 cache, and upon determining that the hit rate for the L1 cache is less than a threshold value, allowing the prefetch of data to the L1 cache.Type: GrantFiled: July 15, 2022Date of Patent: February 6, 2024Assignee: INTEL CORPORATIONInventors: Vikranth Vemulapalli, Lakshminarayanan Striramassarma, Mike MacPherson, Aravindh Anantaraman, Ben Ashbaugh, Murali Ramadoss, William B. Sadler, Jonathan Pearce, Scott Janus, Brent Insko, Vasanth Ranganathan, Kamal Sinha, Arthur Hunter, Jr., Prasoonkumar Surti, Nicolas Galoppo von Borries, Joydeep Ray, Abhishek R. Appu, ElMoustapha Ould-Ahmed-Vall, Altug Koker, Sungye Kim, Subramaniam Maiyuran, Valentin Andrei
-
Patent number: 11886881Abstract: Apparatuses and methods are provided, relating to the control of data processing in devices which comprise both decoupled access-execute processing circuitry and prefetch circuitry. Control of the access portion of the decoupled access-execute processing circuitry may be dependent on a performance metric of the prefetch circuitry. Alternatively or in addition, control of the prefetch circuitry may be dependent on a performance metric of the access portion.Type: GrantFiled: December 21, 2020Date of Patent: January 30, 2024Assignee: Arm LimitedInventors: Mbou Eyole, Michiel Willem Van Tol, Stefanos Kaxiras
-
Patent number: 11868774Abstract: A processor with fault generating circuitry responsive to detecting a processor write is to a stack location that is write protected, such as for storing a return address at the stack location.Type: GrantFiled: September 8, 2021Date of Patent: January 9, 2024Assignee: TEXAS INSTRUMENTS INCORPORATEDInventors: Erik Newton Shreve, Eric Thierry Peeters, Per Torstein Roine
-
Patent number: 11868779Abstract: Aspects of the invention include a computer-implemented method of updating metadata prediction tables. The computer-implemented method includes establishing, in the metadata prediction tables, a prediction of how a set of instructions will resolve and identifying that the set of instructions is completed. The computer-implemented method also includes determining, upon completion of the set of instructions, whether prediction update queues (PUQs) associated with the set of instructions indicate that the set of instructions resolved in one of a plurality of prescribed manners relative to the prediction and deciding that the metadata predictions tables are candidates to be updated based on the PUQs indicating that the set of instructions resolved in one of the plurality of prescribed manners.Type: GrantFiled: September 9, 2021Date of Patent: January 9, 2024Assignee: International Business Machines CorporationInventors: James Raymond Cuffney, Adam Benjamin Collura, James Bonanno, Brian Robert Prasky, Edward Thomas Malley, Suman Amugothu
-
Patent number: 11847508Abstract: Convergence of threads executing common code sections is facilitated using instructions inserted at strategic locations in computer code sections. The inserted instructions enable the threads in a warp or other group to cooperate with a thread scheduler to promote thread convergence.Type: GrantFiled: August 11, 2022Date of Patent: December 19, 2023Assignee: NVIDIA CORP.Inventors: Daniel Robert Johnson, Jack Choquette, Olivier Giroux, Michael Patrick McKeown, Mark Stephenson, Sana Damani
-
Patent number: 11830101Abstract: To suspend the processing for a group of one or more execution threads currently executing a shader program for an output being generated by a graphics processor, the issuing of shader program instructions for execution by the group of one or more execution threads is stopped, and any outstanding register-content affecting transactions for the group of one or more execution threads are allowed to complete. Once all outstanding register-content affecting transactions for the group of one or more execution threads have completed, the content of the registers associated with the threads of the group of one or more execution threads, and a set of state information for the group of one or more execution threads, including at least an indication of the last instruction in the shader program that was executed for the threads of the group of one or more execution threads, are stored to memory.Type: GrantFiled: July 17, 2020Date of Patent: November 28, 2023Assignee: Arm LimitedInventor: Olof Henrik Uhrenholt
-
Patent number: 11817162Abstract: Disclosed are hardware configurations of the Register Aliasing Table (RAT) which are suitable for use in structures such as modern microprocessor, microcontroller, CPU etc. that use pipe line technique, perform multi-command operations, prevents Write After Read (WAR), Write After Write (WAW), Read After Write (RAW) dependencies. The Register Aliasing Table provides a circuit which consumes less energy, uses less space and has low latency compared to the applications in the state of the art.Type: GrantFiled: August 5, 2020Date of Patent: November 14, 2023Inventors: Oguz Ergin, Ilker Polat
-
Patent number: 11782720Abstract: A processor system with micro-threading control by a hardware-accelerated kernel thread and the scheduling methods thereof are provided. The processor system comprises a plurality of processor cores and a mutex processing unit connected with the plurality of processor cores. Each processor core provides a kernel thread and a plurality of user threads for concurrent execution, and each processor core comprises a kernel trigger module configured to monitor a set of trigger conditions and generate a kernel triggering indicator to activate the kernel thread in the processor core. The mutex processing unit is configured to receive a plurality of mutex requests from each processor core, and broadcast a plurality of mutex responses to each processor core. Each of the plurality of mutex requests is configured to create a mutex response that affects an execution status of at least one user thread in the plurality of processor cores.Type: GrantFiled: October 18, 2021Date of Patent: October 10, 2023Inventor: Ronald Chi-Chun Hui
-
Patent number: 11768790Abstract: An integrated circuit including control/configure circuitry which interfaces with a plurality of interconnected MACs and/or one or more rows of interconnected connected MACs. The control/configure circuitry may include a plurality of control/configure circuits, each control/configure circuit interfaces with at least one MAC pipeline, wherein each pipeline includes a plurality of linearly connected multiplier-accumulator circuits. Each control/configure circuit may include one or more of (i) a configurable input data signal path to provide data to the MACs of the pipeline during the execution sequence(s) and (ii) a configurable output data path for the output data generated by execution sequence (i.e., input data that was processed via the multiplier-accumulator circuits of the pipeline).Type: GrantFiled: August 16, 2022Date of Patent: September 26, 2023Assignee: Flex Logix Technologies, Inc.Inventors: Frederick A. Ware, Cheng C. Wang
-
Patent number: 11762664Abstract: There is provided a data processing apparatus comprising decode circuitry responsive to receipt of a block of instructions to generate control signals indicative of each of the block of instructions, and to analyse the block of instructions to detect a potential hazard instruction. The data processing apparatus is provided with decode circuitry to encode information indicative of a clean restart point into the control signals associated with the potential hazard instruction. The data processing apparatus is provided with data processing circuity to perform out-of-order execution of at least some of the block of instructions, and control circuitry responsive to a determination, at execution of the potential hazard instruction, that data values used as operands for the potential hazard instruction have been modified by out-of-order execution of a subsequent instruction, to restart execution from the clean restart point and to flush held data values from the data processing circuitry.Type: GrantFiled: January 5, 2022Date of Patent: September 19, 2023Assignee: Arm LimitedInventors: Yasuo Ishii, Michael David Achenbach, David Gum Lim, Abhishek Raja
-
Patent number: 11755361Abstract: A system, method, and apparatus are provided for handling communications with external communication channel hardware devices by a processor executing event-based programming code to interface a plurality of virtual machines with the external communication channel hardware devices by providing the processor with an event latch for storing hardware events received from the external communication channel hardware devices, with a timer circuit that generates a sequence of timer interrupt signals, and with a masking circuit that masks the hardware events stored in the event latch with an event mask in response to each timer interrupt signal, where each event mask is associated with a different virtual machine running on the processor such that each virtual machine is allowed to communicate only on a masked subset of the hardware events specified by the event mask to ensure freedom from interference between the plurality of virtual machines when communicating with the external communication channel hardware deviceType: GrantFiled: October 15, 2021Date of Patent: September 12, 2023Assignee: NXP B.V.Inventors: Brian Christopher Kahne, Michael Andrew Fischer, Robert Anthony McGowan
-
Patent number: 11748024Abstract: An apparatus includes a register block including a plurality of register groups; at least one processing circuit operating based on first data stored in the register block; and a register manager that receives second data from a host, receives a copy request for at least one register group from at least one processing circuit, and copies third data as at least a portion of the second data to at least one register group in response to the copy request.Type: GrantFiled: January 11, 2021Date of Patent: September 5, 2023Assignee: SAMSUNG ELECTRONICS CO., LTD.Inventor: Sungrae Lee
-
Patent number: 11726525Abstract: An example docking station includes a network interface controller. The network interface controller is to communicatively couple the docking station to a network. The docking station also includes a controller to manage the docking station. The docking station includes a hub communicatively coupled to the network interface controller and the controller. The hub is to communicatively couple to a computing device. The controller is to instruct the hub to use the computing device as a master based on the computing device being communicatively coupled to the hub and to instruct the hub to use the controller as the master based on the computing device not being communicatively coupled to the hub.Type: GrantFiled: September 24, 2018Date of Patent: August 15, 2023Assignee: Hewlett-Packard Development Company, L.P.Inventor: Roger D. Benson
-
Patent number: 11726789Abstract: Embodiments of a multithreaded processor and a method of assigning blocks of register files for hardware threads of multithreaded processors are disclosed.Type: GrantFiled: January 27, 2022Date of Patent: August 15, 2023Assignee: NXP B.V.Inventor: Michael Andrew Fischer
-
Patent number: 11720469Abstract: A computer-implemented method, a computer system and a computer program product customize generation and application of stress test conditions in a processor core. The method includes receiving a workload at the processor core, where the workload includes a plurality of instructions and the processor core comprises a plurality of macros. The method also includes obtaining macro performance data for each macro in the plurality of macros from the processor core. The method further includes determining a switching activity level for each macro in the plurality of macros when each instruction in the plurality of instructions is run based on the macro performance data. Lastly, the method includes generating a stressmark comprising the plurality of instructions in the workload, where the stressmark is associated with a macro in the plurality of macros when the switching activity level for the macro is above a minimum threshold.Type: GrantFiled: November 11, 2022Date of Patent: August 8, 2023Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Karthik V Swaminathan, Ramon Bertran Monfort, Alper Buyuktosunoglu, Pradip Bose
-
Patent number: 11693661Abstract: Techniques related to executing a plurality of instructions by a processor comprising receiving a first instruction for execution on an instruction execution pipeline, beginning execution of the first instruction, receiving one or more second instructions for execution on the instruction execution pipeline, the one or more second instructions associated with a higher priority task than the first instruction, storing a register state associated with the execution of the first instruction in one or more registers of a capture queue associated with the instruction execution pipeline, copying the register state from the capture queue to a memory, determining that the one or more second instructions have been executed, copying the register state from the memory to the one or more registers of the capture queue, and restoring the register state to the instruction execution pipeline from the capture queue.Type: GrantFiled: April 27, 2021Date of Patent: July 4, 2023Assignee: Texas Instruments IncorporatedInventors: Timothy D. Anderson, Joseph Zbiciak, Kai Chirca
-
Patent number: 11687487Abstract: Systems and methods are described for updating text files for a processing pipeline without restarting the processing pipeline. A processing pipeline may include a frontend thread and a backend thread. The frontend thread of the processing pipeline may generate transformed data using the text file. A backend thread of the processing pipeline may periodically determine whether an updated text file has been uploaded. The backend thread can determine that an updated text file has been uploaded and cause the frontend thread to pause generating transformed data. The backend thread can validate the updated text file by comparing the text file and the updated text file. Based on validating the updated text file, the backend thread can cause the frontend thread to resume transforming data using the updated text file.Type: GrantFiled: March 11, 2021Date of Patent: June 27, 2023Assignee: Splunk Inc.Inventors: Bashar Abdul-Jawad, Sonal Bablani, Gayathri Pandyaram, Katlyn Parvin, Michael Peterson, Sergey Sergeev
-
Patent number: 11663328Abstract: An apparatus, system, and method for detecting compromised firmware in a non-volatile storage device. A control bus of a non-volatile storage device is monitored. The non-volatile storage device includes a processor and electronic components coupled to the control bus. Signal traffic on the control bus is analyzed for events and/or triggers related to storage operations initiated on the control bus by the processor. Storage operations include one or more commands directed to at least one of the electronic components. If the latency for the storage operation satisfies an alert threshold a host is notified of compromised firmware.Type: GrantFiled: April 22, 2022Date of Patent: May 30, 2023Assignee: Western Digital Technologies, Inc.Inventors: Judah Gamliel Hahn, Shay Benisty, Ariel Navon
-
Patent number: 11663010Abstract: A system and method for a virtual processor base/virtual execution context arrangement. The disclosed arrangement utilizes chiplets comprising core logic and defined instruction sets. The chiplets are adapted to operate in conjunction with one or more active execution contexts to enable the execution of particular processes. In particular, the defined instruction sets includes a instructions for processor debugging. The system and method support the compartmentalization of such debugging instructions so as to provide enhanced processor and process security.Type: GrantFiled: March 8, 2021Date of Patent: May 30, 2023Assignee: UNISYS CORPORATIONInventors: Andrew Ward Beale, David Strong
-
Patent number: 11656796Abstract: A data processor includes a fabric-attached memory (FAM) interface for coupling to a data fabric and fulfilling memory access instructions. A requestor-side adaptive consistency controller coupled to the FAM interface requests notifications from a fabric manager for the fabric-attached memory regarding changes in requestors authorized to access a FAM region which the data processor is authorized to access. If a notification indicates that more than one requestor is authorized to access the FAM region, fences are activated for selected memory access instructions in a local application.Type: GrantFiled: March 31, 2021Date of Patent: May 23, 2023Assignee: Advanced Micro Devices, Inc.Inventors: Sergey Blagodurov, Brandon K. Potter, Johnathan Alsop
-
Patent number: 11656909Abstract: A tensor accelerator includes two tile execution units and a bidirectional queue. Each of the tile execution units includes a buffer, a plurality of arithmetic logic units, a network, and a selector. The buffer includes a plurality of memory cells. The network is coupled to the plurality of memory cells. The selector is coupled to the network and the plurality of arithmetic logic units. The bidirectional queue is coupled between the selectors of the tile execution units.Type: GrantFiled: April 15, 2021Date of Patent: May 23, 2023Assignee: National Taiwan UniversityInventors: Shao-Yi Chien, Yu-Sheng Lin, Wei-Chao Chen
-
Patent number: 11650902Abstract: Disclosed examples to perform instruction-level graphics processing unit (GPU) profiling based on binary instrumentation include: accessing, via a GPU driver executed by a processor, binary code generated by a GPU compiler based on application programming interface (API)-based code provided by an application; accessing, via the GPU driver executed by the processor, instrumented binary code, the instrumented binary code generated by a binary instrumentation module that inserts profiling instructions in the binary code based on an instrumentation schema provided by a profiling application; and providing, via the GPU driver executed by the processor, the instrumented binary code from the GPU driver to a GPU, the instrumented binary code structured to cause the GPU to collect and store profiling data in a memory based on the profiling instructions while executing the instrumented binary code.Type: GrantFiled: November 8, 2017Date of Patent: May 16, 2023Assignee: Intel CorporationInventors: Konstantin Levit-Gurevich, Aleksey Alekseev, Michael Berezalsky, Sion Berkowits, Julia Fedorova, Anton V. Gorshkov, Sunpyo Hong, Noam Itzhaki, Arik Narkis
-
Patent number: 11604605Abstract: A memory controller circuit is disclosed which is coupleable to a first memory circuit, such as DRAM, and includes: a first memory control circuit to read from or write to the first memory circuit; a second memory circuit, such as SRAM; a second memory control circuit adapted to read from the second memory circuit in response to a read request when the requested data is stored in the second memory circuit, and otherwise to transfer the read request to the first memory control circuit; predetermined atomic operations circuitry; and programmable atomic operations circuitry adapted to perform at least one programmable atomic operation. The second memory control circuit also transfers a received programmable atomic operation request to the programmable atomic operations circuitry and sets a hazard bit for a cache line of the second memory circuit.Type: GrantFiled: February 8, 2021Date of Patent: March 14, 2023Assignee: Micron Technology, Inc.Inventor: Tony M. Brewer
-
Patent number: 11586443Abstract: Devices and techniques for thread-based processor halting are described herein. A processor monitors control-status register (CSR) values that correspond to a halt condition for a thread. The processor then compares the halt condition to a current state of the thread and halts in response to the current state of the thread meeting the halt condition.Type: GrantFiled: October 20, 2020Date of Patent: February 21, 2023Assignee: Micron Technology, Inc.Inventors: Christopher Baronne, Dean E. Walker
-
Patent number: 11586444Abstract: A pipeline processing unit includes a fetch unit that fetches the instruction for the thread having an execution right, a decoding unit that decodes the instruction fetched by the fetch unit, and a computation execution unit that executes the instruction decoded by the decoding unit. When the WAIT instruction for the thread having the execution right is executed, an instruction holding unit holds instruction fetch information on a processing target instruction to be processed immediately after the WAIT instruction. An execution target thread selection unit selects a thread to be executed based on a wait command and, in response to a wait state started from the execution of the WAIT instruction being canceled, processes the processing target instruction from decoding thereof based on the instruction fetch information on the processing target instruction held in the instruction holding unit.Type: GrantFiled: June 10, 2021Date of Patent: February 21, 2023Assignee: SANKEN ELECTRIC CO., LTD.Inventors: Kazuhiro Mima, Hitomi Shishido
-
Patent number: 11579806Abstract: Portions of configuration state registers in-memory. An instruction is obtained, and a determination is made that the instruction accesses a configuration state register. A portion of the configuration state register is in-memory and another portion of the configuration state register is in-processor. Processing associated with the configuration state register is performed. The performing processing is based on a type of access and whether the portion or the other portion is being accessed.Type: GrantFiled: April 7, 2021Date of Patent: February 14, 2023Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Michael K. Gschwind, Valentina Salapura
-
Patent number: 11550546Abstract: A processing apparatus having a programmable circuit including a plurality of ALUs, comprises a holding unit which holds configuration information for switching the programmable circuit from a first circuit setting to a second circuit setting, and timing information; and an updating unit which updates each ALU so as to switch the programmable circuit from the first circuit setting to the second circuit setting, wherein in switching from the first circuit setting to the second circuit setting after the programmable circuit has executed the first data processing, the updating unit, using the timing information, updates the first ALU at a timing at which last data of the first data processing is output from the first ALU, and updates the second ALU at a timing at which the last data is output from the second ALU.Type: GrantFiled: April 3, 2020Date of Patent: January 10, 2023Assignee: CANON KABUSHIKI KAISHAInventors: Kazuma Sakato, Yohei Horikawa
-
Patent number: 11531544Abstract: The system creates, in a scheduler data structure, a first entry for a consumer instruction associated with a logical register ID. The first entry includes: a scheduler entry ID; a physical register ID allocated for the logical register ID; a checkpoint ID; one or more scheduler entry IDs for one or more prior producer instructions; and a release field which indicates whether to early release a physical register. The system updates a register alias table entry to include the scheduler entry ID and the checkpoint ID of the consumer instruction. The system receives the scheduler entry ID and a checkpoint ID for a respective prior producer instruction. Responsive to determining that the received checkpoint ID does not match the checkpoint ID associated with the consumer instruction, the system sets a release field to indicate that a physical register is to remain allocated.Type: GrantFiled: July 29, 2021Date of Patent: December 20, 2022Assignee: Hewlett Packard Enterprise Development LPInventor: Sanyam Mehta
-
Patent number: 11442881Abstract: An integrated circuit including control/configure circuitry which interfaces with a plurality of interconnected (e.g., serially) multiplier-accumulator circuits and/or one or more rows of interconnected (e.g., serially) multiplier-accumulator circuits. The control/configure circuitry may include a plurality of control/configure circuits, each control/configure circuit interfaces with at least one multi-bit MAC execution pipeline, wherein each pipeline includes a plurality of interconnected (e.g., serially) multiplier-accumulator circuits. Each control/configure circuit may include one or more (or all) of (i) a configurable input data signal path to provide data to the MACs of the pipeline during the execution sequence(s), (ii) a configurable accumulation data path for the ongoing/accumulating MAC accumulation totals generated by the MACs during an execution sequence, and (iii) a configurable output data path for the output data generated by execution sequence (i.e.Type: GrantFiled: March 25, 2021Date of Patent: September 13, 2022Assignee: Flex Logix Technologies, Inc.Inventors: Frederick A. Ware, Cheng C. Wang
-
Patent number: 11366720Abstract: In one embodiment, a method includes generating a handle that references a checkpoint for a service, sending the handle to the service, wherein the handle is configured to be used by the service to store one or more states of the service in the checkpoint, determining that the service needs to be restarted, restarting the service, accessing the handle for the checkpoint, and sending the handle for the checkpoint to the restarted service, wherein the handle for the checkpoint is configured to be used by the restarted service to restore the one or more states.Type: GrantFiled: August 1, 2019Date of Patent: June 21, 2022Assignee: Facebook Technologies, LLC.Inventors: Vadim Victor Spivak, Bernhard Poess
-
Patent number: 11347539Abstract: In an apparatus (2) with transactional memory support, a predetermined type of transaction start instruction or a subsequent instruction following the predetermined type of transaction start instruction triggers capture of a lock identifier which identifies a lock variable for controlling exclusive access to at least one resource. In response to a predetermined type of transaction end instruction which follows the predetermined type of transaction start instruction, the lock variable is checked and commitment of results of speculatively executed instructions of the transaction is prevented or deferred when the lock variable indicates that another thread holds the exclusive access to the target resource. This approach can improve performance when executing transactions in a transactional memory based system.Type: GrantFiled: August 30, 2018Date of Patent: May 31, 2022Assignee: Arm LimitedInventors: Matthew James Horsnell, Stephan Diestelhorst