Patents Examined by Jacob Petranek
-
Patent number: 12242846Abstract: An apparatus to facilitate supporting 8-bit floating point format operands in a computing architecture is disclosed. The apparatus includes a processor comprising: a decoder to decode an instruction fetched for execution into a decoded instruction, wherein the decoded instruction is a matrix instruction that operates on 8-bit floating point operands to cause the processor to perform a parallel dot product operation; a controller to schedule the decoded instruction and provide input data for the 8-bit floating point operands in accordance with an 8-bit floating data format indicated by the decoded instruction; and systolic dot product circuitry to execute the decoded instruction using systolic layers, each systolic layer comprises one or more sets of interconnected multipliers, shifters, and adder, each set of multipliers, shifters, and adders to generate a dot product of the 8-bit floating point operands.Type: GrantFiled: March 27, 2024Date of Patent: March 4, 2025Assignee: INTEL CORPORATIONInventors: Naveen Mellempudi, Subramaniam Maiyuran, Varghese George, Fangwen Fu, Shuai Mu, Supratim Pal, Wei Xiong
-
Patent number: 12235792Abstract: An apparatus and method for temperature-constrained frequency control and scheduling. For example, one embodiment of a processor comprises: a plurality of cores; power management circuitry to control a frequency of each core of the plurality of cores based, at least in part, on a temperature associated with one or more cores of the plurality of cores, the power management circuitry comprising: a temperature limit-driven frequency controller to determine a first frequency limit value based on a temperature of a corresponding core reaching a first threshold; frequency prediction hardware logic to predict a temperature-constrained frequency of the corresponding core based on the first frequency limit value and an initial frequency limit value; and performance determination hardware logic to determine a new performance value for the corresponding core based on the temperature-constrained frequency, the new performance value to be provided to a task scheduler.Type: GrantFiled: March 30, 2023Date of Patent: February 25, 2025Assignee: Intel CorporationInventors: Jianwei Dai, Somvir Singh Dahiya, Mahesh Kumar P, Stephen H. Gunther, Sapumal Wijeratne, Mark Gallina
-
Patent number: 12236237Abstract: Processor cores using content object identifiers for routing and computation are disclosed. One method includes executing a complex computation using a set of processing cores. The method includes routing a set of content objects using a set of content object identifiers and executing a set of instructions. The set of instructions are defined using a set of operand identifiers. The operand identifiers represent content object identifiers in the set of content object identifiers. The content objects can be routed according to a named data networking (NDN) or content-centric networking (CCN) paradigm with the content object identifiers mentioned above serving as the names for the computation data being routed by the network.Type: GrantFiled: March 31, 2023Date of Patent: February 25, 2025Assignee: Tenstorrent Inc.Inventors: Davor Capalija, Ljubisa Bajic, Jasmina Vasiljevic, Yongbum Kim
-
Patent number: 12235791Abstract: Methods and apparatus relating to loop driven region based frontend translation control for performant and secure data-space guided micro-sequencing are described. In an embodiment, Data-space Translation Logic (DTL) circuitry receives a static input and a dynamic input and generates one or more outputs based at least in part on the static input and the dynamic input. A frontend counter generates a count value for the dynamic input based at least in part on an incremented/decremented counter value and a next counter value from the DTL circuitry. The DTL circuitry is capable to receive a new dynamic input prior to consumption of the one or more outputs. Other embodiments are also disclosed and claimed.Type: GrantFiled: August 23, 2021Date of Patent: February 25, 2025Assignee: Intel CorporationInventors: Kameswar Subramaniam, Christopher Russell
-
Patent number: 12229557Abstract: In an embodiment, a processor comprises an atomic predictor circuit to predict whether or not an atomic operation will complete successfully. The prediction may be used when a subsequent load operation to the same memory location as the atomic operation is executed, to determine whether or not to forward store data from the atomic operation to the subsequent load operation. If the prediction is successful, the store data may be forwarded. If the prediction is unsuccessful, the store data may not be forwarded. In cases where an atomic operation has been failing (not successfully performing the store operation), the prediction may prevent the forwarding of the store data and thus may prevent a subsequent flush of the load.Type: GrantFiled: March 11, 2024Date of Patent: February 18, 2025Assignee: Apple Inc.Inventors: Brian R. Mestan, Gideon N. Levinsky, Michael L. Karm
-
Patent number: 12229079Abstract: A computing system can include a first system on chip (SoC) and a second SoC. Each SoC can comprise a memory in which the SoC publishes state information. For the first SoC, the state information can correspond to a set of tasks being performed by the first SoC, where the first SoC utilizes a plurality of computational components to perform the set of tasks. The first SoC can directly access the memory of the first SoC to dynamically read the state information published by the first SoC. In a backup role, the second SoC maintains a subset of its computational components in a low power state. When the second SoC detects a trigger while reading the state information published in the first memory of the first SoC, the second SoC powers the subset of computational components to take over the set of tasks.Type: GrantFiled: May 10, 2023Date of Patent: February 18, 2025Assignee: Mercedes-Benz Group AGInventor: Francois Piednoel
-
Patent number: 12217052Abstract: A method for executing a machine code using a microprocessor includes, after an operation of decoding a current loaded instruction, constructing a mask from the signals generated by an instruction decoder in response to decoding of the current loaded instruction by the decoder. The constructed mask varies as a function of the current loaded instruction. Subsequently, before an operation of decoding a next loaded instruction, the next loaded instruction is unmasked using the constructed mask.Type: GrantFiled: March 23, 2022Date of Patent: February 4, 2025Assignee: Commissariat à l'Energie Atomique et aux Energies AlternativesInventors: Gaëtan Leplus, Olivier Savry
-
Patent number: 12210872Abstract: A neural processing device, a processing element included therein and a method for operating various formats of the neural processing device are provided. The neural processing device includes at least one neural processor, a shared memory shared by the at least one neural processor, and a global interconnection configured to transmit data between the at least one neural processor and the shared memory, wherein each of the at least one neural processor comprises at least one processing element, each of the at least one processing element receives an input in a first format and thereby performs an operation, and receives an input in a second format that is different from the first format and thereby performs an operation if a format conversion signal is received, and the first format and the second format have a same number of bits.Type: GrantFiled: March 12, 2024Date of Patent: January 28, 2025Assignee: Rebellions Inc.Inventors: Karim Charfi, Jinwook Oh
-
Patent number: 12210478Abstract: Methods and systems for executing an application data flow graph on a set of computational nodes are disclosed. The computational nodes can each include a programmable controller from a set of programmable controllers, a memory from a set of memories, a network interface unit from a set of network interface units, and an endpoint from a set of endpoints. A disclosed method comprises configuring the programmable controllers with instructions. The method also comprises independently and asynchronously executing the instructions using the set of programmable controllers in response to a set of events exchanged between the programmable controllers themselves, between the programmable controllers and the network interface units, and between the programmable controllers and the set of endpoints. The method also comprises transitioning data in the set of memories on the computational nodes in accordance with the application data flow graph and in response to the execution of the instructions.Type: GrantFiled: May 11, 2023Date of Patent: January 28, 2025Assignee: Tenstorrent Inc.Inventors: Ivan Matosevic, Davor Capalija, Jasmina Vasiljevic, Utku Aydonat, S. Alexander Chin, Djordje Maksimovic, Ljubisa Bajic
-
Patent number: 12204899Abstract: Stochastic hyperdimensional arithmetic computing is provided. Hyperdimensional computing (HDC) is a neurally-inspired computation model working based on the observation that the human brain operates on high-dimensional representations of data, called hypervectors. Although HDC is powerful in reasoning and association of the abstract information, it is weak on feature extraction from complex data. Consequently, most existing HDC solutions rely on expensive pre-processing algorithms for feature extraction. This disclosure proposes StocHD, a novel end-to-end hyperdimensional system that supports accurate, efficient, and robust learning over raw data. StocHD expands HDC functionality to the computing area by mathematically defining stochastic arithmetic over HDC hypervectors. StocHD enables an entire learning application (including feature extractor) to process using HDC data representation, enabling uniform, efficient, robust, and highly parallel computation.Type: GrantFiled: May 13, 2022Date of Patent: January 21, 2025Assignee: The Regents of the University of CaliforniaInventor: Mohsen Imani
-
Patent number: 12204904Abstract: Described herein are systems and methods for dynamic designation of instructions as sensitive. For example, some methods include detecting that a first instruction of a first process has been designated as a sensitive instruction; checking whether a sensitive handling enable indicator in a process state register storing a state of the first process is enabled; responsive to detection of the sensitive instruction and enablement of the sensitive handling enable indicator, invoking a constraint for execution of the first instruction; executing the first instruction subject to the constraint; and executing a second instruction of the first process without the constraint.Type: GrantFiled: February 7, 2022Date of Patent: January 21, 2025Assignee: Marvell Asia Pte, Ltd.Inventor: Shubhendu Sekhar Mukherjee
-
Processing elements array that includes delay queues between processing elements to hold shared data
Patent number: 12190224Abstract: A processing element architecture adapted to a convolution comprises a plurality of processing elements and a delayed queue circuit. The plurality of processing elements includes a first processing element and a second processing element, wherein the first processing element and the second processing element perform the convolution according to a shared datum at least. The delayed queue circuit connects to the first processing element and connects to the second processing element. The delayed queue circuit receives the shared datum sent by the first processing element, and sends the shared datum to the second processing element after receiving the shared datum and waiting for a time interval.Type: GrantFiled: December 29, 2020Date of Patent: January 7, 2025Assignee: INDUSTRIAL TECHNOLOGY RESEARCH INSTITUTEInventors: Yao-Hua Chen, Yu-Xiang Yen, Wan-Shan Hsieh, Chih-Tsun Huang, Juin-Ming Lu, Jing-Jia Liou -
Patent number: 12164917Abstract: A system including one or more processors configured to receive a transpose instruction indicating to transpose a source matrix to a result matrix, provide data elements of the source matrix to input switching circuits, reorder the data elements using the input switching circuits, provide the data elements from the input switching circuits to one or more lanes of a datapath, provide the data elements from the datapath to output switching circuits, undo the reordering of the data elements using the output switching circuits, and provide the data elements from the output switching circuits to a result matrix. Each respective lane of the datapath receiving data elements receives multiple data elements directed to different respective non-overlapping portions of the lane.Type: GrantFiled: May 17, 2023Date of Patent: December 10, 2024Assignee: Google LLCInventors: Vinayak Anand Gokhale, Matthew Leever Hedlund, Matthew William Ashcraft, Indranil Chakraborty
-
Patent number: 12153926Abstract: Processor-guided execution of offloaded instructions using fixed function operations is disclosed. Instructions designated for remote execution by a target device are received by a processor. Each instruction includes, as an operand, a target register in the target device. The target register may be an architected virtual register. For each of the plurality of instructions, the processor transmits an offload request in the order that the instructions are received. The offload request includes the instruction designated for remote execution. The target device may be, for example, a processing-in-memory device or an accelerator coupled to a memory.Type: GrantFiled: December 21, 2023Date of Patent: November 26, 2024Assignee: ADVANCED MICRO DEVICES, INC.Inventors: John Kalamatianos, Michael T. Clark, Marius Evers, William L. Walker, Paul Moyer, Jay Fleischman, Jagadish B. Kotra
-
Patent number: 12153920Abstract: Systems, methods, and apparatuses relating to instructions to multiply values of one are described.Type: GrantFiled: December 13, 2019Date of Patent: November 26, 2024Assignee: Intel CorporationInventors: Mohamed Elmalaki, Elmoustapha Ould-Ahmed-Vall
-
Patent number: 12153923Abstract: A supplemental computing system can provide card services while saving processing power of a data center for other tasks. For example, the supplemental computing system described herein can include a processor and a memory that includes instructions that are executable by the processor to perform operations. The operations can include receiving a first subset of card requests. The operations can further include performing at least one servicing task to a card request resulting in an altered card request. Additionally, the operations can include selecting, for each altered card request in the first subset, a secondary card processor from at least one secondary card processor. The operations can also include transforming the altered card request into a secondary card processor specific card request suitable for the selected secondary card processor. The operations can include submitting the secondary card processor specific card request to the selected secondary card processor.Type: GrantFiled: August 3, 2023Date of Patent: November 26, 2024Assignee: Truist BankInventors: Naga Mrudula Kalyani Chitturi, Glenn S. Bruce, Manikandan Dhanabalan, Gopinath Rajagopal, Harish Dindi, Vijay Srinivasan, Jay Poole
-
Patent number: 12153927Abstract: Merging branch target buffer entries includes maintaining, in a branch target buffer, an entry corresponding to first branch instruction, where the entry identifies a first branch target address for the first branch instruction and a second branch target address for a second branch instruction; and accessing, based on the first branch instruction, the entry.Type: GrantFiled: June 1, 2020Date of Patent: November 26, 2024Assignee: ADVANCED MICRO DEVICES, INC.Inventors: Thomas Clouqueur, Marius Evers, Aparna Mandke, Steven R. Havlir, Robert Cohen, Anthony Jarvis
-
Patent number: 12141584Abstract: Disclosed herein are embodiments related to a power efficient multi-bit storage system. In one configuration, the multi-bit storage system includes a first storage circuit, a second storage circuit, a prediction circuit, and a clock gating circuit. In one aspect, the first storage circuit updates a first output bit according to a first input bit, in response to a trigger signal, and the second storage circuit updates a second output bit according to a second input bit, in response to the trigger signal. In one aspect, the prediction circuit generates a trigger enable signal indicating whether at least one of the first output bit or the second output bit is predicted to change a state. In one aspect, the clock gating circuit generates the trigger signal based on the trigger enable signal.Type: GrantFiled: July 7, 2022Date of Patent: November 12, 2024Assignee: TAIWAN SEMICONDUCTOR MANUFACTURING COMPANY LIMITEDInventors: Kai-Chi Huang, Chi-Lin Liu, Wei-Hsiang Ma, Shang-Chih Hsieh
-
Patent number: 12135679Abstract: In an embodiment a system on chip includes at least one master device, at least one slave device, a connection interface configured to route signals between the at least one master device and the at least one slave device, the connection interface configured to operate according to configuration parameters, and a configuration bus connected to the connection interface, wherein the configuration bus is configured to deliver new configuration parameters to the connection interface so as to adapt operation of the connection interface.Type: GrantFiled: June 8, 2022Date of Patent: November 5, 2024Assignee: STMicroelectronics S.r.l.Inventors: Antonino Mondello, Salvatore Pisasale
-
Patent number: 12130915Abstract: Systems, methods, and apparatuses relating to microarchitectural mechanisms for the prevention of side-channel attacks are disclosed herein. In one embodiment, a processor core includes an instruction fetch circuit to fetch instructions; a branch target buffer comprising a plurality of entries that each include a thread identification (TID) and a privilege level bit; and a branch predictor, coupled to the instruction fetch circuit and the branch target buffer, to predict a target instruction corresponding to a branch instruction based on at least one entry of the plurality of entries in the branch target buffer, and cause the target instruction to be fetched by the instruction fetch circuit.Type: GrantFiled: February 1, 2022Date of Patent: October 29, 2024Assignee: Intel CORPORATIONInventors: Robert S. Chappell, Jared W. Stark, IV, Joseph Nuzman, Stephen Robinson, Jason W. Brandt