Data Or Operand Accessing, E.g., Operand Prefetch, Operand Bypass (epo) Patents (Class 712/E9.046)
E Subclasses
-
Patent number: 12260218Abstract: There is provided an apparatus, method for data processing. The apparatus comprises post decode cracking circuitry responsive to receipt of decoded instructions from decode circuitry of a processing pipeline, to crack the decoded instructions into micro-operations to be processed by processing circuitry of the processing pipeline. The post decode cracking circuitry is responsive to receipt of a decoded instruction suitable for cracking into a plurality of micro-operations including at least one pair of micro-operations having a producer-consumer data dependency, to generate the plurality of micro-operations including a producer micro-operation and a consumer micro-operation, and to assign a transfer register to transfer data between the producer micro-operation and the consumer micro-operation.Type: GrantFiled: June 28, 2023Date of Patent: March 25, 2025Assignee: Arm LimitedInventors: Quentin Éric Nouvel, Luca Nassi, Nicola Piano, Albin Pierrick Tonnerre, Geoffray Matthieu Lacourba
-
Patent number: 12242855Abstract: In an embodiment, a processor includes a buffer in an interface unit. The buffer may be used to accumulate coprocessor instructions to be transmitted to a coprocessor. In an embodiment, the processor issues the coprocessor instructions to the buffer when ready to be issued to the coprocessor. The interface unit may accumulate the coprocessor instructions in the buffer, generating a bundle of instructions. The bundle may be closed based on various predetermined conditions and then the bundle may be transmitted to the coprocessor. If a sequence of coprocessor instructions appears consecutively in a program, the rate at which the instructions are provided to the coprocessor (on average) at least matches the rate at which the coprocessor consumes the instructions, in an embodiment.Type: GrantFiled: July 28, 2023Date of Patent: March 4, 2025Assignee: Apple Inc.Inventors: Aditya Kesiraju, Brett S. Feero, Nikhil Gupta, Viney Gautam
-
Patent number: 12223325Abstract: A data processor is disclosed in which groups of execution threads comprising a thread group can execute a set of instructions in lockstep, and in which a plurality of execution lanes can perform processing operations for the execution threads. In response to an execution thread issuing circuit determining whether a portion of active threads of a first thread group and a portion of active threads of a second thread group use different execution lanes of the plurality of execution lanes, the execution thread issuing circuit issuing both the portion of active threads of a first thread group and a portion of active threads of a second thread group for execution. This can have the effect of increasing data processor efficiency, thereby increasing throughput and reducing latency.Type: GrantFiled: July 24, 2023Date of Patent: February 11, 2025Assignee: Arm LimitedInventors: Daren Croxford, Isidoros Sideris
-
Patent number: 12197921Abstract: A method comprises fetching, by fetch circuitry, an encoded XOR3PP instruction comprising at least one opcode, a first source identifier to identify a first register, a second source identifier to identify a second register, a third source identifier to identifier a third register, and a fourth source identifier to identify a fourth operand, wherein the first register is to store a first value, the second register is to store a second value, and the third register is to store a third value, decoding, by decode circuitry, the encoded XOR3PP instruction to generate a decoded XOR3PP instruction; and executing, by execution circuitry, the decoded XOR3PP instruction to determine a first rotational value and a second rotational value, perform a rotate operation on at least a portion of the first value based on the first rotational value to generate a rotated third value, perform an XOR operation on at least a portion of the first value, at least a portion of the second value, and the rotated third value to generateType: GrantFiled: December 22, 2022Date of Patent: January 14, 2025Assignee: Intel CorporationInventors: Santosh Ghosh, Christoph Dobraunig, Manoj Sastry
-
Patent number: 12189989Abstract: A memory system includes a plurality of memory devices configuring a plurality of ways, and a memory controller communicating with the plurality of memory devices through a channel, wherein each of the plurality of memory devices includes a device queue, and wherein the device queue queues a plurality of controller commands inputted from the memory controller.Type: GrantFiled: July 27, 2021Date of Patent: January 7, 2025Assignee: SK hynix Inc.Inventor: Byoung Sung You
-
Patent number: 12182447Abstract: Methods, systems, and devices for the dynamic selection of cores for processing responses are described. A memory sub-system can receive, from a host system, a read command to retrieve data. The memory sub-system can include a first core and a second core. The first core can process the read command based on receiving the read command. The first core can identify the second core for processing a read response associated with the read command. The first core can issue an internal command to retrieve the data from a memory device of the memory sub-system. The internal command can include an indication of the second core selected to process the read response.Type: GrantFiled: January 20, 2023Date of Patent: December 31, 2024Assignee: Micron Technology, Inc.Inventors: Mark Ish, Yun Li, Scheheresade Virani, John Paul Traver, Ning Zhao
-
Patent number: 12175245Abstract: A data processing apparatus comprises an instruction decoder and processing circuitry. The processing circuitry is configured to perform a load-with-substitution operation in response to the instruction decoder decoding a load-with-substitution instruction specifying an address and a destination register. In the load-with-substitution operation, the processing circuitry is configured to issue a request to obtain target data corresponding to the address from one or more caches. In response to the request hitting in a given cache belonging to a subset of the one or more caches, the processing circuitry is configured to provide to the destination register of the load-with-substitution instruction the target data obtained from the given cache.Type: GrantFiled: February 13, 2023Date of Patent: December 24, 2024Assignee: Arm LimitedInventor: Eric Ola Harald Liljedahl
-
Patent number: 12164438Abstract: In a method of operating a computer system, an instruction loop is executed by a processor in which each iteration of the instruction loop accesses a current data vector and an associated current vector predicate. The instruction loop is repeated when the current vector predicate indicates the current data vector contains at least one valid data element and the instruction loop is exited when the current vector predicate indicates the current data vector contains no valid data elements.Type: GrantFiled: September 5, 2023Date of Patent: December 10, 2024Assignee: Texas Instruments IncorporatedInventors: Duc Quang Bui, Joseph Raymond Michael Zbiciak
-
Patent number: 12153932Abstract: Examples include techniques for an in-network acceleration of a parallel prefix-scan operation. Examples include configuring registers of a node included in a plurality of nodes on a same semiconductor package. The registers to be configured responsive to receiving an instruction that indicates a logical tree to map to a network topology that includes the node. The instruction associated with a prefix-scan operation to be executed by at least a portion of the plurality of nodes.Type: GrantFiled: December 21, 2020Date of Patent: November 26, 2024Assignee: Intel CorporationInventors: Ankit More, Fabrizio Petrini, Robert Pawlowski, Shruti Sharma, Sowmya Pitchaimoorthy
-
Patent number: 12149565Abstract: A distributed computing cluster includes first, second, and third pluralities of computer systems. A first computer of the first plurality applies a first transformation pipeline to a stream of data to generate a output data, and transmits the output data to a computer of the second plurality, which is distinct from the first plurality. A second computer of the second plurality applies a second transformation pipeline. The second transformation pipeline includes a first storage transformation. A third computer of the third plurality stores a representation of a distributed computational graph (DCG), which includes a representation of a portion of the second transformation pipeline. The third computer processes the representation of the DCG, and determines whether the second transformation pipeline includes a storage transformation. The third computer monitors the second transformation pipeline, and in response, causes a fourth computer of the third plurality to apply a second storage transformation.Type: GrantFiled: July 21, 2024Date of Patent: November 19, 2024Assignee: QOMPLX LLCInventors: Jason Crabtree, Andrew Sellers
-
Patent number: 12143425Abstract: A system for predictive analysis of very large data sets using a distributed computational graph has been developed. Data receipt software receives streaming data from one or more sources. In a batch data pathway, data formalization software formats input data for storage. A batch event analysis server inspects stored data for trends, situations, or knowledge. Aggregated data is passed to message handler software. System sanity software receives status information from message handler and optimizes system performance. In the streaming pathway, transformation pipeline software manipulates the data stream, provides results back to the system, receives directives from the system sanity and retrain software.Type: GrantFiled: July 21, 2024Date of Patent: November 12, 2024Assignee: QOMPLX LLCInventors: Jason Crabtree, Andrew Sellers
-
Patent number: 12143424Abstract: A system for predictive analysis of very large data sets using a distributed computational graph has been developed. Data receipt software receives streaming data from one or more sources. In a batch data pathway, data formalization software formats input data for storage. A batch event analysis server inspects stored data for trends, situations, or knowledge. Aggregated data is passed to message handler software. System sanity software receives status information from message handler and optimizes system performance. In the streaming pathway, transformation pipeline software manipulates the data stream, provides results back to the system, receives directives from the system sanity and retrain software.Type: GrantFiled: July 21, 2024Date of Patent: November 12, 2024Assignee: QOMPLX LLCInventors: Jason Crabtree, Andrew Sellers
-
Patent number: 12137123Abstract: A system for predictive analysis of very large data sets using a distributed computational graph has been developed. Data receipt software receives streaming data from one or more sources. In a batch data pathway, data formalization software formats input data for storage. A batch event analysis server inspects stored data for trends, situations, or knowledge. Aggregated data is passed to message handler software. System sanity software receives status information from message handler and optimizes system performance. In the streaming pathway, transformation pipeline software manipulates the data stream, provides results back to the system, receives directives from the system sanity and retrain software.Type: GrantFiled: July 21, 2024Date of Patent: November 5, 2024Assignee: QOMPLX LLCInventors: Jason Crabtree, Andrew Sellers
-
Patent number: 12124368Abstract: A storage device includes a non-volatile memory including a plurality of memory blocks. The storage device performs an alignment operation in response to receipt of an align command. The alignment operation converts a received logical address of a logical segment into a physical address and allocates the physical address to a physical block address corresponding to a free block. The storage device is further configured to performs a garbage collection in units of the physical block address that indicates one memory block.Type: GrantFiled: May 10, 2022Date of Patent: October 22, 2024Assignee: SAMSUNG ELECTRONICS CO., LTD.Inventors: Byoung-Geun Kim, In-Hwan Doh, Joo-Young Hwang, Seung-Uk Shin, Min-Seok Ko, Jae-Yoon Choi
-
Patent number: 12112168Abstract: A method for processing multiple transactions converted from a single transaction related to each of a plurality of threads is provided, in which the method is performed by a processor including at least one core and includes converting a first transaction related to at least one of the plurality of threads into a plurality of second transactions, transferring, by a load-store unit (LSU) of the core, the plurality of second transactions to a subordinate or a cache, receiving, by the LSU, a plurality of data units related to the second transactions from the subordinate or the cache, and merging, by the LSU, the plurality of data units, in which the LSU is configured to further transfer interleaving deactivation information that causes the subordinate or the cache to deactivate interleaving.Type: GrantFiled: April 24, 2024Date of Patent: October 8, 2024Assignee: MetisX CO., Ltd.Inventors: Kwang Sun Lee, Do Hun Kim, Kee Bum Shin
-
Patent number: 12039363Abstract: An example method may include responsive to receiving, by a processing device, an interrupt deferral instruction requesting that interrupts be deferred, disabling delivery of interrupts by the processing device, receiving one or more interrupt requests subsequent to disabling delivery of interrupts, and responsive to determining that a deferral termination criterion is satisfied, delivering one or more interrupts, wherein each of the one or more interrupts is specified by a respective one of the interrupt requests. The method may further include receiving a resume interrupt delivery instruction requesting that deferred and subsequent interrupts be delivered, wherein the deferral termination criterion is satisfied in response to receiving the resume interrupt delivery instruction. The method may further include, responsive to receiving the resume interrupt delivery instruction, enabling delivery of the one or more interrupts and subsequent interrupts by the processing device.Type: GrantFiled: June 29, 2022Date of Patent: July 16, 2024Assignee: Red Hat, Inc.Inventor: Michael Tsirkin
-
Patent number: 12020064Abstract: Devices and techniques to reschedule a memory request that has failed when a thread is executing in a processor are described herein. When a memory request for a thread is denied at a point in the execution pipeline of the processor beyond a thread rescheduling point, the thread can be placed into a memory response path of the processor. An indicator that a register write-back will not occur for the thread can also be provided. Then, the thread can be rescheduled with other threads in the memory response path.Type: GrantFiled: October 20, 2020Date of Patent: June 25, 2024Assignee: Micron Technology, Inc.Inventors: Chris Baronne, Dean E. Walker, John Amelio
-
Patent number: 11948655Abstract: Methods, systems, and devices for indicating a blocked repair operation are described. A first indication of whether an address of a memory device is valid may be stored. After the first indication is stored, a command for accessing the address may be processed. Based on processing the command, a second indication of whether the address is valid may be obtained, and a determination of whether to perform or prevent a repair operation for repairing the address may be made based on the first indication and the second indication. A third indication of whether the repair operation was performed or prevented may be stored.Type: GrantFiled: April 21, 2022Date of Patent: April 2, 2024Assignee: Micron Technology, Inc.Inventors: Seth A. Eichmeyer, Christopher G. Wieduwilt, Matthew D. Jenkinson, Matthew A. Prather
-
Patent number: 11922055Abstract: Apparatus and method for managing data in a processing system, such as but not limited to a data storage device such as a solid-state drive (SSD). A ferroelectric stack register memory has a first arrangement of ferroelectric memory cells (FMEs) of a first construction and a second arrangement of FMEs of a different, second construction arranged to provide respective cache lines for use by a controller, such as a programmable processor. A pointer mechanism is configured to provide pointers to point to each of the respective cache lines based on a time sequence of operation of the processor. Data sets can be migrated to the different arrangements by the controller as required based on the different operational characteristics of the respective FME constructions. The FMEs may be non-volatile and read-destructive. Refresh circuitry can be selectively enacted under different operational modes.Type: GrantFiled: April 27, 2022Date of Patent: March 5, 2024Assignee: SEAGATE TECHNOLOGY LLCInventors: Jon D. Trantham, Praveen Viraraghavan, John W. Dykes, Ian J. Gilbert, Sangita Shreedharan Kalarickal, Matthew J. Totin, Mohamad El-Batal, Darshana H. Mehta
-
Patent number: 11915045Abstract: In at least some embodiments, a store-type operation is received and buffered within a store queue entry of a store queue associated with a cache memory of a processor core capable of executing multiple simultaneous hardware threads. A thread identifier indicating a particular hardware thread among the multiple hardware threads that issued the store-type operation is recorded. An indication of whether the store queue entry is a most recently allocated store queue entry for buffering store-type operations of the hardware thread is also maintained. While the indication indicates the store queue entry is a most recently allocated store queue entry for buffering store-type operations of the particular hardware thread, the store queue extends a duration of a store gathering window applicable to the store queue entry. For example, the duration may be extended by decreasing a rate at which the store gathering window applicable to the store queue entry ends.Type: GrantFiled: June 18, 2021Date of Patent: February 27, 2024Assignee: International Business Machines CorporationInventors: Derek E. Williams, Guy L. Guthrie, Hugh Shen
-
Patent number: 11907124Abstract: Aspects include using a shadow copy of a level 1 (L1) cache in a cache hierarchy. A method includes maintaining the shadow copy of the L1 cache in the cache hierarchy. The maintaining includes updating the shadow copy of the L1 cache with memory content changes to the L1 cache a number of pipeline cycles after the L1 cache is updated with the memory content changes.Type: GrantFiled: March 31, 2022Date of Patent: February 20, 2024Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Yair Fried, Aaron Tsai, Eyal Naor, Christian Jacobi, Timothy Bronson, Chung-Lung K. Shum
-
Patent number: 11893393Abstract: A microprocessor system comprises a computational array and a hardware arbiter. The computational array includes a plurality of computation units. Each of the plurality of computation units operates on a corresponding value addressed from memory. The hardware arbiter is configured to control issuing of at least one memory request for one or more of the corresponding values addressed from the memory for the computation units. The hardware arbiter is also configured to schedule a control signal to be issued based on the issuing of the memory requests.Type: GrantFiled: October 22, 2021Date of Patent: February 6, 2024Assignee: Tesla, Inc.Inventors: Emil Talpes, Peter Joseph Bannon, Kevin Altair Hurd
-
Patent number: 11829643Abstract: A memory controller system (and method of pre-scheduling memory transaction) for a storage device comprising a linked-list controller; a plurality of command buffers to store read commands or write commands, and an arbiter to issue command. Each command buffer containing variables set by the linked-list controller. The linked-list controller is configured to execute commands in sequence independent of logical command buffer sequence. The command buffer is configured to support read commands with maximum number of write commands. The linked-list controller is configured to merge multiple write commands that are going to the same address and snarfs read commands from write commands if both commands are going to the same address and the read commands that are snarfed are loaded into a separate command buffer. The variables contained in each of the command buffer indicates status and dependency of the command buffer to create a link forming a command sequence.Type: GrantFiled: December 27, 2021Date of Patent: November 28, 2023Assignee: SKYECHIP SDN BHDInventors: Chee Hak Teh, Yu Ying Ong, Weng Li Leow, Muhamad Aidil Bin Jazmi
-
Patent number: 11803390Abstract: There is provided an apparatus, method and medium. The apparatus comprises processing circuitry to perform data processing in response to decoded instructions and prediction circuitry to generate a prediction of a number of iterations of a fetching process. The fetching process is used to control fetching of data or instructions to be used in processing operations that are predicted to be performed by the processing circuitry. The processing circuitry is configured to tolerate performing one or more unnecessary iterations of the fetching process following an over-prediction of the number of iterations and, for at least one prediction, to determine a class of a plurality of prediction classes, each of which corresponds to a range of numbers of iterations. The prediction circuitry is also arranged to signal a predetermined number of iterations associated with the class to the processing circuitry to trigger at least the predetermined number of iterations of the fetching process.Type: GrantFiled: July 1, 2022Date of Patent: October 31, 2023Assignee: Arm LimitedInventors: Houdhaifa Bouzguarrou, Guillaume Bolbenes, Thibaut Elie Lanois
-
Patent number: 11789642Abstract: A dispatch element interfaces with a host processor and dispatches threads to one or more tiles of a hybrid threading fabric. Data structures in memory to be used by a tile may be identified by a starting address and a size, included as parameters provided by the host. The dispatch element sends a command to a memory interface to transfer the identified data to the tile that will use the data. Thus, when the tile begins processing the thread, the data is already available in local memory of the tile and does not need to be accessed from the memory controller. Data may be transferred by the dispatch element while the tile is performing operations for another thread, increasing the percentage of operations performed by the tile that are performing useful work and reducing the percentage that are merely retrieving data.Type: GrantFiled: June 28, 2021Date of Patent: October 17, 2023Assignee: Micron Technology, Inc.Inventors: Douglas Vanesko, Bryan Hornung, Tony M. Brewer
-
Patent number: 11789836Abstract: A system to implement debugging for a multi-threaded processor is provided. The system includes a hardware thread scheduler configured to schedule processing of data, and a plurality of schedulers, each configured to schedule a given pipeline for processing instructions. The system further includes a debug control configured to control at least one of the plurality of schedulers to halt, step, or resume the given pipeline of the at least one of the plurality of schedulers for the data to enable debugging thereof. The system further includes a plurality of hardware accelerators configured to implement a series of tasks in accordance with a schedule provided by a respective scheduler in accordance with a command from the debug control. Each of the plurality of hardware accelerators is coupled to at least one of the plurality of schedulers to execute the instructions for the given pipeline and to a shared memory.Type: GrantFiled: August 31, 2021Date of Patent: October 17, 2023Assignee: Texas Instruments IncorporatedInventors: Niraj Nandan, Hetul Sanghvi, Mihir Mody, Gary Cooper, Anthony Lell
-
Patent number: 11755331Abstract: A processor includes a processing pipeline, a plurality of result-storage elements, and writeback logic. The processing pipeline is configured to process program operations and to write, to a result storage, up to a predefined maximal number of results of the processed program operations per clock cycle. The result-storage elements are configured to store respective ones of the results. The writeback logic is configured to (i) detect a writeback conflict event in which the processing pipeline produces simultaneous results that exceed the predefined maximal number of results, for writing to the result storage, in a same clock cycle, (ii) in response to detecting the writeback conflict event, to temporarily store at least a given result, from among the simultaneous results, in a given result-storage element, and (iii) to subsequently write the temporarily-stored given result from the given result-storage element to the result storage.Type: GrantFiled: July 11, 2021Date of Patent: September 12, 2023Assignee: APPLE INC.Inventors: Skanda K Srinivasa, Christopher S Thomas
-
Patent number: 11755328Abstract: In an embodiment, a processor includes a buffer in an interface unit. The buffer may be used to accumulate coprocessor instructions to be transmitted to a coprocessor. In an embodiment, the processor issues the coprocessor instructions to the buffer when ready to be issued to the coprocessor. The interface unit may accumulate the coprocessor instructions in the buffer, generating a bundle of instructions. The bundle may be closed based on various predetermined conditions and then the bundle may be transmitted to the coprocessor. If a sequence of coprocessor instructions appears consecutively in a program, the rate at which the instructions are provided to the coprocessor (on average) at least matches the rate at which the coprocessor consumes the instructions, in an embodiment.Type: GrantFiled: November 16, 2021Date of Patent: September 12, 2023Assignee: Apple Inc.Inventors: Aditya Kesiraju, Brett S. Feero, Nikhil Gupta, Viney Gautam
-
Patent number: 11748270Abstract: In a method of operating a computer system, an instruction loop is executed by a processor in which each iteration of the instruction loop accesses a current data vector and an associated current vector predicate. The instruction loop is repeated when the current vector predicate indicates the current data vector contains at least one valid data element and the instruction loop is exited when the current vector predicate indicates the current data vector contains no valid data elements.Type: GrantFiled: November 21, 2022Date of Patent: September 5, 2023Assignee: Texas Instruments IncorporatedInventors: Duc Quang Bui, Joseph Raymond Michael Zbiciak
-
Patent number: 11748108Abstract: Example embodiments of the present application provide an instruction executing method and apparatus, an electronic device, and a computer-readable storage medium that may be applied in the field of artificial intelligence. The instruction executing method may include: executing an instruction sequence that includes memory instructions and non-memory instructions, the instructions in the sequence executed starting to be executed in order; determining that execution of a first memory instruction needs to be completed before a second memory instruction starts to be executed, the second memory instruction being a next memory instruction following the first memory instruction in the instruction sequence; and executing non-memory instructions between the first memory instruction and the second memory instruction without executing the second memory instruction, during a cycle of executing the first memory instruction.Type: GrantFiled: March 24, 2021Date of Patent: September 5, 2023Assignees: Beijing Baidu Netcom Science and Technology Co., LTD., Kunlunxin Technology (Beijing) Company LimitedInventors: Yingnan Xu, Jian Ouyang, Xueliang Du, Kang An
-
Patent number: 11734919Abstract: A flexible computer architecture for performing digital image analysis is described herein. In some examples, the computer architecture can include a distributed messaging platform (DMP) for receiving images from cameras and storing the images in a first queue. The computer architecture can also include a first container for receiving the images from the first queue, applying an image analysis model to the images, and transmitting the image analysis result to the DMP for storage in a second queue. Additionally, the computer architecture can include a second container for receiving the image analysis result from the second queue, performing a post-processing operation on the image analysis result, and transmitting the post-processing result to the DMP for storage in a third queue. The computer architecture can further include an output container for receiving the post-processing result from the third queue and generating an alert notification based on the post-processing result.Type: GrantFiled: November 16, 2022Date of Patent: August 22, 2023Assignee: SAS Institute, Inc.Inventors: Daniele Cazzari, Hardi Desai, Allen Joseph Langlois, Jonathan Walker, Thomas Tuning, Saurabh Mishra, Varunraj Valsaraj
-
Patent number: 11720619Abstract: Data processing apparatuses, methods and computer programs are disclosed. A range definition register is arranged to store a range specifier and filtering operations are performed with respect to a specified transaction by reference to the range definition register. The range definition register stores the range specifier in a format comprising a significand and an exponent, wherein a range of data identifiers is at least partially defined by the range specifier. When the specified transaction is with respect to a data identifier within the range of data identifiers, the filtering operations performed are dependent on attribute data associated with the range of data identifiers.Type: GrantFiled: November 16, 2020Date of Patent: August 8, 2023Assignee: Arm LimitedInventors: François Christopher Jacques Botman, Thomas Christopher Grocutt, Bradley John Smith
-
Patent number: 11714608Abstract: A processing device used in a bus, for executing a programming language function of a central processing unit (CPU), comprises a receiving circuit, for receiving a joint command from the CPU, to assist the CPU to execute the programming language function, wherein the joint command comprises an extended read command and an extended write command; a transmitting circuit, coupled to the receiving circuit, for transmitting the extended read command to a slave device, to receive a first response message via the receiving circuit in response to the extended read command from the slave device, wherein the first response message comprises at least one data read by the slave device from a memory block; and a writing circuit, coupled to the receiving circuit and transmitting circuit, for writing the at least one data into a destination address corresponding to the programming language function according to the extended write command.Type: GrantFiled: January 10, 2022Date of Patent: August 1, 2023Assignee: Realtek Semiconductor Corp.Inventor: Yuefeng Chen
-
Patent number: 11675943Abstract: Embodiments are directed towards a method to create a reconfigurable interconnect framework in an integrated circuit. The method includes accessing a configuration template directed toward the reconfigurable interconnect framework, editing parameters of the configuration template, functionally combining the configuration template with a plurality of modules from an IP library to produce a register transfer level (RTL) circuit model, generating at least one automated test-bench function, and generating at least one logic synthesis script. Editing parameters of the configuration template includes confirming a first number of output ports of a reconfigurable stream switch and confirming a second number of input ports of the reconfigurable stream switch. Each output port and each input port has a respective architectural composition. The output port architectural composition is defined by a plurality of N data paths including A data outputs and B control outputs.Type: GrantFiled: November 10, 2020Date of Patent: June 13, 2023Assignees: STMICROELECTRONICS S.r.l., STMICROELECTRONICS INTERNATIONAL N.V.Inventors: Thomas Boesch, Giuseppe Desoli
-
Patent number: 11620222Abstract: A method for performing an atomic memory operation may include receiving an atomic input, receiving an address for an atomic memory location, and performing an atomic operation on the atomic memory location based on the atomic input, wherein performing the atomic operation may include performing a first operation on a first portion of the atomic input, and performing a second operation, which may be different from the first operation, on a second portion of the atomic input. The method may further include storing a result of the first operation in a first portion of the atomic memory location, and storing a result of the second operation in a second portion of the atomic memory location. The method may further include returning an original content of the first portion of the atomic memory location concatenated with an original content of the second portion of the atomic memory location.Type: GrantFiled: October 30, 2020Date of Patent: April 4, 2023Inventors: David C. Tannenbaum, Raun M. Krisch, Christopher P. Frascati
-
Patent number: 11614942Abstract: Devices and techniques for short-thread rescheduling in a processor are described herein. When an instruction for a thread completes, a result is produced. The condition that the same thread is scheduled in a next execution slot and that the next instruction of the thread will use the result can be detected. In response to this condition, the result can be provided directly to an execution unit for the next instruction.Type: GrantFiled: October 20, 2020Date of Patent: March 28, 2023Assignee: Micron Technology, Inc.Inventors: Christopher Baronne, Dean E. Walker
-
Patent number: 11599625Abstract: Methods, systems, and devices for techniques for instruction perturbation for improved device security are described. A device may assign a set of executable instructions to an instruction packet based on a parameter associated with the instruction packet, and each executable instruction of the set of executable instructions may be independent from other executable instructions of the set of executable instructions. The device may select an order of the set of executable instructions based on a slot instruction rule associated with the device, and each executable instruction of the set of executable instructions may correspond to a respective slot associated with memory of the device. The device may modify the order of the set of executable instructions in a memory hierarchy post pre-decode based on the slot instruction rule and process the set of executable instructions of the instruction packet based on the modified order.Type: GrantFiled: January 28, 2021Date of Patent: March 7, 2023Assignee: QUALCOMM IncorporatedInventors: Arvind Krishnaswamy, Suresh Kumar Venkumahanti, Charles Tabony
-
Patent number: 11593024Abstract: A request can be provided, from a front-end of a memory sub-system, to a processing device of the memory sub-system and deleting the request from a buffer of the front-end of the memory sub-system. Responsive to deleting the request from the buffer, determining a first quantity of requests in the buffer and responsive to deleting the requests from the buffer, determining a second quantity of outstanding requests in the back-end of the memory sub-system. Responsive to deleting the request from the buffer and providing the request to the processing device, determining whether to provide a response to a host, wherein the response includes an indication of the quantity of requests in the buffer and of outstanding requests in a back-end of the memory sub-system, based on a comparison of the second quantity of outstanding requests to a threshold.Type: GrantFiled: August 30, 2021Date of Patent: February 28, 2023Assignee: Micron Technology, Inc.Inventor: Laurent Isenegger
-
Patent number: 11586430Abstract: Methods and apparatus for distribution and execution of instructions in a distributed computing environment are disclosed. An example apparatus includes memory; first instructions; and processor circuitry to execute the first instructions to manage an instruction queue. The instruction queue includes indications of second instructions to be executed at a component server. The processor circuitry is to add a first indication of a corresponding one of the second instructions to the instruction queue. The first indication is to identify: (1) a location of the second instruction and (2) a format of the second instruction. In response to a second indication that the second instruction has been executed, the processor circuitry is to remove the first indication from the instruction queue.Type: GrantFiled: October 25, 2021Date of Patent: February 21, 2023Assignee: VMware, Inc.Inventors: Dimitar Ivanov, Martin Draganchev, Bryan Paul Halter, Nikola Atanasov, James Harrison
-
Patent number: 11573796Abstract: Representative apparatus, method, and system embodiments are disclosed for configurable computing. A representative system includes an interconnection network; a processor; and a plurality of configurable circuit clusters. Each configurable circuit cluster includes a plurality of configurable circuits arranged in an array; a synchronous network coupled to each configurable circuit of the array; and an asynchronous packet network coupled to each configurable circuit of the array.Type: GrantFiled: August 11, 2021Date of Patent: February 7, 2023Assignee: Micron Technology, Inc.Inventor: Tony M. Brewer
-
Patent number: 11531548Abstract: Embodiments for fast perfect issue of dependent instructions in a distributed issue queue system. Producer information of a producer instruction is inserted in a lookup entry in a lookup table, the lookup entry being allocated to a register. It is determined that the register corresponding to the lookup entry is a source for a dependent instruction. Responsive to storing the dependent instruction in an issue queue, the producer information is stored in a back-to-back entry of a back-to-back wakeup table, the back-to-back entry corresponding to the dependent instruction. The producer instruction is issued which causes the producer information of the producer instruction to be sent to the back-to-back wakeup table. It is determined that there is a match between the producer information and the back-to-back entry for the dependent instruction, and the dependent instruction is caused to issue based on the match.Type: GrantFiled: June 25, 2021Date of Patent: December 20, 2022Assignee: International Business Machines CorporationInventors: Brian D. Barrick, Dung Q. Nguyen, Brian W. Thompto, Tu-An T. Nguyen, Salma Ayub
-
Patent number: 11494315Abstract: An arbiter for use with a plurality of request signals is presented. The arbiter includes a sequence identifier to identify an order between the plurality of request signals. The arbiter provides a plurality of output signals in which each output signal is associated with a request signal. When the request signals are provided in a sequential order the output signals are provided in the identified sequential order. When the request signals are provided substantially at the same time the output signals are provided in an arbitrary sequential order. A corresponding signal arbitration method and an electronic circuit comprising the arbiter are also presented.Type: GrantFiled: April 20, 2021Date of Patent: November 8, 2022Assignee: Dialog Semiconductor B.V.Inventor: Paulus Augustinus Joanna Janssens
-
Patent number: 11467827Abstract: A method for computing includes providing software source code defining a processing pipeline including multiple, sequential stages of parallel computations, in which a plurality of processors apply a computational task to data read from a buffer. A static code analysis is applied to the software source code so as to break the computational task into multiple, independent work units, and to define an index space in which the work units are identified by respective indexes. Based on the static code analysis, mapping parameters that define a mapping between the index space and addresses in the buffer are computed, indicating by the mapping the respective ranges of the data to which the work units are to be applied. The source code is compiled so that the processors execute the work units identified by the respective indexes while accessing the data in the buffer in accordance with the mapping.Type: GrantFiled: April 6, 2021Date of Patent: October 11, 2022Assignee: HABANA LABS LTD.Inventors: Michael Zuckerman, Tzachi Cohen, Doron Singer, Ron Shalev, Amos Goldman
-
Patent number: 11372972Abstract: The present disclosure is directed to systems and methods for detecting side-channel exploit attacks such as Spectre and Meltdown. Performance monitoring circuitry includes first counter circuitry to monitor CPU cache misses and second counter circuitry to monitor DTLB load misses. Upon detecting an excessive number of cache misses and/or load misses, the performance monitoring circuitry transfers the first and second counter circuitry data to control circuitry. The control circuitry determines a CPU cache miss to DTLB load miss ratio for each of a plurality of temporal intervals. The control circuitry the identifies, determines, and/or detects a pattern or trend in the CPU cache miss to DTLB load miss ratio. Upon detecting a deviation from the identified CPU cache miss to DTLB load miss ratio pattern or trend indicative of a potential side-channel exploit attack, the control circuitry generates an output to alert a system user or system administrator.Type: GrantFiled: December 27, 2018Date of Patent: June 28, 2022Assignee: Intel CorporationInventors: Paul Carlson, Rahuldeva Ghosh, Baiju Patel, Zhong Chen
-
Patent number: 9830158Abstract: One embodiment of the present invention sets forth a technique for speculatively issuing instructions to allow a processing pipeline to continue to process some instructions during rollback of other instructions. A scheduler circuit issues instructions for execution assuming that, several cycles later, when the instructions reach multithreaded execution units, that dependencies between the instructions will be resolved, resources will be available, operand data will be available, and other conditions will not prevent execution of the instructions. When a rollback condition exists at the point of execution for an instruction for a particular thread group, the instruction is not dispatched to the multithreaded execution units. However, other instructions issued by the scheduler circuit for execution by different thread groups, and for which a rollback condition does not exist, are executed by the multithreaded execution units.Type: GrantFiled: November 4, 2011Date of Patent: November 28, 2017Assignee: NVIDIA CORPORATIONInventors: Jack Hilaire Choquette, Olivier Giroux, Robert J. Stoll, Xiaogang Qiu
-
Patent number: 8484421Abstract: Embodiments of the present disclosure provide a system on a chip (SOC) comprising a processing core, and a cache including a cache instruction port, a cache data port, and a port utilization circuitry configured to selectively fetch instructions through the cache instruction port and selectively pre-fetch instructions through the cache data port. Other embodiments are also described and claimed.Type: GrantFiled: November 23, 2009Date of Patent: July 9, 2013Assignee: Marvell Israel (M.I.S.L) Ltd.Inventors: Tarek Rohana, Adi Habusha, Gil Stoler