Patents by Inventor Samantika S. Sury

Samantika S. Sury has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

Remote atomic operations in multi-socket systems

Patent number: 11537520

Abstract: Disclosed embodiments relate to remote atomic operations (RAO) in multi-socket systems. In one example, a method, performed by a cache control circuit of a requester socket, includes: receiving the RAO instruction from the requester CPU core, determining a home agent in a home socket for the addressed cache line, providing a request for ownership (RFO) of the addressed cache line to the home agent, waiting for the home agent to either invalidate and retrieve a latest copy of the addressed cache line from a cache, or to fetch the addressed cache line from memory, receiving an acknowledgement and the addressed cache line, executing the RAO instruction on the received cache line atomically, subsequently receiving multiple local RAO instructions to the addressed cache line from one or more requester CPU cores, and executing the multiple local RAO instructions on the received cache line independently of the home agent.

Type: Grant

Filed: October 5, 2021

Date of Patent: December 27, 2022

Assignee: Intel Corporation

Inventors: Doddaballapur N. Jayasimha, Samantika S. Sury, Christopher J. Hughes, Jonas Svennebring, Yen-Cheng Liu, Stephen R. Van Doren, David A. Koufaty
Spatial and temporal merging of remote atomic operations

Patent number: 11500636

Abstract: Disclosed embodiments relate to spatial and temporal merging of remote atomic operations.

Type: Grant

Filed: February 24, 2020

Date of Patent: November 15, 2022

Assignee: Intel Corporation

Inventors: Christopher J. Hughes, Joseph Nuzman, Jonas Svennebring, Doddaballapur N. Jayasimha, Samantika S. Sury, David A. Koufaty, Niall D. McDonnell, Yen-Cheng Liu, Stephen R. Van Doren, Stephen J. Robinson
ADAPTIVE REMOTE ATOMICS

Publication number: 20220206945

Abstract: Disclosed embodiments relate to atomic memory operations. In one example, an apparatus includes multiple processor cores, a cache hierarchy, a local execution unit, and a remote execution unit, and an adaptive remote atomic operation unit. The cache hierarchy includes a local cache at a first level and a shared cache at a second level. The local execution unit is to perform an atomic operation at the first level if the local cache is a storing a cache line including data for the atomic operation. The remote execution unit is to perform the atomic operation at the second level. The adaptive remote atomic operation unit is to determine whether to perform the first atomic operation at the first level or at the second level and whether to copy the cache line from the shared cache to the local cache.

Type: Application

Filed: December 25, 2020

Publication date: June 30, 2022

Applicant: Intel Corporation

Inventors: Carl J. Beckmann, Samantika S. Sury, Christopher J. Hughes, Lingxiang Xiang, Rahul Agrawal
CACHE PROBE TRANSACTION FILTERING

Publication number: 20220107897

Abstract: Examples described herein relate to circuitry to selectively disable cache snoop operations issued by a particular processor or its cache manager based on data in a memory address range, to be accessed by the particular processor, having been flushed from one or more other cache devices accessible to other processors. At or after completion of flushing or scrubbing data in the memory address range to memory, the particular processor or its cache manager do not issue snoop operations for accesses to the memory address range. In response to an access by some other device to the memory address range, the processor or cache manager may resume issuing snoop operations.

Type: Application

Filed: December 15, 2021

Publication date: April 7, 2022

Inventors: David KEPPEL, Swapna RAJ, Kermin CHOFLEMING, Samantika S. SURY
REMOTE ATOMIC OPERATIONS IN MULTI-SOCKET SYSTEMS

Publication number: 20220091983

Abstract: Disclosed embodiments relate to remote atomic operations (RAO) in multi-socket systems. In one example, a method, performed by a cache control circuit of a requester socket, includes: receiving the RAO instruction from the requester CPU core, determining a home agent in a home socket for the addressed cache line, providing a request for ownership (RFO) of the addressed cache line to the home agent, waiting for the home agent to either invalidate and retrieve a latest copy of the addressed cache line from a cache, or to fetch the addressed cache line from memory, receiving an acknowledgement and the addressed cache line, executing the RAO instruction on the received cache line atomically, subsequently receiving multiple local RAO instructions to the addressed cache line from one or more requester CPU cores, and executing the multiple local RAO instructions on the received cache line independently of the home agent.

Type: Application

Filed: October 5, 2021

Publication date: March 24, 2022

Inventors: Doddaballapur N. Jayasimha, Samantika S. Sury, Christopher J. Hughes, Jonas Svennebring, Yen-Cheng Liu, Stephen R. Van Doren, David A. Koufaty
Remote atomic operations in multi-socket systems

Patent number: 11138112

Abstract: Disclosed embodiments relate to remote atomic operations (RAO) in multi-socket systems. In one example, a method, performed by a cache control circuit of a requester socket, includes: receiving the RAO instruction from the requester CPU core, determining a home agent in a home socket for the addressed cache line, providing a request for ownership (RFO) of the addressed cache line to the home agent, waiting for the home agent to either invalidate and retrieve a latest copy of the addressed cache line from a cache, or to fetch the addressed cache line from memory, receiving an acknowledgement and the addressed cache line, executing the RAO instruction on the received cache line atomically, subsequently receiving multiple local RAO instructions to the addressed cache line from one or more requester CPU cores, and executing the multiple local RAO instructions on the received cache line independently of the home agent.

Type: Grant

Filed: April 11, 2019

Date of Patent: October 5, 2021

Assignee: Intel Corporation

Inventors: Doddaballapur N. Jayasimha, Samantika S. Sury, Christopher J. Hughes, Jonas Svennebring, Yen-Cheng Liu, Stephen R. Van Doren, David A. Koufaty
TECHNIQUES FOR NEAR DATA ACCELERATION FOR A MULTI-CORE ARCHITECTURE

Publication number: 20210224213

Abstract: Examples include techniques for near data acceleration for a multi-core architecture. A near data processor included in a memory controller of a processor may access data maintained in a memory device coupled with the near data processor via one or more memory channels responsive to a work request to execute a kernel, an application or a loop routine using the accessed data to generate values. The near data processor provides an indication to the requestor of the work request that values have been generated.

Type: Application

Filed: March 19, 2021

Publication date: July 22, 2021

Inventors: Swapna RAJ, Samantika S. SURY, Kermin CHOFLEMING, Simon C. STEELY, JR.
Processors, methods, systems, and instructions to load multiple data elements to destination storage locations other than packed data registers

Patent number: 11068264

Abstract: A processor of an aspect includes a plurality of packed data registers, and a decode unit to decode an instruction. The instruction is to indicate a packed data register of the plurality of packed data registers that is to store a source packed memory address information. The source packed memory address information is to include a plurality of memory address information data elements. An execution unit is coupled with the decode unit and the plurality of packed data registers, the execution unit, in response to the instruction, is to load a plurality of data elements from a plurality of memory addresses that are each to correspond to a different one of the plurality of memory address information data elements, and store the plurality of loaded data elements in a destination storage location. The destination storage location does not include a register of the plurality of packed data registers.

Type: Grant

Filed: August 9, 2019

Date of Patent: July 20, 2021

Assignee: Intel Corporation

Inventors: William C. Hasenplaugh, Chris J. Newburn, Simon C. Steely, Jr., Samantika S. Sury
SPATIAL AND TEMPORAL MERGING OF REMOTE ATOMIC OPERATIONS

Publication number: 20200319886

Abstract: Disclosed embodiments relate to spatial and temporal merging of remote atomic operations.

Type: Application

Filed: February 24, 2020

Publication date: October 8, 2020

Applicant: Intel Corporation

Inventors: Christopher J. Hughes, Joseph Nuzman, Jonas Svennebring, Doddaballapur N. Jayasimha, Samantika S. Sury, David A. Koufaty, Niall D. McDonnell, Yen-Cheng Liu, Stephen R. Van Doren, Stephen J. Robinson
SOFTWARE-TRANSPARENT HARDWARE PREDICTOR FOR CORE-TO-CORE DATA TRANSFER OPTIMIZATION

Publication number: 20200285578

Abstract: Apparatus, method, and system for implementing a software-transparent hardware predictor for core-to-core data communication optimization are described herein. An embodiment of the apparatus includes a plurality of hardware processor cores each including a private cache; a shared cache that is communicatively coupled to and shared by the plurality of hardware processor cores; and a predictor circuit. The predictor circuit is to track activities relating to a plurality of monitored cache lines in the private cache of a producer hardware processor core (producer core) and to enable a cache line push operation upon determining a target hardware processor core (target core) based on the tracked activities. An execution of the cache line push operation is to cause a plurality of unmonitored cache lines in the private cache of the producer core to be moved to the private cache of the target core.

Type: Application

Filed: March 18, 2020

Publication date: September 10, 2020

Applicant: Intel Corporation

Inventors: Ren Wang, Joseph Nuzman, Samantika S. Sury, Andrew J. Herdrich, Namakkal N. Venkatesan, Anil Vasudevan, Tsung-Yuan C. Tai, Niall D. McDonnell
PROGRAMMABLE ADDRESS RANGE ENGINE FOR LARGER REGION SIZES

Publication number: 20200233814

Abstract: Examples described herein relate to a computing system supporting custom page sized ranges for an application to map contiguous memory regions instead of many smaller sized pages. An application can request a custom range size. An operating system can allocate a contiguous physical memory region to a virtual address range by specifying a custom range sizes that are larger or smaller than the normal general page sizes. Virtual-to-physical address translation can occur using an address range circuitry and translation lookaside buffer in parallel. The address range circuitry can determine if a custom entry is available to use to identify a physical address translation for the virtual address. Physical address translation can be performed by transforming the virtual address in some examples.

Type: Application

Filed: February 10, 2020

Publication date: July 23, 2020

Inventors: Farah E. FARGO, Mitchell DIAMOND, David KEPPEL, Samantika S. SURY, Binh PHAM, Shobha VISSAPRAGADA
Software-transparent hardware predictor for core-to-core data transfer optimization

Patent number: 10635590

Abstract: Apparatus, method, and system for implementing a software-transparent hardware predictor for core-to-core data communication optimization are described herein. An embodiment of the apparatus includes a plurality of hardware processor cores each including a private cache; a shared cache that is communicatively coupled to and shared by the plurality of hardware processor cores; and a predictor circuit. The predictor circuit is to track activities relating to a plurality of monitored cache lines in the private cache of a producer hardware processor core (producer core) and to enable a cache line push operation upon determining a target hardware processor core (target core) based on the tracked activities. An execution of the cache line push operation is to cause a plurality of unmonitored cache lines in the private cache of the producer core to be moved to the private cache of the target core.

Type: Grant

Filed: September 29, 2017

Date of Patent: April 28, 2020

Assignee: Intel Corporation

Inventors: Ren Wang, Joseph Nuzman, Samantika S. Sury, Andrew J. Herdrich, Namakkal N. Venkatesan, Anil Vasudevan, Tsung-Yuan C. Tai, Niall D. McDonnell
Spatial and temporal merging of remote atomic operations

Patent number: 10572260

Abstract: Disclosed embodiments relate to spatial and temporal merging of remote atomic operations.

Type: Grant

Filed: December 29, 2017

Date of Patent: February 25, 2020

Assignee: Intel Corporation

Inventors: Christopher J. Hughes, Joseph Nuzman, Jonas Svennebring, Doddaballapur N. Jayasimha, Samantika S. Sury, David A. Koufaty, Niall D. McDonnell, Yen-Cheng Liu, Stephen R. Van Doren, Stephen J. Robinson
PROCESSORS, METHODS, SYSTEMS, AND INSTRUCTIONS TO LOAD MULTIPLE DATA ELEMENTS TO DESTINATION STORAGE LOCATIONS OTHER THAN PACKED DATA REGISTERS

Publication number: 20190384601

Abstract: A processor of an aspect includes a plurality of packed data registers, and a decode unit to decode an instruction. The instruction is to indicate a packed data register of the plurality of packed data registers that is to store a source packed memory address information. The source packed memory address information is to include a plurality of memory address information data elements. An execution unit is coupled with the decode unit and the plurality of packed data registers, the execution unit, in response to the instruction, is to load a plurality of data elements from a plurality of memory addresses that are each to correspond to a different one of the plurality of memory address information data elements, and store the plurality of loaded data elements in a destination storage location. The destination storage location does not include a register of the plurality of packed data registers.

Type: Application

Filed: August 9, 2019

Publication date: December 19, 2019

Inventors: William C. Hasenplaugh, Chris J. Newburn, Simon C. Steely, JR., Samantika S. Sury
Processors, methods, and systems for a configurable spatial accelerator with transactional and replay features

Patent number: 10445234

Abstract: Systems, methods, and apparatuses relating to a configurable spatial accelerator are described. In an embodiment, a processor includes a plurality of processing elements; and an interconnect network between the plurality of processing elements to receive an input of a dataflow graph comprising a plurality of nodes, wherein the dataflow graph is to be overlaid into the interconnect network and the plurality of processing elements with each node represented as a dataflow operator in the plurality of processing elements, and the plurality of processing elements are to perform an atomic operation when an incoming operand set arrives at the plurality of processing elements.

Type: Grant

Filed: July 1, 2017

Date of Patent: October 15, 2019

Assignee: Intel Corporation

Inventors: Kermin Fleming, Kent D. Glossop, Simon C. Steely, Jr., Samantika S. Sury
Synchronization logic for memory requests

Patent number: 10430252

Abstract: In an embodiment, a processor includes a plurality of cores and synchronization logic. The synchronization logic includes circuitry to: receive a first memory request and a second memory request; determine whether the second memory request is in contention with the first memory request; and in response to a determination that the second memory request is in contention with the first memory request, process the second memory request using a non-blocking cache coherence protocol. Other embodiments are described and claimed.

Type: Grant

Filed: November 15, 2018

Date of Patent: October 1, 2019

Assignee: Intel Corporation

Inventors: Samantika S. Sury, Robert G. Blankenship, Simon C. Steely, Jr.
Processors, methods, and systems for a configurable spatial accelerator with memory system performance, power reduction, and atomics support features

Patent number: 10387319

Abstract: Systems, methods, and apparatuses relating to a configurable spatial accelerator are described. In one embodiment, a processor includes a plurality of processing elements; and an interconnect network between the plurality of processing elements to receive an input of a dataflow graph comprising a plurality of nodes, wherein the dataflow graph is to be overlaid into the interconnect network and the plurality of processing elements with each node represented as a dataflow operator in the plurality of processing elements, and the plurality of processing elements is to perform an operation when an incoming operand set arrives at the plurality of processing elements. The processor also includes a streamer element to prefetch the incoming operand set from two or more levels of a memory system.

Type: Grant

Filed: July 1, 2017

Date of Patent: August 20, 2019

Assignee: Intel Corporation

Inventors: Michael C. Adler, Chiachen Chou, Neal C. Crago, Kermin Fleming, Kent D. Glossop, Aamer Jaleel, Pratik M. Marolia, Simon C. Steely, Jr., Samantika S. Sury
Processors, methods, systems, and instructions to load multiple data elements to destination storage locations other than packed data registers

Patent number: 10379855

Abstract: A processor of an aspect includes a plurality of packed data registers, and a decode unit to decode an instruction. The instruction is to indicate a packed data register of the plurality of packed data registers that is to store a source packed memory address information. The source packed memory address information is to include a plurality of memory address information data elements. An execution unit is coupled with the decode unit and the plurality of packed data registers, the execution unit, in response to the instruction, is to load a plurality of data elements from a plurality of memory addresses that are each to correspond to a different one of the plurality of memory address information data elements, and store the plurality of loaded data elements in a destination storage location. The destination storage location does not include a register of the plurality of packed data registers.

Type: Grant

Filed: September 30, 2016

Date of Patent: August 13, 2019

Assignee: Intel Corporation

Inventors: William C. Hasenplaugh, Chris J. Newburn, Simon C. Steely, Jr., Samantika S. Sury
REMOTE ATOMIC OPERATIONS IN MULTI-SOCKET SYSTEMS

Publication number: 20190243761

Abstract: Disclosed embodiments relate to remote atomic operations (RAO) in multi-socket systems. In one example, a method, performed by a cache control circuit of a requester socket, includes: receiving the RAO instruction from the requester CPU core, determining a home agent in a home socket for the addressed cache line, providing a request for ownership (RFO) of the addressed cache line to the home agent, waiting for the home agent to either invalidate and retrieve a latest copy of the addressed cache line from a cache, or to fetch the addressed cache line from memory, receiving an acknowledgement and the addressed cache line, executing the RAO instruction on the received cache line atomically, subsequently receiving multiple local RAO instructions to the addressed cache line from one or more requester CPU cores, and executing the multiple local RAO instructions on the received cache line independently of the home agent.

Type: Application

Filed: April 11, 2019

Publication date: August 8, 2019

Applicant: Intel Corporation

Inventors: Doddaballapur N. Jayasimha, Samantika S. Sury, Christopher J. Hughes, Jonas Svennebring, Yen-Cheng Liu, Stephen R. Van Doren, David A. Koufaty
SPATIAL AND TEMPORAL MERGING OF REMOTE ATOMIC OPERATIONS

Publication number: 20190205139

Abstract: Disclosed embodiments relate to spatial and temporal merging of remote atomic operations.

Type: Application

Filed: December 29, 2017

Publication date: July 4, 2019

Inventors: Christopher J. Hughes, Joseph Nuzman, Jonas Svennebring, Doddaballapur N. Jayasimha, Samantika S. Sury, David A. Koufaty, Niall D. McDonnell, Yen-Cheng Liu, Stephen R. Van Doren, Stephen J. Robinson

1 2 3 next