Patents by Inventor Robert Pawlowski
Robert Pawlowski has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Patent number: 11960922Abstract: In an embodiment, a processor comprises: an execution circuit to execute instructions; at least one cache memory coupled to the execution circuit; and a table storage element coupled to the at least one cache memory, the table storage element to store a plurality of entries each to store object metadata of an object used in a code sequence. The processor is to use the object metadata to provide user space multi-object transactional atomic operation of the code sequence. Other embodiments are described and claimed.Type: GrantFiled: September 24, 2020Date of Patent: April 16, 2024Assignee: Intel CorporationInventors: Joshua B. Fryman, Jason M. Howard, Ibrahim Hur, Robert Pawlowski
-
Publication number: 20240119015Abstract: Systems, apparatuses and methods may provide for technology that detects a condition in which a plurality of atomic instructions target a common address and different bit positions in a mask, generates a combined read-lock request for the plurality of atomic instructions in response to the condition, and sends the combined read-lock request to a lock buffer coupled to a memory device associated with the common address.Type: ApplicationFiled: August 30, 2023Publication date: April 11, 2024Inventors: Shruti Sharma, Robert Pawlowski
-
Publication number: 20240069921Abstract: Technology described herein provides a dynamically reconfigurable processing core. The technology includes a plurality of pipelines comprising a core, where the core is reconfigurable into one of a plurality of core modes, a core network to provide inter-pipeline connections for the pipelines, and logic to receive a morph instruction including a target core mode from an application running on the core, determine a present core state for the core, and morph, based on the present core state, the core to the target core mode. In embodiments, to morph the core, the logic is to select, based on the target core mode, which inter-pipeline connections are active, where each pipeline includes at least one multiplexor via which the inter-pipeline connections are selected to be active. In embodiments, to morph the core, the logic is further to select, based on the target core mode, which memory access paths are active.Type: ApplicationFiled: September 29, 2023Publication date: February 29, 2024Inventors: Scott Cline, Robert Pawlowski, Joshua Fryman, Ivan Ganev, Vincent Cave, Sebastian Szkoda, Fabio Checconi
-
Publication number: 20240045829Abstract: Techniques for multi-dimensional network sorted array merging. A first switch of a plurality of switches of an apparatus may receive a first element of a first array and a first element of a second array. The first switch may determine that the first element of the first array is less than the first element of the second array. The first switch may cause the first element of the first array to be stored as a first element of an output array.Type: ApplicationFiled: April 5, 2023Publication date: February 8, 2024Applicant: Intel CorporationInventors: Robert Pawlowski, Sriram Aananthakrishnan
-
Publication number: 20240028555Abstract: Techniques for multi-dimensional network sorted array intersection. A first switch of a plurality of switches of an apparatus may receive a first element of a first array from a first compute tile of the plurality of compute tiles and a first element of a second array from a second compute tile of the plurality of compute tiles. The first switch may determine that the first element of the first array is equal to the first element of the second array. The first switch may cause the first element of the first array to be stored as a first element of an output array, the output array to comprise an intersection of the first array and the second array.Type: ApplicationFiled: September 29, 2023Publication date: January 25, 2024Inventors: Robert Pawlowski, Sriram Aananthakrishna, Shruti Sharma
-
Publication number: 20240020253Abstract: Systems, apparatuses and methods may provide for technology that detects a plurality of sub-instruction requests from a first memory engine in a plurality of memory engines, wherein the plurality of sub-instruction requests are associated with a direct memory access (DMA) data type conversion request from a first pipeline, wherein each sub-instruction request corresponds to a data element in the DMA data type conversion request, and wherein the first memory engine is to correspond to the first pipeline, decodes the plurality of sub-instruction requests to identify one or more arguments, loads a source array from a dynamic random access memory (DRAM) in a plurality of DRAMs, wherein the operation engine is to correspond to the DRAM, and conducts a conversion of the source array from a first data type to a second data type in accordance with the one or more arguments.Type: ApplicationFiled: September 29, 2023Publication date: January 18, 2024Inventors: Shruti Sharma, Robert Pawlowski, Fabio Checconi, Jesmin Jahan Tithi
-
Publication number: 20230333998Abstract: Systems, apparatuses and methods may provide for technology that includes a plurality of memory engines corresponding to a plurality of pipelines, wherein each memory engine in the plurality of memory engines is adjacent to a pipeline in the plurality of pipelines, and wherein a first memory engine is to request one or more direct memory access (DMA) operations associated with a first pipeline, and a plurality of operation engines corresponding to a plurality of dynamic random access memories (DRAMs), wherein each operation engine in the plurality of operation engines is adjacent to a DRAM in the plurality of DRAMs, and wherein one or more of the plurality of operation engines is to conduct the one or more DMA operations based on one or more bitmaps.Type: ApplicationFiled: May 5, 2023Publication date: October 19, 2023Inventors: Shruti Sharma, Robert Pawlowski, Fabio Checconi, Jesmin Jahan Tithi
-
Publication number: 20230315451Abstract: Systems, apparatuses and methods may provide for technology that detects, by an operation engine, a plurality of sub-instruction requests from a first memory engine in a plurality of memory engines, wherein the plurality of sub-instruction requests are associated with a direct memory access (DMA) bitmap manipulation request from a first pipeline, wherein each sub-instruction request corresponds to a data element in the DMA bitmap manipulation request, and wherein the first memory engine is to correspond to the first pipeline. The technology also detects, by the operation engine, one or more arguments in the plurality of sub-instruction requests, sends, by the operation engine, one or more load requests to a DRAM in the plurality of DRAMs in accordance with the one or more arguments, and sends, by the operation engine, one or more store requests to the DRAM in accordance with the one or more arguments, wherein the operation engine is to correspond to the DRAM.Type: ApplicationFiled: May 31, 2023Publication date: October 5, 2023Inventors: Shruti Sharma, Robert Pawlowski, Fabio Checconi, Jesmin Jahan Tithi
-
Patent number: 11630691Abstract: Disclosed embodiments relate to an improved memory system architecture for multi-threaded processors. In one example, a system includes a system comprising a multi-threaded processor core (MTPC), the MTPC comprising: P pipelines, each to concurrently process T threads; a crossbar to communicatively couple the P pipelines; a memory for use by the P pipelines, a scheduler to optimize reduction operations by assigning multiple threads to generate results of commutative arithmetic operations, and then accumulate the generated results, and a memory controller (MC) to connect with external storage and other MTPCs, the MC further comprising at least one optimization selected from: an instruction set architecture including a dual-memory operation; a direct memory access (DMA) engine; a buffer to store multiple pending instruction cache requests; multiple channels across which to stripe memory requests; and a shadow-tag coherency management unit.Type: GrantFiled: August 24, 2021Date of Patent: April 18, 2023Assignee: Intel CorporationInventors: Robert Pawlowski, Ankit More, Jason M. Howard, Joshua B. Fryman, Tina C. Zhong, Shaden Smith, Sowmya Pitchaimoorthy, Samkit Jain, Vincent Cave, Sriram Aananthakrishnan, Bharadwaj Krishnamurthy
-
Publication number: 20230095207Abstract: A memory architecture may provide support for any number of direct memory access (DMA) operations at least partially independent of the CPU coupled to the memory. DMA operations may involve data movement between two or more memory locations and may involve minor computations. At least some DMA operations may include any number of atomic functions, and at least some of the atomic functions may include a corresponding return value. A system includes a first direct memory access (DMA) engine to request a DMA operation. The DMA operation includes an atomic operation. The system also includes a second DMA engine to receive a return value associated with the atomic operation and store the return value at a source memory.Type: ApplicationFiled: September 24, 2021Publication date: March 30, 2023Inventors: Robert Pawlowski, Fabio Checconi, Fabrizio Petrini
-
Publication number: 20220414038Abstract: Systems, methods, and apparatuses for direct memory access instruction set architecture support for flexible dense compute using a reconfigurable spatial array are described.Type: ApplicationFiled: June 25, 2021Publication date: December 29, 2022Inventors: ROBERT PAWLOWSKI, BHARADWAJ KRISHNAMURTHY, SHRUTI SHARMA, BYOUNGCHAN OH, JING FANG, DANIEL KLOWDEN, JASON HOWARD, JOSHUA FRYMAN
-
Publication number: 20220413855Abstract: Techniques for operating on an indirect memory access instruction, where the instruction accesses a memory location via at least one indirect address. A pipeline processes the instruction and a memory operation engine generates a first access to the at least one indirect address and a second access to a target address determined by the at least one indirect address. A cache memory used with the pipeline and the memory operation engine caches pointers. In response to a cache hit when executing the indirect memory access instruction, operations dereference a pointer to obtain the at least one indirect address, not set a cache bit, and return data for the instruction without storing the data in the cache memory; and in response to a cache miss, operations set the cache bit, obtain, and store a cache line for a missed pointer, and return data without storing the data in the cache memory.Type: ApplicationFiled: June 25, 2021Publication date: December 29, 2022Inventors: Robert Pawlowski, Sriram Aananthakrishnan, Jason Howard, Joshua Fryman
-
Patent number: 11360809Abstract: Embodiments of apparatuses, methods, and systems for scheduling tasks to hardware threads are described. In an embodiment, a processor includes a multiple hardware threads and a task manager. The task manager is to issue a task to a hardware thread. The task manager includes a hardware task queue to store a descriptor for the task. The descriptor is to include a field to store a value to indicate whether the task is a single task, a collection of iterative tasks, and a linked list of tasks.Type: GrantFiled: June 29, 2018Date of Patent: June 14, 2022Assignee: Intel CorporationInventors: William Paul Griffin, Joshua Fryman, Jason Howard, Sang Phill Park, Robert Pawlowski, Michael Abbott, Scott Cline, Samkit Jain, Ankit More, Vincent Cave, Fabrizio Petrini, Ivan Ganev
-
Publication number: 20220100508Abstract: Embodiments of apparatuses and methods for copying and operating on matrix elements are described. In embodiments, an apparatus includes a hardware instruction decoder to decode a single instruction and execution circuitry, coupled to hardware instruction decoder, to perform one or more operations corresponding to the single instruction. The single instruction has a first operand to reference a base address of a first representation of a source matrix and a second operand to reference a base address of second representation of a destination matrix. The one or more operations include copying elements of the source matrix to corresponding locations in the destination matrix and filling empty elements of the destination matrix with a single value.Type: ApplicationFiled: December 25, 2020Publication date: March 31, 2022Applicant: Intel CorporationInventors: Robert Pawlowski, Ankit More, Vincent Cave, Sriram Aananthakrishnan, Jason M. Howard, Joshua B. Fryman
-
Publication number: 20220091987Abstract: In an embodiment, a processor comprises: an execution circuit to execute instructions; at least one cache memory coupled to the execution circuit; and a table storage element coupled to the at least one cache memory, the table storage element to store a plurality of entries each to store object metadata of an object used in a code sequence. The processor is to use the object metadata to provide user space multi-object transactional atomic operation of the code sequence. Other embodiments are described and claimed.Type: ApplicationFiled: September 24, 2020Publication date: March 24, 2022Inventors: JOSHUA B. FRYMAN, JASON M. HOWARD, IBRAHIM HUR, ROBERT PAWLOWSKI
-
Publication number: 20210409265Abstract: Examples described herein relate to a first group of core nodes to couple with a group of switch nodes and a second group of core nodes to couple with the group of switch nodes, wherein: a core node of the first or second group of core nodes includes circuitry to execute one or more message passing instructions that indicate a configuration of a network to transmit data toward two or more endpoint core nodes and a switch node of the group of switch nodes includes circuitry to execute one or more message passing instructions that indicate the configuration to transmit data toward the two or more endpoint core nodes.Type: ApplicationFiled: September 13, 2021Publication date: December 30, 2021Inventors: Robert PAWLOWSKI, Vincent CAVE, Shruti SHARMA, Fabrizio PETRINI, Joshua B. FRYMAN, Ankit MORE
-
Publication number: 20210389984Abstract: Disclosed embodiments relate to an improved memory system architecture for multi-threaded processors. In one example, a system includes a system comprising a multi-threaded processor core (MTPC), the MTPC comprising: P pipelines, each to concurrently process T threads; a crossbar to communicatively couple the P pipelines; a memory for use by the P pipelines, a scheduler to optimize reduction operations by assigning multiple threads to generate results of commutative arithmetic operations, and then accumulate the generated results, and a memory controller (MC) to connect with external storage and other MTPCs, the MC further comprising at least one optimization selected from: an instruction set architecture including a dual-memory operation; a direct memory access (DMA) engine; a buffer to store multiple pending instruction cache requests; multiple channels across which to stripe memory requests; and a shadow-tag coherency management unit.Type: ApplicationFiled: August 24, 2021Publication date: December 16, 2021Inventors: Robert PAWLOWSKI, Ankit MORE, Jason M. HOWARD, Joshua B. FRYMAN, Tina C. ZHONG, Shaden SMITH, Sowmya PITCHAIMOORTHY, Samkit JAIN, Vincent CAVE, Sriram AANANTHAKRISHNAN, Bharadwaj KRISHNAMURTHY
-
Patent number: 11106494Abstract: Disclosed embodiments relate to an improved memory system architecture for multi-threaded processors. In one example, a system includes a system comprising a multi-threaded processor core (MTPC), the MTPC comprising: P pipelines, each to concurrently process T threads; a crossbar to communicatively couple the P pipelines; a memory for use by the P pipelines, a scheduler to optimize reduction operations by assigning multiple threads to generate results of commutative arithmetic operations, and then accumulate the generated results, and a memory controller (MC) to connect with external storage and other MTPCs, the MC further comprising at least one optimization selected from: an instruction set architecture including a dual-memory operation; a direct memory access (DMA) engine; a buffer to store multiple pending instruction cache requests; multiple channels across which to stripe memory requests; and a shadow-tag coherency management unit.Type: GrantFiled: September 28, 2018Date of Patent: August 31, 2021Assignee: Intel CorporationInventors: Robert Pawlowski, Ankit More, Jason M. Howard, Joshua B. Fryman, Tina C. Zhong, Shaden Smith, Sowmya Pitchaimoorthy, Samkit Jain, Vincent Cave, Sriram Aananthakrishnan, Bharadwaj Krishnamurthy
-
Patent number: 11061742Abstract: In one embodiment, a first processor core includes: a plurality of execution pipelines each to execute instructions of one or more threads; a plurality of pipeline barrier circuits coupled to the plurality of execution pipelines, each of the plurality of pipeline barrier circuits associated with one of the plurality of execution pipelines to maintain status information for a plurality of barrier groups, each of the plurality of barrier groups formed of at least two threads; and a core barrier circuit to control operation of the plurality of pipeline barrier circuits and to inform the plurality of pipeline barrier circuits when a first barrier has been reached by a first barrier group of the plurality of barrier groups. Other embodiments are described and claimed.Type: GrantFiled: June 27, 2018Date of Patent: July 13, 2021Assignee: INTEL CORPORATIONInventors: Robert Pawlowski, Ankit More, Shaden Smith, Sowmya Pitchaimoorthy, Samkit Jain, Vincent Cavé, Sriram Aananthakrishnan, Jason M. Howard, Joshua B. Fryman
-
Publication number: 20210149683Abstract: Examples include techniques for an in-network acceleration of a parallel prefix-scan operation. Examples include configuring registers of a node included in a plurality of nodes on a same semiconductor package. The registers to be configured responsive to receiving an instruction that indicates a logical tree to map to a network topology that includes the node. The instruction associated with a prefix-scan operation to be executed by at least a portion of the plurality of nodes.Type: ApplicationFiled: December 21, 2020Publication date: May 20, 2021Inventors: Ankit MORE, Fabrizio PETRINI, Robert PAWLOWSKI, Shruti SHARMA, Sowmya PITCHAIMOORTHY