Patents by Inventor Robert S. Blackmore

Robert S. Blackmore has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

Performing an asynchronous memory move (AMM) via execution of AMM store instruction within the instruction set architecture

Patent number: 7958327

Abstract: A data processing system with a processor and memory includes an instruction set architecture (ISA) that provides an asynchronous memory move (AMM) store (ST) instruction. When the processor executes the AMM ST instruction, the processor performs a series of functions, which initiates an asynchronous memory move (AMM) operation. The AMM ST instruction moves data from a first memory location having a first real address to a second memory location having a second real address by: (a) performing a move of the data in virtual address space utilizing a source effective address that is memory mapped to the first memory location and a destination effective address that is memory mapped to the second memory location. When the move is completed in the virtual address space, the AMM operation performs the physical move of the data to the second memory location outside the processor core, without processor involvement.

Type: Grant

Filed: February 1, 2008

Date of Patent: June 7, 2011

Assignee: International Business Machines Corporation

Inventors: Ravi K. Arimilli, Robert S. Blackmore, Chulho Kim, Balaram Sinharoy, Hanhong Xue
Specialized memory move barrier operations

Patent number: 7941627

Abstract: An instruction set architecture (ISA) includes an asynchronous memory move (AMM) synchronization (SYNC) instruction. When processor of a data processing system executes the AMM SYNC instruction, the processor prevents an AMM operation generated by a subsequently received/executed AMM ST instruction from proceeding with the data move portion of the AMM operation within the memory subsystem until completion of all ongoing memory access operations within the memory subsystem and fabric. The AMM operation does not wait for a normal barrier operation. The processor forwards the information relevant to initiate the AMM operation to an asynchronous memory mover logic, and signals the logic to not proceed with the AMM operation until signaled of the completion of the AMM SYNC.

Type: Grant

Filed: February 1, 2008

Date of Patent: May 10, 2011

Assignee: International Business Machines Corporation

Inventors: Ravi K. Arimilli, Robert S. Blackmore, Chulho Kim, Balaram Sinharoy, Hanhong Xue
Termination of in-flight asynchronous memory move

Patent number: 7937570

Abstract: A data processing system has a processor, a memory, and an instruction set architecture (ISA) that includes: an asynchronous memory mover (AMM) store (ST) instruction that initiates an asynchronous memory move operation that moves data from a first memory location having a first real address to a second memory location having a second real address by: (a) first performing a move of the data in virtual address space utilizing a source effective address a destination effective address; and (b) when the move is completed, completing a physical move of the data to the second memory location, independent of the processor. The ISA further provides an AMM terminate ST instruction for stopping an ongoing AMM operation before completion of the AMM operation, and a LD CMP instruction for checking a status of an AMM operation.

Type: Grant

Filed: February 1, 2008

Date of Patent: May 3, 2011

Assignee: International Business Machines Corporation

Inventors: Ravi K. Arimilli, Robert S. Blackmore, Ronald N. Kalla, Chulho Kim, Balaram Sinharoy, Hanhong Xue
Handling of address conflicts during asynchronous memory move operations

Patent number: 7930504

Abstract: A method within a data processing system in which a processor handles conflicts, which occur during performance by an asynchronous memory mover of an asynchronous memory move (AMM) operation. The asynchronous memory mover performs an asynchronous memory move (AMM) operation by which the actual data is moved from a source to a destination memory location, independent of the processor. The memory mover sets a flag bit to indicate that the asynchronous memory mover is currently performing an AMM operation at the memory. When the processor receives a memory access operation, the processor checks the value of the flag bit before issuing the new memory access operation, and checks the associated address of the AMM operation to determine possible address conflicts. The processor then evaluates and responds to address conflicts to prevent corruption of data during an AMM operation.

Type: Grant

Filed: February 1, 2008

Date of Patent: April 19, 2011

Assignee: International Business Machines Corporation

Inventors: Ravi K. Arimilli, Robert S. Blackmore, Chulho Kim, Balaram Sinharoy, Hanhong Xue
Allocating a global shared memory

Patent number: 7925842

Abstract: A method of operating a data processing system includes each of multiple tasks within a parallel job executing on multiple nodes of the data processing system issuing a system call to request allocation of backing storage in physical memory for global shared memory accessible to all of the multiple tasks within the parallel job, where the global shared memory is in a global address space defined by a range of effective addresses. Each task among the multiple tasks receives an indication that the allocation requested by the system call was successful only if the global address space for that task was previously reserved and backing storage for the global shared memory has not already been allocated.

Type: Grant

Filed: December 18, 2007

Date of Patent: April 12, 2011

Assignee: International Business Machines Corporation

Inventors: Ravi K. Arimilli, Robert S. Blackmore, Ramakrishnan Rajamony, William J. Starke
Reserving a global address space

Patent number: 7921261

Abstract: A method of operating a data processing system includes each of multiple tasks within a parallel job executing on multiple nodes of the data processing system issuing a respective system call to request reservation, without allocation of backing storage in physical memory, of a global address space defined by a range of effective addresses as global shared memory accessible to all of the multiple tasks within the parallel job. At least two of the tasks within the parallel job allocate global address spaces including a same effective address.

Type: Grant

Filed: December 18, 2007

Date of Patent: April 5, 2011

Assignee: International Business Machines Corporation

Inventors: Robert S. Blackmore, Ramakrishnan Rajamony
Method for enabling direct prefetching of data during asychronous memory move operation

Patent number: 7921275

Abstract: While an asynchronous memory move (AMM) operation is ongoing, a prefetch request for data from the source effective address or the destination effective address triggers cache injection by the AMM mover of relevant data from the stream of data being moved in the physical memory. The memory controller forwards the first prefetched line to the prefetch engine and L1 cache, the next cache lines in the sequence of data to the L2 cache, and a subsequent set of cache lines to the L3 cache. The memory controller then forwards the remaining data to the destination memory location. Quick access to prefetch data is enabled by buffering the stream of data in the upper caches rather than placing all the moved data within the memory. Also, the memory controller places moved data into only a subset of the available cache lines of the upper level cache.

Type: Grant

Filed: February 1, 2008

Date of Patent: April 5, 2011

Assignee: International Business Machines Corporation

Inventors: Ravi K. Arimilli, Robert S. Blackmore, Chulho Kim, Balaram Sinharoy, Hanhong Xue
EFFICIENT PIPELINING OF RDMA FOR COMMUNICATIONS

Publication number: 20110078410

Abstract: Disclosed are a method of and system for multiple party communications in a processing system including multiple processing subsystems. Each of the processing subsystems includes a central processing unit and one or more network adapters for connecting said each processing subsystem to the other processing subsystems. A multitude of nodes are established or created, and each of these nodes is associated with one of the processing subsystems. A first aspect of the invention involves pipelined communication using RDMA among three nodes, where the first node breaks up a large communication into multiple parts and sends these parts one after the other to the second node using RDMA, and the second node in turn absorbs and forwards each of these parts to a third node before all parts of the communication arrive from the first node.

Type: Application

Filed: July 17, 2006

Publication date: March 31, 2011

Applicant: International Business Machines Corporation

Inventors: Robert S. Blackmore, Rama K. Govindaraju, Peter H. Hochschild, Chulho Kim, Rajeev Sivaram, Richard R. Treumann, Hanhong Xue
Mechanism to provide reliability through packet drop detection

Patent number: 7877436

Abstract: A method and a data processing system for completing checkpoint processing of a distributed job with local tasks communicating with other remote tasks via a host fabric interface (HFI) and assigned HFI window. Each HFI window has a send count and a receive count, which tracks GSM messages that are sent from and received at the HFI window. When a checkpoint is initiated by a master task, each local task forwards the send count and the receive count to the master task. The master task sums the respective counts and then compares the totals to each other. When the send count total is equal to the receive count total, the tasks are permitted to continue processing. However, when the send count total is not equal to the receive count total, the master task notifies each task of the job to rollback to a previous checkpoint or kill the job execution.

Type: Grant

Filed: February 1, 2008

Date of Patent: January 25, 2011

Assignee: International Business Machines Corporation

Inventors: Lakshminarayana B. Arimilli, Robert S. Blackmore, Chulho Kim, Ramakrishnan Rajamony, Hanhong Xue
Mechanism to perform debugging of global shared memory (GSM) operations

Patent number: 7873879

Abstract: A host fabric interface (HFI) enables debugging of global shared memory (GSM) operations received at a local node from a network fabric. The local node has a memory management unit (MMU), which provides an effective address to real address (EA-to-RA) translation table that is utilized by the HFI to evaluate when EAs of GSM operations/data from a received GSM packet is memory-mapped to RAs of the local memory. The HFI retrieves the EA associated with a GSM operation/data within a received GSM packet. The HFI forwards the EA to the MMU, which determines when the EA is mapped to RAs within the local memory for the local task. The HFI processing logic enables processing of the GSM packet only when the EA of the GSM operation/data within the GSM packet is an EA that has a local RA translation. Non-matching EAs result in an error condition that requires debugging.

Type: Grant

Filed: February 1, 2008

Date of Patent: January 18, 2011

Assignee: International Business Machines Corporation

Inventors: Lakshminarayana B. Arimilli, Robert S. Blackmore, Chulho Kim, Ramakrishnan Rajamony, Hanhong Xue
USER LEVEL MESSAGE BROADCAST MECHANISM IN DISTRIBUTED COMPUTING ENVIRONMENT

Publication number: 20100269027

Abstract: A data processing system is programmed to provide a method for enabling user-level one-to-all message/messaging (OTAM) broadcast within a distributed parallel computing environment in which multiple threads of a single job execute on different processing nodes across a network. The method comprises: generating one or more messages for transmission to at least one other processing node accessible via a network, where the messages are generated by/for a first thread executing at the data processing system (first processing node) and the other processing node executes one or more second threads of a same parallel job as the first thread. An OTAM broadcast is transmitting via a host fabric interface (HFI) of the data processing system as a one-to-all broadcast on the network, whereby the messages are transmitted to a cluster of processing nodes across the network that execute threads of the same parallel job as the first thread.

Type: Application

Filed: April 16, 2009

Publication date: October 21, 2010

Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Lakshminarayana B. Arimilli, Ravi K. Arimilli, Robert S. Blackmore
Half RDMA and half FIFO operations

Patent number: 7813369

Abstract: In a multinode data processing system in which nodes exchange information over a network or through a switch, a structure and mechanism is provided within the realm of Remote Direct Memory Access (RDMA) operations in which DMA operations are present on one side of the transfer but not the other. On the side in which the transfer is not carried out in DMA fashion, transfer processing is carried out under program control; this is in contrast to the transfer on the DMA side which is characteristically carried out in hardware. Usage of these combination processes is useful in programming situations where RDMA is carried out to or from contiguous locations in memory on one side and where memory locations on the other side is noncontiguous. This split mode of transfer is provided both for read and for write operations.

Type: Grant

Filed: December 20, 2004

Date of Patent: October 12, 2010

Assignee: International Business Machines Corporation

Inventors: Robert S. Blackmore, Fu Chung Chang, Piyush Chaudhary, Kevin J. Gildea, Jason E. Goscinski, Rama K. Govindaraju, Donald G. Grice, Leonard W. Helmer, Jr., Patricia E. Heywood, Peter H. Hochschild, John S. Houston, Chulho Kim, Steven J. Martin
Mechanism to provide software guaranteed reliability for GSM operations

Patent number: 7797588

Abstract: In a global shared memory (GSM) environment, an initiating task at a first node with a host fabric interface (HFI) uses epochs to provide reliability of transmission of packets via a network fabric to a target task. The HFI generates a packet for the initiating task addressed to the target task, and automatically inserts a current epoch of the initiating task into the packet. A copy of the current epoch is maintained by the target task, which accepts for processing only packets having the correct epoch, unless the packet is tagged for guaranteed-once delivery. When a packet delivery is accepted, the target task sends a notification to the initiating task. If the initiating task does not receive the notification of delivery for the issued packet, the initiating task updates the epoch at both the target node and the initiating node and re-transmits the packet.

Type: Grant

Filed: February 1, 2008

Date of Patent: September 14, 2010

Assignee: International Business Machines Corporation

Inventors: Lakshminarayana B. Arimilli, Robert S. Blackmore, Chulho Kim, Hanhong Xue
Sharing lock mechanism between protocol layers

Patent number: 7689992

Abstract: Shared locks are employed for controlling a thread which extends across more than one protocol layer in a data processing system. The use of a counter is used as part of a data structure which makes it possible to implement shared locks across multiple layers. The use of shared locks avoids the processing overhead usually associated with lock acquisition and release. The thread which is controlled may be initiated in either an upper layer protocol or in a lower layer.

Type: Grant

Filed: June 25, 2004

Date of Patent: March 30, 2010

Assignee: International Business Machines Corporation

Inventors: Robert S. Blackmore, Su-Hsuan Huang, Chulho Kim, Richard R. Treumann, Hanhong Xue
INTERRUPT HANDLING USING SIMULTANEOUS MULTI-THREADING

Publication number: 20090271549

Abstract: Disclosed are a method, information processing system, and computer readable medium for managing interrupts. The method includes placing at least one physical processor of an information processing system in a simultaneous multi-threading mode. At least a first logical processor and a second logical processor associated with the at least one physical processor are partitioned. The first logical processor is assigned to manage interrupts and the second logical processor is assigned to dispatch runnable user threads.

Type: Application

Filed: February 16, 2009

Publication date: October 29, 2009

Applicant: International Business Machines Corp.

Inventors: ROBERT S. BLACKMORE, Rama K. Govindaraju, Peter H. Hochschild
FACILITATING INTRA-NODE DATA TRANSFER IN COLLECTIVE COMMUNICATIONS, AND METHODS THEREFOR

Publication number: 20090210635

Abstract: Intra-node data transfer in collective communications is facilitated. A memory object of one task of a collective communication is concurrently attached to the address spaces of a plurality of other tasks of the communication. Those tasks that attach the memory object can access the memory object as if it was their own. Data can be directly written into or read from an application data structure of the memory object by the attaching tasks without copying the data to/from shared memory.

Type: Application

Filed: May 5, 2009

Publication date: August 20, 2009

Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Robert S. BLACKMORE, Bin JIA, Richard R. TREUMANN
COMPLETION OF ASYNCHRONOUS MEMORY MOVE IN THE PRESENCE OF A BARRIER OPERATION

Publication number: 20090198963

Abstract: A method within a data processing system by which a processor executes an asynchronous memory move (AMM) store (ST) instruction to complete a corresponding AMM operation in parallel with an ongoing (not yet completed), previously issued barrier operation. The processor receives the AMM ST instruction after executing the barrier operation (or SYNC instruction) and before the completion of the barrier operation or SYNC on the system fabric. The processor continues executing the AMM ST instruction, which performs a move in virtual address space and then triggers the generation of the AMM operation. The AMM operation proceeds while the barrier operation continues, independent of the processor. The processor stops further execution of all other memory access requests, excluding AMM ST instructions that are received after the barrier operation, but before completion of the barrier operation.

Type: Application

Filed: February 1, 2008

Publication date: August 6, 2009

Inventors: Ravi K. Arimilli, Robert S. Blackmore, Chulho Kim, Balaram Sinharoy, Hanhong Xue
Notification by Task of Completion of GSM Operations at Target Node

Publication number: 20090199182

Abstract: A method for providing global notification of completion of a global shared memory (GSM) operation during processing by a target task executing at a target node of a distributed system. The distributed system has at least one other node on which an initiating task that generated the GSM operation is homed. The target task receives the GSM operation from the initiating task, via a host fabric interface (HFI) window assigned to the target task. The task initiates execution of the GSM operation on the target node. The task detects completion of the execution of the GSM operation on the target node, and issues a global notification to at least the initiating task. The global notification indicates the completion of the execution of the GSM operation to one or more tasks of a single job distributed across multiple processing nodes.

Type: Application

Filed: February 1, 2008

Publication date: August 6, 2009

Inventors: Lakshminarayana B. Arimilli, Robert S. Blackmore, Gheorghe C. Cascaval, Ramakrishnan Rajamony
Mechanisms to Order Global Shared Memory Operations

Publication number: 20090199200

Abstract: A method and data processing system for performing fence operations within a global shared memory (GSM) environment having a local task executing on a processor and providing GSM commands for processing by a host fabric interface (HFI) window that is allocated to the task. The HFI window has one or more registers for use during local fence operations. A first register tracks a first count of task-issued GSM commands, and a second register tracks a second count of GSM operations being processed by the HFI. The processing logic detects a locally-issued fence operation, and responds by performing a series of operations, including: automatically stopping the task from issuing additional GSM commands; monitoring for completion of all the task-issued GSM commands at the HFI; and triggering a resumption of issuance of GSM commands by the task when the completion of all previous task-issued GSM commands is registered by the HFI.

Type: Application

Filed: February 1, 2008

Publication date: August 6, 2009

Inventors: Lakshminarayana B. Arimilli, Robert S. Blackmore, Chulho Kim, Ramakrishnan Rajamony, Hanhong Xue
METHOD FOR ENABLING DIRECT PREFETCHING OF DATA DURING ASYCHRONOUS MEMORY MOVE OPERATION

Publication number: 20090198908

Abstract: While an AMM operation is ongoing, a prefetch request for data from the source effective address or the destination effective address triggers a cache injection by the AMM mover (or memory controller) of relevant data from the stream of data being moved in the physical memory. The memory controller forwards the first prefetched line to the prefetch engine and L1 cache. The memory controller also forwards the next cache lines in the sequence of data to the L2 cache and a subsequent set of cache lines to the L3 cache. The memory controller then forwards the remaining data to the destination memory location. Quick access to prefetch data is enabled by buffering the stream of data in the upper caches rather than placing all the moved data within the memory. Also, the memory controller does not overrun the upper caches, by placing moved data into only a subset of the available cache lines of the upper level cache.

Type: Application

Filed: February 1, 2008

Publication date: August 6, 2009

Inventors: RAVI K. ARIMILLI, Robert S. Blackmore, Chulho Kim, Balaram Sinharoy, Hanhong Xue

prev 1 2 3 4 5 next