Patents by Inventor Robert S. Blackmore

Robert S. Blackmore has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 7958327
    Abstract: A data processing system with a processor and memory includes an instruction set architecture (ISA) that provides an asynchronous memory move (AMM) store (ST) instruction. When the processor executes the AMM ST instruction, the processor performs a series of functions, which initiates an asynchronous memory move (AMM) operation. The AMM ST instruction moves data from a first memory location having a first real address to a second memory location having a second real address by: (a) performing a move of the data in virtual address space utilizing a source effective address that is memory mapped to the first memory location and a destination effective address that is memory mapped to the second memory location. When the move is completed in the virtual address space, the AMM operation performs the physical move of the data to the second memory location outside the processor core, without processor involvement.
    Type: Grant
    Filed: February 1, 2008
    Date of Patent: June 7, 2011
    Assignee: International Business Machines Corporation
    Inventors: Ravi K. Arimilli, Robert S. Blackmore, Chulho Kim, Balaram Sinharoy, Hanhong Xue
  • Patent number: 7941627
    Abstract: An instruction set architecture (ISA) includes an asynchronous memory move (AMM) synchronization (SYNC) instruction. When processor of a data processing system executes the AMM SYNC instruction, the processor prevents an AMM operation generated by a subsequently received/executed AMM ST instruction from proceeding with the data move portion of the AMM operation within the memory subsystem until completion of all ongoing memory access operations within the memory subsystem and fabric. The AMM operation does not wait for a normal barrier operation. The processor forwards the information relevant to initiate the AMM operation to an asynchronous memory mover logic, and signals the logic to not proceed with the AMM operation until signaled of the completion of the AMM SYNC.
    Type: Grant
    Filed: February 1, 2008
    Date of Patent: May 10, 2011
    Assignee: International Business Machines Corporation
    Inventors: Ravi K. Arimilli, Robert S. Blackmore, Chulho Kim, Balaram Sinharoy, Hanhong Xue
  • Patent number: 7937570
    Abstract: A data processing system has a processor, a memory, and an instruction set architecture (ISA) that includes: an asynchronous memory mover (AMM) store (ST) instruction that initiates an asynchronous memory move operation that moves data from a first memory location having a first real address to a second memory location having a second real address by: (a) first performing a move of the data in virtual address space utilizing a source effective address a destination effective address; and (b) when the move is completed, completing a physical move of the data to the second memory location, independent of the processor. The ISA further provides an AMM terminate ST instruction for stopping an ongoing AMM operation before completion of the AMM operation, and a LD CMP instruction for checking a status of an AMM operation.
    Type: Grant
    Filed: February 1, 2008
    Date of Patent: May 3, 2011
    Assignee: International Business Machines Corporation
    Inventors: Ravi K. Arimilli, Robert S. Blackmore, Ronald N. Kalla, Chulho Kim, Balaram Sinharoy, Hanhong Xue
  • Patent number: 7930504
    Abstract: A method within a data processing system in which a processor handles conflicts, which occur during performance by an asynchronous memory mover of an asynchronous memory move (AMM) operation. The asynchronous memory mover performs an asynchronous memory move (AMM) operation by which the actual data is moved from a source to a destination memory location, independent of the processor. The memory mover sets a flag bit to indicate that the asynchronous memory mover is currently performing an AMM operation at the memory. When the processor receives a memory access operation, the processor checks the value of the flag bit before issuing the new memory access operation, and checks the associated address of the AMM operation to determine possible address conflicts. The processor then evaluates and responds to address conflicts to prevent corruption of data during an AMM operation.
    Type: Grant
    Filed: February 1, 2008
    Date of Patent: April 19, 2011
    Assignee: International Business Machines Corporation
    Inventors: Ravi K. Arimilli, Robert S. Blackmore, Chulho Kim, Balaram Sinharoy, Hanhong Xue
  • Patent number: 7925842
    Abstract: A method of operating a data processing system includes each of multiple tasks within a parallel job executing on multiple nodes of the data processing system issuing a system call to request allocation of backing storage in physical memory for global shared memory accessible to all of the multiple tasks within the parallel job, where the global shared memory is in a global address space defined by a range of effective addresses. Each task among the multiple tasks receives an indication that the allocation requested by the system call was successful only if the global address space for that task was previously reserved and backing storage for the global shared memory has not already been allocated.
    Type: Grant
    Filed: December 18, 2007
    Date of Patent: April 12, 2011
    Assignee: International Business Machines Corporation
    Inventors: Ravi K. Arimilli, Robert S. Blackmore, Ramakrishnan Rajamony, William J. Starke
  • Patent number: 7921261
    Abstract: A method of operating a data processing system includes each of multiple tasks within a parallel job executing on multiple nodes of the data processing system issuing a respective system call to request reservation, without allocation of backing storage in physical memory, of a global address space defined by a range of effective addresses as global shared memory accessible to all of the multiple tasks within the parallel job. At least two of the tasks within the parallel job allocate global address spaces including a same effective address.
    Type: Grant
    Filed: December 18, 2007
    Date of Patent: April 5, 2011
    Assignee: International Business Machines Corporation
    Inventors: Robert S. Blackmore, Ramakrishnan Rajamony
  • Patent number: 7921275
    Abstract: While an asynchronous memory move (AMM) operation is ongoing, a prefetch request for data from the source effective address or the destination effective address triggers cache injection by the AMM mover of relevant data from the stream of data being moved in the physical memory. The memory controller forwards the first prefetched line to the prefetch engine and L1 cache, the next cache lines in the sequence of data to the L2 cache, and a subsequent set of cache lines to the L3 cache. The memory controller then forwards the remaining data to the destination memory location. Quick access to prefetch data is enabled by buffering the stream of data in the upper caches rather than placing all the moved data within the memory. Also, the memory controller places moved data into only a subset of the available cache lines of the upper level cache.
    Type: Grant
    Filed: February 1, 2008
    Date of Patent: April 5, 2011
    Assignee: International Business Machines Corporation
    Inventors: Ravi K. Arimilli, Robert S. Blackmore, Chulho Kim, Balaram Sinharoy, Hanhong Xue
  • Publication number: 20110078410
    Abstract: Disclosed are a method of and system for multiple party communications in a processing system including multiple processing subsystems. Each of the processing subsystems includes a central processing unit and one or more network adapters for connecting said each processing subsystem to the other processing subsystems. A multitude of nodes are established or created, and each of these nodes is associated with one of the processing subsystems. A first aspect of the invention involves pipelined communication using RDMA among three nodes, where the first node breaks up a large communication into multiple parts and sends these parts one after the other to the second node using RDMA, and the second node in turn absorbs and forwards each of these parts to a third node before all parts of the communication arrive from the first node.
    Type: Application
    Filed: July 17, 2006
    Publication date: March 31, 2011
    Applicant: International Business Machines Corporation
    Inventors: Robert S. Blackmore, Rama K. Govindaraju, Peter H. Hochschild, Chulho Kim, Rajeev Sivaram, Richard R. Treumann, Hanhong Xue
  • Patent number: 7877436
    Abstract: A method and a data processing system for completing checkpoint processing of a distributed job with local tasks communicating with other remote tasks via a host fabric interface (HFI) and assigned HFI window. Each HFI window has a send count and a receive count, which tracks GSM messages that are sent from and received at the HFI window. When a checkpoint is initiated by a master task, each local task forwards the send count and the receive count to the master task. The master task sums the respective counts and then compares the totals to each other. When the send count total is equal to the receive count total, the tasks are permitted to continue processing. However, when the send count total is not equal to the receive count total, the master task notifies each task of the job to rollback to a previous checkpoint or kill the job execution.
    Type: Grant
    Filed: February 1, 2008
    Date of Patent: January 25, 2011
    Assignee: International Business Machines Corporation
    Inventors: Lakshminarayana B. Arimilli, Robert S. Blackmore, Chulho Kim, Ramakrishnan Rajamony, Hanhong Xue
  • Patent number: 7873879
    Abstract: A host fabric interface (HFI) enables debugging of global shared memory (GSM) operations received at a local node from a network fabric. The local node has a memory management unit (MMU), which provides an effective address to real address (EA-to-RA) translation table that is utilized by the HFI to evaluate when EAs of GSM operations/data from a received GSM packet is memory-mapped to RAs of the local memory. The HFI retrieves the EA associated with a GSM operation/data within a received GSM packet. The HFI forwards the EA to the MMU, which determines when the EA is mapped to RAs within the local memory for the local task. The HFI processing logic enables processing of the GSM packet only when the EA of the GSM operation/data within the GSM packet is an EA that has a local RA translation. Non-matching EAs result in an error condition that requires debugging.
    Type: Grant
    Filed: February 1, 2008
    Date of Patent: January 18, 2011
    Assignee: International Business Machines Corporation
    Inventors: Lakshminarayana B. Arimilli, Robert S. Blackmore, Chulho Kim, Ramakrishnan Rajamony, Hanhong Xue
  • Publication number: 20100269027
    Abstract: A data processing system is programmed to provide a method for enabling user-level one-to-all message/messaging (OTAM) broadcast within a distributed parallel computing environment in which multiple threads of a single job execute on different processing nodes across a network. The method comprises: generating one or more messages for transmission to at least one other processing node accessible via a network, where the messages are generated by/for a first thread executing at the data processing system (first processing node) and the other processing node executes one or more second threads of a same parallel job as the first thread. An OTAM broadcast is transmitting via a host fabric interface (HFI) of the data processing system as a one-to-all broadcast on the network, whereby the messages are transmitted to a cluster of processing nodes across the network that execute threads of the same parallel job as the first thread.
    Type: Application
    Filed: April 16, 2009
    Publication date: October 21, 2010
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Lakshminarayana B. Arimilli, Ravi K. Arimilli, Robert S. Blackmore
  • Patent number: 7813369
    Abstract: In a multinode data processing system in which nodes exchange information over a network or through a switch, a structure and mechanism is provided within the realm of Remote Direct Memory Access (RDMA) operations in which DMA operations are present on one side of the transfer but not the other. On the side in which the transfer is not carried out in DMA fashion, transfer processing is carried out under program control; this is in contrast to the transfer on the DMA side which is characteristically carried out in hardware. Usage of these combination processes is useful in programming situations where RDMA is carried out to or from contiguous locations in memory on one side and where memory locations on the other side is noncontiguous. This split mode of transfer is provided both for read and for write operations.
    Type: Grant
    Filed: December 20, 2004
    Date of Patent: October 12, 2010
    Assignee: International Business Machines Corporation
    Inventors: Robert S. Blackmore, Fu Chung Chang, Piyush Chaudhary, Kevin J. Gildea, Jason E. Goscinski, Rama K. Govindaraju, Donald G. Grice, Leonard W. Helmer, Jr., Patricia E. Heywood, Peter H. Hochschild, John S. Houston, Chulho Kim, Steven J. Martin
  • Patent number: 7797588
    Abstract: In a global shared memory (GSM) environment, an initiating task at a first node with a host fabric interface (HFI) uses epochs to provide reliability of transmission of packets via a network fabric to a target task. The HFI generates a packet for the initiating task addressed to the target task, and automatically inserts a current epoch of the initiating task into the packet. A copy of the current epoch is maintained by the target task, which accepts for processing only packets having the correct epoch, unless the packet is tagged for guaranteed-once delivery. When a packet delivery is accepted, the target task sends a notification to the initiating task. If the initiating task does not receive the notification of delivery for the issued packet, the initiating task updates the epoch at both the target node and the initiating node and re-transmits the packet.
    Type: Grant
    Filed: February 1, 2008
    Date of Patent: September 14, 2010
    Assignee: International Business Machines Corporation
    Inventors: Lakshminarayana B. Arimilli, Robert S. Blackmore, Chulho Kim, Hanhong Xue
  • Patent number: 7689992
    Abstract: Shared locks are employed for controlling a thread which extends across more than one protocol layer in a data processing system. The use of a counter is used as part of a data structure which makes it possible to implement shared locks across multiple layers. The use of shared locks avoids the processing overhead usually associated with lock acquisition and release. The thread which is controlled may be initiated in either an upper layer protocol or in a lower layer.
    Type: Grant
    Filed: June 25, 2004
    Date of Patent: March 30, 2010
    Assignee: International Business Machines Corporation
    Inventors: Robert S. Blackmore, Su-Hsuan Huang, Chulho Kim, Richard R. Treumann, Hanhong Xue
  • Publication number: 20090271549
    Abstract: Disclosed are a method, information processing system, and computer readable medium for managing interrupts. The method includes placing at least one physical processor of an information processing system in a simultaneous multi-threading mode. At least a first logical processor and a second logical processor associated with the at least one physical processor are partitioned. The first logical processor is assigned to manage interrupts and the second logical processor is assigned to dispatch runnable user threads.
    Type: Application
    Filed: February 16, 2009
    Publication date: October 29, 2009
    Applicant: International Business Machines Corp.
    Inventors: ROBERT S. BLACKMORE, Rama K. Govindaraju, Peter H. Hochschild
  • Publication number: 20090210635
    Abstract: Intra-node data transfer in collective communications is facilitated. A memory object of one task of a collective communication is concurrently attached to the address spaces of a plurality of other tasks of the communication. Those tasks that attach the memory object can access the memory object as if it was their own. Data can be directly written into or read from an application data structure of the memory object by the attaching tasks without copying the data to/from shared memory.
    Type: Application
    Filed: May 5, 2009
    Publication date: August 20, 2009
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Robert S. BLACKMORE, Bin JIA, Richard R. TREUMANN
  • Publication number: 20090198963
    Abstract: A method within a data processing system by which a processor executes an asynchronous memory move (AMM) store (ST) instruction to complete a corresponding AMM operation in parallel with an ongoing (not yet completed), previously issued barrier operation. The processor receives the AMM ST instruction after executing the barrier operation (or SYNC instruction) and before the completion of the barrier operation or SYNC on the system fabric. The processor continues executing the AMM ST instruction, which performs a move in virtual address space and then triggers the generation of the AMM operation. The AMM operation proceeds while the barrier operation continues, independent of the processor. The processor stops further execution of all other memory access requests, excluding AMM ST instructions that are received after the barrier operation, but before completion of the barrier operation.
    Type: Application
    Filed: February 1, 2008
    Publication date: August 6, 2009
    Inventors: Ravi K. Arimilli, Robert S. Blackmore, Chulho Kim, Balaram Sinharoy, Hanhong Xue
  • Publication number: 20090199182
    Abstract: A method for providing global notification of completion of a global shared memory (GSM) operation during processing by a target task executing at a target node of a distributed system. The distributed system has at least one other node on which an initiating task that generated the GSM operation is homed. The target task receives the GSM operation from the initiating task, via a host fabric interface (HFI) window assigned to the target task. The task initiates execution of the GSM operation on the target node. The task detects completion of the execution of the GSM operation on the target node, and issues a global notification to at least the initiating task. The global notification indicates the completion of the execution of the GSM operation to one or more tasks of a single job distributed across multiple processing nodes.
    Type: Application
    Filed: February 1, 2008
    Publication date: August 6, 2009
    Inventors: Lakshminarayana B. Arimilli, Robert S. Blackmore, Gheorghe C. Cascaval, Ramakrishnan Rajamony
  • Publication number: 20090199200
    Abstract: A method and data processing system for performing fence operations within a global shared memory (GSM) environment having a local task executing on a processor and providing GSM commands for processing by a host fabric interface (HFI) window that is allocated to the task. The HFI window has one or more registers for use during local fence operations. A first register tracks a first count of task-issued GSM commands, and a second register tracks a second count of GSM operations being processed by the HFI. The processing logic detects a locally-issued fence operation, and responds by performing a series of operations, including: automatically stopping the task from issuing additional GSM commands; monitoring for completion of all the task-issued GSM commands at the HFI; and triggering a resumption of issuance of GSM commands by the task when the completion of all previous task-issued GSM commands is registered by the HFI.
    Type: Application
    Filed: February 1, 2008
    Publication date: August 6, 2009
    Inventors: Lakshminarayana B. Arimilli, Robert S. Blackmore, Chulho Kim, Ramakrishnan Rajamony, Hanhong Xue
  • Publication number: 20090198908
    Abstract: While an AMM operation is ongoing, a prefetch request for data from the source effective address or the destination effective address triggers a cache injection by the AMM mover (or memory controller) of relevant data from the stream of data being moved in the physical memory. The memory controller forwards the first prefetched line to the prefetch engine and L1 cache. The memory controller also forwards the next cache lines in the sequence of data to the L2 cache and a subsequent set of cache lines to the L3 cache. The memory controller then forwards the remaining data to the destination memory location. Quick access to prefetch data is enabled by buffering the stream of data in the upper caches rather than placing all the moved data within the memory. Also, the memory controller does not overrun the upper caches, by placing moved data into only a subset of the available cache lines of the upper level cache.
    Type: Application
    Filed: February 1, 2008
    Publication date: August 6, 2009
    Inventors: RAVI K. ARIMILLI, Robert S. Blackmore, Chulho Kim, Balaram Sinharoy, Hanhong Xue