Patents by Inventor Volker Strumpen

Volker Strumpen has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 7933940
    Abstract: Parallel prefix circuits for computing a cyclic segmented prefix operation with a mesh topology are disclosed. In one embodiment of the present invention, the elements (prefix nodes) of the mesh are arranged in row-major order. Values are accumulated toward the center of the mesh and partial results are propagated outward from the center of the mesh to complete the cyclic segmented prefix operation. This embodiment has been shown to be time-optimal. In another embodiment of the present invention, the prefix nodes are arranged such that the prefix node corresponding to the last element in the array is located at the center of the array. This alternative embodiment is not only time-optimal when accounting for wire-lengths (and therefore propagation delays), but it is also asympotically optimal in terms of minimizing the number of segmented prefix operators.
    Type: Grant
    Filed: April 20, 2006
    Date of Patent: April 26, 2011
    Assignee: International Business Machines Corporation
    Inventors: Matteo Frigo, Volker Strumpen
  • Publication number: 20100122034
    Abstract: A tile for use in a tiled storage array provides re-organization of values within the tile array without requiring sophisticated global control. The tiles operate to move a requested value to a front-most storage element of the tile array according to a global systolic clock. The previous occupant of the front-most location is moved or swapped backward according to the systolic clock, and the new occupant is moved forward according to the systolic clock, according to the operation of the tiles, while providing for multiple in-flight access requests within the tile array. The placement heuristic that moves the values is determined according to the position of the tiles within the array and the behavior of the tiles. The movement of the values can be performed via only next-neighbor connections of adjacent tiles within the tile array.
    Type: Application
    Filed: November 13, 2008
    Publication date: May 13, 2010
    Applicant: International Business Machines Corporation
    Inventors: Volker Strumpen, Matteo Frigo
  • Publication number: 20100122012
    Abstract: Systolic networks within a tiled storage array provide for movement of requested values to a front-most tile, while making space for the requested values at the front-most tile by moving other values away. A first and second information pathway provide different linear pathways through the tiles. The movement of other values, requests for values and responses to requests is controlled according to a clocking logic that governs the movement on the first and second information pathways according to a systolic duty cycle. The first information pathway may be a move-to-front network of a spiral cache, crossing the spiral push-back network which forms the push-back network. The systolic duty cycle may be a three-phase duty cycle, or a two-phase duty cycle may be provided if the storage tiles support a push-back swap operation.
    Type: Application
    Filed: December 17, 2009
    Publication date: May 13, 2010
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Fadi H. Gebara, Jeremy D. Schaub, Volker Strumpen
  • Publication number: 20100122031
    Abstract: A spiral cache memory provides low access latency for frequently-accessed values by self-organizing to always move a requested value to a front-most storage tile of the spiral. If the spiral cache needs to eject a value to make space for a value moved to the front-most tile, space is made by ejecting a value from the cache to a backing store. A buffer along with flow control logic is used to prevent overflow of writes of ejected values to the generally slow backing store. The tiles in the spiral cache may be single storage locations or be organized as some form of cache memory such as direct-mapped or set-associative caches. Power consumption of the spiral cache can be reduced by dividing the cache into an active and inactive partition, which can be adjusted on a per-tile basis. Tile-generated or global power-down decisions can set the size of the partitions.
    Type: Application
    Filed: November 13, 2008
    Publication date: May 13, 2010
    Applicant: International Business Machines Corporation
    Inventors: Volker Strumpen, Matteo Frigo
  • Publication number: 20100122033
    Abstract: An integrated memory system with a spiral cache responds to requests for values at a first external interface coupled to a particular storage location in the cache in a time period determined by the proximity of the requested values to the particular storage location. The cache supports multiple outstanding in-flight requests directed to the same address using an issue table that tracks multiple outstanding requests and control logic that applies the multiple requests to the same address in the order received by the cache memory. The cache also includes a backing store request table that tracks push-back write operations issued from the cache memory when the cache memory is full and a new value is provided from the external interface, and the control logic to prevent multiple copies of the same value from being loaded into the cache or a copy being loaded before a pending push-back has been completed.
    Type: Application
    Filed: December 17, 2009
    Publication date: May 13, 2010
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Fadi H. Gebara, Jeremy D. Schaub, Volker Strumpen
  • Publication number: 20100122057
    Abstract: A tiled storage array provides reduction in access latency for frequently-accessed values by re-organizing to always move a requested value to a front-most storage element of array. The previous occupant of the front-most location is moved backward according to a systolic pulse, and the new occupant is moved forward according to the systolic pulse, preserving the uniqueness of the stored values within the array, and providing for multiple in-flight access requests within the array. The placement heuristic that moves the values according to the systolic pulse can be implemented by control logic within identical tiles, so that the placement heuristic moves the values according to the position of the tiles within the array. The movement of the values can be performed via only next-neighbor connections of adjacent tiles within the array.
    Type: Application
    Filed: November 13, 2008
    Publication date: May 13, 2010
    Applicant: International Business Machines Corporation
    Inventors: Volker Strumpen, Matteo Frigo
  • Publication number: 20100122035
    Abstract: A spiral cache memory provides reduction in access latency for frequently-accessed values by self-organizing to always move a requested value to a front-most central storage element of the spiral. The occupant of the central location is swapped backward, which continues backward through the spiral until an empty location is swapped-to, or the last displaced value is cast out of the last location in the spiral. The elements in the spiral may be cache memories or single elements. The resulting cache memory is self-organizing and for the one-dimensional implementation has a worst-case access time proportional to N, where N is the number of tiles in the spiral. A k-dimensional spiral cache has a worst-case access time proportional to N1/k. Further, a spiral cache system provides a basis for a non-inclusive system of cache memory, which reduces the amount of space and power consumed by a cache memory of a given size.
    Type: Application
    Filed: November 13, 2008
    Publication date: May 13, 2010
    Applicant: International Business Machines Corporation
    Inventors: Volker Strumpen, Matteo Frigo
  • Publication number: 20090326840
    Abstract: A method, system and computer program product for generating device fingerprints and authenticating devices uses initial states of internal storage cells after each of a number multiple power cycles for each of a number of device temperatures to generate a device fingerprint. The device fingerprint may include pairs of expected values for each of the internal storage cells and a corresponding probability that the storage cell will assume the expected value. Storage cells that have expected values varying over the multiple temperatures may be excluded from the fingerprint. A device is authenticated by a similarity algorithm that uses a match of the expected values from a known fingerprint with power-up values from an unknown device, weighting the comparisons by the probability for each cell to compute a similarity measure.
    Type: Application
    Filed: June 26, 2008
    Publication date: December 31, 2009
    Applicant: International Business Machines Corporation
    Inventors: Fadi H. Gebara, Joonsoo Kim, Jeremy D. Schaub, Volker Strumpen
  • Publication number: 20070260663
    Abstract: Parallel prefix circuits for computing a cyclic segmented prefix operation with a mesh topology are disclosed. In one embodiment of the present invention, the elements (prefix nodes) of the mesh are arranged in row-major order. Values are accumulated toward the center of the mesh and partial results are propagated outward from the center of the mesh to complete the cyclic segmented prefix operation. This embodiment has been shown to be time-optimal. In another embodiment of the present invention, the prefix nodes are arranged such that the prefix node corresponding to the last element in the array is located at the center of the array. This alternative embodiment is not only time-optimal when accounting for wire-lengths (and therefore propagation delays), but it is also asympotically optimal in terms of minimizing the number of segmented prefix operators.
    Type: Application
    Filed: April 20, 2006
    Publication date: November 8, 2007
    Inventors: Matteo Frigo, Volker Strumpen
  • Publication number: 20060230409
    Abstract: A method and processor architecture for achieving a high level of concurrency and latency hiding in an “infinite-thread processor architecture” with a limited number of hardware threads is disclosed. A preferred embodiment defines “fork” and “join” instructions for spawning new threads and having a novel operational semantics. If a hardware thread is available to shepherd a forked thread, the fork and join instructions have thread creation and termination/synchronization semantics, respectively. If no hardware thread is available, however, the fork and join instructions assume subroutine call and return semantics respectively. The link register of the processor is used to determine whether a given join instruction should be treated as a thread synchronization operation or as a return from subroutine operation.
    Type: Application
    Filed: April 7, 2005
    Publication date: October 12, 2006
    Inventors: Matteo Frigo, Ahmed Gheith, Volker Strumpen
  • Publication number: 20060230408
    Abstract: A method and processor architecture for achieving a high level of concurrency and latency hiding in an “infinite-thread processor architecture” with a limited number of hardware threads is disclosed. A preferred embodiment defines “fork” and “join” instructions for spawning new context-switched threads. Context switching is used to hide the latency of both memory-access operations (i.e., loads and stores) and arithmetic/logical operations. When an operation executing in a thread incurs a latency having the potential to delay the instruction pipeline, the latency is hidden by performing a context switch to a different thread. When the result of the operation becomes available, a context switch back to that thread is performed to allow the thread to continue.
    Type: Application
    Filed: April 7, 2005
    Publication date: October 12, 2006
    Inventors: Matteo Frigo, Ahmed Gheith, Volker Strumpen