Patents by Inventor Steffen Kosinski

Steffen Kosinski has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 10558255
    Abstract: Systems, apparatuses and methods may provide for firmware access wrapper technology that includes a plurality of input registers communicatively coupled to a hardware power controller, a plurality of output registers communicatively coupled to the hardware power controller, and a processor communicatively coupled to the input registers and the output registers. The processor may include configurable logic to identify a control policy change with respect to the hardware power controller, detect input signal information in one or more of the input registers, and conduct a modification of one or more values in the output registers based on the control policy change and the input signal information.
    Type: Grant
    Filed: December 30, 2016
    Date of Patent: February 11, 2020
    Assignee: Intel Corporation
    Inventors: Jonathan E. Schmidt, Joerg Hartung, Steffen Kosinski, Jack Cummings
  • Publication number: 20180188798
    Abstract: Systems, apparatuses and methods may provide for firmware access wrapper technology that includes a plurality of input registers communicatively coupled to a hardware power controller, a plurality of output registers communicatively coupled to the hardware power controller, and a processor communicatively coupled to the input registers and the output registers. The processor may include configurable logic to identify a control policy change with respect to the hardware power controller, detect input signal information in one or more of the input registers, and conduct a modification of one or more values in the output registers based on the control policy change and the input signal information.
    Type: Application
    Filed: December 30, 2016
    Publication date: July 5, 2018
    Inventors: Jonathan E. Schmidt, Joerg Hartung, Steffen Kosinski, Jack Cummings
  • Patent number: 9973417
    Abstract: Methods related to communication between and within nodes in a high performance computing system are presented. Processing time for message exchange between a processing unit and a network controller interface in a node is reduced. Resources required to manage application state in the network interface controller are minimized. In the network interface controller, multiple contexts are multiplexed into one physical Direct Memory Access engine. Virtual to physical address translation in the network interface controller is accelerated by using a plurality of independent caches, with each level of the page table hierarchy cached in an independent cache. A memory management scheme for data structures distributed between the processing unit and the network controller interface is provided. The state required to implement end-to-end reliability is reduced by limiting the transmit sequence number space to the currently in-flight messages.
    Type: Grant
    Filed: September 2, 2016
    Date of Patent: May 15, 2018
    Assignee: INTEL CORPORATION
    Inventors: Keith D. Underwood, Steffen Kosinski, Jaroslaw Topp, Jan Norden, Michael Redeker
  • Publication number: 20170054633
    Abstract: Methods related to communication between and within nodes in a high performance computing system are presented. Processing time for message exchange between a processing unit and a network controller interface in a node is reduced. Resources required to manage application state in the network interface controller are minimized. In the network interface controller, multiple contexts are multiplexed into one physical Direct Memory Access engine. Virtual to physical address translation in the network interface controller is accelerated by using a plurality of independent caches, with each level of the page table hierarchy cached in an independent cache. A memory management scheme for data structures distributed between the processing unit and the network controller interface is provided. The state required to implement end-to-end reliability is reduced by limiting the transmit sequence number space to the currently in-flight messages.
    Type: Application
    Filed: September 2, 2016
    Publication date: February 23, 2017
    Inventors: Keith D. UNDERWOOD, Steffen KOSINSKI, Jaroslaw TOPP, Jan UERPMANN, Michael REDEKER
  • Patent number: 9558121
    Abstract: A virtually tagged cache may be configured to index virtual address entries in the cache into lockable sets based on a page offset value. When a memory operation misses on the virtually tagged cache, only the one set of virtual address entries with the same page offset may be locked. Thereafter, this general lock may be released and only an address stored in the physical tag array matching the physical address and a virtual address in the virtual tag array corresponding to the matching address stored in the physical tag array may be locked to reduce the amount and duration of locked addresses. The machine may be stalled only if a particular memory address request hits and/or tries to access one or more entries in a locked set. Devices, systems, methods, and computer readable media are provided.
    Type: Grant
    Filed: December 28, 2012
    Date of Patent: January 31, 2017
    Assignee: INTEL CORPORATION
    Inventors: Li-Gao Zei, Fernando Latorre, Steffen Kosinski, Jaroslaw Topp, Varun Mohandru, Lutz Naethke
  • Patent number: 9507725
    Abstract: A bit or other vector may be used to identify whether an address range entered into an intermediate buffer corresponds to most recently updated data associated with the address range. A bit or other vector may also be used to identify whether an address range entered into an intermediate buffer overlaps with an address range of data that is to be loaded. A processing device may then determine whether to obtain data that is to be loaded entirely from a cache, entirely from an intermediate buffer which temporarily buffers data destined for a cache until the cache is ready to accept the data, or from both the cache and the intermediate buffer depending on the particular vector settings. Systems, devices, methods, and computer readable media are provided.
    Type: Grant
    Filed: December 28, 2012
    Date of Patent: November 29, 2016
    Assignee: Intel Corporation
    Inventors: Steffen Kosinski, Fernando Latorre, Niranjan Cooray, Stanislav Shwartsman, Ethan Kalifon, Varun Mohandru, Pedro Lopez, Tom Aviram-Rosenfeld, Jaroslav Topp, Li-Gao Zei
  • Patent number: 9436651
    Abstract: Methods related to communication between and within nodes in a high performance computing system are presented. Processing time for message exchange between a processing unit and a network controller interface in a node is reduced. Resources required to manage application state in the network interface controller are minimized. In the network interface controller, multiple contexts are multiplexed into one physical Direct Memory Access engine. Virtual to physical address translation in the network interface controller is accelerated by using a plurality of independent caches, with each level of the page table hierarchy cached in an independent cache. A memory management scheme for data structures distributed between the processing unit and the network controller interface is provided. The state required to implement end-to-end reliability is reduced by limiting the transmit sequence number space to the currently in-flight messages.
    Type: Grant
    Filed: December 9, 2010
    Date of Patent: September 6, 2016
    Assignee: Intel Corporation
    Inventors: Keith D. Underwood, Steffen Kosinski, Jaroslaw Topp, Jan Uerpmann, Michael Redeker
  • Patent number: 9311239
    Abstract: A system and method to implement a tag structure for a cache memory that includes a multi-way, set-associative translation lookaside buffer. The tag structure may store vectors in an L1 tag array to enable an L1 tag lookup that has fewer bits per entry and consumes less power. The vectors may identify entries in a translation lookaside buffer tag array. When a virtual memory address associated with a memory access instruction hits in the translation lookaside buffer, the translation lookaside buffer may generate a vector identifying the set and the way of the translation lookaside buffer entry that matched. This vector may then be compared to a group of vectors stored in a set of the L1 tag arrays to determine whether the virtual memory address hits in the L1 cache.
    Type: Grant
    Filed: March 14, 2013
    Date of Patent: April 12, 2016
    Assignee: Intel Corporation
    Inventors: Niranjan Cooray, Steffen Kosinski, Rami May, Doron Gershon, Jaroslaw Topp, Varun Mohandru
  • Patent number: 9280474
    Abstract: A system and method for adaptive data prefetching in a processor enables adaptive modification of parameters associated with a prefetch operation. A stride pattern in successive addresses of a memory operation may be detected, including determining a stride length (L). Prefetching of memory operations may be based on a prefetch address determined from a base memory address, the stride length L, and a prefetch distance (D). A number of prefetch misses may be counted at a miss prefetch count (C). Based on the value of the miss prefetch count C, the prefetch distance D may be modified. As a result of adaptive modification of the prefetch distance D, an improved rate of cache hits may be realized.
    Type: Grant
    Filed: January 3, 2013
    Date of Patent: March 8, 2016
    Assignee: Intel Corporation
    Inventors: Demos Pavlou, Pedro Lopez, Mirem Hyuseinova, Fernando Latorre, Steffen Kosinski, Ralf Goettsche, Varun K. Mohandru
  • Publication number: 20150220436
    Abstract: A system and method to implement a tag structure for a cache memory that includes a multi-way, set-associative translation lookaside buffer. The tag structure may store vectors in an L1 tag array to enable an L1 tag lookup that has fewer bits per entry and consumes less power. The vectors may identify entries in a translation lookaside buffer tag array. When a virtual memory address associated with a memory access instruction hits in the translation lookaside buffer, the translation lookaside buffer may generate a vector identifying the set and the way of the translation lookaside buffer entry that matched. This vector may then be compared to a group of vectors stored in a set of the L1 tag arrays to determine whether the virtual memory address hits in the L1 cache.
    Type: Application
    Filed: March 14, 2013
    Publication date: August 6, 2015
    Applicant: Intel Corporation
    Inventors: Niranjan Cooray, Steffen Kosinski, Rami May, Doron Gershon, Jaroslaw Topp, Varun Mohandru
  • Publication number: 20150143057
    Abstract: A system and method for adaptive data prefetching in a processor enables adaptive modification of parameters associated with a prefetch operation. A stride pattern in successive addresses of a memory operation may be detected, including determining a stride length (L). Prefetching of memory operations may be based on a prefetch address determined from a base memory address, the stride length L, and a prefetch distance (D). A number of prefetch misses may be counted at a miss prefetch count (C). Based on the value of the miss prefetch count C, the prefetch distance D may be modified. As a result of adaptive modification of the prefetch distance D, an improved rate of cache hits may be realized.
    Type: Application
    Filed: January 3, 2013
    Publication date: May 21, 2015
    Inventors: Demos Pavlou, Pedro Lopez, Mirem Hyuseinova, Fernando Latorre, Steffen Kosinski, Ralf Goettsche, Varun K. Mohandru
  • Publication number: 20140189238
    Abstract: A virtually tagged cache may be configured to index virtual address entries in the cache into lockable sets based on a page offset value. When a memory operation misses on the virtually tagged cache, only the one set of virtual address entries with the same page offset may be locked. Thereafter, this general lock may be released and only an address stored in the physical tag array matching the physical address and a virtual address in the virtual tag array corresponding to the matching address stored in the physical tag array may be locked to reduce the amount and duration of locked addresses. The machine may be stalled only if a particular memory address request hits and/or tries to access one or more entries in a locked set. Devices, systems, methods, and computer readable media are provided.
    Type: Application
    Filed: December 28, 2012
    Publication date: July 3, 2014
    Inventors: Li-Gao ZEI, Fernando LATORRE, Steffen KOSINSKI, Jaroslaw TOPP, Varun MOHANDRU, Lutz NAETHKE
  • Publication number: 20140189250
    Abstract: A bit or other vector may be used to identify whether an address range entered into an intermediate buffer corresponds to most recently updated data associated with the address range. A bit or other vector may also be used to identify whether an address range entered into an intermediate buffer overlaps with an address range of data that is to be loaded. A processing device may then determine whether to obtain data that is to be loaded entirely from a cache, entirely from an intermediate buffer which temporarily buffers data destined for a cache until the cache is ready to accept the data, or from both the cache and the intermediate buffer depending on the particular vector settings. Systems, devices, methods, and computer readable media are provided.
    Type: Application
    Filed: December 28, 2012
    Publication date: July 3, 2014
    Inventors: Steffen Kosinski, Fernando Latorre, Niranjan Cooray, Stanislav Shwartsman, Ethan Kalifon, Varun Mohandru, Pedro Lopez, Tom Aviram-Rosenfeld, Jaroslav Topp, Li-Gao Zei
  • Publication number: 20130246552
    Abstract: Methods related to communication between and within nodes in a high performance computing system are presented. Processing time for message exchange between a processing unit and a network controller interface in a node is reduced. Resources required to manage application state in the network interface controller are minimized. In the network interface controller, multiple contexts are multiplexed into one physical Direct Memory Access engine. Virtual to physical address translation in the network interface controller is accelerated by using a plurality of independent caches, with each level of the page table hierarchy cached in an independent cache. A memory management scheme for data structures distributed between the processing unit and the network controller interface is provided. The state required to implement end-to-end reliability is reduced by limiting the transmit sequence number space to the currently in-flight messages.
    Type: Application
    Filed: December 9, 2010
    Publication date: September 19, 2013
    Inventors: Keith D. Underwood, Steffen Kosinski, Jaroslaw Topp, Jan Uerpmann, Michael Redeker
  • Patent number: 8386594
    Abstract: An embodiment may include network controller circuitry to be included in a first host computer that includes a host processor to execute an operating system environment. The circuitry may initiate, at least in part, one or more checkpoints of, at least in part, one or more states associated with, at least in part, the operating system environment and network traffic between the first host computer and a second host computer. The circuitry also may coordinate, at least in part, respective execution, at least in part, of the one or more checkpoints with respective execution of one or more other respective checkpoints of the second host computer. Of course, many alternatives, variations, and modifications are possible without departing from this embodiment.
    Type: Grant
    Filed: February 11, 2010
    Date of Patent: February 26, 2013
    Assignee: Intel Corporation
    Inventors: Keith D. Underwood, David N. Lombard, Jan Uerpmann, Steffen Kosinski
  • Publication number: 20110196950
    Abstract: An embodiment may include network controller circuitry to be included in a first host computer that includes a host processor to execute an operating system environment. The circuitry may initiate, at least in part, one or more checkpoints of, at least in part, one or more states associated with, at least in part, the operating system environment and network traffic between the first host computer and a second host computer. The circuitry also may coordinate, at least in part, respective execution, at least in part, of the one or more checkpoints with respective execution of one or more other respective checkpoints of the second host computer. Of course, many alternatives, variations, and modifications are possible without departing from this embodiment.
    Type: Application
    Filed: February 11, 2010
    Publication date: August 11, 2011
    Inventors: Keith D. Underwood, David N. Lombard, Jan Uerpmann, Steffen Kosinski