Patents by Inventor Jaroslaw Topp

Jaroslaw Topp has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 9436651
    Abstract: Methods related to communication between and within nodes in a high performance computing system are presented. Processing time for message exchange between a processing unit and a network controller interface in a node is reduced. Resources required to manage application state in the network interface controller are minimized. In the network interface controller, multiple contexts are multiplexed into one physical Direct Memory Access engine. Virtual to physical address translation in the network interface controller is accelerated by using a plurality of independent caches, with each level of the page table hierarchy cached in an independent cache. A memory management scheme for data structures distributed between the processing unit and the network controller interface is provided. The state required to implement end-to-end reliability is reduced by limiting the transmit sequence number space to the currently in-flight messages.
    Type: Grant
    Filed: December 9, 2010
    Date of Patent: September 6, 2016
    Assignee: Intel Corporation
    Inventors: Keith D. Underwood, Steffen Kosinski, Jaroslaw Topp, Jan Uerpmann, Michael Redeker
  • Patent number: 9367477
    Abstract: A processor includes a core with logic to execute a translated instruction. The translated instruction is translated from an instruction stored in a memory location. The processor further includes a translation lookaside buffer including logic to store translation indicators from a physical map. Each translation indicator indicates whether a corresponding memory location includes translated code to be protected. The processor further includes a translation indicator agent including logic to determine whether the buffer indicates whether the memory location has been modified subsequent to translation of the instruction.
    Type: Grant
    Filed: September 24, 2014
    Date of Patent: June 14, 2016
    Assignee: Intel Corporation
    Inventors: Jaroslaw Topp, Niranjan L. Cooray, Fernando LaTorre
  • Patent number: 9311239
    Abstract: A system and method to implement a tag structure for a cache memory that includes a multi-way, set-associative translation lookaside buffer. The tag structure may store vectors in an L1 tag array to enable an L1 tag lookup that has fewer bits per entry and consumes less power. The vectors may identify entries in a translation lookaside buffer tag array. When a virtual memory address associated with a memory access instruction hits in the translation lookaside buffer, the translation lookaside buffer may generate a vector identifying the set and the way of the translation lookaside buffer entry that matched. This vector may then be compared to a group of vectors stored in a set of the L1 tag arrays to determine whether the virtual memory address hits in the L1 cache.
    Type: Grant
    Filed: March 14, 2013
    Date of Patent: April 12, 2016
    Assignee: Intel Corporation
    Inventors: Niranjan Cooray, Steffen Kosinski, Rami May, Doron Gershon, Jaroslaw Topp, Varun Mohandru
  • Publication number: 20160085685
    Abstract: A processor includes a core with logic to execute a translated instruction. The translated instruction is translated from an instruction stored in a memory location. The processor further includes a translation lookaside buffer including logic to store translation indicators from a physical map. Each translation indicator indicates whether a corresponding memory location includes translated code to be protected. The processor further includes a translation indicator agent including logic to determine whether the buffer indicates whether the memory location has been modified subsequent to translation of the instruction.
    Type: Application
    Filed: September 24, 2014
    Publication date: March 24, 2016
    Inventors: Jaroslaw Topp, Niranjan L. Cooray, Fernando LaTorre
  • Publication number: 20150378731
    Abstract: Various different embodiments of the invention are described including: (1) a method and apparatus for intelligently allocating threads within a binary translation system; (2) data cache way prediction guided by binary translation code morphing software; (3) fast interpreter hardware support on the data-side; (4) out-of-order retirement; (5) decoupled load retirement in an atomic OOO processor; (6) handling transactional and atomic memory in an out-of-order binary translation based processor; and (7) speculative memory management in a binary translation based out of order processor.
    Type: Application
    Filed: June 30, 2014
    Publication date: December 31, 2015
    Inventors: PATRICK P. LAI, ETHAN SCHUCHMAN, DAVID KEPPEL, DENIS M. KHARTIKOV, POLYCHRONIS XEKALAKIS, JOSHUA B. FRYMAN, ALLAN D. KNIES, NAVEEN NEELAKANTAM, GREGOR STELLPFLUG, JOHN H. KELM, MIREM HYUSEINOVA, DEMOS PAVLOU, JAROSLAW TOPP
  • Patent number: 9158705
    Abstract: A processing device implementing stride-based translation lookaside buffer (TLB) prefetching with adaptive offset is disclosed. A processing device of the disclosure includes a data prefetcher to generate a data prefetch address based on a linear address, a stride, or a prefetch distance, the data prefetch address associated with a data prefetch request, and a TLB prefetch address computation component to generate a TLB prefetch address based on the linear address, the stride, the prefetch distance, or an adaptive offset. The processing device also includes a cross page detection component to determine that the data prefetch address or the TLB prefetch address cross a page boundary associated with the linear address, and cause a TLB prefetch request to be written to a TLB request queue, the TLB prefetch request for translation of an address of a linear page number (LPN) based on the data prefetch address or the TLB prefetch address.
    Type: Grant
    Filed: March 13, 2013
    Date of Patent: October 13, 2015
    Assignee: Intel Corporation
    Inventors: Jaroslaw Topp, Pedro Lopez, Fernando Latorre, Demos Pavlou, Thang Vu
  • Publication number: 20150220436
    Abstract: A system and method to implement a tag structure for a cache memory that includes a multi-way, set-associative translation lookaside buffer. The tag structure may store vectors in an L1 tag array to enable an L1 tag lookup that has fewer bits per entry and consumes less power. The vectors may identify entries in a translation lookaside buffer tag array. When a virtual memory address associated with a memory access instruction hits in the translation lookaside buffer, the translation lookaside buffer may generate a vector identifying the set and the way of the translation lookaside buffer entry that matched. This vector may then be compared to a group of vectors stored in a set of the L1 tag arrays to determine whether the virtual memory address hits in the L1 cache.
    Type: Application
    Filed: March 14, 2013
    Publication date: August 6, 2015
    Applicant: Intel Corporation
    Inventors: Niranjan Cooray, Steffen Kosinski, Rami May, Doron Gershon, Jaroslaw Topp, Varun Mohandru
  • Publication number: 20150178090
    Abstract: A system includes a processor with a front end to receive an instruction stream reordered by a software scheduler and including a plurality of memory operations and alias information indicating how a given memory operation may be evaluated. Furthermore, the processor includes a hardware scheduler to reorder, in hardware, the instruction stream for out-of-order execution. In addition, the processor includes a calculation module to determine, for a given memory operation and based upon the alias information, a checking range of memory atoms subsequent to the given memory operation and a virtual order of the memory operation. The virtual order indicates an original ordering of the instructions. The processor also includes an alias unit to reorder the instruction stream, determine whether the hardware reordering caused an error, and determine whether the software reordering caused an error based upon the checking range and the virtual order.
    Type: Application
    Filed: December 23, 2013
    Publication date: June 25, 2015
    Inventors: Rainer Theuer, Arun Raman, Jaroslaw Topp, Rakesh Ranjan, Sebastian Winkel, Gregor Stellpflug, Ulrich Bretthauer
  • Patent number: 9009413
    Abstract: A processor includes a processor core including an execution unit to execute instructions, and a cache memory. The cache memory includes a controller to update each of a plurality of stale indicators in response to a lazy flush instruction. Each stale indicator is associated with respective data, and each updated stale indicator is to indicate that the respective data is stale. The cache memory also includes a plurality of cache lines. Each cache line is to store corresponding data and a foreground tag that includes a respective virtual address associated with the corresponding data, and that includes the associated stale indicator. Other embodiments are described as claimed.
    Type: Grant
    Filed: December 21, 2012
    Date of Patent: April 14, 2015
    Assignee: Intel Corporation
    Inventors: Varun K. Mohandru, Fernando Latorre, Niranjan L. Cooray, Pedro Lopez, Naveen Neelakantam, Li-Gao Zei, Rami May, Jaroslaw Topp, Thomas Gaertner
  • Publication number: 20140281351
    Abstract: A processing device implementing stride-based translation lookaside buffer (TLB) prefetching with adaptive offset is disclosed. A processing device of the disclosure includes a data prefetcher to generate a data prefetch address based on a linear address, a stride, or a prefetch distance, the data prefetch address associated with a data prefetch request, and a TLB prefetch address computation component to generate a TLB prefetch address based on the linear address, the stride, the prefetch distance, or an adaptive offset. The processing device also includes a cross page detection component to determine that the data prefetch address or the TLB prefetch address cross a page boundary associated with the linear address, and cause a TLB prefetch request to be written to a TLB request queue, the TLB prefetch request for translation of an address of a linear page number (LPN) based on the data prefetch address or the TLB prefetch address.
    Type: Application
    Filed: March 13, 2013
    Publication date: September 18, 2014
    Inventors: Jaroslaw Topp, Pedro Lopez, Fernando Latorre, Demos Pavlou, Thang Vu
  • Publication number: 20140189238
    Abstract: A virtually tagged cache may be configured to index virtual address entries in the cache into lockable sets based on a page offset value. When a memory operation misses on the virtually tagged cache, only the one set of virtual address entries with the same page offset may be locked. Thereafter, this general lock may be released and only an address stored in the physical tag array matching the physical address and a virtual address in the virtual tag array corresponding to the matching address stored in the physical tag array may be locked to reduce the amount and duration of locked addresses. The machine may be stalled only if a particular memory address request hits and/or tries to access one or more entries in a locked set. Devices, systems, methods, and computer readable media are provided.
    Type: Application
    Filed: December 28, 2012
    Publication date: July 3, 2014
    Inventors: Li-Gao ZEI, Fernando LATORRE, Steffen KOSINSKI, Jaroslaw TOPP, Varun MOHANDRU, Lutz NAETHKE
  • Publication number: 20140181388
    Abstract: A processor includes a processor core including an execution unit to execute instructions, and a cache memory. The cache memory includes a controller to update each of a plurality of stale indicators in response to a lazy flush instruction. Each stale indicator is associated with respective data, and each updated stale indicator is to indicate that the respective data is stale. The cache memory also includes a plurality of cache lines. Each cache line is to store corresponding data and a foreground tag that includes a respective virtual address associated with the corresponding data, and that includes the associated stale indicator. Other embodiments are described as claimed.
    Type: Application
    Filed: December 21, 2012
    Publication date: June 26, 2014
    Inventors: Varun K. Mohandru, Fernando Latorre, NIRANJAN L. COORAY, Pedro Lopez, NAVEEN NEELAKANTAM, LI-GAO ZEI, RAMI MAY, JAROSLAW TOPP, THOMAS GAERTNER
  • Publication number: 20130246552
    Abstract: Methods related to communication between and within nodes in a high performance computing system are presented. Processing time for message exchange between a processing unit and a network controller interface in a node is reduced. Resources required to manage application state in the network interface controller are minimized. In the network interface controller, multiple contexts are multiplexed into one physical Direct Memory Access engine. Virtual to physical address translation in the network interface controller is accelerated by using a plurality of independent caches, with each level of the page table hierarchy cached in an independent cache. A memory management scheme for data structures distributed between the processing unit and the network controller interface is provided. The state required to implement end-to-end reliability is reduced by limiting the transmit sequence number space to the currently in-flight messages.
    Type: Application
    Filed: December 9, 2010
    Publication date: September 19, 2013
    Inventors: Keith D. Underwood, Steffen Kosinski, Jaroslaw Topp, Jan Uerpmann, Michael Redeker