Patents by Inventor Joshua B. Fryman

Joshua B. Fryman has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Publication number: 20200233819
    Abstract: An apparatus is described. The apparatus includes a rank of memory chips to couple to a memory channel. The memory channel is characterized as having eight transfers of eight bits of raw data per burst access. The rank of memory chips has first, second and third X4 memory chips. The X4 memory chips conform to a JEDEC dual data rate (DDR) memory interface specification. The first and second X4 memory chips are to couple to an eight bit raw data portion of the memory channel's data bus. The third X4 memory chip to couple to an error correction coding (ECC) information portion of the memory channel's data bus.
    Type: Application
    Filed: March 27, 2020
    Publication date: July 23, 2020
    Inventors: Byoungchan OH, Sai Dheeraj POLAGANI, Joshua B. FRYMAN
  • Patent number: 10684858
    Abstract: Disclosed embodiments relate to an indirect memory fetch (IMF) unit. In one example, an apparatus includes circuitry to fetch and decode an instruction specifying a sparse operand array including N operands, and an index array including N contiguously-addressed indices. The apparatus further includes a processing engine associated with an IMF unit to respond to the decoded instruction by initializing the IMF unit to fetch the N operands in order, probing the IMF unit to determine that a fetched operand is ready to retrieve, retrieving the fetched operand from the IMF unit, and repeating the probing and retrieving until all N operands have been retrieved. The IMF unit, independent of the processing engine, is to fetch the N contiguously-addressed indices from the index array, use the N fetched indices to calculate memory addresses for the N operands, and issue a plurality of read requests to fetch the N operands in order.
    Type: Grant
    Filed: June 1, 2018
    Date of Patent: June 16, 2020
    Assignee: Intel Corporation
    Inventors: Stijn Eyerman, Wim Heirman, Kristof Du Bois, Ibrahim Hur, Joshua B. Fryman
  • Publication number: 20200174929
    Abstract: In one embodiment, an apparatus includes a memory access circuit to receive memory access instructions and provide at least some of the memory access instructions to a memory subsystem for execution. The memory access circuit may have a conversion circuit to convert the first memory access instruction to a first subline memory access instruction, e.g., based at least in part on an access history for a first memory access instruction. Other embodiments are described and claimed.
    Type: Application
    Filed: November 29, 2018
    Publication date: June 4, 2020
    Inventors: Wim Heirman, Stijn Eyerman, Kristof Du Bois, Ibrahim Hur, Joshua B. Fryman
  • Publication number: 20200104164
    Abstract: Disclosed embodiments relate to an improved memory system architecture for multi-threaded processors. In one example, a system includes a system comprising a multi-threaded processor core (MTPC), the MTPC comprising: P pipelines, each to concurrently process T threads; a crossbar to communicatively couple the P pipelines; a memory for use by the P pipelines, a scheduler to optimize reduction operations by assigning multiple threads to generate results of commutative arithmetic operations, and then accumulate the generated results, and a memory controller (MC) to connect with external storage and other MTPCs, the MC further comprising at least one optimization selected from: an instruction set architecture including a dual-memory operation; a direct memory access (DMA) engine; a buffer to store multiple pending instruction cache requests; multiple channels across which to stripe memory requests; and a shadow-tag coherency management unit.
    Type: Application
    Filed: September 28, 2018
    Publication date: April 2, 2020
    Inventors: Robert PAWLOWSKI, Ankit MORE, Jason M. HOWARD, Joshua B. FRYMAN, Tina C. ZHONG, Shaden SMITH, Sowmya PITCHAIMOORTHY, Samkit JAIN, Vincent CAVE, Sriram AANANTHAKRISHNAN, Bharadwaj KRISHNAMURTHY
  • Publication number: 20200004602
    Abstract: In one embodiment, a first processor core includes: a plurality of execution pipelines each to execute instructions of one or more threads; a plurality of pipeline barrier circuits coupled to the plurality of execution pipelines, each of the plurality of pipeline barrier circuits associated with one of the plurality of execution pipelines to maintain status information for a plurality of barrier groups, each of the plurality of barrier groups formed of at least two threads; and a core barrier circuit to control operation of the plurality of pipeline barrier circuits and to inform the plurality of pipeline barrier circuits when a first barrier has been reached by a first barrier group of the plurality of barrier groups. Other embodiments are described and claimed.
    Type: Application
    Filed: June 27, 2018
    Publication date: January 2, 2020
    Inventors: Robert Pawlowski, Ankit More, Shaden Smith, Sowmya Pitchaimoorthy, Samkit Jain, Vincent Cavé, Sriram Aananthakrishnan, Jason M. Howard, Joshua B. Fryman
  • Publication number: 20190369998
    Abstract: Disclosed embodiments relate to an indirect memory fetch (IMF) unit. In one example, an apparatus includes circuitry to fetch and decode an instruction specifying a sparse operand array including N operands, and an index array including N contiguously-addressed indices. The apparatus further includes a processing engine associated with an IMF unit to respond to the decoded instruction by initializing the IMF unit to fetch the N operands in order, probing the IMF unit to determine that a fetched operand is ready to retrieve, retrieving the fetched operand from the IMF unit, and repeating the probing and retrieving until all N operands have been retrieved. The IMF unit, independent of the processing engine, is to fetch the N contiguously-addressed indices from the index array, use the N fetched indices to calculate memory addresses for the N operands, and issue a plurality of read requests to fetch the N operands in order.
    Type: Application
    Filed: June 1, 2018
    Publication date: December 5, 2019
    Inventors: Stijn EYERMAN, Wim HEIRMAN, Kristof DU BOIS, Ibrahim HUR, Joshua B. FRYMAN
  • Publication number: 20190303159
    Abstract: Disclosed embodiments relate to an instruction set architecture to facilitate energy-efficient computing for exascale architectures. In one embodiment, a processor includes a plurality of accelerator cores, each having a corresponding instruction set architecture (ISA); a fetch circuit to fetch one or more instructions specifying one of the accelerator cores, a decode circuit to decode the one or more fetched instructions, and an issue circuit to translate the one or more decoded instructions into the ISA corresponding to the specified accelerator core, collate the one or more translated instructions into an instruction packet, and issue the instruction packet to the specified accelerator core; and, wherein the plurality of accelerator cores comprise a memory engine (MENG), a collective engine (CENG), a queue engine (QENG), and a chain management unit (CMU).
    Type: Application
    Filed: March 29, 2018
    Publication date: October 3, 2019
    Inventors: Joshua B. FRYMAN, Jason M. HOWARD, Priyanka SURESH, Banu Meenakshi NAGASUNDARAM, Srikanth DAKSHINAMOORTHY, Ankit MORE, Robert PAWLOWSKI, Samkit JAIN, Pranav YEOLEKAR, Avinash M. SEEGEHALLI, Surhud KHARE, Dinesh SOMASEKHAR, David S. DUNNING, Romain E. Cledat, William Paul GRIFFIN, Bhavitavya B. BHADVIYA, Ivan B. GANEV
  • Patent number: 10409763
    Abstract: Various different embodiments of the invention are described including: (1) a method and apparatus for intelligently allocating threads within a binary translation system; (2) data cache way prediction guided by binary translation code morphing software; (3) fast interpreter hardware support on the data-side; (4) out-of-order retirement; (5) decoupled load retirement in an atomic OOO processor; (6) handling transactional and atomic memory in an out-of-order binary translation based processor; and (7) speculative memory management in a binary translation based out of order processor.
    Type: Grant
    Filed: June 30, 2014
    Date of Patent: September 10, 2019
    Assignee: INTEL CORPORATION
    Inventors: Patrick P. Lai, Ethan Schuchman, David Keppel, Denis M. Khartikov, Polychronis Xekalakis, Joshua B. Fryman, Allan D. Knies, Naveen Neelakantam, Gregor Stellpflug, John H. Kelm, Mirem Hyuseinova Seidahmedova, Demos Pavlou, Jaroslaw Topp
  • Patent number: 10296338
    Abstract: In one embodiment, a processor includes: an accelerator associated with a first address space; a core associated with a second address space and including an alternate address space configuration register to store configuration information to enable the core to execute instructions from the first address space; and a control logic to configure the core based in part on information in the alternate address space configuration register. Other embodiments are described and claimed.
    Type: Grant
    Filed: December 9, 2016
    Date of Patent: May 21, 2019
    Assignee: Intel Corporation
    Inventors: Brent R. Boswell, Banu Meenakshi Nagasundaram, Michael D. Abbott, Srikanth Dakshinamoorthy, Jason M. Howard, Joshua B. Fryman
  • Publication number: 20190042613
    Abstract: Methods, apparatus, systems and articles of manufacture to build a storage architecture for graph data are disclosed herein. Disclosed example apparatus include a neighbor identifier to identify respective sets of neighboring vertices of a graph. The neighboring vertices included in the respective sets are adjacent to respective ones of a plurality of vertices of the graph and respective sets of neighboring vertices are represented as respective lists of neighboring vertex identifiers. The apparatus also includes an element creator to create, in a cache memory, an array of elements that are unpopulated. The array elements have lengths equal to a length of a cache line. In addition, the apparatus includes an element populater to populate the elements with neighboring vertex identifiers. Each of the elements store neighboring vertex identifiers of respective ones of the list of neighboring vertex identifiers.
    Type: Application
    Filed: March 30, 2018
    Publication date: February 7, 2019
    Inventors: Stijn Eyerman, Jason M. Howard, Ibrahim Hur, Ivan B. Ganev, Fabrizio Petrini, Joshua B. Fryman
  • Publication number: 20180285252
    Abstract: Optimized memory access bandwidth devices, systems, and methods for processing low spatial locality data are disclosed and described. A system memory is divided into a plurality of memory subsections, where each memory subsection is communicatively coupled to an independent memory channel to a memory controller. Memory access requests from a processor are thereby sent by the memory controller to only the appropriate memory subsection.
    Type: Application
    Filed: April 1, 2017
    Publication date: October 4, 2018
    Applicant: Intel Corporation
    Inventors: Kon-Woo Kwon, Vivek Kozhikkottu, Sang Phill Park, Ankit More, William P. Griffin, Robert Pawlowski, Jason M. Howard, Joshua B. Fryman
  • Publication number: 20180165203
    Abstract: In one embodiment, a processor includes: an accelerator associated with a first address space; a core associated with a second address space and including an alternate address space configuration register to store configuration information to enable the core to execute instructions from the first address space; and a control logic to configure the core based in part on information in the alternate address space configuration register. Other embodiments are described and claimed.
    Type: Application
    Filed: December 9, 2016
    Publication date: June 14, 2018
    Inventors: Brent R. Boswell, Banu Meenakshi Nagasundaram, Michael D. Abbott, Srikanth Dakshinamoorthy, Jason M. Howard, Joshua B. Fryman
  • Patent number: 9477628
    Abstract: A collective communication apparatus and method for parallel computing systems. For example, one embodiment of an apparatus comprises a plurality of processor elements (PEs); collective interconnect logic to dynamically form a virtual collective interconnect (VCI) between the PEs at runtime without global communication among all of the PEs, the VCI defining a logical topology between the PEs in which each PE is directly communicatively coupled to a only a subset of the remaining PEs; and execution logic to execute collective operations across the PEs, wherein one or more of the PEs receive first results from a first portion of the subset of the remaining PEs, perform a portion of the collective operations, and provide second results to a second portion of the subset of the remaining PEs.
    Type: Grant
    Filed: September 28, 2013
    Date of Patent: October 25, 2016
    Assignee: Intel Corporation
    Inventors: Allan D. Knies, David Pardo Keppel, Dong Hyuk Woo, Joshua B. Fryman
  • Patent number: 9405724
    Abstract: A reconfigurable tree apparatus with a bypass mode and a method of using the reconfigurable tree apparatus are disclosed. The reconfigurable tree apparatus uses a short-circuit register to selectively designate participating agents for such operations as barriers, multicast, and reductions. The reconfigurable tree apparatus enables an agent to initiate a barrier, multicast, or reduction operation, leaving software to determine the participating agents for each operation. Although the reconfigurable tree apparatus is implemented using a small number of wires, multiple in-flight barrier, multicast, and reduction operations can take place. The method and apparatus have low complexity, easy reconfigurability, and provide the energy savings necessary for future exa-scale machines.
    Type: Grant
    Filed: June 28, 2013
    Date of Patent: August 2, 2016
    Assignee: INTEL CORPORATION
    Inventors: Jianping Xu, Asit K. Mishra, Joshua B. Fryman, David S. Dunning
  • Publication number: 20150378731
    Abstract: Various different embodiments of the invention are described including: (1) a method and apparatus for intelligently allocating threads within a binary translation system; (2) data cache way prediction guided by binary translation code morphing software; (3) fast interpreter hardware support on the data-side; (4) out-of-order retirement; (5) decoupled load retirement in an atomic OOO processor; (6) handling transactional and atomic memory in an out-of-order binary translation based processor; and (7) speculative memory management in a binary translation based out of order processor.
    Type: Application
    Filed: June 30, 2014
    Publication date: December 31, 2015
    Inventors: PATRICK P. LAI, ETHAN SCHUCHMAN, DAVID KEPPEL, DENIS M. KHARTIKOV, POLYCHRONIS XEKALAKIS, JOSHUA B. FRYMAN, ALLAN D. KNIES, NAVEEN NEELAKANTAM, GREGOR STELLPFLUG, JOHN H. KELM, MIREM HYUSEINOVA, DEMOS PAVLOU, JAROSLAW TOPP
  • Publication number: 20150095542
    Abstract: A collective communication apparatus and method for parallel computing systems. For example, one embodiment of an apparatus comprises a plurality of processor elements (PEs); collective interconnect logic to dynamically form a virtual collective interconnect (VCI) between the PEs at runtime without global communication among all of the PEs, the VCI defining a logical topology between the PEs in which each PE is directly communicatively coupled to a only a subset of the remaining PEs; and execution logic to execute collective operations across the PEs, wherein one or more of the PEs receive first results from a first portion of the subset of the remaining PEs, perform a portion of the collective operations, and provide second results to a second portion of the subset of the remaining PEs.
    Type: Application
    Filed: September 28, 2013
    Publication date: April 2, 2015
    Inventors: Allan D. Knies, David Pardo Keppel, Dong Hyuk Woo, Joshua B. Fryman
  • Publication number: 20150006849
    Abstract: A reconfigurable tree apparatus with a bypass mode and a method of using the reconfigurable tree apparatus are disclosed. The reconfigurable tree apparatus uses a short-circuit register to selectively designate participating agents for such operations as barriers, multicast, and reductions. The reconfigurable tree apparatus enables an agent to initiate a barrier, multicast, or reduction operation, leaving software to determine the participating agents for each operation. Although the reconfigurable tree apparatus is implemented using a small number of wires, multiple in-flight barrier, multicast, and reduction operations can take place. The method and apparatus have low complexity, easy reconfigurability, and provide the energy savings necessary for future exa-scale machines.
    Type: Application
    Filed: June 28, 2013
    Publication date: January 1, 2015
    Inventors: Jianping Xu, Asit K. Mishra, Joshua B. Fryman, David S. Dunning
  • Publication number: 20140258685
    Abstract: A processor may be built with cores that only execute some partial set of the instructions needed to be fully backwards compliant. Thus, in some embodiments power consumption may be reduced by providing partial cores that only execute certain instructions and not other instructions. The instructions not supported may be handled in other, more energy efficient ways, so that, the overall processor, including the partial core, may be fully backwards compliant.
    Type: Application
    Filed: December 30, 2011
    Publication date: September 11, 2014
    Inventors: Srihari Makineni, Steven R. King, Alexander Redkin, Joshua B. Fryman, Ravishankar Iyer, Pavel S. Smirnov, Dmitry Gusev, Dmitri Pavlov
  • Publication number: 20140095896
    Abstract: A processor includes at least one power domain, each power domain including at least one core that switchably receives power supply from a voltage regulator and switchably receives a clock signal from a clock source, a cache, and at least one control registers having stored thereon data indicating power management states of the at least one power domain and the cache.
    Type: Application
    Filed: September 28, 2012
    Publication date: April 3, 2014
    Inventors: Nicholas P. Carter, Joshua B. Fryman, Robert C. Knauerhase, Aditya B. Agrawal, Josep Torrellas
  • Patent number: 8601242
    Abstract: A technique to perform a fast compare-exchange operation is disclosed. More specifically, a machine-readable medium, processor, and system are described that implement a fast compare-exchange operation as well as a cache line mark operation that enables the fast compare-exchange operation.
    Type: Grant
    Filed: December 18, 2009
    Date of Patent: December 3, 2013
    Assignee: Intel Corporation
    Inventors: Joshua B. Fryman, Andrew Thomas Forsyth, Edward Grochowski