Patents by Inventor KERMIN E. FLEMING

KERMIN E. FLEMING has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 11307873
    Abstract: Systems, methods, and apparatuses relating to unstructured data flow in a configurable spatial accelerator are described. In one embodiment, a configurable spatial accelerator includes a data path having a first branch and a second branch, and the data path comprises at least one processing element; a switch circuit comprising a switch control input to receive a first switch control value to couple an input of the switch circuit to the first branch and a second switch control value to couple the input of the switch circuit to the second branch; a pick circuit comprising a pick control input to receive a first pick control value to couple an output of the pick circuit to the first branch and a second pick control value to couple the output of the pick circuit to a third branch of the data path; a predicate propagation processing element to output a first edge predicate value and a second edge predicate value based on (e.g.
    Type: Grant
    Filed: April 3, 2018
    Date of Patent: April 19, 2022
    Assignee: Intel Corporation
    Inventors: Pablo Halpern, Kermin E. Fleming, James Sukha
  • Publication number: 20220107911
    Abstract: Systems, methods, and apparatuses relating to operations in a configurable spatial accelerator are described. In one embodiment, a configurable spatial accelerator includes a first processing element that includes a configuration register within the first processing element to store a configuration value that causes the first processing element to perform an operation according to the configuration value, a plurality of input queues, an input controller to control enqueue and dequeue of values into the plurality of input queues according to the configuration value, a plurality of output queues, and an output controller to control enqueue and dequeue of values into the plurality of output queues according to the configuration value.
    Type: Application
    Filed: December 14, 2021
    Publication date: April 7, 2022
    Inventors: Kermin E. FLEMING, Simon C. STEELY, Kent D. GLOSSOP, Mitchell DIAMOND, Benjamin KEEN, Dennis BRADFORD, Fabrizio Petrini, Barry TANNENBAUM, Yongzhi ZHANG
  • Patent number: 10515049
    Abstract: Methods and apparatuses relating to distributed memory hazard detection and error recovery are described. In one embodiment, a memory circuit includes a memory interface circuit to service memory requests from a spatial array of processing elements for data stored in a plurality of cache banks; and a hazard detection circuit in each of the plurality of cache banks, wherein a first hazard detection circuit for a speculative memory load request from the memory interface circuit, that is marked with a potential dynamic data dependency, to an address within a first cache bank of the first hazard detection circuit, is to mark the address for tracking of other memory requests to the address, store data from the address in speculative completion storage, and send the data from the speculative completion storage to the spatial array of processing elements when a memory dependency token is received for the speculative memory load request.
    Type: Grant
    Filed: July 1, 2017
    Date of Patent: December 24, 2019
    Assignee: intel corporation
    Inventors: Kermin E. Fleming, Simon C. Steely, Kent D. Glossop
  • Patent number: 10445098
    Abstract: Methods and apparatuses relating to privileged configuration in spatial arrays are described. In one embodiment, a processor includes processing elements; an interconnect network between the processing elements; and a configuration controller coupled to a first subset and a second, different subset of the plurality of processing elements, the first subset having an output coupled to an input of the second, different subset, wherein the configuration controller is to configure the interconnect network between the first subset and the second, different subset of the plurality of processing elements to not allow communication on the interconnect network between the first subset and the second, different subset when a privilege bit is set to a first value and to allow communication on the interconnect network between the first subset and the second, different subset of the plurality of processing elements when the privilege bit is set to a second value.
    Type: Grant
    Filed: September 30, 2017
    Date of Patent: October 15, 2019
    Assignee: Intel Corporation
    Inventors: Kermin E. Fleming, Simon C. Steely, Kent D. Glossop
  • Patent number: 10445250
    Abstract: Systems, methods, and apparatuses relating to a configurable spatial accelerator are described. In one embodiment, a processor includes a core with a decoder to decode an instruction into a decoded instruction and an execution unit to execute the decoded instruction to perform a first operation; a plurality of processing elements; and an interconnect network between the plurality of processing elements to receive an input of a dataflow graph comprising a plurality of nodes, wherein the dataflow graph is to be overlaid into the interconnect network and the plurality of processing elements with each node represented as a dataflow operator in the plurality of processing elements, and the plurality of processing elements are to perform a second operation by a respective, incoming operand set arriving at each of the dataflow operators of the plurality of processing elements.
    Type: Grant
    Filed: December 30, 2017
    Date of Patent: October 15, 2019
    Assignee: Intel Corporation
    Inventors: Kermin E. Fleming, Kent D. Glossop, Simon C. Steely
  • Publication number: 20190303153
    Abstract: Systems, methods, and apparatuses relating to unstructured data flow in a configurable spatial accelerator are described. In one embodiment, a configurable spatial accelerator includes a data path having a first branch and a second branch, and the data path comprising at least one processing element; a switch circuit comprising a switch control input to receive a first switch control value to couple an input of the switch circuit to the first branch and a second switch control value to couple the input of the switch circuit to the second branch; a pick circuit comprising a pick control input to receive a first pick control value to couple an output of the pick circuit to the first branch and a second pick control value to couple the output of the pick circuit to a third branch of the data path; a predicate propagation processing element to output a first edge predicate value and a second edge predicate value based on (e.g.
    Type: Application
    Filed: April 3, 2018
    Publication date: October 3, 2019
    Inventors: PABLO HALPERN, KERMIN E. FLEMING, JAMES SUKHA
  • Patent number: 10417175
    Abstract: Methods and apparatuses relating to consistency in an accelerator are described. In one embodiment, request address file (RAF) circuits are coupled to a spatial array by a first network, a memory is coupled to the RAF circuits by a second network, a RAF circuit is to not issue, into the second network, a request to the memory marked with a program order dependency on a previous request until receiving a first token generated by completion of the previous request to the memory by another RAF circuit, and a second RAF circuit is to not issue, into the second network, a second request to the memory marked with a program order dependency on a first request until receiving a second token sent by a first RAF circuit when a predetermined time period has lapsed since the first request was issued by the first RAF circuit into the second network.
    Type: Grant
    Filed: December 30, 2017
    Date of Patent: September 17, 2019
    Assignee: Intel Corporation
    Inventors: Kermin E. Fleming, Simon C. Steely, Jr., Kent D. Glossop
  • Patent number: 10380063
    Abstract: Systems, methods, and apparatuses relating to a sequencer dataflow operator of a configurable spatial accelerator are described. In one embodiment, an interconnect network between a plurality of processing elements receives an input of a dataflow graph comprising a plurality of nodes forming a loop construct, wherein the dataflow graph is overlaid into the interconnect network and the plurality of processing elements with each node represented as a dataflow operator in the plurality of processing elements and at least one dataflow operator controlled by a sequencer dataflow operator of the plurality of processing elements, and the plurality of processing elements is to perform an operation when an incoming operand set arrives at the plurality of processing elements and the sequencer dataflow operator generates control signals for the at least one dataflow operator in the plurality of processing elements.
    Type: Grant
    Filed: September 30, 2017
    Date of Patent: August 13, 2019
    Assignee: Intel Corporation
    Inventors: Jinjie Tang, Kermin E. Fleming, Simon C. Steely, Kent D. Glossop, Jim Sukha
  • Publication number: 20190205263
    Abstract: Systems, methods, and apparatuses relating to a configurable spatial accelerator are described. In one embodiment, a processor includes a core with a decoder to decode an instruction into a decoded instruction and an execution unit to execute the decoded instruction to perform a first operation; a plurality of processing elements; and an interconnect network between the plurality of processing elements to receive an input of a dataflow graph comprising a plurality of nodes, wherein the dataflow graph is to be overlaid into the interconnect network and the plurality of processing elements with each node represented as a dataflow operator in the plurality of processing elements, and the plurality of processing elements are to perform a second operation by a respective, incoming operand set arriving at each of the dataflow operators of the plurality of processing elements.
    Type: Application
    Filed: December 30, 2017
    Publication date: July 4, 2019
    Inventors: Kermin E. Fleming, Kent D. Glossop, Simon C. Steely
  • Publication number: 20190205284
    Abstract: Methods and apparatuses relating to consistency in an accelerator are described. In one embodiment, request address file (RAF) circuits are coupled to a spatial array by a first network, a memory is coupled to the RAF circuits by a second network, a RAF circuit is to not issue, into the second network, a request to the memory marked with a program order dependency on a previous request until receiving a first token generated by completion of the previous request to the memory by another RAF circuit, and a second RAF circuit is to not issue, into the second network, a second request to the memory marked with a program order dependency on a first request until receiving a second token sent by a first RAF circuit when a predetermined time period has lapsed since the first request was issued by the first RAF circuit into the second network.
    Type: Application
    Filed: December 30, 2017
    Publication date: July 4, 2019
    Inventors: Kermin E. Fleming, Simon C. Steely, Jr., Kent D. Glossop
  • Publication number: 20190102338
    Abstract: Systems, methods, and apparatuses relating to a sequencer dataflow operator of a configurable spatial accelerator are described. In one embodiment, an interconnect network between a plurality of processing elements receives an input of a dataflow graph comprising a plurality of nodes forming a loop construct, wherein the dataflow graph is overlaid into the interconnect network and the plurality of processing elements with each node represented as a dataflow operator in the plurality of processing elements and at least one dataflow operator controlled by a sequencer dataflow operator of the plurality of processing elements, and the plurality of processing elements is to perform an operation when an incoming operand set arrives at the plurality of processing elements and the sequencer dataflow operator generates control signals for the at least one dataflow operator in the plurality of processing elements.
    Type: Application
    Filed: September 30, 2017
    Publication date: April 4, 2019
    Inventors: JINJIE TANG, KERMIN E. FLEMING, SIMON C. STEELY, KENT D. GLOSSOP, JIM SUKHA
  • Publication number: 20190102179
    Abstract: Methods and apparatuses relating to privileged configuration in spatial arrays are described. In one embodiment, a processor includes processing elements; an interconnect network between the processing elements; and a configuration controller coupled to a first subset and a second, different subset of the plurality of processing elements, the first subset having an output coupled to an input of the second, different subset, wherein the configuration controller is to configure the interconnect network between the first subset and the second, different subset of the plurality of processing elements to not allow communication on the interconnect network between the first subset and the second, different subset when a privilege bit is set to a first value and to allow communication on the interconnect network between the first subset and the second, different subset of the plurality of processing elements when the privilege bit is set to a second value.
    Type: Application
    Filed: September 30, 2017
    Publication date: April 4, 2019
    Inventors: KERMIN E. FLEMING, SIMON C. STEELY, KENT D. GLOSSOP