Patents by Inventor Kermin E. Fleming, JR.

Kermin E. Fleming, JR. has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 11593295
    Abstract: Systems, methods, and apparatuses relating to operations in a configurable spatial accelerator are described. In one embodiment, a configurable spatial accelerator includes a first processing element that includes a configuration register within the first processing element to store a configuration value that causes the first processing element to perform an operation according to the configuration value, a plurality of input queues, an input controller to control enqueue and dequeue of values into the plurality of input queues according to the configuration value, a plurality of output queues, and an output controller to control enqueue and dequeue of values into the plurality of output queues according to the configuration value.
    Type: Grant
    Filed: December 14, 2021
    Date of Patent: February 28, 2023
    Assignee: Intel Corporation
    Inventors: Kermin E. Fleming, Jr., Simon C. Steely, Jr., Kent D. Glossop, Mitchell Diamond, Benjamin Keen, Dennis Bradford, Fabrizio Petrini, Barry Tannenbaum, Yongzhi Zhang
  • Patent number: 11200186
    Abstract: Systems, methods, and apparatuses relating to operations in a configurable spatial accelerator are described. In one embodiment, a configurable spatial accelerator includes a first processing element that includes a configuration register within the first processing element to store a configuration value that causes the first processing element to perform an operation according to the configuration value, a plurality of input queues, an input controller to control enqueue and dequeue of values into the plurality of input queues according to the configuration value, a plurality of output queues, and an output controller to control enqueue and dequeue of values into the plurality of output queues according to the configuration value.
    Type: Grant
    Filed: June 30, 2018
    Date of Patent: December 14, 2021
    Assignee: Intel Corporation
    Inventors: Kermin E. Fleming, Jr., Simon C. Steely, Jr., Kent D. Glossop, Mitchell Diamond, Benjamin Keen, Dennis Bradford, Fabrizio Petrini, Barry Tannenbaum, Yongzhi Zhang
  • Patent number: 10891240
    Abstract: Systems, methods, and apparatuses relating to low latency communications in a configurable spatial accelerator are described.
    Type: Grant
    Filed: June 30, 2018
    Date of Patent: January 12, 2021
    Assignee: Intel Corporation
    Inventors: Suresh Mathew, Mitchell Diamond, Kermin E. Fleming, Jr.
  • Patent number: 10853073
    Abstract: Systems, methods, and apparatuses relating to conditional operations in a configurable spatial accelerator are described.
    Type: Grant
    Filed: June 30, 2018
    Date of Patent: December 1, 2020
    Assignee: Intel Corporation
    Inventors: Kermin E. Fleming, Jr., Ping Zou, Mitchell Diamond, Benjamin Keen
  • Patent number: 10565134
    Abstract: Systems, methods, and apparatuses relating to multicast in a configurable spatial accelerator are described. In one embodiment, an accelerator includes a first output buffer of a first processing element coupled to a first input buffer of a second processing element and a second input buffer of a third processing element; and the first processing element determines that it was able to complete a transmission in a previous cycle when the first processing element observed for both the second processing element and the third processing element that either a speculation value was set to a value to indicate a dataflow token was stored in its input buffer (e.g., as indicated by a reception value (e.g., bit)) or a backpressure value was set to a value to indicate that storage is to be available in its input buffer before dequeuing the dataflow token from the first output buffer.
    Type: Grant
    Filed: December 30, 2017
    Date of Patent: February 18, 2020
    Assignee: INTEL CORPORATION
    Inventors: Kermin E. Fleming, Jr., Ping Zou, Mitchell Diamond
  • Patent number: 10564980
    Abstract: Systems, methods, and apparatuses relating to conditional queues in a configurable spatial accelerator are described.
    Type: Grant
    Filed: April 3, 2018
    Date of Patent: February 18, 2020
    Assignee: INTEL CORPORATION
    Inventors: Kermin E. Fleming, Jr., Ping Zou, Mitchell Diamond, Benjamin Keen
  • Patent number: 10558575
    Abstract: Systems, methods, and apparatuses relating to a configurable spatial accelerator are described. In one embodiment, a processor includes a core with a decoder to decode an instruction into a decoded instruction and an execution unit to execute the decoded instruction to perform a first operation; a plurality of processing elements; and an interconnect network between the plurality of processing elements to receive an input of a dataflow graph comprising a plurality of nodes, wherein the dataflow graph is to be overlaid into the interconnect network and the plurality of processing elements with each node represented as a dataflow operator in the plurality of processing elements, and the plurality of processing elements is to perform a second operation when an incoming operand set arrives at the plurality of processing elements.
    Type: Grant
    Filed: December 30, 2016
    Date of Patent: February 11, 2020
    Assignee: Intel Corporation
    Inventors: Kermin E. Fleming, Jr., Kent D. Glossop, Simon C. Steely, Jr., Jinjie Tang, Alan G. Gara
  • Publication number: 20200004538
    Abstract: Systems, methods, and apparatuses relating to conditional operations in a configurable spatial accelerator are described.
    Type: Application
    Filed: June 30, 2018
    Publication date: January 2, 2020
    Inventors: Kermin E. FLEMING, JR., Ping ZOU, Mitchell DIAMOND, Benjamin KEEN
  • Publication number: 20200004690
    Abstract: Systems, methods, and apparatuses relating to low latency communications in a configurable spatial accelerator are described.
    Type: Application
    Filed: June 30, 2018
    Publication date: January 2, 2020
    Inventors: Suresh MATHEW, Mitchell DIAMOND, Kermin E. FLEMING, JR.
  • Patent number: 10459866
    Abstract: Systems, methods, and apparatuses relating to integrated control and data processing in a configurable spatial accelerator are described.
    Type: Grant
    Filed: June 30, 2018
    Date of Patent: October 29, 2019
    Assignee: Intel Corporation
    Inventors: Kermin E. Fleming, Jr., Mitchell Diamond, Ping Zou, Benjamin Keen
  • Publication number: 20190303263
    Abstract: Systems, methods, and apparatuses relating to integrated performance monitoring in a configurable spatial accelerator are described. In one embodiment, a configurable spatial accelerator includes a first performance monitoring circuit coupled to a first proper subset of processing elements by a network to receive at least one monitoring value from each of the first plurality of the processing elements, generate a first aggregated monitoring value based on the at least one monitoring value from each of the first plurality of the processing elements, and send the first aggregated monitoring value to a performance manager circuit on a different network when a first threshold value is exceeded by the first aggregated monitoring value; and the performance manager circuit is to perform an action based on the first aggregated monitoring value.
    Type: Application
    Filed: March 30, 2018
    Publication date: October 3, 2019
    Inventors: KERMIN E. FLEMING, JR., SIMON C. STEELY, JR., JINJIE TANG
  • Publication number: 20190303168
    Abstract: Systems, methods, and apparatuses relating to conditional queues in a configurable spatial accelerator are described.
    Type: Application
    Filed: April 3, 2018
    Publication date: October 3, 2019
    Inventors: Kermin E. Fleming, JR., Ping Zou, Mitchell Diamond, Benjamin Keen
  • Publication number: 20190303297
    Abstract: Systems, methods, and apparatuses relating to remote memory access in a configurable spatial accelerator are described. In one embodiment, a configurable spatial accelerator includes a first memory interface circuit coupled to a first processing element and a cache, the first memory interface circuit to issue a memory request to the cache, the memory request comprising a field to identify a second memory interface circuit as a receiver of data for the memory request; and the second memory interface circuit coupled to a second processing element and the cache, the second memory interface circuit to send a credit return value to the first memory interface circuit, to cause the first memory interface circuit to mark the memory request as complete, when the data for the memory request arrives at the second memory interface circuit and a completion configuration register of the second memory interface circuit is set to a remote response value.
    Type: Application
    Filed: April 2, 2018
    Publication date: October 3, 2019
    Inventors: KERMIN E. FLEMING, JR., SIMON C. STEELY, JR., KENT D. GLOSSOP
  • Patent number: 10402168
    Abstract: A floating point multiply-add unit having inputs coupled to receive a floating point multiplier data element, a floating point multiplicand data element, and a floating point addend data element. The multiply-add unit including a mantissa multiplier to multiply a mantissa of the multiplier data element and a mantissa of the multiplicand data element to calculate a mantissa product. The mantissa multiplier including a most significant bit portion to calculate most significant bits of the mantissa product, and a least significant bit portion to calculate least significant bits of the mantissa product. The mantissa multiplier has a plurality of different possible sizes of the least significant bit portion. Energy consumption reduction logic to selectively reduce energy consumption of the least significant bit portion, but not the most significant bit portion, to cause the least significant bit portion to not calculate the least significant bits of the mantissa product.
    Type: Grant
    Filed: October 1, 2016
    Date of Patent: September 3, 2019
    Assignee: Intel Corporation
    Inventors: William C. Hasenplaugh, Kermin E. Fleming, Jr., Tryggve Fossum, Simon C. Steely, Jr.
  • Publication number: 20190205269
    Abstract: Systems, methods, and apparatuses relating to multicast in a configurable spatial accelerator are described. In one embodiment, an accelerator includes a first output buffer of a first processing element coupled to a first input buffer of a second processing element and a second input buffer of a third processing element; and the first processing element determines that it was able to complete a transmission in a previous cycle when the first processing element observed for both the second processing element and the third processing element that either a speculation value was set to a value to indicate a dataflow token was stored in its input buffer (e.g., as indicated by a reception value (e.g., bit)) or a backpressure value was set to a value to indicate that storage is to be available in its input buffer before dequeuing the dataflow token from the first output buffer.
    Type: Application
    Filed: December 30, 2017
    Publication date: July 4, 2019
    Inventors: Kermin E. Fleming, JR., Ping Zou, Mitchell Diamond
  • Publication number: 20190101952
    Abstract: Methods and apparatuses relating to configurable clock gating in spatial arrays are described. In one embodiment, a processor includes processing elements; an interconnect network between the processing elements; and a configuration controller, coupled to a first processing element and a second processing element of the plurality of processing elements and the first processing element having an output coupled to an input of the second processing element, to configure the second processing element to clock gate at least one clocked component of the second processing element, and configure the first processing element to send a reenable signal on the interconnect network to the second processing element to reenable the at least one clocked component of the second processing element when data is to be sent from the first processing element to the second processing element.
    Type: Application
    Filed: September 30, 2017
    Publication date: April 4, 2019
    Inventors: MITCHELL DIAMOND, BENJAMIN KEEN, KERMIN E. FLEMING, JR.
  • Publication number: 20190042513
    Abstract: Systems, methods, and apparatuses relating to operations in a configurable spatial accelerator are described. In one embodiment, a configurable spatial accelerator includes a first processing element that includes a configuration register within the first processing element to store a configuration value that causes the first processing element to perform an operation according to the configuration value, a plurality of input queues, an input controller to control enqueue and dequeue of values into the plurality of input queues according to the configuration value, a plurality of output queues, and an output controller to control enqueue and dequeue of values into the plurality of output queues according to the configuration value.
    Type: Application
    Filed: June 30, 2018
    Publication date: February 7, 2019
    Inventors: Kermin E. FLEMING, JR., Simon C. STEELY, JR., Kent D. GLOSSOP, Mitchell DIAMOND, Benjamin KEEN, Dennis BRADFORD, Fabrizio Petrini, Barry TANNENBAUM, Yonghzi ZHANG
  • Publication number: 20180189231
    Abstract: Systems, methods, and apparatuses relating to a configurable spatial accelerator are described. In one embodiment, a processor includes a core with a decoder to decode an instruction into a decoded instruction and an execution unit to execute the decoded instruction to perform a first operation; a plurality of processing elements; and an interconnect network between the plurality of processing elements to receive an input of a dataflow graph comprising a plurality of nodes, wherein the dataflow graph is to be overlaid into the interconnect network and the plurality of processing elements with each node represented as a dataflow operator in the plurality of processing elements, and the plurality of processing elements is to perform a second operation when an incoming operand set arrives at the plurality of processing elements.
    Type: Application
    Filed: December 30, 2016
    Publication date: July 5, 2018
    Inventors: KERMIN E. FLEMING, JR., KENT D. GLOSSOP, SIMON C. STEELY, JR., JINJIE TANG, ALAN G. GARA
  • Publication number: 20180095728
    Abstract: A floating point multiply-add unit having inputs coupled to receive a floating point multiplier data element, a floating point multiplicand data element, and a floating point addend data element. The multiply-add unit including a mantissa multiplier to multiply a mantissa of the multiplier data element and a mantissa of the multiplicand data element to calculate a mantissa product. The mantissa multiplier including a most significant bit portion to calculate most significant bits of the mantissa product, and a least significant bit portion to calculate least significant bits of the mantissa product. The mantissa multiplier has a plurality of different possible sizes of the least significant bit portion. Energy consumption reduction logic to selectively reduce energy consumption of the least significant bit portion, but not the most significant bit portion, to cause the least significant bit portion to not calculate the least significant bits of the mantissa product.
    Type: Application
    Filed: October 1, 2016
    Publication date: April 5, 2018
    Applicant: Intel Corporation
    Inventors: William C. Hasenplaugh, Kermin E. Fleming, JR., Tryggve Fossum, Simon C. Steely, JR.