Patents by Inventor John Nicol
John Nicol has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Patent number: 10719470Abstract: Techniques are disclosed for data manipulation. Data is obtained from a first switching element where the first switching element is controlled by a first circular buffer. Data is sent to a second switching element where the second switching element is controlled by a second circular buffer. Data is controlled by a third switching element that is controlled by a third circular buffer. The third switching element hierarchically controls the first switching element and the second switching element. Data is routed through a fourth switching element that is controlled by a fourth circular buffer. The circular buffers are statically scheduled. The obtaining data from a first switching element and the sending the data to a second switching element includes a direct memory access (DMA). The switching elements can operate as a master controller or as a slave device. The switching elements can comprise clusters within an asynchronous reconfigurable fabric.Type: GrantFiled: September 22, 2017Date of Patent: July 21, 2020Assignee: Wave Computing, Inc.Inventor: Christopher John Nicol
-
Publication number: 20200167309Abstract: Techniques for reconfigurable fabric configuration using spatial and temporal routing are disclosed. A plurality of clusters within a reconfigurable fabric is allocated, where the plurality of clusters is configured to execute one or more functions. A first spatial routing and a first temporal routing through the reconfigurable fabric are calculated. A second spatial routing and a second temporal routing through the reconfigurable fabric are calculated. The first and second spatial routings and the first and second temporal routings are optimized. The one or more functions are executed using routings that were optimized. The first spatial routing and the second spatial routing enable a logical connection for data transfer between at least two clusters of the plurality of clusters. The optimizing places routing instructions in clusters along a routing path within the reconfigurable fabric. The routing instructions are placed in unused cluster control instruction locations to enable spatial routing.Type: ApplicationFiled: November 27, 2019Publication date: May 28, 2020Inventor: Christopher John Nicol
-
Patent number: 10656911Abstract: Techniques are disclosed for power conservation. A plurality of processing elements and a plurality of instructions are configured. The plurality of processing elements is controlled by instructions contained in a plurality of circular buffers. The plurality of processing elements can comprise a data flow processor. A first processing element, from the plurality of interconnected processing elements, is set into a sleep state by a first instruction from the plurality of instructions. The first processing element is woken from the sleep state as a result of valid data being presented to the first processing element. A subsection of the plurality of interconnected processing elements is also set into a sleep state based on the first processing element being set into a sleep state.Type: GrantFiled: February 11, 2019Date of Patent: May 19, 2020Assignee: Wave Computing, Inc.Inventor: Christopher John Nicol
-
Patent number: 10659396Abstract: Techniques are disclosed for managing data within a reconfigurable computing environment. In a multiple processing element environment, such as a mesh network or other suitable topology, there is an inherent need to pass data between processing elements. Subtasks are divided among multiple processing elements. The output resulting from the subtasks is then merged by a downstream processing element. In such cases, a join operation can be used to combine data from multiple upstream processing elements. A control agent executes on each processing element. A memory buffer is disposed between upstream processing elements and the downstream processing element. The downstream processing element is configured to automatically perform an operation based on the availability of valid data from the upstream processing elements.Type: GrantFiled: June 28, 2018Date of Patent: May 19, 2020Assignee: Wave Computing, Inc.Inventor: Christopher John Nicol
-
Patent number: 10592444Abstract: A plurality of software programmable processors is disclosed. The software programmable processors are controlled by rotating circular buffers. A first processor and a second processor within the plurality of software programmable processors are individually programmable. The first processor within the plurality of software programmable processors is coupled to neighbor processors within the plurality of software programmable processors. The first processor sends and receives data from the neighbor processors. The first processor and the second processor are configured to operate on a common instruction cycle. An output of the first processor from a first instruction cycle is an input to the second processor on a subsequent instruction cycle.Type: GrantFiled: March 3, 2017Date of Patent: March 17, 2020Assignee: Wave Computing, Inc.Inventors: Christopher John Nicol, Samit Chaudhuri, Radoslav Danilak
-
Patent number: 10564929Abstract: A combination of memory units and dataflow processing units is disclosed for computation. A first memory unit is interposed between a first dataflow processing unit and a second dataflow processing unit. Operations for a dataflow graph are allocated across the first dataflow processing unit and the second dataflow processing unit. The first memory unit passes data between the first dataflow processing unit and the second dataflow processing unit to execute the dataflow graph. The first memory unit is a high bandwidth, shared memory device including a hybrid memory cube. The first dataflow processing unit and second dataflow processing unit include a plurality of circular buffers containing instructions for controlling data transfer between the first dataflow processing unit and second dataflow processing unit. Additional dataflow processing units and additional memory units are included for additional functionality and efficiency.Type: GrantFiled: August 1, 2017Date of Patent: February 18, 2020Assignee: Wave Computing, Inc.Inventors: Christopher John Nicol, Derek William Meyer
-
Patent number: 10505704Abstract: Disclosed embodiments provide an interface circuit for the transfer of data from a synchronous circuit to an asynchronous circuit. Data from the synchronous circuit is received into a memory in the interface circuit. The data in the memory is then sent to the asynchronous circuit based on an instruction in a circular buffer that is part of the interface circuit. Processing elements within the interface circuit execute instructions contained within the circular buffer. The circular buffer rotates to provide new instructions to the processing elements. Flow control paces the data from the synchronous circuit to the asynchronous circuit.Type: GrantFiled: August 1, 2016Date of Patent: December 10, 2019Assignee: Wave Computing, Inc.Inventor: Christopher John Nicol
-
Patent number: 10437728Abstract: Circular buffers containing instructions that enable the execution of operations on logical elements are described where data in the circular buffers is swapped to storage. The instructions comprise a branchless instruction set. Data stored in circular buffers is paged in and out to a second level memory. State information for each logical element is also saved and restored using paging memory. Instructions are provided to logical elements, such as processing elements, via circular buffers. The instructions enable a group of processing elements to perform operations implementing a desired functionality. That functionality is changed by updating the circular buffers with new instructions that are transferred from paging memory. The previous instructions can be saved off in paging memory before the new instructions are copied over to the circular buffers. This enables the hardware to be rapidly reconfigured amongst multiple functions.Type: GrantFiled: September 10, 2018Date of Patent: October 8, 2019Assignee: Wave Computing, Inc.Inventor: Christopher John Nicol
-
Publication number: 20190279038Abstract: Techniques are disclosed for data flow graph node parallel update for machine learning. A first plurality of processing elements is configured to implement a portion of a data flow graph. The nodes include at least one variable node and implement part of a neural network. A second plurality of processing elements is configured to implement a second portion of the data flow graph. These nodes include at least one additional variable node and implement an additional part of the neural network. Training data is issued to the first plurality of processing elements. The training data is used to update variables within the at least one variable node. Additional variables are updated within the at least one additional variable node. The updating includes forwarding training data from the first plurality to the second plurality. The neural network is trained based on the variables that were updated and the additional variables.Type: ApplicationFiled: May 27, 2019Publication date: September 12, 2019Inventor: Christopher John Nicol
-
Publication number: 20190279086Abstract: Techniques are disclosed for data flow graph node update for machine learning. A plurality of processing elements is configured within a reconfigurable fabric to implement a data flow graph. The nodes of the data flow graph include one or more variable nodes, and the data flow graph implements a neural network. N copies of a variable contained in a variable node are issued, where the N copies are used for distribution within the data flow graph, and where N is an integer greater than or equal to one and less than or equal to the total number of nodes in the graph. The N copies of a variable are distributed within the data flow graph. The neural network is updated based on the N copies of a variable. Results from the distribution are averaged. The averaging includes parallel training of different data for machine learning.Type: ApplicationFiled: May 27, 2019Publication date: September 12, 2019Inventors: Christopher John Nicol, Lin Zhong
-
Patent number: 10374605Abstract: Techniques are disclosed for designing a reconfigurable fabric. The reconfigurable fabric is designed using logical elements, configurable connections between and among the logical elements, and rotating circular buffers. The circular buffers contain configuration instructions. The configuration instructions control connections between and among logical elements. The logical elements change operation based on the instructions that rotate through the circular buffers. Clusters of logical elements are interconnected by a switching fabric. Each cluster contains processing elements, storage elements, and switching elements. A circular buffer within a cluster contains multiple switching instructions to control the flow of data throughout the switching fabric. The circular buffer provides a pipelined execution of switching instructions for the implementation of multiple functions.Type: GrantFiled: October 31, 2018Date of Patent: August 6, 2019Assignee: Wave Computing, Inc.Inventor: Christopher John Nicol
-
Patent number: 10374981Abstract: An interface circuit is disclosed for the transfer of data from a synchronous circuit, with multiple source elements, to an asynchronous circuit. Data from the synchronous circuit is received into a memory in the interface circuit. The data in the memory is then sent to the asynchronous circuit based on an instruction in a circular buffer that is part of the interface circuit. Processing elements within the interface circuit execute instructions contained within the circular buffer. The circular buffer rotates to provide new instructions to the processing elements. Flow control paces the data from the synchronous circuit to the asynchronous circuit.Type: GrantFiled: August 2, 2016Date of Patent: August 6, 2019Assignee: Wave Computing, Inc.Inventor: Christopher John Nicol
-
Publication number: 20190228037Abstract: Techniques are disclosed for checkpointing data flow graph computation for machine learning. Processing elements within a reconfigurable fabric are configured to implement a data flow graph. Nodes of the data flow graph can include variable nodes. The processing elements are loaded with process agents. Valid data is executed by a first process agent. The first process agent corresponds to a starting node of the data flow graph. Invalid data is sent to the first process agent. The invalid data initiates a checkpoint operation for the data flow graph. Invalid data is propagated from the starting node of the data flow graph to other nodes within the data flow graph. The variable nodes are paused upon receiving invalid data. Paused variable nodes within the data flow graph are restarted by issuing a run command, and valid data is sent to the starting node of the data flow graph.Type: ApplicationFiled: March 29, 2019Publication date: July 25, 2019Inventors: Christopher John Nicol, Keith Mark Evans, Mehran Ramezani
-
Publication number: 20190228340Abstract: Techniques are disclosed for data manipulation that enables data flow graph computation for machine learning. A plurality of processing elements within a reconfigurable fabric is configured to implement a data flow graph. The nodes of the data flow graph include variable notes. The plurality of processing elements is initialized with a plurality of process agents. A first set of buffers is initialized for a first process agent, where the first process agent corresponds to a starting node of the data flow graph. A fire signal is issued for the starting node based on the first set of buffers being initialized. Results of operations are collected by a further process agent following receipt of the fire signal. The data flow graph computation can be paused by loading invalid data or by withholding new data from entering the data flow graph. The pausing can be controlled by an execution manager.Type: ApplicationFiled: March 29, 2019Publication date: July 25, 2019Inventor: Christopher John Nicol
-
Publication number: 20190197018Abstract: Techniques are disclosed for dynamic reconfiguration using data transfer control. Clusters on a reconfigurable fabric are accessed to implement a logical operation. The logical operation can include a Boolean operation, a matrix operation, a tensor operation, etc. Clusters from the plurality of clusters are provisioned for implementation of a first agent on the reconfigurable fabric. The clusters can include quads. The one or more clusters provisioned for the first agent include a first data transfer control block. Additional clusters from the plurality of clusters are provisioned for implementation of a second agent on the reconfigurable fabric. The additional clusters provisioned for the second agent include a second data transfer control block. The logical operation is performed using the first agent. Control information is transferred from the first data transfer control block to the second data transfer control block.Type: ApplicationFiled: March 1, 2019Publication date: June 27, 2019Inventors: Keith Mark Evans, Christopher John Nicol, Mehran Ramezani
-
Publication number: 20190171416Abstract: Techniques are disclosed for power conservation. A plurality of processing elements and a plurality of instructions are configured. The plurality of processing elements is controlled by instructions contained in a plurality of circular buffers. The plurality of processing elements can comprise a data flow processor. A first processing element, from the plurality of interconnected processing elements, is set into a sleep state by a first instruction from the plurality of instructions. The first processing element is woken from the sleep state as a result of valid data being presented to the first processing element. A subsection of the plurality of interconnected processing elements is also set into a sleep state based on the first processing element being set into a sleep state.Type: ApplicationFiled: February 11, 2019Publication date: June 6, 2019Inventor: Christopher John Nicol
-
Publication number: 20190138373Abstract: Techniques are disclosed for multithreaded data flow processing within a reconfigurable fabric. Code is obtained for performing data manipulation within a reconfigurable fabric. The code is segmented into a plurality of data manipulation operations. A first segment from the segmenting is allocated to a first set of processing elements within a plurality of processing elements comprising a reconfigurable fabric. A second segment from the segmenting is allocated to a second set of processing elements within the reconfigurable fabric. The first segment is executed on the first set of processing elements while the second segment is executed on the second set of processing elements. The first kernel and the second kernel comprise multithreaded operation.Type: ApplicationFiled: December 28, 2018Publication date: May 9, 2019Inventors: Christopher John Nicol, Derek William Meyer
-
Publication number: 20190130269Abstract: Techniques are disclosed for pipelined tensor manipulation within a reconfigurable fabric. A tensor is obtained for processing on a reconfigurable fabric comprised of a plurality of processing elements. The tensor is applied as input to a pipeline of agents running on the plurality of processing elements. The tensor is sectioned into subsections. A first subsection from the one or more subsections is applied to a first agent in the pipeline of agents. A first result is calculated by the first agent for the first subsection. The first result is output to a second agent in the pipeline of agents. A second result is calculated, by the second agent, based on the first result. A subsection done indication is sent, by the second agent, to the first agent, when the calculating the second result is accomplished. The second result is output to a third agent in the pipeline of agents.Type: ApplicationFiled: December 4, 2018Publication date: May 2, 2019Inventor: Christopher John Nicol
-
Publication number: 20190130291Abstract: Techniques are disclosed for dynamic reconfiguration with partially resident agents. A plurality of clusters on a reconfigurable fabric is accessed to implement a logical operation. Two or more clusters from the plurality of clusters are provisioned for implementation of a first agent on the reconfigurable fabric wherein the first agent is comprised of an agent control unit and an agent kernel. The logical operation is executed, on the two or more clusters of the reconfigurable fabric, using the first agent. The agent kernel is removed from the reconfigurable fabric while leaving the agent control unit resident on the reconfigurable fabric. The agent control unit is further used to buffer data incoming to the first agent. The agent control unit is further used to provide data to logic downstream from the first agent. A second agent provides data incoming to the first agent.Type: ApplicationFiled: December 21, 2018Publication date: May 2, 2019Inventor: Christopher John Nicol
-
Publication number: 20190130270Abstract: Techniques are disclosed for tensor manipulation within a reconfigurable fabric using pointers. A first tensor is obtained for processing on a reconfigurable fabric comprised of a plurality of processing, storage, and switching elements. A first agent is deployed on one or more of the plurality of processing elements of the reconfigurable fabric. The first tensor is manipulated by the first agent. The results of the manipulating the first tensor are stored in a storage element external from the first agent. A pointer is provided to a second agent deployed on one or more of the plurality of processing elements of the reconfigurable fabric, wherein the pointer identifies an address of the storage element at which the first tensor is stored. A transfer buffer is used between the first agent and the second agent within the reconfigurable fabric to facilitate tensor transfers between the first agent and the second agent.Type: ApplicationFiled: December 4, 2018Publication date: May 2, 2019Inventors: Christopher John Nicol, David Jay O'Shea