Patents by Inventor Samit Chaudhuri
Samit Chaudhuri has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Patent number: 10592444Abstract: A plurality of software programmable processors is disclosed. The software programmable processors are controlled by rotating circular buffers. A first processor and a second processor within the plurality of software programmable processors are individually programmable. The first processor within the plurality of software programmable processors is coupled to neighbor processors within the plurality of software programmable processors. The first processor sends and receives data from the neighbor processors. The first processor and the second processor are configured to operate on a common instruction cycle. An output of the first processor from a first instruction cycle is an input to the second processor on a subsequent instruction cycle.Type: GrantFiled: March 3, 2017Date of Patent: March 17, 2020Assignee: Wave Computing, Inc.Inventors: Christopher John Nicol, Samit Chaudhuri, Radoslav Danilak
-
Patent number: 10452452Abstract: Disclosed techniques utilize a satisfiability solver for allocation and/or configuration of resources in a reconfigurable fabric of processing elements. A dataflow graph is an input provided to a toolchain that includes a satisfiability solver. The satisfiability solver operates on subsets of interconnected nodes within a dataflow graph to derive a solution. The solution is trimmed by removing artifacts and unnecessary parts. The solutions of subsets are then used as an input to additional subsets of nodes within the dataflow graph in an iterative process to derive a complete solution. The satisfiability solver technique uses adaptive windowing in both the time dimension and the spatial dimensions of the dataflow graph. Processing elements and routing elements within the reconfigurable fabric are configured based on the complete solution. Data computation is performed based on the dataflow graph using the processing elements and the routing resources.Type: GrantFiled: April 16, 2018Date of Patent: October 22, 2019Assignee: Wave Computing, Inc.Inventors: Asmus Hetzel, Samit Chaudhuri
-
Patent number: 10289382Abstract: An apparatus for mathematical manipulation is described allowing the selective combination of shifters to shift binary numbers of various widths. Selective combination allows on-the-fly adjustment of shifters from independent to coordinated shifting operations. Selective combination allows adjustable hardware-based shifting while saving space and resources. Multiple eight-bit shifters can be configured for a variety of operand widths, such as a 32-bit width, a 24-bit width, a 16-bit width, or an eight-bit width. Multiplexers route the appropriate input data to the appropriate shifters. Bidirectional shifting is configured through a selector tree, including both shift left and shift right operations. Opcodes configure the shifters for the desired type of shift and a shifted result is generated.Type: GrantFiled: March 30, 2018Date of Patent: May 14, 2019Assignee: Wave Computing, Inc.Inventor: Samit Chaudhuri
-
Publication number: 20190042941Abstract: Techniques are disclosed for reconfigurable fabric operation linkage. A first function to be performed on a reconfigurable fabric is determined, where the first function is performed on a first cluster within the reconfigurable fabric. A distance is calculated from the first cluster to a second cluster that receives output from the first function on the first cluster. A time duration is calculated for the output from the first function to travel to the second cluster. A first set of instructions for the first function is allocated to the first cluster based on the distance and the time duration. The allocating the first set of instructions is accomplished using a satisfiability solver technique including constructing a set of mapping constraints and building a satisfiability model. The satisfiability solver technique includes a Boolean satisfiability problem solving technique. The satisfiability model is solved and a solution is stored.Type: ApplicationFiled: August 3, 2018Publication date: February 7, 2019Inventors: Henrik Esbensen, Peter Ramyalal Suaris, Samit Chaudhuri
-
Publication number: 20180341734Abstract: Systems and methods are disclosed for computing resource configuration based on flow graph translation. First, a high-level description of logic circuitry is obtained and translated to generate a flow graph representing sequential operations. Using the flow graph, similar processing elements in an array are interchangeably configured to perform computational, communication, and storage tasks as needed. The sequential operations are executed using the array of interchangeable processing elements. Data is provided from the storage elements through the communication elements to the computational elements. Computational results are stored in the storage elements. Outputs from some of the computational elements provide inputs to other computational elements. Execution of the instructions can be controlled with time stepping. The processors are reconfigured as needed, based on changes to the flow graph, on subsequent time steps.Type: ApplicationFiled: August 1, 2018Publication date: November 29, 2018Inventors: Samit Chaudhuri, Henrik Esbensen, Kenneth Shiring, Peter Ramyalal Suaris
-
Publication number: 20180300181Abstract: Disclosed techniques utilize a satisfiability solver for allocation and/or configuration of resources in a reconfigurable fabric of processing elements. A dataflow graph is an input provided to a toolchain that includes a satisfiability solver. The satisfiability solver operates on subsets of interconnected nodes within a dataflow graph to derive a solution. The solution is trimmed by removing artifacts and unnecessary parts. The solutions of subsets are then used as an input to additional subsets of nodes within the dataflow graph in an iterative process to derive a complete solution. The satisfiability solver technique uses adaptive windowing in both the time dimension and the spatial dimensions of the dataflow graph. Processing elements and routing elements within the reconfigurable fabric are configured based on the complete solution. Data computation is performed based on the dataflow graph using the processing elements and the routing resources.Type: ApplicationFiled: April 16, 2018Publication date: October 18, 2018Inventors: Asmus Hetzel, Samit Chaudhuri
-
Publication number: 20180225089Abstract: An apparatus for mathematical manipulation is described allowing the selective combination of shifters to shift binary numbers of various widths. Selective combination allows on-the-fly adjustment of shifters from independent to coordinated shifting operations. Selective combination allows adjustable hardware-based shifting while saving space and resources. Multiple eight-bit shifters can be configured for a variety of operand widths, such as a 32-bit width, a 24-bit width, a 16-bit width, or an eight-bit width. Multiplexers route the appropriate input data to the appropriate shifters. Bidirectional shifting is configured through a selector tree, including both shift left and shift right operations. Op codes configure the shifters for the desired type of shift and a shifted result is generated.Type: ApplicationFiled: March 30, 2018Publication date: August 9, 2018Inventor: Samit Chaudhuri
-
Patent number: 10042966Abstract: Systems and methods are disclosed for computing resource allocation based on flow graph translation. First, a high-level description of logic circuitry is obtained and translated to generate a flow graph representing sequential operations. Using the flow graph, similar processing elements in an array are interchangeably allocated to perform computational, communication, and storage tasks as needed. The sequential operations are executed using the array of interchangeable processing elements. Data is provided from the storage elements through the communication elements to the computational elements. Computational results are stored in the storage elements. Outputs from some of the computational elements provide inputs to other computational elements. Execution of the instructions can be controlled with time stepping. The processors are reallocated as needed, based on changes to the flow graph.Type: GrantFiled: October 30, 2015Date of Patent: August 7, 2018Assignee: Wave Computing, Inc.Inventors: Samit Chaudhuri, Henrik Esbensen, Kenneth Shiring, Peter Ramyalal Suaris
-
Patent number: 9933996Abstract: An apparatus for mathematical manipulation is described allowing the selective combination of shifters to shift binary numbers of various widths. Selective combination allows on-the-fly adjustment of shifters from independent to coordinated shifting operations. Selective combination allows adjustable hardware-based shifting while saving space and resources. Multiple eight-bit shifters can be configured for a variety of operand widths, such as a 32-bit width, a 24-bit width, a 16-bit width, or an eight-bit width. Multiplexers route the appropriate input data to the appropriate shifters. Opcodes configure the shifters for the desired type of shift and a shifted result is generated.Type: GrantFiled: December 20, 2013Date of Patent: April 3, 2018Assignee: Wave Computing, Inc.Inventor: Samit Chaudhuri
-
Publication number: 20170177517Abstract: A plurality of software programmable processors is disclosed. The software programmable processors are controlled by rotating circular buffers. A first processor and a second processor within the plurality of software programmable processors are individually programmable. The first processor within the plurality of software programmable processors is coupled to neighbor processors within the plurality of software programmable processors. The first processor sends and receives data from the neighbor processors. The first processor and the second processor are configured to operate on a common instruction cycle. An output of the first processor from a first instruction cycle is an input to the second processor on a subsequent instruction cycle.Type: ApplicationFiled: March 3, 2017Publication date: June 22, 2017Inventors: Christopher John Nicol, Samit Chaudhuri, Radoslav Danilak
-
Patent number: 9588773Abstract: A processing device is provided. A cluster includes a plurality of groups of processing elements. A multi-word device is connected to the processing elements within the groups. Each processing element in a particular group is in communication with all other processing elements within the particular group, and only one of the processing elements within other groups in the cluster. Each processing element is limited to operations in which input bits can be processed and an output obtained without reference to other bits. The multi-word device is configured to cooperate with at least two other processing elements to perform processing that requires reference to other bits to obtain a result.Type: GrantFiled: January 7, 2014Date of Patent: March 7, 2017Assignee: Wave Computing, Inc.Inventors: Christopher John Nicol, Samit Chaudhuri, Radoslav Danilak
-
Patent number: 9563401Abstract: An extensible iterative multiplier design is provided. Embodiments provide cascaded 8-bit multipliers for simplifying the performance of multi-byte multiplications. Booth encoding is performed in the lowest order multiplier, with the result of the Booth encoding then provided to higher order multipliers. Additionally, multiply-add operations can be performed by initializing a partial product sum register. Configurable connections between the multipliers facilitate a variety of possible multiplication options, including the possibility of varying the width of the operands.Type: GrantFiled: December 7, 2013Date of Patent: February 7, 2017Assignee: Wave Computing, Inc.Inventors: Samit Chaudhuri, Radoslav Danilak
-
Publication number: 20160321039Abstract: Technology mapping onto code fragments and related concepts are disclosed. Program descriptions are obtained in a high-level language. One or more intrinsic libraries containing modules are obtained. The modules correspond to sections of code intended for execution on the special purpose hardware. The high-level program description is analyzed to determine locations of one or more cuts within the program. The cuts represent portions of the high-level code that are eligible for replacement by one or more modules from intrinsic libraries. A matching process is used to find modules that are suitable replacements for the high level code. Once the replacements are made, additional verification and/or validation are performed by compiler checking and/or execution tests.Type: ApplicationFiled: April 29, 2016Publication date: November 3, 2016Inventors: Samit Chaudhuri, Andrew William Fox, Tigran Sargsyan
-
Publication number: 20160125118Abstract: Systems and methods are disclosed for computing resource allocation based on flow graph translation. First, a high-level description of logic circuitry is obtained and translated to generate a flow graph representing sequential operations. Using the flow graph, similar processing elements in an array are interchangeably allocated to perform computational, communication, and storage tasks as needed. The sequential operations are executed using the array of interchangeable processing elements. Data is provided from the storage elements through the communication elements to the computational elements. Computational results are stored in the storage elements. Outputs from some of the computational elements provide inputs to other computational elements. Execution of the instructions can be controlled with time stepping. The processors are reallocated as needed, based on changes to the flow graph.Type: ApplicationFiled: October 30, 2015Publication date: May 5, 2016Inventors: Samit Chaudhuri, Henrik Esbensen, Kenneth Shiring, Peter Ramyalal Suaris
-
Publication number: 20140195779Abstract: A processing device is provided. A cluster includes a plurality of groups of processing elements. A multi-word device is connected to the processing elements within the groups. Each processing element in a particular group is in communication with all other processing elements within the particular group, and only one of the processing elements within other groups in the cluster. Each processing element is limited to operations in which input bits can be processed and an output obtained without reference to other bits. The multi-word device is configured to cooperate with at least two other processing elements to perform processing that requires reference to other bits to obtain a result.Type: ApplicationFiled: January 7, 2014Publication date: July 10, 2014Applicant: Wave SemiconductorInventors: Christopher John Nicol, Samit Chaudhuri, Radoslav Danilak
-
Publication number: 20140181164Abstract: An apparatus for mathematical manipulation is described allowing the selective combination of shifters to shift binary numbers of various widths. Selective combination allows on-the-fly adjustment of shifters from independent to coordinated shifting operations. Selective combination allows adjustable hardware-based shifting while saving space and resources. Multiple eight-bit shifters can be configured for a variety of operand widths, such as a 32-bit width, a 24-bit width, a 16-bit width, or an eight-bit width. Multiplexers route the appropriate input data to the appropriate shifters. Opcodes configure the shifters for the desired type of shift and a shifted result is generated.Type: ApplicationFiled: December 20, 2013Publication date: June 26, 2014Applicant: Wave Semiconductor, Inc.Inventor: Samit Chaudhuri
-
Publication number: 20140164457Abstract: An extensible iterative multiplier design is provided. Embodiments provide cascaded 8-bit multipliers for simplifying the performance of multi-byte multiplications. Booth encoding is performed in the lowest order multiplier, with the result of the Booth encoding then provided to higher order multipliers. Additionally, multiply-add operations can be performed by initializing a partial product sum register. Configurable connections between the multipliers facilitate a variety of possible multiplication options, including the possibility of varying the width of the operands.Type: ApplicationFiled: December 7, 2013Publication date: June 12, 2014Applicant: Wave Semiconductor, Inc.Inventors: Samit Chaudhuri, Radoslav Danilak
-
Patent number: 8707240Abstract: Techniques are disclosed for improving bit slice placement and wiring. Some embodiments include swapping cells to improve routing. An alternative embodiment includes copying wiring from a first bit slice to a second bit slice. Another embodiment includes copying blocks or cells from a first bit slice to a second bit slice. Further, the wiring from the first bit slice may be copied to the second bit slice.Type: GrantFiled: June 5, 2009Date of Patent: April 22, 2014Assignee: Synopsys, Inc.Inventors: Darren Faulkner, Alan Cheuk-Ming Lam, Samit Chaudhuri, Aditya Shiledar
-
Patent number: 8434047Abstract: A method of optimizing clock-gated circuitry in an integrated circuit (IC) design is provided. A plurality of signals which feed into enable inputs of a plurality of clock gates is determined, where the clock gates gate a plurality of sequential elements in the IC design. Combinational logic which is shared among the plurality of signals is identified. The clock-gated circuitry is transformed into multiple levels of clock-gating circuitry based on the shared combinational logic.Type: GrantFiled: January 25, 2011Date of Patent: April 30, 2013Assignee: Synopsys, Inc.Inventors: Yunjian (William) Jiang, Arvind Srinivasan, Joy Banerjee, Yinghua Li, Partha Das, Samit Chaudhuri
-
Patent number: 7958476Abstract: A power optimization method of deriving gated circuitry in an integrated circuit (IC) is provided. A design description of the IC is received and analyzed. A state machine is identified based on the analysis. One or more candidate blocks are determined to be capable of being disabled. At least one of the candidate blocks is selected based on one or more states of the state machine. A gating circuit is inserted for gating the selected candidate block(s). In another embodiment of power optimization, one or more state machines are identified and a synthesized netlist is generated. One or more candidate blocks in the synthesized netlist are determined to be capable of being disabled. At least one of the candidate blocks is selected based on one or more states in the state machine, and a gating circuit is inserted for gating the selected candidate block(s).Type: GrantFiled: July 9, 2008Date of Patent: June 7, 2011Assignee: Magma Design Automation, Inc.Inventors: Yunjian (William) Jiang, Arvind Srinivasan, Samit Chaudhuri