Patents by Inventor Amit Ramchandran
Amit Ramchandran has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Patent number: 7340562Abstract: A distributed data cache includes a number of cache memory units or register files each having a number of cache lines. Data buses are connected with the cache memory units. Each data bus is connected with a different cache line from each cache memory unit. A number of data address generators are connected with a memory unit and the data buses. The data address generators retrieve data values from the memory unit and communicate the data values to the data buses without latency. The data address generators are adapted to simultaneously communicate each of the data values to a different data bus without latency. The cache memory units are adapted to simultaneously load data values from the data buses, with each data value loaded into a different cache line without latency.Type: GrantFiled: July 24, 2003Date of Patent: March 4, 2008Assignee: NVIDIA CorporationInventor: Amit Ramchandran
-
Publication number: 20070294511Abstract: One embodiment of the present includes a heterogenous, high-performance, scalable processor having at least one W-type sub-processor capable of processing W bits in parallel, W being an integer value, at least one N-type sub-processor capable of processing N bits in parallel, N being an integer value wherein and smaller than W by a factor of two. The processor further includes a shared bus coupling the at least one W-type sub-processor and at least one N-type sub-processor and memory shared coupled to the at least one W-type sub-processor and the at least one N-type sub-processor, wherein the W-type sub-processor rearranges memory to accommodate execution of applications allowing for fast operations.Type: ApplicationFiled: August 30, 2007Publication date: December 20, 2007Applicant: 3PLUS1 TECHNOLOGY, INC.Inventors: Amit Ramchandran, John Hauser
-
Publication number: 20070271415Abstract: The present invention includes a adaptable high-performance node (RXN) with several features that enable it to provide high performance along with adaptability. A preferred embodiment of the RXN includes a run-time configurable data path and control path. The RXN supports multi-precision arithmetic including 8, 16, 24, and 32 bit codes. Data flow can be reconfigured to minimize register accesses for different operations. For example, multiply-accumulate operations can be performed with minimal, or no, register stores by reconfiguration of the data path. Predetermined kernels can be configured during a setup phase so that the RXN can efficiently execute, e.g., discrete cosine transform (DCT), fast-Fourier transform (FFT) and other operations. Other features are provided.Type: ApplicationFiled: May 3, 2007Publication date: November 22, 2007Inventor: Amit Ramchandran
-
Publication number: 20070198901Abstract: Among the embodiments of the present invention, one of the embodiment thereof includes a heterogeneous, high-performance, scalable processor including at least one W-type sub-processor capable of processing W bits, or more, in parallel, W being an integer value, at least one N-type sub-processor capable of processing N bits in parallel, N being an integer value wherein and smaller than W, a shared bus coupling the at least one W-type sub-processor and at least one N-type sub-processor; and at least one Galois Field (GF) MAC coupled to communicate with the W-type sub-processor and the N-type sub-processor, wherein the W-type sub-processor rearranges bytes in transit to or from memory to accommodate execution of applications allowing for fast operations.Type: ApplicationFiled: April 10, 2007Publication date: August 23, 2007Inventors: Amit Ramchandran, John Hauser, Adam Lins
-
Patent number: 7249242Abstract: Input pipeline registers are provided at inputs to functional units and data paths in a adaptive computing machine. Input pipeline registers are used to hold last-accessed values and to immediately place commonly needed constant values, such as a zero or one, onto inputs and data lines. This approach can reduce the time to obtain data values and conserve power by avoiding slower and more complex memory or storage accesses. Another embodiment of the invention allows data values to be obtained earlier during pipelined execution of instructions. For example, in a three stage fetch-decode-execute type of reduced instruction set computer (RISC), a data value can be ready from a prior instruction at the decode or execute stage of a subsequent instruction.Type: GrantFiled: July 23, 2003Date of Patent: July 24, 2007Assignee: NVIDIA CorporationInventor: Amit Ramchandran
-
Publication number: 20070150656Abstract: A method for compressing a set of instructions in an adaptive computing machine includes identifying frequently executed instructions, inserting an explicit caching instruction associating the identified instructions with an index value in the set of instructions before the identified instructions and replacing at least one instance of the frequently executed instructions subsequent to the explicit caching instruction with a compressed instruction referencing the index value. One or more instructions can be identified for compression, including groups of consecutive or non-consecutive instructions. The explicit caching instruction directs a node in an adaptive computing machine to store instructions in an instruction storage unit in association with an index value. Instructions stored in the storage unit are retrievable with reference to the index value.Type: ApplicationFiled: March 7, 2007Publication date: June 28, 2007Applicant: QuickSilver Technology, Inc.Inventor: Amit Ramchandran
-
Patent number: 7194605Abstract: A method for compressing a set of instructions in an adaptive computing machine includes identifying frequently executed instructions, inserting an explicit caching instruction associating the identified instructions with an index value in the set of instructions before the identified instructions and replacing at least one instance of the frequently executed instructions subsequent to the explicit caching instruction with a compressed instruction referencing the index value. One or more instructions can be identified for compression, including groups of consecutive or non-consecutive instructions. The explicit caching instruction directs a node in an adaptive computing machine to store instructions in an instruction storage unit in association with an index value. Instructions stored in the storage unit are retrievable with reference to the index value.Type: GrantFiled: July 24, 2003Date of Patent: March 20, 2007Assignee: NVIDIA CorporationInventor: Amit Ramchandran
-
Publication number: 20060026578Abstract: One embodiment of the present includes a heterogenous, high-performance, scalable processor having at least one W-type sub-processor capable of processing W bits or greater in parallel, W being an integer value, at least one N-type sub-processor capable of processing N bits in parallel, N being an integer value wherein and smaller than W. A scenario compiler is included in a hierarchical flow of compilation and used with other compilation and assembler blocks to generate binary code based on different types of codes to allow for efficient processing based on the sub-processors while maintaining low power consumption when the binary code is executed.Type: ApplicationFiled: August 2, 2005Publication date: February 2, 2006Inventors: Amit Ramchandran, John Hauser
-
Publication number: 20060015703Abstract: One embodiment of the present includes a heterogenous, high-performance, scalable processor having at least one W-type sub-processor capable of processing W bits in parallel, W being an integer value, at least one N-type sub-processor capable of processing N bits in parallel, N being an integer value wherein and smaller than W by a factor of two. The processor further includes a shared bus coupling the at least one W-type sub-processor and at least one N-type sub-processor and memory shared coupled to the at least one W-type sub-processor and the at least one N-type sub-processor, wherein the W-type sub-processor rearranges memory to accommodate execution of applications allowing for fast operations.Type: ApplicationFiled: July 12, 2005Publication date: January 19, 2006Inventors: Amit Ramchandran, John Hauser
-
Publication number: 20040168044Abstract: Input pipeline registers are provided at inputs to functional units and data paths in a adaptive computing machine. Input pipeline registers are used to hold last-accessed values and to immediately place commonly needed constant values, such as a zero or one, onto inputs and data lines. This approach can reduce the time to obtain data values and conserve power by avoiding slower and more complex memory or storage accesses. Another embodiment of the invention allows data values to be obtained earlier during pipelined execution of instructions. For example, in a three stage fetch-decode-execute type of reduced instruction set computer (RISC), a data value can be ready from a prior instruction at the decode or execute stage of a subsequent instruction.Type: ApplicationFiled: July 23, 2003Publication date: August 26, 2004Applicant: QuickSilver Technology, Inc.Inventor: Amit Ramchandran
-
Publication number: 20040133745Abstract: The present invention includes a adaptable high-performance node (RXN) with several features that enable it to provide high performance along with adaptability. A preferred embodiment of the RXN includes a run-time configurable data path and control path. The RXN supports multi-precision arithmetic including 8, 16, 24, and 32 bit codes. Data flow can be reconfigured to minimize register accesses for different operations. For example, multiply-accumulate operations can be performed with minimal, or no, register stores by reconfiguration of the data path. Predetermined kernels can be configured during a setup phase so that the RXN can efficiently execute, e.g., discrete cosine transform (DCT), fast-Fourier transform (FFT) and other operations. Other features are provided.Type: ApplicationFiled: July 23, 2003Publication date: July 8, 2004Applicant: QuickSilver Technology, Inc.Inventor: Amit Ramchandran
-
Publication number: 20040093479Abstract: A method for compressing a set of instructions in an adaptive computing machine includes identifying frequently executed instructions, inserting an explicit caching instruction associating the identified instructions with an index value in the set of instructions before the identified instructions and replacing at least one instance of the frequently executed instructions subsequent to the explicit caching instruction with a compressed instruction referencing the index value. One or more instructions can be identified for compression, including groups of consecutive or non-consecutive instructions. The explicit caching instruction directs a node in an adaptive computing machine to store instructions in an instruction storage unit in association with an index value. Instructions stored in the storage unit are retrievable with reference to the index value.Type: ApplicationFiled: July 24, 2003Publication date: May 13, 2004Applicant: QuickSilver Technology, Inc.Inventor: Amit Ramchandran
-
Publication number: 20040093465Abstract: A distributed data cache includes a number of cache memory units or register files each having a number of cache lines. Data buses are connected with the cache memory units. Each data bus is connected with a different cache line from each cache memory unit. A number of data address generators are connected with a memory unit and the data buses. The data address generators retrieve data values from the memory unit and communicate the data values to the data buses without latency. The data address generators are adapted to simultaneously communicate each of the data values to a different data bus without latency. The cache memory units are adapted to simultaneously load data values from the data buses, with each data value loaded into a different cache line without latency.Type: ApplicationFiled: July 24, 2003Publication date: May 13, 2004Applicant: QuickSilver Technology, Inc.Inventor: Amit Ramchandran