Patents by Inventor Sharad Vasantrao Chole
Sharad Vasantrao Chole has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Publication number: 20250013259Abstract: Disclosed are semiconductor devices that implement relaxed clock forwarding between logic blocks. In one embodiment the system includes a set of logic blocks forming a first processing path. Another set of logic blocks form additional processing paths. A clock is configured to forward the data between the logic blocks asynchronously in the first processing path in which the data is asynchronously forwarded between logic blocks. This forwarding is asynchronous with the additional processing paths data and clock. The ends or last logic block in each path can be synchronized using a synchronizer component. The synchronizer can be a plurality of asynchronous FIFOs. In one embodiment, logic blocks form a matrix and the processing paths are along the rows or columns of the matrix.Type: ApplicationFiled: June 28, 2024Publication date: January 9, 2025Inventors: Siyad Chih-Hua Ma, Shang-Tse Chuang, Sharad Vasantrao Chole
-
Patent number: 12182717Abstract: Artificial intelligence is an increasingly important sector of the computer industry. One of the most important applications for artificial intelligence is object recognition and classification from digital images. Convolutional neural networks have proven to be a very effective tool for object recognition and classification from digital images. However, convolutional neural networks are extremely computationally intensive thus requiring high-performance processors, significant computation time, and significant energy consumption. To reduce the computation time and energy consumption a “cone of dependency” and “cone of influence” processing techniques are disclosed. These two techniques arrange the computations required in a manner that minimizes memory accesses such that computations may be performed in local cache memory. These techniques significantly reduce the time to perform the computations and the energy consumed by the hardware implementing a convolutional neural network.Type: GrantFiled: October 18, 2021Date of Patent: December 31, 2024Assignee: Expedera, Inc.Inventors: Shang-Tse Chuang, Sharad Vasantrao Chole, Siyad Chih-Hua Ma
-
Publication number: 20240428046Abstract: Artificial intelligence is an increasingly important sector of the computer industry. However, artificial intelligence is extremely computationally intensive field such that it can be expensive, time consuming, and energy consuming. Fortunately, many of the calculations required for artificial intelligence can be performed in parallel such that specialized processors can great increase computational performance. Specifically, artificial intelligence generally requires large numbers of matrix operations to implement neural networks such that specialized matrix processor circuits can improve performance. To perform all these matrix operations, the neural processing circuits must be quickly and efficiently supplied with data to process or else the matrix processor circuits end up idle or spending large amounts of time loading weight matrix data.Type: ApplicationFiled: June 21, 2023Publication date: December 26, 2024Inventors: Sharad Vasantrao Chole, Shang-Tse Chuang, Siyad Chih-Hua Ma
-
Publication number: 20240427839Abstract: Artificial intelligence is an increasingly important sector of the computer industry. However, artificial intelligence is extremely computationally intensive field such that it can be expensive, time consuming, and energy consuming. Fortunately, many of the calculations required for artificial intelligence can be performed in parallel such that specialized processors can great increase computational performance. Specifically, artificial intelligence generally requires large numbers of matrix operations to implement neural networks such that specialized matrix processor circuits can improve performance. Thus, this document discloses apparatus and methods for efficiently processing matrix operations.Type: ApplicationFiled: June 21, 2023Publication date: December 26, 2024Inventors: Sharad Vasantrao Chole, Shang-Tse Chuang, Siyad Chih-Hua Ma
-
Patent number: 12141226Abstract: Artificial intelligence is an increasingly important sector of the computer industry. However, artificial intelligence is extremely computationally intensive field such that it can be expensive, time consuming, and energy consuming. Fortunately, many of the calculations required for artificial intelligence can be performed in parallel such that specialized processors can greatly increase computational performance. Specifically, artificial intelligence generally requires large numbers of matrix operations to implement neural networks such that specialized Matrix Processor circuits can improve performance. But a neural network is more than a collection of matrix operations; it is a set of specifically coordinated matrix operations with complex data dependencies. Without proper coordination, Matrix Processor circuits may end up idle or spending large amounts of time loading in different weight matrix data.Type: GrantFiled: April 5, 2019Date of Patent: November 12, 2024Assignee: Expedera, Inc.Inventors: Siyad Chih-Hua Ma, Shang-Tse Chuang, Sharad Vasantrao Chole
-
Publication number: 20240320496Abstract: Artificial intelligence is an extremely computationally intensive field such that it can be expensive, time consuming, and energy consuming. Fortunately, many of the calculations required for artificial intelligence can be performed in parallel such that specialized processors can greatly increase computational performance. Specifically, artificial intelligence generally requires a large flow of data from different types of memory. To maximize the process of a multilayer neural network, the reordering of data onto and out of a neural network processor, the computations by the matrix of processing elements within the neural network processor, and the synchronization of these activities are reordered.Type: ApplicationFiled: June 7, 2024Publication date: September 26, 2024Inventors: Ramteja Tadishetti, Steven Twu, Arthur Chang, Sharad Vasantrao Chole
-
Publication number: 20240265234Abstract: Artificial intelligence is an increasingly important sector of the computer industry. However, artificial intelligence is very computationally intensive field. Fortunately, many of the required calculations can be performed in parallel such that specialized processors can greatly increase computation performance. In particular, Graphics Processor Units (GPUs) are often used in artificial intelligence. Although GPUs have helped, they are not ideal for artificial intelligence. Specifically, GPUs are used to compute matrix operations in one direction with a pipelined architecture. However, artificial intelligence is a field that uses both forward propagation computations and back propagation calculations. To efficiently perform artificial intelligence calculations, a symmetric matrix processing element is introduced. The symmetric matrix processing element can perform forward propagation and backward propagation calculations just as easily.Type: ApplicationFiled: April 17, 2024Publication date: August 8, 2024Inventors: Sharad Vasantrao Chole, Shang-Tse Chuang, Siyad Chih-Hua Ma
-
Publication number: 20240242078Abstract: Artificial intelligence is an increasingly important sector of the computer industry. One of the most important applications for artificial intelligence is object recognition and classification from digital images. Convolutional neural networks have proven to be a very effective tool for object recognition and classification from digital images. However, convolutional neural networks are extremely computationally intensive thus requiring high-performance processors, significant computation time, and significant energy consumption. To reduce the computation time and energy consumption a “cone of dependency” and “cone of influence” processing techniques are disclosed. These two techniques arrange the computations required in a manner that minimizes memory accesses such that computations may be performed in local cache memory. These techniques significantly reduce the time to perform the computations and the energy consumed by the hardware implementing a convolutional neural network.Type: ApplicationFiled: October 18, 2021Publication date: July 18, 2024Inventors: Shang-Tse Chuang, Sharad Vasantrao Chole, Siyad Chih-Hua Ma
-
Patent number: 11983616Abstract: Artificial intelligence is an increasingly important sector of the computer industry. However, artificial intelligence is very computationally intensive field. Fortunately, many of the required calculations can be performed in parallel such that specialized processors can greatly increase computation performance. In particular, Graphics Processor Units (GPUs) are often used in artificial intelligence. Although GPUs have helped, they are not ideal for artificial intelligence. Specifically, GPUs are used to compute matrix operations in one direction with a pipelined architecture. However, artificial intelligence is a field that uses both forward propagation computations and back propagation calculations. To efficiently perform artificial intelligence calculations, a symmetric matrix processing element is introduced. The symmetric matrix processing element can perform forward propagation and backward propagation calculations just as easily.Type: GrantFiled: October 1, 2018Date of Patent: May 14, 2024Assignee: Expedera, Inc.Inventors: Sharad Vasantrao Chole, Shang-Tse Chuang, Siyad Chih-Hua Ma
-
Publication number: 20240152761Abstract: Artificial intelligence is an increasingly important sector of the computer industry. However, artificial intelligence is extremely computationally intensive field such that it can be expensive, time consuming, and energy consuming field. Fortunately, many of the calculations required for artificial intelligence can be performed in parallel such that specialized processors can great increase computational performance for AI applications. Specifically, artificial intelligence generally requires large numbers of matrix operations such that specialized matrix processor circuits can greatly improve performance. To efficiently execute all these matrix operations, the matrix processor circuits must be quickly and efficiently supplied with a stream of data and instructions to process or else the matrix processor circuits end up idle. Thus, this document discloses packet architecture for efficiently creating and supplying neural network processors with work packets to process.Type: ApplicationFiled: October 20, 2022Publication date: May 9, 2024Applicant: Expedera, Inc.Inventors: Sharad Vasantrao Chole, Shang-Tse Chuang, Siyad Chih-Hua Ma
-
Patent number: 11151416Abstract: Artificial intelligence is an increasingly important sector of the computer industry. One of the most important applications for artificial intelligence is object recognition and classification from digital images. Convolutional neural networks have proven to be a very effective tool for object recognition and classification from digital images. However, convolutional neural networks are extremely computationally intensive thus requiring high-performance processors, significant computation time, and significant energy consumption. To reduce the computation time and energy consumption a “cone of dependency” and “cone of influence” processing techniques are disclosed. These two techniques arrange the computations required in a manner that minimizes memory accesses such that computations may be performed in local cache memory. These techniques significantly reduce the time to perform the computations and the energy consumed by the hardware implementing a convolutional neural network.Type: GrantFiled: September 11, 2019Date of Patent: October 19, 2021Assignee: Expedera, Inc.Inventors: Shang-Tse Chuang, Sharad Vasantrao Chole, Siyad Chih-Hua Ma
-
Publication number: 20210073585Abstract: Artificial intelligence is an increasingly important sector of the computer industry. One of the most important applications for artificial intelligence is object recognition and classification from digital images. Convolutional neural networks have proven to be a very effective tool for object recognition and classification from digital images. However, convolutional neural networks are extremely computationally intensive thus requiring high-performance processors, significant computation time, and significant energy consumption. To reduce the computation time and energy consumption a “cone of dependency” and “cone of influence” processing techniques are disclosed. These two techniques arrange the computations required in a manner that minimizes memory accesses such that computations may be performed in local cache memory. These techniques significantly reduce the time to perform the computations and the energy consumed by the hardware implementing a convolutional neural network.Type: ApplicationFiled: September 11, 2019Publication date: March 11, 2021Applicant: Expedera, Inc.Inventors: Shang-Tse Chuang, Sharad Vasantrao Chole, Siyad Chih-Hua Ma
-
Publication number: 20200371835Abstract: Artificial intelligence is an increasingly important sector of the computer industry. However, artificial intelligence is an extremely computationally intensive field such that performing artificial intelligence calculations can be expensive, time consuming, and energy consuming. Fortunately, many of the calculations required for artificial intelligence applications can be performed in parallel such that specialized linear algebra matrix processors can greatly increase computational performance. But even with linear algebra matrix processors; performance can be limited due to complex data dependencies. Without proper coordination, linear algebra matrix processors may end up idle or spending large amounts of time moving data around. Thus, this document discloses methods for efficiently scheduling linear algebra matrix processors.Type: ApplicationFiled: May 7, 2020Publication date: November 26, 2020Applicant: Expedera, Inc.Inventors: Shang-Tse Chuang, Sharad Vasantrao Chole, Siyad Chih-Hua Ma
-
Publication number: 20200226201Abstract: Artificial intelligence is an increasingly important sector of the computer industry. However, artificial intelligence is extremely computationally intensive field such that it can be expensive, time consuming, and energy consuming. Fortunately, many of the calculations required for artificial intelligence can be performed in parallel such that specialized processors can great increase computational performance. Specifically, artificial intelligence generally requires large numbers of matrix operations to implement neural networks such that specialized Matrix Processor circuits can improve performance. But a neural network is more than a collection of matrix operations; it is a set of specifically coordinated matrix operations with complex data dependencies. Without proper coordination, Matrix Processor circuits may end up idle or spending large amounts of time loading in different weight matrix data.Type: ApplicationFiled: April 5, 2019Publication date: July 16, 2020Applicant: Expedera, Inc.Inventors: Siyad Chih-Hua Ma, Shang-Tse Chuang, Sharad Vasantrao Chole
-
Publication number: 20200104669Abstract: Artificial intelligence is an increasingly important sector of the computer industry. However, artificial intelligence is very computationally intensive field. Fortunately, many of the required calculations can be performed in parallel such that specialized processors can great increase computation performance. In particular, Graphics Processor Units (GPUs) are often used in artificial intelligence. Although GPUs have helped, they are not ideal for artificial intelligence. Specifically, GPUs are used to compute matrix operations in one direction with a pipelined architecture. However, artificial intelligence is a field that uses both forward propagation computations and back propagation calculations. To efficiently perform artificial intelligence calculations, a symmetric matrix processing element is introduced. The symmetric matrix processing element can perform forward propagation and backward propagation calculations just as easily.Type: ApplicationFiled: October 1, 2018Publication date: April 2, 2020Applicant: Expedera, Inc.Inventors: Sharad Vasantrao Chole, Shang-Tse Chuang, Siyad Chih-Hua Ma
-
Patent number: 9965211Abstract: Provided are a method, a non-transitory computer-readable storage device and an apparatus for managing use of a shared memory buffer that is partitioned into multiple banks and that stores incoming data received at multiple inputs in accordance with a multi-slice architecture. A particular bank is allocated to a corresponding slice. Received respective data packets are associated with corresponding slices based on which respective inputs they are received. Determine, based on a state of the shared memory buffer, to transfer contents of all occupied cells of the particular bank. Writes to the bank are stopped, contents of occupied cells are transferred to cells of one or more other banks associated with the particular bank's slice, information is stored indicating where the contents have been transferred, and the particular bank is returned to a shared pool after transferring is completed.Type: GrantFiled: September 8, 2016Date of Patent: May 8, 2018Assignee: Cisco Technology, Inc.Inventors: Sharad Vasantrao Chole, Shang-Tse Chuang, Georges Akis, Felice Bonardi, Rong Pan
-
Publication number: 20180067683Abstract: Provided are a method, a non-transitory computer-readable storage device and an apparatus for managing use of a shared memory buffer that is partitioned into multiple banks and that stores incoming data received at multiple inputs in accordance with a multi-slice architecture. A particular bank is allocated to a corresponding slice. Received respective data packets are associated with corresponding slices based on which respective inputs they are received. Determine, based on a state of the shared memory buffer, to transfer contents of all occupied cells of the particular bank. Writes to the bank are stopped, contents of occupied cells are transferred to cells of one or more other banks associated with the particular bank's slice, information is stored indicating where the contents have been transferred, and the particular bank is returned to a shared pool after transferring is completed.Type: ApplicationFiled: September 8, 2016Publication date: March 8, 2018Inventors: Sharad Vasantrao Chole, Shang-Tse Chuang, Georges Akis, Felice Bonardi, Rong Pan