Patents by Inventor Sharad Vasantrao Chole

Sharad Vasantrao Chole has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Publication number: 20250013259
    Abstract: Disclosed are semiconductor devices that implement relaxed clock forwarding between logic blocks. In one embodiment the system includes a set of logic blocks forming a first processing path. Another set of logic blocks form additional processing paths. A clock is configured to forward the data between the logic blocks asynchronously in the first processing path in which the data is asynchronously forwarded between logic blocks. This forwarding is asynchronous with the additional processing paths data and clock. The ends or last logic block in each path can be synchronized using a synchronizer component. The synchronizer can be a plurality of asynchronous FIFOs. In one embodiment, logic blocks form a matrix and the processing paths are along the rows or columns of the matrix.
    Type: Application
    Filed: June 28, 2024
    Publication date: January 9, 2025
    Inventors: Siyad Chih-Hua Ma, Shang-Tse Chuang, Sharad Vasantrao Chole
  • Patent number: 12182717
    Abstract: Artificial intelligence is an increasingly important sector of the computer industry. One of the most important applications for artificial intelligence is object recognition and classification from digital images. Convolutional neural networks have proven to be a very effective tool for object recognition and classification from digital images. However, convolutional neural networks are extremely computationally intensive thus requiring high-performance processors, significant computation time, and significant energy consumption. To reduce the computation time and energy consumption a “cone of dependency” and “cone of influence” processing techniques are disclosed. These two techniques arrange the computations required in a manner that minimizes memory accesses such that computations may be performed in local cache memory. These techniques significantly reduce the time to perform the computations and the energy consumed by the hardware implementing a convolutional neural network.
    Type: Grant
    Filed: October 18, 2021
    Date of Patent: December 31, 2024
    Assignee: Expedera, Inc.
    Inventors: Shang-Tse Chuang, Sharad Vasantrao Chole, Siyad Chih-Hua Ma
  • Publication number: 20240428046
    Abstract: Artificial intelligence is an increasingly important sector of the computer industry. However, artificial intelligence is extremely computationally intensive field such that it can be expensive, time consuming, and energy consuming. Fortunately, many of the calculations required for artificial intelligence can be performed in parallel such that specialized processors can great increase computational performance. Specifically, artificial intelligence generally requires large numbers of matrix operations to implement neural networks such that specialized matrix processor circuits can improve performance. To perform all these matrix operations, the neural processing circuits must be quickly and efficiently supplied with data to process or else the matrix processor circuits end up idle or spending large amounts of time loading weight matrix data.
    Type: Application
    Filed: June 21, 2023
    Publication date: December 26, 2024
    Inventors: Sharad Vasantrao Chole, Shang-Tse Chuang, Siyad Chih-Hua Ma
  • Publication number: 20240427839
    Abstract: Artificial intelligence is an increasingly important sector of the computer industry. However, artificial intelligence is extremely computationally intensive field such that it can be expensive, time consuming, and energy consuming. Fortunately, many of the calculations required for artificial intelligence can be performed in parallel such that specialized processors can great increase computational performance. Specifically, artificial intelligence generally requires large numbers of matrix operations to implement neural networks such that specialized matrix processor circuits can improve performance. Thus, this document discloses apparatus and methods for efficiently processing matrix operations.
    Type: Application
    Filed: June 21, 2023
    Publication date: December 26, 2024
    Inventors: Sharad Vasantrao Chole, Shang-Tse Chuang, Siyad Chih-Hua Ma
  • Patent number: 12141226
    Abstract: Artificial intelligence is an increasingly important sector of the computer industry. However, artificial intelligence is extremely computationally intensive field such that it can be expensive, time consuming, and energy consuming. Fortunately, many of the calculations required for artificial intelligence can be performed in parallel such that specialized processors can greatly increase computational performance. Specifically, artificial intelligence generally requires large numbers of matrix operations to implement neural networks such that specialized Matrix Processor circuits can improve performance. But a neural network is more than a collection of matrix operations; it is a set of specifically coordinated matrix operations with complex data dependencies. Without proper coordination, Matrix Processor circuits may end up idle or spending large amounts of time loading in different weight matrix data.
    Type: Grant
    Filed: April 5, 2019
    Date of Patent: November 12, 2024
    Assignee: Expedera, Inc.
    Inventors: Siyad Chih-Hua Ma, Shang-Tse Chuang, Sharad Vasantrao Chole
  • Publication number: 20240320496
    Abstract: Artificial intelligence is an extremely computationally intensive field such that it can be expensive, time consuming, and energy consuming. Fortunately, many of the calculations required for artificial intelligence can be performed in parallel such that specialized processors can greatly increase computational performance. Specifically, artificial intelligence generally requires a large flow of data from different types of memory. To maximize the process of a multilayer neural network, the reordering of data onto and out of a neural network processor, the computations by the matrix of processing elements within the neural network processor, and the synchronization of these activities are reordered.
    Type: Application
    Filed: June 7, 2024
    Publication date: September 26, 2024
    Inventors: Ramteja Tadishetti, Steven Twu, Arthur Chang, Sharad Vasantrao Chole
  • Publication number: 20240265234
    Abstract: Artificial intelligence is an increasingly important sector of the computer industry. However, artificial intelligence is very computationally intensive field. Fortunately, many of the required calculations can be performed in parallel such that specialized processors can greatly increase computation performance. In particular, Graphics Processor Units (GPUs) are often used in artificial intelligence. Although GPUs have helped, they are not ideal for artificial intelligence. Specifically, GPUs are used to compute matrix operations in one direction with a pipelined architecture. However, artificial intelligence is a field that uses both forward propagation computations and back propagation calculations. To efficiently perform artificial intelligence calculations, a symmetric matrix processing element is introduced. The symmetric matrix processing element can perform forward propagation and backward propagation calculations just as easily.
    Type: Application
    Filed: April 17, 2024
    Publication date: August 8, 2024
    Inventors: Sharad Vasantrao Chole, Shang-Tse Chuang, Siyad Chih-Hua Ma
  • Publication number: 20240242078
    Abstract: Artificial intelligence is an increasingly important sector of the computer industry. One of the most important applications for artificial intelligence is object recognition and classification from digital images. Convolutional neural networks have proven to be a very effective tool for object recognition and classification from digital images. However, convolutional neural networks are extremely computationally intensive thus requiring high-performance processors, significant computation time, and significant energy consumption. To reduce the computation time and energy consumption a “cone of dependency” and “cone of influence” processing techniques are disclosed. These two techniques arrange the computations required in a manner that minimizes memory accesses such that computations may be performed in local cache memory. These techniques significantly reduce the time to perform the computations and the energy consumed by the hardware implementing a convolutional neural network.
    Type: Application
    Filed: October 18, 2021
    Publication date: July 18, 2024
    Inventors: Shang-Tse Chuang, Sharad Vasantrao Chole, Siyad Chih-Hua Ma
  • Patent number: 11983616
    Abstract: Artificial intelligence is an increasingly important sector of the computer industry. However, artificial intelligence is very computationally intensive field. Fortunately, many of the required calculations can be performed in parallel such that specialized processors can greatly increase computation performance. In particular, Graphics Processor Units (GPUs) are often used in artificial intelligence. Although GPUs have helped, they are not ideal for artificial intelligence. Specifically, GPUs are used to compute matrix operations in one direction with a pipelined architecture. However, artificial intelligence is a field that uses both forward propagation computations and back propagation calculations. To efficiently perform artificial intelligence calculations, a symmetric matrix processing element is introduced. The symmetric matrix processing element can perform forward propagation and backward propagation calculations just as easily.
    Type: Grant
    Filed: October 1, 2018
    Date of Patent: May 14, 2024
    Assignee: Expedera, Inc.
    Inventors: Sharad Vasantrao Chole, Shang-Tse Chuang, Siyad Chih-Hua Ma
  • Publication number: 20240152761
    Abstract: Artificial intelligence is an increasingly important sector of the computer industry. However, artificial intelligence is extremely computationally intensive field such that it can be expensive, time consuming, and energy consuming field. Fortunately, many of the calculations required for artificial intelligence can be performed in parallel such that specialized processors can great increase computational performance for AI applications. Specifically, artificial intelligence generally requires large numbers of matrix operations such that specialized matrix processor circuits can greatly improve performance. To efficiently execute all these matrix operations, the matrix processor circuits must be quickly and efficiently supplied with a stream of data and instructions to process or else the matrix processor circuits end up idle. Thus, this document discloses packet architecture for efficiently creating and supplying neural network processors with work packets to process.
    Type: Application
    Filed: October 20, 2022
    Publication date: May 9, 2024
    Applicant: Expedera, Inc.
    Inventors: Sharad Vasantrao Chole, Shang-Tse Chuang, Siyad Chih-Hua Ma
  • Patent number: 11151416
    Abstract: Artificial intelligence is an increasingly important sector of the computer industry. One of the most important applications for artificial intelligence is object recognition and classification from digital images. Convolutional neural networks have proven to be a very effective tool for object recognition and classification from digital images. However, convolutional neural networks are extremely computationally intensive thus requiring high-performance processors, significant computation time, and significant energy consumption. To reduce the computation time and energy consumption a “cone of dependency” and “cone of influence” processing techniques are disclosed. These two techniques arrange the computations required in a manner that minimizes memory accesses such that computations may be performed in local cache memory. These techniques significantly reduce the time to perform the computations and the energy consumed by the hardware implementing a convolutional neural network.
    Type: Grant
    Filed: September 11, 2019
    Date of Patent: October 19, 2021
    Assignee: Expedera, Inc.
    Inventors: Shang-Tse Chuang, Sharad Vasantrao Chole, Siyad Chih-Hua Ma
  • Publication number: 20210073585
    Abstract: Artificial intelligence is an increasingly important sector of the computer industry. One of the most important applications for artificial intelligence is object recognition and classification from digital images. Convolutional neural networks have proven to be a very effective tool for object recognition and classification from digital images. However, convolutional neural networks are extremely computationally intensive thus requiring high-performance processors, significant computation time, and significant energy consumption. To reduce the computation time and energy consumption a “cone of dependency” and “cone of influence” processing techniques are disclosed. These two techniques arrange the computations required in a manner that minimizes memory accesses such that computations may be performed in local cache memory. These techniques significantly reduce the time to perform the computations and the energy consumed by the hardware implementing a convolutional neural network.
    Type: Application
    Filed: September 11, 2019
    Publication date: March 11, 2021
    Applicant: Expedera, Inc.
    Inventors: Shang-Tse Chuang, Sharad Vasantrao Chole, Siyad Chih-Hua Ma
  • Publication number: 20200371835
    Abstract: Artificial intelligence is an increasingly important sector of the computer industry. However, artificial intelligence is an extremely computationally intensive field such that performing artificial intelligence calculations can be expensive, time consuming, and energy consuming. Fortunately, many of the calculations required for artificial intelligence applications can be performed in parallel such that specialized linear algebra matrix processors can greatly increase computational performance. But even with linear algebra matrix processors; performance can be limited due to complex data dependencies. Without proper coordination, linear algebra matrix processors may end up idle or spending large amounts of time moving data around. Thus, this document discloses methods for efficiently scheduling linear algebra matrix processors.
    Type: Application
    Filed: May 7, 2020
    Publication date: November 26, 2020
    Applicant: Expedera, Inc.
    Inventors: Shang-Tse Chuang, Sharad Vasantrao Chole, Siyad Chih-Hua Ma
  • Publication number: 20200226201
    Abstract: Artificial intelligence is an increasingly important sector of the computer industry. However, artificial intelligence is extremely computationally intensive field such that it can be expensive, time consuming, and energy consuming. Fortunately, many of the calculations required for artificial intelligence can be performed in parallel such that specialized processors can great increase computational performance. Specifically, artificial intelligence generally requires large numbers of matrix operations to implement neural networks such that specialized Matrix Processor circuits can improve performance. But a neural network is more than a collection of matrix operations; it is a set of specifically coordinated matrix operations with complex data dependencies. Without proper coordination, Matrix Processor circuits may end up idle or spending large amounts of time loading in different weight matrix data.
    Type: Application
    Filed: April 5, 2019
    Publication date: July 16, 2020
    Applicant: Expedera, Inc.
    Inventors: Siyad Chih-Hua Ma, Shang-Tse Chuang, Sharad Vasantrao Chole
  • Publication number: 20200104669
    Abstract: Artificial intelligence is an increasingly important sector of the computer industry. However, artificial intelligence is very computationally intensive field. Fortunately, many of the required calculations can be performed in parallel such that specialized processors can great increase computation performance. In particular, Graphics Processor Units (GPUs) are often used in artificial intelligence. Although GPUs have helped, they are not ideal for artificial intelligence. Specifically, GPUs are used to compute matrix operations in one direction with a pipelined architecture. However, artificial intelligence is a field that uses both forward propagation computations and back propagation calculations. To efficiently perform artificial intelligence calculations, a symmetric matrix processing element is introduced. The symmetric matrix processing element can perform forward propagation and backward propagation calculations just as easily.
    Type: Application
    Filed: October 1, 2018
    Publication date: April 2, 2020
    Applicant: Expedera, Inc.
    Inventors: Sharad Vasantrao Chole, Shang-Tse Chuang, Siyad Chih-Hua Ma
  • Patent number: 9965211
    Abstract: Provided are a method, a non-transitory computer-readable storage device and an apparatus for managing use of a shared memory buffer that is partitioned into multiple banks and that stores incoming data received at multiple inputs in accordance with a multi-slice architecture. A particular bank is allocated to a corresponding slice. Received respective data packets are associated with corresponding slices based on which respective inputs they are received. Determine, based on a state of the shared memory buffer, to transfer contents of all occupied cells of the particular bank. Writes to the bank are stopped, contents of occupied cells are transferred to cells of one or more other banks associated with the particular bank's slice, information is stored indicating where the contents have been transferred, and the particular bank is returned to a shared pool after transferring is completed.
    Type: Grant
    Filed: September 8, 2016
    Date of Patent: May 8, 2018
    Assignee: Cisco Technology, Inc.
    Inventors: Sharad Vasantrao Chole, Shang-Tse Chuang, Georges Akis, Felice Bonardi, Rong Pan
  • Publication number: 20180067683
    Abstract: Provided are a method, a non-transitory computer-readable storage device and an apparatus for managing use of a shared memory buffer that is partitioned into multiple banks and that stores incoming data received at multiple inputs in accordance with a multi-slice architecture. A particular bank is allocated to a corresponding slice. Received respective data packets are associated with corresponding slices based on which respective inputs they are received. Determine, based on a state of the shared memory buffer, to transfer contents of all occupied cells of the particular bank. Writes to the bank are stopped, contents of occupied cells are transferred to cells of one or more other banks associated with the particular bank's slice, information is stored indicating where the contents have been transferred, and the particular bank is returned to a shared pool after transferring is completed.
    Type: Application
    Filed: September 8, 2016
    Publication date: March 8, 2018
    Inventors: Sharad Vasantrao Chole, Shang-Tse Chuang, Georges Akis, Felice Bonardi, Rong Pan