Patents by Inventor Ashish Sirasao

Ashish Sirasao has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 10990826
    Abstract: Detecting objects in video may include receiving object detections for a plurality of selected frames of a video from a still image detector, wherein the plurality of selected frames are non-adjacent frames of the video, propagating the object detections from the plurality of selected frames to sequential frames of the video adjacent to the plurality of selected frames based on a distance metric and vector flow data for the sequential frames, suppressing false positive object detections from the propagating, and outputting resulting object detections for the sequential frames of the video.
    Type: Grant
    Filed: March 20, 2019
    Date of Patent: April 27, 2021
    Assignee: Xilinx, Inc.
    Inventors: Mujib Haider, Venkata V. Dhanikonda, Ashish Sirasao
  • Patent number: 10990736
    Abstract: Implementing a circuit design can include detecting, using computer hardware, a re-convergent section of a circuit design, masking, using the computer hardware, a sequential circuit element of the re-convergent section located between a start and an end of the re-convergent section, and performing, using the computer hardware, an optimization operation on combinatorial logic of the re-convergent section to create optimized combinatorial logic. Using the computer hardware, the optimized combinatorial logic of the re-convergent section can be mapped. Further, the re-convergent section can be modified subsequent to the mapping to match timing of the re-convergent section prior to the masking.
    Type: Grant
    Filed: March 17, 2020
    Date of Patent: April 27, 2021
    Assignee: Xilinx, Inc.
    Inventors: Chaithanya Dudha, Satyaprakash Pareek, Krishna Garlapati, Ashish Sirasao
  • Patent number: 10984500
    Abstract: An example preprocessor circuit for formatting image data into a plurality of streams of image samples includes: a plurality of memory banks configured to store the image data; multiplexer circuitry coupled to the memory banks; a first plurality of registers coupled to the multiplexer circuitry; a second plurality of registers coupled to the first plurality of registers, outputs of the second plurality of registers configured to provide the plurality of streams of image samples; bank address and control circuitry coupled to control inputs of the plurality of memory banks, the multiplexer circuitry, and the first plurality of registers; output control circuitry coupled to control inputs of the second plurality of registers; and a control state machine coupled to the bank address and control circuitry and the output control circuitry.
    Type: Grant
    Filed: September 19, 2019
    Date of Patent: April 20, 2021
    Assignee: XILINX, INC.
    Inventors: Ashish Sirasao, Elliott Delaye, Aaron Ng, Ehsan Ghasemi
  • Patent number: 10943039
    Abstract: An example multiply accumulate (MACC) circuit includes: a multiply-accumulator having an accumulator output register; a quantizer, coupled to the multiply accumulator; and a control circuit coupled to the multiply-accumulator and the quantizer, the control circuit configured to provide control data to the quantizer, the control data indicative of a most-significant bit (MSB) to least significant bit (LSB) range for selecting bit indices from the accumulator output register.
    Type: Grant
    Filed: October 17, 2017
    Date of Patent: March 9, 2021
    Assignee: XILINX, INC.
    Inventors: Ashish Sirasao, Elliott Delaye, Sean Settle, Zhao Ma, Ehsan Ghasemi, Xiao Teng, Aaron Ng, Jindrich Zejda
  • Patent number: 10936311
    Abstract: Disclosed approaches for multiplying a sparse matrix by dense a vector or matrix include first memory banks for storage of column indices, second memory banks for storage of row indices, and third memory banks for storage of non-zero values of a sparse matrix. A pairing circuit distributes an input stream of vector elements across first first-in-first-out (FIFO) buffers according to the buffered column indices. Multiplication circuitry multiplies vector elements output from the first FIFO buffers by corresponding ones of the non-zero values from the third memory banks, and stores products in second FIFO buffers. Row-aligner circuitry organize the products output from the second FIFO buffers into third FIFO buffers according to row indices in the second memory banks. Accumulation circuitry accumulates respective totals from products output from the third FIFO buffers.
    Type: Grant
    Filed: July 9, 2019
    Date of Patent: March 2, 2021
    Assignee: Xilinx, Inc.
    Inventors: Ling Liu, Yifei Zhou, Xiao Teng, Ashish Sirasao, Chuanhua Song, Aaron Ng
  • Patent number: 10824434
    Abstract: Examples described herein relate to dynamically structured single instruction, multiple data (SIMD) instructions, and systems and circuits implementing such dynamically structured SIMD instructions. An example is a method for processing data. A first SIMD structure is determined by a processor. A characteristic of the first SIMD structure is altered by the processor to obtain a second SIMD structure. An indication of the second SIMD structure is communicated from the processor to a numerical engine. Data is packed by the numerical engine into an SIMD instruction according to the second SIMD structure. The SIMD instruction is transmitted from the numerical engine.
    Type: Grant
    Filed: November 29, 2018
    Date of Patent: November 3, 2020
    Assignee: XILINX, INC.
    Inventors: Sean Settle, Ehsan Ghasemi, Ashish Sirasao, Ralph D. Wittig
  • Patent number: 10747502
    Abstract: Circuits and method for multiplying floating point operands. An exponent adder circuit sums a first exponent and a second exponent and generates an output exponent. A mantissa multiplier circuit multiplies a first mantissa and a second mantissa and generates an output mantissa. A first conversion circuit converts the output exponent and output mantissa into a fixed point number. An accumulator circuit sums contents of an accumulation register and the fixed point number into an accumulated value and stores the accumulated value in the accumulation register.
    Type: Grant
    Filed: September 19, 2018
    Date of Patent: August 18, 2020
    Assignee: Xilinx, Inc.
    Inventors: Satyaprakash Pareek, Anup Hosangadi, Bing Tian, Ashish Sirasao, Yao Fu, Oscar Fernando C. Fernandez, Michael Wu, Christopher H. Dick
  • Patent number: 10726175
    Abstract: A memory optimization method includes identifying, within a circuit design, a memory having an arithmetic operator at an output side and/or an input side of the memory. The memory may include a read-only memory (ROM). In some examples, an input of the arithmetic operator includes a constant value. In some embodiments, the memory optimization method further includes absorbing a function of the arithmetic operator into the memory. By way of example, the absorbing the function includes modifying contents of the memory based on the function of the arithmetic operator to provide an updated memory and removing the arithmetic operator from the circuit design.
    Type: Grant
    Filed: March 4, 2019
    Date of Patent: July 28, 2020
    Assignee: Xilinx, Inc.
    Inventors: Chaithanya Dudha, Satyaprakash Pareek, Bing Tian, Ashish Sirasao
  • Patent number: 10678983
    Abstract: Local retiming for a circuit design includes determining, using computer hardware, a load of a synchronous circuit element within the circuit design tagged for forward retiming, traversing, using the computer hardware, each input of the load backward through the circuit design until a sequential circuit element or a primary input is reached, and adding, using the computer hardware, each synchronous circuit element encountered in the traversing to a forward retiming list. In response to determining that forward retiming criteria is met for the forward retiming list, the computer hardware modifies the circuit design by creating a new synchronous circuit element at an output of the load.
    Type: Grant
    Filed: May 23, 2018
    Date of Patent: June 9, 2020
    Assignee: Xilinx, Inc.
    Inventors: Shangzhi Sun, Chaithanya Dudha, Bing Tian, Ashish Sirasao
  • Patent number: 10678509
    Abstract: An example multiply accumulate (MACC) circuit includes a multiply-accumulator having an accumulator output register, a scaler, coupled to the multiply accumulator, and a control circuit coupled to the multiply-accumulator and the scaler. The control circuit is configured to provide control data to the scaler, the control data indicative of: a most-significant bit (MSB) to least significant bit (LSB) range for selecting bit indices from the accumulator output register for implementing a first right shift; a multiplier; and a second right shift.
    Type: Grant
    Filed: August 21, 2018
    Date of Patent: June 9, 2020
    Assignee: XILINX, INC.
    Inventors: Sean Settle, Elliott Delaye, Aaron Ng, Ehsan Ghasemi, Ashish Sirasao, Xiao Teng, Jindrich Zejda
  • Publication number: 20200089472
    Abstract: Circuits and method for multiplying floating point operands. An exponent adder circuit sums a first exponent and a second exponent and generates an output exponent. A mantissa multiplier circuit multiplies a first mantissa and a second mantissa and generates an output mantissa. A first conversion circuit converts the output exponent and output mantissa into a fixed point number. An accumulator circuit sums contents of an accumulation register and the fixed point number into an accumulated value and stores the accumulated value in the accumulation register.
    Type: Application
    Filed: September 19, 2018
    Publication date: March 19, 2020
    Applicant: Xilinx, Inc.
    Inventors: Satyaprakash Pareek, Anup Hosangadi, Bing Tian, Ashish Sirasao, Yao Fu, Oscar Fernando C. Fernandez, Michael Wu, Christopher H. Dick
  • Patent number: 10572409
    Abstract: A memory arrangement can store a matrix of matrix data elements specified as index-value pairs that indicate row and column indices and associated values. First split-and-merge circuitry is coupled between the memory arrangement and a first set of FIFO buffers for reading the matrix data elements from the memory arrangement and putting the matrix data elements in the first set of FIFO buffers based on column indices. A pairing circuit is configured to read vector data elements, pair the vector data elements with the matrix data elements, and put the paired matrix and vector data elements in a second set of FIFO buffers based on column indices. Second split-and-merge circuitry is configured to read paired matrix and vector data elements from the second set of FIFO buffers and put the paired matrix and vector data elements in a third set of FIFO buffers based on row indices.
    Type: Grant
    Filed: May 10, 2018
    Date of Patent: February 25, 2020
    Assignee: XILINX, INC.
    Inventors: Jindrich Zejda, Ling Liu, Yifei Zhou, Ashish Sirasao
  • Patent number: 10572225
    Abstract: A and a request generator circuit is configured to read data elements of a three-dimensional (3-D) input feature map (IFM) from a memory and store a subset of the data elements in one of a plurality of N line buffers. Each line buffer is configured for storage of M data elements. A pixel iterator circuit is coupled to the line buffers and is configured to generate a sequence of addresses for reading the stored data elements from the line buffers based on a sequence of IFM height values and a sequence of IFM width values.
    Type: Grant
    Filed: September 26, 2018
    Date of Patent: February 25, 2020
    Assignee: XILINX, INC.
    Inventors: Ehsan Ghasemi, Elliott Delaye, Ashish Sirasao, Sean Settle
  • Patent number: 10515135
    Abstract: Methods and apparatus are described for performing data-intensive compute algorithms, such as fast massively parallel general matrix multiplication (GEMM), using a particular data format for both storing data to and reading data from memory. This data format may be utilized for arbitrarily-sized input matrices for GEMM implemented on a finite-size GEMM accelerator in the form of a rectangular compute array of digital signal processing (DSP) elements or similar compute cores. This data format solves the issue of double data rate (DDR) dynamic random access memory (DRAM) bandwidth by allowing both linear DDR addressing and single cycle loading of data into the compute array, avoiding input/output (I/O) and/or DDR bottlenecks.
    Type: Grant
    Filed: October 17, 2017
    Date of Patent: December 24, 2019
    Assignee: XILINX, INC.
    Inventors: Jindrich Zejda, Elliott Delaye, Aaron Ng, Ashish Sirasao, Yongjun Wu
  • Patent number: 10460416
    Abstract: An example preprocessor circuit for formatting image data into a plurality of streams of image samples includes: a plurality of memory banks configured to store the image data; multiplexer circuitry coupled to the memory banks; a first plurality of registers coupled to the multiplexer circuitry; a second plurality of registers coupled to the first plurality of registers, outputs of the second plurality of registers configured to provide the plurality of streams of image samples; and control circuitry configured to generate addresses for the plurality of memory banks, control the multiplexer circuitry to select among outputs of the plurality of memory banks, control the first plurality of registers to store outputs of the second plurality of multiplexers, and control the second plurality of registers to store outputs of the first plurality of registers.
    Type: Grant
    Filed: October 17, 2017
    Date of Patent: October 29, 2019
    Assignee: XILINX, INC.
    Inventors: Ashish Sirasao, Elliott Delaye, Aaron Ng, Ehsan Ghasemi
  • Patent number: 10430539
    Abstract: Methods and apparatus relating generally to synthesis are described. In such a method, a directed graph for a circuit design is generated. A cascaded chain is identified in the directed graph with a timing violation. A pipeline register stage of the cascaded chain is moved (or added) to remove the timing violation. The circuit design is transformed to provide a netlist including the pipeline register stage.
    Type: Grant
    Filed: December 16, 2016
    Date of Patent: October 1, 2019
    Assignee: XILINX, INC.
    Inventors: Chaithanya Dudha, Zhao Ma, Krishna Garlapati, Ashish Sirasao
  • Patent number: 10411709
    Abstract: Disclosed circuits and methods include N line buffers. Each line buffer is configured for storage of M data elements of a three-dimensional (3-D) input feature map (IFM). A request generator circuit is coupled to the N line buffers and to a memory configured for storage of the 3-D IFM. The request generator circuit is divides the 3-D IFM into a plurality of IFM sub-volumes based on values of N, M, and dimensions of the 3-D IFM. The request generator circuit reads from the memory, data elements at addresses of an unprocessed one of the IFM sub-volumes and stores the data elements of the unprocessed one of the IFM sub-volumes in the N line buffers. In response to a completion signal, the request generator circuit repeats the reading of an unprocessed one of the IFM sub-volumes and storing the data elements in the N line buffers.
    Type: Grant
    Filed: July 25, 2018
    Date of Patent: September 10, 2019
    Assignee: XILINX, INC.
    Inventors: Ehsan Ghasemi, Elliott Delaye, Ashish Sirasao
  • Patent number: 10354733
    Abstract: Methods and apparatus are described for partitioning and reordering block-based matrix multiplications for high-speed data streaming in general matrix multiplication (GEMM), which may be implemented by a programmable integrated circuit (IC). By preloading and hierarchically caching the blocks, examples of the present disclosure reduce the double data rate (DDR) memory intake bandwidth for software-defined GEMM accelerators.
    Type: Grant
    Filed: October 17, 2017
    Date of Patent: July 16, 2019
    Assignee: XILINX, INC.
    Inventors: Jindrich Zejda, Elliott Delaye, Ashish Sirasao, Yongjun Wu, Aaron Ng
  • Patent number: 10331836
    Abstract: Implementing a circuit design can include determining a chain of a plurality of loop elements of a circuit design, wherein each loop element includes a bit select node configured to perform a bit assignment operation and a corresponding address calculation node, wherein the address calculation nodes use a common variable to calculate a starting bit location provided to the corresponding bit select node. In response to the determining, the chain is replicated resulting in one chain for each value of the common variable and transforming each chain into a plurality of wires. A multiplexer is inserted into the circuit design. The plurality of wires for each chain is coupled to inputs of the multiplexer and the common variable is provided to the multiplexer as a select signal.
    Type: Grant
    Filed: October 11, 2017
    Date of Patent: June 25, 2019
    Assignee: XILINX, INC.
    Inventors: Anup Hosangadi, Sumanta Datta, Aman Gayasen, Ashish Sirasao
  • Patent number: 10303833
    Abstract: Parallelizing operations for implementing a circuit design can include dividing, using a processor, the circuit design into a plurality of partitions, wherein each partition is stored as a separate file, for each partition, generating, using the processor, a timing arc file specifying boundary delays for the partition, and generating, using the processor, a partition design file specifying interfaces of the partitions. Using the processor, a plurality of processes executing in parallel can be initiated. Each process is adapted to operate on a selected partition using the partition design file and the timing arc files for the other partitions to generate an updated file for the selected partition.
    Type: Grant
    Filed: February 9, 2017
    Date of Patent: May 28, 2019
    Assignee: XILINX, INC.
    Inventors: Aman Gayasen, Surya Pratik Saha, Elliott Delaye, Shangzhi Sun, Ashish Sirasao