Patents Assigned to DEEP VISION, INC.
-
Patent number: 11941440Abstract: A method includes: dequeuing a signal primitive from a signaling command queue in the set of command queues, the signal primitive pointing to a waiting command queue; in response to the signal primitive pointing to the waiting command queue, incrementing a number of pending signal primitives in the signal-wait counter matrix; dequeuing a wait primitive from the waiting command queue, the wait primitive pointing to the signaling command queue; in response to the wait primitive pointing to the signaling command queue, accessing the register to read the number of pending signal primitives; in response to the number of pending signal primitives indicating at least one pending signal primitive: decrementing the number of pending signal primitives; and dequeuing an instruction from the waiting command queue; and dispatching a control signal representing the instruction to a resource.Type: GrantFiled: October 25, 2022Date of Patent: March 26, 2024Assignee: Deep Vision Inc.Inventors: Mohamed Shahim, Sreenivas Aerra Reddy, Raju Datla, Lava Kumar Bokam, Suresh Kumar Vennam, Sameek Banerjee
-
Patent number: 11763158Abstract: A method includes, for each floating-point layer in a set of floating-point layers: calculating a set of input activations and a set of output activations of the floating-point layer; converting the floating-point layer to a low-bit-width layer; calculating a set of low-bit-width output activations based on the set of input activations; and calculating a per-layer deviation statistic of the low-bit-width layer. The method also includes ordering the set of low-bit-width layers based on the per-layer deviation statistic of each low-bit-width layer.Type: GrantFiled: December 4, 2020Date of Patent: September 19, 2023Assignee: Deep Vision Inc.Inventors: Wajahat Qadeer, Rehan Hameed, Satyanarayana Raju Uppalapati, Abhilash Bharath Ghanore, Kasanagottu Sai Ram
-
Patent number: 11734006Abstract: Disclosed herein is a processor for deep learning. In one embodiment, the processor comprises: a load and store unit configured to load and store image pixel data and stencil data; a register unit, implementing a banked register file, configured to: load and store a subset of the image pixel data from the load and store unit, and concurrently provide access to image pixel values stored in a register file entry of the banked register file, wherein the subset of the image pixel data comprises the image pixel values stored in the register file entry; and a plurality of arithmetic logic units configured to concurrently perform one or more operations on the image pixel values stored in the register file entry and corresponding stencil data of the stencil data.Type: GrantFiled: July 19, 2022Date of Patent: August 22, 2023Assignee: Deep Vision, Inc.Inventors: Wajahat Qadeer, Rehan Hameed
-
Patent number: 11714651Abstract: A tensor traversal engine in a processor system comprising a source memory component and a destination memory component, the tensor traversal engine comprising: a control signal register storing a control signal for a strided data transfer operation from the source memory component to the destination memory component, the control signal comprising an initial source address, an initial destination address, a first source stride length in a first dimension, and a first source stride count in the first dimension; a source address register communicatively coupled to the control signal register; a destination address register communicatively coupled to the control signal register; a first source stride counter communicatively coupled to the control signal register; and control logic communicatively coupled to the control signal register, the source address register, and the first source stride counter.Type: GrantFiled: May 26, 2021Date of Patent: August 1, 2023Assignee: Deep Vision Inc.Inventors: Mohamed Shahim, Raju Datla, Abhilash Bharath Ghanore, Lava Kumar Bokam, Suresh Kumar Vennam, Rajashekar Reddy Ereddy
-
Patent number: 11550586Abstract: A tensor traversal engine in a processor system comprising a source memory component and a destination memory component, the tensor traversal engine comprising: a control signal register storing a control signal for a strided data transfer operation from the source memory component to the destination memory component, the control signal comprising an initial source address, an initial destination address, a first source stride length in a first dimension, and a first source stride count in the first dimension; a source address register communicatively coupled to the control signal register; a destination address register communicatively coupled to the control signal register; a first source stride counter communicatively coupled to the control signal register; and control logic communicatively coupled to the control signal register, the source address register, and the first source stride counter.Type: GrantFiled: May 26, 2021Date of Patent: January 10, 2023Assignee: Deep Vision Inc.Inventors: Mohamed Shahim, Raju Datla, Rehan Hameed, Shilpa Kallem
-
Patent number: 11526767Abstract: A broadcast subsystem of a processor system includes: a set of broadcast buses, each broadcast bus in the set of broadcast buses electrically coupled to a subset of primary memory units in the set of primary memory units; a primary memory unit queue: configured to store a first set of data transfer requests associated with the set of primary memory units; and electrically coupled to the data buffer a broadcast scheduler: electrically coupled to the primary memory unit queue; electrically coupled to the set of broadcast buses; and configured to transfer source data from the data buffer to a target subset of primary memory units in the set of primary memory units via the set of broadcast buses based on the set of data transfer requests stored in the primary memory unit queue.Type: GrantFiled: August 30, 2021Date of Patent: December 13, 2022Assignee: Deep Vision Inc.Inventors: Raju Datla, Mohamed Shahim, Suresh Kumar Vennam, Sreenivas Aerra Reddy
-
Patent number: 11513847Abstract: A method includes: dequeuing a signal primitive from a signaling command queue in the set of command queues, the signal primitive pointing to a waiting command queue; in response to the signal primitive pointing to the waiting command queue, incrementing a number of pending signal primitives in the signal-wait counter matrix; dequeuing a wait primitive from the waiting command queue, the wait primitive pointing to the signaling command queue; in response to the wait primitive pointing to the signaling command queue, accessing the register to read the number of pending signal primitives; in response to the number of pending signal primitives indicating at least one pending signal primitive: decrementing the number of pending signal primitives; and dequeuing an instruction from the waiting command queue; and dispatching a control signal representing the instruction to a resource.Type: GrantFiled: March 24, 2021Date of Patent: November 29, 2022Assignee: Deep Vision Inc.Inventors: Mohamed Shahim, Sreenivas Aerra Reddy, Raju Datla, Lava Kumar Bokam, Suresh Kumar Vennam, Sameek Banerjee
-
Patent number: 11436014Abstract: Disclosed herein is a processor for deep learning. In one embodiment, the processor comprises: a load and store unit configured to load and store image pixel data and stencil data; a register unit, implementing a banked register file, configured to: load and store a subset of the image pixel data from the load and store unit, and concurrently provide access to image pixel values stored in a register file entry of the banked register file, wherein the subset of the image pixel data comprises the image pixel values stored in the register file entry; and a plurality of arithmetic logic units configured to concurrently perform one or more operations on the image pixel values stored in the register file entry and corresponding stencil data of the stencil data.Type: GrantFiled: June 23, 2021Date of Patent: September 6, 2022Assignee: Deep Vision, Inc.Inventors: Wajahat Qadeer, Rehan Hameed
-
Patent number: 11080056Abstract: Disclosed herein is a processor for deep learning. In one embodiment, the processor comprises: a load and store unit configured to load and store image pixel data and stencil data; a register unit, implementing a banked register file, configured to: load and store a subset of the image pixel data from the load and store unit, and concurrently provide access to image pixel values stored in a register file entry of the banked register file, wherein the subset of the image pixel data comprises the image pixel values stored in the register file entry; and a plurality of arithmetic logic units configured to concurrently perform one or more operations on the image pixel values stored in the register file entry and corresponding stencil data of the stencil data.Type: GrantFiled: October 31, 2019Date of Patent: August 3, 2021Assignee: Deep Vision, Inc.Inventors: Wajahat Qadeer, Rehan Hameed
-
Patent number: 10474464Abstract: Disclosed herein is a processor for deep learning. In one embodiment, the processor comprises: a load and store unit configured to load and store image pixel data and stencil data; a register unit, implementing a banked register file, configured to: load and store a subset of the image pixel data from the load and store unit, and concurrently provide access to image pixel values stored in a register file entry of the banked register file, wherein the subset of the image pixel data comprises the image pixel values stored in the register file entry; and a plurality of arithmetic logic units configured to concurrently perform one or more operations on the image pixel values stored in the register file entry and corresponding stencil data of the stencil data.Type: GrantFiled: July 3, 2018Date of Patent: November 12, 2019Assignee: DEEP VISION, INC.Inventors: Wajahat Qadeer, Rehan Hameed