Patents by Inventor Hisham Cholakkal

Hisham Cholakkal has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Publication number: 20240193404
    Abstract: An edge computing system, computer readable storage medium and method for object detection, including processing circuitry. The processing circuitry is configured with a hybrid CNN and vision transformer backbone network in an object detection deep learning network. The backbone network receives an image, and includes a first convolutional encoder to extract local features from feature maps of the image, a second stage having consecutive second convolutional encoders, a positional encoding layer, split depth-wise transpose attention (SDTA) encoders, consecutive convolutional encoders, a third stage and a fourth stage SDTA encoder. Each of the SDTA encoders perform multi-headed self-attention by applying a dot product operation across channel dimensions in order to compute cross-covariance across channels to generate attention feature maps.
    Type: Application
    Filed: December 9, 2022
    Publication date: June 13, 2024
    Applicant: Mohamed bin Zayed University of Artificial Intelligence
    Inventors: Muhammad MAAZ, Abdelrahman SHAKER, Hisham CHOLAKKAL, Salman KHAN, Syed Waqas ZAMIR, Rao Muhammad ANWER, Fahad Shahbaz KHAN
  • Publication number: 20240161334
    Abstract: A system, method, computer readable storage medium for a computer vision system includes at least one video camera, and video processor circuitry. The method includes inputting a stream of video data and generating a sequence of image frames, segmenting and tracking, by the video analysis apparatus, object instances in the stream of video data, including receiving the sequence of image frames, analyzing the sequence of image frames using a video instance segmentation transformer to obtain a video instance mask sequence from the sequence of image frames, the transformer having a backbone network, a transformer encoder-decoder, and an instance matching and segmentation block, The encoder contains a multi-scale spatio-temporal split attention module to capture spatio-temporal feature relationships at multiple scales across multiple frames. The decoder contains a temporal attention block for enhancing a temporal consistency of transformer queries. The method includes displaying the video instance mask sequence.
    Type: Application
    Filed: November 9, 2022
    Publication date: May 16, 2024
    Applicant: Mohamed bin Zayed University of Artificial Intelligence
    Inventors: Omkar THAWAKAR, Sanath NARAYAN, Hisham CHOLAKKAL, Rao Muhammad ANWER, Muhammad HARIS, Salman KHAN, Fahad KHAN
  • Publication number: 20240161360
    Abstract: An apparatus, computer readable storage medium and method of generating a diverse set of images from few-shot images, includes a parameter input receiving values for control parameters to control an extent to which each reference image impacts a newly generated image. The apparatus involves an image generation deep learning network for generating an image for each of the values for the control parameters. The deep learning network has an encoder, a transformer-based fusion block, and a decoder. The transformer-based fusion block includes a mapping network that computes meta-weights from features extracted from the reference images and the control parameters, and a cross-attention block to generate modulation weights based on the meta-weights. An output displays high-quality and diverse images generated based on the values for the control parameter.
    Type: Application
    Filed: November 9, 2022
    Publication date: May 16, 2024
    Applicant: Mohamed bin Zayed University of Artificial Intelligence
    Inventors: Amandeep KUMAR, Ankan Kumar BHUNIA, Hisham CHOLAKKAL, Sanath NARAYAN, Rao Muhammad ANWER, Fahad KHAN
  • Publication number: 20240153308
    Abstract: A video system and method for person search includes video cameras for capturing video images, a display device, and a computer system. The computer system including a deep learning network to determine person images, from among the video images, matching a target query person. The deep learning network having a person detection branch, a person re-identification branch, and an attention-aware relation mixer connected to the person detection branch and to the person re-identification branch. The attention-aware relation mixer including a relation mixer having a spatial and channel mixer that performs spatial attention followed by spatial mixing (tokenized multi-layered perceptron) and channel attention followed by channel mixing (channel multi-layered perceptron), and a joint spatio-channel attention layer that utilizes 3D attention weights to modulate 3D spatio-channel region of interest features and aggregate the features with output of the relation mixer.
    Type: Application
    Filed: November 9, 2022
    Publication date: May 9, 2024
    Applicant: Mohamed bin Zayed University of Artificial Intelligence
    Inventors: Mustansar FIAZ, Hisham CHOLAKKAL, Sanath NARAYAN, Rao Muhammad ANWER, Fahad KHAN
  • Publication number: 20230316603
    Abstract: A system and computer readable storage medium for automated handwriting generation, including a text input device for inputting a text query having at least one textual word string, an image input device for inputting a handwriting sample with characters in a writing style of a user, and a computer implemented deep learning transformer model including an encoder network and a decoder network in which each are a hybrid of convolution and multi-head self-attention networks. The encoder produces a sequence of style feature embeddings from the input handwriting sample. The decoder takes the sequence of style feature embeddings in order to convert the at least one textual word string into a generated handwritten image having substantially same writing style as the handwriting sample. An output device to output the generated handwriting image.
    Type: Application
    Filed: July 19, 2022
    Publication date: October 5, 2023
    Applicant: Mohamed bin Zayed University of Artificial Intelligence
    Inventors: Ankan Kumar BHUNIA, Salman KHAN, Hisham CHOLAKKAL, Rao Muhammad ANWER, Fahad KHAN
  • Patent number: 11756244
    Abstract: A system and computer readable storage medium for automated handwriting generation, including a text input device for inputting a text query having at least one textual word string, an image input device for inputting a handwriting sample with characters in a writing style of a user, and a computer implemented deep learning transformer model including an encoder network and a decoder network in which each are a hybrid of convolution and multi-head self-attention networks. The encoder produces a sequence of style feature embeddings from the input handwriting sample. The decoder takes the sequence of style feature embeddings in order to convert the at least one textual word string into a generated handwritten image having substantially same writing style as the handwriting sample. An output device to output the generated handwriting image.
    Type: Grant
    Filed: July 19, 2022
    Date of Patent: September 12, 2023
    Assignee: Mohamed bin Zayed University of Artificial Intelligence
    Inventors: Ankan Kumar Bhunia, Salman Khan, Hisham Cholakkal, Rao Muhammad Anwer, Fahad Khan
  • Patent number: 11244188
    Abstract: This disclosure relates to improved techniques for performing computer vision functions, including common object detection and instance segmentation. The techniques described herein utilize neural network architectures to perform these functions in various types of images, such as natural images, UAV images, satellite images, and other images. The neural network architecture can include a dense location regression network that performs object localization and segmentation functions, at least in part, by generating offset information for multiple sub-regions of candidate object proposals, and utilizing this dense offset information to derive final predictions for locations of target objects. The neural network architecture also can include a discriminative region-of-interest (RoI) pooling network that performs classification of the localized objects, at least in part, by sampling various sub-regions of candidate proposals and performing adaptive weighting to obtain discriminative features.
    Type: Grant
    Filed: April 10, 2020
    Date of Patent: February 8, 2022
    Assignee: Inception Institute of Artificial Intelligence, Ltd.
    Inventors: Hisham Cholakkal, Jiale Cao, Rao Muhammad Anwer, Fahad Shahbaz Khan, Yanwei Pang, Ling Shao
  • Publication number: 20210342579
    Abstract: A method for identifying a hand pose in a vehicle involves identifying a hand image for a hand in the vehicle by extraction from a vehicle image of the vehicle. A plurality of contextual images of the hand image is obtained based on the single point. Each of the plurality of contextual images are processed using one or more layers of a neural network to obtain a plurality of contextual features associated with the hand image. A hand pose associated with the hand is identified based on the plurality of contextual features using a classifier model.
    Type: Application
    Filed: August 27, 2019
    Publication date: November 4, 2021
    Inventors: Hisham CHOLAKKAL, Sanath NARAYAN, Arjun JAIN, Shuaib AHMED, Amit BHATKAL, Mallikarjun BYRASANDRA RAMALINGA REDDY, Apurbaa MALLIK
  • Publication number: 20210319242
    Abstract: This disclosure relates to improved techniques for performing computer vision functions, including common object detection and instance segmentation. The techniques described herein utilize neural network architectures to perform these functions in various types of images, such as natural images, UAV images, satellite images, and other images. The neural network architecture can include a dense location regression network that performs object localization and segmentation functions, at least in part, by generating offset information for multiple sub-regions of candidate object proposals, and utilizing this dense offset information to derive final predictions for locations of target objects. The neural network architecture also can include a discriminative region-of-interest (Rol) pooling network that performs classification of the localized objects, at least in part, by sampling various sub-regions of candidate proposals and performing adaptive weighting to obtain discriminative features.
    Type: Application
    Filed: April 10, 2020
    Publication date: October 14, 2021
    Inventors: Hisham Cholakkal, Jiale Cao, Rao Muhammad Anwer, Fahad Shahbaz Khan, Yanwei Pang, Ling Shao
  • Patent number: 10453197
    Abstract: This disclosure relates to improved techniques for performing computer vision functions including common object counting and instance segmentation. The techniques described herein utilize a neural network architecture to perform these functions. The neural network architecture can be trained using image-level supervision techniques that utilize a loss function to jointly train an image classification branch and a density branch of the neural network architecture. The neural network architecture constructs per-category density maps that can be used to generate analysis information comprising global object counts and locations of objects in images.
    Type: Grant
    Filed: February 18, 2019
    Date of Patent: October 22, 2019
    Assignee: Inception Institute of Artificial Intelligence, Ltd.
    Inventors: Hisham Cholakkal, Guolei Sun, Fahad Shahbaz Khan, Ling Shao