Patents by Inventor Muhammad Zeeshan Zia

Muhammad Zeeshan Zia has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 11947343
    Abstract: A system and method for optimizing industrial assembly process in an industrial environment is disclosed. A system operates on artificial intelligence (AI) based conversational/GUI platform, where it receives user commands related to industrial assembly process improvement queries. By analyzing received user commands, system identifies type of industrial assembly process mentioned by extracting relevant keywords or other attributes. Using trained AI-based classification table, system determines performance attributes associated with identified type of process. The system leverages various sources such as domain knowledge, organization-specific knowledge bases, data from tools/internet-based services, and statistical measurements from industrial environment.
    Type: Grant
    Filed: September 5, 2023
    Date of Patent: April 2, 2024
    Assignee: Retrocausal, Inc.
    Inventors: Muhammad Zeeshan Zia, Quoc-Huy Tran, Andrey Konin
  • Patent number: 11941080
    Abstract: A system and method for learning human activities from video demonstrations using video augmentation is disclosed. The method includes receiving original videos from one or more data sources. The method includes processing the received original videos using one or more video augmentation techniques to generate a set of augmented videos. Further, the method includes generating a set of training videos by combining the received original videos with the generated set of augmented videos. Also, the method includes generating a deep learning model for the received original videos based on the generated set of training videos. Further, the method includes learning the one or more human activities performed in the received original videos by deploying the generated deep learning model. The method includes outputting the learnt one or more human activities performed in the original videos.
    Type: Grant
    Filed: May 20, 2021
    Date of Patent: March 26, 2024
    Assignee: Retrocausal, Inc.
    Inventors: Quoc-Huy Tran, Muhammad Zeeshan Zia, Andrey Konin, Sanjay Haresh, Sateesh Kumar
  • Publication number: 20220383638
    Abstract: A system and method for determining sub-activities in videos and segmenting the videos is disclosed. The method includes extracting one or more batches from one or more videos and extracting one or more features from set of frames associated with the one or more batches. The method further includes generating a set of predicted codes and determining a cross-entropy loss, temporal coherence loss and a final loss. Further, the method includes categorizing the set of frames into one or more predefined clusters and generating one or more segmented videos based on the categorized set of frames, the determined final loss, and the set of predicted codes by using s activity determination-based ML model. The method includes outputting the generated one or more segmented videos on user interface screen of one or more electronic devices associated with one or more users.
    Type: Application
    Filed: May 25, 2022
    Publication date: December 1, 2022
    Inventors: Quoc-Huy Tran, Muhammad Zeeshan Zia, Andrey Konin, Sateesh Kumar, Sanjay Haresh, Awais Ahmed, Hamza Khan, Muhammad Shakeeb Hussain Siddiqui
  • Publication number: 20220374653
    Abstract: A system and method for learning human activities from video demonstrations using video augmentation is disclosed. The method includes receiving original videos from one or more data sources. The method includes processing the received original videos using one or more video augmentation techniques to generate a set of augmented videos. Further, the method includes generating a set of training videos by combining the received original videos with the generated set of augmented videos. Also, the method includes generating a deep learning model for the received original videos based on the generated set of training videos. Further, the method includes learning the one or more human activities performed in the received original videos by deploying the generated deep learning model. The method includes outputting the learnt one or more human activities performed in the original videos.
    Type: Application
    Filed: May 20, 2021
    Publication date: November 24, 2022
    Inventors: Quoc-Huy Tran, Muhammad Zeeshan Zia, Andrey Konin, Sanjay Haresh, Sateesh Kumar
  • Patent number: 11368756
    Abstract: A system and method for correlating video frames in a computing environment. The method includes receiving first video data and second video data from one or more data sources. The method further includes encoding the received first video data and the second video data using machine learning network. Further, the method includes generating first embedding video data and second embedding video data corresponding to the received first video data and the received second video data. Additionally, the method includes determining a contrastive IDM temporal regularization value for the first video data and the second video data. The method further includes determining temporal alignment loss between the first video data and the second video data. Also, the method includes determining correlated video frames between the first video data and the second video databased on the determined temporal alignment loss and the determined contrastive IDM temporal regularization value.
    Type: Grant
    Filed: March 26, 2021
    Date of Patent: June 21, 2022
    Inventors: Quoc-Huy Tran, Muhammad Zeeshan Zia, Andrey Konin, Sanjay Haresh, Sateesh Kumar, Shahram Najam Syed
  • Patent number: 11216656
    Abstract: A system and method for management and evaluation of one or more human activities is disclosed. The method includes receiving live videos from data sources. The live videos comprises activity performed by human. The activity comprises actions performed by the human. Further, the method includes detecting the actions performed by the human in the live videos using a neural network model. The method further includes generating a procedural instruction set for the activity performed by the human. Also, the method includes validating quality of the identified actions performed by the human using the generated procedural instruction set. Furthermore, the method includes detecting anomalies in the actions performed by the human based on results of validation. Additionally, the method includes generating rectifiable solutions for the detected anomalies. Moreover, the method includes outputting the rectifiable solutions on a user interface of a user device.
    Type: Grant
    Filed: June 21, 2021
    Date of Patent: January 4, 2022
    Inventors: Muhammad Zeeshan Zia, Quoc-Huy Tran, Andrey Konin
  • Patent number: 11132845
    Abstract: A method for object recognition includes, at a computing device, receiving an image of a real-world object. An identity of the real-world object is recognized using an object recognition model trained on a plurality of computer-generated training images. A digital augmentation model corresponding to the real-world object is retrieved, the digital augmentation model including a set of augmentation-specific instructions. A pose of the digital augmentation model is aligned with a pose of the real-world object. An augmentation is provided, the augmentation associated with the real-world object and specified by the augmentation-specific instructions.
    Type: Grant
    Filed: May 22, 2019
    Date of Patent: September 28, 2021
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Harpreet Singh Sawhney, Andrey Konin, Bilha-Catherine W. Githinji, Amol Ashok Ambardekar, William Douglas Guyman, Muhammad Zeeshan Zia, Ning Xu, Sheng Kai Tang, Pedro Urbina Escos
  • Patent number: 11106949
    Abstract: A computing device, including a processor configured to receive a first video including a plurality of frames. For each frame, the processor may determine that a target region of the frame includes a target object. The processor may determine a surrounding region within which the target region is located. The surrounding region may be smaller than the frame. The processor may identify one or more features located in the surrounding region. From the one or more features, the processor may generate one or more manipulated object identifiers. For each of a plurality of pairs of frames, the processor may determine a respective manipulated object movement between a first manipulated object identifier of the first frame and a second manipulated object identifier of the second frame. The processor may classify at least one action performed in the first video based on the plurality of manipulated object movements.
    Type: Grant
    Filed: March 22, 2019
    Date of Patent: August 31, 2021
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Muhammad Zeeshan Zia, Federica Bogo, Harpreet Singh Sawhney, Huseyin Coskun, Bugra Tekin
  • Patent number: 11030458
    Abstract: The disclosure herein describes training a machine learning model to recognize a real-world object based on generated virtual scene variations associated with a model of the real-world object. A digitized three-dimensional (3D) model representing the real-world object is obtained and a virtual scene is built around the 3D model. A plurality of virtual scene variations is generated by varying one or more characteristics. Each virtual scene variation is generated to include a label identifying the 3D model in the virtual scene variation. A machine learning model may be trained based on the plurality of virtual scene variations. The use of generated digital assets to train the machine learning model greatly decreases the time and cost requirements of creating training assets and provides training quality benefits based on the quantity and quality of variations that may be generated, as well as the completeness of information included in each generated digital asset.
    Type: Grant
    Filed: September 14, 2018
    Date of Patent: June 8, 2021
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Muhammad Zeeshan Zia, Emanuel Shalev, Jonathan C. Hanzelka, Harpreet S. Sawhney, Pedro U. Escos, Michael J. Ebstyne
  • Patent number: 11017690
    Abstract: A system for building computational models of a goal-driven task from demonstration is disclosed. A task recording subsystem receives a recorded video file or recorded sensor data representative of an expert demonstration for a task. An instructor authoring tool generates one or more sub-activity proposals; enables an instructor to specify one or more sub-activity labels upon modification of the one or more sub-activity proposals into one or more sub-tasks. A task learning subsystem learns the one or more sub-tasks represented in the demonstration of the task; builds an activity model to predict and locate the task being performed in the recorded video file. A task evaluation subsystem evaluates a live video representative of the task; generates at least one performance description statistics; identifies a type of activity step executed by the one or more actors; provides an activity guidance feedback in real-time to the one or more actors.
    Type: Grant
    Filed: December 18, 2020
    Date of Patent: May 25, 2021
    Inventors: Muhammad Zeeshan Zia, Quoc-Huy Tran, Andrey Konin, Sanjay Haresh, Sateesh Kumar
  • Publication number: 20200372715
    Abstract: A method for object recognition includes, at a computing device, receiving an image of a real-world object. An identity of the real-world object is recognized using an object recognition model trained on a plurality of computer-generated training images. A digital augmentation model corresponding to the real-world object is retrieved, the digital augmentation model including a set of augmentation-specific instructions. A pose of the digital augmentation model is aligned with a pose of the real-world object. An augmentation is provided, the augmentation associated with the real-world object and specified by the augmentation-specific instructions.
    Type: Application
    Filed: May 22, 2019
    Publication date: November 26, 2020
    Applicant: Microsoft Technology Licensing, LLC
    Inventors: Harpreet Singh SAWHNEY, Andrey KONIN, Bilha-Catherine W. GITHINJI, Amol Ashok AMBARDEKAR, William Douglas GUYMAN, Muhammad Zeeshan ZIA, Ning XU, Sheng Kai TANG, Pedro URBINA ESCOS
  • Patent number: 10832084
    Abstract: A method for estimating dense 3D geometric correspondences between two input point clouds by employing a 3D convolutional neural network (CNN) architecture is presented. The method includes, during a training phase, transforming the two input point clouds into truncated distance function voxel grid representations, feeding the truncated distance function voxel grid representations into individual feature extraction layers with tied weights, extracting low-level features from a first feature extraction layer, extracting high-level features from a second feature extraction layer, normalizing the extracted low-level features and high-level features, and applying deep supervision of multiple contrastive losses and multiple hard negative mining modules at the first and second feature extraction layers.
    Type: Grant
    Filed: July 30, 2019
    Date of Patent: November 10, 2020
    Assignee: NEC Corporation
    Inventors: Quoc-Huy Tran, Mohammed E. Fathy Salem, Muhammad Zeeshan Zia, Paul Vernaza, Manmohan Chandraker
  • Publication number: 20200302245
    Abstract: A computing device, including a processor configured to receive a first video including a plurality of frames. For each frame, the processor may determine that a target region of the frame includes a target object. The processor may determine a surrounding region within which the target region is located. The surrounding region may be smaller than the frame. The processor may identify one or more features located in the surrounding region. From the one or more features, the processor may generate one or more manipulated object identifiers. For each of a plurality of pairs of frames, the processor may determine a respective manipulated object movement between a first manipulated object identifier of the first frame and a second manipulated object identifier of the second frame. The processor may classify at least one action performed in the first video based on the plurality of manipulated object movements.
    Type: Application
    Filed: March 22, 2019
    Publication date: September 24, 2020
    Applicant: Microsoft Technology Licensing, LLC
    Inventors: Muhammad Zeeshan ZIA, Federica BOGO, Harpreet Singh SAWHNEY, Huseyin COSKUN, Bugra TEKIN
  • Patent number: 10762359
    Abstract: Systems and methods for detecting traffic scenarios include an image capturing device which captures two or more images of an area of a traffic environment with each image having a different view of vehicles and a road in the traffic environment. A hierarchical feature extractor concurrently extracts features at multiple neural network layers from each of the images, with the features including geometric features and semantic features, and for estimating correspondences between semantic features for each of the images and refining the estimated correspondences with correspondences between the geometric features of each of the images to generate refined correspondence estimates. A traffic localization module uses the refined correspondence estimates to determine locations of vehicles in the environment in three dimensions to automatically determine a traffic scenario according to the locations of vehicles. A notification device generates a notification of the traffic scenario.
    Type: Grant
    Filed: July 6, 2018
    Date of Patent: September 1, 2020
    Assignee: NEC Corporation
    Inventors: Quoc-Huy Tran, Mohammed E. F. Salem, Muhammad Zeeshan Zia, Paul Vernaza, Manmohan Chandraker
  • Patent number: 10679075
    Abstract: Systems and methods for correspondence estimation and flexible ground modeling include communicating two-dimensional (2D) images of an environment to a correspondence estimation module, including a first image and a second image captured by an image capturing device. First features, including geometric features and semantic features, are hierarchically extract from the first image with a first convolutional neural network (CNN) according to activation map weights, and second features, including geometric features and semantic features, are hierarchically extracted from the second image with a second CNN according to the activation map weights. Correspondences between the first features and the second features are estimated, including hierarchical fusing of geometric correspondences and semantic correspondences. A 3-dimensional (3D) model of a terrain is estimated using the estimated correspondences belonging to the terrain surface.
    Type: Grant
    Filed: July 6, 2018
    Date of Patent: June 9, 2020
    Assignee: NEC Corporation
    Inventors: Quoc-Huy Tran, Mohammed E. F. Salem, Muhammad Zeeshan Zia, Paul Vernaza, Manmohan Chandraker
  • Publication number: 20200089954
    Abstract: The disclosure herein describes training a machine learning model to recognize a real-world object based on generated virtual scene variations associated with a model of the real-world object. A digitized three-dimensional (3D) model representing the real-world object is obtained and a virtual scene is built around the 3D model. A plurality of virtual scene variations is generated by varying one or more characteristics. Each virtual scene variation is generated to include a label identifying the 3D model in the virtual scene variation. A machine learning model may be trained based on the plurality of virtual scene variations. The use of generated digital assets to train the machine learning model greatly decreases the time and cost requirements of creating training assets and provides training quality benefits based on the quantity and quality of variations that may be generated, as well as the completeness of information included in each generated digital asset.
    Type: Application
    Filed: September 14, 2018
    Publication date: March 19, 2020
    Inventors: Muhammad Zeeshan ZIA, Emanuel SHALEV, Jonathan C. HANZELKA, Harpreet S. SAWHNEY, Pedro U. ESCOS, Michael J. EBSTYNE
  • Publication number: 20200058156
    Abstract: A method for estimating dense 3D geometric correspondences between two input point clouds by employing a 3D convolutional neural network (CNN) architecture is presented. The method includes, during a training phase, transforming the two input point clouds into truncated distance function voxel grid representations, feeding the truncated distance function voxel grid representations into individual feature extraction layers with tied weights, extracting low-level features from a first feature extraction layer, extracting high-level features from a second feature extraction layer, normalizing the extracted low-level features and high-level features, and applying deep supervision of multiple contrastive losses and multiple hard negative mining modules at the first and second feature extraction layers.
    Type: Application
    Filed: July 30, 2019
    Publication date: February 20, 2020
    Inventors: Quoc-Huy Tran, Mohammed E. Fathy Salem, Muhammad Zeeshan Zia, Paul Vernaza, Manmohan Chandraker
  • Patent number: 10331974
    Abstract: An action recognition system and method are provided. The action recognition system includes an image capture device configured to capture an actual image depicting an object. The action recognition system includes a processor configured to render, based on a set of 3D CAD models, synthetic images with corresponding intermediate shape concept labels. The processor is configured to form a multi-layer CNN which jointly models multiple intermediate shape concepts, based on the rendered synthetic images. The processor is configured to perform an intra-class appearance variation-aware and occlusion-aware 3D object parsing on the actual image by applying the CNN thereto to generate an image pair including a 2D and 3D geometric structure of the object. The processor is configured to control a device to perform a response action in response to an identification of an action performed by the object, wherein the identification of the action is based on the image pair.
    Type: Grant
    Filed: September 20, 2017
    Date of Patent: June 25, 2019
    Assignee: NEC Corporation
    Inventors: Muhammad Zeeshan Zia, Quoc-Huy Tran, Xiang Yu, Manmohan Chandraker, Chi Li
  • Patent number: 10289935
    Abstract: A system and method are provided for driving assistance. The system includes an image capture device configured to capture an actual image relative to an outward view from a motor vehicle and depicting an object. The system further includes a processor configured to render, based on a set of 3D CAD models, synthetic images with corresponding intermediate shape concept labels. The processor is further configured to form a multi-layer CNN which jointly models multiple intermediate shape concepts, based on the rendered synthetic images. The processor is also configured to perform an intra-class appearance variation-aware and occlusion-aware 3D object parsing on the actual image by applying the CNN to the actual image to output an image pair including a 2D and 3D geometric structure of the object. The processor is additionally configured to perform an action to mitigate a likelihood of harm involving the motor vehicle, based on the image pair.
    Type: Grant
    Filed: September 20, 2017
    Date of Patent: May 14, 2019
    Assignee: NEC Corporation
    Inventors: Muhammad Zeeshan Zia, Quoc-Huy Tran, Xiang Yu, Manmohan Chandraker, Chi Li
  • Patent number: 10289936
    Abstract: A surveillance system and method are provided. The surveillance system includes an image capture device configured to capture an actual image of a target area depicting an object. The surveillance system further includes a processor. The processor is configured to render, based on a set of 3D Computer Aided Design (CAD) models, synthetic images with intermediate shape corresponding concept labels. The processor is further configured to form a multi-layer Convolutional Neural Network (CNN) which jointly models multiple intermediate shape concepts, based on the rendered synthetic images. The processor is also configured to perform an intra-class appearance variation-aware and occlusion-aware 3D object parsing on the actual image by applying the CNN to the actual image to generate an image pair including a 2D and 3D geometric structure of the object depicted in the actual image. The surveillance system further includes a display device configured to display the image pair.
    Type: Grant
    Filed: September 20, 2017
    Date of Patent: May 14, 2019
    Assignee: NEC Corporation
    Inventors: Muhammad Zeeshan Zia, Quoc-Huy Tran, Xiang Yu, Manmohan Chandraker, Chi Li