Patents by Inventor Muhammad Zeeshan Zia
Muhammad Zeeshan Zia has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Patent number: 11947343Abstract: A system and method for optimizing industrial assembly process in an industrial environment is disclosed. A system operates on artificial intelligence (AI) based conversational/GUI platform, where it receives user commands related to industrial assembly process improvement queries. By analyzing received user commands, system identifies type of industrial assembly process mentioned by extracting relevant keywords or other attributes. Using trained AI-based classification table, system determines performance attributes associated with identified type of process. The system leverages various sources such as domain knowledge, organization-specific knowledge bases, data from tools/internet-based services, and statistical measurements from industrial environment.Type: GrantFiled: September 5, 2023Date of Patent: April 2, 2024Assignee: Retrocausal, Inc.Inventors: Muhammad Zeeshan Zia, Quoc-Huy Tran, Andrey Konin
-
Patent number: 11941080Abstract: A system and method for learning human activities from video demonstrations using video augmentation is disclosed. The method includes receiving original videos from one or more data sources. The method includes processing the received original videos using one or more video augmentation techniques to generate a set of augmented videos. Further, the method includes generating a set of training videos by combining the received original videos with the generated set of augmented videos. Also, the method includes generating a deep learning model for the received original videos based on the generated set of training videos. Further, the method includes learning the one or more human activities performed in the received original videos by deploying the generated deep learning model. The method includes outputting the learnt one or more human activities performed in the original videos.Type: GrantFiled: May 20, 2021Date of Patent: March 26, 2024Assignee: Retrocausal, Inc.Inventors: Quoc-Huy Tran, Muhammad Zeeshan Zia, Andrey Konin, Sanjay Haresh, Sateesh Kumar
-
Publication number: 20220383638Abstract: A system and method for determining sub-activities in videos and segmenting the videos is disclosed. The method includes extracting one or more batches from one or more videos and extracting one or more features from set of frames associated with the one or more batches. The method further includes generating a set of predicted codes and determining a cross-entropy loss, temporal coherence loss and a final loss. Further, the method includes categorizing the set of frames into one or more predefined clusters and generating one or more segmented videos based on the categorized set of frames, the determined final loss, and the set of predicted codes by using s activity determination-based ML model. The method includes outputting the generated one or more segmented videos on user interface screen of one or more electronic devices associated with one or more users.Type: ApplicationFiled: May 25, 2022Publication date: December 1, 2022Inventors: Quoc-Huy Tran, Muhammad Zeeshan Zia, Andrey Konin, Sateesh Kumar, Sanjay Haresh, Awais Ahmed, Hamza Khan, Muhammad Shakeeb Hussain Siddiqui
-
Publication number: 20220374653Abstract: A system and method for learning human activities from video demonstrations using video augmentation is disclosed. The method includes receiving original videos from one or more data sources. The method includes processing the received original videos using one or more video augmentation techniques to generate a set of augmented videos. Further, the method includes generating a set of training videos by combining the received original videos with the generated set of augmented videos. Also, the method includes generating a deep learning model for the received original videos based on the generated set of training videos. Further, the method includes learning the one or more human activities performed in the received original videos by deploying the generated deep learning model. The method includes outputting the learnt one or more human activities performed in the original videos.Type: ApplicationFiled: May 20, 2021Publication date: November 24, 2022Inventors: Quoc-Huy Tran, Muhammad Zeeshan Zia, Andrey Konin, Sanjay Haresh, Sateesh Kumar
-
Patent number: 11368756Abstract: A system and method for correlating video frames in a computing environment. The method includes receiving first video data and second video data from one or more data sources. The method further includes encoding the received first video data and the second video data using machine learning network. Further, the method includes generating first embedding video data and second embedding video data corresponding to the received first video data and the received second video data. Additionally, the method includes determining a contrastive IDM temporal regularization value for the first video data and the second video data. The method further includes determining temporal alignment loss between the first video data and the second video data. Also, the method includes determining correlated video frames between the first video data and the second video databased on the determined temporal alignment loss and the determined contrastive IDM temporal regularization value.Type: GrantFiled: March 26, 2021Date of Patent: June 21, 2022Inventors: Quoc-Huy Tran, Muhammad Zeeshan Zia, Andrey Konin, Sanjay Haresh, Sateesh Kumar, Shahram Najam Syed
-
Patent number: 11216656Abstract: A system and method for management and evaluation of one or more human activities is disclosed. The method includes receiving live videos from data sources. The live videos comprises activity performed by human. The activity comprises actions performed by the human. Further, the method includes detecting the actions performed by the human in the live videos using a neural network model. The method further includes generating a procedural instruction set for the activity performed by the human. Also, the method includes validating quality of the identified actions performed by the human using the generated procedural instruction set. Furthermore, the method includes detecting anomalies in the actions performed by the human based on results of validation. Additionally, the method includes generating rectifiable solutions for the detected anomalies. Moreover, the method includes outputting the rectifiable solutions on a user interface of a user device.Type: GrantFiled: June 21, 2021Date of Patent: January 4, 2022Inventors: Muhammad Zeeshan Zia, Quoc-Huy Tran, Andrey Konin
-
Patent number: 11132845Abstract: A method for object recognition includes, at a computing device, receiving an image of a real-world object. An identity of the real-world object is recognized using an object recognition model trained on a plurality of computer-generated training images. A digital augmentation model corresponding to the real-world object is retrieved, the digital augmentation model including a set of augmentation-specific instructions. A pose of the digital augmentation model is aligned with a pose of the real-world object. An augmentation is provided, the augmentation associated with the real-world object and specified by the augmentation-specific instructions.Type: GrantFiled: May 22, 2019Date of Patent: September 28, 2021Assignee: Microsoft Technology Licensing, LLCInventors: Harpreet Singh Sawhney, Andrey Konin, Bilha-Catherine W. Githinji, Amol Ashok Ambardekar, William Douglas Guyman, Muhammad Zeeshan Zia, Ning Xu, Sheng Kai Tang, Pedro Urbina Escos
-
Patent number: 11106949Abstract: A computing device, including a processor configured to receive a first video including a plurality of frames. For each frame, the processor may determine that a target region of the frame includes a target object. The processor may determine a surrounding region within which the target region is located. The surrounding region may be smaller than the frame. The processor may identify one or more features located in the surrounding region. From the one or more features, the processor may generate one or more manipulated object identifiers. For each of a plurality of pairs of frames, the processor may determine a respective manipulated object movement between a first manipulated object identifier of the first frame and a second manipulated object identifier of the second frame. The processor may classify at least one action performed in the first video based on the plurality of manipulated object movements.Type: GrantFiled: March 22, 2019Date of Patent: August 31, 2021Assignee: Microsoft Technology Licensing, LLCInventors: Muhammad Zeeshan Zia, Federica Bogo, Harpreet Singh Sawhney, Huseyin Coskun, Bugra Tekin
-
Patent number: 11030458Abstract: The disclosure herein describes training a machine learning model to recognize a real-world object based on generated virtual scene variations associated with a model of the real-world object. A digitized three-dimensional (3D) model representing the real-world object is obtained and a virtual scene is built around the 3D model. A plurality of virtual scene variations is generated by varying one or more characteristics. Each virtual scene variation is generated to include a label identifying the 3D model in the virtual scene variation. A machine learning model may be trained based on the plurality of virtual scene variations. The use of generated digital assets to train the machine learning model greatly decreases the time and cost requirements of creating training assets and provides training quality benefits based on the quantity and quality of variations that may be generated, as well as the completeness of information included in each generated digital asset.Type: GrantFiled: September 14, 2018Date of Patent: June 8, 2021Assignee: Microsoft Technology Licensing, LLCInventors: Muhammad Zeeshan Zia, Emanuel Shalev, Jonathan C. Hanzelka, Harpreet S. Sawhney, Pedro U. Escos, Michael J. Ebstyne
-
Patent number: 11017690Abstract: A system for building computational models of a goal-driven task from demonstration is disclosed. A task recording subsystem receives a recorded video file or recorded sensor data representative of an expert demonstration for a task. An instructor authoring tool generates one or more sub-activity proposals; enables an instructor to specify one or more sub-activity labels upon modification of the one or more sub-activity proposals into one or more sub-tasks. A task learning subsystem learns the one or more sub-tasks represented in the demonstration of the task; builds an activity model to predict and locate the task being performed in the recorded video file. A task evaluation subsystem evaluates a live video representative of the task; generates at least one performance description statistics; identifies a type of activity step executed by the one or more actors; provides an activity guidance feedback in real-time to the one or more actors.Type: GrantFiled: December 18, 2020Date of Patent: May 25, 2021Inventors: Muhammad Zeeshan Zia, Quoc-Huy Tran, Andrey Konin, Sanjay Haresh, Sateesh Kumar
-
Publication number: 20200372715Abstract: A method for object recognition includes, at a computing device, receiving an image of a real-world object. An identity of the real-world object is recognized using an object recognition model trained on a plurality of computer-generated training images. A digital augmentation model corresponding to the real-world object is retrieved, the digital augmentation model including a set of augmentation-specific instructions. A pose of the digital augmentation model is aligned with a pose of the real-world object. An augmentation is provided, the augmentation associated with the real-world object and specified by the augmentation-specific instructions.Type: ApplicationFiled: May 22, 2019Publication date: November 26, 2020Applicant: Microsoft Technology Licensing, LLCInventors: Harpreet Singh SAWHNEY, Andrey KONIN, Bilha-Catherine W. GITHINJI, Amol Ashok AMBARDEKAR, William Douglas GUYMAN, Muhammad Zeeshan ZIA, Ning XU, Sheng Kai TANG, Pedro URBINA ESCOS
-
Patent number: 10832084Abstract: A method for estimating dense 3D geometric correspondences between two input point clouds by employing a 3D convolutional neural network (CNN) architecture is presented. The method includes, during a training phase, transforming the two input point clouds into truncated distance function voxel grid representations, feeding the truncated distance function voxel grid representations into individual feature extraction layers with tied weights, extracting low-level features from a first feature extraction layer, extracting high-level features from a second feature extraction layer, normalizing the extracted low-level features and high-level features, and applying deep supervision of multiple contrastive losses and multiple hard negative mining modules at the first and second feature extraction layers.Type: GrantFiled: July 30, 2019Date of Patent: November 10, 2020Assignee: NEC CorporationInventors: Quoc-Huy Tran, Mohammed E. Fathy Salem, Muhammad Zeeshan Zia, Paul Vernaza, Manmohan Chandraker
-
Publication number: 20200302245Abstract: A computing device, including a processor configured to receive a first video including a plurality of frames. For each frame, the processor may determine that a target region of the frame includes a target object. The processor may determine a surrounding region within which the target region is located. The surrounding region may be smaller than the frame. The processor may identify one or more features located in the surrounding region. From the one or more features, the processor may generate one or more manipulated object identifiers. For each of a plurality of pairs of frames, the processor may determine a respective manipulated object movement between a first manipulated object identifier of the first frame and a second manipulated object identifier of the second frame. The processor may classify at least one action performed in the first video based on the plurality of manipulated object movements.Type: ApplicationFiled: March 22, 2019Publication date: September 24, 2020Applicant: Microsoft Technology Licensing, LLCInventors: Muhammad Zeeshan ZIA, Federica BOGO, Harpreet Singh SAWHNEY, Huseyin COSKUN, Bugra TEKIN
-
Patent number: 10762359Abstract: Systems and methods for detecting traffic scenarios include an image capturing device which captures two or more images of an area of a traffic environment with each image having a different view of vehicles and a road in the traffic environment. A hierarchical feature extractor concurrently extracts features at multiple neural network layers from each of the images, with the features including geometric features and semantic features, and for estimating correspondences between semantic features for each of the images and refining the estimated correspondences with correspondences between the geometric features of each of the images to generate refined correspondence estimates. A traffic localization module uses the refined correspondence estimates to determine locations of vehicles in the environment in three dimensions to automatically determine a traffic scenario according to the locations of vehicles. A notification device generates a notification of the traffic scenario.Type: GrantFiled: July 6, 2018Date of Patent: September 1, 2020Assignee: NEC CorporationInventors: Quoc-Huy Tran, Mohammed E. F. Salem, Muhammad Zeeshan Zia, Paul Vernaza, Manmohan Chandraker
-
Patent number: 10679075Abstract: Systems and methods for correspondence estimation and flexible ground modeling include communicating two-dimensional (2D) images of an environment to a correspondence estimation module, including a first image and a second image captured by an image capturing device. First features, including geometric features and semantic features, are hierarchically extract from the first image with a first convolutional neural network (CNN) according to activation map weights, and second features, including geometric features and semantic features, are hierarchically extracted from the second image with a second CNN according to the activation map weights. Correspondences between the first features and the second features are estimated, including hierarchical fusing of geometric correspondences and semantic correspondences. A 3-dimensional (3D) model of a terrain is estimated using the estimated correspondences belonging to the terrain surface.Type: GrantFiled: July 6, 2018Date of Patent: June 9, 2020Assignee: NEC CorporationInventors: Quoc-Huy Tran, Mohammed E. F. Salem, Muhammad Zeeshan Zia, Paul Vernaza, Manmohan Chandraker
-
Publication number: 20200089954Abstract: The disclosure herein describes training a machine learning model to recognize a real-world object based on generated virtual scene variations associated with a model of the real-world object. A digitized three-dimensional (3D) model representing the real-world object is obtained and a virtual scene is built around the 3D model. A plurality of virtual scene variations is generated by varying one or more characteristics. Each virtual scene variation is generated to include a label identifying the 3D model in the virtual scene variation. A machine learning model may be trained based on the plurality of virtual scene variations. The use of generated digital assets to train the machine learning model greatly decreases the time and cost requirements of creating training assets and provides training quality benefits based on the quantity and quality of variations that may be generated, as well as the completeness of information included in each generated digital asset.Type: ApplicationFiled: September 14, 2018Publication date: March 19, 2020Inventors: Muhammad Zeeshan ZIA, Emanuel SHALEV, Jonathan C. HANZELKA, Harpreet S. SAWHNEY, Pedro U. ESCOS, Michael J. EBSTYNE
-
Publication number: 20200058156Abstract: A method for estimating dense 3D geometric correspondences between two input point clouds by employing a 3D convolutional neural network (CNN) architecture is presented. The method includes, during a training phase, transforming the two input point clouds into truncated distance function voxel grid representations, feeding the truncated distance function voxel grid representations into individual feature extraction layers with tied weights, extracting low-level features from a first feature extraction layer, extracting high-level features from a second feature extraction layer, normalizing the extracted low-level features and high-level features, and applying deep supervision of multiple contrastive losses and multiple hard negative mining modules at the first and second feature extraction layers.Type: ApplicationFiled: July 30, 2019Publication date: February 20, 2020Inventors: Quoc-Huy Tran, Mohammed E. Fathy Salem, Muhammad Zeeshan Zia, Paul Vernaza, Manmohan Chandraker
-
Patent number: 10331974Abstract: An action recognition system and method are provided. The action recognition system includes an image capture device configured to capture an actual image depicting an object. The action recognition system includes a processor configured to render, based on a set of 3D CAD models, synthetic images with corresponding intermediate shape concept labels. The processor is configured to form a multi-layer CNN which jointly models multiple intermediate shape concepts, based on the rendered synthetic images. The processor is configured to perform an intra-class appearance variation-aware and occlusion-aware 3D object parsing on the actual image by applying the CNN thereto to generate an image pair including a 2D and 3D geometric structure of the object. The processor is configured to control a device to perform a response action in response to an identification of an action performed by the object, wherein the identification of the action is based on the image pair.Type: GrantFiled: September 20, 2017Date of Patent: June 25, 2019Assignee: NEC CorporationInventors: Muhammad Zeeshan Zia, Quoc-Huy Tran, Xiang Yu, Manmohan Chandraker, Chi Li
-
Patent number: 10289935Abstract: A system and method are provided for driving assistance. The system includes an image capture device configured to capture an actual image relative to an outward view from a motor vehicle and depicting an object. The system further includes a processor configured to render, based on a set of 3D CAD models, synthetic images with corresponding intermediate shape concept labels. The processor is further configured to form a multi-layer CNN which jointly models multiple intermediate shape concepts, based on the rendered synthetic images. The processor is also configured to perform an intra-class appearance variation-aware and occlusion-aware 3D object parsing on the actual image by applying the CNN to the actual image to output an image pair including a 2D and 3D geometric structure of the object. The processor is additionally configured to perform an action to mitigate a likelihood of harm involving the motor vehicle, based on the image pair.Type: GrantFiled: September 20, 2017Date of Patent: May 14, 2019Assignee: NEC CorporationInventors: Muhammad Zeeshan Zia, Quoc-Huy Tran, Xiang Yu, Manmohan Chandraker, Chi Li
-
Patent number: 10289936Abstract: A surveillance system and method are provided. The surveillance system includes an image capture device configured to capture an actual image of a target area depicting an object. The surveillance system further includes a processor. The processor is configured to render, based on a set of 3D Computer Aided Design (CAD) models, synthetic images with intermediate shape corresponding concept labels. The processor is further configured to form a multi-layer Convolutional Neural Network (CNN) which jointly models multiple intermediate shape concepts, based on the rendered synthetic images. The processor is also configured to perform an intra-class appearance variation-aware and occlusion-aware 3D object parsing on the actual image by applying the CNN to the actual image to generate an image pair including a 2D and 3D geometric structure of the object depicted in the actual image. The surveillance system further includes a display device configured to display the image pair.Type: GrantFiled: September 20, 2017Date of Patent: May 14, 2019Assignee: NEC CorporationInventors: Muhammad Zeeshan Zia, Quoc-Huy Tran, Xiang Yu, Manmohan Chandraker, Chi Li