Patents by Inventor Hamidreza VAEZI JOZE

Hamidreza VAEZI JOZE has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

ADAPTIVE MULTI-SCALE FACE AND BODY DETECTOR

Publication number: 20230291993

Abstract: Systems and methods are provided for determining faces and bodies of people in an image by adaptively scaling images and by iteratively using a deep neural network for inferencing. A camera captures an image including faces and bodies of people. A face/body determiner determines faces and bodies of people appearing in the image by resizing the image into a predetermined pixel dimension as input to the deep neural network. A region cropper determines a crop region associated with a low level of confidence in detecting faces and bodies that are too small to determine with an acceptable level of confidence. The region cropper resizes the crop region into the predetermined pixel dimension as input to the deep neural network. The face and body determiner determines other faces and bodies appearing in the resized crop region. An aggregator aggregates locations of the determined faces and bodies in the image.

Type: Application

Filed: May 19, 2023

Publication date: September 14, 2023

Applicant: Microsoft Technology Licensing, LLC

Inventors: Hamidreza VAEZI JOZE, Zehua WEI
Adaptive multi-scale face and body detector

Patent number: 11700445

Abstract: Systems and methods are provided for determining faces and bodies of people in an image by adaptively scaling images and by iteratively using a deep neural network for inferencing. A camera captures an image including faces and bodies of people. A face/body determiner determines faces and bodies of people appearing in the image by resizing the image into a predetermined pixel dimension as input to the deep neural network. A region cropper determines a crop region associated with a low level of confidence in detecting faces and bodies that are too small to determine with an acceptable level of confidence. The region cropper resizes the crop region into the predetermined pixel dimension as input to the deep neural network. The face and body determiner determines other faces and bodies appearing in the resized crop region. An aggregator aggregates locations of the determined faces and bodies in the image.

Type: Grant

Filed: October 28, 2021

Date of Patent: July 11, 2023

Assignee: Microsoft Technology Licensing, LLC

Inventors: Hamidreza Vaezi Joze, Zehua Wei
Adaptive Token Sampling for Efficient Transformer

Publication number: 20230153379

Abstract: A transformer is described herein for using transformer-based technology to process data items (e.g., image items). The transformer increases the efficiency of the transformer-based technology by using a modified attention component. In operation, the modified attention component accepts embedding vectors that represent a plurality of item tokens, together with a classification token. A first stage of the modified attention component generates original attention information based on the embedding vectors. A second stage generates score information based on a portion of the original attention information that pertains to the classification token. A third stage produces modified attention information by removing attention values from the original attention information, as guided by a sampling operation that is performed on the score information. The second and third stages do not rely on machine-trained values, which expedites the deployment of these functions in existing transformers.

Type: Application

Filed: November 14, 2021

Publication date: May 18, 2023

Applicant: Microsoft Technology Licensing, LLC

Inventors: Mohsen FAYYAZ, Soroush ABBASI KOOHPAYEGANI, Eric Chris Wolfgang SOMMERLADE, Hamidreza VAEZI JOZE
ADAPTIVE MULTI-SCALE FACE AND BODY DETECTOR

Publication number: 20230133854

Abstract: Systems and methods are provided for determining faces and bodies of people in an image by adaptively scaling images and by iteratively using a deep neural network for inferencing. A camera captures an image including faces and bodies of people. A face/body determiner determines faces and bodies of people appearing in the image by resizing the image into a predetermined pixel dimension as input to the deep neural network. A region cropper determines a crop region associated with a low level of confidence in detecting faces and bodies that are too small to determine with an acceptable level of confidence. The region cropper resizes the crop region into the predetermined pixel dimension as input to the deep neural network. The face and body determiner determines other faces and bodies appearing in the resized crop region. An aggregator aggregates locations of the determined faces and bodies in the image.

Type: Application

Filed: October 28, 2021

Publication date: May 4, 2023

Applicant: Microsoft Technology Licensing, LLC

Inventors: Hamidreza VAEZI JOZE, Zehua WEI
SURFACE SPECTRAL REFLECTION ESTIMATION IN COMPUTER VISION

Publication number: 20220405521

Abstract: An image processor receives first image data representing an image. The first image data comprising a plurality of color values corresponding to a plurality of pixels in the image. The image processor determines, using a trained machine learning model, second image data based on the first image data. The second image data comprises surface spectral reflection values corresponding to the plurality of pixels in the image, where the surface spectral reflection values are distributed across a plurality of wavelengths of visible light in the image. The image processor then performs at least one image processing operation with respect to the image using the second image data.

Type: Application

Filed: June 21, 2021

Publication date: December 22, 2022

Applicant: Microsoft Technology Licensing, LLC

Inventor: Hamidreza Vaezi JOZE
Simultaneously correcting image degradations of multiple types in an image of a face

Patent number: 11526972

Abstract: Described herein are technologies related to correcting image degradations in images. An image is received, and values for features that are usable to correct for image degradation associated with blur, noise, and low light conditions are generated by separate encoders based upon the received image. A fusion network learned by way of network architecture search fuses these values, and the fused values are employed to generate an improved image, such that the image degradations associated with blur, noise, and low light conditions are simultaneously corrected.

Type: Grant

Filed: February 1, 2021

Date of Patent: December 13, 2022

Assignee: MICROSOFT TECHNOLOGY LICENSING, LLC

Inventors: Hamidreza Vaezi Joze, Rajeev Yasarla
NEURAL NETWORK TARGET FEATURE DETECTION

Publication number: 20220358332

Abstract: A method of training a neural network for detecting target features in images is described. The neural network is trained using a first data set that includes labeled images, where at least some of the labeled images having subjects with labeled features, including: dividing each of the labeled images of the first data set into a respective plurality of tiles, and generating, for each of the plurality of tiles, a plurality of feature anchors that indicate target features within the corresponding tile. Target features that correspond to the plurality of feature anchors are detected in a second data set of unlabeled images. Images of the second data set having target features that were not detected are labeled. A third data set that includes the first data set and the labeled images of the second data set is generated. The neural network is trained using the third data set.

Type: Application

Filed: May 7, 2021

Publication date: November 10, 2022

Applicant: Microsoft Technology Licensing, LLC

Inventors: Hamidreza Vaezi JOZE, Vivek PRADEEP, Karthik VIJAYAN
SIMULTANEOUSLY CORRECTING IMAGE DEGRADATIONS OF MULTIPLE TYPES IN AN IMAGE OF A FACE

Publication number: 20220245776

Abstract: Described herein are technologies related to correcting image degradations in images. An image is received, and values for features that are usable to correct for image degradation associated with blur, noise, and low light conditions are generated by separate encoders based upon the received image. A fusion network learned by way of network architecture search fuses these values, and the fused values are employed to generate an improved image, such that the image degradations associated with blur, noise, and low light conditions are simultaneously corrected.

Type: Application

Filed: February 1, 2021

Publication date: August 4, 2022

Inventors: Hamidreza VAEZI JOZE, Rajeev YASARLA
Location matched small segment fingerprint reader

Patent number: 11354934

Abstract: Electronic fingerprint readers are often used for security such as log-in authentication for the identification of a user for selective access to a computing system. As computing devices shrink in overall size and with downward pressure on device pricing, smaller and less expensive fingerprint readers are increasingly desired. While whole fingerprint readers have the greatest accuracy in user identification, the whole fingerprint is often not required for user identification. Often, only a portion of the user's fingerprint is required to adequately identify the user and thus a small segment fingerprint reader may be sufficient for user authentication. However, the smaller the sensing area of the small-segment fingerprint reader, the more likely that the fingerprint reader misidentifies the user or fails to collect sufficient information to identify the user. Systems and methods for improving identification accuracy of small-segment fingerprint readers are disclosed in detail herein.

Type: Grant

Filed: May 3, 2018

Date of Patent: June 7, 2022

Assignee: Microsoft Technology Licensing, LLC

Inventor: Hamidreza Vaezi Joze
Face-speech bridging by cycle video/audio reconstruction

Patent number: 10931976

Abstract: In an embodiment described herein, a method for face-speech bridging by cycle video/audio reconstruction is described. The method comprises encoding audio data and video data via a mutual autoencoders that comprise an audio autoencoder and a video autoencoder, wherein the mutual autoencoders share a common space with corresponding embeddings derived by each of the audio autoencoder and the video autoencoder. Additionally, the method comprises substituting embeddings from a non-corrupted modality for corresponding corrupted embeddings in a corrupted modality in real-time based at least in part on corrupted audio data or corrupted video data. The method also comprises synthesizing reconstructed audio data and reconstructed video data based on, at least in part, the substituted embeddings.

Type: Grant

Filed: October 14, 2019

Date of Patent: February 23, 2021

Assignee: Microsoft Technology Licensing, LLC

Inventors: Hamidreza Vaezi Joze, Hassan Akbari
Video recognition using multiple modalities

Patent number: 10929676

Abstract: Implementations described herein discloses a multi-modality video recognition system. Specifically, the multi-modality video recognition system is configured to train a plurality of classifier networks, each of the classifier network trained with a different one of the plurality of video streams, wherein each of the plurality of different classifier networks includes multiple intermediate layers, determine correlation matrices of related intermediate layers of each of the plurality of the different classifier networks, and align the correlation matrices of the related intermediate layers of each of the plurality of the different classifier networks.

Type: Grant

Filed: February 27, 2019

Date of Patent: February 23, 2021

Assignee: Microsoft Technology Licensing, LLC

Inventors: Hamidreza Vaezi Joze, Mahdi Abavisani
Human action data set generation in a machine learning system

Patent number: 10679044

Abstract: Methods, apparatuses, and computer-readable mediums for generating human action data sets are disclosed by the present disclosure. In an aspect, an apparatus may receive a set of reference images, where each of the images within the set of reference images includes a person, and a background image. The apparatus may identify body parts of the person from the set of reference image and generate a transformed skeleton image by mapping each of the body parts of the person to corresponding skeleton parts of a target skeleton. The apparatus may generate a mask of the transformed skeleton image. The apparatus may generate, using machine learning, a frame of the person formed according to the target skeleton within the background image.

Type: Grant

Filed: March 23, 2018

Date of Patent: June 9, 2020

Assignee: MICROSOFT TECHNOLOGY LICENSING, LLC

Inventors: Hamidreza Vaezi Joze, Ilya Zharkov, Vivek Pradeep, Mehran Khodabandeh
VIDEO RECOGNITION USING MULTIPLE MODALITIES

Publication number: 20200143169

Abstract: Implementations described herein discloses a multi-modality video recognition system. Specifically, the multi-modality video recognition system is configured to train a plurality of classifier networks, each of the classifier network trained with a different one of the plurality of video streams, wherein each of the plurality of different classifier networks includes multiple intermediate layers, determine correlation matrices of related intermediate layers of each of the plurality of the different classifier networks, and align the correlation matrices of the related intermediate layers of each of the plurality of the different classifier networks.

Type: Application

Filed: February 27, 2019

Publication date: May 7, 2020

Inventors: Hamidreza VAEZI JOZE, Mahdi ABAVISANI
LOCATION MATCHED SMALL SEGMENT FINGERPRINT READER

Publication number: 20190340414

Abstract: Electronic fingerprint readers are often used for security such as log-in authentication for the identification of a user for selective access to a computing system. As computing devices shrink in overall size and with downward pressure on device pricing, smaller and less expensive fingerprint readers are increasingly desired. While whole fingerprint readers have the greatest accuracy in user identification, the whole fingerprint is often not required for user identification. Often, only a portion of the user's fingerprint is required to adequately identify the user and thus a small segment fingerprint reader may be sufficient for user authentication. However, the smaller the sensing area of the small-segment fingerprint reader, the more likely that the fingerprint reader misidentifies the user or fails to collect sufficient information to identify the user. Systems and methods for improving identification accuracy of small-segment fingerprint readers are disclosed in detail herein.

Type: Application

Filed: May 3, 2018

Publication date: November 7, 2019

Inventor: Hamidreza VAEZI JOZE
HUMAN ACTION DATA SET GENERATION IN A MACHINE LEARNING SYSTEM

Publication number: 20190294871

Abstract: Methods, apparatuses, and computer-readable mediums for generating human action data sets are disclosed by the present disclosure. In an aspect, an apparatus may receive a set of reference images, where each of the images within the set of reference images includes a person, and a background image. The apparatus may identify body parts of the person from the set of reference image and generate a transformed skeleton image by mapping each of the body parts of the person to corresponding skeleton parts of a target skeleton. The apparatus may generate a mask of the transformed skeleton image. The apparatus may generate, using machine learning, a frame of the person formed according to the target skeleton within the background image.

Type: Application

Filed: March 23, 2018

Publication date: September 26, 2019

Inventors: Hamidreza VAEZI JOZE, Ilya ZHARKOV, Vivek PRADEEP, Mehran KHODABANDEH