Patents by Inventor Hamidreza VAEZI JOZE

Hamidreza VAEZI JOZE has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Publication number: 20230291993
    Abstract: Systems and methods are provided for determining faces and bodies of people in an image by adaptively scaling images and by iteratively using a deep neural network for inferencing. A camera captures an image including faces and bodies of people. A face/body determiner determines faces and bodies of people appearing in the image by resizing the image into a predetermined pixel dimension as input to the deep neural network. A region cropper determines a crop region associated with a low level of confidence in detecting faces and bodies that are too small to determine with an acceptable level of confidence. The region cropper resizes the crop region into the predetermined pixel dimension as input to the deep neural network. The face and body determiner determines other faces and bodies appearing in the resized crop region. An aggregator aggregates locations of the determined faces and bodies in the image.
    Type: Application
    Filed: May 19, 2023
    Publication date: September 14, 2023
    Applicant: Microsoft Technology Licensing, LLC
    Inventors: Hamidreza VAEZI JOZE, Zehua WEI
  • Patent number: 11700445
    Abstract: Systems and methods are provided for determining faces and bodies of people in an image by adaptively scaling images and by iteratively using a deep neural network for inferencing. A camera captures an image including faces and bodies of people. A face/body determiner determines faces and bodies of people appearing in the image by resizing the image into a predetermined pixel dimension as input to the deep neural network. A region cropper determines a crop region associated with a low level of confidence in detecting faces and bodies that are too small to determine with an acceptable level of confidence. The region cropper resizes the crop region into the predetermined pixel dimension as input to the deep neural network. The face and body determiner determines other faces and bodies appearing in the resized crop region. An aggregator aggregates locations of the determined faces and bodies in the image.
    Type: Grant
    Filed: October 28, 2021
    Date of Patent: July 11, 2023
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Hamidreza Vaezi Joze, Zehua Wei
  • Publication number: 20230153379
    Abstract: A transformer is described herein for using transformer-based technology to process data items (e.g., image items). The transformer increases the efficiency of the transformer-based technology by using a modified attention component. In operation, the modified attention component accepts embedding vectors that represent a plurality of item tokens, together with a classification token. A first stage of the modified attention component generates original attention information based on the embedding vectors. A second stage generates score information based on a portion of the original attention information that pertains to the classification token. A third stage produces modified attention information by removing attention values from the original attention information, as guided by a sampling operation that is performed on the score information. The second and third stages do not rely on machine-trained values, which expedites the deployment of these functions in existing transformers.
    Type: Application
    Filed: November 14, 2021
    Publication date: May 18, 2023
    Applicant: Microsoft Technology Licensing, LLC
    Inventors: Mohsen FAYYAZ, Soroush ABBASI KOOHPAYEGANI, Eric Chris Wolfgang SOMMERLADE, Hamidreza VAEZI JOZE
  • Publication number: 20230133854
    Abstract: Systems and methods are provided for determining faces and bodies of people in an image by adaptively scaling images and by iteratively using a deep neural network for inferencing. A camera captures an image including faces and bodies of people. A face/body determiner determines faces and bodies of people appearing in the image by resizing the image into a predetermined pixel dimension as input to the deep neural network. A region cropper determines a crop region associated with a low level of confidence in detecting faces and bodies that are too small to determine with an acceptable level of confidence. The region cropper resizes the crop region into the predetermined pixel dimension as input to the deep neural network. The face and body determiner determines other faces and bodies appearing in the resized crop region. An aggregator aggregates locations of the determined faces and bodies in the image.
    Type: Application
    Filed: October 28, 2021
    Publication date: May 4, 2023
    Applicant: Microsoft Technology Licensing, LLC
    Inventors: Hamidreza VAEZI JOZE, Zehua WEI
  • Publication number: 20220405521
    Abstract: An image processor receives first image data representing an image. The first image data comprising a plurality of color values corresponding to a plurality of pixels in the image. The image processor determines, using a trained machine learning model, second image data based on the first image data. The second image data comprises surface spectral reflection values corresponding to the plurality of pixels in the image, where the surface spectral reflection values are distributed across a plurality of wavelengths of visible light in the image. The image processor then performs at least one image processing operation with respect to the image using the second image data.
    Type: Application
    Filed: June 21, 2021
    Publication date: December 22, 2022
    Applicant: Microsoft Technology Licensing, LLC
    Inventor: Hamidreza Vaezi JOZE
  • Patent number: 11526972
    Abstract: Described herein are technologies related to correcting image degradations in images. An image is received, and values for features that are usable to correct for image degradation associated with blur, noise, and low light conditions are generated by separate encoders based upon the received image. A fusion network learned by way of network architecture search fuses these values, and the fused values are employed to generate an improved image, such that the image degradations associated with blur, noise, and low light conditions are simultaneously corrected.
    Type: Grant
    Filed: February 1, 2021
    Date of Patent: December 13, 2022
    Assignee: MICROSOFT TECHNOLOGY LICENSING, LLC
    Inventors: Hamidreza Vaezi Joze, Rajeev Yasarla
  • Publication number: 20220358332
    Abstract: A method of training a neural network for detecting target features in images is described. The neural network is trained using a first data set that includes labeled images, where at least some of the labeled images having subjects with labeled features, including: dividing each of the labeled images of the first data set into a respective plurality of tiles, and generating, for each of the plurality of tiles, a plurality of feature anchors that indicate target features within the corresponding tile. Target features that correspond to the plurality of feature anchors are detected in a second data set of unlabeled images. Images of the second data set having target features that were not detected are labeled. A third data set that includes the first data set and the labeled images of the second data set is generated. The neural network is trained using the third data set.
    Type: Application
    Filed: May 7, 2021
    Publication date: November 10, 2022
    Applicant: Microsoft Technology Licensing, LLC
    Inventors: Hamidreza Vaezi JOZE, Vivek PRADEEP, Karthik VIJAYAN
  • Publication number: 20220245776
    Abstract: Described herein are technologies related to correcting image degradations in images. An image is received, and values for features that are usable to correct for image degradation associated with blur, noise, and low light conditions are generated by separate encoders based upon the received image. A fusion network learned by way of network architecture search fuses these values, and the fused values are employed to generate an improved image, such that the image degradations associated with blur, noise, and low light conditions are simultaneously corrected.
    Type: Application
    Filed: February 1, 2021
    Publication date: August 4, 2022
    Inventors: Hamidreza VAEZI JOZE, Rajeev YASARLA
  • Patent number: 11354934
    Abstract: Electronic fingerprint readers are often used for security such as log-in authentication for the identification of a user for selective access to a computing system. As computing devices shrink in overall size and with downward pressure on device pricing, smaller and less expensive fingerprint readers are increasingly desired. While whole fingerprint readers have the greatest accuracy in user identification, the whole fingerprint is often not required for user identification. Often, only a portion of the user's fingerprint is required to adequately identify the user and thus a small segment fingerprint reader may be sufficient for user authentication. However, the smaller the sensing area of the small-segment fingerprint reader, the more likely that the fingerprint reader misidentifies the user or fails to collect sufficient information to identify the user. Systems and methods for improving identification accuracy of small-segment fingerprint readers are disclosed in detail herein.
    Type: Grant
    Filed: May 3, 2018
    Date of Patent: June 7, 2022
    Assignee: Microsoft Technology Licensing, LLC
    Inventor: Hamidreza Vaezi Joze
  • Patent number: 10931976
    Abstract: In an embodiment described herein, a method for face-speech bridging by cycle video/audio reconstruction is described. The method comprises encoding audio data and video data via a mutual autoencoders that comprise an audio autoencoder and a video autoencoder, wherein the mutual autoencoders share a common space with corresponding embeddings derived by each of the audio autoencoder and the video autoencoder. Additionally, the method comprises substituting embeddings from a non-corrupted modality for corresponding corrupted embeddings in a corrupted modality in real-time based at least in part on corrupted audio data or corrupted video data. The method also comprises synthesizing reconstructed audio data and reconstructed video data based on, at least in part, the substituted embeddings.
    Type: Grant
    Filed: October 14, 2019
    Date of Patent: February 23, 2021
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Hamidreza Vaezi Joze, Hassan Akbari
  • Patent number: 10929676
    Abstract: Implementations described herein discloses a multi-modality video recognition system. Specifically, the multi-modality video recognition system is configured to train a plurality of classifier networks, each of the classifier network trained with a different one of the plurality of video streams, wherein each of the plurality of different classifier networks includes multiple intermediate layers, determine correlation matrices of related intermediate layers of each of the plurality of the different classifier networks, and align the correlation matrices of the related intermediate layers of each of the plurality of the different classifier networks.
    Type: Grant
    Filed: February 27, 2019
    Date of Patent: February 23, 2021
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Hamidreza Vaezi Joze, Mahdi Abavisani
  • Patent number: 10679044
    Abstract: Methods, apparatuses, and computer-readable mediums for generating human action data sets are disclosed by the present disclosure. In an aspect, an apparatus may receive a set of reference images, where each of the images within the set of reference images includes a person, and a background image. The apparatus may identify body parts of the person from the set of reference image and generate a transformed skeleton image by mapping each of the body parts of the person to corresponding skeleton parts of a target skeleton. The apparatus may generate a mask of the transformed skeleton image. The apparatus may generate, using machine learning, a frame of the person formed according to the target skeleton within the background image.
    Type: Grant
    Filed: March 23, 2018
    Date of Patent: June 9, 2020
    Assignee: MICROSOFT TECHNOLOGY LICENSING, LLC
    Inventors: Hamidreza Vaezi Joze, Ilya Zharkov, Vivek Pradeep, Mehran Khodabandeh
  • Publication number: 20200143169
    Abstract: Implementations described herein discloses a multi-modality video recognition system. Specifically, the multi-modality video recognition system is configured to train a plurality of classifier networks, each of the classifier network trained with a different one of the plurality of video streams, wherein each of the plurality of different classifier networks includes multiple intermediate layers, determine correlation matrices of related intermediate layers of each of the plurality of the different classifier networks, and align the correlation matrices of the related intermediate layers of each of the plurality of the different classifier networks.
    Type: Application
    Filed: February 27, 2019
    Publication date: May 7, 2020
    Inventors: Hamidreza VAEZI JOZE, Mahdi ABAVISANI
  • Publication number: 20190340414
    Abstract: Electronic fingerprint readers are often used for security such as log-in authentication for the identification of a user for selective access to a computing system. As computing devices shrink in overall size and with downward pressure on device pricing, smaller and less expensive fingerprint readers are increasingly desired. While whole fingerprint readers have the greatest accuracy in user identification, the whole fingerprint is often not required for user identification. Often, only a portion of the user's fingerprint is required to adequately identify the user and thus a small segment fingerprint reader may be sufficient for user authentication. However, the smaller the sensing area of the small-segment fingerprint reader, the more likely that the fingerprint reader misidentifies the user or fails to collect sufficient information to identify the user. Systems and methods for improving identification accuracy of small-segment fingerprint readers are disclosed in detail herein.
    Type: Application
    Filed: May 3, 2018
    Publication date: November 7, 2019
    Inventor: Hamidreza VAEZI JOZE
  • Publication number: 20190294871
    Abstract: Methods, apparatuses, and computer-readable mediums for generating human action data sets are disclosed by the present disclosure. In an aspect, an apparatus may receive a set of reference images, where each of the images within the set of reference images includes a person, and a background image. The apparatus may identify body parts of the person from the set of reference image and generate a transformed skeleton image by mapping each of the body parts of the person to corresponding skeleton parts of a target skeleton. The apparatus may generate a mask of the transformed skeleton image. The apparatus may generate, using machine learning, a frame of the person formed according to the target skeleton within the background image.
    Type: Application
    Filed: March 23, 2018
    Publication date: September 26, 2019
    Inventors: Hamidreza VAEZI JOZE, Ilya ZHARKOV, Vivek PRADEEP, Mehran KHODABANDEH