Patents by Inventor Mostafa El-Khamy

Mostafa El-Khamy has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 11645869
    Abstract: A system to recognize objects in an image includes an object detection network outputs a first hierarchical-calculated feature for a detected object. A face alignment regression network determines a regression loss for alignment parameters based on the first hierarchical-calculated feature. A detection box regression network determines a regression loss for detected boxes based on the first hierarchical-calculated feature. The object detection network further includes a weighted loss generator to generate a weighted loss for the first hierarchical-calculated feature, the regression loss for the alignment parameters and the regression loss of the detected boxes. A backpropagator backpropagates the generated weighted loss.
    Type: Grant
    Filed: March 3, 2020
    Date of Patent: May 9, 2023
    Inventors: Mostafa El-Khamy, Arvind Yedla, Marcel Nassar, Jungwon Lee
  • Publication number: 20230139004
    Abstract: A method includes receiving a binary annotation of source text; performing a close operation on the binary annotation to generate a closed annotation using an initial kernel size; defining one or more contours in the closed annotation using one or more bounding boxes, respectively; determining a subset of the one or more contours for which a percentage of area occupied by text within a corresponding bounding box exceeds a threshold; and generating a final annotation of the source text based on the subset of the one or more contours.
    Type: Application
    Filed: March 23, 2022
    Publication date: May 4, 2023
    Inventors: Andrea D. Kang, Jinhong Wu, Mostafa El-Khamy
  • Publication number: 20230123254
    Abstract: A method for computing a dominant class of a scene includes: receiving an input image of a scene; generating a segmentation map of the input image, the segmentation map being labeled with a plurality of corresponding classes of a plurality of classes; computing a plurality of area ratios based on the segmentation map, each of the area ratios corresponding to a different class of the plurality of classes of the segmentation map; and outputting a detected dominant class of the scene based on a plurality of ranked labels based on the area ratios.
    Type: Application
    Filed: December 16, 2022
    Publication date: April 20, 2023
    Inventors: Qingfeng Liu, Mostafa El-Khamy, Rama Mythili Vadali, Tae-ui Kim, Andrea Kang, Dongwoon Bai, Jungwon Lee, Maiyuran Wijay, Jaewon Yoo
  • Publication number: 20230116893
    Abstract: A method of depth detection based on a plurality of video frames includes receiving a plurality of input frames including a first input frame, a second input frame, and a third input frame respectively corresponding to different capture times, convolving the first to third input frames to generate a first feature map, a second feature map, and a third feature map corresponding to the different capture times, calculating a temporal attention map based on the first to third feature maps, the temporal attention map including a plurality of weights corresponding to different pairs of feature maps from among the first to third feature maps, each weight of the plurality of weights indicating a similarity level of a corresponding pair of feature maps, and applying the temporal attention map to the first to third feature maps to generate a feature map with temporal attention.
    Type: Application
    Filed: December 13, 2022
    Publication date: April 13, 2023
    Inventors: Haoyu Ren, Mostafa El Khamy, Jungwon Lee
  • Publication number: 20230104127
    Abstract: A system and a method are disclosed for receiving an input image, using a domain invariant machine learning model to compute an output based on the input image, wherein the domain invariant machine learning model is trained using domain invariant regularization, and displaying information based on the output.
    Type: Application
    Filed: September 14, 2022
    Publication date: April 6, 2023
    Inventors: Behnam BABAGHOLAMI MOHAMADABADI, Mostafa EL-KHAMY, Kee-Bong SONG
  • Patent number: 11620555
    Abstract: A method and system are herein disclosed. The method includes developing a joint latent variable model having a first variable, a second variable, and a joint latent variable representing common information between the first and second variables, generating a variational posterior of the joint latent variable model, training the variational posterior, and performing inference of the first variable from the second variable based on the variational posterior.
    Type: Grant
    Filed: April 3, 2019
    Date of Patent: April 4, 2023
    Inventors: Jongha Ryu, Yoo Jin Choi, Mostafa El-Khamy, Jungwon Lee
  • Patent number: 11615317
    Abstract: A system and method for operating a neural network. In some embodiments, the neural network includes a variational autoencoder, and the training of the neural network includes training the variational autoencoder with a plurality of samples of a first random variable; and a plurality of samples of a second random variable, the plurality of samples of the first random variable and the plurality of samples of the second random variable being unpaired, the training of the neural network including updating weights in the neural network based on a first loss function, the first loss function being based on a measure of deviation from consistency between: a conditional generation path from the first random variable to the second random variable, and a conditional generation path from the second random variable to the first random variable.
    Type: Grant
    Filed: May 28, 2020
    Date of Patent: March 28, 2023
    Assignee: Samsung Electronics Co., Ltd.
    Inventors: Yoo Jin Choi, Jongha Ryu, Mostafa El-Khamy, Jungwon Lee, Young-Han Kim
  • Patent number: 11599979
    Abstract: A method and an apparatus are provided. The method includes receiving a video with a first plurality of frames having a first resolution; generating a plurality of warped frames from the first plurality of frames based on a first type of motion compensation; generating a second plurality of frames having a second resolution, wherein the second resolution is of higher resolution than the first resolution, wherein each of the second plurality of frames having the second resolution is derived from a subset of the plurality of warped frames using a convolutional network; and generating a third plurality of frames having the second resolution based on a second type of motion compensation, wherein each of the third plurality of frames having the second resolution is derived from a fusing a subset of the second plurality of frames.
    Type: Grant
    Filed: May 29, 2020
    Date of Patent: March 7, 2023
    Inventors: Mostafa El-Khamy, Haoyu Ren, Jungwon Lee
  • Publication number: 20230050573
    Abstract: Apparatuses and methods are provided for training a feature extraction model determining a loss function for use in unsupervised image segmentation. A method includes determining a clustering loss from an image; determining a weakly supervised contrastive loss of the image using cluster pseudo labels based on the clustering loss; and determining the loss function based on the clustering loss and the weakly supervised contrastive loss.
    Type: Application
    Filed: May 26, 2022
    Publication date: February 16, 2023
    Inventors: Qingfeng LIU, Mostafa EL-KHAMY, Yuewei YANG
  • Publication number: 20230006692
    Abstract: A method and apparatus for variable rate compression with a conditional autoencoder is herein provided. According to one embodiment, a method for compression includes receiving a first image and a first scheme as inputs for an autoencoder network; determining a first Lagrange multiplier based on the first scheme; and using the first image and the first Lagrange multiplier as inputs, computing a second image from the autoencoder network. The autoencoder network is trained using a plurality of Lagrange multipliers and a second image as training inputs.
    Type: Application
    Filed: September 9, 2022
    Publication date: January 5, 2023
    Inventors: Yoo Jin CHOI, Mostafa El-Khamy, Jungwon Lee
  • Patent number: 11532154
    Abstract: A method for computing a dominant class of a scene includes: receiving an input image of a scene; generating a segmentation map of the input image, the segmentation map being labeled with a plurality of corresponding classes of a plurality of classes; computing a plurality of area ratios based on the segmentation map, each of the area ratios corresponding to a different class of the plurality of classes of the segmentation map; and outputting a detected dominant class of the scene based on a plurality of ranked labels based on the area ratios.
    Type: Grant
    Filed: February 17, 2021
    Date of Patent: December 20, 2022
    Assignee: Samsung Electronics Co., Ltd.
    Inventors: Qingfeng Liu, Mostafa El-Khamy, Rama Mythili Vadali, Tae-ui Kim, Andrea Kang, Dongwoon Bai, Jungwon Lee, Maiyuran Wijay, Jaewon Yoo
  • Patent number: 11527005
    Abstract: A method of depth detection based on a plurality of video frames includes receiving a plurality of input frames including a first input frame, a second input frame, and a third input frame respectively corresponding to different capture times, convolving the first to third input frames to generate a first feature map, a second feature map, and a third feature map corresponding to the different capture times, calculating a temporal attention map based on the first to third feature maps, the temporal attention map including a plurality of weights corresponding to different pairs of feature maps from among the first to third feature maps, each weight of the plurality of weights indicating a similarity level of a corresponding pair of feature maps, and applying the temporal attention map to the first to third feature maps to generate a feature map with temporal attention.
    Type: Grant
    Filed: April 6, 2020
    Date of Patent: December 13, 2022
    Assignee: Samsung Electronics Co., Ltd.
    Inventors: Haoyu Ren, Mostafa El-Khamy, Jungwon Lee
  • Patent number: 11526970
    Abstract: A system and method for processing an input video while maintaining temporal consistency across video frames is provided. The method includes converting the input video from a first frame rate to a second frame rate, wherein the second frame rate is a faster frame rate than the first frame rate; generating processed frames of the input video at the second frame rate; and aggregating the processed frames using temporal sliding window aggregation to yield a processed output video at a third frame rate.
    Type: Grant
    Filed: April 7, 2020
    Date of Patent: December 13, 2022
    Inventors: Mostafa El-Khamy, Ryan Szeto, Jungwon Lee
  • Publication number: 20220391632
    Abstract: A computer vision (CV) training system, includes: a supervised learning system to estimate a supervision output from one or more input images according to a target CV application, and to determine a supervised loss according to the supervision output and a ground-truth of the supervision output; an unsupervised learning system to determine an unsupervised loss according to the supervision output and the one or more input images; a weakly supervised learning system to determine a weakly supervised loss according to the supervision output and a weak label corresponding to the one or more input images; and a joint optimizer to concurrently optimize the supervised loss, the unsupervised loss, and the weakly supervised loss.
    Type: Application
    Filed: August 17, 2022
    Publication date: December 8, 2022
    Inventors: Haoyu Ren, Mostafa El-Khamy, Jungwon Lee, Aman Raj
  • Patent number: 11521634
    Abstract: A method for performing echo cancellation includes: receiving a far-end signal from a far-end device at a near-end device; recording a microphone signal at the near-end device including: a near-end signal; and an echo signal corresponding to the far-end signal; extracting far-end features from the far-end signal; extracting microphone features from the microphone signal; computing estimated near-end features by supplying the microphone features and the far-end features to an acoustic echo cancellation module including: an echo estimator including a first stack of a recurrent neural network configured to compute estimated echo features based on the far-end features; and a near-end estimator including a second stack of the recurrent neural network configured to compute the estimated near-end features based on an output of the first stack and the microphone signal; computing an estimated near-end signal from the estimated near-end features; and transmitting the estimated near-end signal to the far-end device.
    Type: Grant
    Filed: September 9, 2020
    Date of Patent: December 6, 2022
    Assignee: Samsung Electronics Co., Ltd.
    Inventors: Amin Fazeli, Mostafa El-Khamy, Jungwon Lee
  • Patent number: 11461998
    Abstract: Some aspects of embodiments of the present disclosure relate to using a boundary aware loss function to train a machine learning model for computing semantic segmentation maps from input images. Some aspects of embodiments of the present disclosure relate to deep convolutional neural networks (DCNNs) for computing semantic segmentation maps from input images, where the DCNNs include a box filtering layer configured to box filter input feature maps computed from the input images before supplying box filtered feature maps to an atrous spatial pyramidal pooling (ASPP) layer. Some aspects of embodiments of the present disclosure relate to a selective ASPP layer configured to weight the outputs of an ASPP layer in accordance with attention feature maps.
    Type: Grant
    Filed: January 30, 2020
    Date of Patent: October 4, 2022
    Assignee: Samsung Electronics Co., Ltd.
    Inventors: Qingfeng Liu, Mostafa El-Khamy, Dongwoon Bai, Jungwon Lee
  • Patent number: 11456007
    Abstract: A method and system for providing end-to-end multi-task denoising for joint signal distortion ratio (SDR) and perceptual evaluation of speech quality (PESQ) optimization is herein disclosed. According to one embodiment, an method includes receiving a noisy signal, generating a denoised output signal, determining a signal distortion ratio (SDR) loss function based on the denoised output signal, determining a perceptual evaluation of speech quality (PESQ) loss function based on the denoised output signal, and optimizing an overall loss function based on the PESQ loss function and the SDR loss function.
    Type: Grant
    Filed: June 25, 2019
    Date of Patent: September 27, 2022
    Inventors: Jaeyoung Kim, Mostafa El-Khamy, Jungwon Lee
  • Publication number: 20220300819
    Abstract: Apparatuses and methods of manufacturing same, systems, and methods are described. In one aspect, a method includes generating a convolutional neural network (CNN) by training a CNN having a plurality of convolutional layers, and performing cascade training on the trained CNN. The cascade training includes an iterative process of a plurality of stages, in which each stage includes inserting a residual block (ResBlock) and training the CNN with the inserted ResBlock.
    Type: Application
    Filed: May 27, 2022
    Publication date: September 22, 2022
    Inventors: Haoyu REN, Mostafa EL-KHAMY, Jungwon LEE
  • Publication number: 20220301296
    Abstract: A system and a method to train a neural network are disclosed. A first image is weakly and strongly augmented. The first image, the weakly and strongly augmented first images are input into a feature extractor to obtain augmented features. Each weakly augmented first image is input to a corresponding first expert head to determine a supervised loss for each weakly augmented first image. Each strongly augmented first image is input to a corresponding second expert head to determine a diversity loss for each strongly augmented first image. The feature extractor is trained to minimize the supervised loss on weakly augmented first images and to minimize a multi-expert consensus loss on strongly augmented first images. Each first expert head is trained to minimize the supervised loss for each weakly augmented first image, and each second expert head is trained to minimize the diversity loss for each strongly augmented first image.
    Type: Application
    Filed: February 17, 2022
    Publication date: September 22, 2022
    Inventors: Behnam BABAGHOLAMI MOHAMADABADI, Qingfeng LIU, Mostafa EL-KHAMY, Jungwon LEE
  • Publication number: 20220301128
    Abstract: A method of image processing includes: determining a first feature, wherein the first feature has a dimensionality D1; determining a second feature, wherein the second feature has a dimensionality D2 and is based on an output of a feature extraction network; generating a third feature by processing the first feature, the third feature having a dimensionality D3; generating a guidance by processing the second feature, the guidance having the dimensionality D3; generating a filter output by applying a deep guided filter (DGF) to the third feature using the guidance; generating a map based on the filter output; and outputting a processed image based on the map.
    Type: Application
    Filed: December 27, 2021
    Publication date: September 22, 2022
    Inventors: Qingfeng LIU, Hai SU, Mostafa EL-KHAMY