Patents by Inventor Quanfu Fan

Quanfu Fan has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 11915474
    Abstract: Techniques and apparatus for analyzing visual content using a visual transformer are described. An example technique includes generating a first set of tokens based on a visual content item. Each token in the first set of tokens is associated with a regional feature from a different region of a plurality of regions of the visual content item. A second set of tokens is generated based on the visual content item. Each token in the second set of tokens is associated with a local feature from one of the plurality of regions of the visual content item. At least one feature map is generated for the visual content item, based on analyzing the first set of tokens and the second set of tokens separately using a hierarchical vision transformer. At least one vision task is performed based on the at least one feature map.
    Type: Grant
    Filed: May 31, 2022
    Date of Patent: February 27, 2024
    Assignee: International Business Machines Corporation
    Inventors: Richard Chen, Rameswar Panda, Quanfu Fan
  • Patent number: 11875489
    Abstract: A hybrid-distance adversarial patch generator can be trained to generate a hybrid adversarial patch effective at multiple distances. The hybrid patch can be inserted into multiple sample images, each depicting an object, to simulate inclusion of the hybrid patch at multiple distances. The multiple sample images can then be used to train an object detection model to detect the objects.
    Type: Grant
    Filed: June 30, 2021
    Date of Patent: January 16, 2024
    Assignee: International Business Machines Corporation
    Inventors: Quanfu Fan, Sijia Liu, Richard Chen, Rameswar Panda
  • Publication number: 20230386197
    Abstract: Techniques and apparatus for analyzing visual content using a visual transformer are described. An example technique includes generating a first set of tokens based on a visual content item. Each token in the first set of tokens is associated with a regional feature from a different region of a plurality of regions of the visual content item. A second set of tokens is generated based on the visual content item. Each token in the second set of tokens is associated with a local feature from one of the plurality of regions of the visual content item. At least one feature map is generated for the visual content item, based on analyzing the first set of tokens and the second set of tokens separately using a hierarchical vision transformer. At least one vision task is performed based on the at least one feature map.
    Type: Application
    Filed: May 31, 2022
    Publication date: November 30, 2023
    Inventors: Richard CHEN, Rameswar PANDA, Quanfu FAN
  • Publication number: 20230288354
    Abstract: Methods and systems for performing electron microscopy are provided. Microscopy images candidate sub-regions at different magnification levels are captured and provided to a trained sub-region quality assessment application trained to output a quality score for each candidate sub-region. From the quality scores, group-level features for the larger magnification images are determined using a group-level feature extraction application. The quality scores for the candidate sub-regions and the group-level extraction features are provided to a trained Q-learning network that identifies a next sub-region amongst the candidate sub-regions for capturing a micrograph image, where reinforcement learning may be used with the Q-learning network for such identification, for example using a decisional cost.
    Type: Application
    Filed: March 8, 2023
    Publication date: September 14, 2023
    Inventors: Michael Cianfrocco, Quanfu Fan, Yilai Li, Seychelle Vos
  • Patent number: 11663443
    Abstract: Techniques are described for reducing the number of parameters of a deep neural network model. According to one or more embodiments, a device can comprise a memory that stores computer executable components and a processor that executes the computer executable components stored in the memory. The computer executable components can comprise a structure extraction component that determines a number of input nodes associated with a fully connected layer of a deep neural network model. The computer executable components can further comprise a transformation component that replaces the fully connected layer with a number of sparsely connected sublayers, wherein the sparsely connected sublayers have fewer connections than the fully connecter layer, and wherein the number of sparsely connected sublayers is determined based on a defined decrease to the number of input nodes.
    Type: Grant
    Filed: November 21, 2018
    Date of Patent: May 30, 2023
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Dan Gutfreund, Quanfu Fan, Abhijit S. Mudigonda
  • Patent number: 11651206
    Abstract: Embodiments of the present invention are directed to a computer-implemented method for multiscale representation of input data. A non-limiting example of the computer-implemented method includes a processor receiving an original input. The processor downsamples the original input into a downscaled input. The processor runs a first convolutional neural network (“CNN”) on the downscaled input. The processor runs a second CNN on the original input, where the second CNN has fewer layers than the first CNN. The processor merges the output of the first CNN with the output of the second CNN and provides a result following the merging of the outputs.
    Type: Grant
    Filed: June 27, 2018
    Date of Patent: May 16, 2023
    Assignee: International Business Machines Corporation
    Inventors: Quanfu Fan, Richard Chen
  • Publication number: 20230004754
    Abstract: Adversarial patches can be inserted into sample pictures by an adversarial image generator to realistically depict adversarial images. The adversarial image generator can be utilized to train an adversarial patch generator by inserting generated patches into sample pictures, and submitting the resulting adversarial images to object detection models. This way, the adversarial patch generator can be trained to generate patches capable of defeating object detection models.
    Type: Application
    Filed: June 30, 2021
    Publication date: January 5, 2023
    Inventors: Quanfu Fan, Sijia Liu, GAOYUAN ZHANG, Kaidi Xu
  • Publication number: 20230005111
    Abstract: A hybrid-distance adversarial patch generator can be trained to generate a hybrid adversarial patch effective at multiple distances. The hybrid patch can be inserted into multiple sample images, each depicting an object, to simulate inclusion of the hybrid patch at multiple distances. The multiple sample images can then be used to train an object detection model to detect the objects.
    Type: Application
    Filed: June 30, 2021
    Publication date: January 5, 2023
    Inventors: Quanfu Fan, Sijia Liu, Richard Chen, Rameswar Panda
  • Publication number: 20220292285
    Abstract: One embodiment of the invention provides a method for video recognition. The method comprises receiving an input video comprising a sequence of video segments over a plurality of data modalities. The method further comprises, for a video segment of the sequence, selecting one or more data modalities based on data representing the video segment. Each data modality selected is optimal for video recognition of the video segment. The method further comprises, for each data modality selected, providing at least one data input representing the video segment over the data modality selected to a machine learning model corresponding to the data modality selected, and generating a first type of prediction representative of the video segment via the machine learning model. The method further comprises determining a second type of prediction representative of the entire input video by aggregating all first type of predictions generated.
    Type: Application
    Filed: March 11, 2021
    Publication date: September 15, 2022
    Inventors: Rameswar Panda, Richard Chen, Quanfu Fan, Rogerio Schmidt Feris
  • Patent number: 11443069
    Abstract: An illustrative embodiment includes a method for protecting a machine learning model. The method includes: determining concept-level interpretability of respective units within the model; determining sensitivity of the respective units within the model to an adversarial attack; identifying units within the model which are both interpretable and sensitive to the adversarial attack; and enhancing defense against the adversarial attack by masking at least a portion of the units identified as both interpretable and sensitive to the adversarial attack.
    Type: Grant
    Filed: September 3, 2019
    Date of Patent: September 13, 2022
    Assignee: International Business Machines Corporation
    Inventors: Sijia Liu, Quanfu Fan, Gaoyuan Zhang, Chuang Gan
  • Patent number: 11348336
    Abstract: Systems and methods for performing video understanding and analysis. Sets of feature maps for high resolution images and low resolution images in a time sequence of images are combined into combined sets of feature maps each having N feature maps. A time sequence of temporally aggregated sets of feature maps is created for each combined set of feature maps by: selecting a selected combined set of feature maps corresponding to an image at time “t” in the time sequence of images; applying, by channel-wise multiplication, a feature map weighting vector to a number of combined sets of feature maps that are temporally adjacent to the selected combined set of feature maps; and summing elements of the number of combined set of feature maps into a temporally aggregated set of feature maps. The time sequence of temporally aggregated sets of feature maps is processed to perform video understanding processing.
    Type: Grant
    Filed: May 13, 2020
    Date of Patent: May 31, 2022
    Assignee: International Business Machines Corporation
    Inventors: Quanfu Fan, Richard Chen, Sijia Liu, Hildegard Kuehne
  • Patent number: 11227215
    Abstract: Mechanisms are provided for generating an adversarial perturbation attack sensitivity (APAS) visualization. The mechanisms receive a natural input dataset and a corresponding adversarial attack input dataset, where the adversarial attack input dataset comprises perturbations intended to cause a misclassification by a computer model. The mechanisms determine a sensitivity measure of the computer model to the perturbations in the adversarial attack input dataset based on a processing of the natural input dataset and corresponding adversarial attack input dataset by the computer model. The mechanisms generate a classification activation map (CAM) for the computer model based on results of the processing and a sensitivity overlay based on the sensitivity measure. The sensitivity overlay graphically represents different classifications of perturbation sensitivities.
    Type: Grant
    Filed: March 8, 2019
    Date of Patent: January 18, 2022
    Assignee: International Business Machines Corporation
    Inventors: Sijia Liu, Quanfu Fan, Chuang Gan, Dakuo Wang
  • Patent number: 11195024
    Abstract: Provided are embodiments including a computer-implemented method for performing recognition. The computer-implemented method includes receiving video data, and performing, at a pre-attention prediction module, a pre-attention prediction for the video data to generate first prediction priors. The computer-implemented method also includes receiving, at a dual attention module, data including the video data and data from the pre-attention prediction to generate attention maps, wherein the attention maps indicate a region of interest of a frame of the video data, wherein the dual attention module generates enhanced feature representations, and performing, at a post-attention prediction module, a post-attention prediction from data from the dual attention module based at least in part on the enhanced feature representation. Also provided are embodiments for a system and a computer program produce for performing recognition.
    Type: Grant
    Filed: July 10, 2020
    Date of Patent: December 7, 2021
    Assignees: INTERNATIONAL BUSINESS MACHINES CORPORATION, MASSACHUSETTS INSTITUTE OF TECHNOLOGY
    Inventors: Quanfu Fan, Dan Gutfreund, Tete Xiao, Bolei Zhou
  • Publication number: 20210357651
    Abstract: Systems and methods for performing video understanding and analysis. Sets of feature maps for high resolution images and low resolution images in a time sequence of images are combined into combined sets of feature maps each having N feature maps. A time sequence of temporally aggregated sets of feature maps is created for each combined set of feature maps by: selecting a selected combined set of feature maps corresponding to an image at time “t” in the time sequence of images; applying, by channel-wise multiplication, a feature map weighting vector to a number of combined sets of feature maps that are temporally adjacent to the selected combined set of feature maps; and summing elements of the number of combined set of feature maps into a temporally aggregated set of feature maps. The time sequence of temporally aggregated sets of feature maps is processed to perform video understanding processing.
    Type: Application
    Filed: May 13, 2020
    Publication date: November 18, 2021
    Inventors: Quanfu FAN, Richard CHEN, Sijia LIU, Hildegard KUEHNE
  • Patent number: 11176439
    Abstract: Technologies for providing convolutional neural networks are described. An analysis component determines an initial convolutional layer in a network architecture of a convolutional neural network and one or more subsequent convolutional layers in the network architecture. A replacement component replaces original convolutional kernels in the initial convolutional layer with initial sparse convolutional kernels, and replaces subsequent convolutional kernels in one or more subsequent convolutional layers with complementary sparse convolutional kernels. The complementary sparse kernels have a complementary pattern with respect to sparse kernels of a previous convolutional layer. Analyzing the network architecture and a trained model of a convolutional neural network can determine the original convolutional kernels and replace those kernels with sparse kernels based on similarity and/or weight in an initial layer, with sparse complementary kernels used in subsequent layers.
    Type: Grant
    Filed: December 1, 2017
    Date of Patent: November 16, 2021
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Richard Chen, Quanfu Fan
  • Patent number: 11080542
    Abstract: An image data is convolved with one or more kernels and corresponding one or more feature maps generated. Region of interest maps are extracted from the one or more feature maps, and pooled based on one or more features selected as selective features. Pooling generates a feature vector with dimensionality less than a dimensionality associated with the one or more feature maps. The feature vector is flattened and input as a layer in a neural network. The neural network outputs a classification associated with an object in the image data.
    Type: Grant
    Filed: July 27, 2018
    Date of Patent: August 3, 2021
    Assignee: International Business Machines Corporation
    Inventors: Quanfu Fan, Richard Chen
  • Publication number: 20210064785
    Abstract: An illustrative embodiment includes a method for protecting a machine learning model. The method includes: determining concept-level interpretability of respective units within the model; determining sensitivity of the respective units within the model to an adversarial attack; identifying units within the model which are both interpretable and sensitive to the adversarial attack; and enhancing defense against the adversarial attack by masking at least a portion of the units identified as both interpretable and sensitive to the adversarial attack.
    Type: Application
    Filed: September 3, 2019
    Publication date: March 4, 2021
    Inventors: Sijia Liu, Quanfu Fan, Gaoyuan Zhang, Chuang Gan
  • Publication number: 20200285952
    Abstract: Mechanisms are provided for generating an adversarial perturbation attack sensitivity (APAS) visualization. The mechanisms receive a natural input dataset and a corresponding adversarial attack input dataset, where the adversarial attack input dataset comprises perturbations intended to cause a misclassification by a computer model. The mechanisms determine a sensitivity measure of the computer model to the perturbations in the adversarial attack input dataset based on a processing of the natural input dataset and corresponding adversarial attack input dataset by the computer model. The mechanisms generate a classification activation map (CAM) for the computer model based on results of the processing and a sensitivity overlay based on the sensitivity measure. The sensitivity overlay graphically represents different classifications of perturbation sensitivities.
    Type: Application
    Filed: March 8, 2019
    Publication date: September 10, 2020
    Inventors: Sijia Liu, Quanfu Fan, Chuang Gan, Dakuo Wang
  • Patent number: 10740659
    Abstract: Techniques facilitating generation of a fused kernel that can approximate a full kernel of a convolutional neural network are provided. In one example, a computer-implemented method comprises determining a first pattern of samples of a first sample matrix and a second pattern of samples of a second sample matrix. The first sample matrix can be representative of a sparse kernel, and the second sample matrix can be representative of a complementary kernel. The first pattern and second pattern can be complementary to one another. The computer-implemented method also comprises generating a fused kernel based on a combination of features of the sparse kernel and features of the complementary kernel that are combined according to a fusing approach and training the fused kernel.
    Type: Grant
    Filed: December 14, 2017
    Date of Patent: August 11, 2020
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Richard Chen, Quanfu Fan, Marco Pistoia, Toyotaro Suzumura
  • Publication number: 20200242507
    Abstract: A computing system is configured to learn data-augmentations from unlabeled media. The system includes an extracting unit and an embedding unit. The extracting unit is configured to receive media data that includes moving images of an object and audio generated by the object. The extracting unit extracts an image frame of the object among the moving images and extracts an audio segment from the audio. The embedding unit is configured to generate first embeddings of the image frame and second embeddings of the audio segment, and to concatenate the first and second embeddings together to generate concatenated embeddings. The computing system labels the media data based at least in part on the concatenated embeddings.
    Type: Application
    Filed: January 25, 2019
    Publication date: July 30, 2020
    Inventors: Chuang Gan, Quanfu Fan, Sijia Liu, Rogerio Schmidt Feris