Patents by Inventor Quanfu Fan
Quanfu Fan has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Patent number: 11915474Abstract: Techniques and apparatus for analyzing visual content using a visual transformer are described. An example technique includes generating a first set of tokens based on a visual content item. Each token in the first set of tokens is associated with a regional feature from a different region of a plurality of regions of the visual content item. A second set of tokens is generated based on the visual content item. Each token in the second set of tokens is associated with a local feature from one of the plurality of regions of the visual content item. At least one feature map is generated for the visual content item, based on analyzing the first set of tokens and the second set of tokens separately using a hierarchical vision transformer. At least one vision task is performed based on the at least one feature map.Type: GrantFiled: May 31, 2022Date of Patent: February 27, 2024Assignee: International Business Machines CorporationInventors: Richard Chen, Rameswar Panda, Quanfu Fan
-
Patent number: 11875489Abstract: A hybrid-distance adversarial patch generator can be trained to generate a hybrid adversarial patch effective at multiple distances. The hybrid patch can be inserted into multiple sample images, each depicting an object, to simulate inclusion of the hybrid patch at multiple distances. The multiple sample images can then be used to train an object detection model to detect the objects.Type: GrantFiled: June 30, 2021Date of Patent: January 16, 2024Assignee: International Business Machines CorporationInventors: Quanfu Fan, Sijia Liu, Richard Chen, Rameswar Panda
-
Publication number: 20230386197Abstract: Techniques and apparatus for analyzing visual content using a visual transformer are described. An example technique includes generating a first set of tokens based on a visual content item. Each token in the first set of tokens is associated with a regional feature from a different region of a plurality of regions of the visual content item. A second set of tokens is generated based on the visual content item. Each token in the second set of tokens is associated with a local feature from one of the plurality of regions of the visual content item. At least one feature map is generated for the visual content item, based on analyzing the first set of tokens and the second set of tokens separately using a hierarchical vision transformer. At least one vision task is performed based on the at least one feature map.Type: ApplicationFiled: May 31, 2022Publication date: November 30, 2023Inventors: Richard CHEN, Rameswar PANDA, Quanfu FAN
-
Publication number: 20230288354Abstract: Methods and systems for performing electron microscopy are provided. Microscopy images candidate sub-regions at different magnification levels are captured and provided to a trained sub-region quality assessment application trained to output a quality score for each candidate sub-region. From the quality scores, group-level features for the larger magnification images are determined using a group-level feature extraction application. The quality scores for the candidate sub-regions and the group-level extraction features are provided to a trained Q-learning network that identifies a next sub-region amongst the candidate sub-regions for capturing a micrograph image, where reinforcement learning may be used with the Q-learning network for such identification, for example using a decisional cost.Type: ApplicationFiled: March 8, 2023Publication date: September 14, 2023Inventors: Michael Cianfrocco, Quanfu Fan, Yilai Li, Seychelle Vos
-
Patent number: 11663443Abstract: Techniques are described for reducing the number of parameters of a deep neural network model. According to one or more embodiments, a device can comprise a memory that stores computer executable components and a processor that executes the computer executable components stored in the memory. The computer executable components can comprise a structure extraction component that determines a number of input nodes associated with a fully connected layer of a deep neural network model. The computer executable components can further comprise a transformation component that replaces the fully connected layer with a number of sparsely connected sublayers, wherein the sparsely connected sublayers have fewer connections than the fully connecter layer, and wherein the number of sparsely connected sublayers is determined based on a defined decrease to the number of input nodes.Type: GrantFiled: November 21, 2018Date of Patent: May 30, 2023Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Dan Gutfreund, Quanfu Fan, Abhijit S. Mudigonda
-
Patent number: 11651206Abstract: Embodiments of the present invention are directed to a computer-implemented method for multiscale representation of input data. A non-limiting example of the computer-implemented method includes a processor receiving an original input. The processor downsamples the original input into a downscaled input. The processor runs a first convolutional neural network (“CNN”) on the downscaled input. The processor runs a second CNN on the original input, where the second CNN has fewer layers than the first CNN. The processor merges the output of the first CNN with the output of the second CNN and provides a result following the merging of the outputs.Type: GrantFiled: June 27, 2018Date of Patent: May 16, 2023Assignee: International Business Machines CorporationInventors: Quanfu Fan, Richard Chen
-
Publication number: 20230004754Abstract: Adversarial patches can be inserted into sample pictures by an adversarial image generator to realistically depict adversarial images. The adversarial image generator can be utilized to train an adversarial patch generator by inserting generated patches into sample pictures, and submitting the resulting adversarial images to object detection models. This way, the adversarial patch generator can be trained to generate patches capable of defeating object detection models.Type: ApplicationFiled: June 30, 2021Publication date: January 5, 2023Inventors: Quanfu Fan, Sijia Liu, GAOYUAN ZHANG, Kaidi Xu
-
Publication number: 20230005111Abstract: A hybrid-distance adversarial patch generator can be trained to generate a hybrid adversarial patch effective at multiple distances. The hybrid patch can be inserted into multiple sample images, each depicting an object, to simulate inclusion of the hybrid patch at multiple distances. The multiple sample images can then be used to train an object detection model to detect the objects.Type: ApplicationFiled: June 30, 2021Publication date: January 5, 2023Inventors: Quanfu Fan, Sijia Liu, Richard Chen, Rameswar Panda
-
Publication number: 20220292285Abstract: One embodiment of the invention provides a method for video recognition. The method comprises receiving an input video comprising a sequence of video segments over a plurality of data modalities. The method further comprises, for a video segment of the sequence, selecting one or more data modalities based on data representing the video segment. Each data modality selected is optimal for video recognition of the video segment. The method further comprises, for each data modality selected, providing at least one data input representing the video segment over the data modality selected to a machine learning model corresponding to the data modality selected, and generating a first type of prediction representative of the video segment via the machine learning model. The method further comprises determining a second type of prediction representative of the entire input video by aggregating all first type of predictions generated.Type: ApplicationFiled: March 11, 2021Publication date: September 15, 2022Inventors: Rameswar Panda, Richard Chen, Quanfu Fan, Rogerio Schmidt Feris
-
Patent number: 11443069Abstract: An illustrative embodiment includes a method for protecting a machine learning model. The method includes: determining concept-level interpretability of respective units within the model; determining sensitivity of the respective units within the model to an adversarial attack; identifying units within the model which are both interpretable and sensitive to the adversarial attack; and enhancing defense against the adversarial attack by masking at least a portion of the units identified as both interpretable and sensitive to the adversarial attack.Type: GrantFiled: September 3, 2019Date of Patent: September 13, 2022Assignee: International Business Machines CorporationInventors: Sijia Liu, Quanfu Fan, Gaoyuan Zhang, Chuang Gan
-
Patent number: 11348336Abstract: Systems and methods for performing video understanding and analysis. Sets of feature maps for high resolution images and low resolution images in a time sequence of images are combined into combined sets of feature maps each having N feature maps. A time sequence of temporally aggregated sets of feature maps is created for each combined set of feature maps by: selecting a selected combined set of feature maps corresponding to an image at time “t” in the time sequence of images; applying, by channel-wise multiplication, a feature map weighting vector to a number of combined sets of feature maps that are temporally adjacent to the selected combined set of feature maps; and summing elements of the number of combined set of feature maps into a temporally aggregated set of feature maps. The time sequence of temporally aggregated sets of feature maps is processed to perform video understanding processing.Type: GrantFiled: May 13, 2020Date of Patent: May 31, 2022Assignee: International Business Machines CorporationInventors: Quanfu Fan, Richard Chen, Sijia Liu, Hildegard Kuehne
-
Patent number: 11227215Abstract: Mechanisms are provided for generating an adversarial perturbation attack sensitivity (APAS) visualization. The mechanisms receive a natural input dataset and a corresponding adversarial attack input dataset, where the adversarial attack input dataset comprises perturbations intended to cause a misclassification by a computer model. The mechanisms determine a sensitivity measure of the computer model to the perturbations in the adversarial attack input dataset based on a processing of the natural input dataset and corresponding adversarial attack input dataset by the computer model. The mechanisms generate a classification activation map (CAM) for the computer model based on results of the processing and a sensitivity overlay based on the sensitivity measure. The sensitivity overlay graphically represents different classifications of perturbation sensitivities.Type: GrantFiled: March 8, 2019Date of Patent: January 18, 2022Assignee: International Business Machines CorporationInventors: Sijia Liu, Quanfu Fan, Chuang Gan, Dakuo Wang
-
Patent number: 11195024Abstract: Provided are embodiments including a computer-implemented method for performing recognition. The computer-implemented method includes receiving video data, and performing, at a pre-attention prediction module, a pre-attention prediction for the video data to generate first prediction priors. The computer-implemented method also includes receiving, at a dual attention module, data including the video data and data from the pre-attention prediction to generate attention maps, wherein the attention maps indicate a region of interest of a frame of the video data, wherein the dual attention module generates enhanced feature representations, and performing, at a post-attention prediction module, a post-attention prediction from data from the dual attention module based at least in part on the enhanced feature representation. Also provided are embodiments for a system and a computer program produce for performing recognition.Type: GrantFiled: July 10, 2020Date of Patent: December 7, 2021Assignees: INTERNATIONAL BUSINESS MACHINES CORPORATION, MASSACHUSETTS INSTITUTE OF TECHNOLOGYInventors: Quanfu Fan, Dan Gutfreund, Tete Xiao, Bolei Zhou
-
Publication number: 20210357651Abstract: Systems and methods for performing video understanding and analysis. Sets of feature maps for high resolution images and low resolution images in a time sequence of images are combined into combined sets of feature maps each having N feature maps. A time sequence of temporally aggregated sets of feature maps is created for each combined set of feature maps by: selecting a selected combined set of feature maps corresponding to an image at time “t” in the time sequence of images; applying, by channel-wise multiplication, a feature map weighting vector to a number of combined sets of feature maps that are temporally adjacent to the selected combined set of feature maps; and summing elements of the number of combined set of feature maps into a temporally aggregated set of feature maps. The time sequence of temporally aggregated sets of feature maps is processed to perform video understanding processing.Type: ApplicationFiled: May 13, 2020Publication date: November 18, 2021Inventors: Quanfu FAN, Richard CHEN, Sijia LIU, Hildegard KUEHNE
-
Patent number: 11176439Abstract: Technologies for providing convolutional neural networks are described. An analysis component determines an initial convolutional layer in a network architecture of a convolutional neural network and one or more subsequent convolutional layers in the network architecture. A replacement component replaces original convolutional kernels in the initial convolutional layer with initial sparse convolutional kernels, and replaces subsequent convolutional kernels in one or more subsequent convolutional layers with complementary sparse convolutional kernels. The complementary sparse kernels have a complementary pattern with respect to sparse kernels of a previous convolutional layer. Analyzing the network architecture and a trained model of a convolutional neural network can determine the original convolutional kernels and replace those kernels with sparse kernels based on similarity and/or weight in an initial layer, with sparse complementary kernels used in subsequent layers.Type: GrantFiled: December 1, 2017Date of Patent: November 16, 2021Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Richard Chen, Quanfu Fan
-
Patent number: 11080542Abstract: An image data is convolved with one or more kernels and corresponding one or more feature maps generated. Region of interest maps are extracted from the one or more feature maps, and pooled based on one or more features selected as selective features. Pooling generates a feature vector with dimensionality less than a dimensionality associated with the one or more feature maps. The feature vector is flattened and input as a layer in a neural network. The neural network outputs a classification associated with an object in the image data.Type: GrantFiled: July 27, 2018Date of Patent: August 3, 2021Assignee: International Business Machines CorporationInventors: Quanfu Fan, Richard Chen
-
Publication number: 20210064785Abstract: An illustrative embodiment includes a method for protecting a machine learning model. The method includes: determining concept-level interpretability of respective units within the model; determining sensitivity of the respective units within the model to an adversarial attack; identifying units within the model which are both interpretable and sensitive to the adversarial attack; and enhancing defense against the adversarial attack by masking at least a portion of the units identified as both interpretable and sensitive to the adversarial attack.Type: ApplicationFiled: September 3, 2019Publication date: March 4, 2021Inventors: Sijia Liu, Quanfu Fan, Gaoyuan Zhang, Chuang Gan
-
Publication number: 20200285952Abstract: Mechanisms are provided for generating an adversarial perturbation attack sensitivity (APAS) visualization. The mechanisms receive a natural input dataset and a corresponding adversarial attack input dataset, where the adversarial attack input dataset comprises perturbations intended to cause a misclassification by a computer model. The mechanisms determine a sensitivity measure of the computer model to the perturbations in the adversarial attack input dataset based on a processing of the natural input dataset and corresponding adversarial attack input dataset by the computer model. The mechanisms generate a classification activation map (CAM) for the computer model based on results of the processing and a sensitivity overlay based on the sensitivity measure. The sensitivity overlay graphically represents different classifications of perturbation sensitivities.Type: ApplicationFiled: March 8, 2019Publication date: September 10, 2020Inventors: Sijia Liu, Quanfu Fan, Chuang Gan, Dakuo Wang
-
Patent number: 10740659Abstract: Techniques facilitating generation of a fused kernel that can approximate a full kernel of a convolutional neural network are provided. In one example, a computer-implemented method comprises determining a first pattern of samples of a first sample matrix and a second pattern of samples of a second sample matrix. The first sample matrix can be representative of a sparse kernel, and the second sample matrix can be representative of a complementary kernel. The first pattern and second pattern can be complementary to one another. The computer-implemented method also comprises generating a fused kernel based on a combination of features of the sparse kernel and features of the complementary kernel that are combined according to a fusing approach and training the fused kernel.Type: GrantFiled: December 14, 2017Date of Patent: August 11, 2020Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Richard Chen, Quanfu Fan, Marco Pistoia, Toyotaro Suzumura
-
Publication number: 20200242507Abstract: A computing system is configured to learn data-augmentations from unlabeled media. The system includes an extracting unit and an embedding unit. The extracting unit is configured to receive media data that includes moving images of an object and audio generated by the object. The extracting unit extracts an image frame of the object among the moving images and extracts an audio segment from the audio. The embedding unit is configured to generate first embeddings of the image frame and second embeddings of the audio segment, and to concatenate the first and second embeddings together to generate concatenated embeddings. The computing system labels the media data based at least in part on the concatenated embeddings.Type: ApplicationFiled: January 25, 2019Publication date: July 30, 2020Inventors: Chuang Gan, Quanfu Fan, Sijia Liu, Rogerio Schmidt Feris