Patents by Inventor Chuang Gan
Chuang Gan has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Publication number: 20240127001Abstract: Techniques for audio understanding using fixed language models are provided. In one aspect, a system for performing audio understanding tasks includes: a fixed text embedder for, on receipt of a prompt sequence having (e.g., from 0-10) demonstrations of an audio understanding task followed by a new question, converting the prompt sequence into text embeddings; a pretrained audio encoder for converting the prompt sequence into audio embeddings; and a fixed autoregressive language model for answering the new question using the text embeddings and the audio embeddings. A method for performing audio understanding tasks is also provided.Type: ApplicationFiled: October 12, 2022Publication date: April 18, 2024Inventors: Kaizhi Qian, Yang Zhang, Chuang Gan, Bo Wu, Zhenfang Chen
-
Publication number: 20240111950Abstract: A computer-implemented method for fine-grained referring expression comprehension is provided. The computer-implemented method includes receiving, at a processor, a textual expression and an image as inputs and executing, at the processor, fine-grained referring expression comprehension. The executing includes decomposing the textual expression into different textual modules, extracting visual regional proposals from the image, using language-guided graph neural networks to mine fine-grained object relations from the visual regional proposals and aggregating different matching similarities between the different textual modules and the fine-grained object relations.Type: ApplicationFiled: September 28, 2022Publication date: April 4, 2024Inventors: Zhenfang Chen, Chuang Gan, Bo Wu, Dakuo Wang
-
Publication number: 20240095435Abstract: A method, system, and computer program product for circuit design automation. The method identifies a set of circuit components for a proposed circuit design. A subset of circuit components is selected to generate an initial topology for the proposed circuit design. A set of subsequent topologies are iteratively generated by a heuristic search algorithm based on the subset of circuit components and the initial topology. A set of valid topologies of the set of subsequent topologies are determined by a circuit simulator based on the subset of circuit components and a set of connections within the set of subsequent topologies. The method generates the proposed circuit design from the set of valid topologies.Type: ApplicationFiled: September 15, 2022Publication date: March 21, 2024Inventors: Shun Zhang, Xin Zhang, Shaoze Fan, Ningyuan Cao, Jing Li, Xiaoxiao Guo, Chuang Gan
-
Patent number: 11928156Abstract: Obtain, at a computing device, a segment of computer code. With a classification module of a machine learning system executing on the computing device, determine a required annotation category for the segment of computer code. With an annotation generation module of the machine learning system executing on the computing device, generate a natural language annotation of the segment of computer code based on the segment of computer code and the required annotation category. Provide the natural language annotation to a user interface for display adjacent the segment of computer code.Type: GrantFiled: November 3, 2020Date of Patent: March 12, 2024Assignee: International Business Machines CorporationInventors: Dakuo Wang, Lingfei Wu, Xuye Liu, Yi Wang, Chuang Gan, Jing Xu, Xue Ying Zhang, Jun Wang, Jing James Xu
-
Publication number: 20240037940Abstract: A computer vision temporal action localization (TAL) computing tool and operations are provided. The TAL computing tool receives a coarse temporal bounding box, having a first start point and a first end point, for an action in the input video data, and a first set of logits, where each logit corresponds to a potential classification of the action in the input video data. The TAL computing tool executes a first engine on the coarse temporal bounding box to generate a second set of logits, and a second engine on the first set of logits to generate a refined temporal bounding box having a second start point and a second end point. The TAL computing tool performs the computer vision temporal action localization operation based on the second set of logits and the refined temporal bounding box to specify a temporal segment of the input video data corresponding to an action represented in the input video data, and a corresponding classification of the action represented in the temporal segment.Type: ApplicationFiled: July 28, 2022Publication date: February 1, 2024Inventors: Bo Wu, Chuang Gan, Pin-Yu Chen, Yang Zhang, Xin Zhang
-
Publication number: 20240004443Abstract: Described aspects include a system for optimizing performance of a functional circuit unit, a method of optimizing performance of a functional circuit unit, and a computer program product. In one embodiment, the system may include a functional circuit unit having an associated cooling device and power converter, one or more sensors for the functional circuit unit, the one or more sensors including a power sensor and a temperature sensor, and a first machine learning model. The first machine learning model may be adapted to receive temperature data and power data from the one or more sensors, and to generate control signals for the cooling device and the power converter to optimize performance of the functional circuit unit.Type: ApplicationFiled: June 29, 2022Publication date: January 4, 2024Inventors: Xin Zhang, Shun Zhang, Shaoze Fan, Xiaoxiao Guo, Chuang Gan
-
Patent number: 11854305Abstract: A bi-directional spatial-temporal transformer neural network (BDSTT) is trained to predict original coordinates of a skeletal joint in a specific frame through relative relationships of the skeletal joint to other joints and to the state of the skeletal joint in other frames. Obtain a plurality of frames comprising coordinates of the skeletal joint and coordinates of other joints. Produce a spatially masked frame by masking the original coordinates of the skeletal joint. Provide the specific frame, the spatially masked frame, and at least one more frame to a coordinate prediction head of the BDSTT. Obtain, from the coordinate prediction head, a prediction of coordinates for the skeletal joint. Adjust parameters of the BDSTT until a mean-squared error, between the prediction of coordinates for the skeletal joint and the original coordinates of the skeletal joint, converges.Type: GrantFiled: May 9, 2021Date of Patent: December 26, 2023Assignee: International Business Machines CorporationInventors: Bo Wu, Chuang Gan, Dakuo Wang, Kaizhi Qian
-
Publication number: 20230401435Abstract: An output layer is removed from a pre-trained neural network model and a neural capacitance probe unit with multiple layers is incorporated on top of one or more bottom layers of the pre-trained neural network model. The neural capacitance probe unit is randomly initialized and a modified neural network model is trained by fine-tuning the one or more bottom layers on a target dataset for a maximum number of epochs, the modified neural network model comprising the neural capacitance probe unit incorporated with multiple layers on top of the one or more bottom layers of the pre-trained neural network model. An adjacency matrix is obtained from the initialized neural capacitance probe unit and a neural capacitance metric is computed using the adjacency matrix. An active model is selected using the neural capacitance metric and a machine learning system is configured using the active model.Type: ApplicationFiled: June 13, 2022Publication date: December 14, 2023Inventors: Pin-Yu Chen, Tejaswini Pedapati, Bo Wu, Chuang Gan, Chunheng Jiang, Jianxi Gao
-
Publication number: 20230394846Abstract: A vehicle light signal detection and recognition method, system, and computer program product include bounding, using a coarse attention module, one or more regions of an image of an automobile including at least one of a brake light and a signal light generated by automobile signals which include illuminated sections to generate one or more bounded region, removing, using a fine attention module, noise from the one or more bounded regions to generate one or more noise-free bounded regions, and identifying the at least one of the brake light and the signal light from the one or more noise-free bounded regions.Type: ApplicationFiled: August 7, 2023Publication date: December 7, 2023Inventors: Bo Wu, Chuang GAN, Yang ZHANG, Dakuo WANG
-
Publication number: 20230368510Abstract: A system may include a memory and a processor in communication with the memory. The processor may be configured to perform operations. The operations may include receiving an input, extracting features from the input, and mining object relations using the features. The operations may include determining feature vectors using the object relations and generating, using the feature vectors, an output indicating a target region, wherein the target region corresponds to the input.Type: ApplicationFiled: May 13, 2022Publication date: November 16, 2023Inventors: Zhenfang Chen, Chuang Gan, Bo Wu, Pin-Yu Chen
-
Publication number: 20230368529Abstract: One or more computer processors improve action recognition by removing inference introduced by visual appearances of objects within a received video segment. The one or more computer processors extract appearance information and structure information from a received video segment. The one or more computer processors calculate a factual inference (TE) for the received video segment utilizing the extracted appearance information and structure information. The one or more computer processors calculate a counterfactual debiasing inference (NDE) for the received video segment. The one or more computer processors calculate a total indirect effect (TIE) by subtracting the calculated counterfactual debiased inference from the calculated factual inference. The one or more computer processors action recognize the received video segment by selecting a classification result associated with a highest calculated TIE.Type: ApplicationFiled: May 10, 2022Publication date: November 16, 2023Inventors: Bo Wu, Chuang Gan, Pin-Yu Chen, Zhenfang Chen, Dakuo Wang
-
Patent number: 11816889Abstract: Unsupervised learning for video classification. One or more features from one or more video clips are extracted using a spatial-temporal encoder. The one or more extracted features are processed, using a video instance discrimination task, to generate a classification label, the classification label indicating whether two of the video clips are from a same video. The one or more extracted features are processed, using a pair-wise speed discrimination task, to generate a comparison label, the comparison label indicating a relative playback speed between two given video clips. A search is performed in a video database for a video that is similar to a given video based on the comparison label.Type: GrantFiled: March 29, 2021Date of Patent: November 14, 2023Assignee: International Business Machines CorporationInventors: Chuang Gan, Dakuo Wang, Antonio Jose Jimeno Yepes, Bo Wu
-
Publication number: 20230360364Abstract: Mechanisms are provided for performing machine learning (ML) training of a ML action recognition computer model which involves processing an original input dataset to generate an object feature bank comprising object feature data structures for a plurality of different objects. For an input video, a verb data structure and an original object data structure are generated and a candidate object feature data structure is selected from the object feature bank for generation of pseudo composition (PC) training data. The PC training data is generated based on the selected candidate object feature data structure and comprises a combination of the verb data structure and the candidate object feature data structure. The PC training data represents a combination of an action and an object not represented in the original input dataset. ML training of the ML action recognition computer model is performed based on an unseen combination comprising the PC training data.Type: ApplicationFiled: May 5, 2022Publication date: November 9, 2023Inventors: Bo Wu, Chuang Gan, Pin-Yu Chen, Xin Zhang
-
Publication number: 20230360642Abstract: One or more computer processors obtain an initial subnetwork at a target sparsity and an initial pruning mask from a pre-trained self-supervised learning (SSL) speech model. The one or more computer processors finetune the initial subnetwork, comprising: the one or more computer processors zero out one or more masked weights in the initial subnetwork specified by the initial pruning mask; the one or more computer processors train a new subnetwork from the zeroed out subnetwork; the one or more computer processors prune one or more weights of lowest magnitude in the new subnetwork regardless of network structure to satisfy the target sparsity. The one or more computer processors classify an audio segment with the finetuned subnetwork.Type: ApplicationFiled: May 9, 2022Publication date: November 9, 2023Inventors: Cheng-I Lai, Yang Zhang, Kaizhi Qian, Chuang Gan, James R. Glass, Alexander Haojan Liu
-
Patent number: 11790181Abstract: A current observation expressed in natural language is received. Entities in the current observation are extracted. A relevant historical observation is retrieved, which has at least one of the entities in common with the current observation. The current observation and the relevant historical observation are combined as observations. The observations and a template list specifying a list of verb phrases to be filled-in with at least some of the entities are input to a neural network, which can output the template list of the verb phrases filled-in with said at least some of the entities. The neural network can include attention mechanism. A reward associated with the neural network's output can be received and fed back to the neural network for retraining the neural network.Type: GrantFiled: August 19, 2020Date of Patent: October 17, 2023Assignee: International Business Machines CorporationInventors: Xiaoxiao Guo, Mo Yu, Yupeng Gao, Chuang Gan, Shiyu Chang, Murray Scott Campbell
-
Publication number: 20230306738Abstract: According to one embodiment, a method, computer system, and computer program product for identifying one or more intrinsic physical properties of one or more objects is provided. The present invention may include identifying one or more objects in a video set, extracting observable physical properties of the identified one or more objects from the video set, including one or more trajectories, and inferring, by a property-based graph neural network, intrinsic properties of the one or more objects based on the trajectories.Type: ApplicationFiled: March 24, 2022Publication date: September 28, 2023Inventors: Zhenfang Chen, Chuang Gan, Bo Wu, Dakuo Wang
-
Patent number: 11763084Abstract: A method comprises receiving a new data set; identifying at least one prior data set of a plurality of prior data sets that matches the new data set; generating a natural language data science problem statement for the new data set based on information associated with the at least prior one data set that matches the new data set; outputting the generated natural language data science problem statement for user verification; and in response to receiving user input verifying the natural language generated data science problem statement, generating one or more AutoAI configuration settings for the new data set based on one or more AutoAI configuration settings associated with the at least one prior data set that matches the new data set.Type: GrantFiled: August 10, 2020Date of Patent: September 19, 2023Assignee: International Business Machines CorporationInventors: Dakuo Wang, Arunima Chaudhary, Chuang Gan, Mo Yu, Qian Pan, Sijia Liu, Daniel Karl I. Weidele, Abel Valente
-
Patent number: 11741722Abstract: A vehicle light signal detection and recognition method, system, and computer program product include bounding, using a coarse attention module, one or more regions of an image of an automobile including at least one of a brake light and a signal light generated by automobile signals which include illuminated sections to generate one or more bounded region, removing, using a fine attention module, noise from the one or more bounded regions to generate one or more noise-free bounded regions, and identifying the at least one of the brake light and the signal light from the one or more noise-free bounded regions.Type: GrantFiled: September 4, 2020Date of Patent: August 29, 2023Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Bo Wu, Chuang Gan, Yang Zhang, Dakuo Wang
-
Patent number: 11736423Abstract: Systems, computer-implemented methods, and/or computer program products facilitating a process to identify and respond to a primary electronic message are provided. According to an embodiment, a system can comprise a memory that stores computer executable components and a processor that executes the computer executable components stored in the memory. The computer executable components can include a determination component can determine that a primary electronic message has not received a response electronic message. An analysis component can generate a generated electronic message addressing the informational or emotional content of the primary electronic message. In one or more embodiments, an updating component can update the analytical model based on one or more feedbacks to the generated electronic message, where the analytical model can remain active while being updated.Type: GrantFiled: May 4, 2021Date of Patent: August 22, 2023Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Dakuo Wang, Mo Yu, Chuang Gan, Bo Wu
-
Patent number: 11727686Abstract: Systems and techniques that facilitate few-shot temporal action localization based on graph convolutional networks are provided. In one or more embodiments, a graph component can generate a graph that models a support set of temporal action classifications. Nodes of the graph can correspond to respective temporal action classifications in the support set. Edges of the graph can correspond to similarities between the respective temporal action classifications. In various embodiments, a convolution component can perform a convolution on the graph, such that the nodes of the graph output respective matching scores indicating levels of match between the respective temporal action classifications and an action to be classified. In various embodiments, an instantiation component can input into the nodes respective input vectors based on a proposed feature vector representing the action to be classified.Type: GrantFiled: September 21, 2021Date of Patent: August 15, 2023Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Chuang Gan, Ming Tan, Yang Zhang, Dakuo Wang