Patents by Inventor Yi Ke Wu

Yi Ke Wu has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 11763544
    Abstract: In an approach to augmenting a caption dataset by leveraging a denoising autoencoder to sample and generate additional captions from the ground truth captions, one or more computer processors generate a plurality of new captions utilizing an autoencoder fed with one or more noisy captions, wherein the autoencoder is trained with a dataset comprising a plurality of ground truth captions. The one or more computer processors calculate an importance weight for each new caption in the plurality of generated new captions as compared to a plurality of associated ground truth captions based on a consensus metric. The one or more computer processors train a caption model with the generated plurality of new captions and associated calculated weights.
    Type: Grant
    Filed: July 7, 2020
    Date of Patent: September 19, 2023
    Assignee: International Business Machines Corporation
    Inventors: Shiwan Zhao, Hao Kai Zhang, Yi Ke Wu, Zhong Su
  • Patent number: 11651522
    Abstract: In an approach to improving the image captioning performance of low-resource languages by leveraging multimodal inputs, one or more computer processors encode an image utilizing an image encoder, wherein the image is contained within a triplet comprising the image, one or more high-resource captions, and one or more low-resource captions. The one or more computer processors generate one or more high-resource captions utilizing the encoded image and the triplet inputted into a high-resource decoder. The one or more computer processors encode the one or more generated high-resource captions utilizing a high-resource encoder. The one or more computer processors add adaptive cycle consistency constraints on a set of calculated attention weights associated the triplet. The one or more computer processors generate one or more low-resource captions by simultaneously inputting the encoded image, the encoded high-resource caption, and the triplet into a trained low-resource decoder.
    Type: Grant
    Filed: July 8, 2020
    Date of Patent: May 16, 2023
    Assignee: International Business Machines Corporation
    Inventors: Shiwan Zhao, Yi Ke Wu, Hao Kai Zhang, Zhong Su
  • Patent number: 11334769
    Abstract: In an approach to augmenting caption datasets, one or more computer processors sample a ratio lambda from a probability distribution based on a pair of datapoints contained in a dataset, wherein each datapoint in the pair of datapoints comprises an image and an associated caption; extend the dataset by generating one or more new datapoints based on the sampled ratio lambda for each pair of datapoints in the dataset, wherein the sampled ratio lambda incorporates an interpolation of features associated with the pair of datapoints into the generated one or more new datapoints; identify one or more objects contained within a subsequent image utilizing an image model trained utilizing the extended dataset; generate a subsequent caption for one or more identified objects contained within the subsequent image utilizing a language generating model trained utilizing the extended dataset.
    Type: Grant
    Filed: July 7, 2020
    Date of Patent: May 17, 2022
    Assignee: International Business Machines Corporation
    Inventors: Shiwan Zhao, Yi Ke Wu, Hao Kai Zhang, Zhong Su
  • Publication number: 20220012544
    Abstract: In an approach to augmenting caption datasets, one or more computer processors sample a ratio lambda from a probability distribution based on a pair of datapoints contained in a dataset, wherein each datapoint in the pair of datapoints comprises an image and an associated caption; extend the dataset by generating one or more new datapoints based on the sampled ratio lambda for each pair of datapoints in the dataset, wherein the sampled ratio lambda incorporates an interpolation of features associated with the pair of datapoints into the generated one or more new datapoints; identify one or more objects contained within a subsequent image utilizing an image model trained utilizing the extended dataset; generate a subsequent caption for one or more identified objects contained within the subsequent image utilizing a language generating model trained utilizing the extended dataset.
    Type: Application
    Filed: July 7, 2020
    Publication date: January 13, 2022
    Inventors: Shiwan Zhao, Yi Ke Wu, Hao Kai Zhang, Zhong Su
  • Publication number: 20220012919
    Abstract: In an approach to improving the image captioning performance of low-resource languages by leveraging multimodal inputs, one or more computer processors encode an image utilizing an image encoder, wherein the image is contained within a triplet comprising the image, one or more high-resource captions, and one or more low-resource captions. The one or more computer processors generate one or more high-resource captions utilizing the encoded image and the triplet inputted into a high-resource decoder. The one or more computer processors encode the one or more generated high-resource captions utilizing a high-resource encoder. The one or more computer processors add adaptive cycle consistency constraints on a set of calculated attention weights associated the triplet. The one or more computer processors generate one or more low-resource captions by simultaneously inputting the encoded image, the encoded high-resource caption, and the triplet into a trained low-resource decoder.
    Type: Application
    Filed: July 8, 2020
    Publication date: January 13, 2022
    Inventors: Shiwan Zhao, Yi Ke Wu, Hao Kai Zhang, Zhong Su
  • Publication number: 20220012534
    Abstract: In an approach to augmenting a caption dataset by leveraging a denoising autoencoder to sample and generate additional captions from the ground truth captions, one or more computer processors generate a plurality of new captions utilizing an autoencoder fed with one or more noisy captions, wherein the autoencoder is trained with a dataset comprising a plurality of ground truth captions. The one or more computer processors calculate an importance weight for each new caption in the plurality of generated new captions as compared to a plurality of associated ground truth captions based on a consensus metric. The one or more computer processors train a caption model with the generated plurality of new captions and associated calculated weights.
    Type: Application
    Filed: July 7, 2020
    Publication date: January 13, 2022
    Inventors: Shiwan Zhao, Hao Kai Zhang, Yi Ke Wu, Zhong Su