Patents by Inventor Hao Tan

Hao Tan has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 12646507
    Abstract: The disclosed method generates helpful training data for a language model, for example, a model implementing a punctuation restoration task, for real-world ASR texts. The method uses a reinforcement learning method using a generative AI model to generate additional data to train the language model. The method allows the generative AI model to learn from real-world ASR text to generate more effective training examples based on gradient feedback from the language model.
    Type: Grant
    Filed: July 12, 2023
    Date of Patent: June 2, 2026
    Assignee: Adobe Inc.
    Inventors: Viet Dac Lai, Trung Bui, Seunghyun Yoon, Quan Tran, Hao Tan, Hanieh Deilamsalehy, Abel Salinas, Franck Dernoncourt
  • Publication number: 20260146284
    Abstract: Provided is a technique relating to codon optimization.
    Type: Application
    Filed: November 24, 2023
    Publication date: May 28, 2026
    Applicant: NANJING GENSCRIPT BIOTECH CO., LTD.
    Inventors: Long FAN, Yuzhuo HE, Lihua ZHANG, Hong LI, Hao TAN, Zhiwei CHEN
  • Publication number: 20260148477
    Abstract: In some embodiments, a computing system receives an input image of a target in a first view. The computing system creates a single-view feature representation of the target using a trained single-view reconstruction model based on the input image. The computing system generates a multi-view feature representation of the target using a pre-trained generative model based on the single-view feature representation. The computing system determines a 3-dimensional (3D) representation of the target based on the multi-view feature representation using a neural volume rendering algorithm. The computing system generates one or more output images of the target in one or more views based on the 3D representation of the target.
    Type: Application
    Filed: November 25, 2024
    Publication date: May 28, 2026
    Inventors: Jimei Yang, Zhenzhen Weng, Zhan Xu, Yang Zhou, Jingyuan Liu, Hao Tan
  • Publication number: 20260149234
    Abstract: An intracavity frequency-doubling laser device includes a first mirror and a second mirror defining a resonance cavity, a gain media to produce a first lasing light in response to an external pump beam received from outside the cavity, a nonlinear frequency-doubling optical element to generate a second lasing light in response to the first lasing light, a birefringent waveplate to control phase properties of the first and second lasing lights, and a fused quartz element positioned at the exit of the resonance cavity. The second mirror is formed as a relatively thick film on the fused quartz and is controlled to create an output beam of the second lasing light with great accuracy. The four components within the resonance cavity are positioned back-to-back and held in place using an optical glue (forming a compact, miniaturized laser device).
    Type: Application
    Filed: January 9, 2025
    Publication date: May 28, 2026
    Applicant: Fuzhou Photop Optics Co., Ltd.
    Inventors: Yanli Wang, Jian Ding, Yi Huang, Hao Tan, Zhe Liu, Xu Jia, Guanglong Yu, Lei Lin
  • Patent number: 12603958
    Abstract: A system and a method of large-scale networking extended cascaded microphones including: a plurality of cascaded microphones configured for picking up a sound and completing an audio processing according to a reference audio signal received from the cascaded Hub of a superior device after the sound is picked up, and sending processed sound data to the cascaded Hub of the superior device; a cascaded Hub configured for comparing energies after the sound data uploaded by one or more cascaded microphones of a subordinate device are received, and sending the sound data with a maximum energy to the conference integrated machine of the superior device; and a conference integrated machine configured for completing an audio processing according to an audio signal received from a PC machine as a reference audio signal after the sound is picked up.
    Type: Grant
    Filed: July 3, 2024
    Date of Patent: April 14, 2026
    Assignee: SHEN ZHEN PROITAV TECHNOLOGY CO. LTD.
    Inventors: Hao Tan, Junjie Xie, Jia Cao
  • Publication number: 20260073692
    Abstract: The present disclosure relates to systems, methods, and non-transitory computer-readable media that generates spatial-temporal positional encodings. For example, the disclosed systems generate a noised token from adding noise to an embedding of a frame of a video. Moreover, the disclosed systems generate a spatial embedding for a token using a centered two-dimensional coordinate map. Further, the disclosed systems generate temporal embeddings for the token from a timestamp of the token in the video. Further, the disclosed systems generate a denoised token by removing noise from the noised token according to spatial-temporal positional encodings that include the spatial embedding and the temporal embedding via a diffusion model. Additionally, the disclosed systems modify parameters of the diffusion model based on a comparison of the denoised token and the token.
    Type: Application
    Filed: October 29, 2024
    Publication date: March 12, 2026
    Inventors: Jianming Zhang, Zhifei Zhang, Wei-An Lin, Hao Tan
  • Publication number: 20260073580
    Abstract: The present disclosure relates to systems, methods, and non-transitory computer-readable media that generates an image or a video from a text prompt. For example, the disclosed systems receive a text prompt and generates text tokens from the text prompt. Moreover, the disclosed systems generate combined tokens by combining the text tokens with noised tokens. Further, the disclosed systems generate denoised tokens by removing noise from noised tokens in a manner that incorporates a context indicated by the text tokens and further generates an image or video from the denoised tokens.
    Type: Application
    Filed: October 29, 2024
    Publication date: March 12, 2026
    Inventors: Kai Zhang, Jianming Zhang, Sai Bi, Zexiang Xu, Hao Tan, Wei-An Lin
  • Publication number: 20260045041
    Abstract: In implementation of techniques for generating meshes by decoding volume representations, a computing device implements a mesh generation system to receive digital images depicting an object from different angles. The mesh generation system generates a volume representation of the object using a transformer model based on the digital images. By decoding information from the volume representation using an algorithm, the mesh generation system then generates a mesh of the object from the volume representation. The mesh generation system then presents the mesh of the object in a user interface.
    Type: Application
    Filed: August 8, 2024
    Publication date: February 12, 2026
    Applicants: Adobe Inc., THE REGENTS OF THE UNIVERSITY OF CALIFORNIA
    Inventors: Kai Zhang, Zexiang Xu, Xinyue Wei, Valentin Mathieu Deschaintre, Sai Bi, Kalyan Krishna Sunkavalli, Hao Tan, Fujun Luan, Hao Su
  • Publication number: 20260024278
    Abstract: A method, apparatus, non-transitory computer readable medium, and system for image generation image generation may include obtaining a first image depicting a first view of an object, generating a second image depicting a second view of the object based on the first image, and generating a third image depicting a third view of the object based on the first image, where the third view is structurally consistent with the second view.
    Type: Application
    Filed: July 18, 2024
    Publication date: January 22, 2026
    Inventors: Desai Xie, Jiahao Li, Hao Tan, Xin Sun, Zhixin Shu, Yi Zhou, Sai Bi
  • Publication number: 20260017471
    Abstract: The present disclosure relates to systems, non-transitory computer-readable media, and methods for training a multilingual large language model to embed text into an embedding space of a vision language model comprising a text encoder for a first language and a vision encoder. In particular, in some embodiments, the disclosed systems generate, utilizing the vision encoder, image embeddings for images. Additionally, in some embodiments, the disclosed systems generate, utilizing the multilingual large language model, text embeddings for text in languages other than the first language. Furthermore, in some embodiments, the disclosed systems determine similarity metrics between the image embeddings for the images and the text embeddings for the text. Moreover, in some embodiments, the disclosed systems adjust parameters of the multilingual large language model to reduce an output of a contrastive loss function based on the similarity metrics without adjusting parameters of the vision encoder.
    Type: Application
    Filed: July 12, 2024
    Publication date: January 15, 2026
    Inventors: Handong Zhao, Tracy King, Kushal Kafle, Rohith Reddy Katikireddy, Sanat Sharma, Scott Cohen, Seunghyun Yoon, Trung Bui, Tushar Vatsa, Venkata Naveen Kumar Yadav Marri, Wei-ting Hsu, Hao Tan, Fangzheng Wu, Amine Ben Khalifa, Ajinkya Gorakhnath Kale
  • Patent number: 12524954
    Abstract: Systems and methods for generating a 3D model from a single input image are described. Embodiments are configured to obtain an input image and camera view information corresponding to the input image; encode the input image to obtain 2D features comprising a plurality of 2D tokens corresponding to patches of the input image; decode the 2D features based on the camera view information to obtain 3D features comprising a plurality of 3D tokens corresponding to regions of a 3D representation; and generate a 3D model of the input image based on the 3D features.
    Type: Grant
    Filed: September 5, 2023
    Date of Patent: January 13, 2026
    Assignee: ADOBE INC.
    Inventors: Hao Tan, Yicong Hong, Kai Zhang, Jiuxiang Gu, Sai Bi, Yang Zhou, Difan Liu, Feng Liu, Kalyan K. Sunkavalli, Trung Huu Bui
  • Publication number: 20250390713
    Abstract: In some embodiments, a computing system receives an input prompt describing a 3-dimensional (3D) object. The computing system generates one or more levels of latent features based on the input prompt using a latent diffusion model. The computing system decodes the one or more levels of latent features to generate a 3D shape representation using a hierarchical autoencoder. The computing system generates an output shape based on the 3D shape representation.
    Type: Application
    Filed: June 25, 2024
    Publication date: December 25, 2025
    Inventors: Yang Zhou, Yicong Hong, Sai Bi, Kai Zhang, Hao Tan, Feng Liu, Difan Liu
  • Publication number: 20250336154
    Abstract: In implementation of techniques for three-dimensional reconstructions based on Gaussian primitives, a computing device implements a reconstruction system to receive a first digital image depicting an object from a first angle and a second digital image depicting the object from a second angle. The reconstruction system segments the first digital image and the second digital image into patches. The reconstruction system then generates, using a machine learning model, three-dimensional Gaussian primitives that predict parameters of points of the object in a three-dimensional space that correspond on a per-pixel basis to pixels of the patches. The reconstruction system then forms a three-dimensional reconstruction of the object for display in a user interface by merging the three-dimensional Gaussian primitives.
    Type: Application
    Filed: April 25, 2024
    Publication date: October 30, 2025
    Applicant: Adobe Inc.
    Inventors: Kai Zhang, Hao Tan, Sai Bi, Zexiang Xu, Nanxuan Zhao, Kalyan Krishna Sunkavalli
  • Publication number: 20250322528
    Abstract: The present disclosure relates to systems, non-transitory computer-readable media, and methods for hierarchical entity segmentation. In particular, in one or more embodiments, the disclosed systems receive a digital image comprising a plurality of object entities. In addition, in some embodiments, the disclosed systems generate, utilizing a segmentation model comprising parameters generated according to pseudo-labels indicating hierarchies of segmentation masks for a set of training digital images, a hierarchical segmentation indicating hierarchical relations of the plurality of object entities of the digital image. Moreover, in some embodiments, the disclosed systems generate, for the digital image, a segmentation map from the hierarchical segmentation of the plurality of object entities.
    Type: Application
    Filed: April 11, 2024
    Publication date: October 16, 2025
    Inventors: Jiuxiang Gu, Jason Wen Yong Kuen, Hao Tan, Ruiyi Zhang, Handong Zhao, Ani Nenkova, Tong Sun, Shengcao Cao
  • Publication number: 20250314895
    Abstract: A pair of smart glasses described herein includes a temple arm and a removable temple tip. The temple tip is configured to be removably attached to a distal end of the temple arm and the temple tip is configured to be removed by a wearer of the pair of smart glasses. The temple tip includes a battery and an electrical connection configured for transferring power to an electrical component of the pair of smart glasses.
    Type: Application
    Filed: February 24, 2025
    Publication date: October 9, 2025
    Inventors: Tianren Xu, Chuck Consorte, Jason Howard, Karthik Kadirvel, Jun Ho Lee, Gregory Alan Roberts, Bradley Spare, Hao Tan
  • Patent number: 12400384
    Abstract: Embodiments are disclosed for reflowing documents to display semantically related content. The method may include receiving a request to view a document that includes body text and one or more images. A trimodal document relationship model identifies relationships between segments of the body text and the one or more images. A linearized view of the document is generated based on the relationships and the linearized view is caused to be displayed on a user device.
    Type: Grant
    Filed: September 1, 2023
    Date of Patent: August 26, 2025
    Assignee: Adobe Inc.
    Inventors: Christopher Tensmeyer, Fuxiao Liu, Hao Tan, Ani Nenkova
  • Publication number: 20250265831
    Abstract: The present disclosure relates to systems, non-transitory computer-readable media, and methods for training and implementing a vision-language model using masked distillation and contrastive image-text training. In particular, in one or more embodiments, the disclosed systems generate, utilizing a vision encoder, an image embedding from a masked digital image comprising a digital image with one or more masked patches. In some embodiments, the disclosed systems generate, utilizing a text encoder, a text embedding from a masked text phrase. In one or more embodiments, the disclosed systems generate, utilizing the vision-language model from the image embedding and the text embedding, a predicted text reconstruction of the text description and a predicted image reconstruction of the digital image.
    Type: Application
    Filed: February 16, 2024
    Publication date: August 21, 2025
    Inventors: Simon Jenni, Sepehr Sameni, Kushal Kafle, Hao Tan
  • Publication number: 20250209278
    Abstract: The present disclosure relates to systems, non-transitory computer-readable media, and methods for identifying speaker names in transcripts. In particular, in one or more embodiments, the disclosed systems determine, from a set of sentences in a textual transcript of a dialogue, a first sentence spoken by a first speaker and a second sentence spoken by a second speaker. Additionally, in some embodiments, the disclosed systems generate a first feature representation for the first sentence and a second feature representation for the second sentence. Moreover, in some embodiments, the disclosed systems determine a speaker name for at least one of the first sentence or the second sentence by comparing each of the first feature representation and the second feature representation with a name representation for a name spoken in at least one of the first sentence or the second sentence.
    Type: Application
    Filed: December 20, 2023
    Publication date: June 26, 2025
    Inventors: Minh Nguyen, Franck Dernoncourt, Hanieh Deilamsalehy, Hao Tan, Quan Tran, Seunghyun Yoon, Trung Bui
  • Patent number: 12291515
    Abstract: The present invention discloses a pyrrolidine derivative or its optically active isomer, or a pharmaceutically acceptable salt thereof, which is useful as an NAMPT inhibitor, and useful as a potential agent for the chemotherapy of a variety of diseases associated with abnormal NAD+ expression. The pyrrolidine derivative has pyrrolidine as a parent structure, to which pyridinylurea (or substituted pyridinylurea) is attached by an intermediate aliphatic chain, and a side arylformyl (or heterocyclylformyl) group is attached. This structure is an optimized structure of the NAMPT inhibitor FK866, in which the acrylamido group is replaced by a urea structure, to increases the water solubility of the compound. Moreover, the difficulty in synthesis is reduced accordingly, which is conducive to the subsequent industrial production.
    Type: Grant
    Filed: December 13, 2022
    Date of Patent: May 6, 2025
    Assignees: Rushi Biotech (Hangzhou) Co., Ltd
    Inventors: Zheming Wang, Hao Tan
  • Publication number: 20250104349
    Abstract: A method, apparatus, non-transitory computer readable medium, and system for 3D model generation include obtaining a plurality of input images depicting an object and a set of 3D position embeddings, where each of the plurality of input images depicts the object from a different perspective, encoding the plurality of input images to obtain a plurality of 2D features corresponding to the plurality of input images, respectively, generating 3D features based on the plurality of 2D features and the set of 3D position embeddings, and generating a 3D model of the object based on the 3D features.
    Type: Application
    Filed: September 24, 2024
    Publication date: March 27, 2025
    Inventors: Sai Bi, Jiahao Li, Hao Tan, Kai Zhang, Zexiang Xu, Fujun Luan, Yinghao Xu, Yicong Hong, Kalyan K. Sunkavalli