Patents by Inventor Sheng Zhao

Sheng Zhao has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 12222907
    Abstract: Embodiments of the present disclosure provide a method, apparatus, device and storage medium for data processing. If a kernel module sends a data calling request to a userspace process, first modification time information and second modification time information of data corresponding to the data calling request are obtained by the userspace process. The first and second modification time information are used to indicate modification time information of the data in the kernel module and in a file service end, respectively. The first and second modification time information are compared by the userspace process. If the first and second modification time information are inconsistent, a verification invalidation result is returned to the kernel module. The data in the kernel module is invalidated by the kernel module according to the verification invalidation result. The data in the file service is synchronized by the userspace process to the kernel module.
    Type: Grant
    Filed: June 3, 2024
    Date of Patent: February 11, 2025
    Assignee: BEIJING VOLCANO ENGINE TECHNOLOGY CO., LTD.
    Inventors: Jiachen Zhang, Qiming Guan, Yongji Xie, Peng Li, Haiyu Wang, Sheng Zhao, Zewen Jin, Liming Wang, Tianci Zhang, Jinfeng Yang, Wen Chai
  • Publication number: 20250045252
    Abstract: Embodiments of the present disclosure provide a method, apparatus, device and storage medium for data processing. If a kernel module sends a data calling request to a userspace process, first modification time information and second modification time information of data corresponding to the data calling request are obtained by the userspace process. The first and second modification time information are used to indicate modification time information of the data in the kernel module and in a file service end, respectively. The first and second modification time information are compared by the userspace process. If the first and second modification time information are inconsistent, a verification invalidation result is returned to the kernel module. The data in the kernel module is invalidated by the kernel module according to the verification invalidation result. The data in the file service is synchronized by the userspace process to the kernel module.
    Type: Application
    Filed: June 3, 2024
    Publication date: February 6, 2025
    Inventors: Jiachen ZHANG, Qiming Guan, Yongji Xie, Peng Li, Haiyu Wang, Sheng Zhao, Zewen Jin, Liming Wang, Tianci Zhang, Jinfeng Yang, Wen Chai
  • Publication number: 20250045249
    Abstract: Embodiments of the present disclosure provide a method, an apparatus, a device and a storage medium for data request processing. The method is applied at a filesystem in userspace which includes a kernel module and a userspace process. The method includes: obtaining a data request list in the kernel module, where the data request list includes a plurality of data requests to be processed and virtual address information corresponding to respective data requests of the plurality of data requests; querying, for each of the data requests, physical address information corresponding to the virtual address information, according to the virtual address information corresponding to the data request; querying target cached data corresponding to the data request, from a plurality of cached data stored in the kernel module; accessing the target cached data, and processing the data request.
    Type: Application
    Filed: June 3, 2024
    Publication date: February 6, 2025
    Inventors: Qiming GUAN, Jiachen ZHANG, Yongji XIE, Peng LI, Haiyu WANG, Sheng ZHAO, Zewen JIN, Liming WANG, Tianci ZHANG, Jinfeng YANG, Wen CHAI
  • Publication number: 20250002906
    Abstract: The present invention provides a DNA-targeting RNA comprising a single-guide RNA (sgRNA) and a ribonucleotide sequence rich in adenine ribonucleotide, and its use in gene editing, and a method for improving the efficiency of sgRNA-mediated gene editing, comprising a step of adding a ribonucleotide sequence rich in adenine ribonucleotide at 3? end of the sgRNA.
    Type: Application
    Filed: September 11, 2024
    Publication date: January 2, 2025
    Applicant: uBriGene (MA) Biosciences Inc.
    Inventors: Xiulian SUN, Sheng ZHAO, Xiangyang ZHANG
  • Publication number: 20240425529
    Abstract: Disclosed is a boronic acid derivative. Provided are a compound of formula (I) or a pharmaceutically acceptable salt, solvate, polymorph or isomer thereof, a pharmaceutical composition containing the same, and the use thereof in the treatment of lmp7-related diseases.
    Type: Application
    Filed: October 13, 2022
    Publication date: December 26, 2024
    Inventors: Yeliu WANG, Chang LU, Sheng ZHAO, Jijun LI, Weinan HE, Hongjuan LI
  • Publication number: 20240371356
    Abstract: Systems and methods are provided for generating a lightweight, high-quality streaming text-to-speech (TTS) system. For example, some disclosed systems are configured to obtain a first model comprising one or more layers of a convolutional neural network. Each layer of the convolutional neural network is configured to generate a new output from a previous input. The systems also obtain a second model comprising a recurrent neural network. Subsequent to obtaining the first model and the second model, the systems are configured to compile the one or more layers of the convolutional neural network and the recurrent neural network in a parallel architecture to generate a machine learning module such that each model of the machine learning module is configured to receive input simultaneously.
    Type: Application
    Filed: January 18, 2022
    Publication date: November 7, 2024
    Inventors: Jinzhu LI, Sheng ZHAO, Guangyu WU, Yulin LI, Yanqing LIU
  • Publication number: 20240320451
    Abstract: Automatic generation of intelligent content is created using a system of computers including a user device and a cloud-based component that processes the user information. The system performs a process that includes receiving an input document and parsing the input document to generate inputs for a natural language generation model using a text analysis model. The natural language generation model generates one or more candidate presentation scripts based on the inputs. A presentation script is selected from the candidate presentation scripts and displayed. A text-to-speech model may be used to generate a synthesized audio presentation of the presentation script. A final presentation may be generated that includes a visual display of the input document and the corresponding audio presentation in sync with the visual display.
    Type: Application
    Filed: June 6, 2024
    Publication date: September 26, 2024
    Inventors: Ji LI, Konstantin SELESKEROV, Huey-Ru TSAI, Muin Barkatali MOMIN, Ramya TRIDANDAPANI, Sindhu Vigasini JAMBUNATHAN, Amit SRIVASTAVA, Derek Martin JOHNSON, Gencheng WU, Sheng ZHAO, Xinfeng CHEN, Bohan LI
  • Publication number: 20240233706
    Abstract: According to implementations of the subject matter described herein, a solution is proposed for text to speech. In this solution, an initial phoneme sequence corresponding to text is generated, the initial phoneme sequence comprising feature representations of a plurality of phonemes. A first phoneme sequence is generated by inserting a feature representation of an additional phoneme into the initial phoneme sequence, the additional phoneme being related to a characteristic of spontaneous speech. The duration of a phoneme among the plurality of phonemes and the additional phoneme is determined by using an expert model corresponding to the phoneme, and a second phoneme sequence is generated based on the first phoneme sequence. Spontaneous-style speech corresponding to the text is determined based on the second phoneme sequence. In this way, spontaneous-style speech with more varying rhythms can be generated based on spontaneous-style additional phonemes and multiple expert models.
    Type: Application
    Filed: May 23, 2022
    Publication date: July 11, 2024
    Inventors: Xu TAN, Tao Qin, Sheng Zhao, Tie-Yan Liu
  • Patent number: 12032922
    Abstract: Automatic generation of intelligent content is created using a system of computers including a user device and a cloud-based component that processes the user information. The system performs a process that includes receiving an input document and parsing the input document to generate inputs for a natural language generation model using a text analysis model. The natural language generation model generates one or more candidate presentation scripts based on the inputs. A presentation script is selected from the candidate presentation scripts and displayed. A text-to-speech model may be used to generate a synthesized audio presentation of the presentation script. A final presentation may be generated that includes a visual display of the input document and the corresponding audio presentation in sync with the visual display.
    Type: Grant
    Filed: May 12, 2021
    Date of Patent: July 9, 2024
    Assignee: MICROSOFT TECHNOLOGY LICENSING, LLC
    Inventors: Ji Li, Konstantin Seleskerov, Huey-Ru Tsai, Muin Barkatali Momin, Ramya Tridandapani, Sindhu Vigasini Jambunathan, Amit Srivastava, Derek Martin Johnson, Gencheng Wu, Sheng Zhao, Xinfeng Chen, Bohan Li
  • Patent number: 11977167
    Abstract: An improved, efficient method for mapping world points from an environment (e.g., points generated by a LIDAR sensor of an autonomous vehicle) to locations (e.g., pixels) within rolling-shutter images taken of the environment is provided. This improved method allows for accurate localization of the world point in a rolling-shutter image via an iterative process that converges in very few iterations. The method poses the localization process as an iterative process for determining the time, within the rolling-shutter exposure period of the image, at which the world point was imaged by the camera. The method reduces the number of times the world point is projected into the normalized space of the camera image, often converging in three or fewer iterations.
    Type: Grant
    Filed: November 25, 2020
    Date of Patent: May 7, 2024
    Assignee: Waymo LLC
    Inventors: Sheng Zhao, Nicholas Lloyd Armstrong-Crews, Volker Grabe
  • Patent number: 11959774
    Abstract: Systems and methods for extrinsic calibration of vehicle-mounted sensors are provided. One example method involves obtaining first sensor data collected by a first sensor and a second sensor while a vehicle is aligned in a first yaw direction. The method also involves obtaining second sensor data collected by the first sensor and the second sensor while the vehicle is aligned in a second yaw direction. The method also involves determining, based on the first sensor data and the second sensor data, (i) first pitch and roll misalignments of the first sensor relative to the vehicle and (ii) second pitch and roll misalignments of the second sensor relative to the first sensor. The method also involves determining third pitch and roll misalignments of the second sensor relative to the vehicle based on (i) the first pitch and roll misalignments and (ii) the second pitch and roll misalignments.
    Type: Grant
    Filed: November 17, 2020
    Date of Patent: April 16, 2024
    Assignee: Waymo LLC
    Inventors: Sheng Zhao, Damien Dusha, Craig Robinson, Volker Grabe
  • Publication number: 20240013790
    Abstract: A method and system for enhancing pronunciation during a speech, the method including receiving audio data, the audio data including a speech, performing at least one of acoustic scoring and language scoring on the speech, determining a pronunciation score of one or more words of the speech based on the acoustic scoring and the language scoring, determining that the pronunciation score for the word does not satisfy a threshold score, responsive to determining that the pronunciation score does satisfy the threshold score, identifying the word as mispronounced, and responsive to identifying the word as mispronounced, outputting the word and the pronunciation score thereof.
    Type: Application
    Filed: May 28, 2021
    Publication date: January 11, 2024
    Applicant: Microsoft Technology Licensing, LLC
    Inventors: Runnan LI, Sheng ZHAO, Amit SRIVASTAVA, Huakai LIAO, Ana PARRA, Tapan BOHRA, Akshay MALLIPEDDI, Siliang KANG, Lisha MA, Yinhe WEI
  • Publication number: 20230360242
    Abstract: An electronic device tracks its motion in an environment while building a three-dimensional visual representation of the environment that is used to correct drift in the tracked motion. A motion tracking module estimates poses of the electronic device based on feature descriptors corresponding to the visual appearance of spatial features of objects in the environment. A mapping module builds a three-dimensional visual representation of the environment based on a stored plurality of maps, and feature descriptors and estimated device poses received from the motion tracking module. The mapping module provides the three-dimensional visual representation of the environment to a localization module, which identifies correspondences between stored and observed feature descriptors. The localization module performs a loop closure by minimizing the discrepancies between matching feature descriptors to compute a localized pose.
    Type: Application
    Filed: July 20, 2023
    Publication date: November 9, 2023
    Inventors: Esha Nerurkar, Simon Lynen, Sheng Zhao
  • Publication number: 20230298567
    Abstract: Implementations of the subject matter described herein provide a solution for speech synthesis and speech recognition. In this solution, a Text to Speech (TTS) model and an Automatic Speech Recognition (ASR) model supporting at least one language are obtained. The TTS model and the ASR model are adjusted, based on a first set of paired data in a target language, to support the target language. The TTS model is optimized based on the first set of paired data and a first set of synthesized paired data in the target language generated by the ASR model while the ASR model is optimized based on the first set of paired data and a second set of synthesized paired data in the target language generated by the TTS model. As such, the solution can provide TTS and ASR models with high accuracy for languages lacking training data by using less training data.
    Type: Application
    Filed: May 13, 2021
    Publication date: September 21, 2023
    Inventors: Xu Tan, Tao Qin, Jun-Wei Gan, Sheng Zhao, Tie-Yan Liu
  • Patent number: 11734846
    Abstract: An electronic device tracks its motion in an environment while building a three-dimensional visual representation of the environment that is used to correct drift in the tracked motion. A motion tracking module estimates poses of the electronic device based on feature descriptors corresponding to the visual appearance of spatial features of objects in the environment. A mapping module builds a three-dimensional visual representation of the environment based on a stored plurality of maps, and feature descriptors and estimated device poses received from the motion tracking module. The mapping module provides the three-dimensional visual representation of the environment to a localization module, which identifies correspondences between stored and observed feature descriptors. The localization module performs a loop closure by minimizing the discrepancies between matching feature descriptors to compute a localized pose.
    Type: Grant
    Filed: May 15, 2020
    Date of Patent: August 22, 2023
    Assignee: GOOGLE LLC
    Inventors: Esha Nerurkar, Simon Lynen, Sheng Zhao
  • Patent number: 11686841
    Abstract: A radar system and a terminal device are provided. The radar system includes a controller and at least two radar modules directly or indirectly connected to the controller. The at least two radar modules include a first radar module and a second radar module, and the first radar module and the second radar module implement time division multiplexing of the controller in a digital domain. Compared with an existing radar system, the radar system in this application can provide more transmit channels, more receive channels, and a larger antenna array size when the two radar systems include a same quantity of controllers.
    Type: Grant
    Filed: December 27, 2021
    Date of Patent: June 27, 2023
    Assignee: HUAWEI TECHNOLOGIES CO., LTD.
    Inventors: Baopeng Wang, Wei Jiang, Sheng Zhao, Zhenjun Ren
  • Publication number: 20230150518
    Abstract: The described aspects and implementations enable efficient calibration of a sensing system of an autonomous vehicle (AV). In one implementation, disclosed is a method and a system to perform the method, the system including the sensing system configured to collect sensing data and a data processing system, operatively coupled to the sensing system. The data processing system is configured to identify reference point(s) in an environment of the AV, determine multiple estimated locations of the reference point(s), and adjust parameters of the sensing system based on a loss function representative of differences of the estimated locations.
    Type: Application
    Filed: November 15, 2021
    Publication date: May 18, 2023
    Inventors: Sheng Zhao, Antonio Teran Espinoza, Volker Grabe, Changchang Wu
  • Patent number: 11600261
    Abstract: Systems are configured for generating spectrogram data characterized by a voice timbre of a target speaker and a prosody style of source speaker by converting a waveform of source speaker data to phonetic posterior gram (PPG) data, extracting additional prosody features from the source speaker data, and generating a spectrogram based on the PPG data and the extracted prosody features. The systems are configured to utilize/train a machine learning model for generating spectrogram data and for training a neural text-to-speech model with the generated spectrogram data.
    Type: Grant
    Filed: May 27, 2022
    Date of Patent: March 7, 2023
    Assignee: MICROSOFT TECHNOLOGY LICENSING, LLC
    Inventors: Shifeng Pan, Lei He, Yulin Li, Sheng Zhao, Chunling Ma
  • Publication number: 20220415314
    Abstract: Novel solutions for speech recognition provide contextual spelling correction (CSC) for automatic speech recognition (ASR). Disclosed examples include receiving an audio stream; performing an ASR process on the audio stream to produce an ASR hypothesis; receiving a context list; and, based on at least the ASR hypothesis and the context list, performing spelling correction to produce an output text sequence. A contextual spelling correction (CSC) model is used on top of an ASR model, precluding the need for changing the original ASR model. This permits run-time user customization based on contextual data, even for large-size context lists. Some examples include filtering ASR hypotheses for the audio stream and, based on at least the ASR hypotheses filtering, determining whether to trigger spelling correction for the ASR hypothesis. Some examples include generating text to speech (TTS) audio using preprocessed transcriptions with context phrases to train the CSC model.
    Type: Application
    Filed: August 31, 2022
    Publication date: December 29, 2022
    Inventors: Xiaoqiang WANG, Yanqing LIU, Sheng ZHAO, Jinyu LI
  • Publication number: 20220366153
    Abstract: Automatic generation of intelligent content is created using a system of computers including a user device and a cloud-based component that processes the user information. The system performs a process that includes receiving an input document and parsing the input document to generate inputs for a natural language generation model using a text analysis model. The natural language generation model generates one or more candidate presentation scripts based on the inputs. A presentation script is selected from the candidate presentation scripts and displayed. A text-to-speech model may be used to generate a synthesized audio presentation of the presentation script. A final presentation may be generated that includes a visual display of the input document and the corresponding audio presentation in sync with the visual display.
    Type: Application
    Filed: May 12, 2021
    Publication date: November 17, 2022
    Inventors: Ji LI, Konstantin SELESKEROV, Huey-Ru TSAI, Muin Barkatali MOMIN, Ramya TRIDANDAPANI, Sindhu Vigasini JAMBUNATHAN, Amit SRIVASTAVA, Derek Martin JOHNSON, Gencheng WU, Sheng ZHAO, Xinfeng CHEN, Bohan LI