Patents by Inventor Zhenhao Ge

Zhenhao Ge has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 11978472
    Abstract: A system for processing and presenting a conversation includes a sensor, a processor, and a presenter. The sensor is configured to capture an audio-form conversation. The processor is configured to automatically transform the audio-form conversation into a transformed conversation. The transformed conversation includes a synchronized text, wherein the synchronized text is synchronized with the audio-form conversation. The presenter is configured to present the transformed conversation including the synchronized text and the audio-form conversation. The presenter is further configured to present the transformed conversation to be navigable, searchable, assignable, editable, and shareable.
    Type: Grant
    Filed: March 23, 2021
    Date of Patent: May 7, 2024
    Assignee: Otter.ai, Inc.
    Inventors: Yun Fu, Simon Lau, Kaisuke Nakajima, Julius Cheng, Gelei Chen, Sam Song Liang, James Mason Altreuter, Kean Kheong Chin, Zhenhao Ge, Hitesh Anand Gupta, Xiaoke Huang, James Francis McAteer, Brian Francis Williams, Tao Xing
  • Publication number: 20240087574
    Abstract: Computer-implemented method and system for receiving and processing one or more moment-associating elements. For example, the computer-implemented method includes receiving the one or more moment-associating elements, transforming the one or more moment-associating elements into one or more pieces of moment-associating information, and transmitting at least one piece of the one or more pieces of moment-associating information.
    Type: Application
    Filed: November 20, 2023
    Publication date: March 14, 2024
    Inventors: YUN FU, SIMON LAU, KAISUKE NAKAJIMA, JULIUS CHENG, SAM SONG LIANG, JAMES MASON ALTREUTER, KEAN KHEONG CHIN, ZHENHAO GE, HITESH ANAND GUPTA, XIAOKE HUANG, JAMES FRANCIS McATEER, BRIAN FRANCIS WILLIAMS, TAO XING
  • Patent number: 11915685
    Abstract: Techniques are described for training neural networks on variable length datasets. The numeric representation of the length of each training sample is randomly perturbed to yield a pseudo-length, and the samples sorted by pseudo-length to achieve lower zero padding rate (ZPR) than completely randomized batching (thus saving computation time) yet higher randomness than strictly sorted batching (thus achieving better model performance than strictly sorted batching).
    Type: Grant
    Filed: March 23, 2023
    Date of Patent: February 27, 2024
    Assignee: Sony Interactive Entertainment Inc.
    Inventors: Zhenhao Ge, Lakshmish Kaushik, Saket Kumar, Masanori Omote
  • Patent number: 11869508
    Abstract: Computer-implemented method and system for receiving and processing one or more moment-associating elements. For example, the computer-implemented method includes receiving the one or more moment-associating elements, transforming the one or more moment-associating elements into one or more pieces of moment-associating information, and transmitting at least one piece of the one or more pieces of moment-associating information.
    Type: Grant
    Filed: April 28, 2021
    Date of Patent: January 9, 2024
    Assignee: Otter.ai, Inc.
    Inventors: Yun Fu, Simon Lau, Kaisuke Nakajima, Julius Cheng, Sam Song Liang, James Mason Altreuter, Kean Kheong Chin, Zhenhao Ge, Hitesh Anand Gupta, Xiaoke Huang, James Francis McAteer, Brian Francis Williams, Tao Xing
  • Patent number: 11790912
    Abstract: A wake-up word for a digital assistant may be specified by a user to trigger the digital assistant to respond to the wake-up word, with the user providing one or more initial pronunciations of the wake-up word. The wake-up word may be unique, or at least not determined beforehand by a device manufacturer or developer of the digital assistant. The initial pronunciation(s) of the keyword may then be augmented with other potential pronunciations of the wake-up word that might be provided in the future, and those other potential pronunciations may then be pruned down to a threshold number of other potential pronunciations. One or more recordings of the initial pronunciation(s) of the wake-up may then be used to train a phoneme recognizer model to better recognize future instances of the wake-up word being spoken by the user or another person using the initial pronunciation or other potential pronunciations.
    Type: Grant
    Filed: January 3, 2022
    Date of Patent: October 17, 2023
    Assignee: Sony Interactive Entertainment Inc.
    Inventors: Lakshmish Kaushik, Zhenhao Ge, Xiaoyu Liu
  • Publication number: 20230326452
    Abstract: Techniques are described for training neural networks on variable length datasets. The numeric representation of the length of each training sample is randomly perturbed to yield a pseudo-length, and the samples sorted by pseudo-length to achieve lower zero padding rate (ZPR) than completely randomized batching (thus saving computation time) yet higher randomness than strictly sorted batching (thus achieving better model performance than strictly sorted batching).
    Type: Application
    Filed: March 23, 2023
    Publication date: October 12, 2023
    Inventors: Zhenhao Ge, Lakshmish Kaushik, Saket Kumar, Masanori Omote
  • Patent number: 11615782
    Abstract: Techniques are described for training neural networks on variable length datasets. The numeric representation of the length of each training sample is randomly perturbed to yield a pseudo-length, and the samples sorted by pseudo-length to achieve lower zero padding rate (ZPR) than completely randomized batching (thus saving computation time) yet higher randomness than strictly sorted batching (thus achieving better model performance than strictly sorted batching).
    Type: Grant
    Filed: November 30, 2020
    Date of Patent: March 28, 2023
    Assignee: Sony Interactive Entertainment Inc.
    Inventors: Zhenhao Ge, Lakshmish Kaushik, Saket Kumar, Masanori Omote
  • Publication number: 20220148569
    Abstract: Techniques are described for training neural networks on variable length datasets. The numeric representation of the length of each training sample is randomly perturbed to yield a pseudo-length, and the samples sorted by pseudo-length to achieve lower zero padding rate (ZPR) than completely randomized batching (thus saving computation time) yet higher randomness than strictly sorted batching (thus achieving better model performance than strictly sorted batching).
    Type: Application
    Filed: November 30, 2020
    Publication date: May 12, 2022
    Inventors: Zhenhao Ge, Lakshmish Kaushik, Saket Kumar, Masanori Omote
  • Publication number: 20220130384
    Abstract: A wake-up word for a digital assistant may be specified by a user to trigger the digital assistant to respond to the wake-up word, with the user providing one or more initial pronunciations of the wake-up word. The wake-up word may be unique, or at least not determined beforehand by a device manufacturer or developer of the digital assistant. The initial pronunciation(s) of the keyword may then be augmented with other potential pronunciations of the wake-up word that might be provided in the future, and those other potential pronunciations may then be pruned down to a threshold number of other potential pronunciations. One or more recordings of the initial pronunciation(s) of the wake-up may then be used to train a phoneme recognizer model to better recognize future instances of the wake-up word being spoken by the user or another person using the initial pronunciation or other potential pronunciations.
    Type: Application
    Filed: January 3, 2022
    Publication date: April 28, 2022
    Inventors: Lakshmish Kaushik, Zhenhao Ge
  • Patent number: 11217245
    Abstract: A wake-up word for a digital assistant may be specified by a user to trigger the digital assistant to respond to the wake-up word, with the user providing one or more initial pronunciations of the wake-up word. The wake-up word may be unique, or at least not determined beforehand by a device manufacturer or developer of the digital assistant. The initial pronunciation(s) of the keyword may then be augmented with other potential pronunciations of the wake-up word that might be provided in the future, and those other potential pronunciations may then be pruned down to a threshold number of other potential pronunciations. One or more recordings of the initial pronunciation(s) of the wake-up may then be used to train a phoneme recognizer model to better recognize future instances of the wake-up word being spoken by the user or another person using the initial pronunciation or other potential pronunciations.
    Type: Grant
    Filed: August 29, 2019
    Date of Patent: January 4, 2022
    Assignee: Sony Interactive Entertainment Inc.
    Inventors: Lakshmish Kaushik, Zhenhao Ge
  • Publication number: 20210327454
    Abstract: A system for processing and presenting a conversation includes a sensor, a processor, and a presenter. The sensor is configured to capture an audio-form conversation. The processor is configured to automatically transform the audio-form conversation into a transformed conversation. The transformed conversation includes a synchronized text, wherein the synchronized text is synchronized with the audio-form conversation. The presenter is configured to present the transformed conversation including the synchronized text and the audio-form conversation. The presenter is further configured to present the transformed conversation to be navigable, searchable, assignable, editable, and shareable.
    Type: Application
    Filed: March 23, 2021
    Publication date: October 21, 2021
    Inventors: YUN FU, SIMON LAU, KAISUKE NAKAJIMA, JULIUS CHENG, GELEI CHEN, SAM SONG LIANG, JAMES MASON ALTREUTER, KEAN KHEONG CHIN, ZHENHAO GE, HITESH ANAND GUPTA, XIAOKE HUANG, JAMES FRANCIS McATEER, BRIAN FRANCIS WILLIAMS, TAO XING
  • Publication number: 20210319797
    Abstract: Computer-implemented method and system for receiving and processing one or more moment-associating elements. For example, the computer-implemented method includes receiving the one or more moment-associating elements, transforming the one or more moment-associating elements into one or more pieces of moment-associating information, and transmitting at least one piece of the one or more pieces of moment-associating information.
    Type: Application
    Filed: April 28, 2021
    Publication date: October 14, 2021
    Inventors: YUN FU, SIMON LAU, KAISUKE NAKAJIMA, JULIUS CHENG, SAM SONG LIANG, JAMES MASON ALTREUTER, KEAN KHEONG CHIN, ZHENHAO GE, HITESH ANAND GUPTA, XIAOKE HUANG, JAMES FRANCIS McATEER, BRIAN FRANCIS WILLIAMS, TAO XING
  • Patent number: 11100943
    Abstract: A system for processing and presenting a conversation includes a sensor, a processor, and a presenter. The sensor is configured to capture an audio-form conversation. The processor is configured to automatically transform the audio-form conversation into a transformed conversation. The transformed conversation includes a synchronized text, wherein the synchronized text is synchronized with the audio-form conversation. The presenter is configured to present the transformed conversation including the synchronized text and the audio-form conversation. The presenter is further configured to present the transformed conversation to be navigable, searchable, assignable, editable, and shareable.
    Type: Grant
    Filed: February 14, 2019
    Date of Patent: August 24, 2021
    Assignee: Otter.ai, Inc.
    Inventors: Yun Fu, Simon Lau, Kaisuke Nakajima, Julius Cheng, Gelei Chen, Sam Song Liang, James Mason Altreuter, Kean Kheong Chin, Zhenhao Ge, Hitesh Anand Gupta, Xiaoke Huang, James Francis McAteer, Brian Francis Williams, Tao Xing
  • Patent number: 11024316
    Abstract: Computer-implemented method and system for receiving and processing one or more moment-associating elements. For example, the computer-implemented method includes receiving the one or more moment-associating elements, transforming the one or more moment-associating elements into one or more pieces of moment-associating information, and transmitting at least one piece of the one or more pieces of moment-associating information.
    Type: Grant
    Filed: May 3, 2019
    Date of Patent: June 1, 2021
    Assignee: Otter.ai, Inc.
    Inventors: Yun Fu, Simon Lau, Kaisuke Nakajima, Julius Cheng, Sam Song Liang, James Mason Altreuter, Kean Kheong Chin, Zhenhao Ge, Hitesh Anand Gupta, Xiaoke Huang, James Francis McAteer, Brian Francis Williams, Tao Xing
  • Publication number: 20210129031
    Abstract: The performance of a player of a computer game is noted and the player accorded a latency handicap based thereon. The latency handicap is used to slow down play of the computer game, preferably only during times of high player activity. The latency handicap can be reduced over time or owing to improvement in the player's performance.
    Type: Application
    Filed: November 5, 2019
    Publication date: May 6, 2021
    Inventors: Joshua M. Eads, Matthew D. Bennett, Brendan Matera Rehon, Mahdi Azmandian, Zhenhao Ge, Todd Tokubo
  • Publication number: 20210065699
    Abstract: A wake-up word for a digital assistant may be specified by a user to trigger the digital assistant to respond to the wake-up word, with the user providing one or more initial pronunciations of the wake-up word. The wake-up word may be unique, or at least not determined beforehand by a device manufacturer or developer of the digital assistant. The initial pronunciation(s) of the keyword may then be augmented with other potential pronunciations of the wake-up word that might be provided in the future, and those other potential pronunciations may then be pruned down to a threshold number of other potential pronunciations. One or more recordings of the initial pronunciation(s) of the wake-up may then be used to train a phoneme recognizer model to better recognize future instances of the wake-up word being spoken by the user or another person using the initial pronunciation or other potential pronunciations.
    Type: Application
    Filed: August 29, 2019
    Publication date: March 4, 2021
    Inventors: Lakshmish Kaushik, Zhenhao Ge
  • Patent number: 10755718
    Abstract: A method for classifying speakers includes: receiving, by a speaker recognition system including a processor and memory, input audio including speech from a speaker; extracting, by the speaker recognition system, a plurality of speech frames containing voiced speech from the input audio; computing, by the speaker recognition system, a plurality of features for each of the speech frames of the input audio; computing, by the speaker recognition system, a plurality of recognition scores for the plurality of features; computing, by the speaker recognition system, a speaker classification result in accordance with the recognition scores; and outputting, by the speaker recognition system, the speaker classification result.
    Type: Grant
    Filed: December 7, 2017
    Date of Patent: August 25, 2020
    Inventors: Zhenhao Ge, Ananth N. Iyer, Srinath Cheluvaraja, Ram Sundaram, Aravind Ganapathiraju
  • Patent number: 10535000
    Abstract: A method for training a neural network of a neural network based speaker classifier for use in speaker change detection. The method comprises: a) preprocessing input speech data; b) extracting a plurality of feature frames from the preprocessed input speech data; c) normalizing the extracted feature frames of each speaker within the preprocessed input speech data with each speaker's mean and variance; d) concatenating the normalized feature frames to form overlapped longer frames having a frame length and a hop size; e) inputting the overlapped longer frames to the neural network based speaker classifier; and f) training the neural network through forward-backward propagation.
    Type: Grant
    Filed: October 6, 2017
    Date of Patent: January 14, 2020
    Inventors: Zhenhao Ge, Ananth Nagaraja Iyer, Srinath Cheluvaraja, Aravind Ganapathiraju
  • Publication number: 20180158463
    Abstract: A method for classifying speakers includes: receiving, by a speaker recognition system including a processor and memory, input audio including speech from a speaker; extracting, by the speaker recognition system, a plurality of speech frames containing voiced speech from the input audio; computing, by the speaker recognition system, a plurality of features for each of the speech frames of the input audio; computing, by the speaker recognition system, a plurality of recognition scores for the plurality of features; computing, by the speaker recognition system, a speaker classification result in accordance with the recognition scores; and outputting, by the speaker recognition system, the speaker classification result.
    Type: Application
    Filed: December 7, 2017
    Publication date: June 7, 2018
    Inventors: Zhenhao Ge, Ananth N. Iyer, Srinath Cheluvaraja, Ram Sundaram, Aravind Ganapathiraju
  • Publication number: 20180039888
    Abstract: A method for training a neural network of a neural network based speaker classifier for use in speaker change detection. The method comprises: a) preprocessing input speech data; b) extracting a plurality of feature frames from the preprocessed input speech data; c) normalizing the extracted feature frames of each speaker within the preprocessed input speech data with each speaker's mean and variance; d) concatenating the normalized feature frames to form overlapped longer frames having a frame length and a hop size; e) inputting the overlapped longer frames to the neural network based speaker classifier; and f) training the neural network through forward-backward propagation.
    Type: Application
    Filed: October 6, 2017
    Publication date: February 8, 2018
    Inventors: ZHENHAO GE, ANANTH NAGARAJA IYER, SRINATH CHELUVARAJA, ARAVIND GANAPATHIRAJU