Patents Examined by Bharatkumar S Shah
  • Patent number: 11295761
    Abstract: The present disclosure discloses a method for constructing a voice detection model and a voice endpoint detection system, and belongs to the technical field of voice recognition. In the method for constructing a voice detection model according to the present disclosure, audio data is first collected and a mixed voice is synthesized, feature extraction is performed on the mixed voice to obtain a 62-dimensional feature, and then the 62-dimensional feature is input to a recurrent neural network (RNN) model for training to obtain a voice detection model. The voice endpoint detection system according to the present disclosure includes a collecting unit, a calculating unit, a transmitting unit, and a terminal, the collecting unit being electrically connected to the calculating unit, and the calculating unit and the terminal being respectively connected to the transmitting unit. The voice detection model can be applied to a real-time conference communication device.
    Type: Grant
    Filed: May 11, 2020
    Date of Patent: April 5, 2022
    Assignee: YEALINK (XIAMEN) NETWORK TECHNOLOGY CO., LTD.
    Inventors: Zehuang Fang, Yuanxun Kang, Wanjian Feng
  • Patent number: 11295725
    Abstract: A method of self-training WaveNet includes receiving a plurality of recorded speech samples and training a first autoregressive neural network using the plurality of recorded speech samples. The trained first autoregressive neural network is configured to output synthetic speech as an audible representations of a text input. The method further includes generating a plurality of synthetic speech samples using the trained first autoregressive neural network. The method additionally includes training a second autoregressive neural network using the plurality of synthetic speech samples from the trained first autoregressive neural network and distilling the trained second autoregressive neural network into a feedforward neural network.
    Type: Grant
    Filed: July 9, 2020
    Date of Patent: April 5, 2022
    Assignee: Google LLC
    Inventors: Manish Sharma, Tom Marius Kenter, Robert Clark
  • Patent number: 11295723
    Abstract: A voice synthesis method includes: supplying a first trained model with control data including phonetic identifier data to generate a series of frequency spectra of harmonic components; supplying a second trained model with the control data to generate a waveform signal representative of non-harmonic components; and generating a voice signal including the harmonic components and the non-harmonic components based on the series of frequency spectra of the harmonic components generated by the first trained model and the waveform signal representative of the non-harmonic components generated by the second trained model.
    Type: Grant
    Filed: May 28, 2020
    Date of Patent: April 5, 2022
    Assignee: YAMAHA CORPORATION
    Inventors: Ryunosuke Daido, Masahiro Shimizu
  • Patent number: 11288450
    Abstract: A method includes receiving a set of documents related to data discovery issues, wherein at least a first data discovery issue is unrelated to a second data discovery issue. The method further includes generating a map of terms and words for the set of documents that correspond to concepts. The method further includes providing a user interface that includes a search analytics tool to a user associated with the first data discovery issue.
    Type: Grant
    Filed: August 6, 2020
    Date of Patent: March 29, 2022
    Assignee: Casepoint LLC
    Inventor: Vishalkumar Rajpara
  • Patent number: 11282535
    Abstract: Disclosed is an electronic apparatus. The electronic apparatus includes a storage for storing a plurality of filters trained in a plurality of convolutional neural networks (CNNs) respectively and a processor configured to acquire a first spectrogram corresponding to a damaged audio signal, input the first spectrogram to a CNN corresponding to each frequency band to apply the plurality of filters trained in the plurality of CNNs respectively, acquire a second spectrogram by merging output values of the CNNs to which the plurality of filters are applied, and acquire an audio signal reconstructed based on the second spectrogram.
    Type: Grant
    Filed: July 19, 2018
    Date of Patent: March 22, 2022
    Assignee: SAMSUNG ELECTRONICS CO., LTD.
    Inventors: Ki-Hyun Choo, Anton Porov, Jong-Hoon Jeong, Ho-Sang Sung, Eun-Mi Oh, Jong-Youb Ryu
  • Patent number: 11282496
    Abstract: A device may identify a plurality of sources for outputs that the device is configured to provide. The plurality of sources may include at least one of a particular application in the device, an operating system of the device, a particular area within a display of the device, or a particular graphical user interface object. The device may also assign a set of distinct voices to respective sources of the plurality of sources. The device may also receive a request for speech output. The device may also select a particular source that is associated with the requested speech output. The device may also generate speech having particular voice characteristics of a particular voice assigned to the particular source.
    Type: Grant
    Filed: June 12, 2020
    Date of Patent: March 22, 2022
    Assignee: Google LLC
    Inventors: Ioannis Agiomyrgiannakis, Fergus James Henderson
  • Patent number: 11264046
    Abstract: A speech signal leveling system and method include generating an output signal by applying a frequency-dependent or frequency-independent controllable gain to an input signal, the gain being dependent on a gain control signal, and generating at least one speech detection signal indicative of voice components contained in the input signal. The system and method further include generating the gain control signal based on the input signal and the at least one speech detection signal, controlling the controllable-gain block to amplify or attenuate the input signal to have a predetermined mean or maximum or absolute peak signal level as long as voice components are detected in the input signal.
    Type: Grant
    Filed: July 17, 2018
    Date of Patent: March 1, 2022
    Inventor: Markus E Christoph
  • Patent number: 11257497
    Abstract: The present disclosure provides a voice wake-up processing method, an apparatus and a storage medium. After acquiring voice wake-up signals collected by audio input devices in at least two audio zones, an electronic device may correct, based on to-be-woken-up audio zones obtained from amplitudes of the voice wake-up signals collected by the audio input devices in the at least two audio zones, a to-be-woken-up audio zone identified using a voice engine, avoiding that audio zones in which a plurality of audio input devices collecting voice wake-up signals produced from a same user are located are all woken up, therefore, it is possible to improve accuracy of a voice wake-up result obtained by the electronic device. Therefore, the present disclosure can solve the technical problem that a vehicle-mounted terminal has low voice wake-up accuracy due to an insufficient degree of sound isolation between audio zones of the vehicle-mounted terminal.
    Type: Grant
    Filed: December 23, 2019
    Date of Patent: February 22, 2022
    Assignee: BAIDU ONLINE NETWORK TECHNOLOGY (BEIJING) CO., LTD.
    Inventors: Hanying Peng, Nengjun Ouyang
  • Patent number: 11250845
    Abstract: A virtual assistance system for a vehicle is provided. The virtual assistance system includes one or more processors, one or more memory modules communicatively coupled to the one or more processors, an input device communicatively coupled to the one or more processors, an output device communicatively coupled to the one or more processors, and machine readable instructions stored in the one or more memory modules. The virtual assistance system receives, through the input device, a request for an item from a user, retrieves a location of an entity associated the item, calculates an estimated arrival time when the vehicle will arrive at the location, compares the estimated arrival time with a predetermined time associated with the entity, and instructs the output device to generate a delivery prompt related to the item in response to comparing the estimated arrival time with the predetermined time.
    Type: Grant
    Filed: September 5, 2018
    Date of Patent: February 15, 2022
    Assignee: TOYOTA CONNECTED NORTH AMERICA, INC.
    Inventor: Brian M. Kursar
  • Patent number: 11250835
    Abstract: A method for changing the audio response voice of a voice control system based on an activation command of a user. The method includes the user requesting the change of the activation command, the user inputting the new activation command, the gender of the new activation command being determined, setting the audio response voice in accordance with the gender in response to the gender being unambiguously masculine or unambiguously feminine, retaining the present audio response voice in response to the gender being neither unambiguously masculine nor unambiguously feminine, and applying the new activation command.
    Type: Grant
    Filed: December 19, 2017
    Date of Patent: February 15, 2022
    Inventors: Mark Pleschka, Spyros Kousidis, Sebastian Varges, Zeno Wolze, Kim Maurice Cedziwoda
  • Patent number: 11250839
    Abstract: In non-limiting examples of the present disclosure, systems, methods and devices for training conversational language models are presented. An embedding library may be generated and maintained. Exemplary target inputs and associated intent types may be received. The target inputs may be encoded into contextual embeddings. The embeddings may be added to the embedding library. When a conversational entity receives a new natural language input, that new input may be encoded into a contextual embedding. The new embedding may be added to the embedding library. A similarity score model may be applied to the new embedding and one or more embeddings for the exemplary target inputs. Similarity scores may be calculated based on the application of the similarity score model. A response may be generated by the conversational entity for an intent type for which a similarity score exceeds a threshold value.
    Type: Grant
    Filed: April 16, 2020
    Date of Patent: February 15, 2022
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Tien Widya Suwandy, David Shigeru Taniguchi, Cezary Antoni Marcjan, Hung-chih Yang
  • Patent number: 11238872
    Abstract: A method and apparatus for managing agent interactions with customers of an enterprise are disclosed. The method includes generating a value representative of an emotional state of a customer engaged in an ongoing interaction with a virtual agent (VA) associated with the enterprise. The value is generated based, at least in part, on one or more inputs provided by the customer during the ongoing interaction. The value is compared with a predefined emotional threshold range to determine whether the emotional state of the customer is a non-neutral state. The ongoing interaction is deflected to one of a human agent and a specialized VA capable of empathetically handling the ongoing interaction if it is determined that the emotional state of the customer is the non-neutral state.
    Type: Grant
    Filed: November 20, 2018
    Date of Patent: February 1, 2022
    Assignee: [24]7.ai, Inc.
    Inventors: Pallipuram V. Kannan, Anand Sinha
  • Patent number: 11238846
    Abstract: [Problem] More effective information is gathered efficiently for the correction of device behavior. [Solution] Provided is an information processing device that includes an output control unit that controls, on the basis of a gathered operation history of a device, an output of a reproduced scene pertaining to a behavior which the device has executed on the basis of context information; and a communication unit that receives feedback input by a judge regarding the reproduced scene thus output. Further provided is an information processing device that comprises a communication unit that receives information on a reproduced scene pertaining to a behavior which the device has executed on the basis of context information; and a playback unit that plays back the reproduced scene, wherein the communication unit transmits feedback input by a judge regarding the reproduced scene.
    Type: Grant
    Filed: April 17, 2018
    Date of Patent: February 1, 2022
    Assignee: SONY CORPORATION
    Inventors: Junki Ohmura, Hiroaki Ogawa, Kana Nishikawa, Keisuke Touyama, Shinobu Kuriya, Yasushi Tsuruta
  • Patent number: 11217270
    Abstract: Disclosed is a method for generating training data for training a filled pause detecting model and a device therefor, which execute mounted artificial intelligence (AI) and/or machine learning algorithms in a 5G communication environment. The method includes acquiring acoustic data including first speech data including a filled pause, second speech data not including a filled pause, and noise, generating a plurality of noise data based on the acoustic data, and generating first training data including a plurality of filled pauses and second training data not including a plurality of filled pauses by synthesizing the plurality of noise data with the first speech data and the second speech data. According to the present disclosure, training data for training a filled pause detecting model in a simulation noise environment can be generated, and filled pause detection performance for speech data generated in an actual noise environment can be enhanced.
    Type: Grant
    Filed: March 4, 2020
    Date of Patent: January 4, 2022
    Assignee: LG ELECTRONICS INC.
    Inventors: Yun Jin Lee, Jaehun Choi
  • Patent number: 11211059
    Abstract: Disclosed herein an artificial intelligence apparatus for recognizing speech with multiple languages including a microphone, and a processor configured to obtain, via the microphone, speech data including speech of a user with multiple languages, calculate a word recognition reliability of each word in the obtained speech data using an acoustic model of a main language, calculate a word recognition reliability of each word in the obtained speech data using an acoustic model of at least one sub language, select a language having a highest word recognition reliability for each word, convert the speech data into text in consideration of a word recognition result corresponding to the selected language for each word, and generate a speech recognition result corresponding to the speech data using the converted text.
    Type: Grant
    Filed: November 11, 2019
    Date of Patent: December 28, 2021
    Assignee: LG ELECTRONICS INC.
    Inventors: Jaehong Kim, Hyoeun Kim
  • Patent number: 11211045
    Abstract: Provided is an artificial intelligence apparatus for predicting a performance of a voice recognition model in a user environment including: a memory configured to store a performance prediction model; and a processor configured to: obtain first controlled environment data including first controlled environment factors corresponding to a first controlled voice recognition environment and a first controlled voice recognition performance of a target voice recognition model in the first controlled voice recognition environment; obtain first user environment factors corresponding to a first user environment, in which the performance is to be predicted; predict, using the performance prediction model, a first user voice recognition performance of the target voice recognition model in the first user voice recognition environment from the obtained first controlled environment data and the first user environment factors; and output the predicted first user voice recognition performance.
    Type: Grant
    Filed: May 29, 2019
    Date of Patent: December 28, 2021
    Assignee: LG ELECTRONICS INC.
    Inventor: Jonghoon Chae
  • Patent number: 11195545
    Abstract: A device to perform end-of-utterance detection includes a speaker vector extractor configured to receive a frame of an audio signal and to generate a speaker vector that corresponds to the frame. The device also includes an end-of-utterance detector configured to process the speaker vector and to generate an indicator that indicates whether the frame corresponds to an end of an utterance of a particular speaker.
    Type: Grant
    Filed: October 18, 2019
    Date of Patent: December 7, 2021
    Assignee: QUALCOMM Incorporated
    Inventors: Hye Jin Jang, Kyu Woong Hwang, Sungrack Yun, Janghoon Cho
  • Patent number: 11195535
    Abstract: A voice recognition device includes a memory and a processor including hardware. The processor is configured to extract a feature of input voice data and set a duration of a silent state after transition of the voice data to the silent state. The duration is used for determining that an input of the voice data is completed.
    Type: Grant
    Filed: September 6, 2019
    Date of Patent: December 7, 2021
    Assignee: TOYOTA JIDOSHA KABUSHIKI KAISHA
    Inventor: Fumio Wada
  • Patent number: 11189278
    Abstract: A method, performed by a device, of providing a response message to a user input includes obtaining location information of the device; executing a service providing agent corresponding to the location information; receiving a speech input from a user; generating the response message based on the received speech input, the response message being related to a service provided by the executed service providing agent; and displaying the generated response message, wherein the executed service providing agent generates the response message using a model trained using an artificial intelligence (AI) algorithm, the trained model being one from among a plurality of trained models each corresponding to a respective service from among a plurality of services provided by a respective service providing agent from among a plurality of service providing agents, and wherein the trained model corresponds to the executed service providing agent.
    Type: Grant
    Filed: March 26, 2019
    Date of Patent: November 30, 2021
    Assignee: SAMSUNG ELECTRONICS CO., LTD.
    Inventors: Hyungrai Oh, Hyeonmok Ko, Silas Jeon
  • Patent number: 11170173
    Abstract: A method, system and computer program product for improving the understanding of chat transcript data. Chat transcripts are analyzed to classify the utterances into intents and identify products discussed in the chat transcripts. The data of the chat transcripts are divided into categories of utterances associated with products and intents by applying tags to the chat transcripts. The categories of utterances associated with products and intents are then clustered into clusters based on sentence similarity. Once the utterances are grouped, a representative utterance is extracted from a cluster, where the representative utterance is an utterance that has the highest semantic similarity to the utterances in the cluster. In this manner, users will be provided a more accurate guide as to the underlying meaning of the chat transcript data thereby improving the understanding of the chat transcript data more efficiently and accurately than current chat transcript analysis tools.
    Type: Grant
    Filed: February 5, 2019
    Date of Patent: November 9, 2021
    Assignee: International Business Machines Corporation
    Inventors: Jennifer A. Mallette, Steven W. Jones, Vivek Salve, Jia Liu