Patents Examined by Bharatkumar S Shah
-
Patent number: 11295761Abstract: The present disclosure discloses a method for constructing a voice detection model and a voice endpoint detection system, and belongs to the technical field of voice recognition. In the method for constructing a voice detection model according to the present disclosure, audio data is first collected and a mixed voice is synthesized, feature extraction is performed on the mixed voice to obtain a 62-dimensional feature, and then the 62-dimensional feature is input to a recurrent neural network (RNN) model for training to obtain a voice detection model. The voice endpoint detection system according to the present disclosure includes a collecting unit, a calculating unit, a transmitting unit, and a terminal, the collecting unit being electrically connected to the calculating unit, and the calculating unit and the terminal being respectively connected to the transmitting unit. The voice detection model can be applied to a real-time conference communication device.Type: GrantFiled: May 11, 2020Date of Patent: April 5, 2022Assignee: YEALINK (XIAMEN) NETWORK TECHNOLOGY CO., LTD.Inventors: Zehuang Fang, Yuanxun Kang, Wanjian Feng
-
Patent number: 11295725Abstract: A method of self-training WaveNet includes receiving a plurality of recorded speech samples and training a first autoregressive neural network using the plurality of recorded speech samples. The trained first autoregressive neural network is configured to output synthetic speech as an audible representations of a text input. The method further includes generating a plurality of synthetic speech samples using the trained first autoregressive neural network. The method additionally includes training a second autoregressive neural network using the plurality of synthetic speech samples from the trained first autoregressive neural network and distilling the trained second autoregressive neural network into a feedforward neural network.Type: GrantFiled: July 9, 2020Date of Patent: April 5, 2022Assignee: Google LLCInventors: Manish Sharma, Tom Marius Kenter, Robert Clark
-
Patent number: 11295723Abstract: A voice synthesis method includes: supplying a first trained model with control data including phonetic identifier data to generate a series of frequency spectra of harmonic components; supplying a second trained model with the control data to generate a waveform signal representative of non-harmonic components; and generating a voice signal including the harmonic components and the non-harmonic components based on the series of frequency spectra of the harmonic components generated by the first trained model and the waveform signal representative of the non-harmonic components generated by the second trained model.Type: GrantFiled: May 28, 2020Date of Patent: April 5, 2022Assignee: YAMAHA CORPORATIONInventors: Ryunosuke Daido, Masahiro Shimizu
-
Patent number: 11288450Abstract: A method includes receiving a set of documents related to data discovery issues, wherein at least a first data discovery issue is unrelated to a second data discovery issue. The method further includes generating a map of terms and words for the set of documents that correspond to concepts. The method further includes providing a user interface that includes a search analytics tool to a user associated with the first data discovery issue.Type: GrantFiled: August 6, 2020Date of Patent: March 29, 2022Assignee: Casepoint LLCInventor: Vishalkumar Rajpara
-
Patent number: 11282535Abstract: Disclosed is an electronic apparatus. The electronic apparatus includes a storage for storing a plurality of filters trained in a plurality of convolutional neural networks (CNNs) respectively and a processor configured to acquire a first spectrogram corresponding to a damaged audio signal, input the first spectrogram to a CNN corresponding to each frequency band to apply the plurality of filters trained in the plurality of CNNs respectively, acquire a second spectrogram by merging output values of the CNNs to which the plurality of filters are applied, and acquire an audio signal reconstructed based on the second spectrogram.Type: GrantFiled: July 19, 2018Date of Patent: March 22, 2022Assignee: SAMSUNG ELECTRONICS CO., LTD.Inventors: Ki-Hyun Choo, Anton Porov, Jong-Hoon Jeong, Ho-Sang Sung, Eun-Mi Oh, Jong-Youb Ryu
-
Patent number: 11282496Abstract: A device may identify a plurality of sources for outputs that the device is configured to provide. The plurality of sources may include at least one of a particular application in the device, an operating system of the device, a particular area within a display of the device, or a particular graphical user interface object. The device may also assign a set of distinct voices to respective sources of the plurality of sources. The device may also receive a request for speech output. The device may also select a particular source that is associated with the requested speech output. The device may also generate speech having particular voice characteristics of a particular voice assigned to the particular source.Type: GrantFiled: June 12, 2020Date of Patent: March 22, 2022Assignee: Google LLCInventors: Ioannis Agiomyrgiannakis, Fergus James Henderson
-
Patent number: 11264046Abstract: A speech signal leveling system and method include generating an output signal by applying a frequency-dependent or frequency-independent controllable gain to an input signal, the gain being dependent on a gain control signal, and generating at least one speech detection signal indicative of voice components contained in the input signal. The system and method further include generating the gain control signal based on the input signal and the at least one speech detection signal, controlling the controllable-gain block to amplify or attenuate the input signal to have a predetermined mean or maximum or absolute peak signal level as long as voice components are detected in the input signal.Type: GrantFiled: July 17, 2018Date of Patent: March 1, 2022Inventor: Markus E Christoph
-
Patent number: 11257497Abstract: The present disclosure provides a voice wake-up processing method, an apparatus and a storage medium. After acquiring voice wake-up signals collected by audio input devices in at least two audio zones, an electronic device may correct, based on to-be-woken-up audio zones obtained from amplitudes of the voice wake-up signals collected by the audio input devices in the at least two audio zones, a to-be-woken-up audio zone identified using a voice engine, avoiding that audio zones in which a plurality of audio input devices collecting voice wake-up signals produced from a same user are located are all woken up, therefore, it is possible to improve accuracy of a voice wake-up result obtained by the electronic device. Therefore, the present disclosure can solve the technical problem that a vehicle-mounted terminal has low voice wake-up accuracy due to an insufficient degree of sound isolation between audio zones of the vehicle-mounted terminal.Type: GrantFiled: December 23, 2019Date of Patent: February 22, 2022Assignee: BAIDU ONLINE NETWORK TECHNOLOGY (BEIJING) CO., LTD.Inventors: Hanying Peng, Nengjun Ouyang
-
Patent number: 11250845Abstract: A virtual assistance system for a vehicle is provided. The virtual assistance system includes one or more processors, one or more memory modules communicatively coupled to the one or more processors, an input device communicatively coupled to the one or more processors, an output device communicatively coupled to the one or more processors, and machine readable instructions stored in the one or more memory modules. The virtual assistance system receives, through the input device, a request for an item from a user, retrieves a location of an entity associated the item, calculates an estimated arrival time when the vehicle will arrive at the location, compares the estimated arrival time with a predetermined time associated with the entity, and instructs the output device to generate a delivery prompt related to the item in response to comparing the estimated arrival time with the predetermined time.Type: GrantFiled: September 5, 2018Date of Patent: February 15, 2022Assignee: TOYOTA CONNECTED NORTH AMERICA, INC.Inventor: Brian M. Kursar
-
Patent number: 11250835Abstract: A method for changing the audio response voice of a voice control system based on an activation command of a user. The method includes the user requesting the change of the activation command, the user inputting the new activation command, the gender of the new activation command being determined, setting the audio response voice in accordance with the gender in response to the gender being unambiguously masculine or unambiguously feminine, retaining the present audio response voice in response to the gender being neither unambiguously masculine nor unambiguously feminine, and applying the new activation command.Type: GrantFiled: December 19, 2017Date of Patent: February 15, 2022Inventors: Mark Pleschka, Spyros Kousidis, Sebastian Varges, Zeno Wolze, Kim Maurice Cedziwoda
-
Patent number: 11250839Abstract: In non-limiting examples of the present disclosure, systems, methods and devices for training conversational language models are presented. An embedding library may be generated and maintained. Exemplary target inputs and associated intent types may be received. The target inputs may be encoded into contextual embeddings. The embeddings may be added to the embedding library. When a conversational entity receives a new natural language input, that new input may be encoded into a contextual embedding. The new embedding may be added to the embedding library. A similarity score model may be applied to the new embedding and one or more embeddings for the exemplary target inputs. Similarity scores may be calculated based on the application of the similarity score model. A response may be generated by the conversational entity for an intent type for which a similarity score exceeds a threshold value.Type: GrantFiled: April 16, 2020Date of Patent: February 15, 2022Assignee: Microsoft Technology Licensing, LLCInventors: Tien Widya Suwandy, David Shigeru Taniguchi, Cezary Antoni Marcjan, Hung-chih Yang
-
Patent number: 11238872Abstract: A method and apparatus for managing agent interactions with customers of an enterprise are disclosed. The method includes generating a value representative of an emotional state of a customer engaged in an ongoing interaction with a virtual agent (VA) associated with the enterprise. The value is generated based, at least in part, on one or more inputs provided by the customer during the ongoing interaction. The value is compared with a predefined emotional threshold range to determine whether the emotional state of the customer is a non-neutral state. The ongoing interaction is deflected to one of a human agent and a specialized VA capable of empathetically handling the ongoing interaction if it is determined that the emotional state of the customer is the non-neutral state.Type: GrantFiled: November 20, 2018Date of Patent: February 1, 2022Assignee: [24]7.ai, Inc.Inventors: Pallipuram V. Kannan, Anand Sinha
-
Patent number: 11238846Abstract: [Problem] More effective information is gathered efficiently for the correction of device behavior. [Solution] Provided is an information processing device that includes an output control unit that controls, on the basis of a gathered operation history of a device, an output of a reproduced scene pertaining to a behavior which the device has executed on the basis of context information; and a communication unit that receives feedback input by a judge regarding the reproduced scene thus output. Further provided is an information processing device that comprises a communication unit that receives information on a reproduced scene pertaining to a behavior which the device has executed on the basis of context information; and a playback unit that plays back the reproduced scene, wherein the communication unit transmits feedback input by a judge regarding the reproduced scene.Type: GrantFiled: April 17, 2018Date of Patent: February 1, 2022Assignee: SONY CORPORATIONInventors: Junki Ohmura, Hiroaki Ogawa, Kana Nishikawa, Keisuke Touyama, Shinobu Kuriya, Yasushi Tsuruta
-
Patent number: 11217270Abstract: Disclosed is a method for generating training data for training a filled pause detecting model and a device therefor, which execute mounted artificial intelligence (AI) and/or machine learning algorithms in a 5G communication environment. The method includes acquiring acoustic data including first speech data including a filled pause, second speech data not including a filled pause, and noise, generating a plurality of noise data based on the acoustic data, and generating first training data including a plurality of filled pauses and second training data not including a plurality of filled pauses by synthesizing the plurality of noise data with the first speech data and the second speech data. According to the present disclosure, training data for training a filled pause detecting model in a simulation noise environment can be generated, and filled pause detection performance for speech data generated in an actual noise environment can be enhanced.Type: GrantFiled: March 4, 2020Date of Patent: January 4, 2022Assignee: LG ELECTRONICS INC.Inventors: Yun Jin Lee, Jaehun Choi
-
Patent number: 11211059Abstract: Disclosed herein an artificial intelligence apparatus for recognizing speech with multiple languages including a microphone, and a processor configured to obtain, via the microphone, speech data including speech of a user with multiple languages, calculate a word recognition reliability of each word in the obtained speech data using an acoustic model of a main language, calculate a word recognition reliability of each word in the obtained speech data using an acoustic model of at least one sub language, select a language having a highest word recognition reliability for each word, convert the speech data into text in consideration of a word recognition result corresponding to the selected language for each word, and generate a speech recognition result corresponding to the speech data using the converted text.Type: GrantFiled: November 11, 2019Date of Patent: December 28, 2021Assignee: LG ELECTRONICS INC.Inventors: Jaehong Kim, Hyoeun Kim
-
Patent number: 11211045Abstract: Provided is an artificial intelligence apparatus for predicting a performance of a voice recognition model in a user environment including: a memory configured to store a performance prediction model; and a processor configured to: obtain first controlled environment data including first controlled environment factors corresponding to a first controlled voice recognition environment and a first controlled voice recognition performance of a target voice recognition model in the first controlled voice recognition environment; obtain first user environment factors corresponding to a first user environment, in which the performance is to be predicted; predict, using the performance prediction model, a first user voice recognition performance of the target voice recognition model in the first user voice recognition environment from the obtained first controlled environment data and the first user environment factors; and output the predicted first user voice recognition performance.Type: GrantFiled: May 29, 2019Date of Patent: December 28, 2021Assignee: LG ELECTRONICS INC.Inventor: Jonghoon Chae
-
Patent number: 11195545Abstract: A device to perform end-of-utterance detection includes a speaker vector extractor configured to receive a frame of an audio signal and to generate a speaker vector that corresponds to the frame. The device also includes an end-of-utterance detector configured to process the speaker vector and to generate an indicator that indicates whether the frame corresponds to an end of an utterance of a particular speaker.Type: GrantFiled: October 18, 2019Date of Patent: December 7, 2021Assignee: QUALCOMM IncorporatedInventors: Hye Jin Jang, Kyu Woong Hwang, Sungrack Yun, Janghoon Cho
-
Patent number: 11195535Abstract: A voice recognition device includes a memory and a processor including hardware. The processor is configured to extract a feature of input voice data and set a duration of a silent state after transition of the voice data to the silent state. The duration is used for determining that an input of the voice data is completed.Type: GrantFiled: September 6, 2019Date of Patent: December 7, 2021Assignee: TOYOTA JIDOSHA KABUSHIKI KAISHAInventor: Fumio Wada
-
Patent number: 11189278Abstract: A method, performed by a device, of providing a response message to a user input includes obtaining location information of the device; executing a service providing agent corresponding to the location information; receiving a speech input from a user; generating the response message based on the received speech input, the response message being related to a service provided by the executed service providing agent; and displaying the generated response message, wherein the executed service providing agent generates the response message using a model trained using an artificial intelligence (AI) algorithm, the trained model being one from among a plurality of trained models each corresponding to a respective service from among a plurality of services provided by a respective service providing agent from among a plurality of service providing agents, and wherein the trained model corresponds to the executed service providing agent.Type: GrantFiled: March 26, 2019Date of Patent: November 30, 2021Assignee: SAMSUNG ELECTRONICS CO., LTD.Inventors: Hyungrai Oh, Hyeonmok Ko, Silas Jeon
-
Patent number: 11170173Abstract: A method, system and computer program product for improving the understanding of chat transcript data. Chat transcripts are analyzed to classify the utterances into intents and identify products discussed in the chat transcripts. The data of the chat transcripts are divided into categories of utterances associated with products and intents by applying tags to the chat transcripts. The categories of utterances associated with products and intents are then clustered into clusters based on sentence similarity. Once the utterances are grouped, a representative utterance is extracted from a cluster, where the representative utterance is an utterance that has the highest semantic similarity to the utterances in the cluster. In this manner, users will be provided a more accurate guide as to the underlying meaning of the chat transcript data thereby improving the understanding of the chat transcript data more efficiently and accurately than current chat transcript analysis tools.Type: GrantFiled: February 5, 2019Date of Patent: November 9, 2021Assignee: International Business Machines CorporationInventors: Jennifer A. Mallette, Steven W. Jones, Vivek Salve, Jia Liu