Patents Examined by Athar N Pasha
-
Patent number: 11830482Abstract: Embodiments of the present disclosure relate to a method and an apparatus for speech interaction, and a computer readable storage medium. The method may include determining text information corresponding to a received speech signal. The method also includes obtaining label information of the text information by labeling elements in the text information. In addition, the method further includes determining first intention information of the text information based on the label information. The method further includes determining a semantic of the text information based on the first intention information and the label information.Type: GrantFiled: June 8, 2020Date of Patent: November 28, 2023Assignee: BAIDU ONLINE NETWORK TECHNOLOGY (BEIJING) CO., LTDInventors: Zhen Wu, Yufang Wu, Hua Liang, Jiaxiang Ge, Xingyuan Peng, Jinfeng Bai, Lei Jia
-
Patent number: 11823660Abstract: Embodiments of the present disclosure disclose a method, apparatus and device for training a network, and a storage medium, relate to the field of artificial intelligence technology such as deep learning and speech analysis. A semantic prediction network comprises: an encoder network and at least one decoder network; and a particular solution is: acquiring a first speech feature of a target speech sample; the target speech sample being a synthesized speech sample or a real speech sample, the synthesized speech sample being attached with a sample syllable label and a semantic label comprising a value of the domain, and the real speech sample being attached with a sample syllable label; and jointly training an initial semantic prediction network and a syllable classification network using the first speech feature of the target speech sample, to obtain a trained semantic prediction network.Type: GrantFiled: June 21, 2021Date of Patent: November 21, 2023Assignee: BEIJING BAIDU NETCOM SCIENCE AND TECHNOLOGY CO., LTD.Inventors: Li Chen, Saisai Zou
-
Patent number: 11806213Abstract: A speech transmission compensation apparatus that assists discrimination of speech heard by a user, includes: one or more computers each including a memory and a processor configured to: accept input of a speech signal, detect a specific type of sound in the speech signal, analyze an acoustic characteristic of the specific type of sound in the speech signal and output the acoustic characteristic; accept input of the acoustic characteristic being output by the memory and the processor, generate a vibration signal of a duration corresponding to the acoustic characteristic and output the vibration signal; and accept input of the vibration signal being output by the memory and the processor and provide the user with vibration for the duration on the basis of the vibration signal.Type: GrantFiled: April 30, 2020Date of Patent: November 7, 2023Assignee: NIPPON TELEGRAPH AND TELEPHONE CORPORATIONInventors: Asuka Ono, Momoko Nakatani, Ai Nakane, Yoko Ishii
-
Patent number: 11803705Abstract: An information processing apparatus includes a memory and processor. The memory stores a trouble category into which a trouble occurring in a product is classified and a design-element category into which a design element causing the trouble and included in the product is classified in mutual association and store a design-element category into which a design element included in a product is classified and a design-requirement category into which a design requirement of the design element is classified in association with each other.Type: GrantFiled: December 7, 2020Date of Patent: October 31, 2023Assignee: FUJIFILM Business Innovation Corp.Inventors: Makoto Fuchigami, Yasuaki Miyazawa, Mari Horie, Hiroshi Murano, Nobukazu Takahashi, Kimihiro Wakabayashi, Masaki Suda, Eiji Ooyama, Hiroaki Murai, Masahiro Ishino
-
Patent number: 11798548Abstract: The present application discloses a speech recognition method, apparatus, device and readable storage medium, and relates to the technical field of artificial intelligence. A specific implementation includes: an electronic device recognizes a speech signal to obtain a first text; if a first pinyin sequence corresponding to the first text exists in the database, the electronic device uses a correct text corresponding to the first pinyin sequence as the speech recognition result; otherwise, the electronic device performs a fuzzy matching on the first pinyin sequence to obtain multiple second pinyin sequences and second texts corresponding to the second pinyin sequences, and selects the speech recognition result from the multiple second texts.Type: GrantFiled: December 21, 2020Date of Patent: October 24, 2023Assignee: APOLLO INTELLIGENT CONNECTIVITY (BEIJING) TECHNOLOGY CO., LTD.Inventors: Yi Zhou, Qie Yin, Long Zhang, Zhen Chen
-
Patent number: 11776544Abstract: An embodiment of the present invention provides an artificial intelligence (AI) apparatus for recognizing a speech of a user, the artificial intelligence apparatus includes a memory to store a speech recognition model and a processor to obtain a speech signal for a user speech, to convert the speech signal into a text using the speech recognition model, to measure a confidence level for the conversion, to perform a control operation corresponding to the converted text if the measured confidence level is greater than or equal to a reference value, and to provide feedback for the conversion if the measured confidence level is less than the reference value.Type: GrantFiled: May 18, 2022Date of Patent: October 3, 2023Assignee: LG ELECTRONICS INC.Inventors: Jaehong Kim, Hyoeun Kim, Hangil Jeong, Heeyeon Choi
-
Patent number: 11769488Abstract: A system and method invoke virtual assistant action, which may comprise an argument. From audio, a probability of an intent is inferred. A probability of a domain and a plurality of variable values may also be inferred. Invoking the action is in response to the intent probability exceeding a threshold. Invoking the action may also be in response to the domain probability exceeding a threshold, a variable value probability exceeding a threshold, detecting an end of utterance, and a specific amount of time having elapsed. The intent probability may increase when the audio includes speech of words with the same meaning in multiple natural languages. Invoking the action may also be conditional on the variable value exceeding its threshold within a certain period of time of the intent probability exceeding its threshold.Type: GrantFiled: March 3, 2022Date of Patent: September 26, 2023Assignee: SoundHound AI IP, LLCInventors: Sudharsan Krishnaswamy, Maisy Wieman, Jonah Probell
-
Patent number: 11727914Abstract: An example intent-recognition system comprises a processor and memory storing instructions. The instructions cause the processor to receive speech input comprising spoken words. The instructions cause the processor to generate text results based on the speech input and generate acoustic feature annotations based on the speech input. The instructions also cause the processor to apply an intent model to the text result and the acoustic feature annotations to recognize an intent based on the speech input. An example system for adapting an emotional text-to-speech model comprises a processor and memory. The memory stores instructions that cause the processor to receive training examples comprising speech input and receive labelling data comprising emotion information associated with the speech input. The instructions also cause the processor to extract audio signal vectors from the training examples and generate an emotion-adapted voice font model based on the audio signal vectors and the labelling data.Type: GrantFiled: December 24, 2021Date of Patent: August 15, 2023Assignee: MICROSOFT TECHNOLOGY LICENSING, LLCInventors: Pei Zhao, Kaisheng Yao, Max Leung, Bo Yan, Jian Luan, Yu Shi, Malone Ma, Mei-Yuh Hwang
-
Patent number: 11710476Abstract: A voice recognition system includes a microphone configured to receive one or more spoken dialogue commands from a user in a voice recognition session. The system also includes a processor in communication with the microphone. The processor is configured to receive one or more audio files associated with one or more audio events associated with the voice recognition system, execute the one or more audio files in a voice recognition session in an audio event, and output a log report indicating a result of the audio events with the voice recognition session.Type: GrantFiled: April 27, 2020Date of Patent: July 25, 2023Assignee: ROBERT BOSCH GMBHInventors: Xiaowei Zhou, Pongtep Angkititrakul
-
Patent number: 11705130Abstract: An example method includes, at an electronic device: receiving an indication of a notification; in accordance with receiving the indication of the notification: obtaining one or more data streams from one or more sensors; determining, based on the one or more data streams, whether a user associated with the electronic device is speaking; and in accordance with a determination that the user is not speaking: causing an output associated with the notification to be provided.Type: GrantFiled: November 12, 2021Date of Patent: July 18, 2023Assignee: Apple Inc.Inventors: William M. York, Rebecca P. Fish, Gagan A. Gupta, Xinyuan Huang, Heriberto Nieto, Benjamin S. Phipps, Kurt Piersol
-
Patent number: 11669680Abstract: A set of sentences within a natural language text document are parsed, generating a word-level graph corresponding to a sentence in the set of sentences. Within the word-level graph using a trained entity identification model, a set of entity candidates are identified. From a set of graphs modelling relationships between portions of the set of sentences, a set of embeddings is generated. From a set of pairs of embeddings in the set of embeddings using a set of deconvolution layers, a set of links between entity candidates within the set of entity candidates is extracted. From the set of links and the set of entity candidates, an output graph modelling linkages between portions of the set of sentences within the natural language text document is generated.Type: GrantFiled: February 2, 2021Date of Patent: June 6, 2023Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Lingfei Wu, Tengfei Ma, Tian Gao, Xiaojie Guo
-
Patent number: 11669740Abstract: Systems and methods for training a machine-learning model for named-entity recognition. A rule graph is constructed including a plurality of nodes each corresponding to a different labeling rule of a set of labeling rules (including a set of seeding rules of known labeling accuracy and a plurality of candidate rules of unknown labeling accuracy). The nodes are coupled to other nodes based on which rules exhibit the highest sematic similarity. A labeling accuracy metric is estimated for each candidate rule by propagating a labeling confidence metric through the rule graph from the seeding rules to each candidate rule. A subset of labeling rules is then identified by ranking the rules by their labeling confidence metric. The identified subset of labeling rules is applied to unlabeled data to generate a set of weakly labeled named entities and the machine-learning model is trained based on the set of weakly labeled named entities.Type: GrantFiled: February 25, 2021Date of Patent: June 6, 2023Assignee: Robert Bosch GmbHInventors: Xinyan Zhao, Haibo Ding, Zhe Feng
-
Patent number: 11657817Abstract: Implementations set forth relate to suggesting an alternate interface modality when an automated assistant and/or a user is expected to not understand a particular interaction between the user and the automated assistant. In some instances, the automated assistant can pre-emptively determine that a forthcoming and/or ongoing interaction between a user and an automated assistant may experience interference. Based on this determination, the automated assistant can provide an indication that the interaction may not be successful and/or that the user should interact with the automated assistant through a different modality. For example, the automated assistant can render a keyboard interface at a portable computing device when the automated assistant determines that an audio interface of the portable computing device is experiencing interference.Type: GrantFiled: November 20, 2020Date of Patent: May 23, 2023Assignee: GOOGLE LLCInventors: Matthew Sharifi, Victor Carbune
-
Patent number: 11646019Abstract: Methods, systems, and apparatus, including computer programs encoded on computer-readable storage media, for speech recognition using attention-based sequence-to-sequence models. In some implementations, audio data indicating acoustic characteristics of an utterance is received. A sequence of feature vectors indicative of the acoustic characteristics of the utterance is generated. The sequence of feature vectors is processed using a speech recognition model that has been trained using a loss function that uses N-best lists of decoded hypotheses, the speech recognition model including an encoder, an attention module, and a decoder. The encoder and decoder each include one or more recurrent neural network layers. A sequence of output vectors representing distributions over a predetermined set of linguistic units is obtained. A transcription for the utterance is obtained based on the sequence of output vectors. Data indicating the transcription of the utterance is provided.Type: GrantFiled: July 27, 2021Date of Patent: May 9, 2023Assignee: Google LLCInventors: Rohit Prakash Prabhavalkar, Tara N. Sainath, Yonghui Wu, Patrick An Phu Nguyen, Zhifeng Chen, Chung-Cheng Chiu, Anjuli Patricia Kannan
-
Patent number: 11646013Abstract: In some examples, a user, either a customer or potential customer of a business, engages in conversations with a virtual assistant (VA) provided by the business. The virtual assistant (VA) is further supported by one or more human assistants (HA), if needed. In embodiments, to facilitate seamless transitions between a VA and a HA, when needed, an intelligent decision maker (IDM) is provided. The IDM receives a user question and a proposed answer to the question from a VA, evaluates the proposed answer in the context of the conversation, and determines if the proposed answer requires further review by an HA. In response to a determination that the proposed answer requires further review, the IDM sends the proposed answer to an HA, and, in response to an indication by the HA, takes further action in the conversation.Type: GrantFiled: December 30, 2019Date of Patent: May 9, 2023Assignee: International Business Machines CorporationInventors: Khoi-Nguyen Dao Tran, Jingshi Li, Mukesh Kumar Mohania, Jaysen Ollerenshaw
-
Patent number: 11640829Abstract: Aspects of the subject disclosure may include, for example, a device that includes a processing system having a processor and a memory that stores executable instructions that, when executed by the processing system, facilitate performance of operations, where the operations include determining parameters for adapting audio in the content to the device, wherein the device renders the content, and wherein the parameters are based on semantic metadata embedded in the content, adapting the audio in the content based on the parameters, and rendering the content, as adapted by the parameters, to represent a semantic in the semantic metadata. Other embodiments are disclosed.Type: GrantFiled: May 27, 2021Date of Patent: May 2, 2023Assignee: AT&T Intellectual Property I, L.P.Inventors: Eric Zavesky, Jason DeCuir, Robert Gratz
-
Patent number: 11631413Abstract: An electronic apparatus is provided. The electronic apparatus includes a memory and a processor configured to control the electronic apparatus to: classify a plurality of input data into a plurality of types to store in the memory, determine at least one among the input data of the classified plurality of types based on a voice command being recognized among the input data, and provide response information corresponding to the voice command based on the input data of the determined type.Type: GrantFiled: June 29, 2022Date of Patent: April 18, 2023Assignee: Samsung Electronics Co., Ltd.Inventors: Inchul Hwang, Hyeonmok Ko, Munjo Kim, Hojung Lee
-
Patent number: 11620515Abstract: Systems and methods are provided that employ knowledge distillation under a multi-task learning setting. In some embodiments, the systems and methods are implemented with a larger teacher model and a smaller student model, each of which comprise one or more shared layers and a plurality of task layers for performing multiple tasks. During training of the teacher model, its shared layers are initialized, and then the teacher model is multi-task refined. The teacher model predicts teacher logits. During training of the student model, its shared layers are initialized. Knowledge distillation is employed to transfer knowledge from the teacher model to the student model by the student model updating its shared layers and task layers, for example, according to the teacher logits of the teacher model. Other features are also provided.Type: GrantFiled: December 16, 2019Date of Patent: April 4, 2023Assignee: salesforce.com, inc.Inventors: Linqing Liu, Caiming Xiong
-
Patent number: 11604984Abstract: A system comprising a first computing apparatus in communication with multiple second computing apparatuses. The first computing apparatus may obtain a plurality of first trained machine learning models for a task from the multiple second computing apparatuses. At least a portion of parameter values of the plurality of first trained machine learning models may be different from each other. The first computing apparatus may also obtain a plurality of training samples. The first computing apparatus may further determine, based on the plurality of training samples, a second trained machine learning model by learning from the plurality of first trained machine learning models.Type: GrantFiled: November 18, 2019Date of Patent: March 14, 2023Assignee: SHANGHAI UNITED IMAGING INTELLIGENCE CO., LTD.Inventors: Abhishek Sharma, Arun Innanje, Ziyan Wu, Shanhui Sun, Terrence Chen
-
Patent number: 11593633Abstract: Systems, methods, and computer-readable storage devices are disclosed for improved real-time audio processing. One method including: constructing a deep neural network model, including a plurality of at least one-bit neurons, configured to output a predicted label of audio data, the plurality of at least one-bit neurons arranged in a plurality of layers, including at least one hidden layer, and being connected by a plurality of connections, each connection having at least a one-bit weight, wherein one or both of the plurality of at least one-bit neurons and the plurality of connections have a reduced bit precision; receiving a training data set, the training data set including audio data; training the deep neural network model using the training data set; and outputting a trained deep neural network model configured to output a predicted label of real-time audio data.Type: GrantFiled: April 13, 2018Date of Patent: February 28, 2023Assignee: Microsoft Technology Licensing, LLCInventors: Ivan Jelev Tashev, Shuayb M Zarar, Matthai Philipose, Jong Hwan Ko