Patents Examined by Feng-Tzer Tzeng
-
Patent number: 11967335Abstract: An augmented reality (AR) device, such as AR glasses, may include a microphone array. The sensitivity of the microphone array can be directed to a target by beamforming, which includes combining the audio of each microphone of the array in a particular way based on a location of the target. The present disclosure describes systems and methods to determine the location of the target based on a gaze of a user and beamform the audio accordingly. This eye-tracked beamforming (i.e., foveated beamforming) can be used by AR applications to enhance sounds from a gaze direction and to suppress sounds from other directions. Additionally, the gaze information can be used to help visualize the results of an AR application, such as speech-to-text.Type: GrantFiled: September 3, 2021Date of Patent: April 23, 2024Assignee: Google LLCInventors: Ruofei Du, Hendrik Wagenaar, Alex Olwal
-
Patent number: 11967336Abstract: A computing device according to an embodiment is a computing device that is provided with one or more processors and a memory storing one or more programs executed by the one or more processors, the computing device includes a standby state video generating module that generates a standby state video in which a person in a video is in a standby state, a speech state video generating module that generates a speech state video in which a person in a video is in a speech state based on a source of speech content, and a video reproducing module that reproduces the standby state video, and generates a synthesized speech video by synthesizing the standby state video being reproduced and the speech state video.Type: GrantFiled: December 22, 2020Date of Patent: April 23, 2024Assignee: DEEPBRAIN AI INC.Inventor: Doohyun Kim
-
Patent number: 11961512Abstract: An artificial intelligence (AI) system using a machine learning algorithm such as deep learning, and an application thereof are provided. A method of providing, by a device, a voice assistance service includes obtaining a voice input of a user, receiving certain context information from at least one peripheral device, generating first query information from the received context information and the voice input, generating second query information including noise information by inputting the first query information into a noise learning model, transmitting the generated second query information to a server, receiving, from the server, response information obtained based on the transmitted second query information, generating a response message by removing response information corresponding to the noise information from the received response information, and outputting the response message.Type: GrantFiled: July 26, 2021Date of Patent: April 16, 2024Assignee: Samsung Electronics Co., Ltd.Inventors: Wonjong Choi, Soofeel Kim, Yewon Park, Jina Ham
-
Patent number: 11955135Abstract: Eyewear having a speech to moving lips algorithm that receives and translates speech and utterances of a person viewed through the eyewear, and then displays an overlay of moving lips corresponding to the speech and utterances on a mask of the viewed person. A database having text to moving lips information is utilized to translate the speech and generate the moving lips in near-real time with little latency. This translation provides the deaf/hearing impaired users the ability to understand and communicate with the person viewed through the eyewear when they are wearing a mask. The translation may include automatic speech recognition (ASR) and natural language understanding (NLU) as a sound recognition engine.Type: GrantFiled: August 23, 2021Date of Patent: April 9, 2024Assignee: Snap Inc.Inventor: Kathleen Worthington McMahon
-
Patent number: 11954446Abstract: Disclosed herein are methods and systems for generating automatically transactional dialog flows for a virtual assistant based on a set of predefined (labeled) transactional flows defining user interaction paths to accomplish one or more tasks. The virtual assistant is represented by a tree structure comprising a plurality of dialog nodes where each transactional flow comprises a respective subset of the nodes descending from a parent node to one or more completion nodes. New transactional flows may be generated automatically based on similarity with the predefined transactional flows, by selecting a candidate node and one of the predefined transactional flows comprising a completion node most similar to the candidate completion node and further selecting a candidate parent node most similar to the parent node of the selected predefined transactional flow. The new transactional flows may be then generated to comprise all descendant nodes of the candidate parent node.Type: GrantFiled: April 8, 2021Date of Patent: April 9, 2024Assignee: International Business Machines CorporationInventors: Sergey Zeltyn, Avi Yaeli
-
Patent number: 11941356Abstract: Embodiments described herein propose a densely connected Transformer architecture in which each Transformer layer takes advantages of all previous layers. Specifically, the input for each Transformer layer comes from the outputs of all its preceding layers; and the output information of each layer will be incorporated in all its subsequent layers. In this way, a L-layer Transformer network will have L(L+1)/2 connections. In this way, the dense connection allows the linguistic information learned by the lower layer to be directly propagated to all upper layers and encourages feature reuse throughout the network. Each layer is thus directly optimized from the loss function in the fashion of implicit deep supervision.Type: GrantFiled: October 26, 2020Date of Patent: March 26, 2024Assignee: Salesforce, Inc.Inventors: Linqing Liu, Caiming Xiong
-
Patent number: 11942078Abstract: A computer-implemented method is provided for improving accuracy recognition of digital speech. The method includes receiving the digital speech. The method further includes splitting the digital speech into overlapping chunks. The method also includes computing a bidirectional encoder embedding of each of the overlapping chunks to obtain bidirectional encoder embeddings. The method additionally includes combining the bidirectional encoder embeddings. The method further includes interpreting, by a speech recognition system, the digital speech using the combined bidirectional encoder embeddings.Type: GrantFiled: February 26, 2021Date of Patent: March 26, 2024Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventor: George Andrei Saon
-
Patent number: 11934952Abstract: Embodiments described herein provide natural language processing (NLP) systems and methods that utilize energy-based models (EBMs) to compute an exponentially-weighted energy-like term in the loss function to train an NLP classifier. Specifically, noise contrastive estimation (NCE) procedures are applied together with the EBM-based loss objectives for training the NLPs.Type: GrantFiled: December 16, 2020Date of Patent: March 19, 2024Assignee: Salesforce, Inc.Inventors: Tianxing He, Ehsan Hosseini-Asl, Bryan McCann, Caiming Xiong
-
Patent number: 11928434Abstract: A method for text generation, relates to a field of natural language processing, including: obtaining corpus data; labeling the corpus data to obtain a first constraint element; obtaining a first generation target; and generating a first text matching the first generation target by inputting the corpus data and the first constraint element into a generation model.Type: GrantFiled: August 9, 2021Date of Patent: March 12, 2024Assignee: BEIJING BAIDU NETCOM SCIENCE AND TECHNOLOGY CO., LTD.Inventors: Jiachen Liu, Xinyan Xiao, Hua Wu, Haifeng Wang
-
Patent number: 11922966Abstract: A signal separation device for acquiring a source signal from a mixed signal observed by a plurality of sensors includes: a database that stores feature information of a clean signal; separation matrix calculation means for repeatedly performing processes of, based on a separated signal obtained by multiplication of a mixed signal converted into a time-frequency representation by a separation matrix and on the feature information stored in the database, calculating a parameter to be used for an objective function for optimizing the separation matrix, and calculating a separation matrix for minimizing the objective function using the parameter; and output means for outputting a separated signal calculated using the optimized separation matrix obtained by the separation matrix calculation means.Type: GrantFiled: October 1, 2019Date of Patent: March 5, 2024Assignee: NIPPON TELEGRAPH AND TELEPHONE CORPORATIONInventor: Hiroshi Sawada
-
Patent number: 11914964Abstract: The present application discloses a method and apparatus for training a semantic representation model, a device and a computer storage medium, which relates to the field of natural language processing technologies in artificial intelligence.Type: GrantFiled: March 22, 2021Date of Patent: February 27, 2024Assignee: BEIJING BAIDU NETCOM SCIENCE AND TECHNOLOGY CO., LTD.Inventors: Shuohuan Wang, Jiaxiang Liu, Xuan Ouyang, Yu Sun, Hua Wu, Haifeng Wang
-
Patent number: 11908460Abstract: Disclosed herein are techniques for using a generative adversarial network (GAN) to train a semantic parser of a dialog system. A method described herein involves accessing seed data that includes seed tuples. Each seed tuple includes a respective seed utterance and a respective seed logical form corresponding to the respective seed utterance. The method further includes training a semantic parser and a discriminator in a GAN. The semantic parser learns to map utterances to logical forms based on output from the discriminator, and the discriminator learns to recognize authentic logical forms based on output from the semantic parser. The semantic parser may then be integrated into a dialog system.Type: GrantFiled: August 13, 2020Date of Patent: February 20, 2024Assignee: Oracle International CorporationInventors: Thanh Long Duong, Mark Edward Johnson
-
Patent number: 11900950Abstract: The present disclosure provides a bit allocation method and apparatus for an audio signal. The bit allocation method for an audio signal includes: obtaining T audio signals in a current frame, where T is a positive integer; determining a first audio signal set based on the T audio signals, where the first audio signal set includes M audio signals, M is a positive integer, T?M; determining M priorities of the M audio signals in the first audio signal set; and performing bit allocation on the M audio signals based on the M priorities of the M audio signals.Type: GrantFiled: October 28, 2022Date of Patent: February 13, 2024Assignee: HUAWEI TECHNOLOGIES CO., LTD.Inventors: Yuan Gao, Jiance Ding, Bin Wang
-
Patent number: 11900260Abstract: Methods, devices and processor-readable media for an integrated teacher-student machine learning system. One or more teacher-student modules are trained as part of the teacher neural network training. Each student sub-network uses a portion of the teacher neural network to generate an intermediate feature map, then provides the intermediate feature map to a student sub-network to generate inferences. The student sub-network may use a feature enhancement block to map the intermediate feature map to a subsequent feature map. A compression block may be used to compress intermediate feature map data for transmission in some embodiments.Type: GrantFiled: March 5, 2020Date of Patent: February 13, 2024Assignee: HUAWEI TECHNOLOGIES CO., LTD.Inventors: Deepak Sridhar, Juwei Lu
-
Patent number: 11895062Abstract: A system for responding to external requests received at an internal bot network is provided. The system may include an internal bot network including a plurality of internal bots. The plurality of internal bots may interact with a plurality of external bots. The system may include a translation layer/barrier. Each bot included in the plurality of internal bots and the plurality of external bots may be resident on one or more hardware processors. Each external bot may communicate using its own unique set of external specifications. Each internal bot may communicate using a universal set of internal specifications and is prevented from communicating using a set of external specifications. The translation layer/barrier may intercept requests and responses between the plurality of internal bots and the plurality of external bots. The translation layer/barrier reformats intercepted requests and responses to correspond to the set of specifications specific to the receiving bot.Type: GrantFiled: June 30, 2021Date of Patent: February 6, 2024Assignee: Bank of America CorporationInventors: Ramakrishna R. Yannam, Priyank R. Shah, Castigliana Cimpian, Sushil Golani
-
Patent number: 11887590Abstract: Methods and devices for enabling and disabling applications using voice are described herein. In some embodiments, an individual speak an utterance to their electronic device, which may send audio data representing the utterance to a backend system. The backend system may generate text data representing the utterance, and may determine that an intent of the utterance was for an application to be enabled or disabled for their user account on the backend system. If, for instance, the intent was to enable the application, the backend system may receive one or more rules for performing functionalities of the application, as well as one or more sample templates of sample utterances and sample responses that future utterances may use when requesting the application. Furthermore, one or more invocation phrases that may be used within the future utterances to invoke the application may be received, along with slot values for the sample templates.Type: GrantFiled: September 24, 2020Date of Patent: January 30, 2024Assignee: Amazon Technologies, Inc.Inventors: Shaman D'Souza, Ian Suttle, Srikanth Nori, Rajiv Reddy, Amol Kanitkar, Tina Orooji
-
Patent number: 11881227Abstract: A method, executed by a processor for compressing an audio signal in multiple layers, may comprise: (a) restoring, in a highest layer, an input audio signal as a first signal; (b) restoring, in at least one intermediate layer, a signal obtained by subtracting an upsampled signal, which is obtained by upsampling the audio signal restored in the highest layer or an immediately previous intermediate layer, from the input audio signal as a second signal; and (c) restoring, in a lowest layer, a signal obtained by subtracting an upsampled signal, which is obtained by upsampling the audio signal restored in an intermediate layer immediately before the lowest layer, from the input audio signal as a third signal, wherein the first signal, the second signal, and the third signal are combined to output a final restoration audio signal.Type: GrantFiled: January 13, 2023Date of Patent: January 23, 2024Assignees: ELECTRONICS AND TELECOMMUNICATIONS RESERCH INSTITUTE, INDUSTRY-ACADEMIC COOPERATION FOUNDATION, YONSEI UNIVERSITYInventors: In Seon Jang, Seung Kwon Beack, Jong Mo Sung, Tae Jin Lee, Woo Taek Lim, Byeong Ho Cho, Hong Goo Kang, Ji Hyun Lee, Chan Woo Lee, Hyung Seob Lim
-
Patent number: 11875791Abstract: System and methods for processing audio signals are disclosed. In one implementation, a system may include at least one microphone configured to capture sounds from an environment of a user; and at least one processor. The processor may be programmed to receive at least one audio signal representative of at least part of the sounds captured by the microphone; identify at least one word in the at least one audio signal; and in response to identifying the at least one word, cause feedback to be provided the user.Type: GrantFiled: May 20, 2021Date of Patent: January 16, 2024Assignee: ORCAM TECHNOLOGIES LTD.Inventors: Yonatan Wexler, Amnon Shashua, Nir Sancho, Roi Nathan, Tal Rosenwein, Oren Tadmor
-
Patent number: 11868343Abstract: A method utilizes data discovery to support visual analysis of datasets. A user selects a data source, and the method presents a natural language interface for analysis of the data source. A user specifies an incomplete natural language command directed to the data source, and the method correlates words in the incomplete natural language command with a data field in the data source. The method determines the data type of the data field and a range of data values for the data field. According to the data type and the range of data values, the method presents one or more autocompletion options for the incomplete natural language command. Each option includes respective text and a respective corresponding visual graphic. The user selects one of the autocompletion options, and the method forms a complete natural language command. The method then displays a data visualization according to the complete natural language command.Type: GrantFiled: September 2, 2020Date of Patent: January 9, 2024Assignee: Tableau Software, LLCInventors: Vidya Raghavan Setlur, Md Enamul Hoque Prince, Dae Hyun Kim, Angel Xuan Chang
-
Patent number: 11861318Abstract: An electronic device according to various embodiments of the present invention may include a memory, and at least one processor, wherein the at least one processor may be configured to obtain a sentence based on a user input, based on obtaining the sentence, determine at least one persona, convert the sentence to at least one sentence having a style corresponding to the at least one persona, using a neural network, and provide the converted at least one sentence.Type: GrantFiled: December 9, 2019Date of Patent: January 2, 2024Assignee: Samsung Electronics Co., Ltd.Inventor: Eunchung Noh