Patents Examined by Paras D Shah
-
Patent number: 12131750Abstract: A method for enhancing detection of synthetic voice data is provided that includes converting, by an electronic device, monophonic voice data into stereophonic voice data. The stereophonic voice data includes a first channel signal and a second channel signal. Moreover, the method includes decomposing, by a trained machine learning model, the stereophonic voice data into a mid-signal and a side signal. The method also includes determining artifacts indicative of synthetic generation in the structured and secondary artifacts, calculating, based on the determined artifacts, a probability score reflecting the likelihood the monophonic voice data was synthetically generated, and comparing the probability score against a threshold value. When the probability score satisfies the threshold value, there is a high likelihood that the monophonic voice data includes synthetic artifacts, and an alert is generated indicating the monophonic voice data is potentially fraudulent.Type: GrantFiled: May 10, 2024Date of Patent: October 29, 2024Assignee: Daon TechnologyInventors: Raphael A. Rodriguez, Olena Mizynchuk, Davyd Mizynchuk
-
Patent number: 12131730Abstract: A keyword is extracted robustly despite a voice recognition result including an error. A model storage unit 10 stores a keyword extraction model that accepts word vector representations of a plurality of words as an input and extracts and outputs a word vector representation of a word to be extracted as a keyword. A speech detection unit 11 detects a speech part from a voice signal. A voice recognition unit 12 executes voice recognition on the speech part of the voice signal and outputs a confusion network which is a voice recognition result. A word vector representation generating unit 13 generates a word vector representation including reliability of voice recognition with regard to each candidate word for each confusion set. A keyword extraction unit 14 inputs the word vector representation of the candidate word to the keyword extraction model in descending order of the reliability and obtains the word vector representation of the keyword.Type: GrantFiled: November 19, 2019Date of Patent: October 29, 2024Assignee: NIPPON TELEGRAPH AND TELEPHONE CORPORATIONInventors: Takashi Nakamura, Tomohiro Tanaka
-
Patent number: 12125473Abstract: Embodiments of this disclosure disclose a speech recognition method, apparatus, and device, and a storage medium. The method in the embodiments of this disclosure includes: adjusting a probability of a relationship between at least one pair of elements in a language recognition model according to a probability of the relationship between the at least one pair of elements in a textual segment; inputting a to-be-recognized speech into a speech recognition model including the language recognition model; and determining, according to the adjusted probability of relationship between the at least tone pair of elements in the language recognition model, a sequence of elements corresponding to the to-be-recognized speech as a speech recognition result.Type: GrantFiled: March 4, 2021Date of Patent: October 22, 2024Assignee: Tencent Technology (Shenzhen) Company LimitedInventor: Tao Li
-
Patent number: 12118309Abstract: A method converts from compressed language to natural language. The method includes receiving an element string. The element string is in a compressed language format and is extracted from a document in a structured document language. The method includes tokenizing the element string to form multiple element tokens, generating a token set from the element tokens, and generating a name string from multiple token sets. The name string is in a natural language format.Type: GrantFiled: September 28, 2021Date of Patent: October 15, 2024Assignee: Intuit Inc.Inventors: Esmé Manandise, Glenn Carter Scott
-
Patent number: 12112741Abstract: A method, computer program product, and computing system for defining a model representative of a plurality of acoustic variations to a speech signal, thus defining a plurality of time-varying spectral modifications. The plurality of time-varying spectral modifications may be applied to a reference signal using a filtering operation, thus generating a time-varying spectrally-augmented signal.Type: GrantFiled: February 18, 2021Date of Patent: October 8, 2024Assignee: Microsoft Technology Licensing, LLCInventors: Patrick A. Naylor, Dushyant Sharma, Uwe Helmut Jost, William F Ganong, III
-
Patent number: 12062360Abstract: The present invention has an issue of effectively reducing the input load related to a voice trigger. There is provided an information processing device comprising a registration control unit that dynamically controls registration of startup phrases used as start triggers of a voice interaction session, in which the registration control unit temporarily additionally registers at least one of the startup phrases based on input voice. There is also provided an information processing method comprising dynamically controlling, by a processor, registration of startup phrases used as start triggers of a voice interaction session, in which the controlling further includes temporarily additionally registering at least one of the startup phrases based on input voice.Type: GrantFiled: March 12, 2019Date of Patent: August 13, 2024Assignee: SONY CORPORATIONInventors: Hiro Iwase, Yuhei Taki, Kunihito Sawai
-
Patent number: 12062365Abstract: An apparatus for training a dialogue summary model according to an embodiment includes a parameter transferer configured to transfer one or more learning parameter values of a pre-trained natural language processing model to a sequence-to-sequence-based dialogue summary model, and a model trainer configured to train the dialogue summary model by using the transferred learning parameter values as initial values for learning parameters of each of an encoder and a decoder in the dialogue summary model.Type: GrantFiled: October 29, 2021Date of Patent: August 13, 2024Assignee: SAMSUNG SDS CO., LTD.Inventors: Hyun Jae Lee, Hyun Jin Choi, Jae Woong Yun, Ju Dong Kim, Bong Kyu Hwang, Seong Ho Joe, Young June Gwon
-
Patent number: 12020693Abstract: An electronic device includes an audio sensor, a memory, and at least one processor coupled to the audio sensor and the memory. The at least one processor is configured to receive, via the audio sensor an audio input. The at least one processor is further configured to perform, using an automatic speech recognition (ASR) model and an entity prediction model, out-of-vocabulary prediction of an entity. The at least one processor is further configured to receive an ASR hypothesis including the predicted entity. The at least one processor is further configured to output text including the predicted entity.Type: GrantFiled: January 27, 2021Date of Patent: June 25, 2024Assignee: Samsung Electronics Co., Ltd.Inventors: Suhel Jaber, Anil Yadav, Melvin Lobo, Sukrat Gupta
-
Patent number: 11996118Abstract: An important task in several wellness applications is detection of emotional valence from speech. Two types of features of speech signals are used to detect valence: acoustic features and text features. Acoustic features are derived from short frames of speech, while text features are derived from the text transcription. Present disclosure provides systems and methods that determine the effect of text on acoustic features. Acoustic features of speech segments carrying emotion words are to be treated differently from other segments that do not carry such words. Only specific speech segments of the input speech signal are considered based on a dictionary specific to a language to assess emotional valence. A model trained (or trained classifier) for specific language either by including the acoustic features of the emotion related words or by omitting it is used by the system for determining emotional valence in an input speech signal.Type: GrantFiled: October 19, 2021Date of Patent: May 28, 2024Assignee: TATA CONSULTANCY SERVICES LIMITEDInventors: Ramesh Kumar Ramakrishnan, Venkata Subramanian Viraraghavan, Rahul Dasharath Gavas, Sachin Patel, Gauri Deshpande
-
Patent number: 11977855Abstract: The Zero User Interface (UI)-based automatic speech translation system and method can solve problems such as the procedural inconvenience of inputting speech signals and the malfunction of speech recognition due to crosstalk when users who speak difference languages have a face-to-face conversation. The system includes an automatic speech translation server, speaker terminals and a counterpart terminal. The automatic speech translation server selects a speech signal of a speaker among multiple speech signals received from speaker terminals connected to an automatic speech translation service and transmits a result of translating the speech signal of the speaker into a target language to a counterpart terminal.Type: GrantFiled: November 9, 2021Date of Patent: May 7, 2024Assignee: Electronics and Telecommunications Research InstituteInventors: Sang Hun Kim, Seung Yun, Min Kyu Lee, Joon Gyu Maeng, Dong Hyun Kim
-
Patent number: 11967326Abstract: Disclosed are a Digital Enhanced Cordless Telecommunications (DECT) base station, a handset and a communication system. The DECT base station is installed with at least two different voice codecs and is configured to sequentially judge whether the voice codecs of the DECT base station match voice codecs of a handset according to a first codec information data packet sent by the handset, and when a matching is successful, send a response information to the handset to perform voice communication with the handset through a matched voice codec. By using the DECT base station can guarantee the normal voice communication between the DECT base station and the handset in the condition of installing voice codecs in the DECT base station.Type: GrantFiled: September 30, 2020Date of Patent: April 23, 2024Assignee: YEALINK (XIAMEN) NETWORK TECHNOLOGY CO., LTD.Inventors: Wanjian Feng, Zhipeng Lin
-
Patent number: 11967332Abstract: A computer-implemented method for correcting muffled speech caused by facial coverings is disclosed. The computer-implemented method includes monitoring a user's speech for speech distortion. The computer-implemented method further includes determining that the user's speech is distorted. The computer-implemented method further includes determining that a cause of the user's speech distortion is based, at least in part, on a presence of a particular type of facial covering. The computer-implemented method further includes automatically correcting the speech distortion of the user based, at least in part, on the particular type of facial covering causing the speech distortion.Type: GrantFiled: September 17, 2021Date of Patent: April 23, 2024Assignee: International Business Machines CorporationInventors: Girmaw Abebe Tadesse, Michael S. Gordon, Komminist Weldemariam
-
Patent number: 11961531Abstract: An apparatus for processing an audio signal includes a configurable first audio signal processor for processing the audio signal in accordance with different configuration settings to obtain a processed audio signal, wherein the apparatus is adapted so that different configuration settings result in different sampling rates of the processed audio signal. The apparatus furthermore includes n analysis filter bank having a first number of analysis filter bank channels, a synthesis filter bank having a second number of synthesis filter bank channels, a second audio processor being adapted to receive and process an audio signal having a predetermined sampling rate, and a controller for controlling the first number of analysis filter bank channels or the second number of synthesis filter bank channels in accordance with a configuration setting.Type: GrantFiled: May 5, 2022Date of Patent: April 16, 2024Assignee: Fraunhofer-Gesellschaft zur Foerderung der angewandten Forschung e.V.Inventors: Markus Lohwasser, Manuel Jander, Max Neuendorf, Ralf Geiger, Markus Schnell, Matthias Hildenbrand, Tobias Chalupka
-
Patent number: 11955026Abstract: A method, computer program product, and computer system for public speaking guidance is provided. A processor retrieves speaker data regarding a speech made by a user. A processor separates the speaker data into one or more speaker modalities. A processor extracts one or more speaker features from the speaker data for the one or more speaker modalities. A processor generates a performance classification based on the one or more speaker features. A processor sends to the user guidance regarding the speech based on the performance classification.Type: GrantFiled: September 26, 2019Date of Patent: April 9, 2024Assignee: International Business Machines CorporationInventors: Cheng-Fang Lin, Ching-Chun Liu, Ting-Chieh Yu, Yu-Siang Chen, Ryan Young
-
Patent number: 11914965Abstract: Disclosed systems relate to generating questions from text. In an example, a method includes forming a first semantic tree from a first reference text and second semantic tree from a second reference text. The method includes identifying a set of semantic nodes that are in the first semantic tree but not in the second semantic tree. The method includes forming a first syntactic tree for the first reference text and a second syntactic tree for the second reference text. The method includes identifying a set of syntactic nodes that are in the first syntactic tree but not in the second syntactic tree. The method includes mapping the set of semantic nodes to the set of syntactic nodes by identifying a correspondence between a semantic node and a syntactic node, forming a question fragment from a normalized word, and providing the question fragment to a user device.Type: GrantFiled: July 30, 2021Date of Patent: February 27, 2024Assignee: Oracle International CorporationInventor: Boris Galitsky
-
Patent number: 11907820Abstract: Techniques are provided for improving computers as tools for assisting in negotiations. Specifically, techniques are provided for using a trained machine learning system to predict the likelihood that a party to a negotiation intends to comply with terms that are under consideration. In some negotiations, each party of a negotiation may use the techniques described herein to determine terms to offer the other party. In such situations, both parties may be both terms-receiving parties and terms-offering parties. By using a trained machine learning system to predict the intent of a party, the possibility of human bias significantly reduced, allowing proposed terms to be based more on objective facts and predictive indicators rather than the prejudices of the agents that have been delegated the responsibility of proposing terms.Type: GrantFiled: November 28, 2018Date of Patent: February 20, 2024Assignee: LendingClub CorporationInventor: Ramaswamy Alagianambi
-
Patent number: 11862183Abstract: An audio signal encoding and decoding method using a neural network model, a method of training the neural network model, and an encoder and decoder performing the methods are disclosed. The encoding method includes computing the first feature information of an input signal using a recurrent encoding model, computing an output signal from the first feature information using a recurrent decoding model, calculating a residual signal by subtracting the output signal from the input signal, computing the second feature information of the residual signal using a nonrecurrent encoding model, and converting the first feature information and the second feature information to a bitstream.Type: GrantFiled: July 6, 2021Date of Patent: January 2, 2024Assignee: Electronics and Telecommunications Research InstituteInventors: Jongmo Sung, Seung Kwon Beack, Mi Suk Lee, Tae Jin Lee, Woo-taek Lim, Inseon Jang
-
Patent number: 11853695Abstract: Data processing apparatus comprises a data memory; a selection controller comprising a computer processor; and a digital interface between a control process implemented by the selection controller and a text handling process implemented by the computer processor or another processor; in which: the selection controller is configured to provide a text document from the data memory to the text handling process to identify one or more characteristics of words in the text document; the selection controller is configured to provide user selection of one or more of the words in the text document to be substituted and of one or more target characteristics; and the selection controller is configured to request from the text handling process a set of one or more substitute words for the selected words such that the substitute words comply with the selected one or more of the target characteristics.Type: GrantFiled: January 12, 2021Date of Patent: December 26, 2023Assignee: SONY CORPORATIONInventor: Michael Anslow
-
Patent number: 11854571Abstract: Apparatuses and methods of transmitting and receiving a speech signal. The method of transmitting a speech signal includes extracting low frequency feature information from an input speech signal by using a first feature extracting network; and transmitting a speech signal corresponding to the low frequency feature information to a receiving end. The method of receiving a speech signal includes receiving a first speech signal transmitted by a transmitting end; extracting low frequency feature information from the first speech signal and recovering high frequency feature information based on the low frequency feature information, by using a second feature extracting network; and outputting a second speech signal including the low frequency feature information and the high frequency feature information.Type: GrantFiled: November 27, 2020Date of Patent: December 26, 2023Assignee: SAMSUNG ELECTRONICS CO., LTD.Inventors: Liang Wen, Lizhong Wang, Anxi Yi, Chao Min, Liangxi Yao
-
Patent number: 11847417Abstract: In some examples, data-driven social media analytics application synthesis may include generating, for each social media analytics application of a plurality of social media analytics applications, a corpus, performing term normalization, and generating a normalized corpus. An actor, an action and an object may be generated for each social media analytics application, which may be mapped into an embedding space. A semantic cohesion network may be generated for each social media analytics application, and a pair-wise semantic cohesion may be determined to identify semantically cohesive groups. A new social media analytics application may be synthesized based on the identified semantically cohesive groups.Type: GrantFiled: March 12, 2021Date of Patent: December 19, 2023Assignee: ACCENTURE GLOBAL SOLUTIONS LIMITEDInventors: Janardan Misra, Vikrant Kaulgud, Sanjay Podder