Patents Examined by Huyen X. Vo
-
Patent number: 12073837Abstract: Speech detection can be achieved by identifying a speech segment within an audio segment using image classification. An audio segment of radio communications is obtained. An audio sub-segment within the audio segment is extracted. A sampled histogram is generated of a plurality of sampled values across a sampled time window of the audio sub-segment. A two-dimensional image is generated that represents a two-dimensional mapping of the sampled histogram along a first dimension and a predefined histogram along a second dimension that is orthogonal to the first dimension. The two-dimensional image is provided to an image classifier previously trained using the predefined histogram. An output is received from the image classifier based on the two-dimensional image. The output indicates whether the audio sub-segment contains speech.Type: GrantFiled: June 7, 2022Date of Patent: August 27, 2024Assignees: The Boeing Company, University of WashingtonInventors: Stephen Gregory Dame, Les Eugene Atlas
-
Patent number: 12067977Abstract: The present disclosure discloses a speech recognition method and apparatus, and relates to the field of speech and deep learning technologies. A specific implementation scheme involves: acquiring candidate recognition results with first N recognition scores outputted by a speech recognition model for to-be-recognized speech, N being a positive integer greater than 1; scoring the N candidate recognition results based on pronunciation similarities between candidate recognition results and pre-collected popular entities, to obtain similarity scores of the candidate recognition results; and integrating the recognition scores and the similarity scores of the candidate recognition results to determine a recognition result corresponding to the to-be-recognized speech from the N candidate recognition results. The present disclosure can improve recognition accuracy.Type: GrantFiled: March 2, 2022Date of Patent: August 20, 2024Assignee: BEIJING BAIDU NETCOM SCIENCE TECHNOLOGY CO., LTD.Inventors: Liao Zhang, Yinlou Zhao, Zhengxiang Jiang, Xiaoyin Fu, Wei Wei
-
Patent number: 12051424Abstract: An audio processing apparatus 100 is apparatus for generating a training data in speaker recognition. The audio processing apparatus 100 includes a data acquisition unit configured to acquire an audio signal that is a source of the training data as sample data, a data generation unit configured to executes signal processing on the acquired sample data, and to generates a new audio signal as the training data whose similarity with the sample data is within the set range.Type: GrantFiled: October 25, 2018Date of Patent: July 30, 2024Assignee: NEC CORPORATIONInventors: Hitoshi Yamamoto, Takafumi Koshinaka
-
Patent number: 12050887Abstract: Disclosed are an information processing method and a terminal device. The method comprises: acquiring first information, wherein the first information is information to be processed by a terminal device; calling an operation instruction in a calculation apparatus to calculate the first information so as to obtain second information; and outputting the second information. By means of the examples in the present disclosure, a calculation apparatus of a terminal device can be used to call an operation instruction to process first information, so as to output second information of a target desired by a user, thereby improving the information processing efficiency. The present technical solution has advantages of a fast computation speed and high efficiency.Type: GrantFiled: December 11, 2020Date of Patent: July 30, 2024Assignee: SHANGHAI CAMBRICON INFORMATION TECHNOLOGY CO., LTD.Inventors: Tianshi Chen, Shaoli Liu, Zai Wang, Shuai Hu
-
Patent number: 12045567Abstract: A system determines service controls for organizations. The system receives documents from external systems representing reports storing information describing service controls for external services. The service controls are represented using natural language text. The system encodes the service controls using a natural language model to generate encoded service controls. The system determines similarity scores for pairs of service controls. The system determines one or more representative service controls for a category of external services based on similarity scores of pairs of the service controls. The system stores a mapping from categories of external services to representative service controls determined from the set of service controls corresponding to the category of external services. The system uses the mapping for determining representative service controls for external services corresponding to a set of services used by an organization.Type: GrantFiled: March 21, 2022Date of Patent: July 23, 2024Assignee: INTERSTICE LABS, INC.Inventor: Aleksandr Olegovich Gonopolskiy
-
Patent number: 12032580Abstract: An applicant can instantiate a parsing framework, provide an input stream, attach observers, and initiate parsing, which inverts control to the parsing framework. The parsing framework can have an observer manager, a parser controller, and parsers. The observer manager manages observer design patterns from which the observers are instantiated. The parser controller determines which parser would be appropriate for parsing the input stream and instantiate the appropriate parser(s). The parser controller gets the callbacks from the parsers and communicates outcomes to the observer manager. The observer manager determines which of the observers is to be notified, generates parsing notifications accordingly, and dispatches the parsing notifications directly to the observers. The application can be any application that needs parsing in an electronic information exchange platform.Type: GrantFiled: April 6, 2022Date of Patent: July 9, 2024Assignee: OPEN TEXT GXS ULCInventors: Phil Hanson, Kris Loia
-
Patent number: 12032908Abstract: A system determines service controls for organizations. The system receives documents from external systems representing reports storing information describing service controls for external services. The service controls are represented using natural language text. The system encodes the service controls using a natural language model to generate encoded service controls. The system determines similarity scores for pairs of service controls. The system determines one or more representative service controls for a category of external services based on similarity scores of pairs of the service controls. The system stores a mapping from categories of external services to representative service controls determined from the set of service controls corresponding to the category of external services. The system uses the mapping for determining representative service controls for external services corresponding to a set of services used by an organization.Type: GrantFiled: March 21, 2022Date of Patent: July 9, 2024Assignee: INTERSTICE LABS, INC.Inventor: Aleksandr Olegovich Gonopolskiy
-
Patent number: 12027165Abstract: A non-transitory computer readable medium stores computer executable instructions which, when executed by at least one processor, cause the at least one processor to acquire a speech signal of speech of a user; perform a signal processing on the speech signal to acquire at least one feature of the speech of the user; and control display of information, related to each of one or more first candidate converters having a feature corresponding to the at least one feature, to present the one or more first candidate converters for selection by the user.Type: GrantFiled: July 9, 2021Date of Patent: July 2, 2024Assignee: GREE, INC.Inventor: Akihiko Shirai
-
Patent number: 12009009Abstract: A system that can capture a user's voice sample and, based on a comparison with other voice samples stored in a database, determine the existence of one or more musculoskeletal conditions within the user's body. The analysis of the voice sample include determining various voice characteristics of the sample against those that are in the database-stored samples to determine matches with conditions and their associated severities. The results are presented via an application that can present location and severity on a 3D avatar model.Type: GrantFiled: March 11, 2023Date of Patent: June 11, 2024Assignee: Sonaphi LLCInventors: Brandon Muzik, Mark Hinds, Robert D. Fish
-
Patent number: 11990118Abstract: During text-to-speech processing, a speech model creates output audio data, including speech, that corresponds to input text data that includes a representation of the speech. A spectrogram estimator estimates a frequency spectrogram of the speech; the corresponding frequency-spectrogram data is used to condition the speech model. A plurality of acoustic features corresponding to different segments of the input text data, such as phonemes, syllable-level features, and/or word-level features, may be separately encoded into context vectors; the spectrogram estimator uses these separate context vectors to create the frequency spectrogram.Type: GrantFiled: June 6, 2023Date of Patent: May 21, 2024Assignee: Amazon Technologies, Inc.Inventors: Jaime Lorenzo Trueba, Thomas Renaud Drugman, Viacheslav Klimkov, Srikanth Ronanki, Thomas Edward Merritt, Andrew Paul Breen, Roberto Barra-Chicote
-
Patent number: 11989219Abstract: Techniques for disambiguating which profile, of multiple profiles, is to be used to respond to a user input are described. A device located in a communal space (e.g., a hotel room or suite of rooms, conference room, hospital room, etc.) may be associated with a device profile and a user profile of a user presently occupying the communal space. When the user inputs a command to the device (either by text or speech), a system associated with the device determines the profiles (e.g., a device profile and a user profile) associated with the device. The system determines one or more policies associated with the device. The one or more policies may correspond to rules for disambiguating which profile to use to execute with respect to the user input. Using the one or more policies, the system determines which profile is to be used, and causes a speechlet component to execute using information specific to the determined profile.Type: GrantFiled: May 30, 2023Date of Patent: May 21, 2024Assignee: Amazon Technologies, Inc.Inventors: Rebecca Joy Lopdrup Miller, Dick Clarence Hardt, Joseph Jessup, Yu Bao, Gonzalo Alvarez Barrio, Liron Torres
-
Patent number: 11983489Abstract: A method and computer system are provided for generating a text summary. An input text is processed by a model that comprises a set of attention heads. The model is trained to generate abstractive summaries of text documents. A subset of the attention heads are identified. For each attention head in the subset, a portion is identified from the input text that is used to generate the abstractive summary. For each sentence of the input text, a fractional size is calculated for an intersection of the portion and the respective sentence relative to the respective sentence. A subset of the sentences is then determined, where the respective fractional size of each sentence in the subset meets a first threshold. An extractive summary of the input text is then generated from the subset of sentences.Type: GrantFiled: June 23, 2023Date of Patent: May 14, 2024Assignee: Intuit Inc.Inventors: Natalie Bar Eliyahu, Ido Farhi, Adi Shalev, Oren Dar
-
Patent number: 11978447Abstract: The present disclosure provides a speech interaction method, apparatus, device and computer storage medium and relates to the field of artificial intelligence. A specific implementation solution is as follows: performing speech recognition and demand analysis for a first speech instruction input by a user; performing demand prediction for the first speech instruction if the demand analysis fails, to obtain at least one demand expression; returning at least one of the demand expression to the user in a form of a question; performing a service response with a demand analysis result corresponding to the demand expression confirmed by the user, if a second speech instruction confirming at least one of the demand expression is received from the user. The present disclosure can efficiently improve the user's interaction efficiency and enhance the user's experience.Type: GrantFiled: September 17, 2020Date of Patent: May 7, 2024Assignee: BAIDU ONLINE NETWORK TECHNOLOGY (BEIJING) CO., LTD.Inventors: Haifeng Wang, Jizhou Huang
-
Patent number: 11972764Abstract: Systems and methods for providing audio data, from an initially invoked automated assistant to a subsequently invoked automated assistant. An initially invoked automated assistant may be invoked by a user utterance, followed by audio data that includes a query. The query is provided to a secondary automated assistant for processing. Subsequently, the user can submit a query that is related to the first query. In response, the initially invoked automated assistant provides the query to the secondary automated assistant in lieu of providing the query to other secondary automated assistants based on similarity between the first query and the subsequent query.Type: GrantFiled: November 23, 2021Date of Patent: April 30, 2024Assignee: GOOGLE LLCInventors: Victor Carbune, Matthew Sharifi
-
Patent number: 11967308Abstract: Disclosed is an electronic device including processor and memory operatively connected to the processor and storing language model. The electronic device may enter data into the language model, generate an embedding vector in the input embedding layer, add position information to the embedding vector in the positional encoding layer, branch the embedding vector based on domain information, normalize the branched embedding vectors, enter the normalized embedding vectors into the multi-head attention layer, enter output data of the multi-head attention layer into the first layer, normalize pieces of output data of the first layer, enter the normalized pieces of output data of the first layer into the feed-forward layer, enter output data of the feed-forward layer into the second layer and normalize pieces of output data of the second layer, and enter the normalized pieces of output data of the second layer into the linearization layer and the softmax layer to obtain result data.Type: GrantFiled: July 8, 2021Date of Patent: April 23, 2024Assignee: Samsung Electronics Co., Ltd.Inventors: Taewoo Lee, Taegyoon Kang, Hogyeong Kim, Minjoong Lee, Seokyeong Jung, Jiseung Jeong
-
Patent number: 11961521Abstract: Disclosed herein are system, apparatus, article of manufacture, method and/or computer program product embodiments, and/or combinations and sub-combinations thereof, for providing voice control using multiple digital assistants. In some embodiments, a voice platform operates to receive a voice input from a user. The voice platform selects a digital assistant from a plurality of digital assistants based on a trigger word. The voice platform then generates an intent from the voice input using the selected digital assistant. The voice platform then transmits the intent to a media device for processing.Type: GrantFiled: March 23, 2023Date of Patent: April 16, 2024Assignee: Roku, Inc.Inventors: Anthony John Wood, David Stern, Gregory Mack Garner
-
Patent number: 11948553Abstract: Embodiments described herein provide for audio processing operations that evaluate characteristics of audio signals that are independent of the speaker's voice. A neural network architecture trains and applies discriminatory neural networks tasked with modeling and classifying speaker-independent characteristics. The task-specific models generate or extract feature vectors from input audio data based on the trained embedding extraction models. The embeddings from the task-specific models are concatenated to form a deep-phoneprint vector for the input audio signal. The DP vector is a low dimensional representation of the each of the speaker-independent characteristics of the audio signal and applied in various downstream operations.Type: GrantFiled: March 4, 2021Date of Patent: April 2, 2024Assignee: Pindrop Security, Inc.Inventors: Kedar Phatak, Elie Khoury
-
Patent number: 11935508Abstract: The present document relates to audio coding systems which make use of a harmonic transposition method for high frequency reconstruction (HFR), and to digital effect processors, e.g. so-called exciters, where generation of harmonic distortion adds brightness to the processed signal. In particular, a system configured to generate a high frequency component of a signal from a low frequency component of the signal is described. The system may comprise an analysis filter bank (501) configured to provide a set of analysis subband signals from the low frequency component of the signal; wherein the set of analysis subband signals comprises at least two analysis subband signals; wherein the analysis filter bank (501) has a frequency resolution of ?f.Type: GrantFiled: March 31, 2023Date of Patent: March 19, 2024Assignee: DOLBY INTERNATIONAL ABInventors: Per Ekstrand, Lars Villemoes, Per Hedelin
-
Patent number: 11935525Abstract: Systems and methods for utilizing microphone array information for acoustic modeling are disclosed. Audio data may be received from a device having a microphone array configuration. Microphone configuration data may also be received that indicates the configuration of the microphone array. The microphone configuration data may be utilized as an input vector to an acoustic model, along with the audio data, to generate phoneme data. Additionally, the microphone configuration data may be utilized to train and/or generate acoustic models, select an acoustic model to perform speech recognition with, and/or to improve trigger sound detection.Type: GrantFiled: June 8, 2020Date of Patent: March 19, 2024Assignee: Amazon Technologies, Inc.Inventors: Shiva Kumar Sundaram, Minhua Wu, Anirudh Raju, Spyridon Matsoukas, Arindam Mandal, Kenichi Kumatani
-
Patent number: 11914626Abstract: Techniques are disclosed relating to implementing a machine learning approach to cross-language translation and search. In certain embodiments, a method may include receiving a plurality of characters of a first language that are unsegmented and grouping the plurality of character into multiple groups. The method also includes determining a set of word tokens based on one or more transliterations of the multiple groups and one or more translations of the multiple groups to a second language. Further, the method includes generating one or more word token solution sets by querying an index file using the one or more word tokens. The method also includes determining whether the index file references an entity name corresponding to the plurality of characters of the first language based on comparing the one or more token solution sets with the index file.Type: GrantFiled: March 22, 2021Date of Patent: February 27, 2024Assignee: PAYPAL, INC.Inventors: Rushik Upadhyay, Dhamodharan Lakshmipathy, Nandhini Ramesh, Aditya Kaulagi