Patents Examined by Huyen X. Vo
  • Patent number: 12073837
    Abstract: Speech detection can be achieved by identifying a speech segment within an audio segment using image classification. An audio segment of radio communications is obtained. An audio sub-segment within the audio segment is extracted. A sampled histogram is generated of a plurality of sampled values across a sampled time window of the audio sub-segment. A two-dimensional image is generated that represents a two-dimensional mapping of the sampled histogram along a first dimension and a predefined histogram along a second dimension that is orthogonal to the first dimension. The two-dimensional image is provided to an image classifier previously trained using the predefined histogram. An output is received from the image classifier based on the two-dimensional image. The output indicates whether the audio sub-segment contains speech.
    Type: Grant
    Filed: June 7, 2022
    Date of Patent: August 27, 2024
    Assignees: The Boeing Company, University of Washington
    Inventors: Stephen Gregory Dame, Les Eugene Atlas
  • Patent number: 12067977
    Abstract: The present disclosure discloses a speech recognition method and apparatus, and relates to the field of speech and deep learning technologies. A specific implementation scheme involves: acquiring candidate recognition results with first N recognition scores outputted by a speech recognition model for to-be-recognized speech, N being a positive integer greater than 1; scoring the N candidate recognition results based on pronunciation similarities between candidate recognition results and pre-collected popular entities, to obtain similarity scores of the candidate recognition results; and integrating the recognition scores and the similarity scores of the candidate recognition results to determine a recognition result corresponding to the to-be-recognized speech from the N candidate recognition results. The present disclosure can improve recognition accuracy.
    Type: Grant
    Filed: March 2, 2022
    Date of Patent: August 20, 2024
    Assignee: BEIJING BAIDU NETCOM SCIENCE TECHNOLOGY CO., LTD.
    Inventors: Liao Zhang, Yinlou Zhao, Zhengxiang Jiang, Xiaoyin Fu, Wei Wei
  • Patent number: 12051424
    Abstract: An audio processing apparatus 100 is apparatus for generating a training data in speaker recognition. The audio processing apparatus 100 includes a data acquisition unit configured to acquire an audio signal that is a source of the training data as sample data, a data generation unit configured to executes signal processing on the acquired sample data, and to generates a new audio signal as the training data whose similarity with the sample data is within the set range.
    Type: Grant
    Filed: October 25, 2018
    Date of Patent: July 30, 2024
    Assignee: NEC CORPORATION
    Inventors: Hitoshi Yamamoto, Takafumi Koshinaka
  • Patent number: 12050887
    Abstract: Disclosed are an information processing method and a terminal device. The method comprises: acquiring first information, wherein the first information is information to be processed by a terminal device; calling an operation instruction in a calculation apparatus to calculate the first information so as to obtain second information; and outputting the second information. By means of the examples in the present disclosure, a calculation apparatus of a terminal device can be used to call an operation instruction to process first information, so as to output second information of a target desired by a user, thereby improving the information processing efficiency. The present technical solution has advantages of a fast computation speed and high efficiency.
    Type: Grant
    Filed: December 11, 2020
    Date of Patent: July 30, 2024
    Assignee: SHANGHAI CAMBRICON INFORMATION TECHNOLOGY CO., LTD.
    Inventors: Tianshi Chen, Shaoli Liu, Zai Wang, Shuai Hu
  • Patent number: 12045567
    Abstract: A system determines service controls for organizations. The system receives documents from external systems representing reports storing information describing service controls for external services. The service controls are represented using natural language text. The system encodes the service controls using a natural language model to generate encoded service controls. The system determines similarity scores for pairs of service controls. The system determines one or more representative service controls for a category of external services based on similarity scores of pairs of the service controls. The system stores a mapping from categories of external services to representative service controls determined from the set of service controls corresponding to the category of external services. The system uses the mapping for determining representative service controls for external services corresponding to a set of services used by an organization.
    Type: Grant
    Filed: March 21, 2022
    Date of Patent: July 23, 2024
    Assignee: INTERSTICE LABS, INC.
    Inventor: Aleksandr Olegovich Gonopolskiy
  • Patent number: 12032580
    Abstract: An applicant can instantiate a parsing framework, provide an input stream, attach observers, and initiate parsing, which inverts control to the parsing framework. The parsing framework can have an observer manager, a parser controller, and parsers. The observer manager manages observer design patterns from which the observers are instantiated. The parser controller determines which parser would be appropriate for parsing the input stream and instantiate the appropriate parser(s). The parser controller gets the callbacks from the parsers and communicates outcomes to the observer manager. The observer manager determines which of the observers is to be notified, generates parsing notifications accordingly, and dispatches the parsing notifications directly to the observers. The application can be any application that needs parsing in an electronic information exchange platform.
    Type: Grant
    Filed: April 6, 2022
    Date of Patent: July 9, 2024
    Assignee: OPEN TEXT GXS ULC
    Inventors: Phil Hanson, Kris Loia
  • Patent number: 12032908
    Abstract: A system determines service controls for organizations. The system receives documents from external systems representing reports storing information describing service controls for external services. The service controls are represented using natural language text. The system encodes the service controls using a natural language model to generate encoded service controls. The system determines similarity scores for pairs of service controls. The system determines one or more representative service controls for a category of external services based on similarity scores of pairs of the service controls. The system stores a mapping from categories of external services to representative service controls determined from the set of service controls corresponding to the category of external services. The system uses the mapping for determining representative service controls for external services corresponding to a set of services used by an organization.
    Type: Grant
    Filed: March 21, 2022
    Date of Patent: July 9, 2024
    Assignee: INTERSTICE LABS, INC.
    Inventor: Aleksandr Olegovich Gonopolskiy
  • Patent number: 12027165
    Abstract: A non-transitory computer readable medium stores computer executable instructions which, when executed by at least one processor, cause the at least one processor to acquire a speech signal of speech of a user; perform a signal processing on the speech signal to acquire at least one feature of the speech of the user; and control display of information, related to each of one or more first candidate converters having a feature corresponding to the at least one feature, to present the one or more first candidate converters for selection by the user.
    Type: Grant
    Filed: July 9, 2021
    Date of Patent: July 2, 2024
    Assignee: GREE, INC.
    Inventor: Akihiko Shirai
  • Patent number: 12009009
    Abstract: A system that can capture a user's voice sample and, based on a comparison with other voice samples stored in a database, determine the existence of one or more musculoskeletal conditions within the user's body. The analysis of the voice sample include determining various voice characteristics of the sample against those that are in the database-stored samples to determine matches with conditions and their associated severities. The results are presented via an application that can present location and severity on a 3D avatar model.
    Type: Grant
    Filed: March 11, 2023
    Date of Patent: June 11, 2024
    Assignee: Sonaphi LLC
    Inventors: Brandon Muzik, Mark Hinds, Robert D. Fish
  • Patent number: 11990118
    Abstract: During text-to-speech processing, a speech model creates output audio data, including speech, that corresponds to input text data that includes a representation of the speech. A spectrogram estimator estimates a frequency spectrogram of the speech; the corresponding frequency-spectrogram data is used to condition the speech model. A plurality of acoustic features corresponding to different segments of the input text data, such as phonemes, syllable-level features, and/or word-level features, may be separately encoded into context vectors; the spectrogram estimator uses these separate context vectors to create the frequency spectrogram.
    Type: Grant
    Filed: June 6, 2023
    Date of Patent: May 21, 2024
    Assignee: Amazon Technologies, Inc.
    Inventors: Jaime Lorenzo Trueba, Thomas Renaud Drugman, Viacheslav Klimkov, Srikanth Ronanki, Thomas Edward Merritt, Andrew Paul Breen, Roberto Barra-Chicote
  • Patent number: 11989219
    Abstract: Techniques for disambiguating which profile, of multiple profiles, is to be used to respond to a user input are described. A device located in a communal space (e.g., a hotel room or suite of rooms, conference room, hospital room, etc.) may be associated with a device profile and a user profile of a user presently occupying the communal space. When the user inputs a command to the device (either by text or speech), a system associated with the device determines the profiles (e.g., a device profile and a user profile) associated with the device. The system determines one or more policies associated with the device. The one or more policies may correspond to rules for disambiguating which profile to use to execute with respect to the user input. Using the one or more policies, the system determines which profile is to be used, and causes a speechlet component to execute using information specific to the determined profile.
    Type: Grant
    Filed: May 30, 2023
    Date of Patent: May 21, 2024
    Assignee: Amazon Technologies, Inc.
    Inventors: Rebecca Joy Lopdrup Miller, Dick Clarence Hardt, Joseph Jessup, Yu Bao, Gonzalo Alvarez Barrio, Liron Torres
  • Patent number: 11983489
    Abstract: A method and computer system are provided for generating a text summary. An input text is processed by a model that comprises a set of attention heads. The model is trained to generate abstractive summaries of text documents. A subset of the attention heads are identified. For each attention head in the subset, a portion is identified from the input text that is used to generate the abstractive summary. For each sentence of the input text, a fractional size is calculated for an intersection of the portion and the respective sentence relative to the respective sentence. A subset of the sentences is then determined, where the respective fractional size of each sentence in the subset meets a first threshold. An extractive summary of the input text is then generated from the subset of sentences.
    Type: Grant
    Filed: June 23, 2023
    Date of Patent: May 14, 2024
    Assignee: Intuit Inc.
    Inventors: Natalie Bar Eliyahu, Ido Farhi, Adi Shalev, Oren Dar
  • Patent number: 11978447
    Abstract: The present disclosure provides a speech interaction method, apparatus, device and computer storage medium and relates to the field of artificial intelligence. A specific implementation solution is as follows: performing speech recognition and demand analysis for a first speech instruction input by a user; performing demand prediction for the first speech instruction if the demand analysis fails, to obtain at least one demand expression; returning at least one of the demand expression to the user in a form of a question; performing a service response with a demand analysis result corresponding to the demand expression confirmed by the user, if a second speech instruction confirming at least one of the demand expression is received from the user. The present disclosure can efficiently improve the user's interaction efficiency and enhance the user's experience.
    Type: Grant
    Filed: September 17, 2020
    Date of Patent: May 7, 2024
    Assignee: BAIDU ONLINE NETWORK TECHNOLOGY (BEIJING) CO., LTD.
    Inventors: Haifeng Wang, Jizhou Huang
  • Patent number: 11972764
    Abstract: Systems and methods for providing audio data, from an initially invoked automated assistant to a subsequently invoked automated assistant. An initially invoked automated assistant may be invoked by a user utterance, followed by audio data that includes a query. The query is provided to a secondary automated assistant for processing. Subsequently, the user can submit a query that is related to the first query. In response, the initially invoked automated assistant provides the query to the secondary automated assistant in lieu of providing the query to other secondary automated assistants based on similarity between the first query and the subsequent query.
    Type: Grant
    Filed: November 23, 2021
    Date of Patent: April 30, 2024
    Assignee: GOOGLE LLC
    Inventors: Victor Carbune, Matthew Sharifi
  • Patent number: 11967308
    Abstract: Disclosed is an electronic device including processor and memory operatively connected to the processor and storing language model. The electronic device may enter data into the language model, generate an embedding vector in the input embedding layer, add position information to the embedding vector in the positional encoding layer, branch the embedding vector based on domain information, normalize the branched embedding vectors, enter the normalized embedding vectors into the multi-head attention layer, enter output data of the multi-head attention layer into the first layer, normalize pieces of output data of the first layer, enter the normalized pieces of output data of the first layer into the feed-forward layer, enter output data of the feed-forward layer into the second layer and normalize pieces of output data of the second layer, and enter the normalized pieces of output data of the second layer into the linearization layer and the softmax layer to obtain result data.
    Type: Grant
    Filed: July 8, 2021
    Date of Patent: April 23, 2024
    Assignee: Samsung Electronics Co., Ltd.
    Inventors: Taewoo Lee, Taegyoon Kang, Hogyeong Kim, Minjoong Lee, Seokyeong Jung, Jiseung Jeong
  • Patent number: 11961521
    Abstract: Disclosed herein are system, apparatus, article of manufacture, method and/or computer program product embodiments, and/or combinations and sub-combinations thereof, for providing voice control using multiple digital assistants. In some embodiments, a voice platform operates to receive a voice input from a user. The voice platform selects a digital assistant from a plurality of digital assistants based on a trigger word. The voice platform then generates an intent from the voice input using the selected digital assistant. The voice platform then transmits the intent to a media device for processing.
    Type: Grant
    Filed: March 23, 2023
    Date of Patent: April 16, 2024
    Assignee: Roku, Inc.
    Inventors: Anthony John Wood, David Stern, Gregory Mack Garner
  • Patent number: 11948553
    Abstract: Embodiments described herein provide for audio processing operations that evaluate characteristics of audio signals that are independent of the speaker's voice. A neural network architecture trains and applies discriminatory neural networks tasked with modeling and classifying speaker-independent characteristics. The task-specific models generate or extract feature vectors from input audio data based on the trained embedding extraction models. The embeddings from the task-specific models are concatenated to form a deep-phoneprint vector for the input audio signal. The DP vector is a low dimensional representation of the each of the speaker-independent characteristics of the audio signal and applied in various downstream operations.
    Type: Grant
    Filed: March 4, 2021
    Date of Patent: April 2, 2024
    Assignee: Pindrop Security, Inc.
    Inventors: Kedar Phatak, Elie Khoury
  • Patent number: 11935508
    Abstract: The present document relates to audio coding systems which make use of a harmonic transposition method for high frequency reconstruction (HFR), and to digital effect processors, e.g. so-called exciters, where generation of harmonic distortion adds brightness to the processed signal. In particular, a system configured to generate a high frequency component of a signal from a low frequency component of the signal is described. The system may comprise an analysis filter bank (501) configured to provide a set of analysis subband signals from the low frequency component of the signal; wherein the set of analysis subband signals comprises at least two analysis subband signals; wherein the analysis filter bank (501) has a frequency resolution of ?f.
    Type: Grant
    Filed: March 31, 2023
    Date of Patent: March 19, 2024
    Assignee: DOLBY INTERNATIONAL AB
    Inventors: Per Ekstrand, Lars Villemoes, Per Hedelin
  • Patent number: 11935525
    Abstract: Systems and methods for utilizing microphone array information for acoustic modeling are disclosed. Audio data may be received from a device having a microphone array configuration. Microphone configuration data may also be received that indicates the configuration of the microphone array. The microphone configuration data may be utilized as an input vector to an acoustic model, along with the audio data, to generate phoneme data. Additionally, the microphone configuration data may be utilized to train and/or generate acoustic models, select an acoustic model to perform speech recognition with, and/or to improve trigger sound detection.
    Type: Grant
    Filed: June 8, 2020
    Date of Patent: March 19, 2024
    Assignee: Amazon Technologies, Inc.
    Inventors: Shiva Kumar Sundaram, Minhua Wu, Anirudh Raju, Spyridon Matsoukas, Arindam Mandal, Kenichi Kumatani
  • Patent number: 11914626
    Abstract: Techniques are disclosed relating to implementing a machine learning approach to cross-language translation and search. In certain embodiments, a method may include receiving a plurality of characters of a first language that are unsegmented and grouping the plurality of character into multiple groups. The method also includes determining a set of word tokens based on one or more transliterations of the multiple groups and one or more translations of the multiple groups to a second language. Further, the method includes generating one or more word token solution sets by querying an index file using the one or more word tokens. The method also includes determining whether the index file references an entity name corresponding to the plurality of characters of the first language based on comparing the one or more token solution sets with the index file.
    Type: Grant
    Filed: March 22, 2021
    Date of Patent: February 27, 2024
    Assignee: PAYPAL, INC.
    Inventors: Rushik Upadhyay, Dhamodharan Lakshmipathy, Nandhini Ramesh, Aditya Kaulagi