Patents Examined by Ibrahim Siddo
  • Patent number: 12198701
    Abstract: A conversation support system is provided at an utterance place where utterance is delivered to a plurality of persons. The persons are each an utterer having a possibility of uttering and/or a performer having a possibility of marking. The conversation support system includes a hardware processor and a marking motion catcher. The hardware processor obtains voice data of an utterance made by an utterer and received by a voice receiver, and manages the voice data on a voice timeline. The marking motion catcher catches a marking motion by which a marker is given to the utterance. The hardware processor manages the marking motion on a marking timeline, and links the marking motion with the utterance on a same timeline.
    Type: Grant
    Filed: June 25, 2021
    Date of Patent: January 14, 2025
    Assignee: KONICA MINOLTA, INC.
    Inventors: Keita Saito, Aran Suzuki, Masaharu Harashima
  • Patent number: 12197484
    Abstract: A system and method for generating a multi-label classifier of textual data is presented. The method includes training a plurality of single-label classifiers, wherein each of the plurality of single-label classifiers is trained to classify textual data to a single predefined revenue-based label; and training the multi-label classifier using labeled data output by the plurality of single-label classifiers, wherein the multi-label classifier is trained to classify textual data to a vector including revenue-based labels and their respective classification scores.
    Type: Grant
    Filed: March 28, 2022
    Date of Patent: January 14, 2025
    Assignee: GONG.io Ltd.
    Inventors: Igal Grinis, Inbal Horev, Raquel Sitman, Omri Allouche
  • Patent number: 12198681
    Abstract: Techniques for personalized batch and streaming speech-to-text transcription of audio reduce the error rate of automatic speech recognition (ASR) systems in transcribing rare and out-of-vocabulary words. The techniques achieve personalization of connectionist temporal classification (CT) models by using adaptive boosting to perform biasing at the level of sub-words. In addition to boosting, the techniques encompass a phone alignment network to bias sub-word predictions towards rare long-tail words and out-of-vocabulary words. A technical benefit of the techniques is that the accuracy of speech-to-text transcription of rare and out-of-vocabulary words in a custom vocabulary by automatic speech recognition (ASR) system can be improved without having to train the ASR system on the custom vocabulary. Instead, the techniques allow the same ASR system trained on a base vocabulary to realize the accuracy improvements for different custom vocabularies spanning different domains.
    Type: Grant
    Filed: September 30, 2022
    Date of Patent: January 14, 2025
    Assignee: Amazon Technologies, Inc.
    Inventors: Monica Lakshmi Sunkara, Srikanth Ronanki, Sravan Babu Bodapati, Jeffrey John Farris, Katrin Kirchhoff, Vivek Govindan, Yide Zou, Mohit Narendra Gupta, Silviu Mihai Burz
  • Patent number: 12190060
    Abstract: The present disclosure presents a generative model configured to receive input regarding an item in two different modalities, such as text data and non-text data (including, for example, image or audio data), in order to generate output regarding the item that is determined based on a combination of both modalities' input. Specific relative positional and token type embeddings may be employed in an encoder portion of an encoder-decoder arrangement. An associated decoder may be trained to generate new text corresponding to diverse tasks based on the encoded representation of the two inputs as generated within the encoder. For example, the decoder may be utilized to generate attributes regarding the input item, auto-complete or auto-correct a title or description of the item, among other uses.
    Type: Grant
    Filed: September 30, 2022
    Date of Patent: January 7, 2025
    Assignee: Amazon Technologies, Inc.
    Inventors: Amirhossein Tavanaei, Karim Bouyarmane, Ismail Baha Tutar
  • Patent number: 12175198
    Abstract: A method of document processing is provided. An implementation solution is: obtaining target text information and target layout information of a target document, the target text information includes target text included in the target document and character position information of the target text, and the target layout information is used to characterize the region where text in the target document is located; fusing the target text information and the target layout information to obtain first multimodal information of the target document; and inputting the first multimodal information into an intelligent document comprehension model, and obtaining at least one target word in the target document and at least one feature vector corresponding to the at least one target word output by the intelligent document comprehension model, each target word is related to semantics of the target document.
    Type: Grant
    Filed: September 23, 2022
    Date of Patent: December 24, 2024
    Assignee: BEIJING BAIDU NETCOM SCIENCE TECHNOLOGY CO., LTD.
    Inventor: Yingqi Sun
  • Patent number: 12175977
    Abstract: Systems and processes for operating a digital assistant are provided. In one example, a method includes receiving a first speech input from a user. The method further includes identifying context information and determining a user intent based on the first speech input and the context information. The method further includes determining whether the user intent is to perform a task using a searching process or an object managing process. The searching process is configured to search data, and the object managing process is configured to manage objects. The method further includes, in accordance with a determination the user intent is to perform the task using the searching process, performing the task using the searching process; and in accordance with the determination that the user intent is to perform the task using the object managing process, performing the task using the object managing process.
    Type: Grant
    Filed: April 19, 2023
    Date of Patent: December 24, 2024
    Assignee: Apple Inc.
    Inventors: Aram D. Kudurshian, Bronwyn Jones, Elizabeth Caroline Furches Cranfill, Harry J. Saddler
  • Patent number: 12165636
    Abstract: Devices and techniques are generally described for inference reduction in natural language processing using semantic similarity-based caching. In various examples, first automatic speech recognition (ASR) data representing a first natural language input may be determined. A cache may be searched using the first ASR data. A first skill associated with the first ASR data may be determined from the cache. In some examples, first intent data representing a semantic interpretation of the first natural language input data may be determined by using a first natural language process associated with the first skill.
    Type: Grant
    Filed: November 10, 2022
    Date of Patent: December 10, 2024
    Assignee: AMAZON TECHNOLOGIES, INC.
    Inventors: Kiana Hajebi, Vivek Yadav, Pradeep Natarajan
  • Patent number: 12159640
    Abstract: Provided is an encoding method according to various example embodiments and an encoder performing the method. The encoding method includes outputting a linear prediction (LP) coefficients bitstream and a residual signal by performing a linear prediction analysis on an input signal, outputting a first latent signal obtained by encoding a periodic component of the residual signal, using a first neural network module, outputting a first bitstream obtained by quantizing the first latent signal, using a quantization module, outputting a second latent signal obtained by encoding an aperiodic component of the residual signal, using the first neural network module, and outputting a second bitstream obtained by quantizing the second latent signal, using the quantization module, wherein the aperiodic component of the residual signal is calculated based on a periodic component of the residual signal decoded from the quantized first latent signal output by de-quantizing the first bitstream.
    Type: Grant
    Filed: August 9, 2022
    Date of Patent: December 3, 2024
    Assignee: ELECTRONICS AND TELECOMMUNICATIONS RESEARCH INSTITUTE
    Inventors: Jongmo Sung, Seung Kwon Beack, Tae Jin Lee, Woo-taek Lim, Inseon Jang
  • Patent number: 12153889
    Abstract: Systems, devices, and methods of the present invention involve discourse trees. In an example, a method involves generating a discourse tree. The method includes identifying, from the discourse tree, a central entity that is associated with a rhetorical relation of type elaboration and corresponds to a topic node that identifies a central entity of the text. The method includes determining a subset of elementary discourse units of the discourse tree that are associated with the central entity. The method includes forming generalized phrases from the subset of elementary discourse units. The method includes forming tuples from the generalized phrases, where a tuple is an ordered set of words in normal form. The method involves responsive to successfully converting an elementary discourse unit associated with an identified tuple into a logical representation, updating the ontology with an entity from the identified tuple.
    Type: Grant
    Filed: December 14, 2023
    Date of Patent: November 26, 2024
    Assignee: Oracle International Corporation
    Inventor: Boris Galitsky
  • Patent number: 12153893
    Abstract: A method and system for providing tone detection for a content may include receiving a request to detect a tone for a content, retrieving user data and data about the content, detecting a content environment for the content based on at least one of the user data and the data about the content, detecting the tone for the content based on the content and the content environment, inputting the content and the detected tone into a machine-learning (ML) model for modifying the tone from the detected tone to a modified tone, obtaining at least one rephrased content segment as an output from the ML model, the rephrased content segment modifying the tone of the content from the detected tone to the modified tone, and providing at least one of the detected tone or the at least one rephrased content segment for display.
    Type: Grant
    Filed: January 25, 2022
    Date of Patent: November 26, 2024
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Tomasz Lukasz Religa, Zhang Li, Christine Lauren Mayer, Max Wang, Huitian Jiao, Weixin Cai, Cheng Yang, Christie Chan, Siqing Chen
  • Patent number: 12154590
    Abstract: A sound data processing method of a sound data processing device, the sound data processing device including a processing unit configured to acquire sound data of a target by input and to process the sound data, the sound data processing method including: a step of generating, by using acquired normal sound data of the target, simulated abnormal sound data that becomes a simulated abnormal sound of the target; and a step of performing machine learning by using the acquired normal sound data and the generated simulated abnormal sound data as learning sound data, and generating a learning model for determining an abnormal sound of the sound data of the target to perform abnormal sound detection.
    Type: Grant
    Filed: October 18, 2023
    Date of Patent: November 26, 2024
    Assignee: Panasonic Intellectual Property Management Co., Ltd.
    Inventor: Ryota Fujii
  • Patent number: 12149914
    Abstract: A method, computer program product, and computing system for obtaining machine vision encounter information using one or more machine vision systems. Audio encounter information may be obtained using a plurality of audio acquisition devices of an audio recording system. The audio encounter information may be encoded using an audio codec. The encoding of the audio encounter information by the audio codec may be adapted based upon, at least in part, the machine vision encounter information.
    Type: Grant
    Filed: February 11, 2022
    Date of Patent: November 19, 2024
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Dushyant Sharma, Patrick A. Naylor, Uwe Helmut Jost
  • Patent number: 12142278
    Abstract: Aspects of the present disclosure involve a system comprising a computer-readable storage medium storing a program and method for providing augmented reality content in association with travel. The program and method provide for receiving, by a messaging application, a request to perform a scan operation in association with an image captured by a device camera; determining a travel parameter associated with the request and an attribute of an object depicted in the image; selecting an augmented reality content item based on at least one of the travel parameter or the attribute, the augmented reality content item being configured to present augmented reality content based on speech input; receiving the speech input; obtaining at least one of a transcription or translation of the speech input; and presenting the augmented reality content item, including the transcription or translation, in association with the image.
    Type: Grant
    Filed: August 30, 2023
    Date of Patent: November 12, 2024
    Assignee: Snap Inc.
    Inventors: Virginia Drummond, Ilteris Kaan Canberk, Jean Luo, Alek Matthiessen, Celia Nicole Mourkogiannis
  • Patent number: 12141534
    Abstract: Techniques for personalizing an AI automated conversational system are provided. In one aspect, a method for personalizing an automated conversational system includes: making predictions of a familiarity of a user with concepts needed to understand a standard output utterance based on the familiarity of an aggregate of users and a background knowledge model of the concepts and related concepts, wherein the standard output utterance assumes that the concepts are known; and giving, by the automated conversational system, an output utterance that is tailored to the user given the predictions. For instance, the automated conversational system can give the standard output utterance to the user when it is predicted that the user is familiar with the concepts, or a nonstandard output utterance when it is it is predicted that the user is unfamiliar with at least one of the concepts.
    Type: Grant
    Filed: December 30, 2021
    Date of Patent: November 12, 2024
    Assignee: International Business Machines Corporation
    Inventors: Robert John Moore, Eric Young Liu, Shun Jiang, Chung-hao Tan, Lei Huang, Guangjie Ren, Sungeun An
  • Patent number: 12141503
    Abstract: Systems and methods to implement commands based on selection sequences to a user interface are disclosed. Exemplary implementations may: store, electronic storage, a library of terms utterable by users that facilitate implementation of intended results; obtain audio information representing sounds captured by a client computing platform; detect the spoken terms uttered by the user present within the audio information; determine whether the spoken terms detected are included in the library of terms; responsive to determination that the spoken terms are not included in the library of terms, effectuate presentation of an error message via the user interface; record a selection sequence that the user performs subsequent to the presentation of the error message that causes a result; correlate the selection sequence with the spoken terms based on the selection sequence recorded subsequent to error message to generate correlation; and store the correlation to the electronic storage.
    Type: Grant
    Filed: November 28, 2023
    Date of Patent: November 12, 2024
    Assignee: Suki AI, Inc.
    Inventors: Jatin Chhugani, Ganesh Satish Mallya, Alan Diec, Vamsi Reddy Chagari, Sudheer Tumu, Nithyanand Kota, Maneesh Dewan
  • Patent number: 12131741
    Abstract: An audio transmission method includes: a second device sends noise energy data on the second device end and transmission efficiency data to a first device; the first device determines a first bit rate based on the noise energy data, and determines a second bit rate based on the transmission efficiency data; the first device encodes an audio stream based on a lower bit rate of the first bit rate and the second bit rate to obtain audio data; the first device sends the audio data obtained by encoding to the second device.
    Type: Grant
    Filed: February 5, 2021
    Date of Patent: October 29, 2024
    Assignee: HUAWEI TECHNOLOGIES CO., LTD.
    Inventors: Dong Shi, Jie Su
  • Patent number: 12131736
    Abstract: In some implementations, a recording device may obtain a settings configuration associated with deactivating an audio recording function or an audio processing function of the recording device, wherein the settings configuration indicates one or more deactivation events. The recording device may obtain first audio content associated with the recording device for identifying audio prompts associated with causing the recording device to perform one or more actions. The recording device may detect a deactivation event of the one or more deactivation events. The recording device may refrain from obtaining audio content based on detecting the deactivation event and until an activation event is detected. The recording device may obtain second audio content associated with the recording device based on detecting the activation event.
    Type: Grant
    Filed: July 5, 2022
    Date of Patent: October 29, 2024
    Assignee: Capital One Services, LLC
    Inventors: Jeremy Goodsitt, Galen Rafferty, Samuel Sharpe, Grant Eden, Austin Walters, Anh Truong, Christopher Wallace
  • Patent number: 12118999
    Abstract: Systems and processes for selectively processing and responding to a spoken user input are provided. In one example, audio input containing a spoken user input can be received at a user device. The spoken user input can be identified from the audio input by identifying start and end-points of the spoken user input. It can be determined whether or not the spoken user input was intended for a virtual assistant based on contextual information. The determination can be made using a rule-based system or a probabilistic system. If it is determined that the spoken user input was intended for the virtual assistant, the spoken user input can be processed and an appropriate response can be generated. If it is instead determined that the spoken user input was not intended for the virtual assistant, the spoken user input can be ignored and/or no response can be generated.
    Type: Grant
    Filed: August 7, 2023
    Date of Patent: October 15, 2024
    Assignee: Apple Inc.
    Inventors: Philippe P. Piernot, Justin G. Binder
  • Patent number: 12112754
    Abstract: Implementations relate to an automated assistant that can respond to communications received via a third party application and/or other third party communication modality. The automated assistant can determine that the user is participating in multiple different conversations via multiple different third party communication services. In some implementations, conversations can be processed to identify particular features of the conversations. When the automated assistant is invoked to provide input to a conversation, the automated assistant can compare the input to the identified conversation features in order to select the particular conversation that is most relevant to the input. In this way, the automated assistant can assist with any of multiple disparate conversations that are each occurring via a different third party application.
    Type: Grant
    Filed: November 20, 2023
    Date of Patent: October 8, 2024
    Assignee: GOOGLE LLC
    Inventors: Victor Carbune, Matthew Sharifi
  • Patent number: 12094467
    Abstract: Techniques are described for providing emulation software used to emulate voice assistant-enabled devices, and further for a platform used to perform and monitor large scale load tests against fleets of voice assistant-enabled device emulators. The emulation software broadly provides a collection of software libraries for creating emulated instances of a wide range of voice assistant-enabled device types capable of interacting with voice assistant services provided by a cloud provider network. The emulation software further includes software interfaces and observable data streams that enable developers to configure and extend emulated device capabilities, to obtain debugging and performance information, and the like. The described load testing platform further enables developers to test the performance of voice assistant-related technologies (including e.g., device features, voice assistant apps, third-party services, etc.) at scale.
    Type: Grant
    Filed: June 30, 2022
    Date of Patent: September 17, 2024
    Assignee: Amazon Technologies, Inc.
    Inventors: Diyu Zhu, Puneeth Simha Kadaba Sathya Kumar, Saso Crnugelj-Gale, Chenyuan Wang, Shilpi Nair, Anirudh Daga