Recognition Patents (Class 704/231)
  • Patent number: 11954441
    Abstract: A device and method for generating article markup information are provided. The method for generating article markup information includes the following. Segmentation processing is performed on an article to generate a segmentation result. Name entity recognition is performed on the segmentation result to generate a first recognition result. Whether the segmentation result includes any word in an expansion list is determined. Expanded entity classification conversion is performed on the first recognition result to generate a second recognition result. The second recognition result and the segmentation result are used as markup information.
    Type: Grant
    Filed: January 4, 2022
    Date of Patent: April 9, 2024
    Assignee: Acer Incorporated
    Inventors: Yi-Chun Lin, Yueh-Yarng Tsai, Pin-Cyuan Lin, Ke-Han Pan, Sheng-Wei Chu
  • Patent number: 11947784
    Abstract: An electronic device with a touch-sensitive display, one or more processors, and memory detects a first user input. In response to detecting the first user input, the device displays on the touch-sensitive display a user interface screen including a first affordance and a second affordance. The device detects a second user input including a contact on the touch-sensitive display. In accordance with a determination that the contact corresponds to selection of the first affordance, the electronic device is caused to turn off. In accordance with a determination that the contact corresponds to selection of the second affordance, the device causes initiation of a telephone call to a determined number.
    Type: Grant
    Filed: June 16, 2021
    Date of Patent: April 2, 2024
    Assignee: Apple Inc.
    Inventors: Aled Hywel Williams, Jonathan P. Ive, Bronwyn Jones, Ieyuki Kawashima, Kevin Lynch, Natalia Maric, Andreas E. Schobel, Molly Pray Wiebe
  • Patent number: 11948554
    Abstract: The acoustic feature extraction means 82 extracts an acoustic feature, using predetermined parameters, from an acoustic pattern obtained as a result of processing on an acoustic signal. The language vector calculation means 83 calculates a language vector from a given label that represents an attribute of a source of the acoustic signal and that is associated with the acoustic pattern. The similarity calculation means 84 calculates a similarity between the acoustic feature and the language vector. The parameter update means 85 learns parameters so that the similarity becomes larger, and updates the predetermined parameters to the parameters obtained by learning.
    Type: Grant
    Filed: September 20, 2018
    Date of Patent: April 2, 2024
    Assignee: NEC CORPORATION
    Inventors: Tatsuya Komatsu, Reishi Kondo, Sakiko Mishima
  • Patent number: 11942975
    Abstract: An apparatus for correcting an input signal is configured for receiving the input signal, the received input signal comprising a series of input values. The apparatus is configured for matching a series of template values to the series of input values by warping the series of template values and the series of input values relatively to each other so as to assign one or more template values to one or more input values, wherein the series of template values represents an approximation of a noise signal that is expected to be comprised in the input signal. The apparatus is configured for obtaining a series of corrected input values based on a mismatch between the input values and their respective assigned template values. The apparatus is configured for providing a corrected signal based on the series of corrected input values.
    Type: Grant
    Filed: January 8, 2021
    Date of Patent: March 26, 2024
    Assignee: INFINEON TECHNOLOGIES AG
    Inventors: Alessandra Fusco, Christian Bretthauer
  • Patent number: 11935520
    Abstract: A method and system for identifying the beginning and ending of songs via a machine learning analysis. A machine learning model analyzes streaming audio (such as a radio broadcast) in overlapping, 3-second samples. Each sample is labeled into groups such as “song,” “talk,” “commercial” and “transition.” Based on the location of the transition samples, an exact second a given song begins and ends in the audio stream is derivable. The model further identifies when two songs shift between one another.
    Type: Grant
    Filed: December 16, 2020
    Date of Patent: March 19, 2024
    Assignee: Auddia Inc.
    Inventors: Peter Shoebridge, Jeffrey Thramann, Pablo Calderon Rodriguez
  • Patent number: 11929064
    Abstract: A method for detecting a hotword includes receiving a sequence of input frames that characterize streaming audio captured by a user device and generating a probability score indicating a presence of a hotword in the streaming audio using a memorized neural network. The network includes sequentially-stacked single value decomposition filter (SVDF) layers and each SVDF layer includes at least one neuron. Each neuron includes a respective memory component, a first stage configured to perform filtering on audio features of each input frame individually and output to the memory component, and a second stage configured to perform filtering on all the filtered audio features residing in the respective memory component. The method also includes determining whether the probability score satisfies a hotword detection threshold and initiating a wake-up process on the user device for processing additional terms.
    Type: Grant
    Filed: January 9, 2023
    Date of Patent: March 12, 2024
    Assignee: Google LLC
    Inventors: Raziel Alvarez Guevara, Hyun Jin Park
  • Patent number: 11922948
    Abstract: Implementations of the subject technology provide systems and methods for multi-mode voice triggering for audio devices. An audio device may store multiple voice recognition models, each trained to detect a single corresponding trigger phrase. So that the audio device can detect a specific one of the multiple trigger phrases without consuming the processing and/or power resources to run a voice recognition model that can differentiate between different trigger phrases, the audio device pre-loads a selected one of the voice recognition models for an expected trigger phrase into a processor of the audio device. The audio device may select the one of the voice recognition models for the expected trigger phrase based on a type of a companion device that is communicatively coupled to the audio device.
    Type: Grant
    Filed: April 21, 2023
    Date of Patent: March 5, 2024
    Assignee: Apple Inc.
    Inventors: Dersheet C. Mehta, Dinesh Garg, Sham Anton Koli, Kerry J. Kopp, Hans Bernhard
  • Patent number: 11915113
    Abstract: According to principles described herein, a system applies Active Learning methodology to multiple models simultaneously. The system includes a means to distribute the sample selection algorithm across large pools of unlabeled data and a automatic model training deployed on hardware matched to the model type that scales to large volumes of data without consuming all resources.
    Type: Grant
    Filed: March 5, 2020
    Date of Patent: February 27, 2024
    Assignee: Verint Americas Inc.
    Inventor: Ian Roy Beaver
  • Patent number: 11908480
    Abstract: This disclosure proposes systems and methods for processing natural language inputs using data associated with multiple language recognition contexts (LRC). A system using multiple LRCs can receive input data from a device, identify a first identifier associated with the device, and further identify second identifiers associated with the first identifier and representing candidate users of the device. The system can access language processing data used for natural language processing for the LRCs corresponding to each of the first and second identifiers, and process the input data using the language processing data at one or more stages of automatic speech recognition, natural language understanding, entity resolution, and/or command execution. User recognition can reduce the number of candidate users, and thus the amount of data used to process the input data. Dynamic arbitration can select from between competing hypotheses representing the first identifier and a second identifier, respectively.
    Type: Grant
    Filed: March 23, 2020
    Date of Patent: February 20, 2024
    Assignee: Amazon Technologies, Inc.
    Inventors: Da Teng, Adrian Evans, Naresh Narayanan
  • Patent number: 11908034
    Abstract: A computer system can receive requests for transport from computing devices of users while the users ride a transit vehicle. The system can determine a rate of travel of the transit vehicle based on location data received from the computing device of a user riding the transit vehicle. Based at least in part on the rate of travel of the transit vehicle, the system can determine a first estimated time of arrival (ETA) of the user to the start location. The system can further receive location data from computing devices associated with available vehicles within a proximity of a start location of the user, and select one of vehicles to service the request when the ETA of the vehicle is within a threshold amount of time of the first ETA.
    Type: Grant
    Filed: September 14, 2021
    Date of Patent: February 20, 2024
    Assignee: Uber Technologies, Inc.
    Inventors: Nuri Kim, Christopher Haugli, Rachel Lin, Hasrat Godil, Jeffrey Wolski, Amos Barreto
  • Patent number: 11900007
    Abstract: A display apparatus includes: a display; an input interface; and a processor in connection with the display and the input interface and configured to: upon detecting access of a first power amplifier device, output a high-level Hotplug signal at a Hotplug port of the display apparatus; monitor whether a common-mode data packet from the first power amplifier device is received within a first preset duration; in response to the common-mode data packet being received within the first preset duration, send a heartbeat packet to the first power amplifier device, and monitor whether a heartbeat response is received within a second preset duration; in response to the heartbeat response being received within the second preset duration, determine that the first power amplifier device supports e-ARC function; and in response to the heartbeat response being not received within the second preset duration, determine that the first power amplifier device supports ARC function.
    Type: Grant
    Filed: February 24, 2023
    Date of Patent: February 13, 2024
    Assignee: Hisense Visual Technology Co., Ltd.
    Inventors: Haiying Wang, Pingguang Lu, Shanliang Xu, Xianzhuo Sun, Yanli Wu
  • Patent number: 11902467
    Abstract: A method is provided. The method comprises a computer performing receiving notice of an incoming inquiry originated by a customer device. The method further comprises the computer performing instantiating a session with the customer device, the session performing comprising the computer combining chatbot functionality and flow control functionality. The method further comprises the computer performing a prompt to the customer device, the prompt comprising a set of valid selections. The method further comprises the computer performing receiving a selection from the customer device. The method further comprises the computer, performing based on the received selection, at least one of providing information to the customer device and routing the inquiry.
    Type: Grant
    Filed: November 9, 2018
    Date of Patent: February 13, 2024
    Assignee: INTRADO CORPORATION
    Inventors: Prabhat Dahal, Geoff Finch, Steven Heithoff, Mike Lindner, Sarath Ravindran, Mayank Sawala
  • Patent number: 11900929
    Abstract: According to an embodiment of the disclosure, an electronic device may include a speaker, a microphone, a wireless communication circuit, and at least one processor connected to the speaker, the microphone, and the wireless communication circuit. The at least one processor may be configured to: in response to a user's voice command received through the microphone, perform a task corresponding to the voice command, based on an information amount contained in a result of the task, determine a type of the result to be visually appropriate or auditorily appropriate, and based on the type of the result, determine a device for providing the result as a screen device or a speaker.
    Type: Grant
    Filed: December 1, 2020
    Date of Patent: February 13, 2024
    Assignee: Samsung Electronics Co., Ltd.
    Inventors: Yongho Kim, Hyunjin Kim, Sunah Kim, Dayoung Lee, Jaeyoung Lee, Jungkun Lee
  • Patent number: 11887620
    Abstract: The present invention improves the accuracy of language prediction. A history speech meta-information understanding unit 11 obtains a history speech meta-information vector from a word string of a preceding speech using a meta-information understanding device. A history speech embedding unit 12 converts the word string of the preceding speech and a speaker label into a history speech embedding vector. A speech unit combination vector construction unit 13 obtains a speech unit combination vector by combining the history speech meta-information vector and the history speech embedding vector. A speech sequence embedding vector calculation unit 14 converts a plurality of speech unit combination vectors obtained for the past speech sequences to a speech sequence embedding vector. A language model score calculation unit 15 calculates a language model score of a current speech from a word string of the current speech, a speaker label, and a speech sequence embedding vector.
    Type: Grant
    Filed: January 27, 2020
    Date of Patent: January 30, 2024
    Assignee: NIPPON TELEGRAPH AND TELEPHONE CORPORATION
    Inventors: Ryo Masumura, Tomohiro Tanaka, Takanobu Oba
  • Patent number: 11869481
    Abstract: Systems and methods are directed to improving speech signal recognition. A method includes obtaining a spatial audio signal; separating a continuous speech signal and a corresponding directivity flag signal for a sound source direction from the spatial audio signal; and combining the continuous speech signal with the corresponding directivity flag signal for the sound source direction to generate a speech activation detection signal for the sound source direction. Because the speech activation detection signal of the sound source direction is obtained by combining the continuous speech signal with the directivity flag signal of the sound source direction, the speech activation detection signal has directivity, reducing the interference from continuous speech signal in other sound source directions.
    Type: Grant
    Filed: November 29, 2018
    Date of Patent: January 9, 2024
    Assignee: Alibaba Group Holding Limited
    Inventor: Yong Liu
  • Patent number: 11862167
    Abstract: A spoken dialogue device includes a recognition unit that recognizes an acquired user speech, a barge-in speech control unit that determines whether to engage a barge-in speech, a dialogue control unit that outputs a system response to a user based on a recognition result of the user speech other than the barge-in speech determined not to be engaged by the barge-in speech control unit, a response generation unit that generates a system speech based on the system response, and an output unit that outputs a system speech. When each user speech element included in the user speech corresponds to a predetermined morpheme included in the immediately previous system speech and does not correspond to a response candidate to the immediately previous system speech by a user, the barge-in speech control unit does not engage at least the user speech element.
    Type: Grant
    Filed: January 14, 2020
    Date of Patent: January 2, 2024
    Assignee: NTT DOCOMO, INC.
    Inventors: Mariko Chiba, Taichi Asami
  • Patent number: 11861264
    Abstract: A portable terminal device in an information processing system and method includes a camera and a microphone. Data of obtained images and voice are transmitted to a server that identifies operations to be executed based on the received voice and image data. The server transmits an identification of one or more results of the plurality of operations to the portable terminal device. When the portable terminal device receives only one result from the server, an operation corresponding to the one result is executed, and when a plurality of results is received, the portable terminal device displays information corresponding to the plurality of results as candidates. Additional voice is captured for selecting one of the plurality of results during the displaying of the information. A determination of one result from the plurality of results is made based on the captured voice, and an operation corresponding to the determined result is executed.
    Type: Grant
    Filed: October 20, 2022
    Date of Patent: January 2, 2024
    Assignee: Maxell, Ltd.
    Inventors: Motoyuki Suzuki, Hideo Nishijima
  • Patent number: 11849908
    Abstract: A method of providing an intelligent voice recognition model includes obtaining space type information about a placement area of the voice recognition device, extracting space feature information from the space type information; and generating a predetermined voice recognition model matched to the extracted space feature information. At least one device implementing the method of providing the intelligent voice recognition model may be associated with an artificial intelligence module, a unmanned aerial vehicle (UAV), a robot, an augmented reality (AR) device, a virtual reality (VR) device, devices related to 5G services, and the like.
    Type: Grant
    Filed: June 5, 2019
    Date of Patent: December 26, 2023
    Assignee: LG Electronics Inc.
    Inventor: Jonghoon Chae
  • Patent number: 11854549
    Abstract: Systems and methods for distinguishing valid voice commands from false voice commands in an interactive media guidance application. In some aspects, the interactive media guidance application receives, at a user device, a signature sound sequence. The interactive media guidance application determines, using control circuitry, based on the signature sound sequence, a threshold gain for the current location of the user device. The interactive media guidance application receives, at the user device, a voice command. The interactive media guidance application determines, using the control circuitry, based on the voice command, a gain for the voice command. The interactive media guidance application determines, using the control circuitry, whether the gain for the voice command is different from the threshold gain. Based on determining that the gain for the voice command is different from the threshold gain, the interactive media guidance application executes, using the control circuitry, the voice command.
    Type: Grant
    Filed: December 14, 2022
    Date of Patent: December 26, 2023
    Assignee: Rovi Guides, Inc.
    Inventors: Edison Lin, Rowena Young, Lauren Palmateer
  • Patent number: 11848018
    Abstract: A method includes receiving a spoken utterance that includes a plurality of words, and generating, using a neural network-based utterance classifier comprising a stack of multiple Long-Short Term Memory (LSTM) layers, a respective textual representation for each word of the of the plurality of words of the spoken utterance. The neural network-based utterance classifier trained on negative training examples of spoken utterances not directed toward an automated assistant server. The method further including determining, using the respective textual representation generated for each word of the plurality of words of the spoken utterance, that the spoken utterance is one of directed toward the automated assistant server or not directed toward the automated assistant server, and when the spoken utterance is directed toward the automated assistant server, generating instructions that cause the automated assistant server to generate a response to the spoken utterance.
    Type: Grant
    Filed: May 31, 2022
    Date of Patent: December 19, 2023
    Assignee: Google LLC
    Inventors: Nathan David Howard, Gabor Simko, Maria Carolina Parada San Martin, Ramkarthik Kalyanasundaram, Guru Prakash Arumugam, Srinivas Vasudevan
  • Patent number: 11848007
    Abstract: A display apparatus including a display, a voice input receiver, a memory, a communication circuitry and a processor. The processor being configured to control the display to display at least one first identifier corresponding to at least one first component on a first area in the screen during a first time such that one of the at least one first identifier is selectable by a first user voice input, and control the display to display at least one second identifier corresponding to the at least one second component on a second area in the screen during a second time different from the first time, such that one of the at least one second identifier is selectable by a second user voice input.
    Type: Grant
    Filed: July 13, 2022
    Date of Patent: December 19, 2023
    Assignee: SAMSUNG ELECTRONICS CO., LTD.
    Inventors: Kyeonga Han, Soungmin Yoo
  • Patent number: 11848029
    Abstract: A method for detecting an audio signal, the method comprises: obtaining a speech segment and a non-speech segment of an audio signal to be detected, extracting a first audio feature of the speech segment and a second audio feature of the non-speech segment, detecting the first audio feature using a predetermined speech segment detection model to obtain a first detection score, detecting the second audio feature using a predetermined non-speech segment detection model to obtain a second detection score, and determining whether the audio signal belongs to a target audio based on the first detection score and the second detection score.
    Type: Grant
    Filed: May 21, 2021
    Date of Patent: December 19, 2023
    Assignee: BEIJING XIAOMI PINECONE ELECTRONICS CO., LTD.
    Inventors: Yifeng Wang, Guodu Cai, Shuo Yang, Lihan Li, Peng Gao
  • Patent number: 11823685
    Abstract: A method includes receiving acoustic features of a first utterance spoken by a first user that speaks with typical speech and processing the acoustic features of the first utterance using a general speech recognizer to generate a first transcription of the first utterance. The operations also include analyzing the first transcription of the first utterance to identify one or more bias terms in the first transcription and biasing the alternative speech recognizer on the one or more bias terms identified in the first transcription. The operations also include receiving acoustic features of a second utterance spoken by a second user that speaks with atypical speech and processing, using the alternative speech recognizer biased on the one or more terms identified in the first transcription, the acoustic features of the second utterance to generate a second transcription of the second utterance.
    Type: Grant
    Filed: January 25, 2023
    Date of Patent: November 21, 2023
    Assignee: Google LLC
    Inventors: Fadi Biadsy, Pedro J. Moreno Mengibar
  • Patent number: 11817097
    Abstract: An electronic apparatus is provided. The electronic apparatus includes a communicator, a memory, and a processor connected to the communicator and the memory and configured to control the electronic apparatus. The processor is configured to, by executing at least one command stored in the memory, based on a user input for executing an assistant service being received, transmit information on a user voice acquired by the electronic apparatus to a plurality of servers providing different assistant services through the communicator, and based on a plurality of response information being received from the plurality of servers, provide a response on the user voice based on at least one of the plurality of response information. The plurality of servers provide the assistant service using an artificial intelligence agent.
    Type: Grant
    Filed: March 3, 2022
    Date of Patent: November 14, 2023
    Assignee: Samsung Electronics Co., Ltd.
    Inventors: Wonnam Jang, Sooyeon Kim, Sungrae Jo
  • Patent number: 11817101
    Abstract: A system, method and computer program is provided for generating customized text representations of audio commands. A first speech recognition module may be used for generating a first text representation of an audio command based on a general language grammar. A second speech recognition module may be used for generating a second text representation of the audio command, the second module including a custom language grammar that may include contacts for a particular user. Entity extraction is applied to the second text representation and the entities are checked against a file containing personal language. If the entities are found in the user-specific language, the two text representations may be fused into a combined text representation and named entity recognition may be performed again to extract further entities.
    Type: Grant
    Filed: November 4, 2020
    Date of Patent: November 14, 2023
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Wilson Hsu, Kaheer Suleman, Joshua Pantony
  • Patent number: 11816224
    Abstract: In general, various aspects of the present disclosure provide methods, apparatuses, systems, computing devices, computing entities, and/or the like for addressing a modified risk rating identifying a risk to an entity of having computer-implemented functionality provided by a vendor integrated with a computing system of the entity.
    Type: Grant
    Filed: October 31, 2022
    Date of Patent: November 14, 2023
    Assignee: OneTrust, LLC
    Inventors: Jason L. Sabourin, Shiven Patel
  • Patent number: 11803561
    Abstract: Documents may be maintained in a repository and retrieved based on searches that specify labels as criteria. Documents may be associated with groups of labels identified as topics. Searches may be performed using binary-encoded matrices specifying relationships between documents and topics, topics and labels and differential information indicating differences between topics and labels associated with documents. An initial result estimate may be based on forming a product of a documents-topics matrix and a topics-labels matrix. The initial estimate may be corrected by applying the differential information.
    Type: Grant
    Filed: March 31, 2014
    Date of Patent: October 31, 2023
    Assignee: Amazon Technologies, Inc.
    Inventors: William Nathan John Hurst, Timothy Daniel Cole
  • Patent number: 11785136
    Abstract: Method and system are provided for audio quality feedback during live transmission from a source that is received at multiple audience devices. The method carried out at a server includes: obtaining audio information of an audio signal as received by at least some of the audience devices in a transmission session; classifying one or more subsets of the audience devices by one or more common factors per subset; and analyzing the obtained audio information from the audience devices in conjunction with the classifications of the subsets of the audience devices to determine one or more common factors that affect received audio quality at an identified subset of the audience devices classified by the one or more common factors. The method provides feedback of the one or more common factors to at least one of the audience devices in the identified subset or to the source device, or to both.
    Type: Grant
    Filed: October 29, 2020
    Date of Patent: October 10, 2023
    Assignee: International Business Machines Corporation
    Inventors: Jenny Jing He, Adrian Kyte, Joseph R Winchester, Cheng Fang Wang, Ping Xiao
  • Patent number: 11783224
    Abstract: One embodiment of the invention provides a method of training a chatbot. The method comprises identifying one or more chat logs that exhibit a trait. The method further comprises identifying one or more labels associated with the trait based on the one or more chat logs. The method further comprises training the chatbot to generate a response that models the trait based on the one or more chat logs. The method further comprises labeling the chatbot with the one or more labels.
    Type: Grant
    Filed: December 6, 2019
    Date of Patent: October 10, 2023
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Abhijit Mishra, Enara C. Vijil, Seema Nagar, Kuntal Dey
  • Patent number: 11783314
    Abstract: In one embodiment, a method includes receiving, by a payment service system (PSS), a payment request from a payment application executing on a device of a sender. The payment request includes a recipient identifier corresponding to a recipient. The sender and recipient have financial accounts associated with the PSS. The method includes determining a similarity score based on a comparison of contact records associated with the device of the sender and contact records associated with a device of the recipient. The method includes, responsive to determining that the similarity score does not satisfy a threshold similarity score, transmitting a confirmation request to the device of the sender to confirm an identity of the recipient. The method includes receiving, from the device of the sender, an approval response to the confirmation request. The method includes, authorizing a payment transaction associated with the payment request based on the approval response.
    Type: Grant
    Filed: October 8, 2020
    Date of Patent: October 10, 2023
    Assignee: Block, Inc.
    Inventors: Brian Grassadonia, Ayokunle Omojola, Michael Moring, Robert Andersen, Daniele Perito, Kristopher Stipech
  • Patent number: 11765430
    Abstract: Example apparatus disclosed herein include a return path data classifier to classify a first viewing period associated with segments of return path data received from a set top box into tuning classifications based on the segments of the return path data; calculate a total reported tuning duration for the first viewing period when the first viewing period is classified as live or playback tuning; and compare the total reported tuning duration to a duration threshold to determine whether the segments of return path data associated with the first viewing period are valid. The example apparatus also includes a return path data rectifier to rectify missing tuning data associated with a second viewing period based on tuning data included in the segments of return path data associated with the first viewing period when the segments of the return path data associated with the first viewing period are determined to be valid.
    Type: Grant
    Filed: November 16, 2020
    Date of Patent: September 19, 2023
    Assignee: The Nielsen Company (US), LLC
    Inventors: Balachander Shankar, Jonathan Sullivan, Molly Poppie, John Charles Coughlin, Paul Chimenti, Rachel Worth Olson, Samantha M. Mowrer, David J. Kurzynski, Remy Spoentgen, Christine Heiss, Shuangxing Chen
  • Patent number: 11762494
    Abstract: A system and method for identifying a user of a device includes comparing audio received by a device with acoustic fingerprint information to identify a user of the device. Image data, video data and other data may also be used in the identification of the user. Once the user is identified, operation of the device may be customized based on the user. Further, once the user is identified, data can be associated with the user, for example, usage data, location data, gender data, age data, dominant hand data of the user, and other data. This data can then be used to further customize the operation of the device to the specific user.
    Type: Grant
    Filed: October 20, 2020
    Date of Patent: September 19, 2023
    Assignee: Amazon Technologies, Inc.
    Inventors: Michael David Dumont, Jonathan White Keljo, Levon Dolbakian, Srinivasan Sridharan, Arnaud Marie Froment, Nadim Awad, Kenneth Paul Kiraly
  • Patent number: 11763398
    Abstract: The present technology extends to methods, systems, and computer program products for expanding semantic classes via user feedback. Aspects of the technology learn how a set of labels can be expanded from user-generated tags. Text labels applied by human reviewers to digital content can be inspected and compared to one another. When a threshold of human-generated text tags contain similar terminology, the set of labels can be expanded to define a representation of the similar terminology. Similar terminology can include terms that originate from the same base term, are synonyms, are more specific terms related to a general term category, etc. Similar terminology can be consolidated into a defining term that is used to generate a new (more granular) label or a new top level label. Accordingly, new semantic classes can be discovered from user-generated feedback. New semantic classes can provide a more granular representation of content item classification.
    Type: Grant
    Filed: June 7, 2022
    Date of Patent: September 19, 2023
    Assignee: DISCORD INC.
    Inventors: Michele Banko, Alok Puranik, Taylor Rhyne
  • Patent number: 11755637
    Abstract: The decoder network includes multiple decoders trained to generate different types of summaries. The lower layers of the multiple decoders are shared. The upper layers of the multiple decoders do not overlap. The multiple decoders generate probability distributions. A gating mechanism combines the probability distributions of the multiple decoders into a probability distribution of the decoder network. Words in the summary are selected based on the probability distribution of the decoder network.
    Type: Grant
    Filed: January 10, 2022
    Date of Patent: September 12, 2023
    Assignee: Salesforce, Inc.
    Inventors: Tanya Goyal, Wojciech Kryscinski, Nazneen Rajani
  • Patent number: 11755701
    Abstract: Methods, apparatus and systems for biometric authentication based on an audio signal are provided. The audio signal comprises a representation of a voice signal of a user conducted via at least part of a user's skeleton. Further embodiments may relate to biometric authentication based upon a combination of a bone-conducted audio signal, or a bone-conducted voice biometric process, with an air-conducted voice signal.
    Type: Grant
    Filed: July 6, 2018
    Date of Patent: September 12, 2023
    Assignee: Cirrus Logic Inc.
    Inventor: John Paul Lesso
  • Patent number: 11755652
    Abstract: An information-processing device including an acquisition unit that acquires input data corresponding to spoken words input to a user terminal, and acquired response data output from a dialog processing device that performs processing according to the input data. A generation unit generates guidance information on how to use the dialog processing device based on the input data and the response data. An output unit outputs the guidance information generated by the generation unit to the user terminal.
    Type: Grant
    Filed: November 20, 2018
    Date of Patent: September 12, 2023
    Assignee: NTT DOCOMO, INC.
    Inventors: Kousuke Kadono, Yuuki Saitou, Youhei Oono, Yuichiro Segawa
  • Patent number: 11749285
    Abstract: This disclosure describes transcribing speech using audio, image, and other data. A system is described that includes an audio capture system configured to capture audio data associated with a plurality of speakers, an image capture system configured to capture images of one or more of the plurality of speakers, and a speech processing engine. The speech processing engine may be configured to recognize a plurality of speech segments in the audio data, identify, for each speech segment of the plurality of speech segments and based on the images, a speaker associated with the speech segment, transcribe each of the plurality of speech segments to produce a transcription of the plurality of speech segments including, for each speech segment in the plurality of speech segments, an indication of the speaker associated with the speech segment, and analyze the transcription to produce additional data derived from the transcription.
    Type: Grant
    Filed: January 14, 2022
    Date of Patent: September 5, 2023
    Assignee: META PLATFORMS TECHNOLOGIES, LLC
    Inventors: Vincent Charles Cheung, Chengxuan Bai, Yating Sheng
  • Patent number: 11748423
    Abstract: A method is disclosed in which one or more pieces of user information are obtained. The one or more pieces of user information are indicative of at least one attribute of a user and/or include at least one piece of information associated with the user. In the method, one or more pieces of user probability information are determined based on the one or more pieces of user information. The one or more pieces of user probability information are indicative of a probability that the one or more pieces of user information are linked to the user. One user identity is determined based on the one or more pieces of user probability information. It is further disclosed an according apparatus, computer program and system.
    Type: Grant
    Filed: December 9, 2019
    Date of Patent: September 5, 2023
    Assignee: HUMADA HOLDINGS INC.
    Inventors: Andreas Berger, Hans-Martin Hellebrand
  • Patent number: 11749284
    Abstract: Implementations are directed to dynamically adapting which assistant on-device model(s) are locally stored at assistant devices of an assistant device group and/or dynamically adapting the assistant processing role(s) of the assistant device(s) of the assistant device group. In some of those implementations, the corresponding on-device model(s) and/or corresponding processing role(s), for each of the assistant devices of the group, is determined based on collectively considering individual processing capabilities of the assistant devices of the group. Implementations are additionally or alternatively directed to cooperatively utilizing assistant devices of a group, and their associated post-adaptation on-device model(s) and/or post-adaptation processing role(s), in cooperatively processing assistant requests that are directed to any one of the assistant devices of the group.
    Type: Grant
    Filed: November 13, 2020
    Date of Patent: September 5, 2023
    Assignee: GOOGLE LLC
    Inventors: Matthew Sharifi, Victor Carbune
  • Patent number: 11749257
    Abstract: A method for evaluating a speech forced alignment model, an electronic device, and a storage medium are provided. The method includes: according to each audio segment in a test set and a text corresponding to each audio segment, acquiring, by using a speech forced alignment model to be evaluated, a phoneme sequence corresponding to each audio segment and a predicted start time and a predicted end time of each phoneme in the phoneme sequence; for each phoneme, obtaining a time accuracy score of the phoneme according to the predicted start time and the predicted end time of the phoneme and a predetermined reference start time and a predetermined reference end time of the phoneme; and determining a time accuracy score of said speech forced alignment model according to the time accuracy score of each phoneme.
    Type: Grant
    Filed: March 6, 2023
    Date of Patent: September 5, 2023
    Assignee: BEIJING CENTURY TAL EDUCATION TECHNOLOGY CO., LTD.
    Inventors: Lizhao Guo, Song Yang, Junfeng Yuan
  • Patent number: 11741951
    Abstract: One embodiment provides a method, including: detecting, at an input device of an information handling device, voice input; determining, using a processor, whether the voice input corresponds to a voice command; identifying, responsive to determining that the voice input corresponds to a voice command, that the voice command is associated with an enabled voice command; determining, using a processor, whether a characteristic of the voice command corresponds to a predetermined input characteristic; and performing, responsive to identifying that the voice command is associated with the enabled voice command and responsive to determining that the characteristic corresponds to the predetermined input characteristic, an action corresponding to the enabled voice command. Other aspects are described and claimed.
    Type: Grant
    Filed: February 22, 2019
    Date of Patent: August 29, 2023
    Assignee: Lenovo (Singapore) Pte. Ltd.
    Inventors: John Weldon Nicholson, Daryl Cromer, Howard Locker
  • Patent number: 11741986
    Abstract: A method includes obtaining, by an electronic device, an audio segment comprising one or more audio events of a target subject. The method also includes extracting, by the electronic device, audio embeddings from the one or more audio events using an embedding model, the embedding model comprising a trained machine learning model. The method further includes comparing, by the electronic device, the extracted audio embeddings with a match profile of the target subject, the match profile generated during an enrollment stage. The method also includes generating, by the electronic device, a label for the audio segment based on whether or not the extracted audio embeddings match the match profile, wherein the label enables correlation of the audio segment with the target subject for monitoring a health condition of the target subject.
    Type: Grant
    Filed: August 20, 2020
    Date of Patent: August 29, 2023
    Assignee: Samsung Electronics Co., Ltd.
    Inventors: Korosh Vatanparvar, Tousif Ahmed, Viswam Nathan, Ebrahim Nematihosseinabadi, Md Mahbubur Rahman, Jilong Kuang, Jun Gao
  • Patent number: 11741303
    Abstract: Introduced here are computer programs and associated computer-implemented techniques for discovering the presence of filler words through tokenization of a transcript derived from audio content. When audio content is obtained by a media production platform, the audio content can be converted into text content as part of a speech-to-text operation. The text content can then be tokenized and labeled using a Natural Language Processing (NLP) library. Tokenizing/labeling may be performed in accordance with a series of rules associated with filler words. At a high level, these rules may examine the text content (and associated tokens/labels) to determine whether patterns, relationships, verbatim, and context indicate that a term is a filler word. Any filler words that are discovered in the text content can be identified as such so that appropriate action(s) can be taken.
    Type: Grant
    Filed: November 10, 2020
    Date of Patent: August 29, 2023
    Assignee: Descript, Inc.
    Inventors: Alexandre de Brébisson, Antoine d'Andigné
  • Patent number: 11727927
    Abstract: Embodiments of the present disclosure disclose a view-based voice interaction method, an apparatus, a server, a terminal and a medium. The method includes: obtaining voice information of a user and voice-action description information of a voice-operable element in a currently displayed view on a terminal; obtaining operational intention of the user by performing semantic recognition on the voice information of the user according to view description information of the voice-operable element; locating a sequence of actions matched with the operational intention of the user in the voice-action list according to the voice-action description information; and delivering the sequence of actions to the terminal for performing.
    Type: Grant
    Filed: May 29, 2020
    Date of Patent: August 15, 2023
    Assignee: BAIDU ONLINE NETWORK TECHNOLOGY (BEIJING) CO., LTD.
    Inventors: Zhou Shen, Dai Tan, Sheng Lv, Kaifang Wu, Yudong Li
  • Patent number: 11724403
    Abstract: A system, method and computer-readable storage devices are for processing natural language commands, such as commands to a robotic arm, using a Tag & Parse approach to semantic parsing. The system first assigns semantic tags to each word in a sentence and then parses the tag sequence into a semantic tree. The system can use statistical approach for tagging, parsing, and reference resolution. Each stage can produce multiple hypotheses, which are re-ranked using spatial validation. Then the system selects a most likely hypothesis after spatial validation, and generates or outputs a command. In the case of a robotic arm, the command is output in Robot Control Language (RCL).
    Type: Grant
    Filed: February 6, 2020
    Date of Patent: August 15, 2023
    Assignees: HYUNDAI MOTOR COMPANY, KIA CORPORATION
    Inventors: Svetlana Stoyanchev, Srinivas Bangalore, John Chen, Hyuckchul Jung
  • Patent number: 11721344
    Abstract: A system and method are disclosed for generating a teleconference space for two or more communication devices using a computer coupled with a database and comprising a processor and memory. The computer generates a teleconference space and transmits requests to join the teleconference space to the two or more communication devices. The computer stores in memory identification information, and audiovisual data associated with one or more users, for each of the two or more communication devices. The computer stores audio transcription data, transmitted to the computer by each of the two or more communication devices and associated with one or more communication device users, in the computer memory. The computer merges the audio transcription data from each of the two or more communication devices into a master audio transcript, and transmits the master audio transcript to each of the two or more communication devices.
    Type: Grant
    Filed: January 25, 2023
    Date of Patent: August 8, 2023
    Assignee: Nextiva, Inc.
    Inventors: Tomas Gorny, Jean-Baptiste Martinoli, Tracy Conrad, Lukas Gorny
  • Patent number: 11714833
    Abstract: Exemplary embodiments pertain to new techniques for classifying or labeling organized data. A major impediment to implementing high-quality machine learning is the lack of readily accessible labeled data. In some cases, data can be classified using a classifier, but these solutions can be inaccurate and slow. Exemplary embodiments address the problem of obtaining accurate labeled data in a timely manner by applying a classifier configured to operate on character-level embeddings. Among other advantages, this can help the classifier to recognize information contained within a data unit, such as a cell of a table. The classifier may operate within the organizational structure of the data, such as by operating across a particular row or column of a table. Because data within a particular row or column is often temporally organized (e.g., transactions that are logged in chronological order), row- or column-based approaches can yield more accurate results.
    Type: Grant
    Filed: September 10, 2021
    Date of Patent: August 1, 2023
    Assignee: Capital One Services, LLC
    Inventors: Jeremy Edward Goodsitt, Austin Grant Walters, Anh Truong
  • Patent number: 11710486
    Abstract: A virtual environment platform may receive, from a user device, a request to access a virtual reality (VR) environment and may verify, based on the request, a user of the user device to allow the user device access to the VR environment. The virtual environment platform may receive, after verifying the user of the user device, user voice input and user handwritten input from the user device. The virtual environment platform may generate processed user speech by processing the user voice input, wherein a characteristic of the processed user speech and a corresponding characteristic of the user voice input are different and may generate formatted user text by processing the user handwritten input, wherein the formatted user text is machine-encoded text. The virtual environment platform may cause the processed user speech to be audibly presented and the formatted user text to be visually presented in the VR environment.
    Type: Grant
    Filed: June 11, 2021
    Date of Patent: July 25, 2023
    Assignee: Capital One Services, LLC
    Inventors: Austin Walters, Jeremy Goodsitt, Fardin Abdi Taghi Abad, Vincent Pham, Kenneth Taylor
  • Patent number: 11705127
    Abstract: Coordinating signal processing among computing devices in a voice-driven computing environment is provided. A first and second digital assistant can detect an input audio signal, perform a signal quality check, and provide indications that the first and second digital assistants are operational to process the input audio signal. A system can select the first digital assistant for further processing. The system can receive, from the first digital assistant, data packets including a command. The system can generate, for a network connected device selected from a plurality of network connected devices, an action data structure based on the data packets, and transmit the action data structure to the selected network connected device.
    Type: Grant
    Filed: June 11, 2021
    Date of Patent: July 18, 2023
    Assignee: GOOGLE LLC
    Inventors: Anshul Kothari, Gaurav Bhaya, Tarun Jain
  • Patent number: 11700335
    Abstract: A system may provide for the generation of spatial audio for audiovisual conferences, video conferences, etc. (referred to herein simply as “conferences”). Spatial audio may include audio encoding and/or decoding techniques in which a sound source may be specified at a location, such as on a two-dimensional plane and/or within a three-dimensional field, and/or in which a direction or target for a given sound source may be specified. A conference participant's position within a conference user interface (“UI”) may be set as the source of sound associated with the conference participant, such that different conference participants may be associated with different sound source positions within the conference UI.
    Type: Grant
    Filed: September 7, 2021
    Date of Patent: July 11, 2023
    Assignee: Verizon Patent and Licensing Inc.
    Inventor: Pierre Seigneurbieux