Recognition Patents (Class 704/231)
  • Patent number: 10741181
    Abstract: Speech recognition is performed on a received utterance to determine a plurality of candidate text representations of the utterance, including a primary text representation and one or more alternative text representations. Natural language processing is performed on the primary text representation to determine a plurality of candidate actionable intents, including a primary actionable intent and one or more alternative actionable intents. A result is determined based on the primary actionable intent. The result is provided to the user. A recognition correction trigger is detected. In response to detecting the recognition correction trigger, a set of alternative intent affordances and a set of alternative text affordances are concurrently displayed.
    Type: Grant
    Filed: May 14, 2019
    Date of Patent: August 11, 2020
    Assignee: Apple Inc.
    Inventors: Ashish Garg, Harry J. Saddler, Shweta Grampurohit, Robert A. Walker, Rushin N. Shah, Matthew S. Seigel, Matthias Paulik
  • Patent number: 10740370
    Abstract: Embodiments of the present invention provide a system for implementing multi-turn dialogs. The system performs a method that includes receiving a series of user utterances, generating a series of responsive system utterances, and labeling the series of responsive system utterances to generate training data for training a dialog management policy. The labeling includes executing a reward function at each turn of a dialog, in which for each turn of the dialog the reward function is configured to output a reward value that is based at least in part on an accuracy of the responsive system utterance of the turn and on number of dialog turns elapsed.
    Type: Grant
    Filed: May 24, 2019
    Date of Patent: August 11, 2020
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Murray S. Campbell, Miao Liu, Biplav Srivastava
  • Patent number: 10741170
    Abstract: A speech recognition method comprises: generating, based on a preset speech knowledge source, a search space comprising preset client information and for decoding a speech signal; extracting a characteristic vector sequence of a to-be-recognized speech signal; calculating a probability at which the characteristic vector corresponds to each basic unit of the search space; and executing a decoding operation in the search space by using the probability as an input to obtain a word sequence corresponding to the characteristic vector sequence.
    Type: Grant
    Filed: May 3, 2018
    Date of Patent: August 11, 2020
    Assignee: Alibaba Group Holding Limited
    Inventors: Xiaohui Li, Hongyan Li
  • Patent number: 10733975
    Abstract: An out-of-service (OOS) sentence generating method includes: training models based on a target utterance template of a target service and a target sentence generated from the target utterance template; generating a similar utterance template that is similar to the target utterance template based on a trained model, among the trained models, and a sentence generated from an utterance template of another service; and generating a similar sentence that is similar to the target sentence based on another trained model, among the trained models, and the similar utterance template.
    Type: Grant
    Filed: March 6, 2018
    Date of Patent: August 4, 2020
    Assignee: Samsung Electronics Co., Ltd.
    Inventors: Young-Seok Kim, Sang Hyun Yoo, Jehun Jeon, Junhwi Choi
  • Patent number: 10733497
    Abstract: A system and method determine a classification by simulating a human user. The system and method translate an input segment such as speech into an output segment such as text and represents the frequency of words and phrases in the textual segment as an input vector. The system and method process the input vector and generate a plurality of intents and a plurality of sub-entities. The processing of multiple intents and sub-entities generates a second multiple of intents and sub-entities that represent a species classification. The system and method select an instance of an evolutionary model as a result of the recognition of one or more predefined semantically relevant words and phrases detected in the input vector.
    Type: Grant
    Filed: June 25, 2019
    Date of Patent: August 4, 2020
    Assignee: PROGRESSIVE CASUALTY INSURANCE COMPANY
    Inventors: Craig S. Sesnowitz, Geoffrey S. McCormack, Rama Rao Panguluri, Robert R. Wagner
  • Patent number: 10725733
    Abstract: There is provided an information processing apparatus to further mitigate the effect of a delay associated with recognizing a gesture, the information processing apparatus including: an acquisition unit that acquires, on a basis of first input information corresponding to a detection result of a gesture, a prediction result of an operation corresponding to the gesture that is input subsequently; and a control unit that controls a process related to an acquisition of second input information associated with the first input information, in accordance with the prediction result of the operation.
    Type: Grant
    Filed: September 23, 2016
    Date of Patent: July 28, 2020
    Assignee: SONY CORPORATION
    Inventors: Yusuke Nakagawa, Shinichi Kawano
  • Patent number: 10720148
    Abstract: A method for a dialogue system includes establishing a dialogue session between an application executing on a server and a remote machine. The dialogue session includes one or more utterances received from a user at the remote machine. A natural language processing machine identifies a request associated with a computer-readable representation of an utterance. A dialogue expansion machine generates a plurality of alternative actions for responding to the request. A previously-trained machine learning confidence model assesses a confidence score for each alternative. If a highest confidence score for a top alternative does not satisfy a threshold, the plurality of alternatives including the top alternative are transmitted to a remote machine (which may be the same remote machine or a different remote machine) for review by a human reviewer. After the dialogue system and/or the human reviewer select an alternative, computer-readable instructions defining the selected alternative are executed.
    Type: Grant
    Filed: July 16, 2018
    Date of Patent: July 21, 2020
    Assignee: Semantic Machines, Inc.
    Inventors: Jesse Daniel Eskes Rusak, David Leo Wright Hall, Jason Andrew Wolfe, Daniel Lawrence Roth, Daniel Klein, Jordan Rian Cohen
  • Patent number: 10714144
    Abstract: Systems and methods for tagging video content are disclosed. A method includes: receiving a video stream from a user computer device, the video stream including audio data and video data; determining a candidate audio tag based on analyzing the audio data; establishing an audio confidence score of the candidate audio tag based on the analyzing of the audio data; determining a candidate video tag based on analyzing the video data; establishing a video confidence score of the candidate video tag based on the analyzing of the video data; determining a correlation factor of the candidate audio tag relative to the candidate video tag; and assigning a tag to a portion in the video stream based on the correlation factor exceeding a correlation threshold value and at least one of the audio confidence score exceeding an audio threshold value and the video confidence score exceeding a video threshold value.
    Type: Grant
    Filed: November 6, 2017
    Date of Patent: July 14, 2020
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Mark P. Delaney, Robert H. Grant, Trudy L. Hewitt, Martin A. Oberhofer
  • Patent number: 10714096
    Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for determining hotword suitability. In one aspect, a method includes receiving speech data that encodes a candidate hotword spoken by a user, evaluating the speech data or a transcription of the candidate hotword, using one or more predetermined criteria, generating a hotword suitability score for the candidate hotword based on evaluating the speech data or a transcription of the candidate hotword, using one or more predetermined criteria, and providing a representation of the hotword suitability score for display to the user.
    Type: Grant
    Filed: May 16, 2018
    Date of Patent: July 14, 2020
    Assignee: Google LLC
    Inventors: Andrew Rubin, Johan Schalkwyk, Maria Carolina Parada San Martin
  • Patent number: 10715214
    Abstract: Methods and apparatus to monitor a media presentation are disclosed. An example system includes a monitoring device to monitor a media presentation and generate research data identifying the media. A bridge device includes a housing dimensioned to receive the monitoring device, a receiver carried by the housing to receive a first audio signal via a wireless data connection from an audio source device using a wireless communication protocol, the first audio signal associated with the media. The bridge device includes an audio emitter to emit the audio signal for receipt by the monitoring device, and a transmitter to transmit the audio signal to an audio receiver device using the wireless communication protocol.
    Type: Grant
    Filed: December 3, 2018
    Date of Patent: July 14, 2020
    Assignee: The Nielsen Company (US), LLC
    Inventors: William K. Krug, James Zhang
  • Patent number: 10706874
    Abstract: An audio signal is obtained by a user terminal. The audio signal is divided into a plurality of short-time energy frames based on a frequency of a predetermined voice signal. Energy of each short-time energy frame is determined. Based on the energy of each short-time energy frame, whether the audio signal includes a voice signal is determined.
    Type: Grant
    Filed: April 10, 2019
    Date of Patent: July 7, 2020
    Assignee: Alibaba Group Holding Limited
    Inventors: Lei Jiao, Yanchu Guan, Xiaodong Zeng, Feng Lin
  • Patent number: 10685358
    Abstract: Consistent with the disclosed embodiments, systems and methods are provided herein for autonomously generating a thoughtful gesture for a customer. In one example implementation of the disclosed technology, a method is provided that includes receiving incoming customer dialogue and determining, based on the customer dialogue, customer information including one or more of: customer preferences, customer biographical information, and customer current life circumstances. The method also includes generating, based on the customer information, gesture-specific information-eliciting utterances for additional dialogue with the customer and identifying one or more response opportunities based on additional incoming customer dialogue responsive to sending the gesture-specific information-eliciting utterances to the customer. Further, the method includes generating a thoughtful gesture based on the identified one or more response opportunities and outputting, for presentation to the customer, the thoughtful gesture.
    Type: Grant
    Filed: June 4, 2018
    Date of Patent: June 16, 2020
    Assignee: CAPITAL ONE SERVICES, LLC
    Inventors: Alexandra Coman, Erik T. Mueller, Margaret Mayer
  • Patent number: 10677881
    Abstract: The present invention provides an improved method of pedestrian dead reckoning (PDR) and device for performing PDR. An image of the environment to be traversed by the pedestrian is used to define states associated with the PDR environment. The image is utilized to constrain error in the estimated location of the pedestrian. Pairs of states are identified and the characteristics of the defined states are utilized to define the probabilities of transitioning between states of the pair. Possible pedestrian events which may be observed are also defined and for each pair of states, the possibility of detecting an event given the state transition is determined. After detecting a series of events, the transition probabilities and the event probabilities are utilized to determine a state probability that the pedestrian is in a particular state at each particular time. Utilizing these state probabilities an estimated location is provided at each time.
    Type: Grant
    Filed: March 29, 2016
    Date of Patent: June 9, 2020
    Inventor: Aaron D. Magid
  • Patent number: 10679625
    Abstract: A system and method for temporarily disabling keyword detection to avoid detection of machine-generated keywords. A local device may operate two keyword detectors. The first keyword detector operates on input audio data received by a microphone to capture keywords uttered by a user. In these instances, the keyword may be detected by the first detector and the audio data may be transmitted to a remote device for processing. The remote device may generate output audio data to be sent to the local device. The local device may process the output audio data to determine that it also includes the keyword. The device may then disable the first keyword detector while the output audio data is played back by an audio speaker of the local device. Thus the local device may avoid detection of a keyword originating from the output audio. The first keyword detector may be reactivated after a time interval during which the keyword might be detectable in the output audio.
    Type: Grant
    Filed: September 14, 2018
    Date of Patent: June 9, 2020
    Assignee: Amazon Technologies, Inc.
    Inventors: Christopher Wayne Lockhart, Matthew Joseph Cole, Xulei Liu
  • Patent number: 10678828
    Abstract: A neural network-based classifier system can receive a query including a media signal and, in response, provide an indication that the query corresponds to a specified media type or media class. The neural network-based classifier system can select and apply various models to facilitate media classification. In an example embodiment, a query can be analyzed for various characteristics, such as a noise profile, before it is input to the network-based classifier. If the query has greater than a specified threshold noise characteristic, then a successful classification can be unlikely and a classification process based on the query can be terminated before computational resources are expended. Query signals that meet or exceed a threshold condition can be provided to the network-based classifier for media classification. In an example embodiment, a remote device or a central media classifier circuit can determine a noise profile for a query.
    Type: Grant
    Filed: June 17, 2016
    Date of Patent: June 9, 2020
    Assignee: GRACENOTE, INC.
    Inventors: Jason Cramer, Markus K. Cremer, Phillip Popp, Cameron Aubrey Summers
  • Patent number: 10678499
    Abstract: An audio interface device, comprising: an interface unit and a wireless unit, wherein the interface unit is configured to relay a first audio signal transmitted between a microphone and a communication device and a second audio signal transmitted between the communication device and a speaker, and to route first audio data related to the first audio signal and second audio data related to the second audio signal to the wireless unit; and wherein the wireless unit is configured to transmit the first audio data and the second audio data to a remote audio device.
    Type: Grant
    Filed: January 30, 2019
    Date of Patent: June 9, 2020
    Assignee: i2x GmbH
    Inventors: Claudio Martay, Michael Brehm, Evgenii Khamukhin
  • Patent number: 10679648
    Abstract: Various embodiments relating to detecting at least one of conversation, the presence and the identity of others during presentation of digital content on a computing device. When another person is detected, one or more actions may be taken with respect to the digital content. For example, the digital content may be minimized, moved, resized or otherwise modified.
    Type: Grant
    Filed: January 12, 2018
    Date of Patent: June 9, 2020
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Arthur Charles Tomlin, Dave Hill, Jonathan Paulovich, Evan Michael Keibler, Jason Scott, Cameron G. Brown, Thomas Forsythe, Jeffrey A. Kohler, Brian Murphy
  • Patent number: 10672414
    Abstract: Systems, methods, and computer-readable storage devices are disclosed for improved real-time audio processing. One method including: receiving audio data including a plurality of frames having a plurality of frequency bins; calculating, for each frequency bin, an approximate speech signal estimation based on the plurality of frames; calculating, for each approximate speech signal estimation, a clean speech estimation and at least one additional target including an ideal ratio mask using a trained neural network model; and calculating, for each frequency bin, a final clean speech estimation using the calculated at least one additional target including the calculated ideal ratio mask and the calculated clean speech estimation.
    Type: Grant
    Filed: April 13, 2018
    Date of Patent: June 2, 2020
    Assignee: MICROSOFT TECHNOLOGY LICENSING, LLC
    Inventors: Ivan Jelev Tashev, Shuayb M Zarar, Yan-Hui Tu, Chin-Hui Lee, Han Zhao
  • Patent number: 10672400
    Abstract: An electronic device is attachable to and detachable from an information processing device having a mode control portion performing control for switching an operation state in which background processing is executed in a standby mode in which the display of a display portion is stopped and a low power consumption state in which the information processing device can be quickly returned to the operation state and has a light emission portion disposed so as to be visibly recognized in a state in which the display of the display portion is stopped and a lighting control portion lighting the light emission portion in the standby mode when a predetermined event to be notified to a user occurs in the background processing.
    Type: Grant
    Filed: July 24, 2018
    Date of Patent: June 2, 2020
    Assignee: LENOVO (SINGAPORE) PTE. LTD.
    Inventors: Yasumichi Tsukamoto, Seiji Yamasaki, Munefumi Nakata
  • Patent number: 10664522
    Abstract: One embodiment provides a method, including: utilizing at least one processor to execute computer code that performs the steps of: using an electronic device to engage in an interactive session between a user and a virtual assistant; receiving, at the electronic device, audio input from the user, wherein the audio input comprises a problem-solving query corresponding to a request by the user for assistance in solving a problem related to at least one object; parsing the audio input to identify at least one annotated video file corresponding to the at least one object and the problem-solving query; determining a state of the object and a location in the at least one annotated video file corresponding to the state of the object; and providing, to the user and based on the location in the at least one annotated video file, instructional output related to the problem-solving query.
    Type: Grant
    Filed: December 7, 2017
    Date of Patent: May 26, 2020
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Sampath Dechu, Neelamadhav Gantayat, Pratyush Kumar, Senthil Kumar Kumarasamy Mani
  • Patent number: 10657956
    Abstract: There is provided an information processing device to further improve the operability of user interfaces that use a voice as an input, the information processing device including: an acquiring unit configured to acquire context information in a period related to collection of a voice; and a control unit configured to cause a predetermined output unit to output output information related to the collection of the voice in a mode corresponding to the acquired context information.
    Type: Grant
    Filed: March 28, 2017
    Date of Patent: May 19, 2020
    Assignee: SONY CORPORATION
    Inventors: Yusuke Nakagawa, Shinichi Kawano, Yuhei Taki, Ayumi Kato
  • Patent number: 10657969
    Abstract: An identity verification method and an identity verification apparatus based on a voiceprint are provided. The identity verification method based on a voiceprint includes: receiving an unknown voice; extracting a voiceprint of the unknown voice using a neural network-based voiceprint extractor which is obtained through pre-training; concatenating the extracted voiceprint with a pre-stored voiceprint to obtain a concatenated voiceprint; and performing judgment on the concatenated voiceprint using a pre-trained classification model, to verify whether the extracted voiceprint and the pre-stored voiceprint are from a same person. With the identity verification method and the identity verification apparatus, a holographic voiceprint of the speaker can be extracted from a short voice segment, such that the verification result is more robust.
    Type: Grant
    Filed: January 9, 2018
    Date of Patent: May 19, 2020
    Assignee: FUJITSU LIMITED
    Inventors: Ziqiang Shi, Liu Liu, Rujie Liu
  • Patent number: 10657963
    Abstract: Provided is a user command processing method and system to provide and adjust an operation of a device by analyzing a presentation of a user speech. The user command processing method includes managing at least one pre-defined operation to be performed according to a user command, a plurality of options being preset in relation to each of the at least one pre-defined operation, receiving a user command at least including a voice input received from a user, selecting an operation corresponding to a keyword extracted from the voice input, determining at least one option corresponding to the extracted keyword among a plurality of options preset in relation to the selected operation, according to a presentation of the voice input, and performing the selected operation in association with the determined at least one option.
    Type: Grant
    Filed: May 2, 2018
    Date of Patent: May 19, 2020
    Assignees: NAVER Corporation, LINE Corporation
    Inventors: Seijin Cha, Eonjoung Choi
  • Patent number: 10657174
    Abstract: The present invention relates to providing identification information in response to an audio segment using a first mode of operation including receiving an audio segment and sending the audio segment to a remote server and receiving, from the remote server, identification information relating to the audio segment, and a second mode of operation of receiving an audio segment and using stored information to obtain identification information relating to the received audio segment received, without sending the audio segment to the remote server. The present invention further includes using identification information from the remote server and using local identification information and selecting either identification information from the remote server or local identification information based on selection criteria, and generating an output based on the selected identification information.
    Type: Grant
    Filed: July 24, 2018
    Date of Patent: May 19, 2020
    Assignee: SoundHound, Inc.
    Inventors: Aaron Master, Bernard Mont-Reynaud, Keyvan Mohajer, Timothy Stonehocker
  • Patent number: 10650811
    Abstract: Disclosed in various examples are methods, systems, and machine-readable mediums for providing improved computer implemented speech recognition by detecting and correcting speech recognition errors during a speech session. The system recognizes repeated speech commands from a user in a speech session that are similar or identical to each other. To correct these repeated errors, the system creates a customized language model that is then utilized by the language modeler to produce a refined prediction of the meaning of the repeated speech commands. The custom language model may comprise clusters of similar past predictions of speech commands from the speech session of the user.
    Type: Grant
    Filed: March 13, 2018
    Date of Patent: May 12, 2020
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Meryem Pinar Donmez Ediz, Ranjitha Gurunath Kulkarni, Shuangyu Chang, Nitin Kamra
  • Patent number: 10650802
    Abstract: A voice recognition method is provided that includes extracting a first speech from the sound collected with a microphone connected to a voice processing device, and calculating a recognition result for the first speech and the confidence level of the first speech. The method also includes performing a speech for a repetition request based on the calculated confidence level of the first speech, and extracting with the microphone a second speech obtained through the repetition request. The method further includes calculating a recognition result for the second speech and the confidence level of the second speech, and generating a recognition result from the recognition result for the first speech and the recognition result for the second speech, based on the confidence level of the calculated second speech.
    Type: Grant
    Filed: June 27, 2018
    Date of Patent: May 12, 2020
    Assignee: PANASONIC INTELLECTUAL PROPERTY MANAGEMENT CO., LTD.
    Inventors: Yuji Kunitake, Yusaku Ota
  • Patent number: 10643611
    Abstract: An electronic device may capture a voice command from a user. The electronic device may store contextual information about the state of the electronic device when the voice command is received. The electronic device may transmit the voice command and the contextual information to computing equipment such as a desktop computer or a remote server. The computing equipment may perform a speech recognition operation on the voice command and may process the contextual information. The computing equipment may respond to the voice command. The computing equipment may also transmit information to the electronic device that allows the electronic device to respond to the voice command.
    Type: Grant
    Filed: March 28, 2018
    Date of Patent: May 5, 2020
    Assignee: Apple Inc.
    Inventor: Aram M. Lindahl
  • Patent number: 10636425
    Abstract: Among other things, requests are received from voice assistant devices expressed in accordance with different corresponding protocols of one or more voice assistant frameworks. Each of the requests represents a voiced input by a user to the corresponding voice assistant device. The received requests are re-expressed in accordance with a common request protocol. Based on the received requests, responses to the requests are expressed in accordance with a common response protocol. Each of the responses is re-expressed according to a protocol of the framework with respect to which the corresponding request was expressed. The responses are sent to the voice assistant devices for presentation to the users.
    Type: Grant
    Filed: June 5, 2018
    Date of Patent: April 28, 2020
    Assignee: Voicify, LLC
    Inventors: Robert T. Naughton, Nicholas G. Laidlaw, Alexander M. Dunn, Jeffrey K. McMahon
  • Patent number: 10635701
    Abstract: A neural network-based classifier system can receive a query including a media signal and, in response, provide an indication that the query corresponds to a specified media type or media class. The neural network-based classifier system can select and apply various models to facilitate media classification. In an example embodiment, a query can be analyzed for various characteristics, such as a noise profile, before it is input to the network-based classifier. If the query has greater than a specified threshold noise characteristic, then a successful classification can be unlikely and a classification process based on the query can be terminated before computational resources are expended. Query signals that meet or exceed a threshold condition can be provided to the network-based classifier for media classification. In an example embodiment, a remote device or a central media classifier circuit can determine a noise profile for a query.
    Type: Grant
    Filed: June 17, 2016
    Date of Patent: April 28, 2020
    Assignee: GRACENOTE, INC.
    Inventors: Jason Cramer, Markus K. Cremer, Phillip Popp, Cameron Aubrey Summers
  • Patent number: 10629187
    Abstract: Systems and methods are described herein for providing media guidance. Control circuitry may receive a first voice input and access a database of topics to identify a first topic associated with the first voice input. A user interface may generate a first response to the first voice input, and subsequent to generating the first response, the control circuitry may receive a second voice input. The control circuitry may determine a match between the second voice input and an interruption input such as a period of silence or a keyword or a phrase, such as “Ahh,”, “Umm,”, or “Hmm.” The user interface may generate a second response that is associated with a second topic related to the first topic. By interrupting the conversation and changing the subject from time to time, media guidance systems can appear to be more intelligent and human.
    Type: Grant
    Filed: April 9, 2019
    Date of Patent: April 21, 2020
    Assignee: Rovi Guides, Inc.
    Inventors: Charles Dawes, Walter R. Klappert
  • Patent number: 10628570
    Abstract: Described herein are methods and systems for secure communication of private audio data in a zero user interface computing environment. A server receives text generated from a first digital audio bitstream, the digital audio bitstream corresponding to speech captured by a zero user interface computing device from a user. The server analyzes the text to extract a set of keywords from the text. The server determines whether information responsive to the keywords comprises private data related to the user. If the information responsive to the set of keywords comprises private data: the server generates a text response to the set of keywords that includes the private data relating to the user, determines a personal audio playback device associated with the user, and transmits the generated text response to the personal audio playback device for playback as a second digital audio bitstream.
    Type: Grant
    Filed: May 15, 2018
    Date of Patent: April 21, 2020
    Assignee: FMR LLC
    Inventors: Michael Quinn, Adam Schouela, Adrian Ronayne, Emily Elwell, Aaron Montford
  • Patent number: 10621391
    Abstract: Provided are a method and an apparatus for acquiring a semantic fragment of a query based on artificial intelligence, and a terminal. The method includes pre-processing the query and determining a first main word and a semantic fragment set included in the query; determining an association degree between each semantic fragment in the semantic fragment set and the first main word according to historical retrieve data; filtering the semantic fragment set according to the association degree and determining an object semantic fragment set corresponding to the query.
    Type: Grant
    Filed: December 26, 2017
    Date of Patent: April 14, 2020
    Assignee: BEIJING BAIDU NETCOM SCIENCE AND TECHNOLOGY CO., LTD.
    Inventor: Yufang Wu
  • Patent number: 10616475
    Abstract: The present disclosure provides a photo-taking prompting method and apparatus, an apparatus and a non-volatile computer storage medium. On the one hand, a user's image information is collected while the user finds view, then the user's face posture information is obtained from the image information, then face posture information of a preset photo-taking template is compared with the user's face posture information, and the user is prompted to adjust the face posture according to a comparison result. The technical solutions provided by embodiments of the present disclosure may implement prompting the user's face posture adjustment while the user finds view and thereby implement providing guidance for the user's face posture, and solve the problem in the prior art about failure to perform photo-taking guidance while the user finds view.
    Type: Grant
    Filed: November 13, 2015
    Date of Patent: April 7, 2020
    Assignee: BEIJING BAIDU NETCOM SCIENCE AND TECHNOLOGY CO., LTD.
    Inventors: Fujian Wang, Fuguo Zhu, Errui Ding, Long Gong, Yafeng Deng
  • Patent number: 10614265
    Abstract: An apparatus for correcting a character string in a text of an embodiment includes a first converter, a first output unit, a second converter, an estimation unit, and a second output unit. The first converter recognizes a first speech of a first speaker, and converts the first speech to a first text. The first output unit outputs a first caption image indicating the first text. The second converter recognizes a second speech of a second speaker for correcting a character string to be corrected in the first text, and converts the second speech to a second text. The estimation unit estimates the character string to be corrected, based on text matching between the first text and the second text. The second output unit outputs a second caption image indicating that the character string to be corrected is to be replaced with the second text.
    Type: Grant
    Filed: December 21, 2016
    Date of Patent: April 7, 2020
    Assignee: Kabushiki Kaisha Toshiba
    Inventors: Kosei Fume, Taira Ashikawa, Masayuki Ashikawa
  • Patent number: 10607606
    Abstract: In one aspect, a first device includes a processor and storage accessible to the at least one processor. The storage bears instructions executable by the processor to execute a digital assistant, receive input for the digital assistant to perform a task, determine the task indicated in the input, determine whether to use a second device for processing the input based on the determination of the task at the first device, and transmit at least a portion of the input to the second device. The instructions are also executable by the processor to, responsive to a determination to not use the second device for processing the input, execute the task at the first device and using the digital assistant without receiving a response from the second device.
    Type: Grant
    Filed: June 19, 2017
    Date of Patent: March 31, 2020
    Assignee: Lenovo (Singapore) Pte. Ltd.
    Inventors: John Weldon Nicholson, Daryl Cromer, Mir Farooq Ali, David Alexander Schwarz
  • Patent number: 10607631
    Abstract: The purpose of the present invention is to reduce leakage of an output signal between band-pass filters and in the time axis direction. A signal input part 1 vectorizes an input signal x(n). An extended impulse response matrix generation part 2 generates an extended impulse response matrix He in which impulse response vectors using impulse response sequences of band-pass filters as elements are extended in the time axis direction. An expansion coefficient calculation part 3 calculates an expansion coefficient vector {circumflex over (?)}y(n) using an input signal vectors ?x(n) and the extended impulse response matrix He. A signal output part 4 outputs at least one of expansion coefficients corresponding to a center vector of the extended impulse response matrix He of the expansion coefficient vector {circumflex over (?)}y(n).
    Type: Grant
    Filed: December 5, 2017
    Date of Patent: March 31, 2020
    Assignee: NIPPON TELEGRAPH AND TELEPHONE CORPORATION
    Inventors: Suehiro Shimauchi, Kana Eguchi, Tsutomu Yabuuchi, Kazuhiro Yoshida, Osamu Mizuno
  • Patent number: 10599766
    Abstract: A dimensionality analysis method, system, and computer program product, include defining a grammar that describes an admissible relationship between quantities in a data set, discovering symbolic expressions that intrinsically account for a dimensionality analysis based on the grammar, conducting a search that determines which valid expression in the data set satisfy the grammar, and selecting the expression that fits the dataset and minimizes a measure of a complexity of the expression.
    Type: Grant
    Filed: December 15, 2017
    Date of Patent: March 24, 2020
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Oki Gunawan, Lior Horesh, Giacomo Nannicini, Wang Zhou
  • Patent number: 10600408
    Abstract: Techniques for ensuring content output to a user conforms to a quality of the user's speech, even when a speechlet or skill ignores the speech's quality, are described. When a system receives speech, the system determines an indicator of the speech's quality (e.g., whispered, shouted, fast, slow, etc.) and persists the indicator in memory. When the system receives output content from a speechlet or skill, the system checks whether the output content is in conformity with the speech quality indicator. If the content conforms to the speech quality indicator, the system may cause the content to be output to the user without further manipulation. But, if the content does not conform to the speech quality indicator, the system may manipulate the content to render it in conformity with the speech quality indicator and output the manipulated content to the user.
    Type: Grant
    Filed: March 23, 2018
    Date of Patent: March 24, 2020
    Assignee: Amazon Technologies, Inc.
    Inventors: Andrew Smith, Christopher Schindler, Karthik Ramakrishnan, Rohit Prasad, Michael George, Rafal Kuklinski
  • Patent number: 10593346
    Abstract: The present disclosure generally relates to processing speech or text using rank-reduced token representation. In one example process, speech input is received. A sequence of candidate words corresponding to the speech input is determined. The sequence of candidate words includes a current word and one or more previous words. A vector representation of the current word is determined from a set of trained parameters. A number of parameters in the set of trained parameters varies as a function of one or more linguistic characteristics of the current word. Using the vector representation of the current word, a probability of a next word given the current word and the one or more previous words is determined. A text representation of the speech input is displayed based on the determined probability.
    Type: Grant
    Filed: March 15, 2017
    Date of Patent: March 17, 2020
    Assignee: Apple Inc.
    Inventors: Christophe J. Van Gysel, Yi Su, Xiaochuan Niu, Ilya Oparin
  • Patent number: 10593347
    Abstract: A portable electronic device includes an audio input device and a processor. The processor is configured to obtain audio input data including a noise signal having an audio feature through the audio input device, to filter the audio input data using a neural network model to generate first audio output data, and to filter the first audio output data without using the neural network model to generate second audio output data. The first audio output data has a first changed audio feature corresponding to the audio feature and the second audio output data has a second changed audio feature corresponding to the audio feature.
    Type: Grant
    Filed: March 23, 2018
    Date of Patent: March 17, 2020
    Assignee: Samsung Electronics Co., Ltd.
    Inventors: Soon Ho Baek, Han Gil Moon, Ki Ho Cho, Gang Youl Kim, Jin Soo Park
  • Patent number: 10593326
    Abstract: Systems, methods, and devices for location-based context driven speech recognition are disclosed. A mobile or stationary computing device can include position locating functionality for determining the particular physical location of the computing device. Once the physical location of the computing device determined, a context related to that particular physical location. The context related to the particular physical location can include information regarding objects or experiences a user might encounter while in that particular physical location. The context can then be used to determine delimited or constrained speech recognition vocabulary subset based on the range of experiences a user might encounter within a particular context.
    Type: Grant
    Filed: April 25, 2013
    Date of Patent: March 17, 2020
    Assignee: SENSORY, INCORPORATED
    Inventor: William Teasley
  • Patent number: 10592784
    Abstract: A system and method to perform detection based on sensor fusion includes obtaining data from two or more sensors of different types. The method also includes extracting features from the data from the two or more sensors and processing the features to obtain a vector associated with each of the two or more sensors. The method further includes concatenating the two or more vectors obtained from the two or more sensors to obtain a fused vector, and performing the detection based on the fused vector.
    Type: Grant
    Filed: February 20, 2018
    Date of Patent: March 17, 2020
    Assignee: GM GLOBAL TECHNOLOGY OPERATIONS LLC
    Inventors: Shuqing Zeng, Wei Tong, Yasen Hu, Mohannad Murad, David R. Petrucci, Gregg R. Kittinger
  • Patent number: 10592767
    Abstract: Approaches for interpretable counting for visual question answering include a digital image processor, a language processor, and a counter. The digital image processor identifies objects in an image, maps the identified objects into an embedding space, generates bounding boxes for each of the identified objects, and outputs the embedded objects paired with their bounding boxes. The language processor embeds a question into the embedding space. The scorer determines scores for the identified objects. Each respective score determines how well a corresponding one of the identified objects is responsive to the question. The counter determines a count of the objects in the digital image that are responsive to the question based on the scores. The count and a corresponding bounding box for each object included in the count are output. In some embodiments, the counter determines the count interactively based on interactions between counted and uncounted objects.
    Type: Grant
    Filed: January 29, 2018
    Date of Patent: March 17, 2020
    Assignee: salesforce.com, inc.
    Inventors: Alexander Richard Trott, Caiming Xiong, Richard Socher
  • Patent number: 10586531
    Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for performing speech recognition by generating a neural network output from an audio data input sequence, where the neural network output characterizes words spoken in the audio data input sequence. One of the methods includes, for each of the audio data inputs, providing a current audio data input sequence that comprises the audio data input and the audio data inputs preceding the audio data input in the audio data input sequence to a convolutional subnetwork comprising a plurality of dilated convolutional neural network layers, wherein the convolutional subnetwork is configured to, for each of the plurality of audio data inputs: receive the current audio data input sequence for the audio data input, and process the current audio data input sequence to generate an alternative representation for the audio data input.
    Type: Grant
    Filed: December 4, 2018
    Date of Patent: March 10, 2020
    Assignee: DeepMind Technologies Limited
    Inventors: Aaron Gerard Antonius van den Oord, Sander Etienne Lea Dieleman, Nal Emmerich Kalchbrenner, Karen Simonyan, Oriol Vinyals, Lasse Espeholt
  • Patent number: 10573294
    Abstract: Embodiments of the present disclosure provide a speech recognition method based on artificial intelligence, and a terminal. The method includes obtaining speech data to be recognized; performing a processing on the speech data to be recognized using a trained sub-band energy normalized acoustic model, to determine an normalized energy feature corresponding to each time-frequency unit in the speech data to be recognized; and determining text data corresponding to the speech data to be recognized according to the normalized energy feature corresponding to each time-frequency unit.
    Type: Grant
    Filed: December 29, 2017
    Date of Patent: February 25, 2020
    Assignee: BAIDU ONLINE NETWORK TECHNOLOGY (GEIJING) CO., LTD.
    Inventors: Mingming Chen, Xiangang Li, Jue Sun
  • Patent number: 10573317
    Abstract: An electronic device and method are disclosed herein. The electronic device implements the method, including: receiving a first speech, and extracting a first text from the received first speech, in response to detecting that extraction of the first text includes errors such that a request associated with the first speech is unprocessable, storing the extracted first text, receiving a second speech and extracting a second text from the received second speech, in response to detecting that the request is processable using the extracted second text, detecting whether a similarity between the first and second texts is greater than a similarity threshold, and whether the second speech is received within a predetermined time duration of receiving the first speech, and when the similarity is greater than the threshold, and the first and second speech signals are received within the time duration, storing the first text in association with the second text.
    Type: Grant
    Filed: August 14, 2018
    Date of Patent: February 25, 2020
    Assignee: Samsung Electronics Co., Ltd.
    Inventors: Yongwook Kim, Jamin Goo, Gangheok Kim, Dongkyu Lee
  • Patent number: 10574827
    Abstract: A method and apparatus of sharing documents during a conference call data is disclosed. One example method may include initiating a document sharing operation during a conference call conducted between at least two participants communicating during the conference call. The method may also include transferring the document from one of the two participants to another of the two participants, and recording at least one action performed to the document by the participants during the conference call.
    Type: Grant
    Filed: April 9, 2019
    Date of Patent: February 25, 2020
    Assignee: West Corporation
    Inventors: Mark J. Pettay, Hendryanto Rilantono, Myron P. Sojka
  • Patent number: 10573048
    Abstract: One or more computing devices, systems, and/or methods for emotional reaction sharing are provided. For example, a client device captures video of a user viewing content, such as a live stream video. Landmark points, corresponding to facial features of the user, are identified and provided to a user reaction distribution service that evaluates the landmark points to identify a facial expression of the user, such as a crying facial expression. The facial expression, such as landmark points that can be applied to a three-dimensional model of an avatar to recreate the facial expression, are provided to client devices of users viewing the content, such as a second client device. The second client device applies the landmark points of the facial expression to a bone structure mapping and a muscle movement mapping to create an expressive avatar having the facial expression for display to a second user.
    Type: Grant
    Filed: July 25, 2016
    Date of Patent: February 25, 2020
    Assignee: Oath Inc.
    Inventors: Bin Ni, Gregory Davis Choi, Adam Bryan Mathes
  • Patent number: 10565455
    Abstract: Disclosed are systems and devices for facilitating audiovisual communication. According to embodiments, there may be provided a communication device including or otherwise functionally associated with a video camera to acquire visual information from a scene and to convert the visual information into a video stream. A video processing circuit may (a) receive the video stream, (b) extract user related features from the video stream; and (c) detect and characterize one or more user actions. A controller to receive the video processing circuit output and responsive to an indication of one or more detected events within the video stream to trigger a communication session between said communication device and an addressee device, wherein different addressee devices may be associated with different detected actions.
    Type: Grant
    Filed: July 26, 2015
    Date of Patent: February 18, 2020
    Assignee: ANTS TECHNOLOGY (HK) LIMITED
    Inventors: Ron Fridental, Ilya Blayvas, Lili Zhao
  • Patent number: 10565990
    Abstract: Described herein are systems, methods, and apparatus for determining audio context between an audio source and an audio sink and selecting signal profiles based at least in part on that audio context. The signal profiles may include noise cancellation which is configured to facilitate operation within the audio context. Audio context may include user-to-user and user-to-device communications.
    Type: Grant
    Filed: July 28, 2017
    Date of Patent: February 18, 2020
    Assignee: Amazon Technologies, Inc.
    Inventors: Stephen M. Polansky, Matthew P. Bell, Yuzo Watanabe