Abstract: The disclosed computer-implemented method may include receiving input voice data synchronous with a visual state of a user interface of the third-party application, generating multiple sentence alternatives for the received input voice data, identifying a best sentence of the multiple sentence alternatives, executing a dialog script for the third-party application using the best sentence, the dialog script generating a response to the received voice data comprising output voice data and a corresponding visual response, and providing the visual response and the output voice data to the third-party application, the third-party application playing the output voice data synchronous with updating the user interface based on the visual response. Various other methods, systems, and computer-readable media are also disclosed.
Abstract: A voice support server is used to provide voice control functionality to a third party application that does not natively support voice control functions. The voice support server implements a domain specific to the third party application that maintains a domain-specific language model (DLM) reflecting the functionality of the third party application. The DLM comprises a plurality of intent patterns corresponding to different commands and their possible variations that may be issued by the user, and maps each intent pattern to a corresponding action to be performed by the third party application. Received audio data is analyzed to determine one or more user utterances, which are transcribed and compared to the intent patterns of the DLM to identify an intent corresponding to the user utterance. The voice control module may then transmit instructions to the third party application to perform the action corresponding to the identified intent.
Type:
Grant
Filed:
January 24, 2020
Date of Patent:
March 19, 2024
Assignee:
Alan AI, Inc.
Inventors:
Andrey Ryabov, Anna Miroshnichenko, Evgeny Yusov, Alex Sotnikov
Abstract: The disclosed computer-implemented method may include receiving input voice data synchronous with a visual state of a user interface of the third-party application, generating multiple sentence alternatives for the received input voice data, identifying a best sentence of the multiple sentence alternatives, executing a dialog script for the third-party application using the best sentence, the dialog script generating a response to the received voice data comprising output voice data and a corresponding visual response, and providing the visual response and the output voice data to the third-party application, the third-party application playing the output voice data synchronous with updating the user interface based on the visual response. Various other methods, systems, and computer-readable media are also disclosed.
Abstract: A mobile device includes a plurality of physical buttons for providing programmable control inputs to the mobile device. A processor within the mobile device is programmed with instructions to cause the mobile device to respond to a first input via a first of the plurality of physical buttons by activating voice recognition by the mobile device and providing an indication on a status bar on a screen of the mobile device. The mobile device accepts voice inputs and processes the voice inputs locally and via interactions with a remote server to provide contextualized information within an application executing on the mobile device.
Type:
Grant
Filed:
January 2, 2018
Date of Patent:
December 31, 2019
Assignee:
Alan AI, Inc.
Inventors:
Andrey Ryabov, Ramu Sunkara, James Shelburne, Sergey Yuryev