Patents Assigned to AppTek

Method and apparatus for improved automatic subtitle segmentation using an artificial neural network model

Patent number: 12073177

Abstract: A subtitle segmentation system employs a neural network model to find good segment boundaries. The model may be trained on millions of professionally segmented subtitles, and implicitly learns from data the underlying guidelines that professionals use. For controlling different characteristics of the output subtitles, the neural model may be combined with a number of heuristic features. To find the best segmentation according to the model combination, a dedicated beam search decoder may be implemented. The segmentation system incorporates a trained neural model comprising a word embedding layer, at least two bi-directional LSTM layers, a softmax layer and program instructions for segmenting text into subtitles.

Type: Grant

Filed: May 18, 2020

Date of Patent: August 27, 2024

Assignee: Applications Technology (AppTek), LLC

Inventors: Patrick Wilken, Evgeny Matusov
Method and apparatus for forced duration in neural speech synthesis

Patent number: 11302300

Abstract: A system and method enable one to set a target duration of a desired synthesized utterance without removing or adding spoken content. Without changing the spoken text, the voice characteristics may be kept the same or substantially the same. Silence adjustment and interpolation may be used to alter the duration while preserving speech characteristics. Speech may be translated prior to a vocoder step, pursuant to which the translated speech is constrained by the original audio duration, while mimicking the speech characteristics of the original speech.

Type: Grant

Filed: November 19, 2020

Date of Patent: April 12, 2022

Assignee: Applications Technology (AppTek), LLC

Inventors: Nick Rossenbach, Mudar Yaghi
AUDIO BIOMARKER FOR VIRTUAL LUNG FUNCTION ASSESSMENT AND AUSCULTATION

Publication number: 20210298711

Abstract: A mobile device application prompts and conducts audio and/or video tests using a microphone on a smartphone, tablet or laptop in order to record and analyze a patient's speech, cough, breathing and other sounds in order to diagnose the patient with Covid 19, another ailment, or as having normal ranges not indicative of disease. The mobile device's tests and protocols use program instructions, AI processing and other automated tools to facilitate the speed and reliability of the testing.

Type: Application

Filed: March 25, 2021

Publication date: September 30, 2021

Applicant: Applications Technology (AppTek), LLC

Inventors: Shahnaz MIRI, Yasar Torres YAGHI, Fernando PAGAN, Mudar YAGHI, Sanjeev KHUDANPUR, Jan TRMAL, Hassan SAWAF, Jintao JIANG, Mazda EBRAHIMI
METHOD AND APPARATUS FOR FORCED DURATION IN NEURAL SPEECH SYNTHESIS

Publication number: 20210151028

Abstract: A system and method enable one to set a target duration of a desired synthesized utterance without removing or adding spoken content. Without changing the spoken text, the voice characteristics may be kept the same or substantially the same. Silence adjustment and interpolation may be used to alter the duration while preserving speech characteristics. Speech may be translated prior to a vocoder step, pursuant to which the translated speech is constrained by the original audio duration, while mimicking the speech characteristics of the original speech.

Type: Application

Filed: November 19, 2020

Publication date: May 20, 2021

Applicant: Applications Technology (AppTek), LLC

Inventors: Nick ROSSENBACH, Mudar YAGHI
SYSTEMS AND METHODS FOR DIAGNOSING AND ANALYZING CONCUSSIONS

Publication number: 20210100453

Abstract: A system for determining whether a user has a concussion includes: a mobile device including a camera, microphone, display, an application program running on the mobile device, and a network connection, and a back end processing server, the mobile device further including a processor that is configured to execute program instructions associated with the application that cause the mobile device to: collect audio in response to specific questions from the user using the microphone, and transmit the audio to the back end processing server, and wherein the back end processing server configured to execute program instructions to: receive the audio stream from the user, store the audio stream, process the audio stream to recognize speech within the audio stream, and compare parameters associated with the speech recognition with prior speech for the user collected on a prior occasion in response to the same questions.

Type: Application

Filed: December 17, 2020

Publication date: April 8, 2021

Applicant: Applications Technology (AppTek), LLC

Inventors: Mudar YAGHI, Yasar YAGHI, Darius FERDOWS, Jintao JIANG
Systems and methods for diagnosing and analyzing concussions

Patent number: 10939821

Abstract: A mobile device is programmed with an application that uses the mobile device's camera, accelerometer and microphone to enable a parent, coach or player to use it as a tool to diagnose a concussion. The tool may diagnose concussion on the basis of one or multiple factors that are scored, for example the player's balance, eye movement, speech responses to questions, button pressing response time, and other information about the location of the impact. A mobile device may be equipped with speech recognition and voice prompting to enable a concussion examination of a player to be administered by another player or coach to the injured player without significant effort by the injured player or helper. Each test may be scored, by itself or against one or more baselines for the injured player to develop an overall score and likelihood of a concussion. When the coach thinks there is a concussion, he/she can use the application to help find a doctor.

Type: Grant

Filed: March 20, 2018

Date of Patent: March 9, 2021

Assignee: Applications Technology (AppTek), LLC

Inventors: Mudar Yaghi, Yasar Yaghi, Darius Ferdows, Jintao Jiang
METHOD AND APPARATUS FOR IMPROVED AUTOMATIC SUBTITLE SEGMENTATION USING AN ARTIFICIAL NEURAL NETWORK MODEL

Publication number: 20200364402

Abstract: A subtitle segmentation system employs a neural network model to find good segment boundaries. The model may be trained on millions of professionally segmented subtitles, and implicitly learns from data the underlying guidelines that professionals use. For controlling different characteristics of the output subtitles, the neural model may be combined with a number of heuristic features. To find the best segmentation according to the model combination, a dedicated beam search decoder may be implemented. The segmentation system incorporates a trained neural model comprising a word embedding layer, at least two bi-directional LSTM layers, a softmax layer and program instructions for segmenting text into subtitles.

Type: Application

Filed: May 18, 2020

Publication date: November 19, 2020

Applicant: Applications Technology (AppTek), LLC

Inventors: Patrick WILKEN, Evgeny MATUSOV
SYSTEM AND METHOD FOR DIRECT SPEECH TRANSLATION SYSTEM

Publication number: 20200226327

Abstract: A system for translating speech from at least two source languages into another target language provides direct speech to target language translation. The target text is converted to speech in the target language through a TTS system. The system simplifies speech recognition and translation process by providing direct translation, includes mechanisms described herein that facilitate mixed language source speech translation, and punctuating output text streams in the target language. It also in some embodiments allows translation of speech into the target language to reflect the voice of the speaker of the source speech based on characteristics of the source language speech and speaker's voice and to produce subtitled data in the target language corresponding to the source speech. The system uses models having been trained using (i) encoder-decoder architectures with attention mechanisms and training data using TTS and (ii) parallel text training data in more than two different languages.

Type: Application

Filed: January 13, 2020

Publication date: July 16, 2020

Applicant: Applications Technology (AppTek), LLC

Inventors: Evgeny MATUSOV, Jintao JIANG, Mudar YAGHI
DEVICE FOR SENSOR BASED MEDICATION DELIVERY BASED ON MOTOR, SENSORY, COGNITIVE AND PHYSIOLOGICAL FLUCTUATIONS

Publication number: 20200170510

Abstract: A system and method dispense medication through a smart pill that holds at least one medication and a mobile device capable of communicating wirelessly with the smart pill. The mobile device includes a user interface for communicating with a patient and at least one application program that includes program instructions for measuring an aspect of a patient's physiology. A processor on the mobile device is configured to execute a program for monitoring and communicating with the smart pill, causing at least one of the applications to execute and test the patient, and on the basis of the outcome of the testing, issue a signal for timing and amount of medication release from the smart pill to the patient. The application programs may include a cognitive test, an eye test, a balance test, and a reaction test which may use the display, camera, speaker and microphone of the mobile device.

Type: Application

Filed: October 29, 2019

Publication date: June 4, 2020

Applicant: AppTek, Inc.

Inventors: Darius FERDOWS, Yasar YAGHI, Mudar YAGHI, Fernando PAGAN, Charbel MOUSSA
Centered, left- and right-shifted deep neural networks and their combinations

Patent number: 10255910

Abstract: Deep Neural Networks (DNN) are time shifted relative to one another and trained. The time-shifted networks may then be combined to improve recognition accuracy. The approach is based on an automatic speech recognition (ASR) system using DNN and using time shifted features. Initially, a regular ASR model is trained to produce a first trained DNN. Then a top layer (e.g., SoftMax layer) and the last hidden layer (e.g., Sigmoid) are fine-tuned with same data set but with a feature window left- and right-shifted to create respective second and third left-shifted and right-shifted DNNs. From these three DNN networks, four combination networks may be generated: left- and right-shifted, left-shifted and centered, centered and right-shifted, and left-shifted, centered, and right-shifted. The centered networks are used to perform the initial (first-pass) ASR. Then the other six networks are used to perform rescoring.

Type: Grant

Filed: September 18, 2017

Date of Patent: April 9, 2019

Assignee: AppTek, Inc.

Inventors: Mudar Yaghi, Hassan Sawaf, Jinato Jiang
Hybrid phoneme, diphone, morpheme, and word-level deep neural networks

Patent number: 10235991

Abstract: A hybrid frame, phone, diphone, morpheme, and word-level Deep Neural Networks (DNN) in model training and applications-is based on training a regular ASR system, which can be based on Gaussian Mixture Models (GMM) or DNN. All the training data (in the format of features) are aligned with the transcripts in terms of phonemes and words with the timing information and new features are formed in terms of phonemes, diphones, morphemes, and up to words. Regular ASR produces a result lattice with timing information for each word. A feature is then extracted and sent to the word-level DNN for scoring Phoneme features are sent to corresponding DNNs for training. Scores are combined to form the word level scores, a rescored lattice and a new recognition result.

Type: Grant

Filed: August 9, 2017

Date of Patent: March 19, 2019

Assignee: AppTek, Inc.

Inventors: Jintao Jiang, Hassan Sawaf, Mudar Yaghi
SYSTEMS AND METHODS FOR DIAGNOSING AND ANALYZING CONCUSSIONS

Publication number: 20180263496

Abstract: A mobile device is programmed with an application that uses the mobile device's camera, accelerometer and microphone to enable a parent, coach or player to use it as a tool to diagnose a concussion. The tool may diagnose concussion on the basis of one or multiple factors that are scored, for example the player's balance, eye movement, speech responses to questions, button pressing response time, and other information about the location of the impact. A mobile device may be equipped with speech recognition and voice prompting to enable a concussion examination of a player to be administered by another player or coach to the injured player without significant effort by the injured player or helper. Each test may be scored, by itself or against one or more baselines for the injured player to develop an overall score and likelihood of a concussion. When the coach thinks there is a concussion, he/she can use the application to help find a doctor.

Type: Application

Filed: March 20, 2018

Publication date: September 20, 2018

Applicant: Apptek, Inc.

Inventor: Jintao JIANG
Method and apparatus for keyword speech recognition

Patent number: 10074363

Abstract: Phoneme images are created for keywords and audio files. The keyword images and audio file images are used to identify keywords within the audio file when the phoneme images match. Confidence scores may be determined corresponding to the match. Audio around the keywords may be stored and processed with an automatic speech recognition (ASR) program to verify the keyword match and provide textual and audio context to where the keyword appears within speech.

Type: Grant

Filed: November 11, 2016

Date of Patent: September 11, 2018

Assignee: Apptek, Inc.

Inventors: Jintao Jiang, Mudar Yaghi
CENTERED, LEFT- AND RIGHT-SHIFTED DEEP NEURAL NETWORKS AND THEIR COMBINATIONS

Publication number: 20180082677

Abstract: Deep Neural Networks (DNN) are time shifted relative to one another and trained. The time-shifted networks may then be combined to improve recognition accuracy. The approach is based on an automatic speech recognition (ASR) system using DNN and using time shifted features. Initially, a regular ASR model is trained to produce a first trained DNN. Then a top layer (e.g., SoftMax layer) and the last hidden layer (e.g., Sigmoid) are fine-tuned with same data set but with a feature window left- and right-shifted to create respective second and third left-shifted and right-shifted DNNs. From these three DNN networks, four combination networks may be generated: left- and right-shifted, left-shifted and centered, centered and right-shifted, and left-shifted, centered, and right-shifted. The centered networks are used to perform the initial (first-pass) ASR. Then the other six networks are used to perform rescoring.

Type: Application

Filed: September 18, 2017

Publication date: March 22, 2018

Applicant: Apptek, Inc.

Inventors: Mudar YAGHI, Hassan SAWAF, Jinato JIANG
HYBRID PHONEME, DIPHONE, MORPHEME, AND WORD-LEVEL DEEP NEURAL NETWORKS

Publication number: 20180047385

Abstract: An approach of hybrid frame, phone, diphone, morpheme, and word-level Deep Neural Networks (DNN) in model training and applications is described. The approach can be applied to many applications. The approach is based on a regular ASR system, which can be based on Gaussian Mixture Models (GMM) or DNN. In the first step, a regular ASR model is trained. All the training data (in the format of features) are aligned with the transcripts in terms of phonemes and words with the timing information. Feature normalization can be applied for these new features. Based on the alignment timing information, new features are formed in terms of phonemes, diphones, morphemes, and up to words. A first pass regular speech recognition is performed, and the result lattice is produced. In the lattice, there is the timing information for each word. A feature is then extracted and sent to the word-level DNN for scoring.

Type: Application

Filed: August 9, 2017

Publication date: February 15, 2018

Applicant: Apptek, Inc.

Inventors: Jintao JIANG, Hassan SAWAF, Mudar YAGHI
METHOD AND APPARATUS FOR KEYWORD SPEECH RECOGNITION

Publication number: 20170133038

Abstract: Phoneme images are created for keywords and audio files. The keyword images and audio file images are used to identify keywords within the audio file when the phoneme images match. Confidence scores may be determined corresponding to the match. Audio around the keywords may be stored and processed with an automatic speech recognition (ASR) program to verify the keyword match and provide textual and audio context to where the keyword appears within speech.

Type: Application

Filed: November 11, 2016

Publication date: May 11, 2017

Applicant: Apptek, Inc.

Inventors: Jintao JIANG, Mudar YAGHI
HYBRID MACHINE TRANSLATION

Publication number: 20100179803

Abstract: A system and method for hybrid machine translation approach is based on a statistical transfer approach using statistical and linguistic features. The system and method may be used to translate from one language into another. The system may include at least one database, a rule based translation module, a statistical translation module and a hybrid machine translation engine. The database(s) store source and target text and rule based language models and statistical language models. The rule based translation module translates source text based on the rule based language models. The statistical translation module translates source text based on the statistical language models. A hybrid machine translation engine, having a maximum entropy algorithm, is coupled to the rule based translation module and the statistical translation module and is capable of translating source text into target text based on the rule based and statistical language models.

Type: Application

Filed: October 26, 2009

Publication date: July 15, 2010

Applicant: AppTek

Inventors: Hassan SAWAF, Mohammad Shihadah, Mudar Yaghi