Patents Examined by Bhavesh M. Mehta
  • Patent number: 11443761
    Abstract: A technique, suitable for real-time processing, is disclosed for pitch tracking by detection of glottal excitation epochs in speech signal. It uses Hilbert envelope to enhance saliency of the glottal excitation epochs and to reduce the ripples due to the vocal tract filter. The processing comprises the steps of dynamic range compression, calculation of the Hilbert envelope, and epoch marking. The Hilbert envelope is calculated using the output of a FIR filter based Hilbert transformer and the delay-compensated signal. The epoch marking uses a dynamic peak detector with fast rise and slow fall and nonlinear smoothing to further enhance the saliency of the epochs, followed by a differentiator or a Teager energy operator, and amplitude-duration thresholding. The technique is meant for use in speech codecs, voice conversion, speech and speaker recognition, diagnosis of voice disorders, speech training aids, and other applications involving pitch estimation.
    Type: Grant
    Filed: August 3, 2019
    Date of Patent: September 13, 2022
    Inventors: Prem Chand Pandey, Hirak Dasgupta, Nataraj Kathriki Shambulingappa
  • Patent number: 11380343
    Abstract: A method for encoding an audio signal, comprising using one or more algorithms operating on a processor to filter the audio signal into two output signals, wherein each output signal has a sampling rate that is equal to a sampling rate of the audio signal, and wherein one of the output signals includes high frequency data. Using one or more algorithms operating on the processor to window the high frequency data by selecting a set of the high frequency data. Using one or more algorithms operating on the processor to determine a set of linear predictive coding (LPC) coefficients for the windowed data. Using one or more algorithms operating on the processor to generate energy scale values for the windowed data. Using one or more algorithms operating on the processor to generate an encoded high frequency bitstream.
    Type: Grant
    Filed: September 12, 2019
    Date of Patent: July 5, 2022
    Assignee: IMMERSION NETWORKS, INC.
    Inventors: James David Johnston, King Wei Hor
  • Patent number: 11189302
    Abstract: A speech emotion detection system may obtain to-be-detected speech data. The system may generate speech frames based on framing processing and the to-be-detected speech data. The system may extract speech features corresponding to the speech frames to form a speech feature matrix corresponding to the to-be-detected speech data. The system may input the speech feature matrix to an emotion state probability detection model. The system may generate, based on the speech feature matrix and the emotion state probability detection model, an emotion state probability matrix corresponding to the to-be-detected speech data. The system may input the emotion state probability matrix and the speech feature matrix to an emotion state transition model. The system may generate an emotion state sequence based on the emotional state probability matrix, the speech feature matrix, and the emotional state transition model. The system may determine an emotion state based on the emotion state sequence.
    Type: Grant
    Filed: October 11, 2019
    Date of Patent: November 30, 2021
    Assignee: Tencent Technology (Shenzhen) Company Limited
    Inventor: Haibo Liu
  • Patent number: 11164582
    Abstract: Set forth is a motorized computing device that selectively navigates to a user according content of a spoken utterance directed at the motorized computing device. The motorized computing device can modify operations of one or more motors of the motorized computing device according to whether the user provided a spoken utterance while the one or more motors are operating. The motorized computing device can render content according to interactions between the user and an automated assistant. For instance, when automated assistant is requested to provide graphical content for the user, the motorized computing device can navigate to the user in order to present the content the user. However, in some implementations, when the user requests audio content, the motorized computing device can bypass navigating to the user when the motorized computing device is within a distance from the user for audibly rendering the audio content.
    Type: Grant
    Filed: April 29, 2019
    Date of Patent: November 2, 2021
    Assignee: GOOGLE LLC
    Inventors: Scott Stanford, Keun-Young Park, Vitalii Tomkiv, Hideaki Matsui, Angad Sidhu
  • Patent number: 11132509
    Abstract: A speech interface device is configured to perform natural language understanding (NLU) processing in a manner that optimizes the use of resources on the speech interface device. In an example process, a domain classifier(s) is used to generate domain classifier scores associated with multiple candidate domains, and the candidate domains can then be evaluated, one candidate domain at a time, in accordance with the domain classifier scores (e.g., starting with a highest scoring candidate domain). For each candidate domain undergoing the evaluation, input data is by that domain's NLU model(s), and, as soon as a domain-specific NLU model(s) produces a NLU result with a confidence score that satisfies a threshold confidence score, the evaluation can be stopped for any remaining candidate domains.
    Type: Grant
    Filed: December 3, 2018
    Date of Patent: September 28, 2021
    Assignee: Amazon Technologies, Inc.
    Inventors: Stanislaw Ignacy Pasko, Ross William McGowan, Aliaksei Kuzmin, Rui Liu
  • Patent number: 10991362
    Abstract: Provided is a target speech signal extraction method for robust speech recognition including: receiving information on a direction of arrival of the target speech source with respect to the microphones; generating a nullformer by using the information on the direction of arrival of the target speech source to remove the target speech signal from the input signals and to estimate noise; setting a real output of the target speech source using an adaptive vector as a first channel and setting a dummy output by the nullformer as a remaining channel; setting a cost function for minimizing dependency between the real output of the target speech source and the dummy output using the nullformer by performing independent component analysis (ICA) or independent vector analysis (IVA); setting an auxiliary function to the cost function; and estimating the target speech signal by using the cost function and the auxiliary function.
    Type: Grant
    Filed: April 15, 2020
    Date of Patent: April 27, 2021
    Assignee: INDUSTRY-UNIVERSITY COOPERATION FOUNDATION SOGANG UNIVERSITY
    Inventors: Hyung Min Park, Seoyoung Lee, Seung-Yun Kim, Byung Joon Cho, Uihyeop Shin
  • Patent number: 10930277
    Abstract: A voice interaction architecture has a hands-free, electronic voice controlled assistant that permits users to verbally request information from cloud services. Since the assistant relies primarily, if not exclusively, on voice interactions, configuring the assistant for the first time may pose a challenge, particularly to a novice user who is unfamiliar with network settings (such as wife access keys). The architecture supports several approaches to configuring the voice controlled assistant that may be accomplished without much or any user input, thereby promoting a positive out-of-box experience for the user. More particularly, these approaches involve use of audible or optical signals to configure the voice controlled assistant.
    Type: Grant
    Filed: August 12, 2016
    Date of Patent: February 23, 2021
    Assignee: Amazon Technologies, Inc.
    Inventors: Tony David, Parag Garg
  • Patent number: 10923101
    Abstract: A system, a computer program product, and method for controlling synthesized speech output on a voice-controlled device. The voice-controlled device recognized that speech input is being received. The voice-controlled device outputs synthesized speech based on the speech input. While outputting synthesized speech based on the audio is captured. The voice-controlled device recognized the audio input as speech and pausing the outputting of synthesized speech. Otherwise, in response to the captured audio not being recognized as speech and above a settable background noise threshold, pausing the outputting of synthesized speech. The paused output of speech based on the synthesized speech input is resumed after the pausing of the output of synthesized speech being within a settable pause timeframe.
    Type: Grant
    Filed: December 26, 2017
    Date of Patent: February 16, 2021
    Assignee: International Business Machines Corporation
    Inventors: Shang Qing Guo, Jonathan Lenchner
  • Patent number: 10909315
    Abstract: A syntax analysis method and apparatus are disclosed. The method includes: obtaining a source language sentence that is a translation of a target language sentence (S110); determining instances of state transition for the target language sentence according to the source language sentence and a correspondence between words of the target language sentence and words of the source language sentence (S120); and generating a syntax tree of the target language sentence according to the instances of state transition for the target language sentence (S130). The syntax analysis method and apparatus can improve efficiency of syntax analysis.
    Type: Grant
    Filed: January 17, 2018
    Date of Patent: February 2, 2021
    Assignee: HUAWEI TECHNOLOGIES CO., LTD.
    Inventors: Zhaopeng Tu, Xiao Chen, Wenbin Jiang
  • Patent number: 10909977
    Abstract: A disclosed method includes monitoring an audio signal energy level while having a plurality of signal processing components deactivated and activating at least one signal processing component in response to a detected change in the audio signal energy level. The method may include activating and running a voice activity detector on the audio signal in response to the detected change where the voice activity detector is the at least one signal processing component. The method may further include activating and running the noise suppressor only if a noise estimator determines that noise suppression is required. The method may activate and runs a noise type classifier to determine the noise type based on information received from the noise estimator and may select a noise suppressor algorithm, from a group of available noise suppressor algorithms, where the selected noise suppressor algorithm is the most power consumption efficient.
    Type: Grant
    Filed: May 11, 2018
    Date of Patent: February 2, 2021
    Assignee: Google Technology Holdings LLC
    Inventors: Plamen A. Ivanov, Kevin J. Bastyr, Joel A. Clark, Mark A. Jasiuk, Tenkasi V. Ramabadran, Jincheng Wu
  • Patent number: 10909992
    Abstract: The lossless coding method includes selecting one of a first coding method and a second coding method, based on a range in which a quantization index of energy is represented, and coding the quantization index by using the selected coding method. The lossless decoding method includes determining a coding method of a differential quantization index of energy included in a bitstream and decoding the differential quantization index by using one of a first decoding method and a second decoding method based on a range in which a quantization index of energy is represented, in response to the determined coding method.
    Type: Grant
    Filed: May 29, 2020
    Date of Patent: February 2, 2021
    Assignee: SAMSUNG ELECTRONICS CO., LTD.
    Inventor: Ki-hyun Choo
  • Patent number: 10902855
    Abstract: An electronic device includes one or more processors, an audio interface, operable with the one or more processors, and a voice interface engine. The audio interface receives first acoustic signals identifying a control operation for the one or more processors. The one or more processors cause the audio interface to exchange second acoustic signals with at least one other electronic device, thereby negotiating which device will perform the control operation.
    Type: Grant
    Filed: May 8, 2017
    Date of Patent: January 26, 2021
    Assignee: Motorola Mobility LLC
    Inventors: Jun-Ki Min, Sudhir Vissa, Nikhil Ambha Madhusudhana, Vivek Tyagi, Mir Farooq Ali
  • Patent number: 10878203
    Abstract: The present disclosure provides a translation system enabling a translation of a web page by an alteration of a website. The translation system comprises: a translation request receiving unit for receiving a translation request from a client device, the translation request including the URL of a web page in which text in a first language is displayed; a translating unit for translating the text in the first language included in the web page indicated by the URL into text in a second language by referring to a bilingual database storing words and phrases in the first language associated with words and phrases in the second language constituting translated words and phrases of the words and phrases in the first language; and a translation sending unit for sending the translated text in the second language to the client device.
    Type: Grant
    Filed: October 4, 2018
    Date of Patent: December 29, 2020
    Assignee: Wovn Technologies, Inc.
    Inventors: Takaharu Hayashi, Jeffrey Thomas Sandford
  • Patent number: 10811023
    Abstract: The present document relates to time-alignment of encoded data of an audio encoder with associated metadata, such as spectral band replication (SBR) metadata. An audio decoder configured to determine a reconstructed frame of an audio signal from an access unit of a received data stream is described. The access unit comprises waveform data and metadata, wherein the waveform data and the metadata are associated with the same reconstructed frame of the audio signal. The audio decoder comprises a waveform processing path configured to generate a plurality of waveform subband signals from the waveform data, and a metadata processing path configured to generate decoded metadata from the metadata.
    Type: Grant
    Filed: September 29, 2017
    Date of Patent: October 20, 2020
    Assignee: Dolby International AB
    Inventors: Kristofer Kjoerling, Heiko Purnhagen, Jens Popp
  • Patent number: 10628710
    Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for processing images or features of images using an image classification system that includes a batch normalization layer. One of the systems includes a convolutional neural network configured to receive an input comprising an image or image features of the image and to generate a network output that includes respective scores for each object category in a set of object categories, the score for each object category representing a likelihood that that the image contains an image of an object belonging to the category, and the convolutional neural network comprising: a plurality of neural network layers, the plurality of neural network layers comprising a first convolutional neural network layer and a second neural network layer; and a batch normalization layer between the first convolutional neural network layer and the second neural network layer.
    Type: Grant
    Filed: December 19, 2018
    Date of Patent: April 21, 2020
    Assignee: Google LLC
    Inventors: Sergey Ioffe, Corinna Cortes
  • Patent number: 10607101
    Abstract: Images in bitonal formats often include watermarks, stamps, or other patterns and artifacts. These patterned artifacts may be represented as a series of geometric points, dots, and/or dashes in the general shape of the original pattern. These patterned artifacts make other processes such as optical character recognition (OCR) difficult or impossible when items or pixels of interest are also found within the pattern of such artifact(s). Current patterned artifact removal solutions use methods of erosion to minimize the unwanted patterned artifact. However, such methods also erode the pixels/items of interest which, in turn, cause failures in other processes, such as OCR, that are desired to be carried out on or with the image.
    Type: Grant
    Filed: December 12, 2017
    Date of Patent: March 31, 2020
    Assignee: Revenue Management Solutions, LLC
    Inventor: Jay Wagner
  • Patent number: 10599907
    Abstract: A display panel for fingerprint recognition and a display device are disclosed. The display panel for fingerprint recognition includes: a driving circuit backboard; a plurality of electroluminescent units disposed on the driving circuit backboard in a form of array; a plurality of infrared luminescent units disposed on the driving circuit backboard in a form of array; a protection cover plate; and a plurality of infrared photosensitive induction units disposed in a form of array between the protection cover plate and a film on which the infrared luminescent units are located.
    Type: Grant
    Filed: February 10, 2017
    Date of Patent: March 24, 2020
    Assignee: BOE TECHNOLOGY GROUP CO., LTD.
    Inventors: Rui Xu, Xue Dong, Jing Lv, Haisheng Wang, Yingming Liu, Xiaoliang Ding, Changfeng Li, Yannan Jia, Lijun Zhao, Yuzhen Guo, Pengpeng Wang, Yanling Han, Wei Liu, Chun-Wei Wu
  • Patent number: 10600182
    Abstract: An image processing apparatus includes an image obtaining unit configured to obtain a time-sequential image obtained in order of a start image, a plurality of intermediate images, and an end image, an image selecting unit configured to select a plurality of combinations of images including the start image, at least one of the plurality of intermediate images, and the end image, and a conformable image determining unit configured to determine a conformable combination from among the plurality of combinations based on image quality of the images included in each of the plurality of combinations and a similarity between the images included in each of the plurality of combinations.
    Type: Grant
    Filed: January 23, 2018
    Date of Patent: March 24, 2020
    Assignee: Canon Kabushiki Kaisha
    Inventor: Keita Nakagomi
  • Patent number: 10595062
    Abstract: A method of encapsulating an encoded bitstream representing one or more images, the encapsulated bitstream comprising a data part and a metadata part. The method including providing image item information identifying a portion of the data part representing a sub-image or an image of a single image; providing image description information comprising parameters including display parameters and/or transformation operators relating to one or more images and outputting said bitstream together with said provided information as an encapsulated data file, wherein the image description information is stored in the metadata part.
    Type: Grant
    Filed: February 9, 2016
    Date of Patent: March 17, 2020
    Assignee: Canon Kabushiki Kaisha
    Inventors: Frédéric Maze, Franck Denoual, Cyril Concolato, Jean Le Feuvre
  • Patent number: 10579868
    Abstract: A system for recognition of objects from ink elements on a computing device is provided. The computing device comprises a processor, a memory and at least one non-transitory computer readable medium for recognizing content under control of the processor. The at least one non-transitory computer readable medium is configured to determine a perimeter of an ink element stored in a memory of the computing device, determine a plurality of pen units for the ink element based on the determined ink element perimeter, determine at least one stroke representing a path through two or more of the pen units, and cause recognition of one or more objects represented by the ink element using the determined at least one stroke.
    Type: Grant
    Filed: September 11, 2017
    Date of Patent: March 3, 2020
    Assignee: MYSCRIPT
    Inventors: Thibault Lelore, Udit Roy