Pitch Patents (Class 704/207)
  • Patent number: 11038787
    Abstract: In accordance with an example embodiment of the present invention, disclosed is a method and an apparatus thereof for selecting a packet loss concealment procedure for a lost audio frame of a received audio signal. A method for selecting a packet loss concealment procedure comprises detecting an audio type of a received audio frame and determining a packet loss concealment procedure based on the audio type. In the method, detecting an audio type comprises determining a stability of a spectral envelope of signals of received audio frames.
    Type: Grant
    Filed: October 1, 2019
    Date of Patent: June 15, 2021
    Assignee: TELEFONAKTIEBOLAGET LM ERICSSON (PUBL)
    Inventor: Stefan Bruhn
  • Patent number: 11011160
    Abstract: A computerized system for transforming recorded speech into a derived expression of intent from the recorded speech includes: (1) a text classification module comparing a transcription of at least a portion of recorded speech against a text classifier to generate a first set of one or more of the representations of potential intents based upon such comparison; (2) a phonetics classification module comparing a phonetic transcription of at least a portion of the recorded speech against a phonetics classifier to generate a second set of one or more of the representations of potential intents based upon such comparison; (3) an audio classification module comparing an audio version of at least a portion of the recorded speech with an audio classifier to generate a third set of one or more of the representations of potential intents based upon such comparison; and a (4) discriminator module for receiving the first, second and third sets of the one or more representations of potential intents and generating at least
    Type: Grant
    Filed: January 19, 2018
    Date of Patent: May 18, 2021
    Assignee: OPEN WATER DEVELOPMENT LLC
    Inventor: Moshe Villaizan
  • Patent number: 10984813
    Abstract: A method and an apparatus for detecting correctness of a pitch period, where the method for detecting correctness of a pitch period includes determining, according to an initial pitch period of an input signal in a time domain, a pitch frequency bin of the input signal, where the initial pitch period is obtained by performing open-loop detection on the input signal, determining, based on an amplitude spectrum of the input signal in a frequency domain, a pitch period correctness decision parameter, associated with the pitch frequency bin, of the input signal, and determining correctness of the initial pitch period according to the pitch period correctness decision parameter. Hence, the method and apparatus for detecting correctness of the pitch period improve, based on a relatively less complex algorithm, accuracy of detecting correctness of the pitch period.
    Type: Grant
    Filed: February 15, 2019
    Date of Patent: April 20, 2021
    Assignee: HUAWEI TECHNOLOGIES CO., LTD.
    Inventors: Fengyan Qi, Lei Miao
  • Patent number: 10971191
    Abstract: A generally diverse set of audiovisual clips is sourced from one or more repositories for use in preparing a coordinated audiovisual work. In some cases, audiovisual clips are retrieved using tags such as user-assigned hashtags or metadata. Pre-existing associations of such tags can be used as hints that certain audiovisual clips are likely to share correspondence with an audio signal encoding of a particular song or other audio baseline. Clips are evaluated for computationally determined correspondence with an audio baseline track. In general, comparisons of audio power spectra, of rhythmic features, tempo, pitch sequences and other extracted audio features may be used to establish correspondence. For clips exhibiting a desired level of correspondence, computationally determined temporal alignments of individual clips with the baseline audio track are used to prepare a coordinated audiovisual work that mixes the selected audiovisual clips with the audio track.
    Type: Grant
    Filed: June 15, 2015
    Date of Patent: April 6, 2021
    Inventors: Mark T. Godfrey, Turner Evan Kirk, Ian S. Simon, Nick Kruge
  • Patent number: 10938366
    Abstract: A volume level meter has a housing that is mounted on a microphone, and is connected to a pop filter positioned in front of a vocalist and adjacent to a microphone. The display faces the vocalist, and is arranged on the housing so that it indicates a volume level of audio signals received from the microphone. The vocalist can see indicators on the display and know the volume level of the audio signal from the microphone. This allows the vocalist to monitor the volume level indicators of the volume level display and control their vocal volume levels based on the indicators. In this way, the vocalist to reduce fluctuations in vocal volume levels that may lead to distortion of the audio signal by monitoring the volume level display.
    Type: Grant
    Filed: May 3, 2019
    Date of Patent: March 2, 2021
    Inventors: Joseph N Griffin, Corey D Chapman
  • Patent number: 10924193
    Abstract: Embodiments include techniques for transmitting and receiving radio frequency (RF) signals, where the techniques for generating, via a digital analog converter (DAC), a frequency signal, and filtering the frequency signal to produce a first filtered signal and a second filtered signal. The techniques also include transmitting the second filtered signal to a device under test, and filtering the second filtered signal into a sub-signal having one or more components. The techniques include mixing the first filtered signal with the sub-signal to produce a first mixed signal, subsequently mixing the first mixed signal with an output signal received from the device under test to produce a second mixed signal, and converting the second mixed signal for analysis.
    Type: Grant
    Filed: September 29, 2017
    Date of Patent: February 16, 2021
    Assignee: International Business Machines Corporation
    Inventors: Mohit Kapur, Muir Kumph
  • Patent number: 10902841
    Abstract: Systems, methods, and computer program products customizing and delivering contextually relevant, artificially synthesized, voiced content that is targeted toward the individual user behaviors, viewing habits, experiences and preferences of each individual user accessing the content of a content provider. A network accessible profile service collects and analyzes collected user profile data and recommends contextually applicable voices based on the user's profile data. As user input to access voiced content or triggers voiced content maintained by a content provider, the voiced content being delivered to the user is a modified version comprising artificially synthesized human speech mimicking the recommended voice and delivering the dialogue of the voiced content, in a manner that imitates the sounds and speech patterns of the recommended voice.
    Type: Grant
    Filed: February 15, 2019
    Date of Patent: January 26, 2021
    Assignee: International Business Machines Corporation
    Inventors: Su Liu, Eric J. Rozner, Inseok Hwang, Chungkuk Yoo
  • Patent number: 10878800
    Abstract: According to one aspect of the present disclosure, a computer-implemented method for changing a voice interacting with a user can be provided. Identity information for a user can be received. The identity information can be analyzed to identify the user. Voice change information for the user indicating help for the user to understand the voice can be retrieved. A change to be made to the voice based on retrieved user information can be made. The changed voice can be provided to the user.
    Type: Grant
    Filed: May 29, 2019
    Date of Patent: December 29, 2020
    Assignee: Capital One Services, LLC
    Inventors: Anh Truong, Mark Watson, Jeremy Goodsitt, Vincent Pham, Fardin Abdi Taghi Abad, Kate Key, Austin Walters, Reza Farivar
  • Patent number: 10867525
    Abstract: Computer-implemented systems and methods are provided for automatically generating recitation items. For example, a computer performing the recitation item generation can receive one or more text sets that each includes one or more texts. The computer can determine a value for each text set using one or more metrics, such as a vocabulary difficulty metric, a syntactic complexity metric, a phoneme distribution metric, a phonetic difficulty metric, and a prosody distribution metric. Then the computer can select a final text set based on the value associated with each text set. The selected final text set can be used as the recitation items for a speaking assessment test.
    Type: Grant
    Filed: February 13, 2018
    Date of Patent: December 15, 2020
    Assignee: Educational Testing Service
    Inventors: Su-Youn Yoon, Lei Chen, Keelan Evanini, Klaus Zechner
  • Patent number: 10861482
    Abstract: Temporal regions of a time-based media program that contain spoken dialog in a language that is dubbed from a primary language are identified automatically. A primary language audio track of the media program is compared with an alternate language audio track. Closely similar regions are assumed not to contain dubbed dialog, while the temporal inverse of the similar regions are candidate regions for containing dubbed speech. The candidate regions are provided to a dub validator to facilitate locating each region to be validated without having to play back or search the entire time-based media program. Corresponding regions of the primary and alternate language tracks that are closely similar and that contain voice activity are candidate regions of forced narrative, and the temporal locations of these regions may be used by a validator to facilitate rapid validation of forced narrative in the program.
    Type: Grant
    Filed: October 12, 2018
    Date of Patent: December 8, 2020
    Assignee: Avid Technology, Inc.
    Inventors: Jacob B. Garland, Vedantha G. Hothur
  • Patent number: 10854217
    Abstract: A wind noise filtering device includes a mixer, an extraction unit, a decision unit, a wind noise filter and an output module. The mixer receives a source sound and outputs an input audio. The extraction unit is electrically connected to the mixer to receive the input audio, the extraction unit performs feature extraction on the input audio to generate a plurality of feature data. The decision unit is electrically connected to the extraction unit to receive the feature data, the decision unit outputs a decision signal according to the plurality of feature data. The wind noise filter is electrically connected to the decision unit to receive the decision signal and is controlled to be turned on or off by the decision signal. The output module is electrically connected to the wind noise filter and the mixer to output an output audio according to the input audio or the filtered audio.
    Type: Grant
    Filed: March 11, 2020
    Date of Patent: December 1, 2020
    Assignee: COMPAL ELECTRONICS, INC.
    Inventor: Chung-Han Lin
  • Patent number: 10847179
    Abstract: The present disclosure provides a method, an apparatus and a device for recognizing voice endpoints. In the method of the present disclosure, a start point recognition model and a finish point recognition model are obtained by training a cyclic neural network with a start point training set and a finish point training set, respectively, and a voice start point frame among audio frames is recognized according to each of acoustic features of the audio frames and the start point recognition model, thereby avoiding affecting a delay time of the finish point frame recognition while ensuring the accuracy of the start frame recognition as high as possible; and a voice finish point frame among the audio frames is recognized according to the acoustic features of the audio frames and the finish point recognition model.
    Type: Grant
    Filed: December 28, 2018
    Date of Patent: November 24, 2020
    Assignee: BAIDU ONLINE NETWORK TECHNOLOGY (BEIJING) CO., LTD.
    Inventors: Chao Li, Weixin Zhu
  • Patent number: 10838881
    Abstract: A device management server computer (“server”) is programmed to manage a plurality of input devices and output devices in a physical room. The server is programmed to analyze media data capturing actions performed by a user in real time as a participant in the physical room, determine how the user would like to connect at least one of the input devices and one of the output devices from the analysis, and enable the connection. The sever is programmed to interpret the actions and derive commands for connecting two or more devices based on predetermined data regarding the input devices and output devices and rules for referring to and connecting these devices.
    Type: Grant
    Filed: April 26, 2019
    Date of Patent: November 17, 2020
    Assignee: XIO RESEARCH, INC.
    Inventors: Aditya Vempaty, Robert Smith, Shom Ponoth, Sharad Sundararajan, Ravindranath Kokku, Robert Hutter, Satya Nitta
  • Patent number: 10818311
    Abstract: An auditory selection method based on a memory and attention model, including: step S1, encoding an original speech signal into a time-frequency matrix; step S2, encoding and transforming the time-frequency matrix to convert the matrix into a speech vector; step S3, using a long-term memory unit to store a speaker and a speech vector corresponding to the speaker; step S4, obtaining a speech vector corresponding to a target speaker, and separating a target speech from the original speech signal through an attention selection model. A storage device includes a plurality of programs stored in the storage device. The plurality of programs are configured to be loaded by a processor and execute the auditory selection method based on the memory and attention model. A processing unit includes the processor and the storage device.
    Type: Grant
    Filed: November 14, 2018
    Date of Patent: October 27, 2020
    Assignee: INSTITUTE OF AUTOMATION, CHINESE ACADEMY OF SCIENCES
    Inventors: Jiaming Xu, Jing Shi, Bo Xu
  • Patent number: 10818308
    Abstract: Systems, devices, media, and methods are presented for converting sounds in an audio stream. The systems and methods receive an audio conversion request initiating conversion of one or more sound characteristics of an audio stream from a first state to a second state. The systems and methods access an audio conversion model associated with an audio signature for the second state. The audio stream is converted based on the audio conversion model and an audio construct is compiled from the converted audio stream and a base audio segment. The compiled audio construct is presented at a client device.
    Type: Grant
    Filed: April 27, 2018
    Date of Patent: October 27, 2020
    Assignee: Snap Inc.
    Inventor: Wei Chu
  • Patent number: 10789938
    Abstract: A speech synthesis method and device. The method comprises: determining language types of a statement to be synthesized; determining base models corresponding to the language types; determining a target timbre, performing adaptive transformation on the spectrum parameter models based on the target timbre, and training the statement to be synthesized based on the spectrum parameter models subjected to adaptive transformation to generate spectrum parameters; training the statement to be synthesized based on the fundamental frequency parameters to generate fundamental frequency parameters, and adjusting the fundamental frequency parameters based on the target timbre; and synthesizing the statement to be synthesized into a target speech based on the spectrum parameters, and the fundamental frequency parameters after adjusting.
    Type: Grant
    Filed: September 5, 2016
    Date of Patent: September 29, 2020
    Assignee: BAIDU ONLINE NETWORK TECHNOLOGY (BEIJING) CO., LTD
    Inventors: Hao Li, Yongguo Kang
  • Patent number: 10777215
    Abstract: A method and system for enhancing a speech signal is provided herein. The method may include the following steps: obtaining an original video, wherein the original video includes a sequence of original input images showing a face of at least one human speaker, and an original soundtrack synchronized with said sequence of images; and processing, using a computer processor, the original video, to yield an enhanced speech signal of said at least one human speaker, by detecting sounds that are acoustically unrelated to the speech of the at least one human speaker, based on visual data derived from the sequence of original input images.
    Type: Grant
    Filed: November 11, 2019
    Date of Patent: September 15, 2020
    Assignee: Yissum Research Development Company of The Hebrew University of Jerusalem Ltd.
    Inventors: Shmuel Peleg, Asaph Shamir, Tavi Halperin, Aviv Gabbay, Ariel Ephrat
  • Patent number: 10762907
    Abstract: An apparatus for improving a transition from a concealed audio signal portion is provided. The apparatus includes a processor being configured to generate a decoded audio signal portion of the audio signal. The processor is configured to generate the decoded audio signal portion using the first sub-portion of the first audio signal portion and using the second audio signal portion or a second sub-portion of the second audio signal portion, such that for each sample of two or more samples of the second audio signal portion, the sample position of the sample of the two or more samples of the second audio signal portion is equal to the sample position of one of the samples of the decoded audio signal portion.
    Type: Grant
    Filed: July 27, 2018
    Date of Patent: September 1, 2020
    Assignee: Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V.
    Inventors: Adrian Tomasek, Jérémie LeComte
  • Patent number: 10755718
    Abstract: A method for classifying speakers includes: receiving, by a speaker recognition system including a processor and memory, input audio including speech from a speaker; extracting, by the speaker recognition system, a plurality of speech frames containing voiced speech from the input audio; computing, by the speaker recognition system, a plurality of features for each of the speech frames of the input audio; computing, by the speaker recognition system, a plurality of recognition scores for the plurality of features; computing, by the speaker recognition system, a speaker classification result in accordance with the recognition scores; and outputting, by the speaker recognition system, the speaker classification result.
    Type: Grant
    Filed: December 7, 2017
    Date of Patent: August 25, 2020
    Inventors: Zhenhao Ge, Ananth N. Iyer, Srinath Cheluvaraja, Ram Sundaram, Aravind Ganapathiraju
  • Patent number: 10715173
    Abstract: A method for partitioning of input vectors for coding is presented. The method comprises obtaining of an input vector. The input vector is segmented, in a non-recursive manner, into an integer number, NSEG, of input vector segments. A representation of a respective relative energy difference between parts of the input vector on each side of each boundary between the input vector segments is determined, in a recursive manner. The input vector segments and the representations of the relative energy differences are provided for individual coding. Partitioning units and computer programs for partitioning of input vectors for coding, as well as positional encoders, are presented.
    Type: Grant
    Filed: May 7, 2019
    Date of Patent: July 14, 2020
    Assignee: TELEFONAKTIEBOLAGET LM ERICSSON (PUBL)
    Inventors: Tomas Jansson Toftgård, Volodya Grancharov, Jonas Svedberg
  • Patent number: 10684683
    Abstract: Technologies for natural language interactions with virtual personal assistant systems include a computing device configured to capture audio input, distort the audio input to produce a number of distorted audio variations, and perform speech recognition on the audio input and the distorted audio variants. The computing device selects a result from a large number of potential speech recognition results based on contextual information. The computing device may measure a user's engagement level by using an eye tracking sensor to determine whether the user is visually focused on an avatar rendered by the virtual personal assistant. The avatar may be rendered in a disengaged state, a ready state, or an engaged state based on the user engagement level. The avatar may be rendered as semitransparent in the disengaged state, and the transparency may be reduced in the ready state or the engaged state. Other embodiments are described and claimed.
    Type: Grant
    Filed: January 25, 2019
    Date of Patent: June 16, 2020
    Assignee: Intel Corporation
    Inventor: William C. Deleeuw
  • Patent number: 10686465
    Abstract: An improved mixed oscillator-and-external excitation model and methods for estimating the model parameters, for evaluating model quality, and for combining it with known in the art methods are disclosed. The improvement over existing oscillators allows the model to receive, as an input, all except the most recent point in the acquired data. Model stability is achieved through a process which includes restoring unavailable to the decoder data from the optimal model parameters and by using metrics to select a stable restored model output. The present invention is effective for very low bit-rate coding/compression and decoding/decompression of digital signals, including digitized speech, audio, and image data, and for analysis, detection, and classification of signals. Operations can be performed in real time, and parameterization can be achieved at a user-specified level of compression.
    Type: Grant
    Filed: July 24, 2018
    Date of Patent: June 16, 2020
    Assignee: Luce Communications
    Inventors: Irina Gorodnitsky, Anton Yen
  • Patent number: 10679632
    Abstract: An apparatus for decoding an audio signal includes a receiving interface, wherein the receiving interface is configured to receive a first frame and a second frame. Moreover, the apparatus includes a noise level tracing unit for determining noise level information being represented in a tracing domain. Furthermore, the apparatus includes a first reconstruction unit for reconstructing a third audio signal portion of the audio signal depending on the noise level information and a second reconstruction unit for reconstructing a fourth audio signal portion depending on noise level information being represented in the second reconstruction domain.
    Type: Grant
    Filed: January 24, 2018
    Date of Patent: June 9, 2020
    Assignee: Fraunhofer-Gesellschaft zur Foerderung der angewandten Forschung e.V.
    Inventors: Michael Schnabel, Goran Markovic, Ralph Sperschneider, Jérémie Lecomte, Christian Helmrich
  • Patent number: 10679256
    Abstract: A content server uses a form of artificial intelligence such as machine learning to identify audio content with musicological characteristics. The content server obtains an indication of a music item presented by a client device and obtains reference music features describing musicological characteristics of the music item. The content server identifies candidate audio content associated with candidate music features. The candidate music features are determined by analyzing acoustic features of the candidate audio content and mapping the acoustic features to music features according to a music feature model. Acoustic features quantify low-level properties of the candidate audio content. One of the candidate audio content items is selected according to comparisons between the candidate music features of the candidate audio advertisements and the reference music features of the music item. The selected audio content is provided the client device for presentation.
    Type: Grant
    Filed: June 25, 2015
    Date of Patent: June 9, 2020
    Assignee: Pandora Media, LLC
    Inventors: Christopher Irwin, Shriram Bharath, Andrew J. Asman
  • Patent number: 10679644
    Abstract: A method, a system, and a computer program product are provided for interpreting low amplitude speech and transmitting amplified speech to a remote communication device. At least one computing device receives sensor data from multiple sensors. The sensor data is associated with the low amplitude speech. At least one of the at least one computing device analyzes the sensor data to map the sensor data to at least one syllable resulting in a string of one or more words. An electronic representation of the string of the one or more words may be generated and transmitted to a remote communication device for producing the amplified speech from the electronic representation.
    Type: Grant
    Filed: August 15, 2019
    Date of Patent: June 9, 2020
    Assignee: International Business Machines Corporation
    Inventors: Sarbajit K. Rakshit, Martin G. Keen, James E. Bostick, John M. Ganci, Jr.
  • Patent number: 10665253
    Abstract: Voice activity detection (VAD) is an enabling technology for a variety of speech based applications. Herein disclosed is a robust VAD algorithm that is also language independent. Rather than classifying short segments of the audio as either “speech” or “silence”, the VAD as disclosed herein employees a soft-decision mechanism. The VAD outputs a speech-presence probability, which is based on a variety of characteristics.
    Type: Grant
    Filed: April 23, 2018
    Date of Patent: May 26, 2020
    Assignee: VERINT SYSTEMS LTD.
    Inventor: Ron Wein
  • Patent number: 10650837
    Abstract: Network communication speech handling systems are provided herein. In one example, a method of processing audio signals by a network communications handling node is provided. The method includes processing an audio signal to determine a pitch cycle property associated with the audio signal, determining transfer times for encoded segments of the audio signal based at least in part on the pitch cycle property, and transferring packets comprising one or more encoded segments for delivery to a target node in accordance with the transfer time.
    Type: Grant
    Filed: August 29, 2017
    Date of Patent: May 12, 2020
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Karsten Vandborg Sørensen, Sriram Srinivasan, Koen Bernard Vos
  • Patent number: 10643631
    Abstract: The present invention reduces encoding distortion in frequency domain encoding compared to conventional techniques, and obtains LSP parameters that correspond to quantized LSP parameters for the preceding frame and are to be used in time domain encoding from coefficients equivalent to linear prediction coefficients resulting from frequency domain encoding. When p is an integer equal to or greater than 1, a linear prediction coefficient sequence which is obtained by linear prediction analysis of audio signals in a predetermined time segment is represented as a[1], a[2], . . . , a[p], and ?[1], ?[2], . . . , ?[p] are a frequency domain parameter sequence derived from the linear prediction coefficient sequence a[1], a[2], . . . , a[p], an LSP linear transformation unit (300) determines the value of each converted frequency domain parameter ˜?[i] (i=1, 2, . . . , p) in a converted frequency domain parameter sequence ˜?[1], ˜?[2], . . . , ˜?[p] using the frequency domain parameter sequence ?[1], ?[2], . . .
    Type: Grant
    Filed: October 15, 2019
    Date of Patent: May 5, 2020
    Assignees: Nippon Telegraph and Telephone Corporation, The University of Tokyo
    Inventors: Takehiro Moriya, Yutaka Kamamoto, Noboru Harada, Hirokazu Kameoka, Ryosuke Sugiura
  • Patent number: 10607167
    Abstract: Systems and methods are shown for routing task objects to multiple agents that involve receiving and storing real-time sensor data for multiple agents, receiving tasks and, for each task, create a task object representing the task and placing the task object in an input buffer. For each task object in the input buffer, utilize the real-time sensor data to identify one or more of the multiple agents as suitable for assignment to the task, apply a routing strategy to the task object to further select one of the multiple agents identified as suitable for assignment of the task object based on sensor data, and route the task object from the input buffer to a workbin corresponding to the selected agent.
    Type: Grant
    Filed: October 13, 2015
    Date of Patent: March 31, 2020
    Inventors: Herbert Willi Artur Ristock, Adrian Lee-Kwen, David Beilis, Christopher Connolly, Liyuan Qiao, Merijn te Booij, James Kraeutler
  • Patent number: 10580434
    Abstract: An information presentation apparatus includes an acquisition unit and a presentation unit. The acquisition unit acquires activity information on activities of people in a group including multiple people having a conversation about a specific theme. The presentation unit presents an advice regarding progress of the conversation in accordance with a situation of the conversation defined based on the activity information acquired by the acquisition unit.
    Type: Grant
    Filed: September 4, 2018
    Date of Patent: March 3, 2020
    Assignee: FUJI XEROX CO., LTD.
    Inventors: Kiichiro Arikawa, Daisuke Yasuoka
  • Patent number: 10560410
    Abstract: A method, system, and software for communicating between a sender and a recipient via a personalized message. The communication can be in the form of a customized audio, video, or multimedia content. The content may be found in a database or uploaded by a user. Further, the content may be edited or customized in a variety of manners. The communications can take place between two users or among a number of users.
    Type: Grant
    Filed: May 24, 2019
    Date of Patent: February 11, 2020
    Assignee: Audiobyte LLC
    Inventors: Scott Guthery, Richard Van Den Bosch, Brian Vo, Nolan Leung, Andrew Blacker, John Van Suchtelen
  • Patent number: 10535356
    Abstract: An apparatus for encoding a multi-channel signal having at least two channels is provided. The apparatus includes a time-spectral converter, converting sequences of blocks of sample values of the two channels into a frequency domain representation having sequences of blocks of spectral values for the two channels, a block of sampling values having an associated input sampling rate, a block of spectral values of the sequences of blocks that has spectral values up to a maximum input frequency related to the input sampling rate; a multi-channel processor to obtain a result sequence of blocks of spectral values having information related to the two channels; a spectral domain resampler to obtain a resampled sequence of blocks of spectral values; a spectral-time converter for converting the resampled sequence of blocks into a time domain representation; and a core encoder for encoding the output sequence of blocks to obtain an encoded multi-channel signal.
    Type: Grant
    Filed: November 22, 2017
    Date of Patent: January 14, 2020
    Assignee: Fraunhofer-Gesellschaft zur Foerderung der angewandten Forschung e.V.
    Inventors: Guillaume Fuchs, Emmanuel Ravelli, Markus Multrus, Markus Schnell, Stefan Doehla, Martin Dietz, Goran Markovic, Eleni Fotopoulou, Stefan Bayer, Wolfgang Jaegers
  • Patent number: 10529347
    Abstract: According to an aspect of the present invention, a method for reconstructing an audio signal having a baseband portion and a highband portion is disclosed. The method includes obtaining a decoded baseband audio signal by decoding an encoded audio signal and obtaining a plurality of subband signals by filtering the decoded baseband audio signal. The method further includes generating a high-frequency reconstructed signal by copying a number of consecutive subband signals of the plurality of subband signals and obtaining an envelope adjusted high-frequency signal. The method further includes generating a noise component based on a noise parameter. Finally, the method includes adjusting a phase of the high-frequency reconstructed signal and obtaining a time-domain reconstructed audio signal by combining the decoded baseband audio signal and the combined high-frequency signal to obtain a time-domain reconstructed audio signal.
    Type: Grant
    Filed: February 5, 2019
    Date of Patent: January 7, 2020
    Assignee: Dolby Laboratories Licensing Corporation
    Inventors: Michael M. Truman, Mark S. Vinton
  • Patent number: 10529355
    Abstract: A method, a system, and a computer program product are provided for interpreting low amplitude speech and transmitting amplified speech to a remote communication device. At least one computing device receives sensor data from multiple sensors. The sensor data is associated with the low amplitude speech. At least one of the at least one computing device analyzes the sensor data to map the sensor data to at least one syllable resulting in a string of one or more words. An electronic representation of the string of the one or more words may be generated and transmitted to a remote communication device for producing the amplified speech from the electronic representation.
    Type: Grant
    Filed: December 19, 2017
    Date of Patent: January 7, 2020
    Assignee: International Business Machines Corporation
    Inventors: Sarbajit K. Rakshit, Martin G. Keen, James E. Bostick, John M. Ganci, Jr.
  • Patent number: 10516982
    Abstract: An example system comprising: a processing resource; and a memory resource storing machine readable instructions executable to cause the processing resource to: receive a Bluetooth Low Energy (BLE) signal transmitted from a user device; generate, from the BLE signal, a BLE moving pattern of the user device, wherein the BLE moving pattern is generated at a different entity than an entity that transmits the BLE signal; track an object carrying the user device via visual information of the object such that a visual moving pattern of the object is generated from the tracking; determine the visual moving pattern matches the BLE moving pattern; and assign, responsive to the determination, an identity obtained from the user device to the object being tracked via the visual information.
    Type: Grant
    Filed: October 27, 2017
    Date of Patent: December 24, 2019
    Assignee: Hewlett Packard Enterprise Development LP
    Inventors: Yurong Jiang, Kyu Han Kim, Puneet Jain, Xiaochen Liu
  • Patent number: 10509913
    Abstract: An image forming system includes a concealment word registration unit, an ID information acquisition unit, a concealment word managing unit, and an image forming unit. The concealment word registration unit registers a concealment word associating with a user. The ID information acquisition unit obtains creator ID information and execution person ID information from a print job, the creator ID information identifies a user as a creator of a document file, the execution person ID information identifies a user as an execution person of the image formation process. The concealment word managing unit uses the concealment words to determine a concealment region. The concealment words include the concealment word associated with the user identified by the creator ID information and the concealment word associated with the user identified by the execution person ID information. The image forming unit executes a masking process that makes the concealment region illegible.
    Type: Grant
    Filed: October 31, 2017
    Date of Patent: December 17, 2019
    Assignee: KYOCERA Document Solutions Inc.
    Inventor: Takayuki Mashimo
  • Patent number: 10504032
    Abstract: This disclosure is directed to an apparatus for intelligent matching of disparate input signals received from disparate input signal systems in a complex computing network for establishing targeted communication to a computing device associated with the intelligently matched disparate input signals.
    Type: Grant
    Filed: March 27, 2017
    Date of Patent: December 10, 2019
    Assignee: Research Now Group, LLC
    Inventors: Melanie D. Courtright, Vincent P. Derobertis, Michael D. Bigby, William C. Robinson, Greg Ellis, Heidi D. E. Wilton, John R. Rothwell, Jeremy S. Antoniuk
  • Patent number: 10504539
    Abstract: An audio processing device or method includes an audio transducer operable to receive audio input and generate an audio signal based on the audio input. The audio processing device or method also includes an audio signal processor operable to extract local features from the audio signal, such as Power-Normalized Coefficients (PNCC) of the audio signal. The audio signal processor also is operable to extract global features from the audio signal, such as chroma features and harmonicity features. A neural network is provided to determine a probability that a target audio is present in the audio signal based on the local and global features. In particular, the neural network is trained to output a value indicating whether the target audio is present and locally dominant in the audio signal.
    Type: Grant
    Filed: December 5, 2017
    Date of Patent: December 10, 2019
    Assignee: SYNAPTICS INCORPORATED
    Inventors: Saeed Mosayyebpour Kaskari, Francesco Nesta
  • Patent number: 10482892
    Abstract: System and method embodiments are provided for very short pitch detection and coding for speech or audio signals. The system and method include detecting whether there is a very short pitch lag in a speech or audio signal that is shorter than a conventional minimum pitch limitation using a combination of time domain and frequency domain pitch detection techniques. The pitch detection techniques include using pitch correlations in time domain and detecting a lack of low frequency energy in the speech or audio signal in frequency domain. The detected very short pitch lag is coded using a pitch range from a predetermined minimum very short pitch limitation that is smaller than the conventional minimum pitch limitation.
    Type: Grant
    Filed: July 28, 2017
    Date of Patent: November 19, 2019
    Assignee: HUAWEI TECHNOLOGIES CO., LTD.
    Inventors: Yang Gao, Fengyan Qi
  • Patent number: 10477313
    Abstract: An audio signal processing apparatus comprises a receiver (403) receiving an audio signal sampled at a first sampling frequency, the audio signal having a maximum frequency below half the first sampling frequency by a first frequency margin. A filter bank (405) generates subband signals for the digital audio signal using overlapping sub-filters. A first frequency shifter (407) applies a frequency shift to at least one subband of the set of subbands and a decimator (409) decimates the subband signals by a decimation factor resulting in a decimated sampling frequency being at least twice a bandwidth of each of the overlapping sub-filters. The frequency shift for a subband is arranged to shift the subband to a frequency interval being a multiple of a frequency interval from zero to half the decimated sample frequency. The subband may be individually processed and the processed subbands may subsequently be combined to generate a full band output signal.
    Type: Grant
    Filed: September 19, 2016
    Date of Patent: November 12, 2019
    Assignee: Koninklijke Philips N.V.
    Inventors: Cornelis Pieter Janse, Leonardus Cornelis Antonius Van Stuivenberg
  • Patent number: 10460744
    Abstract: Methods, systems, and media for voice communication are provided. In some embodiments, a system for voice communication is provided, the system including: a first audio sensor that captures an acoustic input; and generates a first audio signal based on the acoustic input, wherein the first audio sensor is positioned between a first surface and a second surface of a textile structure. In some embodiments, the first audio sensor is positioned in a region located between the first surface and the second surface of the textile structure. In some embodiments, the first audio sensor is positioned in a passage located between the first surface and the second surface of the textile structure.
    Type: Grant
    Filed: February 4, 2016
    Date of Patent: October 29, 2019
    Inventor: Xinxiao Zeng
  • Patent number: 10453479
    Abstract: A system-effected method for synthesizing speech, or recognizing speech including a sequence of expressive speech utterances. The method can be computer-implemented and can include system-generating a speech signal embodying the sequence of expressive speech utterances. Other possible steps include: system-marking the speech signal with a pitch marker indicating a pitch change at or near a first zero amplitude crossing point of the speech signal following a glottal closure point, at a minimum, at a maximum or at another location; system marking the speech signal with at least one further pitch marker; system-aligning a sequence of prosodically marked text with the pitch-marked speech signal according to the pitch markers; and system outputting the aligned text or the aligned speech signal, respectively. Computerized systems, and stored programs for implementing method embodiments of the invention are also disclosed.
    Type: Grant
    Filed: September 21, 2012
    Date of Patent: October 22, 2019
    Assignee: LESSAC TECHNOLOGIES, INC.
    Inventors: Reiner Wilhelms-Tricarico, Brian Mottershead, Rattima Nitisaroj, Michael Baumgartner, John B. Reichenbach, Gary A. Marple
  • Patent number: 10431236
    Abstract: Aspects of the present disclosure relate to dynamic pitch adjustment of inbound audio to improve speech recognition. Inbound audio may be received. Upon receiving the inbound audio, clusters of speech input may be detected within the received inbound audio. An average pitch may be detected from the inbound audio, using either subparts of the inbound audio or one or more of the detected speech clusters. A determination may be made using, among other things, the average pitch. Based on this determination, the pitch of the inbound audio may be adjusted. The adjusted input may then be passed to a speech recognition component.
    Type: Grant
    Filed: November 15, 2017
    Date of Patent: October 1, 2019
    Assignee: Sphero, Inc.
    Inventor: Carly Gloge
  • Patent number: 10431243
    Abstract: This invention provides a signal processing apparatus for improving the speech determination accuracy in an input sound. The signal processing apparatus includes a transformer that transforms an input signal into an amplitude component signal in a frequency domain, a calculator that calculates a norm of a change in the amplitude component signal in a frequency direction, an accumulator that accumulates the norm of the change in the amplitude component signal in the frequency direction calculated by the calculator, and an analyzer that analyzes speech in the input signal in accordance with an accumulated value of the norm of the change in the amplitude component signal in the frequency direction calculated by the accumulator.
    Type: Grant
    Filed: March 27, 2014
    Date of Patent: October 1, 2019
    Assignee: NEC CORPORATION
    Inventors: Masanori Kato, Akihiko Sugiyama
  • Patent number: 10418959
    Abstract: A noise suppression apparatus disclosed is capable of suppressing pulse noise in an input signal even in a situation in which the level of the input signal changes. The pulse noise mixed in the input signal is suppressed, and linear prediction coefficients for the input signal is derived by linear prediction analysis. A prediction residual signal is then calculated from the input signal using the linear prediction coefficient. A threshold value is calculated based on the signal level of the input signal and the signal level of the prediction residual signal is compared with the threshold value. A limit control is performed on the prediction residual signal depending on a result of the comparison, and an output signal is generated based on the prediction residual signal having been subjected to the limit control using the linear prediction coefficient.
    Type: Grant
    Filed: April 10, 2018
    Date of Patent: September 17, 2019
    Assignee: PANASONIC INTELLECTUAL PROPERTY CORPORATION OF AMERICA
    Inventor: Shinichi Yuzuriha
  • Patent number: 10403307
    Abstract: A pitch detection method. Such a pitch detection method may have M-PWVT-TEO algorithm to detect a pitch value from a speech signal, apply a partial auto-correlation to a current signal with the pitch value to compensate the delay of the pitch value. Also, the pitch detection method may apply a full auto-correlation to the speech signal where the pitch value is not detected to recover on-sets of the speech signal.
    Type: Grant
    Filed: March 31, 2017
    Date of Patent: September 3, 2019
    Assignee: OmniSpeech LLC
    Inventor: Vahid Khanagha
  • Patent number: 10397687
    Abstract: Embodiments of the invention determine a speech estimate using a bone conduction sensor or accelerometer, without employing voice activity detection gating of speech estimation. Speech estimation is based either exclusively on the bone conduction signal, or is performed in combination with a microphone signal. The speech estimate is then used to condition an output signal of the microphone. There are multiple use cases for speech processing in audio devices.
    Type: Grant
    Filed: June 15, 2018
    Date of Patent: August 27, 2019
    Assignee: Cirrus Logic, Inc.
    Inventors: David Leigh Watts, Brenton Robert Steele, Thomas Ivan Harvey, Vitaliy Sapozhnykov
  • Patent number: 10376197
    Abstract: The present invention relates to a method for detecting a personality consciousness code of an individual, comprising: a) storing reference voice characteristics of different persons that represent acoustic information as expressed by human voice in a form of a time to frequency component relation; b) classifying the acoustic information into 12 different personality consciousness codes by using support vector machine that analyzes said acoustic information; c) receiving data indicative of a sound energy generated by the voice of said individual; d) performing spectral analysis of said received sound energy in order to obtain voice characteristics from an electronic representation of said sound energy; and e) comparing said obtained voice characteristics with the reference voice characteristics and determining the personality consciousness code of said individual by using the support vector machines, and using the obtained voice characteristics to determine the level of consciousness.
    Type: Grant
    Filed: May 11, 2016
    Date of Patent: August 13, 2019
    Inventor: Penina Ohana Lubelchick
  • Patent number: 10373515
    Abstract: A determination regarding whether to intervene in a dialog to provide system-initiated assistive information involves monitoring a dialog between at least two participants and capturing data from a dialog environment containing at least one of the participants. The captured data represent the content of the dialog and physiological data for one or more participants. Assistive information relevant to the dialog and participants is identified, and the captured data are used to determine an intervention index of delivering the assistive information to one or more participants during the dialog. This intervention index is then used to determine whether or not to intervene in the dialog to deliver the assistive information to one or more participants.
    Type: Grant
    Filed: January 4, 2017
    Date of Patent: August 6, 2019
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Helena Morgado Corelli, Rodrigo Braga Fernandes, Rodrigo Laiola Guimaraes, Rafael Sorana de Matos, Julio Nogima, Rogerio Cesar Barbosa dos Santos Silva, Thiago Luiz de Barcelos Vital
  • Patent number: 10366709
    Abstract: A sound discriminating device capable of correctly discriminating a cry or other given sounds is provided. When a sound is input from a sound input unit, a feature amount extracting unit extracts a differential value between the amplitude of a fundamental frequency of the input sound and the amplitude of the second harmonic of the fundamental frequency as a feature amount of the input sound. A likelihood calculating unit calculates a likelihood between an acoustic model set for which a feature amount is known and the extracted feature amount. A result output unit determines whether or not the input sound is the given sound based on the result of the likelihood calculation.
    Type: Grant
    Filed: April 4, 2017
    Date of Patent: July 30, 2019
    Assignee: CANON KABUSHIKI KAISHA
    Inventor: Kazue Kaneko