Patents by Inventor Osamu Ichikawa

Osamu Ichikawa has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Publication number: 20210165214
    Abstract: A head mounted system includes an eye imaging device configured to image an eye of a wearer of the head mounted system, a periphery imaging device configured to image a periphery of the wearer, a display device configured to display, toward the wearer, a peripheral image of the periphery of the wearer imaged by the periphery imaging device, and a control device configured to adjust display of the peripheral image on the display device based on an eye image of the eye of the wearer imaged by the eye imaging device.
    Type: Application
    Filed: December 1, 2020
    Publication date: June 3, 2021
    Inventors: Hiroshi Hosokawa, Osamu Nomura, Shinichi Yamashita, Hiroshi Yoshioka, Takeshi Ichikawa
  • Patent number: 10839791
    Abstract: A method is provided for training a neural network-based (NN-based) acoustic model. The method includes receiving, by a processor, the neural network-based (NN-based) acoustic model, trained by a one-hot scheme and having an input layer, a set of middle layers, and an original output layer. At least each of the middle layers subsequent to a first one of the middle layers have trained parameters. The method further includes stacking, by the processor, a new output layer on the original output layer of the NN-based acoustic model to form a new NN-based acoustic model. The new output layer has a same size as the original output layer. The method also includes retraining, by the processor, only the new output layer and the original output layer of the new NN-based acoustic model in the one-hot scheme, with the trained parameters of middle layers subsequent to at least the first one being fixed.
    Type: Grant
    Filed: June 27, 2018
    Date of Patent: November 17, 2020
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Osamu Ichikawa, Takashi Fukuda
  • Patent number: 10832661
    Abstract: A computer-implemented method is provided. The computer-implemented method is performed by a speech recognition system having at least a processor. The method further includes performing a speech recognition operation on the audio signal data to decode the audio signal data into a textual representation based on the estimated sound identification information from a neural network having periodic indications and components of a frequency spectrum of the audio signal data inputted thereto. The neural network includes a plurality of fully-connected network layers having a first layer that includes a plurality of first nodes and a plurality of second nodes. The method further comprises training the neural network by initially isolating the periodic indications from the components of the frequency spectrum in the first layer by setting weights between the first nodes and a plurality of input nodes corresponding to the periodic indications to 0.
    Type: Grant
    Filed: October 28, 2019
    Date of Patent: November 10, 2020
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Takashi Fukuda, Osamu Ichikawa, Bhuvana Ramabhadran
  • Patent number: 10783882
    Abstract: Acoustic change is detected by a method including preparing a first Gaussian Mixture Model (GMM) trained with first audio data of first speech sound from a speaker at a first distance from an audio interface and a second GMM generated from the first GMM using second audio data of second speech sound from the speaker at a second distance from the audio interface; calculating a first output of the first GMM and a second output of the second GMM by inputting obtained third audio data into the first GMM and the second GMM; and transmitting a notification in response to determining at least that a difference between the first output and the second output exceeds a threshold. Each Gaussian distribution of the second GMM has a mean obtained by shifting a mean of a corresponding Gaussian distribution of the first GMM by a common channel bias.
    Type: Grant
    Filed: January 3, 2018
    Date of Patent: September 22, 2020
    Assignee: International Business Machines Corporation
    Inventors: Osamu Ichikawa, Gakuto Kurata, Takashi Fukuda
  • Patent number: 10726326
    Abstract: A method for learning a neural network having a plurality of filters for extracting local features performed by a computing device is disclosed. The computing device calculates a plurality of projection parameter sets by analyzing one or more training data. The plurality of the projection parameter sets define a projection of each training data into a new space and each projection parameter set has a same size as the filters in the neural network. At least part of the plurality of the projection parameter sets is set as initial parameters of at least part of the plurality of the filters in the neural network for training.
    Type: Grant
    Filed: February 24, 2016
    Date of Patent: July 28, 2020
    Assignee: International Business Machines Corporation
    Inventors: Takashi Fukuda, Osamu Ichikawa
  • Patent number: 10726828
    Abstract: A method, computer system, and a computer program product for generating a plurality of voice data having a particular speaking style is provided. The present invention may include preparing a plurality of original voice data corresponding to at least one word or at least one phrase is prepared. The present invention may also include attenuating a low frequency component and a high frequency component in the prepared plurality of original voice data. The present invention may then include reducing power at a beginning and an end of the prepared plurality of original voice data. The present invention may further include storing a plurality of resultant voice data obtained after the attenuating and the reducing.
    Type: Grant
    Filed: May 31, 2017
    Date of Patent: July 28, 2020
    Assignee: International Business Machines Corporation
    Inventors: Takashi Fukuda, Osamu Ichikawa, Gakuto Kurata, Masayuki Suzuki
  • Patent number: 10586529
    Abstract: A computer-implemented method for processing a speech signal, includes: identifying speech segments in an input speech signal; calculating an upper variance and a lower variance, the upper variance being a variance of upper spectra larger than a criteria among speech spectra corresponding to frames in the speech segments, the lower variance being a variance of lower spectra smaller than a criteria among the speech spectra corresponding to the frames in the speech segments; determining whether the input speech signal is a special input speech signal using a difference between the upper variance and the lower variance; and performing speech recognition of the input speech signal which has been determined to be the special input speech signal, using a special acoustic model for the special input speech signal.
    Type: Grant
    Filed: September 14, 2017
    Date of Patent: March 10, 2020
    Assignee: International Business Machines Corporation
    Inventors: Osamu Ichikawa, Takashi Fukuda, Gakuto Kurata, Bhuvana Ramabhadran
  • Publication number: 20200058297
    Abstract: A computer-implemented method is provided. The computer-implemented method is performed by a speech recognition system having at least a processor. The method further includes performing a speech recognition operation on the audio signal data to decode the audio signal data into a textual representation based on the estimated sound identification information from a neural network having periodic indications and components of a frequency spectrum of the audio signal data inputted thereto. The neural network includes a plurality of fully-connected network layers having a first layer that includes a plurality of first nodes and a plurality of second nodes. The method further comprises training the neural network by initially isolating the periodic indications from the components of the frequency spectrum in the first layer by setting weights between the first nodes and a plurality of input nodes corresponding to the periodic indications to 0.
    Type: Application
    Filed: October 28, 2019
    Publication date: February 20, 2020
    Inventors: Takashi Fukuda, Osamu Ichikawa, Bhuvana Ramabhadran
  • Publication number: 20200034703
    Abstract: A student neural network may be trained by a computer-implemented method, including: inputting common input data to each teacher neural network among a plurality of teacher neural networks to obtain a soft label output among a plurality of soft label outputs from each teacher neural network among the plurality of teacher neural networks, and training a student neural network with the input data and the plurality of soft label outputs.
    Type: Application
    Filed: July 27, 2018
    Publication date: January 30, 2020
    Inventors: Takashi Fukuda, Masayuki Suzuki, Osamu Ichikawa, Gakuto Kurata, Samuel Thomas, Bhuvana Ramabhadran
  • Publication number: 20200034702
    Abstract: A student neural network may be trained by a computer-implemented method, including: selecting a teacher neural network among a plurality of teacher neural networks, inputting an input data to the selected teacher neural network to obtain a soft label output generated by the selected teacher neural network, and training a student neural network with at least the input data and the soft label output from the selected teacher neural network.
    Type: Application
    Filed: July 27, 2018
    Publication date: January 30, 2020
    Inventors: Takashi Fukuda, Masayuki Suzuki, Osamu Ichikawa, Gakuto Kurata, Samuel Thomas, Bhuvana Ramabhadran
  • Patent number: 10546238
    Abstract: A technique for training a neural network including an input layer, one or more hidden layers and an output layer, in which the trained neural network can be used to perform a task such as speech recognition. In the technique, a base of the neural network having at least a pre-trained hidden layer is prepared. A parameter set associated with one pre-trained hidden layer in the neural network is decomposed into a plurality of new parameter sets. The number of hidden layers in the neural network is increased by using the plurality of the new parameter sets. Pre-training for the neural network is performed.
    Type: Grant
    Filed: April 9, 2019
    Date of Patent: January 28, 2020
    Assignee: International Business Machines Corporation
    Inventors: Takashi Fukuda, Osamu Ichikawa
  • Publication number: 20200005769
    Abstract: A method is provided for training a neural network-based (NN-based) acoustic model. The method includes receiving, by a processor, the neural network-based (NN-based) acoustic model, trained by a one-hot scheme and having an input layer, a set of middle layers, and an original output layer. At least each of the middle layers subsequent to a first one of the middle layers have trained parameters. The method further includes stacking, by the processor, a new output layer on the original output layer of the NN-based acoustic model to form a new NN-based acoustic model. The new output layer has a same size as the original output layer. The method also includes retraining, by the processor, only the new output layer and the original output layer of the new NN-based acoustic model in the one-hot scheme, with the trained parameters of middle layers subsequent to at least the first one being fixed.
    Type: Application
    Filed: June 27, 2018
    Publication date: January 2, 2020
    Inventors: Osamu Ichikawa, Takashi Fukuda
  • Publication number: 20190378006
    Abstract: A technique for constructing a model supporting a plurality of domains is disclosed. In the technique, a plurality of teacher models, each of which is specialized for different one of the plurality of the domains, is prepared. A plurality of training data collections, each of which is collected for different one of the plurality of the domains, is obtained. A plurality of soft label sets is generated by inputting each training data in the plurality of the training data collections into corresponding one of the plurality of the teacher models. A student model is trained using the plurality of the soft label sets.
    Type: Application
    Filed: June 8, 2018
    Publication date: December 12, 2019
    Inventors: Takashi Fukuda, Osamu Ichikawa, Samuel Thomas, Bhuvana Ramabhadran
  • Patent number: 10460723
    Abstract: A computer-implemented method is provided. The computer-implemented method is performed by a speech recognition system having at least a processor. The method includes estimating sound identification information from a neural network having periodic indications and components of a frequency spectrum of an audio signal data inputted thereto. The method further includes performing a speech recognition operation on the audio signal data to decode the audio signal data into a textual representation based on the estimated sound identification information. The neural network includes a plurality of fully-connected network layers having a first layer that includes a plurality of first nodes and a plurality of second nodes. The method further comprises training the neural network by initially isolating the periodic indications from the components of the frequency spectrum in the first layer by setting weights between the first nodes and a plurality of input nodes corresponding to the periodic indications to 0.
    Type: Grant
    Filed: May 30, 2018
    Date of Patent: October 29, 2019
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Takashi Fukuda, Osamu Ichikawa, Bhuvana Ramabhadran
  • Patent number: 10373607
    Abstract: A method, for testing words defined in a pronunciation lexicon used in an automatic speech recognition (ASR) system, is provided. The method includes: obtaining test sentences which can be accepted by a language model used in the ASR system. The test sentences cover words defined in the pronunciation lexicon. The method further includes obtaining variations of speech data corresponding to each test sentence, and obtaining a plurality of texts by recognizing the variations of speech data, or a plurality of texts generated by recognizing the variation of speech data. The method also includes constructing a word graph, using the plurality of texts, for each test sentence, where each word in the word graph corresponds to each word defined in the pronunciation lexicon; and determining whether or not all or parts of words in a test sentence are present in a path of the word graph derived from the test sentence.
    Type: Grant
    Filed: June 13, 2017
    Date of Patent: August 6, 2019
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Takashi Fukuda, Osamu Ichikawa, Futoshi Iwama
  • Publication number: 20190220747
    Abstract: A technique for training a neural network including an input layer, one or more hidden layers and an output layer, in which the trained neural network can be used to perform a task such as speech recognition. In the technique, a base of the neural network having at least a pre-trained hidden layer is prepared. A parameter set associated with one pre-trained hidden layer in the neural network is decomposed into a plurality of new parameter sets. The number of hidden layers in the neural network is increased by using the plurality of the new parameter sets. Pre-training for the neural network is performed.
    Type: Application
    Filed: April 9, 2019
    Publication date: July 18, 2019
    Inventors: Takashi Fukuda, Osamu Ichikawa
  • Patent number: 10346125
    Abstract: A method, a system, and a computer program product detect a clipping event in audio signals. The method includes digitalizing audio signals having limited frequency bands, at a sampling frequency which is greater than two times as large as the maximum frequency component of the audio signal; and detecting a clipping event of the audio signals, based on magnitudes of spectrum in a bandwidth which is greater than or equal to the limited frequency band. The sampling frequency may be greater than or equal to three times as large as the maximum frequency component of the audio signal. The detection of a clipping event may include determining, for each frame, whether or not a sum or average of the magnitudes of spectrum at the bandwidth which is greater than or equal to the limited frequency band is larger than a predetermined threshold.
    Type: Grant
    Filed: August 18, 2015
    Date of Patent: July 9, 2019
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Takashi Fukuda, Osamu Ichikawa
  • Publication number: 20190206394
    Abstract: Acoustic change is detected by a method including preparing a first Gaussian Mixture Model (GMM) trained with first audio data of first speech sound from a speaker at a first distance from an audio interface and a second GMM generated from the first GMM using second audio data of second speech sound from the speaker at a second distance from the audio interface; calculating a first output of the first GMM and a second output of the second GMM by inputting obtained third audio data into the first GMM and the second GMM; and transmitting a notification in response to determining at least that a difference between the first output and the second output exceeds a threshold. Each Gaussian distribution of the second GMM has a mean obtained by shifting a mean of a corresponding Gaussian distribution of the first GMM by a common channel bias.
    Type: Application
    Filed: January 3, 2018
    Publication date: July 4, 2019
    Inventors: Osamu Ichikawa, Gakuto Kurata, Takashi Fukuda
  • Publication number: 20190080684
    Abstract: A computer-implemented method for processing a speech signal, includes: identifying speech segments in an input speech signal; calculating an upper variance and a lower variance, the upper variance being a variance of upper spectra larger than a criteria among speech spectra corresponding to frames in the speech segments, the lower variance being a variance of lower spectra smaller than a criteria among the speech spectra corresponding to the frames in the speech segments; determining whether the input speech signal is a special input speech signal using a difference between the upper variance and the lower variance; and performing speech recognition of the input speech signal which has been determined to be the special input speech signal, using a special acoustic model for the special input speech signal.
    Type: Application
    Filed: September 14, 2017
    Publication date: March 14, 2019
    Inventors: Osamu Ichikawa, Takashi Fukuda, Gakuto Kurata, Bhuvana Ramabhadran
  • Patent number: 10217456
    Abstract: A method and system for generating training data for a target domain using speech data of a source domain. The training data generation method including: reading out a Gaussian mixture model (GMM) of a target domain trained with a clean speech data set of the target domain; mapping, by referring to the GMM of the target domain, a set of source domain speech data received as an input to the set of target domain speech data on a basis of a channel characteristic of the target domain speech data; and adding a noise of the target domain to the mapped set of source domain speech data to output a set of pseudo target domain speech data.
    Type: Grant
    Filed: April 14, 2014
    Date of Patent: February 26, 2019
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Osamu Ichikawa, Steven J Rennie