Patents by Inventor Kazuhiro Nakadai

Kazuhiro Nakadai has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Publication number: 20240096330
    Abstract: A speech recognition device includes: an acquisition part, acquiring a speech signal; a speech feature amount calculation part, calculating a speech feature amount; a first speech recognition part, based on the speech feature amount, performing speech recognition using a learned first E2E model, attaching a first tag to a vocabulary portion of a specific class in text that is a recognition result, and outputting the same; a second speech recognition part, based on the speech feature amount, performing speech recognition using a learned second E2E model, attaching a second tag to a vocabulary portion of a specific class in a phoneme that is a recognition result, and outputting the same; a phoneme replacement part, replacing a vocabulary with the first tag with a phoneme with the second tag; and an output part, converting the phoneme with the second tag into text and outputting the same.
    Type: Application
    Filed: August 22, 2023
    Publication date: March 21, 2024
    Applicant: Honda Motor Co., Ltd.
    Inventors: Yui SUDO, Kazuhiro NAKADAI, Kazuya Hata
  • Publication number: 20240071379
    Abstract: The speech recognition that is disclosed analyzes an acoustic feature for each subframe of an audio signal; provides a first model configured to determine a hidden state for each frame consisting of multiple subframes on the basis of the acoustic feature; provides a second model configured to determine a hidden state for each frame consisting of multiple subframes on the basis of the acoustic feature; and provides a third model configured to determine an utterance content on the basis of a sequence of the hidden states of each block consisting of multiple frames belonging to a voice segment.
    Type: Application
    Filed: August 29, 2022
    Publication date: February 29, 2024
    Inventors: Yui Sudo, Kazuhiro Nakadai, Muhummad Shakeel
  • Patent number: 11818557
    Abstract: A spatial normalization unit generates a normalized spectrum by normalizing an orientation component of a microphone array for a target direction included in a spectrum of an acoustic signal acquired from each of a plurality of microphones forming the microphone array into an orientation component for a predetermined standard direction. A mask function estimating unit determines a mask function used for extracting a component of a target sound source arriving in the target direction on the basis of the normalized spectrum using a machine learning model. A mask processing unit estimates the component of the target sound source installed in the target direction by applying the mask function to the acoustic signal.
    Type: Grant
    Filed: February 22, 2022
    Date of Patent: November 14, 2023
    Assignees: HONDA MOTOR CO., LTD., OSAKA UNIVERSITY
    Inventors: Kazuhiro Nakadai, Ryu Takeda
  • Patent number: 11755832
    Abstract: A voice recognition part performs voice recognition on a voice data and generates a first text which is a text indicating an utterance content. A text acquisition part acquires a second text which is a text indicating an utterance content according to an operation. A display processing part moves a position of a display text displayed on a display part, displays a text of at least one of the first text and the second text as a display text in a free region generated by the movement, and when fixing of a display position of the second text is instructed according to an operation, fixes the second text as a fixed text at a predetermined display position and displays the second text on the display part.
    Type: Grant
    Filed: March 29, 2021
    Date of Patent: September 12, 2023
    Assignee: Honda Motor Co., Ltd.
    Inventors: Naoaki Sumida, Masaki Nakatsuka, Kazuhiro Nakadai, Yuichi Yoshida, Takashi Yamauchi, Kazuya Maura, Kyosuke Hineno, Syozo Yokoo
  • Publication number: 20230252996
    Abstract: In a conversation support device, a first voice recognition unit performs voice recognition processing on the basis of a voice signal and defines partial section text information for each partial section that is a part of an utterance section, a second voice recognition unit performs voice recognition processing on the basis of the voice signal and defines utterance section text information for each utterance section, an information integration unit integrates the partial section text information into the utterance section text information to generate integration text information, and an output processing unit outputs the integration text information to the display unit after outputting the partial section text information to the display unit.
    Type: Application
    Filed: December 14, 2022
    Publication date: August 10, 2023
    Inventors: Kazuhiro Nakadai, Masayuki Takigahira, Naoaki Sumida, Masaki Nakatsuka, Kazuya Maura, Kyosuke Hineno, Takehito Shimizu
  • Publication number: 20230076123
    Abstract: A storage unit stores a first transfer function representing a transfer characteristic of a sound from a sound source for each sound source direction, a sound source direction estimating unit calculates a conversion coefficient of an acoustic signal for each channel in a frequency domain and a spatial spectrum for each sound source direction on the basis of the first transfer function and estimates a sound source direction in which the spatial spectrum becomes a maximum as an estimated sound source direction, a transfer function estimating unit estimates a transfer function for the estimated sound source direction as a second transfer function by normalizing the conversion coefficients among channels, and a transfer function updating unit updates the first transfer function for the estimated sound source direction using the second transfer function.
    Type: Application
    Filed: August 31, 2022
    Publication date: March 9, 2023
    Inventors: Kazuhiro Nakadai, Masayuki Takigahira, Hirofumi Nakajima
  • Patent number: 11594238
    Abstract: An acoustic signal processing device calculates a signal waveform that a microphone receives when at least one of a sound source and the microphone is moving. The acoustic signal processing device includes a coefficient calculation unit configured to model a steering coefficient gk,m representing how much an amplitude of a sound source signal emitted at an mth discrete time, where m is an integer between 1 and M and M is a length of the sound source signal, is transferred to an amplitude of a signal that the microphone receives at a kth discrete time, where k is an integer between 1 and K and K is a length of a recording signal, using N-order Fourier series expansion where N is an integer of 1 or more, and a recording signal calculation unit configured to calculate the signal waveform that the microphone receives using the modeled steering coefficient gk,m.
    Type: Grant
    Filed: March 5, 2020
    Date of Patent: February 28, 2023
    Assignee: HONDA MOTOR CO., LTD.
    Inventors: Kazuhiro Nakadai, Hirofumi Nakajima
  • Publication number: 20220286775
    Abstract: A spatial normalization unit generates a normalized spectrum by normalizing an orientation component of a microphone array for a target direction included in a spectrum of an acoustic signal acquired from each of a plurality of microphones forming the microphone array into an orientation component for a predetermined standard direction. A mask function estimating unit determines a mask function used for extracting a component of a target sound source arriving in the target direction on the basis of the normalized spectrum using a machine learning model. A mask processing unit estimates the component of the target sound source installed in the target direction by applying the mask function to the acoustic signal.
    Type: Application
    Filed: February 22, 2022
    Publication date: September 8, 2022
    Inventors: Kazuhiro Nakadai, Ryu Takeda
  • Patent number: 11373355
    Abstract: An acoustic scene reconstruction device includes: a sound source localization and separation unit configured to perform sound source localization and sound source separation from a collected sound signal; an identification unit configured to identify a kind of a sound source contained in the sound signal; an analysis processing unit configured to estimate a position of the sound source based on a result obtained through the sound source localization and the sound source separation and a result obtained through the identification, select a separation sound and generate visualization information; and a visualization processing unit configured to generate an image corresponding to the sound source is displayed at the estimated position of the sound source by using the visualization information and the separation sound and generate a sound in which the separation sound is reproduced at the estimated position of the sound source.
    Type: Grant
    Filed: August 9, 2019
    Date of Patent: June 28, 2022
    Assignee: HONDA MOTOR CO., LTD.
    Inventor: Kazuhiro Nakadai
  • Publication number: 20220138578
    Abstract: A node pruning device for a network model in which a plurality of layers are continuously connected includes: a node activation section configured to select a node to be pruned on the basis of a score function that represents importance of a node; an inter-layer pairing section configured to prune an input connected to a node pruned at an output of a previous layer; a bypass setting section configured to provide a bypass connection between an input and an output of a layer and not to prune the bypass connection; and a pruning execution section configured to prune the nodes with the same pruning rate for each layer.
    Type: Application
    Filed: October 26, 2021
    Publication date: May 5, 2022
    Inventors: Kazuhiro Nakadai, Yosuke Fukumoto, Ryu Takeda
  • Publication number: 20220101852
    Abstract: A speech recognition portion generates utterance text representing utterance content by performing a speech recognition process on speech data. A topic analysis portion identifies a word or a phrase of a prescribed topic and a numerical value having a prescribed positional relationship with the word or the phrase from the utterance text. A display processing portion causes a display portion to display display information in which the numerical value or a numerical value derived from the numerical value is shown as a display value in association with the utterance text.
    Type: Application
    Filed: September 20, 2021
    Publication date: March 31, 2022
    Inventors: Kazuhiro Nakadai, Naoaki Sumida, Masaki Nakatsuka, Yuichi Yoshida, Takashi Yamauchi, Kazuya Maura, Kyosuke Hineno, Syozo Yokoo
  • Publication number: 20220100959
    Abstract: A topic analysis portion extracts a word or a phrase of a prescribed topic from utterance text representing utterance content. A search portion searches for reference text related to the topic in a storage portion in which an utterance history including previous utterance text is saved. A display processing portion outputs the utterance text and related information about the reference text in association with each other to a display portion.
    Type: Application
    Filed: September 22, 2021
    Publication date: March 31, 2022
    Inventors: Kazuhiro Nakadai, Naoaki Sumida, Masaki Nakatsuka, Yuichi Yoshida, Takashi Yamauchi, Kazuya Maura, Kyosuke Hineno, Syozo Yokoo
  • Publication number: 20210303787
    Abstract: A voice recognition part performs voice recognition on a voice data and generates a first text which is a text indicating an utterance content. A text acquisition part acquires a second text which is a text indicating an utterance content according to an operation. A display processing part moves a position of a display text displayed on a display part, displays a text of at least one of the first text and the second text as a display text in a free region generated by the movement, and when fixing of a display position of the second text is instructed according to an operation, fixes the second text as a fixed text at a predetermined display position and displays the second text on the display part.
    Type: Application
    Filed: March 29, 2021
    Publication date: September 30, 2021
    Applicant: Honda Motor Co., Ltd.
    Inventors: Naoaki SUMIDA, Masaki NAKATSUKA, Kazuhiro NAKADAI, Yuichi YOSHIDA, Takashi YAMAUCHI, Kazuya Maura, Kyosuke Hineno, Syozo Yokoo
  • Publication number: 20210304755
    Abstract: A voice recognition part performs voice recognition on a voice data and generates an utterance text which is a text indicating an utterance content. A display processing part moves a position of a display text displayed on a display part, displays the utterance text as a display text in a free region generated by the movement, and fixes the display text in a section of which fixing of a display position is instructed according to an operation as a fixed text at a predetermined display position to display on the display part.
    Type: Application
    Filed: March 29, 2021
    Publication date: September 30, 2021
    Applicant: Honda Motor Co., Ltd.
    Inventors: Naoaki SUMIDA, Masaki NAKATSUKA, Kazuhiro NAKADAI, Yuichi YOSHIDA, Takashi YAMAUCHI, Kazuya Maura, Kyosuke Hineno, Syozo Yokoo
  • Publication number: 20210304767
    Abstract: Provided are a meeting support system, a meeting support method, and a program. The meeting support system includes a meeting support device used by a first participant and a terminal used by a second participant. The meeting support device includes an acquisition unit acquiring utterance information of the first participant, a display unit displaying at least the utterance information of the first participant, and a processing unit determining whether an utterance of the first participant is interrupted when acquiring a wait request from the terminal and changing display of the display unit according to the wait request when it is determined that the utterance of the first participant is interrupted.
    Type: Application
    Filed: March 29, 2021
    Publication date: September 30, 2021
    Applicant: Honda Motor Co., Ltd.
    Inventors: Naoaki SUMIDA, Masaki NAKATSUKA, Kazuhiro NAKADAI, Yuichi YOSHIDA, Takashi YAMAUCHI, Kazuya Maura, Kyosuke Hineno, Syozo Yokoo
  • Patent number: 11076250
    Abstract: A microphone array position estimation device includes an estimation unit that estimates a position X of a microphone array for maximizing a simultaneous probability P(X,S,Z) of X, Y, and Z through repeated estimation of S and X when the position of the microphone array constituted by M (M is an integer of 1 or greater) microphones is set to X (=(X1T, . . . , XMT)T, T indicates a transposition), spectrums of sound source signals output by the N (N is an integer of 1 or greater) sound sources are set to S (a set related to all of n, f, and t of Snft, f is a frequency bin, and t is a frame index), and spectrums of recorded signals collected by the microphone array are set to Z (a set related to all of f and t of Zft).
    Type: Grant
    Filed: February 13, 2020
    Date of Patent: July 27, 2021
    Assignee: HONDA MOTOR CO., LTD.
    Inventors: Kazuhiro Nakadai, Katsuhiro Dan, Katsutoshi Itoyama, Kenji Nishida
  • Patent number: 10966024
    Abstract: A sound source localization device includes: a sound receiving unit that includes two or more microphones; and a sound source localization unit that transforms a sound signal received by each of the microphones into a frequency domain, models a steering vector through Fourier series expansion of an N-th (here, N is an integer equal to or larger than “1”) order for the transformed sound signal of the frequency domain for each of the microphones, calculates a steering vector of an arbitrary angle using the modeled steering vector, and performs localization of a sound source using the calculated steering vector of the arbitrary angle.
    Type: Grant
    Filed: March 4, 2020
    Date of Patent: March 30, 2021
    Assignee: HONDA MOTOR CO., LTD.
    Inventors: Kazuhiro Nakadai, Hirofumi Nakajima
  • Patent number: 10917720
    Abstract: A sound source localization device includes an acquisition unit configured to acquire acoustic signals of M channels (M is an integer equal to or greater than one), a phase difference information calculator configured to perform a short-time Fourier transform on the acoustic signals of M channels and to convert a time domain into a frequency domain including phase information, and an estimator configured to input phase information of the acoustic signals subjected to the short-time Fourier transform to a deep learning machine and to perform sound source localization of the acoustic signals using the deep learning machine where input follows a von Mises distribution.
    Type: Grant
    Filed: February 20, 2020
    Date of Patent: February 9, 2021
    Assignee: HONDA MOTOR CO., LTD.
    Inventors: Kazuhiro Nakadai, Shungo Masaki, Ryosuke Kojima, Osamu Sugiyama, Katsutoshi Itoyama, Kenji Nishida
  • Patent number: 10869148
    Abstract: An audio processing device includes: a sound source localizing unit configured to determine a localized sound source direction, which is a direction of a sound source, on the basis of audio signals of a plurality of channels acquired from M (here, M is an integer equal to or greater than “3”) sound receiving units of which positions are different from each other; and a sound source position estimating unit configured to, for each set of two sound receiving units, estimate a midpoint of a segment perpendicular to both of half lines directed in estimated sound source directions, which are directions from the sound receiving units to an estimated sound source position of the sound source, as the estimated sound source position.
    Type: Grant
    Filed: August 22, 2019
    Date of Patent: December 15, 2020
    Assignee: HONDA MOTOR CO., LTD.
    Inventors: Kazuhiro Nakadai, Daniel Patrik Gabriel
  • Patent number: 10863271
    Abstract: An acoustic signal processing device includes an acoustic signal processing unit configured to calculate a spectrum of each acoustic signal and a steering vector having m elements on the basis of m acoustic signals converted into m digital signals by sampling m analog signals representing sounds collected by m microphones (m is an integer of 1 or more and M or less, and M is an integer of 2 or more), and to estimate a sampling frequency ?m in the sampling on the basis of the spectrum, the steering vector, and a sampling frequency ?ideal that is a predetermined value.
    Type: Grant
    Filed: August 28, 2019
    Date of Patent: December 8, 2020
    Assignee: HONDA MOTOR CO., LTD.
    Inventors: Katsutoshi Itoyama, Kazuhiro Nakadai