Patents by Inventor Atsuo Hiroe

Atsuo Hiroe has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Publication number: 20240155290
    Abstract: The present technology relates to a signal processing apparatus, a signal processing method, and a program that make it possible to improve precision of target sound extraction. A signal processing apparatus includes a reference signal generating section that generates a reference signal corresponding to a target sound on the basis of a mixed sound signal which is recorded with multiple microphones arranged at different positions and is a mixture of the target sound and a non-target sound, and a sound source extracting section that extracts, from the mixed sound signal of one frame or multiple frames, a signal of one frame which is similar to the reference signal and in which the target sound is more enhanced. The present technology can be applied to a signal processing apparatus.
    Type: Application
    Filed: January 13, 2022
    Publication date: May 9, 2024
    Inventor: ATSUO HIROE
  • Publication number: 20230005488
    Abstract: Provided is a signal processing device including a main speech detection unit configured to detect, by using a neural network, whether or not a signal input to a sound collection device assigned to each of at least two speakers includes a main speech that is a voice of the corresponding speaker, and output frame information indicating presence or absence of the main speech.
    Type: Application
    Filed: December 10, 2020
    Publication date: January 5, 2023
    Inventor: ATSUO HIROE
  • Publication number: 20220189498
    Abstract: A signal processing device includes: an input unit to which a microphone signal including a mixed sound in which a target sound and a sound other than the target sound are mixed and a one-dimensional time-series signal acquired by an auxiliary sensor and synchronized with the target sound are input; and a sound source extraction unit that extracts a target sound signal corresponding to the target sound from the microphone signal on the basis of the one-dimensional time-series signal.
    Type: Application
    Filed: February 10, 2020
    Publication date: June 16, 2022
    Inventor: ATSUO HIROE
  • Patent number: 11158334
    Abstract: In a case where two microphones are used, sound source direction estimation of a plurality of sound sources can be performed with high accuracy. For this purpose, an inter-microphone phase difference is calculated for every frequency band in a microphone pair including two microphones that are installed apart from each other by a predetermined distance. Furthermore, for every frequency band in the microphone pair, a single sound source mask indicating whether or not a component of the frequency band is a single sound source is calculated. Then, the calculated inter-microphone phase difference and the calculated single sound source mask are input as feature quantities to a multi-label classifier, and a direction label associated with a sound source direction is output to the feature quantities.
    Type: Grant
    Filed: January 28, 2019
    Date of Patent: October 26, 2021
    Assignee: SONY CORPORATION
    Inventor: Atsuo Hiroe
  • Patent number: 11049493
    Abstract: [Problem] With conventional technology, it is impossible to appropriately support spoken dialog that is carried out in multiple languages.
    Type: Grant
    Filed: July 24, 2017
    Date of Patent: June 29, 2021
    Assignee: NATIONAL INSTITUTE OF INFORMATION AND COMMUNICATIONS TECHNOLOGY
    Inventors: Atsuo Hiroe, Takuma Okamoto
  • Patent number: 11024286
    Abstract: In order to solve a conventional problem that, after a series of dialog between a user and a spoken dialog device has progressed to some extent, that user or another user cannot see or recognize a previous dialog status, a cross-lingual spoken dialog system is provided wherein, in a case in which an instruction from a user terminal is received by a pairing server, dialog information stored in a storage medium is transmitted to the user terminal. Accordingly, even after a series of dialog between a user and the spoken dialog device has progressed to some extent, that user or another user can see or recognize a previous dialog status.
    Type: Grant
    Filed: November 6, 2017
    Date of Patent: June 1, 2021
    Assignee: National Institute of Information and Communications Technology
    Inventors: Atsuo Hiroe, Takuma Okamoto, Yutaka Kidawara
  • Publication number: 20210020190
    Abstract: In a case where two microphones are used, sound source direction estimation of a plurality of sound sources can be performed with high accuracy. For this purpose, an inter-microphone phase difference is calculated for every frequency band in a microphone pair including two microphones that are installed apart from each other by a predetermined distance. Furthermore, for every frequency band in the microphone pair, a single sound source mask indicating whether or not a component of the frequency band is a single sound source is calculated. Then, the calculated inter-microphone phase difference and the calculated single sound source mask are input as feature quantities to a multi-label classifier, and a direction label associated with a sound source direction is output to the feature quantities.
    Type: Application
    Filed: January 28, 2019
    Publication date: January 21, 2021
    Inventor: ATSUO HIROE
  • Publication number: 20200066254
    Abstract: In order to solve a conventional problem that, after a series of dialog between a user and a spoken dialog device has progressed to some extent, that user or another user cannot see or recognize a previous dialog status, a cross-lingual spoken dialog system is provided wherein, in a case in which an instruction from a user terminal is received by a pairing server, dialog information stored in a storage medium is transmitted to the user terminal. Accordingly, even after a series of dialog between a user and the spoken dialog device has progressed to some extent, that user or another user can see or recognize a previous dialog status.
    Type: Application
    Filed: November 6, 2017
    Publication date: February 27, 2020
    Inventors: Atsuo HIROE, Takuma OKAMOTO, Yutaka KIDAWARA
  • Patent number: 10475440
    Abstract: There is provided an apparatus and a method for rapidly extracting a target sound from a sound signal where a variety of sounds are mixed generated from a plurality of the sound sources. There is a voice recognition unit including a tracking unit for detecting a sound source direction and a voice segment to execute a sound source extraction process, and a voice recognition unit for inputting a sound source extraction result to execute a voice recognition process. In the tracking unit, a segment being created management unit that creates and manages a voice segment per unit of sound source sequentially detects a sound source direction, sequentially updates a voice segment estimated by connecting a detection result to a time direction, creates an extraction filter for a sound source extraction after a predetermined time is elapsed, and sequentially creates a sound source extraction result by sequentially applying the extraction filter to an input voice signal.
    Type: Grant
    Filed: December 20, 2013
    Date of Patent: November 12, 2019
    Assignee: SONY CORPORATION
    Inventor: Atsuo Hiroe
  • Publication number: 20190172444
    Abstract: [Problem] With conventional technology, it is impossible to appropriately support spoken dialog that is carried out in multiple languages.
    Type: Application
    Filed: July 24, 2017
    Publication date: June 6, 2019
    Inventors: Atsuo HIROE, Takuma OKAMOTO
  • Patent number: 10013998
    Abstract: A device and a method for determining a speech segment with a high degree of accuracy from a sound signal in which different sounds coexist are provided. Directional points indicating the direction of arrival of the sound signal are connected in the temporal direction, and a speech segment is detected. In this configuration, pattern classification is performed in accordance with directional characteristics with respect to the direction of arrival, and a directionality pattern and a null beam pattern are generated from the classification results. Also, an average null beam pattern is also generated by calculating the average of the null beam patterns at a time when a non-speech-like signal is input.
    Type: Grant
    Filed: January 27, 2015
    Date of Patent: July 3, 2018
    Assignee: SONY CORPORATION
    Inventor: Atsuo Hiroe
  • Publication number: 20170047079
    Abstract: A device and a method for determining a speech segment with a high degree of accuracy from a sound signal in which different sounds coexist are provided. Directional points indicating the direction of arrival of the sound signal are connected in the temporal direction, and a speech segment is detected. In this configuration, pattern classification is performed in accordance with directional characteristics with respect to the direction of arrival, and a directionality pattern and a null beam pattern are generated from the classification results. Also, an average null beam pattern is also generated by calculating the average of the null beam patterns at a time when a non-speech-like signal is input.
    Type: Application
    Filed: January 27, 2015
    Publication date: February 16, 2017
    Inventor: ATSUO HIROE
  • Patent number: 9361907
    Abstract: An apparatus including a direction estimation unit detecting one or more direction points indicating a sound source direction of a sound signal for each of blocks divided in a predetermined time unit, and a direction tracking unit connecting the direction points to each other between the blocks and detecting a section in which a sound is active.
    Type: Grant
    Filed: January 11, 2012
    Date of Patent: June 7, 2016
    Assignee: Sony Corporation
    Inventor: Atsuo Hiroe
  • Patent number: 9357298
    Abstract: A sound signal processing apparatus includes an observed signal analysis unit that receives as an observed signal a sound signal for channels obtained by a sound signal input unit formed of microphones and estimates a sound direction and a sound segment of a target sound which is sound to be extracted and a sound source extraction unit that receives the sound direction and sound segment of the target sound estimated by the observed signal analysis unit and extracts the sound signal for the target sound. The observed signal analysis unit includes a short time Fourier transform unit that generates an observed signal in time-frequency domain by applying short time Fourier transform to the sound signal for the channels received and a direction/segment estimation unit that receives the observed signal generated by the short time Fourier transform unit and detects the sound direction and sound segment of the target sound.
    Type: Grant
    Filed: March 21, 2014
    Date of Patent: May 31, 2016
    Assignee: SONY CORPORATION
    Inventor: Atsuo Hiroe
  • Patent number: 9318124
    Abstract: There is provided a sound signal processing device, in which an observation signal analysis unit receives multi-channels of sound-signals acquired by a sound-signal input unit and estimates a sound direction and a sound segment of a target sound to be extracted and a sound source extraction unit receives the sound direction and the sound segment of the target sound and extracts a sound-signal of the target sound. By applying short-time Fourier transform to the incoming multi-channel sound-signals this device generates an observation signal in the time-frequency domain and detects the sound direction and the sound segment of the target sound. Further, based on the sound direction and the sound segment of the target sound, this device generates a reference signal corresponding to a time envelope indicating changes of the target's sound volume in the time direction, and extracts the signal of the target sound, utilizing the reference signal.
    Type: Grant
    Filed: April 13, 2012
    Date of Patent: April 19, 2016
    Assignee: SONY CORPORATION
    Inventor: Atsuo Hiroe
  • Publication number: 20160005394
    Abstract: There is provided an apparatus and a method for rapidly extracting a target sound from a sound signal where a variety of sounds are mixed generated from a plurality of the sound sources. There is a voice recognition unit including a tracking unit for detecting a sound source direction and a voice segment to execute a sound source extraction process, and a voice recognition unit for inputting a sound source extraction result to execute a voice recognition process. In the tracking unit, a segment being created management unit that creates and manages a voice segment per unit of sound source sequentially detects a sound source direction, sequentially updates a voice segment estimated by connecting a detection result to a time direction, creates an extraction filter for a sound source extraction after a predetermined time is elapsed, and sequentially creates a sound source extraction result by sequentially applying the extraction filter to an input voice signal.
    Type: Application
    Filed: December 20, 2013
    Publication date: January 7, 2016
    Inventor: Atsuo HIROE
  • Publication number: 20140328487
    Abstract: A sound signal processing apparatus includes an observed signal analysis unit that receives as an observed signal a sound signal for channels obtained by a sound signal input unit formed of microphones and estimates a sound direction and a sound segment of a target sound which is sound to be extracted and a sound source extraction unit that receives the sound direction and sound segment of the target sound estimated by the observed signal analysis unit and extracts the sound signal for the target sound. The observed signal analysis unit includes a short time Fourier transform unit that generates an observed signal in time-frequency domain by applying short time Fourier transform to the sound signal for the channels received and a direction/segment estimation unit that receives the observed signal generated by the short time Fourier transform unit and detects the sound direction and sound segment of the target sound.
    Type: Application
    Filed: March 21, 2014
    Publication date: November 6, 2014
    Applicant: Sony Corporation
    Inventor: Atsuo HIROE
  • Patent number: 8818001
    Abstract: A signal processing apparatus includes: a separation processing unit that generates observed signals in the time frequency domain by performing the short-time Fourier transform on mixed signals as outputs, which are acquired from a plurality of sound sources by a plurality of sensors, and generates sound source separation results corresponding to the sound sources by a linear filtering process on the observed signals. The separation processing unit has a linear filtering process section that performs the linear filtering process on the observed signals so as to generate separated signals corresponding to the respective sound sources, an all-null spatial filtering section that applies an all-null spatial filter to generate signals filtered with the all-null spatial filter (spatially filtered signals) in which the acquired sounds in null directions are removed, and a frequency filtering section that performs a filtering process by inputting the separated signals and the spatially filtered signals.
    Type: Grant
    Filed: November 11, 2010
    Date of Patent: August 26, 2014
    Assignee: Sony Corporation
    Inventor: Atsuo Hiroe
  • Patent number: 8577054
    Abstract: A signal processing apparatus includes a source separation module for producing respective separation signals corresponding to a plurality of sound sources by applying an ICA (Independent Component Analysis) to observation signals produced based on mixture signals from the sound sources, which are taken by source separation microphones, to thereby execute a separation process of the mixture signals, and a signal projection-back module for receiving observation signals of projection-back target microphones and the separation signals produced by the source separation module, and for producing projection-back signals as respective separation signals corresponding to the sound sources, which are taken by the projection-back target microphones. The signal projection-back module produces the projection-back signals by receiving the observation signals of the projection-back target microphones which differ from the source separation microphones.
    Type: Grant
    Filed: March 22, 2010
    Date of Patent: November 5, 2013
    Assignee: Sony Corporation
    Inventor: Atsuo Hiroe
  • Patent number: 8566094
    Abstract: An apparatus, method and program for performing a speech recognition process utilizing contextual information that comprises an estimation of the intention of an utterance of a user. The recognition process includes calculating a pre-score based on observed contextual information according intention models which correspond to a plurality of types of intention information and combining the pre-scoring results with acoustic and linguistic scores to obtain an improved recognition or comprehension of the intent of a user utterance.
    Type: Grant
    Filed: August 10, 2011
    Date of Patent: October 22, 2013
    Assignee: Sony Corporation
    Inventors: Katsuki Minamino, Atsuo Hiroe, Yoshinori Maeda, Satoshi Asakawa