Patents by Inventor Kazuhito Koishida
Kazuhito Koishida has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Patent number: 11244696Abstract: Example speech enhancement systems include a spatio-temporal residual network configured to receive video data containing a target speaker and extract visual features from the video data, an autoencoder configured to receive input of an audio spectrogram and extract audio features from the audio spectrogram, and a squeeze-excitation fusion block configured to receive input of visual features from a layer of the spatio-temporal residual network and input of audio features from a layer of the autoencoder, and to provide an output to the decoder of the autoencoder. The decoder is configured to output a mask configured based upon the fusion of audio features and visual features by the squeeze-excitation fusion block, and the instructions are executable to apply the mask to the audio spectrogram to generate an enhanced magnitude spectrogram, and to reconstruct an enhanced waveform from the enhanced magnitude spectrogram.Type: GrantFiled: February 5, 2020Date of Patent: February 8, 2022Assignee: Microsoft Technology Licensing, LLCInventors: Kazuhito Koishida, Michael Iuzzolino
-
Publication number: 20220012470Abstract: An intelligent assistant records speech spoken by a first user and determines a self-selection score for the first user. The intelligent assistant sends the self-selection score to another intelligent assistant, and receives a remote-selection score for the first user from the other intelligent assistant. The intelligent assistant compares the self-selection score to the remote-selection score. If the self-selection score is greater than the remote-selection score, the intelligent assistant responds to the first user and blocks subsequent responses to all other users until a disengagement metric of the first user exceeds a blocking threshold. If the self-selection score is less than the remote-selection score, the intelligent assistant does not respond to the first user.Type: ApplicationFiled: September 27, 2021Publication date: January 13, 2022Applicant: Microsoft Technology Licensing, LLCInventors: Kazuhito KOISHIDA, Alexander A. POPOV, Uros BATRICEVIC, Steven Nabil BATHICHE
-
Patent number: 11194998Abstract: An intelligent assistant records speech spoken by a first user and determines a self-selection score for the first user. The intelligent assistant sends the self-selection score to another intelligent assistant, and receives a remote-selection score for the first user from the other intelligent assistant. The intelligent assistant compares the self-selection score to the remote-selection score. If the self-selection score is greater than the remote-selection score, the intelligent assistant responds to the first user and blocks subsequent responses to all other users until a disengagement metric of the first user exceeds a blocking threshold. If the self-selection score is less than the remote-selection score, the intelligent assistant does not respond to the first user.Type: GrantFiled: July 24, 2017Date of Patent: December 7, 2021Assignee: Microsoft Technology Licensing, LLCInventors: Kazuhito Koishida, Alexander A Popov, Uros Batricevic, Steven Nabil Bathiche
-
Publication number: 20210134312Abstract: Example speech enhancement systems include a spatio-temporal residual network configured to receive video data containing a target speaker and extract visual features from the video data, an autoencoder configured to receive input of an audio spectrogram and extract audio features from the audio spectrogram, and a squeeze-excitation fusion block configured to receive input of visual features from a layer of the spatio-temporal residual network and input of audio features from a layer of the autoencoder, and to provide an output to the decoder of the autoencoder. The decoder is configured to output a mask configured based upon the fusion of audio features and visual features by the squeeze-excitation fusion block, and the instructions are executable to apply the mask to the audio spectrogram to generate an enhanced magnitude spectrogram, and to reconstruct an enhanced waveform from the enhanced magnitude spectrogram.Type: ApplicationFiled: February 5, 2020Publication date: May 6, 2021Applicant: Microsoft Technology Licensing, LLCInventors: Kazuhito KOISHIDA, Michael IUZZOLINO
-
Patent number: 10721594Abstract: Mobile devices provide a variety of techniques for presenting messages from sources to a user. However, when the message pertains to the presence of the user at a location, the available communications techniques may exhibit deficiencies, e.g., reliance on the memory of the source and/or user of the existence and content of a message between its initiation and the user's visit to the location, or reliance on the communication accessibility of the user, the device, and/or the source during the user's location visit. Presented herein are techniques for enabling a mobile device, at a first time, to receive a request to present an audio message during the presence of the user at a location; and, at a second time, detecting the presence of the user at the location, and presenting the audio message to the user, optionally without awaiting a request from the user to present the message.Type: GrantFiled: June 26, 2014Date of Patent: July 21, 2020Assignee: Microsoft Technology Licensing, LLCInventors: Raja Bose, Hiroshi Horii, Jonathan Lester, Ruchita Bhargava, Kazuhito Koishida, Michelle L. Holtmann, Christina Chen
-
Patent number: 10564713Abstract: Computer systems, methods, and storage media for generating a continuous motion control using neurological data and for associating the continuous motion control with a continuous user interface control to enable analog control of the user interface control. The user interface control is modulated through a user's physical movements within a continuous range of motion associated with the continuous motion control. The continuous motion control enables fine-tuned and continuous control of the corresponding user interface control as opposed to control limited to a small number of discrete settings.Type: GrantFiled: January 9, 2019Date of Patent: February 18, 2020Assignee: Microsoft Technology Licensing, LLCInventors: Cem Keskin, Khuram Shahid, Bill Chau, Jaeyoun Kim, Kazuhito Koishida
-
Publication number: 20190212810Abstract: Computer systems, methods, and storage media for generating a continuous motion control using neurological data and for associating the continuous motion control with a continuous user interface control to enable analog control of the user interface control. The user interface control is modulated through a user's physical movements within a continuous range of motion associated with the continuous motion control. The continuous motion control enables fine-tuned and continuous control of the corresponding user interface control as opposed to control limited to a small number of discrete settings.Type: ApplicationFiled: January 9, 2019Publication date: July 11, 2019Inventors: Cem Keskin, Khuram Shahid, Bill Chau, Jaeyoun Kim, Kazuhito Koishida
-
Patent number: 10203751Abstract: Computer systems, methods, and storage media for generating a continuous motion control using neurological data and for associating the continuous motion control with a continuous user interface control to enable analog control of the user interface control. The user interface control is modulated through a user's physical movements within a continuous range of motion associated with the continuous motion control. The continuous motion control enables fine-tuned and continuous control of the corresponding user interface control as opposed to control limited to a small number of discrete settings.Type: GrantFiled: May 11, 2016Date of Patent: February 12, 2019Assignee: Microsoft Technology Licensing, LLCInventors: Cem Keskin, Khuram Shahid, Bill Chau, Jaeyoun Kim, Kazuhito Koishida
-
Publication number: 20180293221Abstract: A method to execute computer-actionable directives conveyed in human speech comprises: receiving audio data recording speech from one or more speakers; converting the audio data into a linguistic representation of the recorded speech; detecting a target corresponding to the linguistic representation; committing to the data structure language data associated with the detected target and based on the linguistic representation; parsing the data structure to identify one or more of the computer-actionable directives; and submitting the one or more of the computer-actionable directives to the computer for processing.Type: ApplicationFiled: June 11, 2018Publication date: October 11, 2018Applicant: Microsoft Technology Licensing, LLCInventors: Erich-Soren FINKELSTEIN, Han Yee Mimi FUNG, Aleksandar UZELAC, Oz SOLOMON, Keith Coleman HEROLD, Vivek PRADEEP, Zongyi LIU, Kazuhito KOISHIDA, Haithem ALBADAWI, Steven Nabil BATHICHE, Christopher Lance NUESMEYER, Michelle Lynn HOLTMANN, Christopher Brian QUIRK, Pablo Luis SALA
-
Publication number: 20180233140Abstract: Intelligent assistant systems, methods and computing devices are disclosed for identifying a speaker change. A method comprises receiving audio input comprising a speech fragment. A first voice model is trained with a first sub-fragment from the speech fragment. A second voice model is trained with a second sub-fragment from the speech fragment. The first sub-fragment is analyzed with the second voice model to yield a first confidence value. The second sub-fragment is analyzed with the first voice model to yield a second confidence value. Based at least on the first and second confidence values, the method determines if a speaker of the first sub-fragment is the speaker of the second sub-fragment.Type: ApplicationFiled: July 11, 2017Publication date: August 16, 2018Applicant: Microsoft Technology Licensing, LLCInventors: Kazuhito KOISHIDA, Uros BATRICEVIC
-
Publication number: 20180233142Abstract: An intelligent assistant records speech spoken by a first user and determines a self-selection score for the first user. The intelligent assistant sends the self-selection score to another intelligent assistant, and receives a remote-selection score for the first user from the other intelligent assistant. The intelligent assistant compares the self-selection score to the remote-selection score. If the self-selection score is greater than the remote-selection score, the intelligent assistant responds to the first user and blocks subsequent responses to all other users until a disengagement metric of the first user exceeds a blocking threshold. If the self-selection score is less than the remote-selection score, the intelligent assistant does not respond to the first user.Type: ApplicationFiled: July 24, 2017Publication date: August 16, 2018Applicant: Microsoft Technology Licensing, LLCInventors: Kazuhito KOISHIDA, Alexander A. POPOV, Uros BATRICEVIC, Steven Nabil BATHICHE
-
Patent number: 9864431Abstract: Computer systems, methods, and storage media for changing the state of an application by detecting neurological user intent data associated with a particular operation of a particular application state, and changing the application state so as to enable execution of the particular operation as intended by the user. The application state is automatically changed to align with the intended operation, as determined by received neurological user intent data, so that the intended operation is performed. Some embodiments relate to a computer system creating or updating a state machine, through a training process, to change the state of an application according to detected neurological data.Type: GrantFiled: May 11, 2016Date of Patent: January 9, 2018Assignee: Microsoft Technology Licensing, LLCInventors: Cem Keskin, David Kim, Bill Chau, Jaeyoun Kim, Kazuhito Koishida, Khuram Shahid
-
Publication number: 20170329392Abstract: Computer systems, methods, and storage media for generating a continuous motion control using neurological data and for associating the continuous motion control with a continuous user interface control to enable analog control of the user interface control. The user interface control is modulated through a user's physical movements within a continuous range of motion associated with the continuous motion control. The continuous motion control enables fine-tuned and continuous control of the corresponding user interface control as opposed to control limited to a small number of discrete settings.Type: ApplicationFiled: May 11, 2016Publication date: November 16, 2017Inventors: Cem Keskin, Khuram Shahid, Bill Chau, Jaeyoun Kim, Kazuhito Koishida
-
Publication number: 20170329404Abstract: Computer systems, methods, and storage media for changing the state of an application by detecting neurological user intent data associated with a particular operation of a particular application state, and changing the application state so as to enable execution of the particular operation as intended by the user. The application state is automatically changed to align with the intended operation, as determined by received neurological user intent data, so that the intended operation is performed. Some embodiments relate to a computer system creating or updating a state machine, through a training process, to change the state of an application according to detected neurological data.Type: ApplicationFiled: May 11, 2016Publication date: November 16, 2017Inventors: Cem Keskin, David Kim, Bill Chau, Jaeyoun Kim, Kazuhito Koishida, Khuram Shahid
-
Patent number: 9817100Abstract: An array of microphones placed on a mobile robot provides multiple channels of audio signals. A received set of audio signals is called an audio segment, which is divided into multiple frames. A phase analysis is performed on a frame of the signals from each pair of microphones. If both microphones are in an active state during the frame, a candidate angle is generated for each such pair of microphones. The result is a list of candidate angles for the frame. This list is processed to select a final candidate angle for the frame. The list of candidate angles is tracked over time to assist in the process of selecting the final candidate angle for an audio segment.Type: GrantFiled: August 19, 2016Date of Patent: November 14, 2017Assignee: Microsoft Technology Licensing, LLCInventors: Shankar Regunathan, Kazuhito Koishida, Harshavardhana Narayana Kikkeri
-
Publication number: 20170323220Abstract: Technologies are described herein for modifying the modality of a computing device based upon a user's brain activity. A machine learning classifier is trained using data that identifies a modality for operating a computing device and data identifying brain activity of a user of the computing device. Once trained, the machine learning classifier can select a mode of operation for the computing device based upon a user's current brain activity and, potentially, other biological data. The computing device can then be operated in accordance with the selected modality. An application programming interface can also expose an interface through which an operating system and application programs executing on the computing device can obtain data identifying the modality selected by the machine learning classifier. Through the use of this data, the operating system and application programs can modify their mode of operation to be most suitable for the user's current mental state.Type: ApplicationFiled: May 9, 2016Publication date: November 9, 2017Inventors: John C. Gordon, Kazuhito Koishida
-
Patent number: 9741354Abstract: An audio decoder provides a combination of decoding components including components implementing base band decoding, spectral peak decoding, frequency extension decoding and channel extension decoding techniques. The audio decoder decodes a compressed bitstream structured by a bitstream syntax scheme to permit the various decoding components to extract the appropriate parameters for their respective decoding technique.Type: GrantFiled: April 29, 2016Date of Patent: August 22, 2017Assignee: Microsoft Technology Licensing, LLCInventors: Kazuhito Koishida, Sanjeev Mehrotra, Chao He, Wei-Ge Chen
-
Publication number: 20170052245Abstract: An array of microphones placed on a mobile robot provides multiple channels of audio signals. A received set of audio signals is called an audio segment, which is divided into multiple frames. A phase analysis is performed on a frame of the signals from each pair of microphones. If both microphones are in an active state during the frame, a candidate angle is generated for each such pair of microphones. The result is a list of candidate angles for the frame. This list is processed to select a final candidate angle for the frame. The list of candidate angles is tracked over time to assist in the process of selecting the final candidate angle for an audio segment.Type: ApplicationFiled: August 19, 2016Publication date: February 23, 2017Inventors: Shankar Regunathan, Kazuhito Koishida, Harshavardhana Narayana Kikkeri
-
Patent number: 9435873Abstract: An array of microphones placed on a mobile robot provides multiple channels of audio signals. A received set of audio signals is called an audio segment, which is divided into multiple frames. A phase analysis is performed on a frame of the signals from each pair of microphones. If both microphones are in an active state during the frame, a candidate angle is generated for each such pair of microphones. The result is a list of candidate angles for the frame. This list is processed to select a final candidate angle for the frame. The list of candidate angles is tracked over time to assist in the process of selecting the final candidate angle for an audio segment.Type: GrantFiled: July 14, 2011Date of Patent: September 6, 2016Assignee: Microsoft Technology Licensing, LLCInventors: Shankar Regunathan, Kazuhito Koishida, Harshavardhana Narayana Kikkeri
-
Publication number: 20160247515Abstract: An audio decoder provides a combination of decoding components including components implementing base band decoding, spectral peak decoding, frequency extension decoding and channel extension decoding techniques. The audio decoder decodes a compressed bitstream structured by a bitstream syntax scheme to permit the various decoding components to extract the appropriate parameters for their respective decoding technique.Type: ApplicationFiled: April 29, 2016Publication date: August 25, 2016Applicant: Microsoft Technology Licensing, LLCInventors: Kazuhito Koishida, Sanjeev Mehrotra, Chao He, Wei-Ge Chen