Patents by Inventor Ross G. Cutler

Ross G. Cutler has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

ADAPTIVE ENHANCEMENT OF AUDIO OR VIDEO SIGNALS

Publication number: 20240428768

Abstract: This document relates to distributed teleconferencing. Some implementations can employ adaptive audio or video enhancement to address scenarios where audio enhancement can tend to remove desirable sounds. For instance, adaptive audio enhancement can involve detecting the presence of a sound, such as clapping, and modifying audio enhancement so that the sound is retained in an enhanced audio signal. Adaptive video processing can involve detecting the presence of the sound and adding a graphical identifier to a video signal that conveys the presence of that sound.

Type: Application

Filed: June 20, 2023

Publication date: December 26, 2024

Applicant: Microsoft Technology Licensing, LLC

Inventors: Ross G. CUTLER, Harishchandra DUBEY, Vishak GOPAL
PERSONALIZED IMAGE OR VIDEO ENHANCEMENT

Publication number: 20240428380

Abstract: This document relates to personalized image or video processing. For example, the disclosed implementations can identify a designated user of a computing device that participates in a video call with other users. When another person appears in a video feed captured by the computing device, the other person can be removed. This can avoid distractions that can be caused, for example, by family members or pets that inadvertently walk into the field of view while a designated user is participating in a video call. Similar techniques can be employed to remove people other than designated users from still images.

Type: Application

Filed: June 20, 2023

Publication date: December 26, 2024

Applicant: Microsoft Technology Licensing, LLC

Inventor: Ross G. CUTLER
ACTIVE SPEAKER DETECTION USING DISTRIBUTED DEVICES

Publication number: 20240428803

Abstract: This document relates to active speaker detection using distributed devices. For example, the disclosed implementations can employ personal devices of one or more users to detect when those users are speaking during a call with other users. Then, a camera on the personal device can be employed to obtain a front-facing view of the user, which can be provided to other call participants. In some cases, a microphone and/or camera on the user's device are employed to detect when the user is actively speaking.

Type: Application

Filed: June 20, 2023

Publication date: December 26, 2024

Applicant: Microsoft Technology Licensing, LLC

Inventor: Ross G. CUTLER
DISTRIBUTED TELECONFERENCING USING ADAPTIVE MICROPHONE SELECTION

Publication number: 20240406621

Abstract: This document relates to distributed devices teleconferencing. Some implementations can employ adaptive microphone selection based on signal characteristics such as signal-to-noise ratios or speech quality, and/or based on a microphone affinity approach. The selected microphone signals can be synchronized and mixed to generate a playback signal that is sent to a remote device. Further implementations can perform proximity-based mixing, where microphone signals received from devices in a particular room can be omitted from playback signals transmitted to other devices in the same room. These techniques can allow enhanced call quality for teleconferencing sessions where co-located users can employ their own devices to participate in a call with other users.

Type: Application

Filed: May 31, 2023

Publication date: December 5, 2024

Applicant: Microsoft Technology Licensing, LLC

Inventors: Ross G. CUTLER, Hong Wang SODOMA, Robert Andreas AICHNER, Vinod PRAKASH, Warren Michael LAM
DYNAMIC SPEECH ENHANCEMENT COMPONENT OPTIMIZATION

Publication number: 20240005939

Abstract: Systems, methods, and computer-readable storage devices are disclosed for personalizing speech enhancement components without enrollment in speech communication systems. One method including: receiving audio data, the audio data including speech, and the audio data to be processed by at least one speech enhancement component; determining, without requiring a user to enroll, whether the speech of the audio data includes one or both of near-field speech and far-field speech; and changing one or more of the at least one speech enhancement component based on determining the speech of the audio data includes one or both of near-field speech and far-field speech.

Type: Application

Filed: June 30, 2022

Publication date: January 4, 2024

Applicant: Microsoft Technology Licensing, LLC

Inventor: Ross G. CUTLER
DYNAMIC SPEECH ENHANCEMENT COMPONENT OPTIMIZATION

Publication number: 20230419986

Abstract: Systems, methods, and computer-readable storage devices are disclosed for optimizing speech enhancement components to use in speech communication systems using non-intrusive speech quality assessment. One method including: receiving audio data, the audio data including speech; and the audio data having been processed by at least one speech enhancement component; detecting a first quality of the speech of the audio data using a trained non-intrusive speech quality assessment (NISQA) model, the trained NISQA model trained to detect quality of speech automatically; and changing one or more of the at least one speech enhancement component based on the detected first quality of the speech.

Type: Application

Filed: June 24, 2022

Publication date: December 28, 2023

Applicant: Microsoft Technology Licensing, LLC

Inventors: Ross G. CUTLER, William D. FALLAS CORDERO
DYNAMIC SPEECH ENHANCEMENT COMPONENT OPTIMIZATION

Publication number: 20230419987

Abstract: Systems, methods, and computer-readable storage devices are disclosed for optimizing speech enhancement components to use in speech communication systems using non-intrusive speech quality assessment. One method including: receiving, from a computing device over a network, audio data, the audio data including speech; detecting a first quality of the speech of the audio data using a trained non-intrusive speech quality assessment (NISQA) model, the trained NISQA model trained to detect quality of speech automatically; determining whether the computing device is a low-quality endpoint based on the first quality of speech of the audio data; and transferring, from the computing device over the network, at least one speech enhancement component to at least one server device when the computing device is determined to be a low-quality endpoint.

Type: Application

Filed: December 1, 2022

Publication date: December 28, 2023

Applicant: Microsoft Technology Licensing, LLC

Inventors: Ross G. CUTLER, William D. FALLAS CORDERO
Artificially generated speech for a communication session

Patent number: 10930262

Abstract: A device for communicating with a remote device is disclosed, which includes a processor and a memory in communication with the processor. The memory includes executable instructions that, when executed, cause the processor to control the device to perform functions of establishing, via a communication network, a communication session with the remote device; capturing a speech spoken by a user and generating audio data representing the captured speech by the user; encoding the audio data for transmission to the remote device via the communication network; converting the audio data to text data representing the captured speech; and transmitting, during the communication session, the encoded audio data and the text data to the remote device via the communication network. The device thus can provide the text data representing the captured speech when a quality of the encoded audio signal received by the remote device is below a predetermined level.

Type: Grant

Filed: September 30, 2018

Date of Patent: February 23, 2021

Assignee: Microsoft Technology Licensing, LLC.

Inventors: Ross G. Cutler, Sriram Srinivasan, Ramin Mehran, Karlton David Sequeira, Jayant Ajit Gupchup, Senthil K. Velayutham
ARTIFICIALLY GENERATED SPEECH FOR A COMMUNICATION SESSION

Publication number: 20190073993

Abstract: A device is disclosed, which includes a processor and a memory in communication with the processor. The memory includes executable instructions that, when executed by the processor, cause the processor to control the device to perform functions of capturing a speech by a user; generating audio data representing the captured speech by a user; generating, based on the audio data, text data representing at least a portion of the captured speech; and transmitting, via a communication channel, the audio data and text data to the remote device. The device thus can provide the text data representing the captured speech when a quality of the audio signal received by the remote device is below a predetermined level.

Type: Application

Filed: October 31, 2018

Publication date: March 7, 2019

Applicant: MICROSOFT TECHNOLOGY LICENSING, LLC

Inventors: Ross G. Cutler, Sriram Srinivasan, Ramin Mehran, Karlton David Sequeira, Jayant Ajit Gupchup, Senthil K. Velayutham
ARTIFICIALLY GENERATED SPEECH FOR A COMMUNICATION SESSION

Publication number: 20190035383

Abstract: A device for communicating with a remote device is disclosed, which includes a processor and a memory in communication with the processor. The memory includes executable instructions that, when executed, cause the processor to control the device to perform functions of establishing, via a communication network, a communication session with the remote device; capturing a speech spoken by a user and generating audio data representing the captured speech by the user; encoding the audio data for transmission to the remote device via the communication network; converting the audio data to text data representing the captured speech; and transmitting, during the communication session, the encoded audio data and the text data to the remote device via the communication network. The device thus can provide the text data representing the captured speech when a quality of the encoded audio signal received by the remote device is below a predetermined level.

Type: Application

Filed: September 30, 2018

Publication date: January 31, 2019

Applicant: MICROSOFT TECHNOLOGY LICENSING, LLC

Inventors: Ross G. Cutler, Sriram Srinivasan, Ramin Mehran, Karlton David Sequeira, Jayant Ajit Gupchup, Senthil K. Velayutham
PRIVACY IMAGE GENERATION

Publication number: 20180365809

Abstract: A privacy image generation system may use a light field camera that includes an array of cameras or an RGBZ camera(s)) is used to capture images and display images according to a selected privacy mode. The privacy mode may include a blur background mode that can be automatically selected based on the meeting type, participants, location, and device type. A region of interest and/or an object(s) of interest (e.g. one or more persons in a foreground) is determined and the privacy image generation system is configured to clearly show the region/object of interest and obscure or replace the background by combining multiple images. The displayed image includes the region/object(s) of interest clearly shown (e.g. in focus) and any objects in a background of the combined image shown having a limited depth of field (e.g. blurry/not in focus) and/or blurred due to the combination of the multiple images.

Type: Application

Filed: August 24, 2018

Publication date: December 20, 2018

Inventors: Ross G. Cutler, Ramin Mehran
DUAL NETWORK INTERFACE IMPLEMENTATION IN MULTIPATH NETWORKING

Publication number: 20180367446

Abstract: Technologies are described for enhancement of call qualify in online communications through deployment of two or more network interface devices. Endpoint to endpoint or multiple endpoint communications managed by a multipoint control unit (MCU) communications may be facilitated using two or more network interface devices on either or both ends of a communication path. Received signals may be aggregated to improve signal quality. Network interface devices may be integrated to an endpoint, external modules, or available through combination of two endpoints (e.g., a computer connected to an online communication speaker phone). Network interface device configuration and activation may be automatically performed for a seamless operation transparent to a user.

Type: Application

Filed: June 16, 2017

Publication date: December 20, 2018

Applicant: MICROSOFT TECHNOLOGY LICENSING, LLC

Inventor: Ross G. CUTLER
Artificially generated speech for a communication session

Patent number: 10147415

Abstract: Content is received at a receiving equipment from a transmitting user terminal over a network in a communication session between a transmitting user and a receiving user. The received content comprises audio data representing speech spoken by a voice of the transmitting user, and further comprises text data generated from speech spoken by the voice of the transmitting user during the communication session. At the receiving equipment, at least a portion of the received text data is converted to artificially-generated audible speech based on a model of the transmitting user's voice stored at the receiving equipment (and in embodiments in dependence on the receive audio quality). The received audio data and the artificially-generated speech are supplied to be played out to the receiving user through one or more speakers.

Type: Grant

Filed: February 2, 2017

Date of Patent: December 4, 2018

Assignee: MICROSOFT TECHNOLOGY LICENSING, LLC

Inventors: Ross G. Cutler, Sriram Srinivasan, Ramin Mehran, Karlton David Sequeira, Jayant Ajit Gupchup, Senthil K. Velayutham
Artificially generated speech for a communication session

Publication number: 20180218727

Abstract: Content is received at a receiving equipment from a transmitting user terminal over a network in a communication session between a transmitting user and a receiving user. The received content comprises audio data representing speech spoken by a voice of the transmitting user, and further comprises text data generated from speech spoken by the voice of the transmitting user during the communication session. At the receiving equipment, at least a portion of the received text data is converted to artificially-generated audible speech based on a model of the transmitting user's voice stored at the receiving equipment (and in embodiments in dependence on the receive audio quality). The received audio data and the artificially-generated speech are supplied to be played out to the receiving user through one or more speakers.

Type: Application

Filed: February 2, 2017

Publication date: August 2, 2018

Applicant: Microsoft Technology Licensing, LLC

Inventors: Ross G. Cutler, Sriram Srinivasan, Ramin Mehran, Karlton David Sequeira, Jayant Ajit Gupchup, Senthil K. Velayutham
PRIVACY CAMERA

Publication number: 20170301067

Abstract: A privacy camera, such as a light field camera that includes an array of cameras or an RGBZ camera(s)) is used to capture images and display images according to a selected privacy mode. The privacy mode may include a blur background mode and a background replacement mode and can be automatically selected based on the meeting type, participants, location, and device type. A region of interest and/or an object(s) of interest (e.g. one or more persons in a foreground) is determined and the privacy camera is configured to clearly show the region/object of interest and obscure or replace the background according to the selected privacy mode. The displayed image includes the region/object(s) of interest clearly shown (e.g. in focus) and any objects in a background of the combined image shown having a limited depth of field (e.g. blurry/not in focus) and/or the background replaced with another image and/or fill.

Type: Application

Filed: June 30, 2017

Publication date: October 19, 2017

Applicant: Microsoft Technology Licensing, LLC

Inventors: Ross G. Cutler, Ramin Mehran
Boundary binaural microphone array

Patent number: 9516417

Abstract: A boundary binaural microphone array includes a pair of microphones spaced from one another by a distance between approximately 5 cm and 30 cm. The boundary binaural microphone array has a structural support that locates the microphones no more than approximately 4 cm off of a surface upon which the array is placed. The microphones are separated by a sound barrier that provides an interaural level difference in the amplitudes of the sound signals sensed by the two microphones.

Type: Grant

Filed: January 2, 2013

Date of Patent: December 6, 2016

Assignee: Microsoft Technology Licensing, LLC

Inventor: Ross G. Cutler
Satellite microphones for improved speaker detection and zoom

Patent number: 9071895

Abstract: Architecture for exploiting satellite microphones and employing other techniques of conference room camera/microphone systems to significantly improve the true positive rate (reduce false positives) in sound source localization (SSL). Techniques for realizing the improvement include using an LED emitter to determine the precise location of the satellite microphones on a table, using the base SSL and external sounds to determine the approximate location of the satellite microphone on the table, using the satellite microphone phase to improve the SSL performance, using the satellite microphone amplitude to improve the active speaker detector (ASD) performance, and using the satellite microphones to estimate camera zoom.

Type: Grant

Filed: November 19, 2012

Date of Patent: June 30, 2015

Assignee: Microsoft Technology Licensing, LLC

Inventor: Ross G. Cutler
Automatic video framing

Patent number: 8773499

Abstract: A dynamically adjustable framed view of occupants in a room is captured through an automatic framing system. The system employs a camera system, including a pan/tilt/zoom (PTZ) camera and one or more depth cameras, to automatically locate occupants in a room and adjust the PTZ camera's pan, tilt, and zoom settings to focus in on the occupants and center them in the main video frame. The depth cameras may distinguish between occupants and inanimate objects and adaptively determine the location of the occupants in the room. The PTZ camera may be calibrated with the depth cameras in order to use the location information determined by the depth cameras to automatically center the occupants in the main video frame for a framed view. Additionally, the system may track position changes in the room and may dynamically adjust and update the framed view when changes occur.

Type: Grant

Filed: June 24, 2011

Date of Patent: July 8, 2014

Assignee: Microsoft Corporation

Inventors: Josh Watson, Simone Leorin, Ross G. Cutler
BOUNDARY BINAURAL MICROPHONE ARRAY

Publication number: 20140185814

Abstract: A boundary binaural microphone array includes a pair of microphones spaced from one another by a distance between approximately 5 cm and 30 cm. The boundary binaural microphone array has a structural support that locates the microphones no more than approximately 4 cm off of a surface upon which the array is placed. The microphones are separated by a sound barrier that provides an interaural level difference in the amplitudes of the sound signals sensed by the two microphones.

Type: Application

Filed: January 2, 2013

Publication date: July 3, 2014

Applicant: MICROSOFT CORPORATION

Inventor: Ross G. Cutler
Capture device movement compensation for speaker indexing

Patent number: 8749650

Abstract: Embodiments of the invention compensate for the movement of a meeting capture device during a live meeting when performing speaker indexing of a recorded meeting. In one example, a first position of a capture device is determined. A second position of the capture device is determined after the capture device has been moved from the first position to the second position. The movement data associated with movement of the capture device from the first position to the second position is determined. The movement data is outputted and used in speaker indexing of the recorded meeting.

Type: Grant

Filed: December 7, 2012

Date of Patent: June 10, 2014

Assignee: Microsoft Corporation

Inventor: Ross G. Cutler

1 2 3 4 next