Patents by Inventor Ross G. Cutler

Ross G. Cutler has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 8717402
    Abstract: Speakers are identified based on sound origination detection through use of infrared detection of satellite microphones, estimation of distance between satellite microphones and base unit utilizing captured audio, and/or estimation of satellite microphone orientation utilizing captured audio. Multiple sound source localization results are combined to enhance sound source localization and/or active speaker detection accuracy.
    Type: Grant
    Filed: May 1, 2012
    Date of Patent: May 6, 2014
    Assignee: Microsoft Corporation
    Inventor: Ross G. Cutler
  • Patent number: 8620653
    Abstract: Architecture that uses near-end speech detection and far-end energy level detection to notify a user when a local microphone and/or speaker that the user is using, are muted. A voice activity detector is employed to detect the presence of near-end speech, sense the existing mute state of the near-end microphone, and then notify the user when the current microphone is muted. Separately or in combination therewith, received far-end voice signals are detected, the associated energy level computed, the existing mute state of the near-end audio speaker is sensed, and the user notified when the speaker is muted and/or at a reduced volume setting. These determinations enhance the user experience when the architecture is employed for communications sessions where participants connect via different communications modalities by automatically notifying the user of the audio device state, without attempting to contribute only to find that a microphone or speaker was muted.
    Type: Grant
    Filed: June 18, 2009
    Date of Patent: December 31, 2013
    Assignee: Microsoft Corporation
    Inventor: Ross G. Cutler
  • Patent number: 8614734
    Abstract: Techniques to detect a display device are described. An apparatus may include a video camera operative to receive video information for an image, and a microphone operative to receive audio information for an image. The apparatus may further include a monitor detection module communicatively coupled to the video camera and the microphone, where the monitor detection module is operative to detect a temporal watermark signal displayed by the monitor within the image, and determine a location for the monitor within the image based on the detection. The apparatus may also include an active speaker detector module communicatively coupled to the monitor detection module, where the active speaker detector module is operative to exclude false positives caused by the monitor. Other embodiments are described and claimed.
    Type: Grant
    Filed: October 4, 2012
    Date of Patent: December 24, 2013
    Assignee: Microsoft Corporation
    Inventor: Ross G. Cutler
  • Patent number: 8526632
    Abstract: A camera speakerphone having a microphone array may be used for videoconferencing. Example microphone array designs described herein may be used to perform Sound Source Localization (SSL) and improve audio quality of captured audio. In one example, an omni-directional camera speakerphone includes a base having a speaker and at least one microphone. A neck is coupled to the base which is coupled to a head. The head includes an omni-directional camera and at least one microphone.
    Type: Grant
    Filed: June 28, 2007
    Date of Patent: September 3, 2013
    Assignee: Microsoft Corporation
    Inventor: Ross G. Cutler
  • Patent number: 8510110
    Abstract: Systems and methods for detecting people or speakers in an automated fashion are disclosed. A pool of features including more than one type of input (like audio input and video input) may be identified and used with a learning algorithm to generate a classifier that identifies people or speakers. The resulting classifier may be evaluated to detect people or speakers.
    Type: Grant
    Filed: July 11, 2012
    Date of Patent: August 13, 2013
    Assignee: Microsoft Corporation
    Inventors: Cha Zhang, Paul A. Viola, Pei Yin, Ross G. Cutler, Xinding Sun, Yong Rui
  • Patent number: 8488745
    Abstract: Architecture that employs a signal (e.g., audible or inaudible sounds) to detect if endpoints of a communications session are sufficiently close to each other to induce echo, and then control (e.g., muting) is applied to one or more of the endpoints to prevent echo. The signals can be played and detected from the endpoints or a central conferencing component such as a multiple control unit (MCU). The MCU can provide support for legacy endpoints as well. When echo is detected, the offending endpoint(s) can be controlled to mute one or more onboard devices such as a speaker or microphone. The device(s) can be muted from a remote component or for a local component or locally by the endpoint user. A notification can be sent that notifies the endpoint user that the mute operation has been applied or should be applied to one or more of the local devices.
    Type: Grant
    Filed: June 17, 2009
    Date of Patent: July 16, 2013
    Assignee: Microsoft Corporation
    Inventor: Ross G. Cutler
  • Patent number: 8432432
    Abstract: Eye gaze reduction may be provided. First, a location of a near-end camera relative to a near-end screen may be determined. Next, based upon the determined location of the near-end camera, a location may be determined for a video window on the near-end screen. The determined location for the video window may be configured to reduce an eye gaze error in a near-end image transmitted to a far-end device from the near-end camera. Then video data from a far-end camera corresponding to the far-end device may be received and rendered in the video window at the determined location for the video window on the near-end screen.
    Type: Grant
    Filed: December 3, 2010
    Date of Patent: April 30, 2013
    Assignee: Microsoft Corporation
    Inventor: Ross G. Cutler
  • Patent number: 8416715
    Abstract: Gaze tracking or other interest indications are used during a video conference to determine one or more audio sources that are of interest to one or more participants to the video conference, such as by determining a conversation from among multiple conversations that a subset of participants are participating in or listening to, for enhancing the audio experience of one or more of the participants.
    Type: Grant
    Filed: June 15, 2009
    Date of Patent: April 9, 2013
    Assignee: Microsoft Corporation
    Inventors: Daniel A. Rosenfeld, Zicheng Liu, Ross G. Cutler, Philip A. Chou, Christian Huitema, Kori Quinn
  • Publication number: 20120327179
    Abstract: A dynamically adjustable framed view of occupants in a room is captured through an automatic framing system. The system employs a camera system, including a pan/tilt/zoom (PTZ) camera and one or more depth cameras, to automatically locate occupants in a room and adjust the PTZ camera's pan, tilt, and zoom settings to focus in on the occupants and center them in the main video frame. The depth cameras may distinguish between occupants and inanimate objects and adaptively determine the location of the occupants in the room. The PTZ camera may be calibrated with the depth cameras in order to use the location information determined by the depth cameras to automatically center the occupants in the main video frame for a framed view. Additionally, the system may track position changes in the room and may dynamically adjust and update the framed view when changes occur.
    Type: Application
    Filed: June 24, 2011
    Publication date: December 27, 2012
    Applicant: MICROSOFT CORPORATION
    Inventors: Josh Watson, Simone Leorin, Ross G. Cutler
  • Patent number: 8340267
    Abstract: The claimed subject matter relates to an architecture that can preprocess audio portions of communications in order to enrich multiparty communication sessions or environments. In particular, the architecture can provide both a public channel for public communications that are received by substantially all connected parties and can further provide a private channel for private communications that are received by a selected subset of all connected parties. Most particularly, the architecture can apply an audio transform to communications that occur during the multiparty communication session based upon a target audience of the communication. By way of illustration, the architecture can apply a whisper transform to private communications, an emotion transform based upon relationships, an ambience or spatial transform based upon physical locations, or a pace transform based upon lack of presence.
    Type: Grant
    Filed: February 5, 2009
    Date of Patent: December 25, 2012
    Assignee: Microsoft Corporation
    Inventors: Dinei A. Florencio, Alejandro Acero, William Buxton, Phillip A. Chou, Ross G. Cutler, Jason Garms, Christian Huitema, Kori M. Quinn, Daniel Allen Rosenfeld, Zhengyou Zhang
  • Patent number: 8330787
    Abstract: Embodiments of the invention compensate for the movement of a meeting capture device during a live meeting when performing speaker indexing of a recorded meeting. In one example, a first position of a capture device is determined. A second position of the capture device is determined after the capture device has been moved from the first position to the second position. The movement data associated with movement of the capture device from the first position to the second position is determined. The movement data is outputted and used in speaker indexing of the recorded meeting.
    Type: Grant
    Filed: June 29, 2007
    Date of Patent: December 11, 2012
    Assignee: Microsoft Corporation
    Inventor: Ross G. Cutler
  • Patent number: 8314829
    Abstract: Architecture for exploiting satellite microphones and employing other techniques of conference room camera/microphone systems to significantly improve the true positive rate (reduce false positives) in sound source localization (SSL). Techniques for realizing the improvement include using an LED emitter to determine the precise location of the satellite microphones on a table, using the base SSL and external sounds to determine the approximate location of the satellite microphone on the table, using the satellite microphone phase to improve the SSL performance, using the satellite microphone amplitude to improve the active speaker detector (ASD) performance, and using the satellite microphones to estimate camera zoom.
    Type: Grant
    Filed: August 12, 2008
    Date of Patent: November 20, 2012
    Assignee: Microsoft Corporation
    Inventor: Ross G. Cutler
  • Publication number: 20120278077
    Abstract: Systems and methods for detecting people or speakers in an automated fashion are disclosed. A pool of features including more than one type of input (like audio input and video input) may be identified and used with a learning algorithm to generate a classifier that identifies people or speakers. The resulting classifier may be evaluated to detect people or speakers.
    Type: Application
    Filed: July 11, 2012
    Publication date: November 1, 2012
    Applicant: MICROSOFT CORPORATION
    Inventors: Cha Zhang, Paul A. Viola, Pei Yin, Ross G. Cutler, Xinding Sun, Yong Rui
  • Patent number: 8300080
    Abstract: Techniques to detect a display device are described. An apparatus may include a video camera operative to receive video information for an image, and a microphone operative to receive audio information for an image. The apparatus may further include a monitor detection module communicatively coupled to the video camera and the microphone, where the monitor detection module is operative to detect a temporal watermark signal displayed by the monitor within the image, and determine a location for the monitor within the image based on the detection. The apparatus may also include an active speaker detector module communicatively coupled to the monitor detection module, where the active speaker detector module is operative to exclude false positives caused by the monitor. Other embodiments are described and claimed.
    Type: Grant
    Filed: June 29, 2007
    Date of Patent: October 30, 2012
    Assignee: Microsoft Corporation
    Inventor: Ross G. Cutler
  • Publication number: 20120218371
    Abstract: Speakers are identified based on sound origination detection through use of infrared detection of satellite microphones, estimation of distance between satellite microphones and base unit utilizing captured audio, and/or estimation of satellite microphone orientation utilizing captured audio. Multiple sound source localization results are combined to enhance sound source localization and/or active speaker detection accuracy.
    Type: Application
    Filed: May 1, 2012
    Publication date: August 30, 2012
    Applicant: MICROSOFT CORPORATION
    Inventor: Ross G. Cutler
  • Patent number: 8245043
    Abstract: An audio start service method for enabling and scheduling ad hoc distributed meetings. Only a short (in some embodiments less than or equal to about 32 bits) unique device identification is needed to enable distributed meeting devices participating in the meeting to rendezvous at a common rendezvous network address. Once the participants know the unique meeting network address they can take part in the meeting, while others can join or leave the meeting. The data string is each device's unique identification that is encoded into an inaudible watermark and continuously exchanged between devices over the telephone network. A first distributed meeting device requests a network address from a distributed meeting server. This unique meeting network address then is sent to an audio start service that identifies “buddies” of the first device and sends out meeting invitations and the network address to other devices so they can join the meeting.
    Type: Grant
    Filed: June 15, 2007
    Date of Patent: August 14, 2012
    Assignee: Microsoft Corporation
    Inventor: Ross G. Cutler
  • Patent number: 8234113
    Abstract: Systems and methods for detecting people or speakers in an automated fashion are disclosed. A pool of features including more than one type of input (like audio input and video input) may be identified and used with a learning algorithm to generate a classifier that identifies people or speakers. The resulting classifier may be evaluated to detect people or speakers.
    Type: Grant
    Filed: August 30, 2011
    Date of Patent: July 31, 2012
    Assignee: Microsoft Corporation
    Inventors: Cha Zhang, Paul A. Viola, Pei Yin, Ross G. Cutler, Xinding Sun, Yong Rui
  • Publication number: 20120140023
    Abstract: Eye gaze reduction may be provided. First, a location of a near-end camera relative to a near-end screen may be determined. Next, based upon the determined location of the near-end camera, a location may be determined for a video window on the near-end screen. The determined location for the video window may be configured to reduce an eye gaze error in a near-end image transmitted to a far-end device from the near-end camera. Then video data from a far-end camera corresponding to the far-end device may be received and rendered in the video window at the determined location for the video window on the near-end screen.
    Type: Application
    Filed: December 3, 2010
    Publication date: June 7, 2012
    Applicant: Microsoft Corporation
    Inventor: Ross G. Cutler
  • Patent number: 8189807
    Abstract: Speakers are identified based on sound origination detection through use of infrared detection of satellite microphones, estimation of distance between satellite microphones and base unit utilizing captured audio, and/or estimation of satellite microphone orientation utilizing captured audio. Multiple sound source localization results are combined to enhance sound source localization and/or active speaker detection accuracy.
    Type: Grant
    Filed: June 27, 2008
    Date of Patent: May 29, 2012
    Assignee: Microsoft Corporation
    Inventor: Ross G. Cutler
  • Patent number: 8165416
    Abstract: A region of interest may be determined using any or all of sound source location, multi-person detection, and active speaker detection. An weighted mean may be determined using the region of interest and a set of backlight weight regions, or, only the set of backlight weight regions if a region of interest could not be found. The image mean is compared to a target value to determine if the image mean is greater than or less than the target value within a predetermined threshold. If the image mean is greater than the predetermined target value and predetermined threshold value, the gain and exposure are decreased. If the image mean is lesser than the predetermined target value minus the predetermined threshold value, the gain and exposure are decreased.
    Type: Grant
    Filed: June 29, 2007
    Date of Patent: April 24, 2012
    Assignee: Microsoft Corporation
    Inventor: Ross G. Cutler