Patents by Inventor Sumit Basu

Sumit Basu has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

System and process for speaking in a two-way voice communication without talking using a set of speech selection menus

Patent number: 7443962

Abstract: A system and process for enabling a communication device having computing capability, a user interface and display, to conduct two-way voice communications between a user and a remote party over a communication link in such a manner that the remote party speaks but the user does not, is presented. In general, a series of menus listing potential responses is displayed on the display of the communication device. In addition, there are a plurality of backchanneling responses provided that the user can select. These responses are employed by the user to communicate with the remote party, rather than speaking. This is accomplished by the user selecting one of the available responses. Once a selection has been made, a pre-recorded voice snippet corresponding to the selected response is accessed. The accessed voice snippet is then played back and transmitted to the remote party over the communication link.

Type: Grant

Filed: November 3, 2003

Date of Patent: October 28, 2008

Assignee: Microsoft Corporation

Inventor: Sumit Basu
ACTIVE LEARNING FRAMEWORK FOR AUTOMATIC FIELD EXTRACTION FROM NETWORK TRAFFIC

Publication number: 20080140589

Abstract: An active learning framework is provided to extract information from particular fields from a variety of protocols. Extraction is performed in an unknown protocol, in which the user presents the system with a small number of labeled instances. The system then automatically generates an abundance of features and negative examples. A boosting approach is then used for feature selection and classifier combination. The system then displays its results for the user to correct and/or add new examples. The process can be iterated until the user is satisfied with the performance of the extraction capabilities provided by the classifiers generated by the system.

Type: Application

Filed: December 6, 2006

Publication date: June 12, 2008

Applicant: MICROSOFT CORPORATION

Inventors: Sumit Basu, Karthik Gopalratnam, John David Dunagan, Jiahe Helen Wang
CREATING MUSIC VIA CONCATENATIVE SYNTHESIS

Publication number: 20070289432

Abstract: A “Concatenative Synthesizer” applies concatenative synthesis to create a musical output from a database of musical notes and an input musical score (such as a MIDI score or other computer readable musical score format). In various embodiments, the musical output is either a music score, or an analog or digital audio file. This musical output is constructed by evaluating the database of musical notes to identify sets of candidate notes for each note of the input musical score. An “optimal path” through candidate notes is identified by minimizing an overall cost function through the candidate notes relative to the input musical score. The musical output is then constructed by concatenating the selected candidate notes. In further embodiments, the database of musical notes is generated from any desired musical genre, performer, performance, or instrument. Furthermore, notes in the database may be modified to better fit notes of the input musical score.

Type: Application

Filed: June 15, 2006

Publication date: December 20, 2007

Applicant: Microsoft Corporation

Inventors: Sumit Basu, Ian Simon, David Salesin, Maneesh Agrawala, Adil Sherwani, Chad Gibson
ECHO CANCELLATION FOR CHANNELS WITH UNKNOWN TIME-VARYING GAIN

Publication number: 20070286230

Abstract: After an initial training session, a “Dynamic Echo Canceller” (DEC) provides echo cancellation where only access to an input signal and a composite output signal are available, and the input signal is subjected to an unknown variable gain function. In one embodiment, the DEC uses echo cancellation to provide a “clean” copy of a second input signal where only a first input signal and a composite of the first and second input signal is available. An example is a “black box” amplifier coupled to a microphone and a phone line, with access to only the microphone input and a combined output signal where it is desired to retrieve a clean copy of a remote caller signal from the combined output. The DEC is applicable to many fields, including: signal separation; cancellation of echoes caused by impedance mismatches, periodic electrical noise, acoustic echoes caused by acoustic coupling, etc.

Type: Application

Filed: June 10, 2006

Publication date: December 13, 2007

Applicant: MICROSOFT CORPORATION

Inventor: Sumit Basu
Metadata-based song creation and editing

Publication number: 20070261535

Abstract: Relating higher-level descriptive musical metadata to lower-level musical elements to enable creation of a song map, song model, backing track, or the like. The musical elements are queried based on input metadata to create a set of musical elements of varying types such as notes, chords, song structures, and the like. The set of musical elements is provided to a user for selection of particular musical elements The selected musical elements represent the song model.

Type: Application

Filed: May 1, 2006

Publication date: November 15, 2007

Applicant: Microsoft Corporation

Inventors: Adil Sherwani, Chad Gibson, Sumit Basu
METHODS OF ESTABLISHING A COMMUNICATIONS LINK USING PERCEPTUAL SENSING OF A USER'S PRESENCE

Publication number: 20070229652

Abstract: A method of establishing a communications link uses automatic sensing of a computer user's presence and activity state to record user attributes in a form accessible to other computers in a communications network. Such automatic sensing may include keyboard/mouse monitors, cameras with associated image processing algorithms, speech detectors, RF radiation detectors, and infrared sensors. Preferably, the attribute recording is done in a server process which can be accessed by other computer programs. A first application of this method is to inform persons at remote locations whether the party to be called is available to receive a call. A second application of the method is to use a Connection Agent to determine whether all desired participants for a conference, or at least a quorum of them, are present and available, so that the conference can be started.

Type: Application

Filed: May 30, 2007

Publication date: October 4, 2007

Inventors: Julian Center, Christopher Wren, Sumit Basu, Evgeniy Gusyatin
Methods of establishing a communications link using perceptual sensing of a user's presence

Patent number: 7242421

Abstract: A method of establishing a communications link uses automatic sensing of a computer user's presence and activity state to record user attributes in a form accessible to other computers in a communications network. Such automatic sensing may include keyboard/mouse monitors, cameras with associated image processing algorithms, speech detectors, RF radiation detectors, and infrared sensors. Preferably, the attribute recording is done in a server process which can be accessed by other computer programs. A first application of this method is to inform persons at remote locations whether the party to be called is available to receive a call. A second application of the method is to use a Connection Agent to determine whether all desired participants for a conference, or at least a quorum of them, are present and available, so that the conference can be started.

Type: Grant

Filed: November 13, 2001

Date of Patent: July 10, 2007

Assignee: Perceptive Network Technologies, Inc.

Inventors: Julian L. Center, Jr., Christopher R. Wren, Sumit Basu, Evgeniy Gusyatin
Aligning and mixing songs of arbitrary genres

Patent number: 7220911

Abstract: A “music mixer”, as described herein, provides a capability for automatically mixing arbitrary pieces of music, regardless of whether the music being mixed is of the same music genre, and regardless of whether that music has strong beat structures. In automatically determining potential mixes of two or more songs, the music mixer first computes a frame-based energy for each song. Using the computed frame-based energies, the music mixer then computes one or more potentially optimal alignments of the digital signals representing each song based on correlating peaks of the computed energies across a range of time scalings and time shifts without the need to ever compute or evaluate a beats-per-minute (BPM) for any of the songs. Then, once one of the potentially optimal time-scalings and time-shifts has been selected, the songs are then simply blended together using those parameters.

Type: Grant

Filed: May 3, 2006

Date of Patent: May 22, 2007

Assignee: Microsoft Corporation

Inventor: Sumit Basu
Monitoring, mining, and classifying electronically recordable conversations

Publication number: 20070067159

Abstract: Conversations that take place over an electronically recordable channel are analyzed by constructing a set of features from the speech of two participants in the conversation. The set of features is applied to a model or a plurality of models to determine the likelihood of the set of features for each model. These likelihoods are then used to classify the conversation into categories, provide real-time monitoring of the conversation, and/or identify anomalous conversations.

Type: Application

Filed: September 2, 2005

Publication date: March 22, 2007

Applicant: Microsoft Corporation

Inventors: Sumit Basu, Mauricio de la Fuente
ALIGNING AND MIXING SONGS OF ARBITRARY GENRES

Publication number: 20060192478

Abstract: A “music mixer”, as described herein, provides a capability for automatically mixing arbitrary pieces of music, regardless of whether the music being mixed is of the same music genre, and regardless of whether that music has strong beat structures. In automatically determining potential mixes of two or more songs, the music mixer first computes a frame-based energy for each song. Using the computed frame-based energies, the music mixer then computes one or more potentially optimal alignments of the digital signals representing each song based on correlating peaks of the computed energies across a range of time scalings and time shifts without the need to ever compute or evaluate a beats-per-minute (BPM) for any of the songs. Then, once one of the potentially optimal time-scalings and time-shifts has been selected, the songs are then simply blended together using those parameters.

Type: Application

Filed: May 3, 2006

Publication date: August 31, 2006

Applicant: Microsoft Corporation

Inventor: Sumit Basu
System and process for increasing the apparent resolution of a display

Publication number: 20060165314

Abstract: A system and process for creating an apparently higher resolution image on a display exhibiting a lower resolution is presented. The basic idea is to make multiple decimated versions of an image at different offsets in a smooth path (all of which will contain different bits of detail), and then animate through the resulting decimated images (i.e., show them in rapid succession). The viewer sees what looks like a higher-resolution image moving in a smooth path. The viewer sees this since the human eye is capable of integrating details over the continuous motion. Thus, images such as text enjoy an enhanced legibility.

Type: Application

Filed: January 21, 2005

Publication date: July 27, 2006

Applicant: Microsoft Corporation

Inventors: Sumit Basu, Patrick Baudisch
Palette-based classifying and synthesizing of auditory information

Publication number: 20060167692

Abstract: The subject invention leverages spectral “palettes” or representations of an input sequence to provide recognition and/or synthesizing of a class of data. The class can include, but is not limited to, individual events, distributions of events, and/or environments relating to the input sequence. The representations are compressed versions of the data that utilize a substantially smaller amount of system resources to store and/or manipulate. Segments of the palettes are employed to facilitate in reconstruction of an event occurring in the input sequence. This provides an efficient means to recognize events, even when they occur in complex environments. The palettes themselves are constructed or “trained” utilizing any number of data compression techniques such as, for example, epitomes, vector quantization, and/or Huffman codes and the like.

Type: Application

Filed: January 24, 2005

Publication date: July 27, 2006

Applicant: Microsoft Corporation

Inventors: Sumit Basu, Nebojsa Jojic, Ashish Kapoor
System and method for aligning and mixing songs of arbitrary genres

Patent number: 7081582

Abstract: A “music mixer”, as described herein, provides a capability for automatically mixing arbitrary pieces of music, regardless of whether the music being mixed is of the same music genre, and regardless of whether that music has strong beat structures. In automatically determining potential mixes of two or more songs, the music mixer first computes a frame-based energy for each song. Using the computed frame-based energies, the music mixer then computes one or more potentially optimal alignments of the digital signals representing each song based on correlating peaks of the computed energies across a range of time scalings and time shifts without the need to ever compute or evaluate a beats-per-minute (BPM) for any of the songs. Then, once one of the potentially optimal time-scalings and time-shifts has been selected, the songs are then simply blended together using those parameters.

Type: Grant

Filed: June 30, 2004

Date of Patent: July 25, 2006

Assignee: Microsoft Corporation

Inventor: Sumit Basu
System and method for video browsing using a cluster index

Publication number: 20060120624

Abstract: A “Video Browser” provides an intuitive user interface for indexing, and interactive visual browsing, of particular elements within a video recording. In general, the Video Browser operates by first generating a set of one or more mosaic images from the video recording. In one embodiment, these mosaics are further clustered using an adjustable similarity threshold. User selection of a particular video mosaic then initiates a playback of corresponding video frames. However, in contrast to conventional mosaicing schemes which simply play back the set of frames used to construct the mosaic, the Video Browser provides a playback of only those individual frames within which a particular point selected within the image mosaic was observed. Consequently, user selection of a point in one of the image mosaics serves to provide a targeted playback of only those frames of interest, rather than playing back the entire image sequence used to generate the mosaic.

Type: Application

Filed: December 8, 2004

Publication date: June 8, 2006

Applicant: Microsoft Corporation

Inventors: Nebojsa Jojic, Sumit Basu
System and method for aligning and mixing songs of arbitrary genres

Publication number: 20060000344

Abstract: A “music mixer”, as described herein, provides a capability for automatically mixing arbitrary pieces of music, regardless of whether the music being mixed is of the same music genre, and regardless of whether that music has strong beat structures. In automatically determining potential mixes of two or more songs, the music mixer first computes a frame-based energy for each song. Using the computed frame-based energies, the music mixer then computes one or more potentially optimal alignments of the digital signals representing each song based on correlating peaks of the computed energies across a range of time scalings and time shifts without the need to ever compute or evaluate a beats-per-minute (BPM) for any of the songs. Then, once one of the potentially optimal time-scalings and time-shifts has been selected, the songs are then simply blended together using those parameters.

Type: Application

Filed: June 30, 2004

Publication date: January 5, 2006

Applicant: Microsoft Corporation

Inventor: Sumit Basu
System and process for controlling a shared display given inputs from multiple users using multiple input modalities

Publication number: 20050198578

Abstract: A system and process for controlling common information displays, referred to as shared displays, is presented. The system and process allows multiple modes of input using a set of modules that accept and display data from a variety of sources. Input modules are able to understand data from a single mode of communication and to be able to generate messages as output accordingly. An optional translation module takes discrete message units and converts them into commands or requests that can be processed by a logic module. The logic module includes any application that is running on the shared display. A layout module lays out the information output by the logic module and a display module takes the layout data and converts the information to a form that can be readily displayed on a display device.

Type: Application

Filed: January 15, 2004

Publication date: September 8, 2005

Inventors: Maneesh Agrawala, Sumit Basu, Steven Drucker, Ronald Logan, Trausti Kristjansson, Tim Paek, Kentaro Toyama, Andrew Wilson
Hypertext navigation for shared displays

Publication number: 20050193328

Abstract: A browsing system and method for browsing allows multiple users to access and view hypertext documents on a shared display. A browsing system includes a hypertext document converter configured to convert a component in a hypertext document to include alternate component activation tags. A hypertext display controller controls a display module to display the converted component in the hypertext document. The input processor receives and processes an input signal related to the alternate component activation tag from at least one of the plurality of input devices. The browsing system activates the converted component of the hypertext document upon receiving the input signal.

Type: Application

Filed: February 27, 2004

Publication date: September 1, 2005

Applicant: Microsoft Corporation

Inventors: Maneesh Agrawala, Sumit Basu, Steven Drucker, Ronald Logan, Trausti Kristjansson, Tim Paek, Kentaro Toyama, Andrew Wilson
System & process for speaking in a two-way voice communication without talking using a set of speech selection menus

Publication number: 20050094781

Abstract: A system and process for enabling a communication device having computing capability, a user interface and display, to conduct two-way voice communications between a user and a remote party over a communication link in such a manner that the remote party speaks but the user does not, is presented. In general, a series of menus listing potential responses is displayed on the display of the communication device. In addition, there are a plurality of backchanneling responses provided that the user can select. These responses are employed by the user to communicate with the remote party, rather than speaking. This is accomplished by the user selecting one of the available responses. Once a selection has been made, a pre-recorded voice snippet corresponding to the selected response is accessed. The accessed voice snippet is then played back and transmitted to the remote party over the communication link.

Type: Application

Filed: November 3, 2003

Publication date: May 5, 2005

Applicant: Microsoft Corporation

Inventor: Sumit Basu
Methods of establishing a communications link using perceptual sensing of a user's presence

Publication number: 20020163572

Abstract: A method of establishing a communications link uses automatic sensing of a computer user's presence and activity state to record user attributes in a form accessible to other computers in a communications network. Such automatic sensing may include keyboard/mouse monitors, cameras with associated image processing algorithms, speech detectors, RF radiation detectors, and infra-red sensors. Preferably, the attribute recording is done in a server process which can be accessed by other computer programs. A first application of this method is to inform persons at remote locations whether the party to be called is available to receive a call. A second application of the method is to use a Connection Agent to determine whether all desired participants for a conference, or at least a quorum of them, are present and available, so that the conference can be started.

Type: Application

Filed: November 13, 2001

Publication date: November 7, 2002

Inventors: Julian L. Center, Christopher R. Wren, Sumit Basu, Evgeniy Gusyatin
Method of extending image-based face recognition systems to utilize multi-view image sequences and audio information

Publication number: 20020113687

Abstract: A biometric identification method of identifying a person combines facial identification steps with audio identification steps. In order to reduce vulnerability of a recognition system to deception using photographs or even three-dimensional masks or replicas, the system uses a sequence of images to verify that lips and chin are moving as a predetermined sequence of sounds are uttered by a person who desires to be identified. In order to compensate for variations in speed of making the utterance, a dynamic time warping algorithm is used to normalize length of the input utterance to match the length of a model utterance previously stored for the person. In order to prevent deception based on two-dimensional images, preferably two cameras pointed in different directions are used for facial recognition.

Type: Application

Filed: November 13, 2001

Publication date: August 22, 2002

Inventors: Julian L. Center, Christopher R. Wren, Sumit Basu

prev 1 2 3 4 5 next