Patents by Inventor Sumit Basu

Sumit Basu has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 7443962
    Abstract: A system and process for enabling a communication device having computing capability, a user interface and display, to conduct two-way voice communications between a user and a remote party over a communication link in such a manner that the remote party speaks but the user does not, is presented. In general, a series of menus listing potential responses is displayed on the display of the communication device. In addition, there are a plurality of backchanneling responses provided that the user can select. These responses are employed by the user to communicate with the remote party, rather than speaking. This is accomplished by the user selecting one of the available responses. Once a selection has been made, a pre-recorded voice snippet corresponding to the selected response is accessed. The accessed voice snippet is then played back and transmitted to the remote party over the communication link.
    Type: Grant
    Filed: November 3, 2003
    Date of Patent: October 28, 2008
    Assignee: Microsoft Corporation
    Inventor: Sumit Basu
  • Publication number: 20080140589
    Abstract: An active learning framework is provided to extract information from particular fields from a variety of protocols. Extraction is performed in an unknown protocol, in which the user presents the system with a small number of labeled instances. The system then automatically generates an abundance of features and negative examples. A boosting approach is then used for feature selection and classifier combination. The system then displays its results for the user to correct and/or add new examples. The process can be iterated until the user is satisfied with the performance of the extraction capabilities provided by the classifiers generated by the system.
    Type: Application
    Filed: December 6, 2006
    Publication date: June 12, 2008
    Applicant: MICROSOFT CORPORATION
    Inventors: Sumit Basu, Karthik Gopalratnam, John David Dunagan, Jiahe Helen Wang
  • Publication number: 20070289432
    Abstract: A “Concatenative Synthesizer” applies concatenative synthesis to create a musical output from a database of musical notes and an input musical score (such as a MIDI score or other computer readable musical score format). In various embodiments, the musical output is either a music score, or an analog or digital audio file. This musical output is constructed by evaluating the database of musical notes to identify sets of candidate notes for each note of the input musical score. An “optimal path” through candidate notes is identified by minimizing an overall cost function through the candidate notes relative to the input musical score. The musical output is then constructed by concatenating the selected candidate notes. In further embodiments, the database of musical notes is generated from any desired musical genre, performer, performance, or instrument. Furthermore, notes in the database may be modified to better fit notes of the input musical score.
    Type: Application
    Filed: June 15, 2006
    Publication date: December 20, 2007
    Applicant: Microsoft Corporation
    Inventors: Sumit Basu, Ian Simon, David Salesin, Maneesh Agrawala, Adil Sherwani, Chad Gibson
  • Publication number: 20070286230
    Abstract: After an initial training session, a “Dynamic Echo Canceller” (DEC) provides echo cancellation where only access to an input signal and a composite output signal are available, and the input signal is subjected to an unknown variable gain function. In one embodiment, the DEC uses echo cancellation to provide a “clean” copy of a second input signal where only a first input signal and a composite of the first and second input signal is available. An example is a “black box” amplifier coupled to a microphone and a phone line, with access to only the microphone input and a combined output signal where it is desired to retrieve a clean copy of a remote caller signal from the combined output. The DEC is applicable to many fields, including: signal separation; cancellation of echoes caused by impedance mismatches, periodic electrical noise, acoustic echoes caused by acoustic coupling, etc.
    Type: Application
    Filed: June 10, 2006
    Publication date: December 13, 2007
    Applicant: MICROSOFT CORPORATION
    Inventor: Sumit Basu
  • Publication number: 20070261535
    Abstract: Relating higher-level descriptive musical metadata to lower-level musical elements to enable creation of a song map, song model, backing track, or the like. The musical elements are queried based on input metadata to create a set of musical elements of varying types such as notes, chords, song structures, and the like. The set of musical elements is provided to a user for selection of particular musical elements The selected musical elements represent the song model.
    Type: Application
    Filed: May 1, 2006
    Publication date: November 15, 2007
    Applicant: Microsoft Corporation
    Inventors: Adil Sherwani, Chad Gibson, Sumit Basu
  • Publication number: 20070229652
    Abstract: A method of establishing a communications link uses automatic sensing of a computer user's presence and activity state to record user attributes in a form accessible to other computers in a communications network. Such automatic sensing may include keyboard/mouse monitors, cameras with associated image processing algorithms, speech detectors, RF radiation detectors, and infrared sensors. Preferably, the attribute recording is done in a server process which can be accessed by other computer programs. A first application of this method is to inform persons at remote locations whether the party to be called is available to receive a call. A second application of the method is to use a Connection Agent to determine whether all desired participants for a conference, or at least a quorum of them, are present and available, so that the conference can be started.
    Type: Application
    Filed: May 30, 2007
    Publication date: October 4, 2007
    Inventors: Julian Center, Christopher Wren, Sumit Basu, Evgeniy Gusyatin
  • Patent number: 7242421
    Abstract: A method of establishing a communications link uses automatic sensing of a computer user's presence and activity state to record user attributes in a form accessible to other computers in a communications network. Such automatic sensing may include keyboard/mouse monitors, cameras with associated image processing algorithms, speech detectors, RF radiation detectors, and infrared sensors. Preferably, the attribute recording is done in a server process which can be accessed by other computer programs. A first application of this method is to inform persons at remote locations whether the party to be called is available to receive a call. A second application of the method is to use a Connection Agent to determine whether all desired participants for a conference, or at least a quorum of them, are present and available, so that the conference can be started.
    Type: Grant
    Filed: November 13, 2001
    Date of Patent: July 10, 2007
    Assignee: Perceptive Network Technologies, Inc.
    Inventors: Julian L. Center, Jr., Christopher R. Wren, Sumit Basu, Evgeniy Gusyatin
  • Patent number: 7220911
    Abstract: A “music mixer”, as described herein, provides a capability for automatically mixing arbitrary pieces of music, regardless of whether the music being mixed is of the same music genre, and regardless of whether that music has strong beat structures. In automatically determining potential mixes of two or more songs, the music mixer first computes a frame-based energy for each song. Using the computed frame-based energies, the music mixer then computes one or more potentially optimal alignments of the digital signals representing each song based on correlating peaks of the computed energies across a range of time scalings and time shifts without the need to ever compute or evaluate a beats-per-minute (BPM) for any of the songs. Then, once one of the potentially optimal time-scalings and time-shifts has been selected, the songs are then simply blended together using those parameters.
    Type: Grant
    Filed: May 3, 2006
    Date of Patent: May 22, 2007
    Assignee: Microsoft Corporation
    Inventor: Sumit Basu
  • Publication number: 20070067159
    Abstract: Conversations that take place over an electronically recordable channel are analyzed by constructing a set of features from the speech of two participants in the conversation. The set of features is applied to a model or a plurality of models to determine the likelihood of the set of features for each model. These likelihoods are then used to classify the conversation into categories, provide real-time monitoring of the conversation, and/or identify anomalous conversations.
    Type: Application
    Filed: September 2, 2005
    Publication date: March 22, 2007
    Applicant: Microsoft Corporation
    Inventors: Sumit Basu, Mauricio de la Fuente
  • Publication number: 20060192478
    Abstract: A “music mixer”, as described herein, provides a capability for automatically mixing arbitrary pieces of music, regardless of whether the music being mixed is of the same music genre, and regardless of whether that music has strong beat structures. In automatically determining potential mixes of two or more songs, the music mixer first computes a frame-based energy for each song. Using the computed frame-based energies, the music mixer then computes one or more potentially optimal alignments of the digital signals representing each song based on correlating peaks of the computed energies across a range of time scalings and time shifts without the need to ever compute or evaluate a beats-per-minute (BPM) for any of the songs. Then, once one of the potentially optimal time-scalings and time-shifts has been selected, the songs are then simply blended together using those parameters.
    Type: Application
    Filed: May 3, 2006
    Publication date: August 31, 2006
    Applicant: Microsoft Corporation
    Inventor: Sumit Basu
  • Publication number: 20060165314
    Abstract: A system and process for creating an apparently higher resolution image on a display exhibiting a lower resolution is presented. The basic idea is to make multiple decimated versions of an image at different offsets in a smooth path (all of which will contain different bits of detail), and then animate through the resulting decimated images (i.e., show them in rapid succession). The viewer sees what looks like a higher-resolution image moving in a smooth path. The viewer sees this since the human eye is capable of integrating details over the continuous motion. Thus, images such as text enjoy an enhanced legibility.
    Type: Application
    Filed: January 21, 2005
    Publication date: July 27, 2006
    Applicant: Microsoft Corporation
    Inventors: Sumit Basu, Patrick Baudisch
  • Publication number: 20060167692
    Abstract: The subject invention leverages spectral “palettes” or representations of an input sequence to provide recognition and/or synthesizing of a class of data. The class can include, but is not limited to, individual events, distributions of events, and/or environments relating to the input sequence. The representations are compressed versions of the data that utilize a substantially smaller amount of system resources to store and/or manipulate. Segments of the palettes are employed to facilitate in reconstruction of an event occurring in the input sequence. This provides an efficient means to recognize events, even when they occur in complex environments. The palettes themselves are constructed or “trained” utilizing any number of data compression techniques such as, for example, epitomes, vector quantization, and/or Huffman codes and the like.
    Type: Application
    Filed: January 24, 2005
    Publication date: July 27, 2006
    Applicant: Microsoft Corporation
    Inventors: Sumit Basu, Nebojsa Jojic, Ashish Kapoor
  • Patent number: 7081582
    Abstract: A “music mixer”, as described herein, provides a capability for automatically mixing arbitrary pieces of music, regardless of whether the music being mixed is of the same music genre, and regardless of whether that music has strong beat structures. In automatically determining potential mixes of two or more songs, the music mixer first computes a frame-based energy for each song. Using the computed frame-based energies, the music mixer then computes one or more potentially optimal alignments of the digital signals representing each song based on correlating peaks of the computed energies across a range of time scalings and time shifts without the need to ever compute or evaluate a beats-per-minute (BPM) for any of the songs. Then, once one of the potentially optimal time-scalings and time-shifts has been selected, the songs are then simply blended together using those parameters.
    Type: Grant
    Filed: June 30, 2004
    Date of Patent: July 25, 2006
    Assignee: Microsoft Corporation
    Inventor: Sumit Basu
  • Publication number: 20060120624
    Abstract: A “Video Browser” provides an intuitive user interface for indexing, and interactive visual browsing, of particular elements within a video recording. In general, the Video Browser operates by first generating a set of one or more mosaic images from the video recording. In one embodiment, these mosaics are further clustered using an adjustable similarity threshold. User selection of a particular video mosaic then initiates a playback of corresponding video frames. However, in contrast to conventional mosaicing schemes which simply play back the set of frames used to construct the mosaic, the Video Browser provides a playback of only those individual frames within which a particular point selected within the image mosaic was observed. Consequently, user selection of a point in one of the image mosaics serves to provide a targeted playback of only those frames of interest, rather than playing back the entire image sequence used to generate the mosaic.
    Type: Application
    Filed: December 8, 2004
    Publication date: June 8, 2006
    Applicant: Microsoft Corporation
    Inventors: Nebojsa Jojic, Sumit Basu
  • Publication number: 20060000344
    Abstract: A “music mixer”, as described herein, provides a capability for automatically mixing arbitrary pieces of music, regardless of whether the music being mixed is of the same music genre, and regardless of whether that music has strong beat structures. In automatically determining potential mixes of two or more songs, the music mixer first computes a frame-based energy for each song. Using the computed frame-based energies, the music mixer then computes one or more potentially optimal alignments of the digital signals representing each song based on correlating peaks of the computed energies across a range of time scalings and time shifts without the need to ever compute or evaluate a beats-per-minute (BPM) for any of the songs. Then, once one of the potentially optimal time-scalings and time-shifts has been selected, the songs are then simply blended together using those parameters.
    Type: Application
    Filed: June 30, 2004
    Publication date: January 5, 2006
    Applicant: Microsoft Corporation
    Inventor: Sumit Basu
  • Publication number: 20050198578
    Abstract: A system and process for controlling common information displays, referred to as shared displays, is presented. The system and process allows multiple modes of input using a set of modules that accept and display data from a variety of sources. Input modules are able to understand data from a single mode of communication and to be able to generate messages as output accordingly. An optional translation module takes discrete message units and converts them into commands or requests that can be processed by a logic module. The logic module includes any application that is running on the shared display. A layout module lays out the information output by the logic module and a display module takes the layout data and converts the information to a form that can be readily displayed on a display device.
    Type: Application
    Filed: January 15, 2004
    Publication date: September 8, 2005
    Inventors: Maneesh Agrawala, Sumit Basu, Steven Drucker, Ronald Logan, Trausti Kristjansson, Tim Paek, Kentaro Toyama, Andrew Wilson
  • Publication number: 20050193328
    Abstract: A browsing system and method for browsing allows multiple users to access and view hypertext documents on a shared display. A browsing system includes a hypertext document converter configured to convert a component in a hypertext document to include alternate component activation tags. A hypertext display controller controls a display module to display the converted component in the hypertext document. The input processor receives and processes an input signal related to the alternate component activation tag from at least one of the plurality of input devices. The browsing system activates the converted component of the hypertext document upon receiving the input signal.
    Type: Application
    Filed: February 27, 2004
    Publication date: September 1, 2005
    Applicant: Microsoft Corporation
    Inventors: Maneesh Agrawala, Sumit Basu, Steven Drucker, Ronald Logan, Trausti Kristjansson, Tim Paek, Kentaro Toyama, Andrew Wilson
  • Publication number: 20050094781
    Abstract: A system and process for enabling a communication device having computing capability, a user interface and display, to conduct two-way voice communications between a user and a remote party over a communication link in such a manner that the remote party speaks but the user does not, is presented. In general, a series of menus listing potential responses is displayed on the display of the communication device. In addition, there are a plurality of backchanneling responses provided that the user can select. These responses are employed by the user to communicate with the remote party, rather than speaking. This is accomplished by the user selecting one of the available responses. Once a selection has been made, a pre-recorded voice snippet corresponding to the selected response is accessed. The accessed voice snippet is then played back and transmitted to the remote party over the communication link.
    Type: Application
    Filed: November 3, 2003
    Publication date: May 5, 2005
    Applicant: Microsoft Corporation
    Inventor: Sumit Basu
  • Publication number: 20020163572
    Abstract: A method of establishing a communications link uses automatic sensing of a computer user's presence and activity state to record user attributes in a form accessible to other computers in a communications network. Such automatic sensing may include keyboard/mouse monitors, cameras with associated image processing algorithms, speech detectors, RF radiation detectors, and infra-red sensors. Preferably, the attribute recording is done in a server process which can be accessed by other computer programs. A first application of this method is to inform persons at remote locations whether the party to be called is available to receive a call. A second application of the method is to use a Connection Agent to determine whether all desired participants for a conference, or at least a quorum of them, are present and available, so that the conference can be started.
    Type: Application
    Filed: November 13, 2001
    Publication date: November 7, 2002
    Inventors: Julian L. Center, Christopher R. Wren, Sumit Basu, Evgeniy Gusyatin
  • Publication number: 20020113687
    Abstract: A biometric identification method of identifying a person combines facial identification steps with audio identification steps. In order to reduce vulnerability of a recognition system to deception using photographs or even three-dimensional masks or replicas, the system uses a sequence of images to verify that lips and chin are moving as a predetermined sequence of sounds are uttered by a person who desires to be identified. In order to compensate for variations in speed of making the utterance, a dynamic time warping algorithm is used to normalize length of the input utterance to match the length of a model utterance previously stored for the person. In order to prevent deception based on two-dimensional images, preferably two cameras pointed in different directions are used for facial recognition.
    Type: Application
    Filed: November 13, 2001
    Publication date: August 22, 2002
    Inventors: Julian L. Center, Christopher R. Wren, Sumit Basu