Patents by Inventor Sumit Basu
Sumit Basu has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Patent number: 7443962Abstract: A system and process for enabling a communication device having computing capability, a user interface and display, to conduct two-way voice communications between a user and a remote party over a communication link in such a manner that the remote party speaks but the user does not, is presented. In general, a series of menus listing potential responses is displayed on the display of the communication device. In addition, there are a plurality of backchanneling responses provided that the user can select. These responses are employed by the user to communicate with the remote party, rather than speaking. This is accomplished by the user selecting one of the available responses. Once a selection has been made, a pre-recorded voice snippet corresponding to the selected response is accessed. The accessed voice snippet is then played back and transmitted to the remote party over the communication link.Type: GrantFiled: November 3, 2003Date of Patent: October 28, 2008Assignee: Microsoft CorporationInventor: Sumit Basu
-
Publication number: 20080140589Abstract: An active learning framework is provided to extract information from particular fields from a variety of protocols. Extraction is performed in an unknown protocol, in which the user presents the system with a small number of labeled instances. The system then automatically generates an abundance of features and negative examples. A boosting approach is then used for feature selection and classifier combination. The system then displays its results for the user to correct and/or add new examples. The process can be iterated until the user is satisfied with the performance of the extraction capabilities provided by the classifiers generated by the system.Type: ApplicationFiled: December 6, 2006Publication date: June 12, 2008Applicant: MICROSOFT CORPORATIONInventors: Sumit Basu, Karthik Gopalratnam, John David Dunagan, Jiahe Helen Wang
-
Publication number: 20070289432Abstract: A “Concatenative Synthesizer” applies concatenative synthesis to create a musical output from a database of musical notes and an input musical score (such as a MIDI score or other computer readable musical score format). In various embodiments, the musical output is either a music score, or an analog or digital audio file. This musical output is constructed by evaluating the database of musical notes to identify sets of candidate notes for each note of the input musical score. An “optimal path” through candidate notes is identified by minimizing an overall cost function through the candidate notes relative to the input musical score. The musical output is then constructed by concatenating the selected candidate notes. In further embodiments, the database of musical notes is generated from any desired musical genre, performer, performance, or instrument. Furthermore, notes in the database may be modified to better fit notes of the input musical score.Type: ApplicationFiled: June 15, 2006Publication date: December 20, 2007Applicant: Microsoft CorporationInventors: Sumit Basu, Ian Simon, David Salesin, Maneesh Agrawala, Adil Sherwani, Chad Gibson
-
Publication number: 20070286230Abstract: After an initial training session, a “Dynamic Echo Canceller” (DEC) provides echo cancellation where only access to an input signal and a composite output signal are available, and the input signal is subjected to an unknown variable gain function. In one embodiment, the DEC uses echo cancellation to provide a “clean” copy of a second input signal where only a first input signal and a composite of the first and second input signal is available. An example is a “black box” amplifier coupled to a microphone and a phone line, with access to only the microphone input and a combined output signal where it is desired to retrieve a clean copy of a remote caller signal from the combined output. The DEC is applicable to many fields, including: signal separation; cancellation of echoes caused by impedance mismatches, periodic electrical noise, acoustic echoes caused by acoustic coupling, etc.Type: ApplicationFiled: June 10, 2006Publication date: December 13, 2007Applicant: MICROSOFT CORPORATIONInventor: Sumit Basu
-
Publication number: 20070261535Abstract: Relating higher-level descriptive musical metadata to lower-level musical elements to enable creation of a song map, song model, backing track, or the like. The musical elements are queried based on input metadata to create a set of musical elements of varying types such as notes, chords, song structures, and the like. The set of musical elements is provided to a user for selection of particular musical elements The selected musical elements represent the song model.Type: ApplicationFiled: May 1, 2006Publication date: November 15, 2007Applicant: Microsoft CorporationInventors: Adil Sherwani, Chad Gibson, Sumit Basu
-
Publication number: 20070229652Abstract: A method of establishing a communications link uses automatic sensing of a computer user's presence and activity state to record user attributes in a form accessible to other computers in a communications network. Such automatic sensing may include keyboard/mouse monitors, cameras with associated image processing algorithms, speech detectors, RF radiation detectors, and infrared sensors. Preferably, the attribute recording is done in a server process which can be accessed by other computer programs. A first application of this method is to inform persons at remote locations whether the party to be called is available to receive a call. A second application of the method is to use a Connection Agent to determine whether all desired participants for a conference, or at least a quorum of them, are present and available, so that the conference can be started.Type: ApplicationFiled: May 30, 2007Publication date: October 4, 2007Inventors: Julian Center, Christopher Wren, Sumit Basu, Evgeniy Gusyatin
-
Patent number: 7242421Abstract: A method of establishing a communications link uses automatic sensing of a computer user's presence and activity state to record user attributes in a form accessible to other computers in a communications network. Such automatic sensing may include keyboard/mouse monitors, cameras with associated image processing algorithms, speech detectors, RF radiation detectors, and infrared sensors. Preferably, the attribute recording is done in a server process which can be accessed by other computer programs. A first application of this method is to inform persons at remote locations whether the party to be called is available to receive a call. A second application of the method is to use a Connection Agent to determine whether all desired participants for a conference, or at least a quorum of them, are present and available, so that the conference can be started.Type: GrantFiled: November 13, 2001Date of Patent: July 10, 2007Assignee: Perceptive Network Technologies, Inc.Inventors: Julian L. Center, Jr., Christopher R. Wren, Sumit Basu, Evgeniy Gusyatin
-
Patent number: 7220911Abstract: A “music mixer”, as described herein, provides a capability for automatically mixing arbitrary pieces of music, regardless of whether the music being mixed is of the same music genre, and regardless of whether that music has strong beat structures. In automatically determining potential mixes of two or more songs, the music mixer first computes a frame-based energy for each song. Using the computed frame-based energies, the music mixer then computes one or more potentially optimal alignments of the digital signals representing each song based on correlating peaks of the computed energies across a range of time scalings and time shifts without the need to ever compute or evaluate a beats-per-minute (BPM) for any of the songs. Then, once one of the potentially optimal time-scalings and time-shifts has been selected, the songs are then simply blended together using those parameters.Type: GrantFiled: May 3, 2006Date of Patent: May 22, 2007Assignee: Microsoft CorporationInventor: Sumit Basu
-
Publication number: 20070067159Abstract: Conversations that take place over an electronically recordable channel are analyzed by constructing a set of features from the speech of two participants in the conversation. The set of features is applied to a model or a plurality of models to determine the likelihood of the set of features for each model. These likelihoods are then used to classify the conversation into categories, provide real-time monitoring of the conversation, and/or identify anomalous conversations.Type: ApplicationFiled: September 2, 2005Publication date: March 22, 2007Applicant: Microsoft CorporationInventors: Sumit Basu, Mauricio de la Fuente
-
Publication number: 20060192478Abstract: A “music mixer”, as described herein, provides a capability for automatically mixing arbitrary pieces of music, regardless of whether the music being mixed is of the same music genre, and regardless of whether that music has strong beat structures. In automatically determining potential mixes of two or more songs, the music mixer first computes a frame-based energy for each song. Using the computed frame-based energies, the music mixer then computes one or more potentially optimal alignments of the digital signals representing each song based on correlating peaks of the computed energies across a range of time scalings and time shifts without the need to ever compute or evaluate a beats-per-minute (BPM) for any of the songs. Then, once one of the potentially optimal time-scalings and time-shifts has been selected, the songs are then simply blended together using those parameters.Type: ApplicationFiled: May 3, 2006Publication date: August 31, 2006Applicant: Microsoft CorporationInventor: Sumit Basu
-
Publication number: 20060167692Abstract: The subject invention leverages spectral “palettes” or representations of an input sequence to provide recognition and/or synthesizing of a class of data. The class can include, but is not limited to, individual events, distributions of events, and/or environments relating to the input sequence. The representations are compressed versions of the data that utilize a substantially smaller amount of system resources to store and/or manipulate. Segments of the palettes are employed to facilitate in reconstruction of an event occurring in the input sequence. This provides an efficient means to recognize events, even when they occur in complex environments. The palettes themselves are constructed or “trained” utilizing any number of data compression techniques such as, for example, epitomes, vector quantization, and/or Huffman codes and the like.Type: ApplicationFiled: January 24, 2005Publication date: July 27, 2006Applicant: Microsoft CorporationInventors: Sumit Basu, Nebojsa Jojic, Ashish Kapoor
-
Publication number: 20060165314Abstract: A system and process for creating an apparently higher resolution image on a display exhibiting a lower resolution is presented. The basic idea is to make multiple decimated versions of an image at different offsets in a smooth path (all of which will contain different bits of detail), and then animate through the resulting decimated images (i.e., show them in rapid succession). The viewer sees what looks like a higher-resolution image moving in a smooth path. The viewer sees this since the human eye is capable of integrating details over the continuous motion. Thus, images such as text enjoy an enhanced legibility.Type: ApplicationFiled: January 21, 2005Publication date: July 27, 2006Applicant: Microsoft CorporationInventors: Sumit Basu, Patrick Baudisch
-
Patent number: 7081582Abstract: A “music mixer”, as described herein, provides a capability for automatically mixing arbitrary pieces of music, regardless of whether the music being mixed is of the same music genre, and regardless of whether that music has strong beat structures. In automatically determining potential mixes of two or more songs, the music mixer first computes a frame-based energy for each song. Using the computed frame-based energies, the music mixer then computes one or more potentially optimal alignments of the digital signals representing each song based on correlating peaks of the computed energies across a range of time scalings and time shifts without the need to ever compute or evaluate a beats-per-minute (BPM) for any of the songs. Then, once one of the potentially optimal time-scalings and time-shifts has been selected, the songs are then simply blended together using those parameters.Type: GrantFiled: June 30, 2004Date of Patent: July 25, 2006Assignee: Microsoft CorporationInventor: Sumit Basu
-
Publication number: 20060120624Abstract: A “Video Browser” provides an intuitive user interface for indexing, and interactive visual browsing, of particular elements within a video recording. In general, the Video Browser operates by first generating a set of one or more mosaic images from the video recording. In one embodiment, these mosaics are further clustered using an adjustable similarity threshold. User selection of a particular video mosaic then initiates a playback of corresponding video frames. However, in contrast to conventional mosaicing schemes which simply play back the set of frames used to construct the mosaic, the Video Browser provides a playback of only those individual frames within which a particular point selected within the image mosaic was observed. Consequently, user selection of a point in one of the image mosaics serves to provide a targeted playback of only those frames of interest, rather than playing back the entire image sequence used to generate the mosaic.Type: ApplicationFiled: December 8, 2004Publication date: June 8, 2006Applicant: Microsoft CorporationInventors: Nebojsa Jojic, Sumit Basu
-
Publication number: 20060000344Abstract: A “music mixer”, as described herein, provides a capability for automatically mixing arbitrary pieces of music, regardless of whether the music being mixed is of the same music genre, and regardless of whether that music has strong beat structures. In automatically determining potential mixes of two or more songs, the music mixer first computes a frame-based energy for each song. Using the computed frame-based energies, the music mixer then computes one or more potentially optimal alignments of the digital signals representing each song based on correlating peaks of the computed energies across a range of time scalings and time shifts without the need to ever compute or evaluate a beats-per-minute (BPM) for any of the songs. Then, once one of the potentially optimal time-scalings and time-shifts has been selected, the songs are then simply blended together using those parameters.Type: ApplicationFiled: June 30, 2004Publication date: January 5, 2006Applicant: Microsoft CorporationInventor: Sumit Basu
-
Publication number: 20050198578Abstract: A system and process for controlling common information displays, referred to as shared displays, is presented. The system and process allows multiple modes of input using a set of modules that accept and display data from a variety of sources. Input modules are able to understand data from a single mode of communication and to be able to generate messages as output accordingly. An optional translation module takes discrete message units and converts them into commands or requests that can be processed by a logic module. The logic module includes any application that is running on the shared display. A layout module lays out the information output by the logic module and a display module takes the layout data and converts the information to a form that can be readily displayed on a display device.Type: ApplicationFiled: January 15, 2004Publication date: September 8, 2005Inventors: Maneesh Agrawala, Sumit Basu, Steven Drucker, Ronald Logan, Trausti Kristjansson, Tim Paek, Kentaro Toyama, Andrew Wilson
-
Publication number: 20050193328Abstract: A browsing system and method for browsing allows multiple users to access and view hypertext documents on a shared display. A browsing system includes a hypertext document converter configured to convert a component in a hypertext document to include alternate component activation tags. A hypertext display controller controls a display module to display the converted component in the hypertext document. The input processor receives and processes an input signal related to the alternate component activation tag from at least one of the plurality of input devices. The browsing system activates the converted component of the hypertext document upon receiving the input signal.Type: ApplicationFiled: February 27, 2004Publication date: September 1, 2005Applicant: Microsoft CorporationInventors: Maneesh Agrawala, Sumit Basu, Steven Drucker, Ronald Logan, Trausti Kristjansson, Tim Paek, Kentaro Toyama, Andrew Wilson
-
Publication number: 20050094781Abstract: A system and process for enabling a communication device having computing capability, a user interface and display, to conduct two-way voice communications between a user and a remote party over a communication link in such a manner that the remote party speaks but the user does not, is presented. In general, a series of menus listing potential responses is displayed on the display of the communication device. In addition, there are a plurality of backchanneling responses provided that the user can select. These responses are employed by the user to communicate with the remote party, rather than speaking. This is accomplished by the user selecting one of the available responses. Once a selection has been made, a pre-recorded voice snippet corresponding to the selected response is accessed. The accessed voice snippet is then played back and transmitted to the remote party over the communication link.Type: ApplicationFiled: November 3, 2003Publication date: May 5, 2005Applicant: Microsoft CorporationInventor: Sumit Basu
-
Publication number: 20020163572Abstract: A method of establishing a communications link uses automatic sensing of a computer user's presence and activity state to record user attributes in a form accessible to other computers in a communications network. Such automatic sensing may include keyboard/mouse monitors, cameras with associated image processing algorithms, speech detectors, RF radiation detectors, and infra-red sensors. Preferably, the attribute recording is done in a server process which can be accessed by other computer programs. A first application of this method is to inform persons at remote locations whether the party to be called is available to receive a call. A second application of the method is to use a Connection Agent to determine whether all desired participants for a conference, or at least a quorum of them, are present and available, so that the conference can be started.Type: ApplicationFiled: November 13, 2001Publication date: November 7, 2002Inventors: Julian L. Center, Christopher R. Wren, Sumit Basu, Evgeniy Gusyatin
-
Publication number: 20020113687Abstract: A biometric identification method of identifying a person combines facial identification steps with audio identification steps. In order to reduce vulnerability of a recognition system to deception using photographs or even three-dimensional masks or replicas, the system uses a sequence of images to verify that lips and chin are moving as a predetermined sequence of sounds are uttered by a person who desires to be identified. In order to compensate for variations in speed of making the utterance, a dynamic time warping algorithm is used to normalize length of the input utterance to match the length of a model utterance previously stored for the person. In order to prevent deception based on two-dimensional images, preferably two cameras pointed in different directions are used for facial recognition.Type: ApplicationFiled: November 13, 2001Publication date: August 22, 2002Inventors: Julian L. Center, Christopher R. Wren, Sumit Basu