Patents by Inventor Duanpei Wu

Duanpei Wu has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Publication number: 20070156924
    Abstract: Disclosed are video conferencing systems, devices, architectures, and methods for transcoding, transrating, and the like, to facilitate video streaming in a distributed arrangement. An exemplary translator in accordance with embodiments can include: an input configured to receive a first video stream in a first format, the first video stream being from a first media switch, the first media switch being associated with a first stream group having one or more first endpoints; and an output configured to provide a second video stream in a second format, the second video stream being sent to a second media switch, the second media switch being associated with a second stream group having one or more second endpoints, whereby the translator is configured to convert from the first to the second format.
    Type: Application
    Filed: January 3, 2006
    Publication date: July 5, 2007
    Applicant: Cisco Technology, Inc.
    Inventors: Thiyagesan Ramalingam, Francis Viggiano, Nermin Ismail, Walter Friedrich, Duanpei Wu, Shantanu Sarkar
  • Publication number: 20070153712
    Abstract: Disclosed are video conferencing systems, devices, architectures, and methods for using media notifications to coordinate switching between video in a distributed arrangement. An exemplary media switch in accordance with embodiments can include: a first interface configured for a first type communication with an endpoint; a second interface configured for the first type communication with another media switch, the second interface being configured to receive a first video stream having a first characteristic and a second video stream having a second characteristic; a third interface configured for a second type communication with a stream controller, the stream controller being configured to provide a notification; and a fourth interface configured for the second type communication with a controlling server, whereby the media switch is configured to re-target an active stream in response to the notification or a difference between the first and second characteristics.
    Type: Application
    Filed: January 5, 2006
    Publication date: July 5, 2007
    Applicant: Cisco Technology, Inc.
    Inventors: Steven Fry, Thiyagesan Ramalingam, Nermin Ismail, Walter Friedrich, Duanpei Wu
  • Publication number: 20070064901
    Abstract: According to an embodiment of the present invention, an apparatus for performing video conferencing is provided that includes an I-frame injector element operable to intercept I-frame requests from one or more end points and to attempt to service the I-frame requests such that at least a portion of the requests are prevented from propagating back to an originating sender. In more specific embodiments, when a receiver endpoint sends a fast video update (FVU) request upstream, it is intercepted by the I-frame injector element and rather than passing the FVU request to the sender the I-frame injector element replaces a next P-frame from the sender with an I-frame, whereby the I-frame is constructed so that when decoded, it matches the P-frame that it replaced. In still more detailed embodiments, the I-frame injector element operates in one of three modes that are associated with bandwidth parameters.
    Type: Application
    Filed: August 24, 2005
    Publication date: March 22, 2007
    Inventors: Randall Baird, Scott Firestone, Luke Surazski, Duanpei Wu
  • Patent number: 7084898
    Abstract: An audio mixer on a first device receives one or more incoming audio streams. Each of the one or more incoming audio streams has an associated timestamp. The audio mixer generates a mixed audio stream from the one or more incoming audio streams. The audio mixer determines differences in the time base of each of the one or more incoming audio streams and the time base for the mixed audio stream. The audio mixer generates mapping parameters associated with the determined differences and transforms the timestamp of each of the one or more incoming audio streams to a corresponding output timestamp associated with the mixed audio stream according to the mapping parameters. the mapping parameters are provided to a video mixer for similar processing and transformation such that the mixed audio stream is in synchronization with a mixed video stream.
    Type: Grant
    Filed: November 18, 2003
    Date of Patent: August 1, 2006
    Assignee: Cisco Technology, Inc.
    Inventors: Scott S. Firestone, Walter R. Friedrich, Nermin M. Ismail, Keith A. Lantz, Shantanu Sarkar, Luke K. Surazski, Duanpei Wu
  • Patent number: 6989856
    Abstract: A method for executing a video conference is provided that includes receiving one or more audio streams associated with a video conference from one or more end points and determining an active speaker associated with one of the end points. Audio information associated with the active speaker may be received at one or more media switches. One or more video streams may be suppressed except for a selected video stream associated with the active speaker, the selected video stream propagating to one or more of the media switches during the video conference. The selected video stream may be replicated such that it may be communicated to one or more of the end points associated with a selected one of the media switches.
    Type: Grant
    Filed: November 6, 2003
    Date of Patent: January 24, 2006
    Assignee: Cisco Technology, Inc.
    Inventors: Scott S. Firestone, Walter R. Friedrich, Nermin M. Ismail, Keith A. Lantz, Shantanu Sarkar, Luke K. Surazski, Duanpei Wu
  • Publication number: 20050248652
    Abstract: A method for executing a video conference is provided that includes receiving one or more audio streams associated with a video conference from one or more end points and determining an active speaker associated with one of the end points. Audio information associated with the active speaker may be received at one or more media switches. One or more video streams may be suppressed except for a selected video stream associated with the active speaker, the selected video stream propagating to one or more of the media switches during the video conference. The selected video stream may be replicated such that it may be communicated to one or more of the end points associated with a selected one of the media switches.
    Type: Application
    Filed: July 12, 2005
    Publication date: November 10, 2005
    Inventors: Scott Firestone, Walter Friedrich, Nermin Ismail, Keith Lantz, Shantanu Sarkar, Luke Surazski, Duanpei Wu
  • Publication number: 20050078170
    Abstract: A method for executing a video conference is provided that includes receiving one or more audio streams associated with a video conference from one or more end points and determining an active speaker associated with one of the end points. Audio information associated with the active speaker may be received at one or more media switches. One or more video streams may be suppressed except for a selected video stream associated with the active speaker, the selected video stream propagating to one or more of the media switches during the video conference. The selected video stream may be replicated such that it may be communicated to one or more of the end points associated with a selected one of the media switches.
    Type: Application
    Filed: October 8, 2003
    Publication date: April 14, 2005
    Inventors: Scott Firestone, Walter Friedrich, Nermin Ismail, Keith Lantz, Shantanu Sarkar, Luke Surazski, Duanpei Wu
  • Publication number: 20050078171
    Abstract: A method for executing a video conference is provided that includes receiving one or more audio streams associated with a video conference from one or more end points and determining an active speaker associated with one of the end points. Audio information associated with the active speaker may be received at one or more media switches. One or more video streams may be suppressed except for a selected video stream associated with the active speaker, the selected video stream propagating to one or more of the media switches during the video conference. The selected video stream may be replicated such that it may be communicated to one or more of the end points associated with a selected one of the media switches.
    Type: Application
    Filed: November 6, 2003
    Publication date: April 14, 2005
    Inventors: Scott Firestone, Walter Friedrich, Nermin Ismail, Keith Lantz, Shantanu Sarkar, Luke Surazski, Duanpei Wu
  • Patent number: 6826528
    Abstract: A method for implementing a noise suppressor in a speech recognition system comprises a filter bank for separating source speech data into discrete frequency sub-bands to generate filtered channel energy, and a noise suppressor for weighting the frequency sub-bands to improve the signal-to-noise ratio of the resultant noise-suppressed channel energy. The noise suppressor preferably includes a noise calculator for calculating background noise values, a speech energy calculator for calculating speech energy values for each channel of the filter bank, and a weighting module for applying calculated weighting values to the projected channel energy to generate the noise-suppressed channel energy.
    Type: Grant
    Filed: October 18, 2000
    Date of Patent: November 30, 2004
    Assignees: Sony Corporation, Sony Electronics Inc.
    Inventors: Duanpei Wu, Miyuki Tanaka, Xavier Menendez-Pidal
  • Patent number: 6778959
    Abstract: A system and method for speech verification using out-of-vocabulary models includes a speech recognizer that has a model bank with system vocabulary word models, a garbage model, and one or more noise models. The model bank may reject an utterance or other sound as an invalid vocabulary word when the model bank identifies the utterance or other sound as corresponding to the garbage model or the noise models. Initial noise models may be selectively combined into a pre-determined number of final noise model clusters to effectively reduce the number of noise models that are utilized by the model bank of the speech recognizer to verify system vocabulary words.
    Type: Grant
    Filed: October 18, 2000
    Date of Patent: August 17, 2004
    Assignees: Sony Corporation, Sony Electronics Inc.
    Inventors: Duanpei Wu, Lex Olorenshaw, Xavier Menendez-Pidal, Ruxin Chen
  • Patent number: 6751588
    Abstract: A method for performing microphone conversions in a speech recognition system comprises a speech module that simultaneously captures an identical input signal using both an original microphone and a final microphone. The original microphone is also used to record an original training database. The final microphone is also used to capture input signals during normal use of the speech recognition system. A characterization module then analyzes the recorded identical input signal to generate characterization values that are subsequently utilized by a conversion module to convert the original training database into a final training database. A training program then uses the final training database to train a recognizer in the speech module in order to optimally perform a speech recognition process, in accordance with the present invention.
    Type: Grant
    Filed: November 23, 1999
    Date of Patent: June 15, 2004
    Assignees: Sony Corporation, Sony Electronics Inc.
    Inventors: Xavier Menendez-Pidal, Miyuki Tanaka, Duanpei Wu
  • Patent number: 6718302
    Abstract: A method for utilizing validity constraints in a speech endpoint detector comprises a validity manager that may utilize a pulse width module to validate utterances that include a plurality of energy pulses during a certain time period. The validity manager also may utilize a minimum power module to ensure that speech energy below a pre-determined level is not classified as a valid utterance. In addition the validity manager may use a duration module to ensure that valid utterances fall within a specified duration. Finally, the validity manager may utilize a short-utterance minimum power module to specifically distinguish an utterance of short duration from background noise based on the energy level of the short utterance.
    Type: Grant
    Filed: January 12, 2000
    Date of Patent: April 6, 2004
    Assignees: Sony Corporation, Sony Electronics Inc.
    Inventors: Duanpei Wu, Miyuki Tanaka, Ruxin Chen, Lex Olorenshaw
  • Patent number: 6473735
    Abstract: The present invention comprises a system and method for speech verification using a confidence measure that includes a speech verifier which compares a differential score for a recognized word to a predetermined threshold value, where a recognized word is the word model that produced the highest recognition score. In one embodiment, a single threshold is used for each word in a vocabulary. In another embodiment, each word model has an associated threshold, so that a differential score for a recognized word is compared to a unique threshold associated with that word. In a further embodiment, pairs of confused words in the vocabulary are dealt with separately. If a confused word is the recognized word, the speech verifier compares the differential score to a threshold that depends on the word model that produced the next-highest recognition score. Different values for the various thresholds may maximize rejection accuracy or recognition accuracy.
    Type: Grant
    Filed: April 20, 2000
    Date of Patent: October 29, 2002
    Assignees: Sony Corporation, Sony Electronics Inc.
    Inventors: Duanpei Wu, Xavier Menendez-Pidal, Lex Olorenshaw, Ruxin Chen
  • Patent number: 6272460
    Abstract: A method for implementing a speech verification system for use in a noisy environment comprises the steps of generating a confidence index for an utterance using a speech verifier, and controlling the speech verifier with a processor, wherein the utterance contains frames of sound energy. The speech verifier includes a noise suppressor, a pitch detector, and a confidence determiner. The noise suppressor suppresses noise in each frame in the utterance by summing a frequency spectrum for each frame with frequency spectra of a selected number of previous frames to produce a spectral sum. The pitch detector applies a spectral comb window to each spectral sum to produce correlation values for each frame in the utterance. The pitch detector also applies an alternate spectral comb window to each spectral sum to produce alternate correlation values for each frame in the utterance. The confidence determiner evaluates the correlation values to produce a frame confidence measure for each frame in the utterance.
    Type: Grant
    Filed: March 8, 1999
    Date of Patent: August 7, 2001
    Assignees: Sony Corporation, Sony Electronics Inc.
    Inventors: Duanpei Wu, Miyuki Tanaka, Lex Olorenshaw
  • Patent number: 6230122
    Abstract: A method for effectively suppressing background noise in a speech detection system comprises a filter bank for separating source speech data into discrete frequency sub-bands to generate filtered channel energy, and a noise suppressor for weighting the frequency sub-bands to improve the signal-to-noise ratio of the resultant noise-suppressed channel energy. The noise suppressor preferably includes a subspace module for using a Karhunen-Loeve transformation to create a subspace based on the background noise, a projection module for generating projected channel energy by projecting the filtered channel energy onto the created subspace, and a weighting module for applying calculated weighting values to the projected channel energy to generate the noise-suppressed channel energy.
    Type: Grant
    Filed: October 21, 1998
    Date of Patent: May 8, 2001
    Assignees: Sony Corporation, Sony Electronics Inc.
    Inventors: Duanpei Wu, Miyuki Tanaka, Mariscela Amador-Hernandez
  • Patent number: 6216103
    Abstract: A method for implementing a speech recognition system for use during conditions with background noise includes the steps of calculating, in real-time, sequential short-term delta energy parameters for speech energy from a spoken utterance, determining threshold values in the speech energy, and identifying a beginning point and an ending point for the spoken utterance based on the relationship between the threshold values and the short-term delta energy parameters.
    Type: Grant
    Filed: October 20, 1997
    Date of Patent: April 10, 2001
    Assignees: Sony Corporation, Sony Electronics Inc.
    Inventors: Duanpei Wu, Miyuki Tanaka, Ruxin Chen, Lex Olorenshaw
  • Patent number: 6173258
    Abstract: A method for reducing noise distortions in a speech recognition system comprises a feature extractor that includes a noise-suppressor, one or more time cosine transforms, and a normalizer. The noise-suppressor preferably performs a spectral subtraction process early in the feature extraction procedure. The time cosine transforms preferably operate in a centered-mode to each perform a transformation in the time domain. The normalizer calculates and utilizes normalization values to generate normalized features for speech recognition. The calculated normalization values preferably include mean values, left variances and right variances.
    Type: Grant
    Filed: October 22, 1998
    Date of Patent: January 9, 2001
    Assignees: Sony Corporation, Sony Electronics Inc.
    Inventors: Xavier Menendez-Pidal, Miyuki Tanaka, Ruxin Chen, Duanpei Wu
  • Patent number: 6006186
    Abstract: A method and an apparatus for a parameter sharing speech recognition system are provided. Speech signals are received into a processor of a speech recognition system. The speech signals are processed using a speech recognition system hosting a shared hidden Markov model (HMM) produced by generating a number of phoneme models, some of which are shared. The phoneme models are generated by retaining as a separate phoneme model any triphone model having a number of trained frames available that exceeds a prespecified threshold. A shared phoneme model is generated to represent each of the groups of triphone phoneme models for which the number of trained frames having a common biphone exceed the prespecified threshold. A shared phoneme model is generated to represent each of the groups of triphone phoneme models for which the number of trained frames having an equivalent effect on a phonemic context exceed the prespecified threshold.
    Type: Grant
    Filed: October 16, 1997
    Date of Patent: December 21, 1999
    Assignees: Sony Corporation, Sony Electronics, Inc.
    Inventors: Ruxin Chen, Miyuki Tanaka, Duanpei Wu, Lex S. Olorenshaw
  • Patent number: 5748843
    Abstract: Apparatus and method for speech recognition control of apparel manufacture equipment, such as a sewing machine, is provided. The invention includes a device for recognizing and translating an operator's verbal command into a digital control signal; a communication device such as a microphone for inputting the operator's verbal command into the recognizing and translating device; and interfacing means for presenting the digital control signal to the apparel manufacture equipment in a form recognized and accepted by the equipment. The method for voice control of apparel manufacture equipment according to the present invention comprises the steps of receiving an operator's verbal command; recognizing and translating the verbal command into a digital control signal; and routing this digital control signal to the apparel manufacture equipment in a form recognized by an actable upon by the equipment. An infrared light linkage may be employed to transmit commands from an operator to the machine's control circuitry.
    Type: Grant
    Filed: November 8, 1994
    Date of Patent: May 5, 1998
    Assignee: Clemson University
    Inventors: John C. Peck, Randy Rowland, Duanpei Wu