Patents by Inventor Robert Boman

Robert Boman has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 9064161
    Abstract: A system and method for detecting the presence of known or unknown objects based on visual features is disclosed. In the preferred embodiment, the system is a checkout system for detecting items of merchandise on a shopping cart. The merchandise checkout system preferably includes a feature extractor for extracting visual features from a plurality of images; a motion detector configured to detect one or more groups of the visual features present in at least two of the plurality of images; a classifier to classify each of said groups of the visual features based on one or more classification criteria, wherein each of the one or more parameters is associated with one of said groups of visual features; and an alarm configured to generate an alert if the one or more parameters for any of said groups of the visual features satisfy one or more classification criteria.
    Type: Grant
    Filed: June 8, 2007
    Date of Patent: June 23, 2015
    Assignee: Datalogic ADC, Inc.
    Inventors: Robert Boman, Luis Goncalves, James Ostrowski
  • Publication number: 20110286628
    Abstract: A method of organizing a set of recognition models of known objects stored in a database of an object recognition system includes determining a classification model for each known object and grouping the classification models into multiple classification model groups. Each classification model group identifies a portion of the database that contains the recognition models of the known objects having classification models that are members of the classification model group. The method also includes computing a representative classification model for each classification model group. Each representative classification model is derived from the classification models that are members of the classification model group. When a target object is to be recognized, the representative classification models are compared to a classification model of the target object to enable selection of a subset of the recognition models of the known objects for comparison to a recognition model of the target object.
    Type: Application
    Filed: May 13, 2011
    Publication date: November 24, 2011
    Inventors: Luis F. Goncalves, Jim Ostrowski, Robert Boman
  • Patent number: 7324943
    Abstract: A media capture device has an audio input receptive of user speech relating to a media capture activity in close temporal relation to the media capture activity. A plurality of focused speech recognition lexica respectively relating to media capture activities are stored on the device, and a speech recognizer recognizes the user speech based on a selected one of the focused speech recognition lexica. A media tagger tags captured media with generated speech recognition text, and a media annotator annotates the captured media with a sample of the user speech that is suitable for input to a speech recognizer. Tagging and annotating are based on close temporal relation between receipt of the user speech and capture of the captured media. Annotations may be converted to tags during post processing, employed to edit a lexicon using letter-to-sound rules and spelled word input, or matched directly to speech to retrieve captured media.
    Type: Grant
    Filed: October 2, 2003
    Date of Patent: January 29, 2008
    Assignee: Matsushita Electric Industrial Co., Ltd.
    Inventors: Luca Rigazio, Robert Boman, Patrick Nguyen, Jean-Claude Junqua
  • Publication number: 20050240324
    Abstract: The vehicular monitoring system obtains audio information from the speech of occupants within the vehicle and then processes that audio information to extract information about the behavior of the vehicle occupants. Using the behavioral information the system then assesses whether said behavior is in compliance with a predefined set of rules. Severe violations are reported to a third party via cellular telephone or other means. Less severe violations are recorded in a log file that is subsequently uploaded to a networked computer system for review and analysis. The behavioral rules used to assess violations may be modified by the administrative user.
    Type: Application
    Filed: April 26, 2004
    Publication date: October 27, 2005
    Inventors: Robert Boman, Roland Kuhn, Brian Hanson
  • Publication number: 20050228663
    Abstract: A media production system includes a textual alignment module aligning multiple speech recordings to textual lines of a script based on speech recognition results. A navigation module responds to user navigation selections respective of the textual lines of the script by communicating to the user corresponding, line-specific portions of the multiple speech recordings. An editing module responds to user associations of multiple speech recordings with textual lines by accumulating line-specific portions of the multiple speech recordings in a combination recording based on at least one of relationships of textual lines in the script to the combination recording, and temporal alignments between the multiple speech recordings and the combination recording.
    Type: Application
    Filed: March 31, 2004
    Publication date: October 13, 2005
    Inventors: Robert Boman, Patrick Nguyen, Jean-Claude Junqua
  • Publication number: 20050148339
    Abstract: A personal item monitoring system includes a monitor having a transmitter and a receiver located therein. At least one radio identification tag is adapted to be coupled to a personal item. Alternatively, the radio identification tag may be pre-installed into the personal item. The monitor emits a radio frequency received by the radio frequency identification tag, and the radio frequency identification tag emits a responding signal if within a detection range. The monitor then alerts a user if the radio identification tag leaves the range of detection.
    Type: Application
    Filed: January 6, 2004
    Publication date: July 7, 2005
    Inventors: Robert Boman, Brian Hanson
  • Publication number: 20050114357
    Abstract: An indexing system for tagging a media stream is provided. The indexing system includes a plurality of inputs for defining at least one tag. A tagging system assigns the tag to the media stream. A tag analysis system selectively distributes tags for review and editing by members of the collaborative group. A tag database stores the tag and the media stream. Retrieval architecture can search the database using the tags.
    Type: Application
    Filed: November 20, 2003
    Publication date: May 26, 2005
    Inventors: Rathinavelu Chengalvarayan, Philippe Morin, Robert Boman, Ted Applebaum
  • Patent number: 6895257
    Abstract: Personalized agent services are provided in a personal messaging device, such as a cellular telephone or personal digital assistant, through services of a speech recognizer that converts speech into text and a text-to-speech synthesizer that converts text to speech. Both recognizer and synthesizer may be server-based or locally deployed within the device. The user dictates an e-mail message which is converted to text and stored. The stored text is sent back to the user as text or as synthesized speech, to allow the user to edit the message and correct transcription errors before sending as e-mail. The system includes a summarization module that prepares short summaries of incoming e-mail and voice mail. The user may access these summaries, and retrieve and organize email and voice mail using speech commands.
    Type: Grant
    Filed: February 18, 2002
    Date of Patent: May 17, 2005
    Assignee: Matsushita Electric Industrial Co., Ltd.
    Inventors: Robert Boman, Kirill Stoimenov, Roland Kuhn, Jean-Claude Junqua
  • Patent number: 6889189
    Abstract: System speakers are switched to function as sound input transducers to improve recognizer performance and to support recognizer features. A crossbar switch is selectively activated, either manually or under software control, to allow system loudspeakers to function as sound input transducers that supplement the recognition system microphone or microphone array. Using loudspeakers as “microphones” improves speech recognition in noisy environments, thus attaining better recognition performance with little added system cost. The loudspeakers, positioned in physically separate locations also provide spatial information that can be used to determine the location of the person speaking and thereby offer different functionality for different persons. Acoustic models are selected based on environmental and vehicle operating conditions and may be adapted dynamically using ambient information obtained using the loudspeakers as sound input transducers.
    Type: Grant
    Filed: September 26, 2003
    Date of Patent: May 3, 2005
    Assignee: Matsushita Electric Industrial Co., Ltd.
    Inventors: Robert Boman, Luca Rigazio, Brian Hanson, Rathinavelu Chengalvarayan
  • Publication number: 20050075881
    Abstract: A media capture device has an audio input receptive of user speech relating to a media capture activity in close temporal relation to the media capture activity. A plurality of focused speech recognition lexica respectively relating to media capture activities are stored on the device, and a speech recognizer recognizes the user speech based on a selected one of the focused speech recognition lexica. A media tagger tags captured media with generated speech recognition text, and a media annotator annotates the captured media with a sample of the user speech that is suitable for input to a speech recognizer. Tagging and annotating are based on close temporal relation between receipt of the user speech and capture of the captured media. Annotations may be converted to tags during post processing, employed to edit a lexicon using letter-to-sound rules and spelled word input, or matched directly to speech to retrieve captured media.
    Type: Application
    Filed: October 2, 2003
    Publication date: April 7, 2005
    Inventors: Luca Rigazio, Robert Boman, Patrick Nguyen, Jean-Claude Junqua
  • Publication number: 20050071159
    Abstract: System speakers are switched to function as sound input transducers to improve recognizer performance and to support recognizer features. A crossbar switch is selectively activated, either manually or under software control, to allow system loudspeakers to function as sound input transducers that supplement the recognition system microphone or microphone array. Using loudspeakers as “microphones” improves speech recognition in noisy environments, thus attaining better recognition performance with little added system cost. The loudspeakers, positioned in physically separate locations also provide spatial information that can be used to determine the location of the person speaking and thereby offer different functionality for different persons. Acoustic models are selected based on environmental and vehicle operating conditions and may be adapted dynamically using ambient information obtained using the loudspeakers as sound input transducers.
    Type: Application
    Filed: September 26, 2003
    Publication date: March 31, 2005
    Inventors: Robert Boman, Luca Rigazio, Brian Hanson, Rathinavelu Chengalvarayan
  • Publication number: 20050010411
    Abstract: A speech data mining system for use in generating a rich transcription having utility in call center management includes a speech differentiation module differentiating between speech of interacting speakers, and a speech recognition module improving automatic recognition of speech of one speaker based on interaction with another speaker employed as a reference speaker. A transcript generation module generates a rich transcript based on recognized speech of the speakers. Focused, interactive language models improve recognition of a customer on a low quality channel using context extracted from speech of a call center operator on a high quality channel with a speech model adapted to the operator. Mined speech data includes number of interaction turns, customer frustration phrases, operator polity, interruptions, and/or contexts extracted from speech recognition results, such as topics, complaints, solutions, and resolutions.
    Type: Application
    Filed: July 9, 2003
    Publication date: January 13, 2005
    Inventors: Luca Rigazio, Patrick Nguyen, Jean-Claude Junqua, Robert Boman
  • Patent number: 6697778
    Abstract: Client speaker locations in a speaker space are used to generate speech models for comparison with test speaker data or test speaker speech models. The speaker space can be constructed using training speakers that are entirely separate from the population of client speakers, or from client speakers, or from a mix of training and client speakers. Reestimation of the speaker space based on client environment information is also provided to improve the likelihood that the client data will fall within the speaker space. During enrollment of the clients into the speaker space, additional client speech can be obtained when predetermined conditions are met. The speaker distribution can also be used in the client enrollment step.
    Type: Grant
    Filed: July 5, 2000
    Date of Patent: February 24, 2004
    Assignee: Matsushita Electric Industrial Co., Ltd.
    Inventors: Roland Kuhn, Olivier Thyes, Patrick Nguyen, Jean-Claude Junqua, Robert Boman
  • Patent number: 6691091
    Abstract: A noise adaptation system and method provide for noise adaptation in a speech recognition system. The method includes the steps of generating a reference model based on a training speech signal, and compensating the reference model for additive noise in the cepstral domain. The reference model is also compensated for convolutional noise in the cepstral domain. In one embodiment, the convolutional noise is compensated for by estimating a convolutional bias between the reference model and a target speech signal. The estimated convolutional bias is transformed with a channel adaptation matrix, and the transformed convolutional bias is added to the reference model in the cepstral domain.
    Type: Grant
    Filed: July 31, 2000
    Date of Patent: February 10, 2004
    Assignee: Matsushita Electric Industrial Co., Ltd.
    Inventors: Christophe Cerisara, Luca Rigazio, Robert Boman, Jean-Claude Junqua
  • Publication number: 20030157968
    Abstract: Personalized agent services are provided in a personal messaging device, such as a cellular telephone or personal digital assistant, through services of a speech recognizer that converts speech into text and a text-to-speech synthesizer that converts text to speech. Both recognizer and synthesizer may be server-based or locally deployed within the device. The user dictates an e-mail message which is converted to text and stored. The stored text is sent back to the user as text or as synthesized speech, to allow the user to edit the message and correct transcription errors before sending as e-mail. The system includes a summarization module that prepares short summaries of incoming e-mail and voice mail. The user may access these summaries, and retrieve and organize email and voice mail using speech commands.
    Type: Application
    Filed: February 18, 2002
    Publication date: August 21, 2003
    Inventors: Robert Boman, Kirill Stoimenov, Roland Kuhn, Jean-Claude Junqua
  • Patent number: 6529872
    Abstract: The improved noise adaptation technique employs a linear or non-linear transformation to the set of Jacobian matrices corresponding to an initial noise condition. An &agr;-adaptation parameter or artificial intelligence operation is employed in a linear or non-linear way to increase the adaptation bias added to the speech models. This corrects shortcomings of conventional Jacobian adaptation, which tend to underestimate the effect of noise. The improved adaptation technique is further enhanced by a reduced dimensionality, principal component analysis technique that reduces the computational burden, making the adaptation technique beneficial in embedded recognition systems.
    Type: Grant
    Filed: April 18, 2000
    Date of Patent: March 4, 2003
    Assignee: Matsushita Electric Industrial Co., Ltd.
    Inventors: Christophe Cerisara, Luca Rigazio, Robert Boman, Jean-Claude Junqua
  • Patent number: 6480819
    Abstract: A method and apparatus is provided to enable a user watching and/or listening to a program to search for new information in the stream of a telecommunications data. The apparatus includes a voice recognition system that recognizes the user's request and causes a search to be performed in the long stream of data of at least one other telecommunication channel. The system includes a storage device for storing and processing the request. Upon recognition of the request, the incoming signal or signals are scanned for matches with the request. Upon finding the match between the request and the incoming signal, information related to the data is brought to the viewer's attention. This can be accomplished by either changing the viewer's station or by bringing in a split screen display forward into the display.
    Type: Grant
    Filed: February 25, 1999
    Date of Patent: November 12, 2002
    Assignee: Matsushita Electric Industrial Co., Ltd.
    Inventors: Robert Boman, Jean-Claude Junqua
  • Patent number: 6141644
    Abstract: Speech models are constructed and trained upon the speech of known client speakers (and also impostor speakers, in the case of speaker verification). Parameters from these models are concatenated to define supervectors and a linear transformation upon these supervectors results in a dimensionality reduction yielding a low-dimensional space called eigenspace. The training speakers are then represented as points or distributions in eigenspace. Thereafter, new speech data from the test speaker is placed into eigenspace through a similar linear transformation and the proximity in eigenspace of the test speaker to the training speakers serves to authenticate or identify the test speaker.
    Type: Grant
    Filed: September 4, 1998
    Date of Patent: October 31, 2000
    Assignee: Matsushita Electric Industrial Co., Ltd.
    Inventors: Roland Kuhn, Patrick Nguyen, Jean-Claude Junqua, Robert Boman
  • Patent number: 5946649
    Abstract: The present invention eliminates injection noise in speech produced by esophageal speakers. A speech input signal is digitized. One copy of the digitized signal is used for analysis and the other is passed through a gain switch to an amplifier as output. A Fast Fourier Transform and a mean value of the digitized speech input signal is calculated. The Fast Fourier Transform (FFT) is passed through a morphological filter to produce a filtered spectrum. An occurrence of injection noise is detected by calculating a derivative of the filtered spectrum and determining from the mean value and the derivative a location and value of a largest peak and a second largest peak in the filtered spectrum. If the largest peak is lower in frequency than the second largest peak, and if all points above 2 KHz are less than the mean, then an occurrence of injection noise has been detected.
    Type: Grant
    Filed: April 16, 1997
    Date of Patent: August 31, 1999
    Assignee: Technology Research Association of Medical Welfare Apparatus
    Inventors: Hector Raul Javkin, Michael Galler, Nancy Niedzielski, Robert Boman