Patents by Inventor Yan Ming Cheng
Yan Ming Cheng has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Patent number: 11367092Abstract: A method of extracting price text from an image set includes: obtaining input data comprising (i) a plurality of images depicting shelves supporting products, and (ii) for each of the images, a set of text regions and corresponding price text strings; registering the images to a common frame of reference; identifying a subset of the text regions having overlapping locations in the common frame of reference; selecting one of the text regions from the subset; and presenting the price text string corresponding to the one of the text regions for further processing.Type: GrantFiled: May 1, 2017Date of Patent: June 21, 2022Assignee: Symbol Technologies, LLCInventors: Yan Zhang, Robert E. Beach, Bo Fu, Yan-Ming Cheng, Jordan Varley, Iaacov Coby Segall
-
Patent number: 10664229Abstract: A method, apparatus, and electronic device for voice navigation are disclosed. A voice input mechanism 310 may receive a verbal input from a user to a voice user interface program invisible to the user. A processor 104 may identify in a graphical user interface (GUI) a set of GUI items. The processor 104 may convert the set of GUI items to a set of voice searchable indices 400. The processor 104 may correlate a matching GUI item of the set of GUI items to a phonemic representation of the verbal input.Type: GrantFiled: August 20, 2014Date of Patent: May 26, 2020Assignee: Google LLCInventors: Yan Ming Cheng, Changxue Ma, Theodore Mazurkiewicz
-
Publication number: 20200118063Abstract: A method of object status detection for objects supported by a shelf, from shelf image data, includes: obtaining a plurality of images of a shelf, each image including an indication of a gap on the shelf between the objects; registering the images to a common frame of reference; identifying a subset of the gaps having overlapping locations in the common frame of reference; generating a consolidated gap indication from the subset; obtaining reference data including (i) identifiers for the objects and (ii) prescribed locations for the objects within the common frame of reference; based on a comparison of the consolidated gap indication with the reference data, selecting a target object identifier from the reference data; and generating and presenting a status notification for the target product identifier.Type: ApplicationFiled: May 1, 2018Publication date: April 16, 2020Inventors: Bo Fu, Yan Zhang, Yan-Ming Cheng, Jordan K. Varley, Robert E. Beach, laacov Coby Segall, Richard Jeffrey Rzeszutek, Michael Ramputi
-
Publication number: 20180315065Abstract: A method of extracting price text from an image set includes: obtaining input data comprising (i) a plurality of images depicting shelves supporting products, and (ii) for each of the images, a set of text regions and corresponding price text strings; registering the images to a common frame of reference; identifying a subset of the text regions having overlapping locations in the common frame of reference; selecting one of the text regions from the subset; and presenting the price text string corresponding to the one of the text regions for further processing.Type: ApplicationFiled: May 1, 2017Publication date: November 1, 2018Inventors: Yan Zhang, Robert E. Beach, Bo Fu, Yan-Ming Cheng, Jordan Varley, Iaacov Coby Segall
-
Patent number: 9465796Abstract: A computing device obtains an incomplete semantic map of a predefined space. The incomplete semantic map includes static landmarks. The computing device receives a set of natural language instructions including a sequence of semantically directive clauses, processes the sequence of semantically directive clauses, decodes one of an action and a path in the set of natural language instructions using an optimization process and based on the incomplete semantic map. In response to the decoding, the computing device inserts a newly identified landmark into the incomplete semantic map.Type: GrantFiled: December 1, 2014Date of Patent: October 11, 2016Assignee: Symbol Technologies, LLCInventor: Yan-Ming Cheng
-
Publication number: 20160154791Abstract: A computing device obtains an incomplete semantic map of a predefined space. The incomplete semantic map includes static landmarks. The computing device receives a set of natural language instructions including a sequence of semantically directive clauses, processes the sequence of semantically directive clauses, decodes one of an action and a path in the set of natural language instructions using an optimization process and based on the incomplete semantic map. In response to the decoding, the computing device inserts a newly identified landmark into the incomplete semantic map.Type: ApplicationFiled: December 1, 2014Publication date: June 2, 2016Inventor: YAN-MING CHENG
-
Patent number: 9081868Abstract: A search system will receive a voice query and use speech recognition with a predefined vocabulary to generate a textual transcription of the voice query. Queries are sent to a text search engine, retrieving multiple web page results for each of these initial text queries. The collection of the keywords is extracted from the resulting web pages and is phonetically indexed to form a voice query dependent and phonetically searchable index database. Finally, a phonetically-based voice search engine is used to search the original voice query against the voice query dependent and phonetically searchable index database to find the keywords and/or key phrases that best match what was originally spoken. The keywords and/or key phrases that best match what was originally spoken are then used as a final text query for a search engine. Search results from the final text query are then presented to the user.Type: GrantFiled: December 16, 2009Date of Patent: July 14, 2015Assignee: GOOGLE TECHNOLOGY HOLDINGS LLCInventors: Fan Zhang, Yan-Ming Cheng, Changxue Ma, James R. Talley
-
Publication number: 20150057917Abstract: Warping vectors of an image and audio are used to determine visual and verbal interaction effectiveness. A probability of successful placement of an unmanned vehicle is determined based on placement policies and the visual and verbal interaction effectiveness. A direction of movement is then determined that maximizes the probability of successful placement. Instructions are issued to move the unmanned vehicle towards the direction that maximizes the probability of successful placement.Type: ApplicationFiled: August 21, 2013Publication date: February 26, 2015Applicant: MOTOROLA SOLUTIONS, INC.Inventor: YAN-MING CHENG
-
Patent number: 8914289Abstract: A method for parsing a verbal expression received from a user to determine whether or not the expression contains a multiple-goal command is described. Specifically, known techniques are applied to extract terms from the verbal expression. The extracted terms are assigned to categories. If two or more terms are found in the parsed verbal expression that are in associated categories and that do not overlap one another temporally, then the confidence levels of these terms are compared. If the confidence levels are similar, then the terms may be parallel entries in the verbal expression and may represent multiple goals. If a multiple-goal command is found, then the command is either presented to the user for review and possible editing or is executed. If the parsed multiple-goal command is presented to the user for review, then the presentation can be made via any appropriate interface including voice and text interfaces.Type: GrantFiled: December 16, 2009Date of Patent: December 16, 2014Assignee: Symbol Technologies, Inc.Inventors: Changxue Ma, Yan-Ming Cheng
-
Publication number: 20140358903Abstract: A method, apparatus, and electronic device for voice navigation are disclosed. A voice input mechanism 310 may receive a verbal input from a user to a voice user interface program invisible to the user. A processor 104 may identify in a graphical user interface (GUI) a set of GUI items. The processor 104 may convert the set of GUI items to a set of voice searchable indices 400. The processor 104 may correlate a matching GUI item of the set of GUI items to a phonemic representation of the verbal input.Type: ApplicationFiled: August 20, 2014Publication date: December 4, 2014Inventors: Yan Ming Cheng, Changxue Ma, Theodore Mazurkiewicz
-
Patent number: 8442823Abstract: A method of performing a search of a database of speakers, includes: receiving a query speech sample spoken by a query speaker; deriving a query utterance from the query speech sample; extracting query utterance statistics from the query utterance; performing Kernelized Locality-Sensitive Hashing (KLSH) using a kernel function, the KLSH using as input the query utterance statistics and utterance statistics extracted from a plurality of utterances included in a database of speakers in order to select a subset of the plurality of utterances; and comparing, using an utterance comparison equation, the query utterance statistics to the utterance statistics for each utterance in the subset to generate a list of speakers from the database of utterances having a highest similarity to the query speaker.Type: GrantFiled: October 19, 2010Date of Patent: May 14, 2013Assignee: Motorola Solutions, Inc.Inventors: Woojay Jeon, Yan-Ming Cheng, Changxue Ma, Dusan Macho
-
Publication number: 20110145214Abstract: A search system will receive a voice query and use speech recognition with a predefined vocabulary to generate a textual transcription of the voice query. Queries are sent to a text search engine, retrieving multiple web page results for each of these initial text queries. The collection of the keywords is extracted from the resulting web pages and is phonetically indexed to form a voice query dependent and phonetically searchable index database. Finally, a phonetically-based voice search engine is used to search the original voice query against the voice query dependent and phonetically searchable index database to find the keywords and/or key phrases that best match what was originally spoken. The keywords and/or key phrases that best match what was originally spoken are then used as a final text query for a search engine. Search results from the final text query are then presented to the user.Type: ApplicationFiled: December 16, 2009Publication date: June 16, 2011Applicant: MOTOROLA, INC.Inventors: Fan Zhang, Yan-Ming Cheng, Changxue Ma, James R. Talley
-
Publication number: 20110144996Abstract: Disclosed is a method for parsing a verbal expression received from a user to determine whether or not the expression contains a multiple-goal command. Specifically, known techniques are applied to extract terms from the verbal expression. The extracted terms are assigned to categories. If two or more terms are found in the parsed verbal expression that are in associated categories and that do not overlap one another temporally, then the confidence levels of these terms are compared. If the confidence levels are similar, then the terms may be parallel entries in the verbal expression and may represent multiple goals. If a multiple-goal command is found, then the command is either presented to the user for review and possible editing or is executed. If the parsed multiple-goal command is presented to the user for review, then the presentation can be made via any appropriate interface including voice and text interfaces.Type: ApplicationFiled: December 16, 2009Publication date: June 16, 2011Applicant: MOTOROLA, INC.Inventors: Changxue Ma, Yan-Ming Cheng
-
Patent number: 7818170Abstract: A method for distributed voice searching may include receiving a search query from a user of the mobile communication device, generating a lattice of coarse linguistic representations from speech parts in the search query, extracting query features from the generated lattice of coarse linguistic representations, generating coarse search feature vectors based on the extracted query features, performing a coarse search using the generated coarse search feature vectors and transmitting the generated coarse search feature vectors to a remote voice search processing unit, receiving remote resultant web indices from the remote voice search processing unit, generating a lattice of fine linguistic representations from speech parts in the search query, generating fine search feature vectors from the lattice of fine linguistic representations, performing a fine search using the coarse search results, the remote resultant web indices and the generated fine search feature vectors, and displaying the fine search results tType: GrantFiled: April 10, 2007Date of Patent: October 19, 2010Assignee: Motorola, Inc.Inventor: Yan Ming Cheng
-
Publication number: 20100145971Abstract: A method and apparatus for generating a query from multimedia content is provided herein. During operation a query generator (101) will receive multi-media content and separate the multi-media content into at least a video portion and an audio portion. A query will be generated based on both the video portion and the audio portion. The query may comprise a single query based on both the video and audio portion, or the query may comprise a “bundle” of queries. The bundle of queries contains at least a query for the video portion, and a query for the audio portion of the multimedia event.Type: ApplicationFiled: December 8, 2008Publication date: June 10, 2010Applicant: MOTOROLA, INC.Inventors: Yan-Ming Cheng, John Richard Kane
-
Publication number: 20090131021Abstract: A method (400, 500) of propagating an alert (116). The alert can be received on a first communication device (102). The alert can be associated with data indicating at least one peer-to-peer propagation parameter. Further, the alert can be automatically communicated from the first communication device to at least a second communication device in accordance with the peer-to-peer propagation parameter via peer-to-peer communications.Type: ApplicationFiled: November 16, 2007Publication date: May 21, 2009Applicant: MOTOROLA, INC.Inventors: Jerome O. Vogedes, Daniel A. Baudino, Charles P. Binzel, Yan Ming Cheng, Steven J. Nowlan, Jorge L. Perdomo, W. Garland Phillips
-
Patent number: 7471775Abstract: A method and apparatus (100) for updating a voice tag comprising N stored voice tag phoneme sequences includes a function (110) for determining (205) an accepted stored voice tag phoneme sequence for an utterance, a function (140) for extracting(210) a current set of M phoneme sequences having highest likelihoods of representing the utterance, a function (160) for updating (215) a reference histogram associated with the accepted voice tag, and a function (160) for updating (225) the voice tag with N selected phoneme sequences that are selected from the current set of M phoneme sequences and the set of N voice tag phoneme sequences, wherein the N selected phoneme sequences have phoneme histograms most closely matching the reference histogram. The method and apparatus (100) also generates a voice tag using some functions (110, 140, 160) that are common with the method and apparatus to update the voice tag, such as the extracting (410) of the current set of M phoneme sequences.Type: GrantFiled: June 30, 2005Date of Patent: December 30, 2008Assignee: Motorola, Inc.Inventor: Yan Ming Cheng
-
Publication number: 20080256033Abstract: A method for distributed voice searching may include receiving a search query from a user of the mobile communication device, generating a lattice of coarse linguistic representations from speech parts in the search query, extracting query features from the generated lattice of coarse linguistic representations, generating coarse search feature vectors based on the extracted query features, performing a coarse search using the generated coarse search feature vectors and transmitting the generated coarse search feature vectors to a remote voice search processing unit, receiving remote resultant web indices from the remote voice search processing unit, generating a lattice of fine linguistic representations from speech parts in the search query, generating fine search feature vectors from the lattice of fine linguistic representations, performing a fine search using the coarse search results, the remote resultant web indices and the generated fine search feature vectors, and displaying the fine search results tType: ApplicationFiled: April 10, 2007Publication date: October 16, 2008Applicant: Motorola, Inc.Inventor: Yan Ming CHENG
-
Publication number: 20080215490Abstract: A method, apparatus, and electronic device for optimizing content acquisition are disclosed. A memory may store usage of a previous set of media content by the mobile device. An input/output device may receive a request for a current set of media content. A processor may create a user profile based on the usage and provides a first recommendation of a first digital rights agreement based on the user profile.Type: ApplicationFiled: March 3, 2007Publication date: September 4, 2008Applicant: Motorola, Inc.Inventors: Jason N. Howard, Alfred N. Danial, Scott B. Davis, Thomas J. MacTavish, Yan Ming Cheng, Thomas J. Weigert
-
Publication number: 20080162128Abstract: One provides (101) a plurality of frames of sampled audio content and then processes (102) that plurality of frames using a speech recognition search process that comprises, at least in part, determining whether to search each subword boundary contained within each frame on a frame-by-frame basis. These teachings will also readily accommodate determining whether to search each word boundary contained within each frame on a frame-by-frame basis.Type: ApplicationFiled: December 29, 2006Publication date: July 3, 2008Applicant: MOTOROLA, INC.Inventor: Yan Ming Cheng