Patents by Inventor CHIEH-CHI KAO
CHIEH-CHI KAO has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Publication number: 20240071408Abstract: A system may include a first acoustic event detection (AED) component configured to detect a predetermined set of acoustic events, and include a second AED component configured to detect custom acoustic events that a user configures a device to detect. The first and second AED components are configured to perform task-specific processing, and may receive as input the same acoustic feature data corresponding to audio data that potentially represents occurrence of one or more events. Based on processing by the first and second AED components, a device may output data indicating that one or more acoustic events occurred, where the acoustic events may be a predetermined acoustic event and/or a custom acoustic event.Type: ApplicationFiled: September 8, 2023Publication date: February 29, 2024Inventors: Qingming Tang, Chieh-Chi Kao, Qin Zhang, Ming Sun, Chao Wang, Sumit Garg, Rong Chen, James Garnet Droppo, Chia-Jung Chang
-
Patent number: 11790932Abstract: A system may include a first acoustic event detection (AED) component configured to detect a predetermined set of acoustic events, and include a second AED component configured to detect custom acoustic events that a user configures a device to detect. The first and second AED components are configured to perform task-specific processing, and may receive as input the same acoustic feature data corresponding to audio data that potentially represents occurrence of one or more events. Based on processing by the first and second AED components, a device may output data indicating that one or more acoustic events occurred, where the acoustic events may be a predetermined acoustic event and/or a custom acoustic event.Type: GrantFiled: December 10, 2021Date of Patent: October 17, 2023Assignee: Amazon Technologies, Inc.Inventors: Qingming Tang, Chieh-Chi Kao, Qin Zhang, Ming Sun, Chao Wang, Sumit Garg, Rong Chen, James Garnet Droppo, Chia-Jung Chang
-
Publication number: 20230186939Abstract: A system may include a first acoustic event detection (AED) component configured to detect a predetermined set of acoustic events, and include a second AED component configured to detect custom acoustic events that a user configures a device to detect. The first and second AED components are configured to perform task-specific processing, and may receive as input the same acoustic feature data corresponding to audio data that potentially represents occurrence of one or more events. Based on processing by the first and second AED components, a device may output data indicating that one or more acoustic events occurred, where the acoustic events may be a predetermined acoustic event and/or a custom acoustic event.Type: ApplicationFiled: December 10, 2021Publication date: June 15, 2023Inventors: Qingming Tang, Chieh-Chi Kao, Qin Zhang, Ming Sun, Chao Wang, Sumit Garg, Rong Chen, James Garnet Droppo, Chia-Jung Chang
-
Patent number: 11302329Abstract: A system may include an acoustic event detection component for detecting acoustic events, which may be non-speech sounds. Upon detection of a command to detect a new sound, a device may prompt a user to cause occurrence of the sound one or more times. The acoustic event detection component may then be reconfigured, using audio data corresponding to the occurrences, to detect future occurrences of the event.Type: GrantFiled: June 29, 2020Date of Patent: April 12, 2022Assignee: Amazon Technologies, Inc.Inventors: Ming Sun, Spyridon Matsoukas, Venkata Naga Krishna Chaitanya Puvvada, Chao Wang, Chieh-Chi Kao
-
Patent number: 11069352Abstract: Described herein is a system for media presence detection in audio. The system analyzes audio data to recognize whether a given audio segment contains sounds from a media source as a way of differentiating recorded media source sounds from other live sounds. In exemplary embodiments, the system includes a hierarchical model architecture for processing audio data segments, where individual audio data segments are processed by a trained machine learning model operating locally, and another trained machine learning model provides historical and contextual information to determine a score indicating the likelihood that the audio data segment contains sounds from a media source.Type: GrantFiled: February 18, 2019Date of Patent: July 20, 2021Assignee: Amazon Technologies, Inc.Inventors: Qingming Tang, Ming Sun, Chieh-Chi Kao, Chao Wang, Viktor Rozgic
-
Patent number: 10803885Abstract: An audio event detection system that processes audio data into audio feature data and processes the audio feature data using pre-configured candidate interval lengths to identify top candidate regions of the feature data that may include an audio event. The feature data from the top candidate regions are then scored by a classifier, where the score indicates a likelihood that the candidate region corresponds to a desired audio event. The scores are compared to a threshold, and if the threshold is satisfied, the top scoring candidate region is determined to include an audio event.Type: GrantFiled: June 29, 2018Date of Patent: October 13, 2020Assignee: Amazon Technologies, Inc.Inventors: Chieh-Chi Kao, Chao Wang, Weiran Wang, Ming Sun
-
Patent number: 10769500Abstract: System and method for an active learning system including a sensor obtains data from a scene including a set of images having objects. A memory to store active learning data including an object detector trained for detecting objects in images. A processor in communication with the memory, is configured to detect a semantic class and a location of at least one object in an image selected from the set of images using the object detector to produce a detection metric as a combination of an uncertainty of the object detector about the semantic class of the object in the image (classification) and an uncertainty of the object detector about the location of the object in the image (localization). Using an output interface or a display type device, in communication with the processor, to display the image for human labeling when the detection metric is above a threshold.Type: GrantFiled: August 31, 2017Date of Patent: September 8, 2020Assignee: Mitsubishi Electric Research Laboratories, Inc.Inventors: Teng-Yok Lee, Chieh-Chi Kao, Pradeep Sen, Ming-Yu Liu
-
Patent number: 10417524Abstract: An image processing system includes a memory to store a classifier and a set of labeled images for training the classifier, wherein each labeled image is labeled as either a positive image that includes an object of a specific type or a negative image that does not include the object of the specific type, wherein the set of labeled images has a first ratio of the positive images to the negative images. The system includes an input interface to receive a set of input images, a processor to determine a second ratio of the positive images, to classify the input images into positive and negative images to produce a set of classified images, and to select a subset of the classified images having the second ratio of the positive images to the negative images, and an output interface to render the subset of the input images for labeling.Type: GrantFiled: July 11, 2017Date of Patent: September 17, 2019Assignee: Mitsubishi Electric Research Laboratories, Inc.Inventors: Chen Feng, Ming-Yu Liu, Chieh-Chi Kao, Teng-Yok Lee
-
Patent number: 10418957Abstract: An audio event detection system that subsamples input audio data using a series of recurrent neural networks to create data of a coarser time scale than the audio data. Data frames corresponding to the coarser time scale may then be upsampled to data frames that match the finer time scale of the original audio data frames. The resulting data frames are then scored with a classifier to determine a likelihood that the individual frames correspond to an audio event. Each frame is then weighted by its score and a composite weighted frame is created by summing the weighted frames and dividing by the cumulative score. The composite weighted frame is then scored by the classifier. The resulting score is taken as an overall score indicating a likelihood that the input audio data includes an audio event.Type: GrantFiled: June 29, 2018Date of Patent: September 17, 2019Assignee: Amazon Technologies, Inc.Inventors: Weiran Wang, Chao Wang, Chieh-Chi Kao
-
Publication number: 20190065908Abstract: System and method for an active learning system including a sensor obtains data from a scene including a set of images having objects. A memory to store active learning data including an object detector trained for detecting objects in images. A processor in communication with the memory, is configured to detect a semantic class and a location of at least one object in an image selected from the set of images using the object detector to produce a detection metric as a combination of an uncertainty of the object detector about the semantic class of the object in the image (classification) and an uncertainty of the object detector about the location of the object in the image (localization). Using an output interface or a display type device, in communication with the processor, to display the image for human labeling when the detection metric is above a threshold.Type: ApplicationFiled: August 31, 2017Publication date: February 28, 2019Inventors: Teng-Yok Lee, Chieh-Chi Kao, Pradeep Sen, Ming-Yu Liu
-
Publication number: 20180232601Abstract: An image processing system includes a memory to store a classifier and a set of labeled images for training the classifier, wherein each labeled image is labeled as either a positive image that includes an object of a specific type or a negative image that does not include the object of the specific type, wherein the set of labeled images has a first ratio of the positive images to the negative images. The system includes an input interface to receive a set of input images, a processor to determine a second ratio of the positive images, to classify the input images into positive and negative images to produce a set of classified images, and to select a subset of the classified images having the second ratio of the positive images to the negative images, and an output interface to render the subset of the input images for labeling.Type: ApplicationFiled: July 11, 2017Publication date: August 16, 2018Applicant: Mitsubishi Electric Research Laboratories, Inc.Inventors: Chen Feng, Ming-Yu Liu, Chieh-Chi Kao, Teng-Yok Lee
-
Publication number: 20180144241Abstract: A method for training a neuron network using a processor in communication with a memory includes determining features of a signal using the neuron network, determining an uncertainty measure of the features for classifying the signal, reconstructing the signal from the features using a decoder neuron network to produce a reconstructed signal, comparing the reconstructed signal with the signal to produce a reconstruction error, combining the uncertainty measure with the reconstruction error to produce a rank of the signal for a necessity of a manual labeling, labeling the signal according to the rank to produce the labeled signal; and training the neuron network and the decoder neuron network using the labeled signal.Type: ApplicationFiled: November 22, 2016Publication date: May 24, 2018Inventors: Ming-Yu Liu, Chieh-Chi Kao
-
Patent number: 9329677Abstract: The present invention is related to a social system and process used for bringing virtual social network into real life, which is allowed for gathering and analyzing a social message of at least one interlocutor from virtual social network so as to generate at least one recommended topic, allowing a user to talk with the interlocutor through the utilization of the recommended topic, and then capturing and analyzing at least one speech and behavior and/or physiological response of the interlocutor during talking so as to generate an emotion state of the interlocutor. The user is allowed to determine whether the interlocutor is interested in the recommended topic through the emotion state of interlocutor. Thus, it is possible to bring the social message on virtual network into real life, so as to increase communication topics between persons in real life.Type: GrantFiled: March 15, 2012Date of Patent: May 3, 2016Assignee: National Taiwan UniversityInventors: Shao-Yi Chien, Jui-Hsin Lai, Jhe-Yi Lin, Min-Yian Su, Po-Chen Wu, Chieh-Chi Kao
-
Publication number: 20130169680Abstract: The present invention is related to a social system and process used for bringing virtual social network into real life, which is allowed for gathering and analyzing a social message of at least one interlocutor from virtual social network so as to generate at least one recommended topic, allowing a user to talk with the interlocutor through the utilization of the recommended topic, and then capturing and analyzing at least one speech and behavior and/or physiological response of the interlocutor during talking so as to generate an emotion state of the interlocutor. The user is allowed to determine whether the interlocutor is interested in the recommended topic through the emotion state of interlocutor. Thus, it is possible to bring the social message on virtual network into real life, so as to increase communication topics between persons in real life.Type: ApplicationFiled: March 15, 2012Publication date: July 4, 2013Applicant: NATIONAL TAIWAN UNIVERSITYInventors: SHAO-YI CHIEN, JUI-HSIN LAI, JHE-YI LIN, MIN-YIAN SU, PO-CHEN WU, CHIEH-CHI KAO