Patents by Inventor CHIEH-CHI KAO

CHIEH-CHI KAO has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Publication number: 20240071408
    Abstract: A system may include a first acoustic event detection (AED) component configured to detect a predetermined set of acoustic events, and include a second AED component configured to detect custom acoustic events that a user configures a device to detect. The first and second AED components are configured to perform task-specific processing, and may receive as input the same acoustic feature data corresponding to audio data that potentially represents occurrence of one or more events. Based on processing by the first and second AED components, a device may output data indicating that one or more acoustic events occurred, where the acoustic events may be a predetermined acoustic event and/or a custom acoustic event.
    Type: Application
    Filed: September 8, 2023
    Publication date: February 29, 2024
    Inventors: Qingming Tang, Chieh-Chi Kao, Qin Zhang, Ming Sun, Chao Wang, Sumit Garg, Rong Chen, James Garnet Droppo, Chia-Jung Chang
  • Patent number: 11790932
    Abstract: A system may include a first acoustic event detection (AED) component configured to detect a predetermined set of acoustic events, and include a second AED component configured to detect custom acoustic events that a user configures a device to detect. The first and second AED components are configured to perform task-specific processing, and may receive as input the same acoustic feature data corresponding to audio data that potentially represents occurrence of one or more events. Based on processing by the first and second AED components, a device may output data indicating that one or more acoustic events occurred, where the acoustic events may be a predetermined acoustic event and/or a custom acoustic event.
    Type: Grant
    Filed: December 10, 2021
    Date of Patent: October 17, 2023
    Assignee: Amazon Technologies, Inc.
    Inventors: Qingming Tang, Chieh-Chi Kao, Qin Zhang, Ming Sun, Chao Wang, Sumit Garg, Rong Chen, James Garnet Droppo, Chia-Jung Chang
  • Publication number: 20230186939
    Abstract: A system may include a first acoustic event detection (AED) component configured to detect a predetermined set of acoustic events, and include a second AED component configured to detect custom acoustic events that a user configures a device to detect. The first and second AED components are configured to perform task-specific processing, and may receive as input the same acoustic feature data corresponding to audio data that potentially represents occurrence of one or more events. Based on processing by the first and second AED components, a device may output data indicating that one or more acoustic events occurred, where the acoustic events may be a predetermined acoustic event and/or a custom acoustic event.
    Type: Application
    Filed: December 10, 2021
    Publication date: June 15, 2023
    Inventors: Qingming Tang, Chieh-Chi Kao, Qin Zhang, Ming Sun, Chao Wang, Sumit Garg, Rong Chen, James Garnet Droppo, Chia-Jung Chang
  • Patent number: 11302329
    Abstract: A system may include an acoustic event detection component for detecting acoustic events, which may be non-speech sounds. Upon detection of a command to detect a new sound, a device may prompt a user to cause occurrence of the sound one or more times. The acoustic event detection component may then be reconfigured, using audio data corresponding to the occurrences, to detect future occurrences of the event.
    Type: Grant
    Filed: June 29, 2020
    Date of Patent: April 12, 2022
    Assignee: Amazon Technologies, Inc.
    Inventors: Ming Sun, Spyridon Matsoukas, Venkata Naga Krishna Chaitanya Puvvada, Chao Wang, Chieh-Chi Kao
  • Patent number: 11069352
    Abstract: Described herein is a system for media presence detection in audio. The system analyzes audio data to recognize whether a given audio segment contains sounds from a media source as a way of differentiating recorded media source sounds from other live sounds. In exemplary embodiments, the system includes a hierarchical model architecture for processing audio data segments, where individual audio data segments are processed by a trained machine learning model operating locally, and another trained machine learning model provides historical and contextual information to determine a score indicating the likelihood that the audio data segment contains sounds from a media source.
    Type: Grant
    Filed: February 18, 2019
    Date of Patent: July 20, 2021
    Assignee: Amazon Technologies, Inc.
    Inventors: Qingming Tang, Ming Sun, Chieh-Chi Kao, Chao Wang, Viktor Rozgic
  • Patent number: 10803885
    Abstract: An audio event detection system that processes audio data into audio feature data and processes the audio feature data using pre-configured candidate interval lengths to identify top candidate regions of the feature data that may include an audio event. The feature data from the top candidate regions are then scored by a classifier, where the score indicates a likelihood that the candidate region corresponds to a desired audio event. The scores are compared to a threshold, and if the threshold is satisfied, the top scoring candidate region is determined to include an audio event.
    Type: Grant
    Filed: June 29, 2018
    Date of Patent: October 13, 2020
    Assignee: Amazon Technologies, Inc.
    Inventors: Chieh-Chi Kao, Chao Wang, Weiran Wang, Ming Sun
  • Patent number: 10769500
    Abstract: System and method for an active learning system including a sensor obtains data from a scene including a set of images having objects. A memory to store active learning data including an object detector trained for detecting objects in images. A processor in communication with the memory, is configured to detect a semantic class and a location of at least one object in an image selected from the set of images using the object detector to produce a detection metric as a combination of an uncertainty of the object detector about the semantic class of the object in the image (classification) and an uncertainty of the object detector about the location of the object in the image (localization). Using an output interface or a display type device, in communication with the processor, to display the image for human labeling when the detection metric is above a threshold.
    Type: Grant
    Filed: August 31, 2017
    Date of Patent: September 8, 2020
    Assignee: Mitsubishi Electric Research Laboratories, Inc.
    Inventors: Teng-Yok Lee, Chieh-Chi Kao, Pradeep Sen, Ming-Yu Liu
  • Patent number: 10417524
    Abstract: An image processing system includes a memory to store a classifier and a set of labeled images for training the classifier, wherein each labeled image is labeled as either a positive image that includes an object of a specific type or a negative image that does not include the object of the specific type, wherein the set of labeled images has a first ratio of the positive images to the negative images. The system includes an input interface to receive a set of input images, a processor to determine a second ratio of the positive images, to classify the input images into positive and negative images to produce a set of classified images, and to select a subset of the classified images having the second ratio of the positive images to the negative images, and an output interface to render the subset of the input images for labeling.
    Type: Grant
    Filed: July 11, 2017
    Date of Patent: September 17, 2019
    Assignee: Mitsubishi Electric Research Laboratories, Inc.
    Inventors: Chen Feng, Ming-Yu Liu, Chieh-Chi Kao, Teng-Yok Lee
  • Patent number: 10418957
    Abstract: An audio event detection system that subsamples input audio data using a series of recurrent neural networks to create data of a coarser time scale than the audio data. Data frames corresponding to the coarser time scale may then be upsampled to data frames that match the finer time scale of the original audio data frames. The resulting data frames are then scored with a classifier to determine a likelihood that the individual frames correspond to an audio event. Each frame is then weighted by its score and a composite weighted frame is created by summing the weighted frames and dividing by the cumulative score. The composite weighted frame is then scored by the classifier. The resulting score is taken as an overall score indicating a likelihood that the input audio data includes an audio event.
    Type: Grant
    Filed: June 29, 2018
    Date of Patent: September 17, 2019
    Assignee: Amazon Technologies, Inc.
    Inventors: Weiran Wang, Chao Wang, Chieh-Chi Kao
  • Publication number: 20190065908
    Abstract: System and method for an active learning system including a sensor obtains data from a scene including a set of images having objects. A memory to store active learning data including an object detector trained for detecting objects in images. A processor in communication with the memory, is configured to detect a semantic class and a location of at least one object in an image selected from the set of images using the object detector to produce a detection metric as a combination of an uncertainty of the object detector about the semantic class of the object in the image (classification) and an uncertainty of the object detector about the location of the object in the image (localization). Using an output interface or a display type device, in communication with the processor, to display the image for human labeling when the detection metric is above a threshold.
    Type: Application
    Filed: August 31, 2017
    Publication date: February 28, 2019
    Inventors: Teng-Yok Lee, Chieh-Chi Kao, Pradeep Sen, Ming-Yu Liu
  • Publication number: 20180232601
    Abstract: An image processing system includes a memory to store a classifier and a set of labeled images for training the classifier, wherein each labeled image is labeled as either a positive image that includes an object of a specific type or a negative image that does not include the object of the specific type, wherein the set of labeled images has a first ratio of the positive images to the negative images. The system includes an input interface to receive a set of input images, a processor to determine a second ratio of the positive images, to classify the input images into positive and negative images to produce a set of classified images, and to select a subset of the classified images having the second ratio of the positive images to the negative images, and an output interface to render the subset of the input images for labeling.
    Type: Application
    Filed: July 11, 2017
    Publication date: August 16, 2018
    Applicant: Mitsubishi Electric Research Laboratories, Inc.
    Inventors: Chen Feng, Ming-Yu Liu, Chieh-Chi Kao, Teng-Yok Lee
  • Publication number: 20180144241
    Abstract: A method for training a neuron network using a processor in communication with a memory includes determining features of a signal using the neuron network, determining an uncertainty measure of the features for classifying the signal, reconstructing the signal from the features using a decoder neuron network to produce a reconstructed signal, comparing the reconstructed signal with the signal to produce a reconstruction error, combining the uncertainty measure with the reconstruction error to produce a rank of the signal for a necessity of a manual labeling, labeling the signal according to the rank to produce the labeled signal; and training the neuron network and the decoder neuron network using the labeled signal.
    Type: Application
    Filed: November 22, 2016
    Publication date: May 24, 2018
    Inventors: Ming-Yu Liu, Chieh-Chi Kao
  • Patent number: 9329677
    Abstract: The present invention is related to a social system and process used for bringing virtual social network into real life, which is allowed for gathering and analyzing a social message of at least one interlocutor from virtual social network so as to generate at least one recommended topic, allowing a user to talk with the interlocutor through the utilization of the recommended topic, and then capturing and analyzing at least one speech and behavior and/or physiological response of the interlocutor during talking so as to generate an emotion state of the interlocutor. The user is allowed to determine whether the interlocutor is interested in the recommended topic through the emotion state of interlocutor. Thus, it is possible to bring the social message on virtual network into real life, so as to increase communication topics between persons in real life.
    Type: Grant
    Filed: March 15, 2012
    Date of Patent: May 3, 2016
    Assignee: National Taiwan University
    Inventors: Shao-Yi Chien, Jui-Hsin Lai, Jhe-Yi Lin, Min-Yian Su, Po-Chen Wu, Chieh-Chi Kao
  • Publication number: 20130169680
    Abstract: The present invention is related to a social system and process used for bringing virtual social network into real life, which is allowed for gathering and analyzing a social message of at least one interlocutor from virtual social network so as to generate at least one recommended topic, allowing a user to talk with the interlocutor through the utilization of the recommended topic, and then capturing and analyzing at least one speech and behavior and/or physiological response of the interlocutor during talking so as to generate an emotion state of the interlocutor. The user is allowed to determine whether the interlocutor is interested in the recommended topic through the emotion state of interlocutor. Thus, it is possible to bring the social message on virtual network into real life, so as to increase communication topics between persons in real life.
    Type: Application
    Filed: March 15, 2012
    Publication date: July 4, 2013
    Applicant: NATIONAL TAIWAN UNIVERSITY
    Inventors: SHAO-YI CHIEN, JUI-HSIN LAI, JHE-YI LIN, MIN-YIAN SU, PO-CHEN WU, CHIEH-CHI KAO