Patents Examined by Fan S. Tsang
-
Patent number: 12142257Abstract: Systems and methods are provided for providing emotion-based text to speech. The systems and methods perform operations comprising accessing a text string; storing a plurality of embeddings associated with a plurality of speakers, a first embedding for a first speaker being associated with a first emotion and a second embedding for a second speaker of the plurality of speakers being associated with a second emotion; selecting the first speaker to speak one or more words of the text string; determining that the one or more words are associated with the second emotion; generating, based on the first embedding and the second embedding, a third embedding for the first speaker associated with the second emotion; and applying the third embedding and the text string to a vocoder to generate an audio stream comprising the one or more words being spoken by the first speaker with the second emotion.Type: GrantFiled: February 8, 2022Date of Patent: November 12, 2024Assignee: SNAP INC.Inventors: Liron Harazi, Jacob Assa, Alan Bekker
-
Patent number: 12141501Abstract: An example implementation involves a computing device transmitting, via a local area network, a command that instructs a playback device to play a particular audio signal. The example implementation also involves the computing device receiving data indicating a detected audio signal corresponding to playback of the particular audio signal by the playback device, where the detected audio signal includes a portion of the particular audio signal. The implementation further involves the computing device obtaining data indicating a predetermined audio characteristic and determining an audio processing algorithm based on the detected audio signal and the predetermined audio characteristic. The example implementation involves causing the playback device to apply the determined audio processing algorithm when playing audio via at least one speaker.Type: GrantFiled: April 7, 2023Date of Patent: November 12, 2024Assignee: Sonos, Inc.Inventor: Timothy W. Sheen
-
Patent number: 12141528Abstract: A text mining system providing NLP and NLU capabilities is operable to perform, at a first processing layer, a first operation on input data to produce metadata about the input data. At a second processing layer, a rules module applies a composite AI extraction rule to further process the input data. The composite AI extraction rule has a rule condition that leverages the metadata from the first operation and a rule action that involves a second operation. Other composite AI extraction rules involving multiple text mining operations may also be applied. For instance, a rule may specify using the tonality of a document from a sentiment analysis to classify the document according to a relevant taxonomy. Another rule may specify classifying documents of a particular type under a specific category. In this way, new/enhanced information about the input data can be deduced, validated, and/or enriched.Type: GrantFiled: October 22, 2021Date of Patent: November 12, 2024Assignee: OPEN TEXT CORPORATIONInventors: Paul O'Hagan, Isidre Royo Bonnin, Robert Kapitan, Ravinder Reddy Yeddla, Renaud Levert
-
Patent number: 12131096Abstract: Example techniques related to a sub-index of a media index. An example implementation may involve maintaining, on a mobile device, a first index of audio tracks associated with a particular user profile, the audio tracks indexed in the first index consisting of a particular subset of audio tracks that are indexed in a second index. Based on receiving the input data indicating the search query, the mobile device searches, within the first index, for audio tracks corresponding to the search query. If the audio tracks corresponding to the search query are not found in the first index, the mobile device sends to one or more servers of the cloud service, a request to search the second index for audio tracks corresponding to the search query.Type: GrantFiled: September 25, 2023Date of Patent: October 29, 2024Assignee: Sonos, Inc.Inventors: Amber Brown, Diane Roberts
-
Patent number: 12125052Abstract: A model as a service system in which call participants' emotional or mental state is recognized by the system based on aspects of voice-based audio data collected from the call participants' communication devices, in particular the state of a company's customers. The system provides an algorithm for use by a company employee during an ongoing communication with a customer of the company. The voice-based data is recorded from the participants' communication devices, and sent to a server that analyzes the recordings for characteristics that reflect the current emotional or mental state of call participants, particularly the state of the customer. The characteristics are used to generate an algorithm that provides to a company participant suggestions for modifying aspects of their voice communication with the customer in real- or near real-time.Type: GrantFiled: December 2, 2021Date of Patent: October 22, 2024Assignee: Cogito CorporationInventor: Skyler Place
-
Patent number: 12119003Abstract: A computing device of a communication network may generate a first connection context that can manage a first voice connection between a first device and a second device. A signal may be received over the first connection context indicative of a request by the first device. In response, the computing device may connect the first device to a second connection context that can manage a second voice connection between the first device and an automated service such as a communication bot. The first device can transmit natural language requests to the automated service. The automated service facilitates execution of the natural language requests by the computing device.Type: GrantFiled: December 21, 2023Date of Patent: October 15, 2024Assignee: LIVEPERSON, INC.Inventor: Thorsten Ohrstrom Sandgren
-
Patent number: 12113780Abstract: An innovative system for transmitting encrypted 1-bit audio over an Ethernet network comprises using an omni-directional micro-electrical-mechanical system acoustic sensor element to provide an analog input signal to a sigma-delta modulator that then creates a pulse density modulated 1-bit data stream, at an audio oversampling rate, to a first input of a first exclusive-or (XOR) logic gate. The second input of the XOR logic gate is simultaneously presented with a first pseudo-random 1-bit data stream, at the same audio oversampling rate, thereby resulting in an encrypted pulse density modulated (PDM) 1-bit data stream at the output of the XOR logic gate. The encrypted PDM 1-bit data stream is clocked into a first-in first-out (FIFO) memory at the audio oversampling rate and is clocked out of the first FIFO memory as Ethernet PDM frame data packages at a predetermined Ethernet PHY transfer rate.Type: GrantFiled: September 16, 2022Date of Patent: October 8, 2024Assignee: Crestron Electronics, Inc.Inventor: Philip L. Kirkpatrick
-
Patent number: 12108228Abstract: A voice processing system includes: a plurality of microphone-speaker devices; a voice acquirer that acquires audio data from each of the microphone-speaker devices; a voice transmitter that transmits the audio data acquired by the voice acquirer to other microphone-speaker devices; a determination processor that determines whether or not a predetermined condition is met with respect to a factor that affects progress of a conference; a notification processor that causes, when the predetermined condition is met, a microphone-speaker device selected from among the plurality of microphone-speaker devices depending on the factor that affects the progress of the conference to provide specific information related to the predetermined condition.Type: GrantFiled: February 15, 2022Date of Patent: October 1, 2024Assignee: SHARP KABUSHIKI KAISHAInventors: Tatsuya Nishio, Maaki Shozu
-
Patent number: 12094464Abstract: An utterance analysis device including: a storage that stores a plurality of pieces of related information each relating to one of a plurality of categories; a control circuit that receives utterance data of an utterer in order of time series, and analyzes content of the utterance data by using a plurality of first likelihoods, which are each values for identifying a possibility that the utterance data acquired by the acquire corresponds to each category; and a display processor that displays, under control of the control circuit, display data including link information indicating an association for displaying related information relating to the category of the utterance data from the storage.Type: GrantFiled: December 17, 2021Date of Patent: September 17, 2024Assignee: PANASONIC INTELLECTUAL PROPERTY MANAGEMENT CO., LTD.Inventor: Natsuki Saeki
-
Patent number: 12093602Abstract: Example techniques relate to playback queue subscriptions. An example implementation involves a computing system receiving, from a first computing device associated with a first user account, an instruction to enable subscription to a first playback queue associated with a first media playback system. In response to the instruction, the computing system enables second user accounts to subscribe to the first playback queue. The second user accounts are registered with respective second media playback systems in respective second household. The computing system receives, from a particular second media playback system, a request to subscribe to the first playback queue; and in response, sends one or more messages that update a control interface of the first control device to display a subscriber indication and (ii) sends one or more messages that populate a second playback queue of the particular second media playback system with audio tracks of the first playback queue.Type: GrantFiled: November 20, 2023Date of Patent: September 17, 2024Assignee: Sonos, Inc.Inventors: Chris Bierbower, Philippe Vossel
-
Patent number: 12087284Abstract: An appliance can include a microphone transducer, a processor, and a memory storing instructions. The appliance is configured to receive an audio signal at the microphone transducer and to detect an utterance in the audio signal. The appliance is further configured to classify a speech mode based on the utterance. The appliance is further configured to determine conditions of an environment of the appliance. The appliance is further configured to select at least one of a playback volume or a speech output mode from a plurality of speech output modes based on the classification, and the conditions of the environment of the appliance. The appliance is further configured to adapt the playback volume and/or mode of played-back speech according to the speech output mode. The appliance may be configured to synthesize speech according to the speech output mode, or to modify synthesized speech according to the speech output mode.Type: GrantFiled: September 28, 2022Date of Patent: September 10, 2024Assignee: Apple Inc.Inventors: Narimene Lezzoum, Sylvain J. Choisel, Richard Powell, Ashrith Deshpande, Ameya Joshi
-
Patent number: 12086503Abstract: Media content episodes are received. Using machine learning, one or more media segments of interest are identified in each of the media content episodes based at least in part on an analysis of content included in a corresponding audio content episode. Each of the identified media segments is associated with one or more automatically determined tags. Using machine learning, a recommended media segment is selected for a specific user from the identified media segments based at least in part on attributes of the specific user and the automatically determined tags of the identified media segments. The recommended media segment is automatically provided in an media segment feed.Type: GrantFiled: February 1, 2023Date of Patent: September 10, 2024Assignee: Spotify ABInventors: Doug Imbruce, Diego Fernando Lorenzo Casabuena Gonzalez, Oluseye Ojumu
-
Patent number: 12087323Abstract: This application relates to a device and a method for voice-based trauma screening using deep learning. The device and method for voice-based trauma screening using deep learning screen for trauma through voices that may be obtained in a non-contact manner without limitations of space or situation. In one aspect, the device includes a memory configured to store at least one program and a processor configured to perform an operation by executing the at least one program. The processor can obtain voice data, pre-process the voice data, convert pre-processed voice data into image data, and input the image data to a deep learning model and obtain a trauma result value as an output value of the deep learning model.Type: GrantFiled: November 16, 2021Date of Patent: September 10, 2024Assignee: EMOCOG CO., LTD.Inventors: Yoo Hun Noh, Eui Chul Lee, Na Hye Kim, So Eui Kim, Ji Won Mok, Su Gyeong Yu, Na Yeon Han
-
Patent number: 12073146Abstract: Example techniques relate to prioritizing media content requests. An example implementation involves a computing system receives an explicit request to play back a playlist on one or more playback devices of a media playback system. The computing system causes the playback devices to play back a given audio track of the playlist. While the playback devices are playing back first tracks of the playlist, the computing system receives one or more implicit requests for second audio tracks in the playlist. While the playback devices are playing back the second audio tracks of the playlist, the computing system receives an explicit request to play back audio content on a mobile device. The computing system determines that the request to play back the audio content on the mobile device is a higher priority than the requests for second audio tracks and switches playback from the playback devices to the mobile device.Type: GrantFiled: August 8, 2022Date of Patent: August 27, 2024Assignee: Sonos, Inc.Inventor: Keith Corbin
-
Patent number: 12072924Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for determining breakpoints in a media item. Methods can include determining a candidate set of breakpoints within a media item. A machine learning model is used to generate a score for each particular candidate breakpoint in the set of candidate breakpoints based on presentation features of the media item. A subset of candidate breakpoints is selected from the set of candidate breakpoints based on the score. A final set of breakpoints is selected from the subset of candidate breakpoints based on a combination of the score for each particular candidate breakpoint and a location of the particular candidate breakpoint relative to a different candidate breakpoint. The final set of breakpoints is stored in a database and during playback of the media item, a digital component is presented when the media item reaches a stored breakpoint.Type: GrantFiled: December 29, 2022Date of Patent: August 27, 2024Assignee: Google LLCInventors: Wenbo Zhang, Son Khanh Pham, Karthik Prabhakar
-
Patent number: 12056458Abstract: A computer device acquires a semantic association graph associated with n source statements belonging to different modals. The semantic association graph includes n semantic nodes of the different modals, a first connecting edge used for connecting the semantic nodes of a same modal and a second connecting edge used for connecting the semantic nodes of different modals. The computer device extracts a plurality of first word vectors from the semantic association graph. The device encodes the plurality of first word vectors to obtain n encoded feature vectors. The device also decodes the n encoded feature vectors to obtain a translated target statement.Type: GrantFiled: April 12, 2022Date of Patent: August 6, 2024Assignee: TENCENT TECHNOLOGY (SHENZHEN) COMPANY LIMITEDInventors: Fandong Meng, Yongjing Yin, Jinsong Su, Jie Zhou
-
Patent number: 12051443Abstract: In general, the subject matter described in this disclosure can be embodied in methods, systems, and program products for identifying that a first audio stream includes first, second, and third sources of audio. A computing system identifies that a second audio stream includes the first, second, and third sources of audio. The computing system determines that the first and second sources of audio are part of a first conversation. The computing system generates a third audio stream that combines the first source of audio from the first audio stream, the first source of audio from the second audio stream, the second source of audio from the first audio stream, and the second source of audio from the second audio stream, and diminishes the third source of audio from the first audio stream, and the third source of audio from the second audio stream.Type: GrantFiled: August 19, 2022Date of Patent: July 30, 2024Assignee: Google LLCInventors: Dimitri Kanevsky, Golan Pundak
-
Patent number: 12033628Abstract: A wireless audio device is provided. The wireless audio device includes an audio receiving circuit, an audio output circuit, an acceleration sensor, a communication circuit, a processor, and a memory. The memory may store instructions that, when executed by the processor, cause the wireless audio device to detect an utterance of a user of the wireless audio device by using the acceleration sensor, enter a dialog mode in which at least some of ambient sounds received by the audio receiving circuit are output through the audio output circuit, in response to detecting the utterance of the user, and end the dialog mode if no voice is detected for a specified time or longer by using the audio receiving circuit in the dialog mode.Type: GrantFiled: December 8, 2021Date of Patent: July 9, 2024Assignee: Samsung Electronics Co., Ltd.Inventors: Hoseon Shin, Chulmin Lee, Youngwoo Lee
-
Patent number: 12026427Abstract: Example techniques involve identification of device groups. In an example implementation, a mobile device displays, via a control application, a synchrony group control including controls to select playback devices for a synchrony group. The mobile device receives input data representing a command to create a new synchrony group, the input data including input data representing selection of two or more playback devices for a new synchrony group. In response, the mobile device forms the synchrony group by receiving input data indicating a particular group identification for the new synchrony group, determining that the particular group identification is unique among other synchrony groups, and sending data representing instructions to the playback devices to form the new synchrony group with the particular group identification. In response to forming the new synchrony group, the mobile device updates an interface for the media playback system to indicate the new synchrony group.Type: GrantFiled: September 12, 2022Date of Patent: July 2, 2024Assignee: Sonos, Inc.Inventor: Arthur L. Coburn, IV
-
Patent number: 12020713Abstract: A method for spatial audio signal encoding comprising: obtaining, for a first frame, a plurality of audio direction parameters, wherein each parameter comprises an elevation value and an azimuth value and wherein each parameter has an ordered position; determining whether, for a preceding frame, any of the plurality of audio direction parameters was differentially encoded based on a difference between the preceding frame parameter elevation value and a further preceding frame parameter elevation value and the preceding frame parameter azimuth value and a further preceding frame parameter azimuth value; generating, for any audio direction parameter which was not differentially encoded in the considered preceding frame, a differential parameter value based on a difference between the frame parameter elevation value and a preceding frame parameter elevation value and a difference between the frame parameter azimuth value and a preceding frame parameter azimuth value; generating for each of the plurality of audioType: GrantFiled: July 27, 2020Date of Patent: June 25, 2024Assignee: Nokia Technologies OyInventor: Adriana Vasilache