Sound Editing Patents (Class 704/278)
  • Patent number: 10929091
    Abstract: This disclosure concerns the playback of audio content, e.g. in the form of music. More particularly, the disclosure concerns the playback of streamed audio. In one example embodiment, there is a method of operating an electronic device for dynamically controlling a playlist including one or several audio items. A request to adjust an energy level (e.g. a tempo) associated with the playlist is received. In response to receiving this request, the playlist is adjusted in accordance with the requested energy level (e.g., the tempo).
    Type: Grant
    Filed: September 18, 2017
    Date of Patent: February 23, 2021
    Assignee: SPOTIFY AB
    Inventors: Souheil Medaghri Alaoui, Miles Lennon, Kieran Del Pasqua
  • Patent number: 10886028
    Abstract: Techniques for presenting alternative hypotheses for medical facts may include identifying, using at least one statistical fact extraction model, a plurality of alternative hypotheses for a medical fact to be extracted from a portion of text documenting a patient encounter. At least two of the alternative hypotheses may be selected, and the selected hypotheses may be presented to a user documenting the patient encounter.
    Type: Grant
    Filed: February 2, 2018
    Date of Patent: January 5, 2021
    Assignee: Nuance Communications, Inc.
    Inventor: Girija Yegnanarayanan
  • Patent number: 10854190
    Abstract: Various embodiments of the present disclosure evaluate transcription accuracy. In some implementations, the system normalizes a first transcription of an audio file and a baseline transcription of the audio file. The baseline transcription can be used as an accurate transcription of the audio file. The system can further determine an error rate of the first transcription by aligning each portion of the first transcription with the portion of the baseline transcription, and assigning a label to each portion based on a comparison of the portion of the first transcription with the portion of the baseline transcription.
    Type: Grant
    Filed: June 5, 2017
    Date of Patent: December 1, 2020
    Assignee: UNITED SERVICES AUTOMOBILE ASSOCIATION (USAA)
    Inventors: Michael J. Szentes, Carlos Chavez, Robert E. Lewis, Nicholas S. Walker
  • Patent number: 10845976
    Abstract: Approaches provide for navigating or otherwise interacting with content in response to input from a user, including voice inputs, device inputs, gesture inputs, among other such inputs such that a user can quickly and easily navigate to different levels of detail of content. This can include, for example, presenting content (e.g., images, multimedia, text, etc.) in a particular layout, and/or highlighting, emphasizing, animating, or otherwise altering in appearance, and/or arrangement of the interface elements used to present the content based on a current level of detail, where the current level of detail can be determined by data selection criteria associated with a magnification level and other such data. As a user interacts with the computing device, for example, by providing a zoom input, values of the selection criteria can be updated, which can be used to filter and/or select content for presentation.
    Type: Grant
    Filed: August 21, 2018
    Date of Patent: November 24, 2020
    Assignee: IMMERSIVE SYSTEMS INC.
    Inventors: Jason Simmons, Maksim Galkin
  • Patent number: 10798271
    Abstract: In various embodiments, a subtitle timing application detects timing errors between subtitles and shot changes. In operation, the subtitle timing application determines that a temporal edge associated with a subtitle does not satisfy a timing guideline based on a shot change. The shot change occurs within a sequence of frames of an audiovisual program. The subtitle timing application then determines a new temporal edge that satisfies the timing guideline relative to the shot change. Subsequently, the subtitle timing application causes a modification to a temporal location of the subtitle within the sequence of frames based on the new temporal edge. Advantageously, the modification to the subtitle improves a quality of a viewing experience for a viewer. Notably, by automatically detecting timing errors, the subtitle timing application facilitates proper and efficient re-scheduling of subtitles that are not optimally timed with shot changes.
    Type: Grant
    Filed: January 5, 2018
    Date of Patent: October 6, 2020
    Assignee: NETFLIX, INC.
    Inventors: Murthy Parthasarathi, Andrew Swan, Yadong Wang, Thomas E. Mack
  • Patent number: 10777095
    Abstract: A method to develop pronunciation and intonation proficiency of English using an electronic interface, includes: preparing video bites each having an English language sound clip; preparing a script of the sound clip, wherein the script is partially marked in accordance with a predetermined rule of a pronunciation and intonation rhythm; displaying a circle on a screen of the electronic interface, wherein the circle has an illuminant movably provided along the circle, wherein the circle is serially partitioned to first to fourth quadrants; selectively playing on the screen the sound clip and the script adjacent to the circle; and synchronizing the sound clip to the illuminant in accordance with the predetermined rule, wherein an angular velocity of the illuminant moving along the circle accelerates and decelerates in the first quadrant and substantially remains constant in the second and third quadrants.
    Type: Grant
    Filed: June 10, 2020
    Date of Patent: September 15, 2020
    Inventor: Il Sung Bang
  • Patent number: 10712998
    Abstract: There is provided an information processing device to improve communication between a user and a person speaking to the user by specifying speaking motion information indicating a motion of a surrounding person speaking to the user for whom information from the surroundings is auditorily or visually restricted, the information processing device including: a detecting unit configured to detect a speaking motion of a surrounding person speaking to a user using a device that auditorily or visually restricts information from surroundings; and a specifying unit configured to specify speaking motion information indicating the speaking motion on a basis of monitored surrounding information in a case in which the speaking motion is detected.
    Type: Grant
    Filed: June 30, 2016
    Date of Patent: July 14, 2020
    Assignee: SONY CORPORATION
    Inventor: Ryouhei Yasuda
  • Patent number: 10685667
    Abstract: In aspects, systems, methods, apparatuses and computer-readable storage media implementing embodiments for mixing audio content based on a plurality of user generated recordings (UGRs) are disclosed. In embodiments, the mixing comprises: receiving a plurality of UGRs, each UGR of the plurality of UGRs comprising at least audio content; determining a correlation between samples of audio content associated with at least two UGRs of the plurality of UGRs; generating one or more clusters comprising samples of the audio content identified as having a relationship based on the determined correlations; synchronizing, for each of the one or more clusters, the samples of the audio content to produce synchronized audio content for each of the one or more clusters, normalizing, for each of the one or more clusters, the synchronized audio content to produce normalized audio content; and mixing, for each of the one or more clusters, the normalized audio content.
    Type: Grant
    Filed: June 12, 2018
    Date of Patent: June 16, 2020
    Assignee: FOUNDATION FOR RESEARCH AND TECHNOLOGY—HELLAS (FORTH)
    Inventors: Nikolaos Stefanakis, Athanasios Mouchtaris
  • Patent number: 10672399
    Abstract: Techniques are provided for creating a mapping that maps locations in audio data (e.g., an audio book) to corresponding locations in text data (e.g., an e-book). Techniques are provided for using a mapping between audio data and text data, whether the mapping is created automatically or manually. A mapping may be used for bookmark switching where a bookmark established in one version of a digital work (e.g., e-book) is used to identify a corresponding location with another version of the digital work (e.g., an audio book). Alternatively, the mapping may be used to play audio that corresponds to text selected by a user. Alternatively, the mapping may be used to automatically highlight text in response to audio that corresponds to the text being played. Alternatively, the mapping may be used to determine where an annotation created in one media context (e.g., audio) will be consumed in another media context.
    Type: Grant
    Filed: October 6, 2011
    Date of Patent: June 2, 2020
    Assignee: APPLE INC.
    Inventors: Alan C. Cannistraro, Gregory S. Robbin, Casey M. Dougherty, Raymond Walsh, Melissa Breglio Hajj
  • Patent number: 10656901
    Abstract: A media item that was presented in media players of computing devices at a first audio level may be identified, each of the media players having a corresponding user of a first set of users. A second audio level value corresponding to an amplitude setting selected by a user of the set of users during playback of the media item may be determined for each of the media players. An audio level difference (ALD) value for each of the media players may be determined based on a corresponding second audio level value. A second audio level value for an amplitude setting to be provided for the media item in response to a request of a second user to play the media item may be determined based on determined ALD values.
    Type: Grant
    Filed: December 13, 2017
    Date of Patent: May 19, 2020
    Assignee: GOOGLE LLC
    Inventor: Christian Weitenberner
  • Patent number: 10579326
    Abstract: A control device is provided which mixes and records two types of audio signals processed under standards different from each other; in particular, an audio signal of ASIO standard and an audio signal of WDM standard. An audio interface is connected to a computer, and an audio signal is input to the computer. A mixer module of the computer mixes an audio signal which is effect-processed by an ASIO application and an audio signal reproduced by a WDM application, and outputs the mixed audio signal to the audio interface and to the WDM application for sound recording. The user operates a screen displayed on an operation panel to switch between presence and absence of effect process and presence and absence of mixing.
    Type: Grant
    Filed: January 18, 2017
    Date of Patent: March 3, 2020
    Assignee: TEAC Corporation
    Inventor: Kaname Hayasaka
  • Patent number: 10575119
    Abstract: Methods and systems are provided for visualizing spatial audio using determined properties for time segments of the spatial audio. Such properties include the position sound is coming from, intensity of the sound, focus of the sound, and color of the sound at a time segment of the spatial audio. These properties can be determined by analyzing the time segment of the spatial audio. Upon determining these properties, the properties are used in rendering a visualization of the sound with attributes based on the properties of the sound(s) at the time segment of the spatial audio.
    Type: Grant
    Filed: December 12, 2018
    Date of Patent: February 25, 2020
    Assignee: Adobe Inc.
    Inventors: Stephen Joseph DiVerdi, Yaniv De Ridder
  • Patent number: 10565435
    Abstract: A method for determining a video-related emotion and a method of generating data for learning video-related emotions include separating an input video into a video stream and an audio stream; analyzing the audio stream to detect a music section; extracting at least one video clip matching the music section; extracting emotion information from the music section; tagging the video clip with the extracted emotion information and outputting the video clip; learning video-related emotions by using the at least one video clip tagged with the emotion information to generate a video-related emotion classification model; and determining an emotion related to an input query video by using the video-related emotion classification model to provide the emotion.
    Type: Grant
    Filed: May 30, 2018
    Date of Patent: February 18, 2020
    Assignee: ELECTRONICS AND TELECOMMUNICATIONS RESEARCH INSTITUTE
    Inventors: Jee Hyun Park, Jung Hyun Kim, Yong Seok Seo, Won Young Yoo, Dong Hyuck Im
  • Patent number: 10489450
    Abstract: Implementations generally relate to selecting soundtracks. In some implementations, a method includes determining one or more sound mood attributes of one or more soundtracks, where the one or more sound mood attributes are based on one or more sound characteristics. The method further includes determining one or more visual mood attributes of one or more visual media items, where the one or more visual mood attributes are based on one or more visual characteristics. The method further includes selecting one or more of the soundtracks based on the one or more sound mood attributes and the one or more visual mood attributes. The method further includes generating an association among the one or more selected soundtracks and the one or more visual media items, wherein the association enables the one or more selected soundtracks to be played while the one or more visual media items are displayed.
    Type: Grant
    Filed: February 26, 2015
    Date of Patent: November 26, 2019
    Assignee: Google LLC
    Inventor: Ryan James Lothian
  • Patent number: 10417279
    Abstract: Systems and methods are provided for curating playlists of content for provisioning and presenting to users a seamless cross fade experience from one piece of content to the next within the playlist. In embodiments, information that identifies portions of content without audio or video data may be maintained. Further, metadata may be generated that identifies a cross fade points for the content in response to receiving input from a user device of a user. In an embodiment, each cross fade point may identify a time window of the content for interleaving with other content. In accordance with at least one embodiment, the metadata may be transmitted based at least in part on a selection of the metadata for the content.
    Type: Grant
    Filed: December 7, 2015
    Date of Patent: September 17, 2019
    Assignee: Amazon Technologies, Inc.
    Inventor: Jonathan Beech
  • Patent number: 10397525
    Abstract: In a pilotless flying object detection system, a masking area setter sets a masking area to be excluded from detection of a pilotless flying object which appears in a captured image of a monitoring area, based on audio collected by a microphone array. An object detector detects the pilotless flying object based on the audio collected by the microphone array and the masking area set by the masking area setter. An output controller superimpose sound source visual information, which indicates the volume of a sound at a sound source position, at the sound source position of the pilotless flying object in the captured image and displays the result on a first monitor in a case where the pilotless flying object is detected in an area other than the masking area.
    Type: Grant
    Filed: March 9, 2017
    Date of Patent: August 27, 2019
    Assignee: PANASONIC INTELLECTUAL PROPERTY MANAGEMENT CO., LTD.
    Inventors: Hiroyuki Matsumoto, Shintaro Yoshikuni, Masanari Miyamoto
  • Patent number: 10332506
    Abstract: Disclosed are systems and methods for improving interactions with and between computers in content searching, generating, hosting and/or providing systems supported by or configured with personal computing devices, servers and/or platforms. The systems interact to identify and retrieve data within or across platforms, which can be used to improve the quality of data used in processing interactions between or among processors in such systems. The disclosed systems and methods provide systems and methods for automatic creation of a formatted, readable transcript of multimedia content, which is derived, extracted, determined, or otherwise identified from the multimedia content. The formatted, readable transcript can be utilized to increase accuracy and efficiency in search engine optimization, as well as identification of relevant digital content available for communication to a user.
    Type: Grant
    Filed: September 2, 2015
    Date of Patent: June 25, 2019
    Assignee: OATH INC.
    Inventors: Aasish Pappu, Amanda Stent
  • Patent number: 10334142
    Abstract: In some examples, a system receives a color sample comprising a color measurement of a proper subset of a gamut of colors printable by a printer, and computes a forward transform value and a reverse transform value based on a color profile calculated from a profiling chart comprising a set of estimated color samples calculated based on the received color sample, the forward and reverse transform values to convert between colorimetry values and color values for the printer. The system provides an adjusted color profile for the printer based on an original color profile for the printer and the computing, wherein the original color profile for the printer is associated with a substrate.
    Type: Grant
    Filed: March 14, 2017
    Date of Patent: June 25, 2019
    Assignee: Hewlett-Packard Development Company, L.P.
    Inventors: Peter Morovic, Jan Morovic
  • Patent number: 10318637
    Abstract: An editing method facilitates the task of adding background sound to speech-containing audio data so as to augment the listening experience. The editing method is executed by a processor in a computing device and comprises obtaining characterization data that characterizes time segments in the audio data by at least one of topic and sentiment; deriving, for a respective time segment in the audio data and based on the characterization data, a desired property of a background sound to be added to the audio data in the respective time segment, and providing the desired property for the respective time segment so as to enable the audio data to be combined, within the respective time segment, with background sound having the desired property. The background sound may be selected and added automatically or by manual user intervention.
    Type: Grant
    Filed: May 13, 2017
    Date of Patent: June 11, 2019
    Assignee: SONY MOBILE COMMUNICATIONS INC.
    Inventor: Ola Thörn
  • Patent number: 10194199
    Abstract: Methods, systems, and computer program products that automatically categorize and/or assign ratings to content (video and audio content) uploaded by individuals who want to broadcast the content to others via a communications network, such as an IPTV network, are provided. When an individual uploads content to a network, a network service automatically extracts an audio stream from the uploaded content. Words in the extracted audio stream are identified. For each identified word, a preexisting library of selected words is queried to determine if a match exists between words in the library and words in the extracted audio stream. The selected words in the library are associated with a particular content category or content rating. If a match exists between an identified word and a word in the library, the uploaded content is assigned a content category and/or rating associated with the matched word.
    Type: Grant
    Filed: May 1, 2017
    Date of Patent: January 29, 2019
    Assignee: AT&T Intellectual Property I, L.P.
    Inventor: Ke Yu
  • Patent number: 10109286
    Abstract: According to an embodiment, a speech synthesizer includes a source generator, a phase modulator, and a vocal tract filter unit. The source generator generates a source signal by using a fundamental frequency sequence and a pulse signal. The phase modulator modulates, with respect to the source signal generated by the source generator, a phase of the pulse signal at each pitch mark based on audio watermarking information. The vocal tract filter unit generates a speech signal by using a spectrum parameter sequence with respect to the source signal in which the phase of the pulse signal is modulated by the phase modulator.
    Type: Grant
    Filed: September 14, 2017
    Date of Patent: October 23, 2018
    Assignee: KABUSHIKI KAISHA TOSHIBA
    Inventors: Kentaro Tachibana, Takehiko Kagoshima, Masatsune Tamura, Masahiro Morita
  • Patent number: 9942748
    Abstract: Embodiments of the present invention provide a service provisioning system and method, a mobile edge application server and support node. The system includes: at least one mobile edge application server (MEAS) and at least one mobile edge application server support function (MEAS-SF), where the MEAS is deployed at an access network side; and the MEAS-SF is deployed at a core network side, connected to one or more MEAS. In the service provisioning system provided in the embodiment, services that are provided by an SP are deployed in the MEAS. When the MEAS can provide the user equipment with a service requested in a service request, the MEAS directly and locally generates service data corresponding to the service request. Therefore, the user equipment directly obtains required service data from an RAN side, which avoids data congestion between an RAN and a CN and saves network resources.
    Type: Grant
    Filed: August 21, 2015
    Date of Patent: April 10, 2018
    Assignee: Huawei Technologies Co., Ltd.
    Inventors: Zhiming Zhu, Weihua Liu, Mingrong Cao
  • Patent number: 9841879
    Abstract: A computing device can include a recognition mode interface utilizing graphical elements, such as virtual fireflies, to indicate recognized or identified objects. The fireflies can be animated to move across a display, and the fireflies can create bounding boxes around visual representations of objects as the objects are recognized. In some cases, the object might be of a type that has specific meaning or information to be conveyed to a user. In such cases, the fireflies might be displayed with a particular size, shape, or color to convey that information. The fireflies also can be configured to form shapes or patterns in order to convey other types of information to a user, such as where audio is being recognized, light is sufficient for image capture, and the like. Other types of information can be conveyed as well via altering characteristics of the fireflies.
    Type: Grant
    Filed: December 20, 2013
    Date of Patent: December 12, 2017
    Assignee: AMAZON TECHNOLOGIES, INC.
    Inventors: Timothy Thomas Gray, Forrest Elliott
  • Patent number: 9838731
    Abstract: Video clips may be automatically edited to be synchronized for accompaniment by audio tracks. A preliminary version of a video clip may be made up from stored video content. Occurrences of video events within the preliminary version may be determined. A first audio track may include audio event markers. A first revised version of the video clip may be synchronized so that moments within the video clip corresponding to occurrences of video events are aligned with moments within the first audio track corresponding to audio event markers. Presentation of an audio mixing option may be effectuated on a graphical user interface of a video application for selection by a user. The audio mixing option may define volume at which the first audio track is played as accompaniment for the video clip.
    Type: Grant
    Filed: April 7, 2016
    Date of Patent: December 5, 2017
    Assignee: GoPro, Inc.
    Inventor: Joven Matias
  • Patent number: 9766854
    Abstract: This disclosure concerns the playback of audio content, e.g. in the form of music. More particularly, the disclosure concerns the playback of streamed audio. In one example embodiment, there is a method of operating an electronic device for dynamically controlling a playlist including one or several audio items. A request to adjust an energy level (e.g. a tempo) associated with the playlist is received. In response to receiving this request, the playlist is adjusted in accordance with the requested energy level (e.g., the tempo).
    Type: Grant
    Filed: August 28, 2015
    Date of Patent: September 19, 2017
    Assignee: SPOTIFY AB
    Inventors: Souheil Medaghri Alaoui, Miles Lennon, Kieran Del Pasqua
  • Patent number: 9736337
    Abstract: One example relates to a print system for adjusting a color profile. The print system can comprise a system comprising a memory for storing computer executable instructions and a processing unit for accessing the memory and executing the computer executable instructions. The computer executable instructions can comprise a profile transformer to receive a color sample comprising a color measurement of a proper subset of a gamut of colors printable with ink deposited at a printer. The profile transformer can also provide an adjusted color profile for the printer based on (i) an original color profile associated with a substrate for the printer and (ii) the color sample.
    Type: Grant
    Filed: June 30, 2011
    Date of Patent: August 15, 2017
    Assignee: HEWLETT-PACKARD DEVELOPMENT COMPANY, L.P.
    Inventors: Peter Morovic, Jan Morovic
  • Patent number: 9674051
    Abstract: An address generation section (111) receives an acquisition request including a file name of a target content and a device ID of a device (113) as a place where the target content is stored, from an application execution section (112) that executes a viewing application. Then, the address generation section (111) specifies the current file path and IP address of the target content in content information and device information each managed by a management section (107) on the basis of the received acquisition request, and generates an acquisition address for acquiring the target content from the device (113) on the basis of the specified file path and IP address.
    Type: Grant
    Filed: October 16, 2013
    Date of Patent: June 6, 2017
    Assignee: Panasonic Intellectual Property Corporation of America
    Inventor: Shigehiro Iida
  • Patent number: 9626956
    Abstract: A method and a device that preprocess a speech signal are disclosed, which include extracting at least one frame corresponding to a speech recognition range from frames included in a speech signal, generating a supplementary frame to supplement speech recognition with respect to the speech recognition range based on the at least one extracted frame, and outputting a preprocessed speech signal including the supplementary frame along with the frames of the speech signal.
    Type: Grant
    Filed: April 7, 2015
    Date of Patent: April 18, 2017
    Assignee: Samsung Electronics Co., Ltd.
    Inventor: Hodong Lee
  • Patent number: 9558746
    Abstract: This invention describes methods for implementing human speech recognition. The methods described here are of using sub-events that are sounds between spaces (typically a fully spoken word) that is then compared with a library of sub-events. All sub-events are packaged with it's own speech recognition function as individual units. This invention illustrates how this model can be used as a Large Vocabulary Speech Recognition System.
    Type: Grant
    Filed: September 30, 2014
    Date of Patent: January 31, 2017
    Inventor: Darrell Poirier
  • Patent number: 9544649
    Abstract: A device and method are presently disclosed. The computer implemented method, includes at an electronic device with a touch-sensitive display, displaying a still image on the touch-sensitive display, while displaying the still image, detecting user's finger contact with the touch-sensitive display, and in response to detecting the user's finger contact, video recording the still image.
    Type: Grant
    Filed: February 25, 2014
    Date of Patent: January 10, 2017
    Assignee: Aniya's Production Company
    Inventors: Damon Wayans, James Cahall
  • Patent number: 9483459
    Abstract: A system is configured to receive a first string corresponding to an interpretation of a natural-language user voice entry; provide a representation of the first string as feedback to the natural-language user voice entry; receive, based on the feedback, a second string corresponding to a natural-language corrective user entry, where the natural-language corrective user entry may correspond to a correction to the natural-language user voice entry; parse the second string into one or more tokens; determine at least one corrective instruction from the one or more tokens of the second string; generate, from at least a portion of each of the first and second strings and based on the at least one corrective instruction, candidate corrected user entries; select a corrected user entry from the candidate corrected user entries; and output the selected, corrected user entry.
    Type: Grant
    Filed: March 13, 2013
    Date of Patent: November 1, 2016
    Assignee: Google Inc.
    Inventors: Michael D Riley, Johan Schalkwyk, Cyril Georges Luc Allauzen, Ciprian Ioan Chelba, Edward Oscar Benson
  • Patent number: 9472177
    Abstract: A music application guides a user with some musical, experience through the steps of creating and editing a musical enhancement file that enhances and plays in synchronicity with an audio signal of an original artist's recorded performance. This enables others, perhaps with lesser musical ability than the original artist, to play-along with the original artist by following melodic, chordal, rhythmic, and verbal prompts. The music application accounts for differences in the timing of the performance from a standard tempo by guiding the user through the process of creating a tempo map for the performance and by associating the tempo map with MIDI information of the enhancement file. Enhancements may contain. MIDI information, audio signal information, and/or video signal information which may be played back in synchronicity with the recorded performance to provide an aural and visual, aid to others playing-along who may have less musical experience.
    Type: Grant
    Filed: December 27, 2013
    Date of Patent: October 18, 2016
    Assignee: Family Systems, Ltd.
    Inventors: Brian Reynolds, William B. Hudak
  • Patent number: 9449611
    Abstract: A computer readable medium containing computer executable instructions is described for extracting a reference representation from a mixture representation that comprises the reference representation and a residual representation wherein the reference representation, the mixture representation, and the residual representation are representations of collections of acoustical waves stored on computer readable media.
    Type: Grant
    Filed: October 1, 2012
    Date of Patent: September 20, 2016
    Assignee: AUDIONAMIX
    Inventors: Pierre Leveau, Xabier Jaureguiberry
  • Patent number: 9423944
    Abstract: A method for adjusting the sound volume of media clips using volume adjuster lines is provided. The volume adjuster lines are individually set for each clip based on the intrinsic, or absolute, volume values of the clip. In some embodiments, the volume adjuster lines are set for each clip based on the peak value or a calculated loudness equivalent of the clip. A user can move the volume adjuster line to set the absolute sound level of a clip. The volume adjuster lines can be hidden in some embodiments. In these embodiments, dragging on any portion of a clip is treated as dragging on the volume adjuster line. Some embodiments provide a deformable volume adjuster line, or curve. In these embodiments, a single audio clip can have several different volume adjuster lines for different sections of the clip where the volume adjuster line for each section is individually adjustable.
    Type: Grant
    Filed: September 6, 2011
    Date of Patent: August 23, 2016
    Assignee: APPLE INC.
    Inventor: Aaron M. Eppolito
  • Patent number: 9418652
    Abstract: Systems and methods for modifying a computer-based speech recognition system. A speech utterance is processed with the computer-based speech recognition system using a set of internal representations, which may comprise parameters for recognizing speech in a speech utterance, such as parameters of an acoustic model and/or a language model. The computer-based speech recognition system may perform a first task in response to the processed speech utterance. The utterance may also be provided to a human who performs a second task based on the utterance. Data indicative of the first task, performed by the computer system, is compared to data indicative of a second task, performed by the human in response to the speech utterance. Based on the comparison, the set of internal representations may be updated or modified to improve the speech recognition performance and capabilities of the speech recognition system.
    Type: Grant
    Filed: January 30, 2015
    Date of Patent: August 16, 2016
    Assignee: Next IT Corporation
    Inventor: Charles C Wooters
  • Patent number: 9378731
    Abstract: The present disclosure relates to training a speech recognition system. One example method includes receiving a collection of speech data items, wherein each speech data item corresponds to an utterance that was previously submitted for transcription by a production speech recognizer. The production speech recognizer uses initial production speech recognizer components in generating transcriptions of speech data items. A transcription for each speech data item is generated using an offline speech recognizer, and the offline speech recognizer components are configured to improve speech recognition accuracy in comparison with the initial production speech recognizer components. The updated production speech recognizer components are trained for the production speech recognizer using a selected subset of the transcriptions of the speech data items generated by the offline speech recognizer.
    Type: Grant
    Filed: April 22, 2015
    Date of Patent: June 28, 2016
    Assignee: Google Inc.
    Inventors: Olga Kapralova, John Paul Alex, Eugene Weinstein, Pedro J. Moreno Mengibar, Olivier Siohan, Ignacio Lopez Moreno
  • Patent number: 9361887
    Abstract: Systems and methods of providing text related to utterances, and gathering voice data in response to the text are provide herein. In various implementations, an identification token that identifies a first file for a voice data collection campaign, and a second file for a session script may be received from a natural language processing training device. The first file and the second file may be used to configure the mobile application to display a sequence of screens, each of the sequence of screens containing text of at least one utterance specified in the voice data collection campaign. Voice data may be received from the natural language processing training device in response to user interaction with the text of the at least one utterance. The voice data and the text may be stored in a transcription library.
    Type: Grant
    Filed: September 7, 2015
    Date of Patent: June 7, 2016
    Assignee: VoiceBox Technologies Corporation
    Inventors: Daniela Braga, Faraz Romani, Ahmad Khamis Elshenawy, Michael Kennewick
  • Patent number: 9355683
    Abstract: Provided are a method and apparatus thereof for setting a marker within audio information, the method including: receiving the audio information including a silent portion and a non-silent portion; receiving a selection for a selected marker insertion point; determining, based on the received selection and received audio information, whether the selected marker insertion point occurs during the non-silent portion; and if the selected marker insertion point occurs during the non-silent portion, determining a time of the silent portion, and setting the marker to correspond to the determined time of the silent portion.
    Type: Grant
    Filed: July 30, 2010
    Date of Patent: May 31, 2016
    Assignee: SAMSUNG ELECTRONICS CO., LTD.
    Inventor: Jung-dae Kim
  • Patent number: 9343053
    Abstract: A method of adding sound effects to movies, comprising: opening a file comprising audio and video tracks on a computing device comprising a display and touch panel input mode; running the video track on the display; selecting an audio sound suitable to a displayed frame from an audio sounds library; and adding audio effects to said selected audio sound using hand gestures on displayed art effects.
    Type: Grant
    Filed: May 7, 2014
    Date of Patent: May 17, 2016
    Assignee: SOUND IN MOTION
    Inventor: Zion Harel
  • Patent number: 9286900
    Abstract: An audio codec in a baseband processor may be utilized for mixing audio signals received at a plurality of data sampling rates. The mixed audio signals may be up sampled to a very large sampling rate, and then down sampled to a specified sampling rate that is compatible with a Bluetooth-enabled device by utilizing an interpolator in the audio codec. The down-sampled signals may be communicated to Bluetooth-enabled devices, such as Bluetooth headsets, or Bluetooth-enabled devices with a USB interface. The interpolator may be a linear interpolator for which the audio codec may enable generation of triggering and/or coefficient signals based on the specified output sampling rate. An interpolation coefficient may be generated based on a base value associated with the specified output sampling rate. The audio codec may enable selecting the specified output sampling rate from a plurality of rates.
    Type: Grant
    Filed: March 21, 2011
    Date of Patent: March 15, 2016
    Assignee: Broadcom Corporation
    Inventors: Hongwei Kong, Nelson Sollenberger, Li Fung Chang, Claude Hayek, Taiyi Cheng
  • Patent number: 9251805
    Abstract: An object of the present invention is to process the speech of a particular speaker. The present invention provides a technique for collecting speech, analyzing the collected speech to extract the features of the speech, grouping the speech, or text corresponding to the speech, on the basis of the extracted features, presenting the result of the grouping to a user, and when one or more of the groups is selected by the user, enhancing, or reducing or cancelling the speech of a speaker associated with the selected group.
    Type: Grant
    Filed: December 2, 2013
    Date of Patent: February 2, 2016
    Assignee: International Business Machines Corporation
    Inventors: Taku Aratsu, Masami Tada, Akihiko Takajo, Takahito Tashiro
  • Patent number: 9230561
    Abstract: A system and method of creating a customized multi-media message to a recipient is disclosed. The multi-media message is created by a sender and contains an animated entity that delivers an audible message. The sender chooses the animated entity from a plurality of animated entities. The system receives a text message from the sender and receives a sender audio message associated with the text message. The sender audio message is associated with the chosen animated entity to create the multi-media message. The multi-media message is delivered by the animated entity using as the voice the sender audio message wherein the mouth movements of the animated entity conform to the sender audio message.
    Type: Grant
    Filed: August 27, 2013
    Date of Patent: January 5, 2016
    Assignee: AT&T Intellectual Property II, L.P.
    Inventors: Joern Ostermann, Mehmet Reha Civanlar, Barbara Buda, Claudio Lande
  • Patent number: 9202469
    Abstract: A technique for recording dictation, meetings, lectures, and other events includes automatically segmenting an audio recording into portions by detecting speech transitions within the recording and selectively identifying certain portions of the recording as noteworthy. Noteworthy audio portions are displayed to a user for selective playback. The user can navigate to different noteworthy audio portions while ignoring other portions. Each noteworthy audio portion starts and ends with a speech transition. Thus, the improved technique typically captures noteworthy topics from beginning to end, thereby reducing or avoiding the need for users to have to search for the beginnings and ends of relevant topics manually.
    Type: Grant
    Filed: September 16, 2014
    Date of Patent: December 1, 2015
    Assignee: Citrix Systems, Inc.
    Inventors: Yogesh Moorjani, Ryan Warren Kasper, Ashish V. Thapliyal, Ajay Kumar, Abhinav Kuruvadi Ramesh Babu, Elizabeth Thapliyal, James Kalbach, Margaret Dianne Cramer
  • Patent number: 9203966
    Abstract: A method and device are provided for modifying a compounded voice message having at least one first voice component. The method includes a step of obtaining at least one second voice component, a step of updating at least one item of information belonging to a group of items of information associated with the compounded voice message as a function of the at least one second voice component and a step of making available the compounded voice message comprising the at least one first and second voice components, and the group of items of information associated with the compounded voice message. The compounded voice message is intended to be consulted by at least one recipient user.
    Type: Grant
    Filed: September 26, 2012
    Date of Patent: December 1, 2015
    Assignee: FRANCE TELECOM
    Inventor: Ghislain Moncomble
  • Patent number: 9201580
    Abstract: Sound alignment user interface techniques are described. In one or more implementations, a user interface is output having a first representation of sound data generated from a first sound signal and a second representation of sound data generated from a second sound signal. One or more inputs are received, via interaction with the user interface, that indicate that a first point in time in the first representation corresponds to a second point in time in the second representation. Aligned sound data is generated from the sound data from the first and second sound signals based at least in part on correspondence of the first point in time in the sound data generated from the first sound signal to the second point in time in the sound data generated from the second sound signal.
    Type: Grant
    Filed: November 13, 2012
    Date of Patent: December 1, 2015
    Assignee: Adobe Systems Incorporated
    Inventors: Brian John King, Gautham J. Mysore, Paris Smaragdis
  • Patent number: 9159363
    Abstract: Systems and methods are disclosed to adjust the loudness or another audio attribute for one or more audio clips. Intra-track audio levels can automatically be equalized, for example, to achieve a homogeneous audio level for all clips within an audio track. Information about such audio adjustments may be identified and stored as information without destructively altering the underlying clip content. For example, keyframes may define changes to a fader that will be applied at different points along a track's timeline to achieve the audio adjustments when the track is played. An audio editing application can provide a feature for automatically determining appropriate keyframes, allow manual editing of keyframes, and use keyframes to display control curves that represent graphically the time-based adjustments made to track-specific faders, play test audio output, and output combined audio, among other things.
    Type: Grant
    Filed: April 2, 2010
    Date of Patent: October 13, 2015
    Assignee: Adobe Systems Incorporated
    Inventors: Holger Classen, Sven Duwenhorst
  • Patent number: 9153234
    Abstract: A speech recognition apparatus includes: a recognition device that recognizes a speech of a user and generates a speech character string; a display device that displays the speech character string; a reception device that receives an input of a correction character string, which is used for correction of the speech character string, through an operation portion; and a correction device that corrects the speech character string with using the correction character string.
    Type: Grant
    Filed: March 19, 2013
    Date of Patent: October 6, 2015
    Assignee: DENSO CORPORATION
    Inventors: Toru Nada, Kiyotaka Taguchi, Makoto Manabe, Shinji Hatanaka, Norio Sanma, Makoto Obayashi, Akira Yoshizawa
  • Patent number: 9064558
    Abstract: A recording and/or reproducing apparatus includes a microphone, a semiconductor memory, an operating section and a controller. An output signal from the microphone is written in the semiconductor memory and the written signals are read out from the semiconductor memory. The operating section performs input processing for writing a digital signal outputted by an analog/digital converter, reading out the digital signal stored in the semiconductor memory and for erasing the digital signal stored in the semiconductor memory. The control section controls the writing of the microphone output signal in the semiconductor memory based on an input from the operating section and the readout of the digital signal stored in the semiconductor memory.
    Type: Grant
    Filed: May 30, 2013
    Date of Patent: June 23, 2015
    Assignee: Sony Corporation
    Inventor: Kenichi Iida
  • Patent number: 9066049
    Abstract: Provided in some embodiments is a computer implemented method that includes providing script data including script words indicative of dialogue words to be spoken, providing recorded dialogue audio data corresponding to at least a portion of the dialogue words to be spoken, wherein the recorded dialogue audio data includes timecodes associated with recorded audio dialogue words, matching at least some of the script words to corresponding recorded audio dialogue words to determine alignment points, determining that a set of unmatched script words are accurate based on the matching of at least some of the script words matched to corresponding recorded audio dialogue words, generating time-aligned script data including the script words and their corresponding timecodes and the set of unmatched script words determined to be accurate based on the matching of at least some of the script words matched to corresponding recorded audio dialogue words.
    Type: Grant
    Filed: May 28, 2010
    Date of Patent: June 23, 2015
    Assignee: Adobe Systems Incorporated
    Inventors: Jerry R. Scoggins, II, Walter W. Chang, David A. Kuspa
  • Patent number: 9058876
    Abstract: A resistive random access memory integrated circuit for use as a mass storage media and adapted for bulk erase by substantially simultaneously switching all memory cells to one of at least two possible resistive states. Bulk switching is accomplished by biasing all bottom electrodes within an erase area to a voltage lower than that of the top electrodes, wherein the erase area can comprise the entire memory array of the integrated circuit or else a partial array. Alternatively the erase area may be a single row and, upon receiving the erase command, the row address is advanced automatically and the erase step is repeated until the entire array has been erased.
    Type: Grant
    Filed: June 21, 2013
    Date of Patent: June 16, 2015
    Assignee: 4D-S, LTD
    Inventors: Lee Cleveland, Franz Michael Schuette