Patents Issued in December 24, 2020
  • Publication number: 20200402487
    Abstract: Systems, devices, and methods for encoding digital representations of musical compositions are described. Various components of a musical composition that are defined in modern music theory, such as notes and bars, are encoded as respective hierarchically-dependent data objects in a data file. The hierarchically-dependent data objects encode the musical composition in a tree-like data structure with modular nodes and adjustable relationships between nodes. Note start times and beat start times are encoded independently of one another and characterized by a timing relationship that captures the expressiveness imbued when notes and beats are not precisely synchronized. Musical variations that preserve the timing relationship between the notes and beats of the original composition are also generated and encoded.
    Type: Application
    Filed: January 28, 2020
    Publication date: December 24, 2020
    Inventors: Colin P. Williams, Gregory Gabrenya
  • Publication number: 20200402488
    Abstract: Systems, devices, and methods for encoding the harmonic structure of a musical composition in a digital data structure are described. Tonal and rhythmic commonalities are identified across the musical bars that make up a musical composition. Individual bars of the musical composition are each analyzed to characterize their respective harmonic fingerprints in various forms, and the respective harmonic fingerprints are compared to sort the musical bars into harmonic equivalence categories. Isomorphic mappings between hierarchical data structures that encode the musical composition based on musicality and harmony, respectively, are also described. The systems, devices, and methods for encoding the harmonic structure of a musical composition in a digital data structure have broad applicability in computer-based composition and variation of music.
    Type: Application
    Filed: January 28, 2020
    Publication date: December 24, 2020
    Inventor: Colin P. Williams
  • Publication number: 20200402489
    Abstract: Provided is a mechanism allowing improvement of the degree of freedom in production when multiple users collaboratively produce a music track via a network. An information processing apparatus includes a control unit configured to receive multitrack data containing multiple pieces of track data generated by different users, edit the multitrack data, and transmit the edited multitrack data.
    Type: Application
    Filed: October 10, 2018
    Publication date: December 24, 2020
    Applicant: SONY CORPORATION
    Inventors: Junichirou SAKATA, Keisuke SAITOU, Keiichiro YAMADA, Misao SATO, Kouichi SUNAGA, Takuya OGURA
  • Publication number: 20200402490
    Abstract: Various aspects include systems and approaches for providing audio performance capabilities with one or more far field microphones. One aspect includes a method of controlling a speaker system with at least one far field microphone that is coupled with a separate display device. The method can include: receiving a user command to initiate an audio performance mode; initiating audio playback of an audio performance file at a transducer at the speaker system; initiating video playback including musical performance guidance associated with the audio performance file at the display device; receiving a user generated acoustic signal at the at least one far field microphone after initiating the audio playback and the video playback; comparing the user generated acoustic signal with a reference acoustic signal; and providing feedback about the comparison to the user.
    Type: Application
    Filed: June 20, 2019
    Publication date: December 24, 2020
    Inventor: Gregg Michael Duthaler
  • Publication number: 20200402491
    Abstract: A reverberating percussion instrument has a resonating chamber of a completely smoothly contoured configuration to facilitate generation of long-duration pitched reverberations, and a plurality of hollow and substantially mutually parallel vibratable tubes of different lengths, each of which being suspended from an element of the resonating chamber by a tensioned filament at each longitudinal end thereof. Reverberated and resonating musical sounds are directed forwardly from the resonating chamber in response to selective vibration of one or more of the tubes.
    Type: Application
    Filed: February 27, 2019
    Publication date: December 24, 2020
    Applicant: SoundFreq LTD
    Inventors: Ortal Pelleg, Ido Luxman
  • Publication number: 20200402492
    Abstract: Rotor noise cancellation through the use of mechanical means for a personal aerial drone vehicle. Active noise cancellation is achieved by creating an antiphase amplitude wave by modulation of the propeller blades, by utilizing embedded magnets through an electromagnetic coil encircling the propeller blades. A noise level sensor signals the rotor control system to adjust the frequency of the electromagnetic field surrounding the rotor and control the speed of the rotor. An additional method comprises of incorporating a phase lock loop within the control system configured to determine the frequencies corresponding to the rotors and generate corrective audio signals to achieve active noise cancellation.
    Type: Application
    Filed: June 24, 2019
    Publication date: December 24, 2020
    Inventor: ALAN RICHARD GREENBERG
  • Publication number: 20200402493
    Abstract: In one aspect a method that includes receiving an input signal captured by one or more first sensors associated with an active noise reduction (ANR) device, and processing the input signal using a first filter disposed in an ANR signal path to generate a first signal for an acoustic transducer of the ANR device. The input signal is processed in a pass-through signal path disposed in parallel with the ANR signal path to generate a second signal for the acoustic transducer, wherein the pass-through signal path allows a portion of the input signal to pass through to the acoustic transducer in accordance with a variable gain. One or more second sensors detect an existence of a condition likely to cause instability in the pass-through signal path, and in response, the variable gain is adjusted. A driver signal for the acoustic transducer is generated using an output based on the adjusted gain.
    Type: Application
    Filed: June 20, 2019
    Publication date: December 24, 2020
    Inventor: Christopher A. Barnes
  • Publication number: 20200402494
    Abstract: The signal processing apparatus includes a noise canceling processing unit capable of connecting one or more input portions and one or more output portions, performs a noise canceling processing by connecting a plurality.
    Type: Application
    Filed: March 8, 2019
    Publication date: December 24, 2020
    Inventors: SHIGETOSHI HAYASHI, KOHEI ASADA, SHINPEI TSUCHIYA, KAZUNOBU OOKURI
  • Publication number: 20200402495
    Abstract: Disclosed herein a method of producing an ultrasound that includes defining a set of criteria for an ultrasound emitter comprising a plate. The set of criteria includes a power output criterion, a frequency criterion and number of nodes for a resonance mode of the plate, a focus criterion, and a durability criterion. The method includes determining an outline and a thickness range for the plate, based on the set of criteria. The method includes using topology optimization to determine internodal zone dimensions for the plate, based on the set of criteria, the outline, and the thickness range. The method includes manufacturing the plate according to the internodal zone dimensions.
    Type: Application
    Filed: June 24, 2019
    Publication date: December 24, 2020
    Inventors: John Z. Lin, Wayne Cooper, Kane M. Mordaunt
  • Publication number: 20200402496
    Abstract: A reverberation adding apparatus includes: a plurality of paths that constitutes an output channel or a plurality of output channels; and a convolution operation unit that convolves an impulse response for each of the paths, in which the impulse response is formed by combining a plurality of reverberation pattern blocks in a time axis direction, the reverberation adding apparatus using the reverberation pattern blocks common to the plurality of convolution operation units.
    Type: Application
    Filed: January 8, 2019
    Publication date: December 24, 2020
    Applicant: Sony Corporation
    Inventor: Yuji Tsuchida
  • Publication number: 20200402497
    Abstract: Systems and methods for generating audio data in accordance with embodiments of the invention are illustrated. One embodiment includes a method for generating audio data. The method includes steps for generating a plurality of style tokens from a set of audio inputs, generating an input feature vector based on the plurality of style tokens and a set of text features, and generating audio data (e.g., a spectrogram, audio waveforms, etc.) based on the input feature vector.
    Type: Application
    Filed: June 24, 2020
    Publication date: December 24, 2020
    Applicant: Replicant Solutions, Inc.
    Inventors: Zak Semenov, John Meade, Alessandro Marin, Alexander L. De Souza, Benjamin Gleitzman, Meghna Suresh
  • Publication number: 20200402498
    Abstract: [Problem] There are proposed an information processing apparatus, an information processing method, and a program, which are capable of learning a meaning corresponding to a speech recognition result of a first speech adaptively to a determination result as to whether or not a second speech is a restatement of the first speech. [Solution] An information processing apparatus including: a learning unit configured to learn, based on a determination result as to whether or not a second speech collected at second timing after first timing is a restatement of a first speech collected at the first timing, a meaning corresponding to a speech recognition result of the first speech.
    Type: Application
    Filed: November 30, 2018
    Publication date: December 24, 2020
    Applicant: Sony Corporation
    Inventors: Shinichi KAWANO, Hiro IWASE, Yuhei TAKI
  • Publication number: 20200402499
    Abstract: Systems and methods for detecting speech activity. The system includes an audio source and an electronic processor. The electronic processor is configured to receive a first audio signal from the audio source, buffer the first audio signal, add random noise to the buffered first audio signal, and filter the first audio stream to create a filtered signal. The electronic processor then determines a signal entropy of each frame of the filtered signal, determines an average signal entropy of a first plurality of frames of the filtered signal occurring at a beginning of the filtered signal, and compares the signal entropy of each frame of the filtered signal to the average signal entropy. Based on the comparison, the electronic processor determines a first speech endpoint located in a first frame of the filtered signal.
    Type: Application
    Filed: June 21, 2019
    Publication date: December 24, 2020
    Inventors: Pongtep Angkititrakul, HyeongSik Kim
  • Publication number: 20200402500
    Abstract: A method and device for generating speech recognition model are provided. The method includes: obtaining training samples, wherein each training sample includes a speech frame sequence and a labeled text sequence; training the encoder by using the speech frame sequence as an input feature and using speech encoded frames of the speech frame sequence as an output feature; training the decoder by using the speech encoded frames as a first input feature and using the labeled text sequence as a first output feature, and obtaining a current prediction text sequence; and training the decoder again by using the speech encoded frames as a second input feature and using a sequence as a second output feature, wherein the sequence is obtained by sampling the labeled text sequence and the current prediction text sequence based on a preset probability.
    Type: Application
    Filed: September 3, 2020
    Publication date: December 24, 2020
    Inventors: Yuanyuan Zhao, Jie Li, Xiaorui Wang, Yan Li
  • Publication number: 20200402501
    Abstract: A method of biasing speech recognition includes receiving audio data encoding an utterance and obtaining a set of one or more biasing phrases corresponding to a context of the utterance. Each biasing phrase in the set of one or more biasing phrases includes one or more words. The method also includes processing, using a speech recognition model, acoustic features derived from the audio data and grapheme and phoneme data derived from the set of one or more biasing phrases to generate an output of the speech recognition model. The method also includes determining a transcription for the utterance based on the output of the speech recognition model.
    Type: Application
    Filed: April 30, 2020
    Publication date: December 24, 2020
    Applicant: Google LLC
    Inventors: Rohit Prakash Prabhavalkar, Golan Pundak, Tara N. Sainath, Antoine Jean Bruguier
  • Publication number: 20200402502
    Abstract: In one example of the disclosure, microphone data indicative of a user spoken phrase is captured utilizing the microphone at a communication apparatus. At least a portion of the microphone data is sent to a set of computing devices. A response phrase determined at a virtual assistant service is received from each of the computing devices. A preferred response phrase is identified among the set of received response phrases according to a preference rule. The preferred response phrase is caused to be output via a speaker at the communication apparatus.
    Type: Application
    Filed: June 21, 2016
    Publication date: December 24, 2020
    Applicant: Hewlett-Packard Development Company, L.P.
    Inventors: David H. Hanes, John Michael Main, Jon R. Dory
  • Publication number: 20200402503
    Abstract: An information processing apparatus includes a processor. The processor is configured to identify, from a character string recognition result for a form, a form feature that indicates at least a field in which the form is used or an attribute of a filling-out person filling out the form, accumulate past correction tendencies for character string recognition results for forms having respective identified form features, and obtain a correction tendency for a form having a form feature that is the same as the identified form feature from among the accumulated correction tendencies, and perform control to display a candidate correct expression for the character string recognition result for the form in accordance with the obtained correction tendency.
    Type: Application
    Filed: October 30, 2019
    Publication date: December 24, 2020
    Applicant: FUJI XEROX CO., LTD.
    Inventor: Mami IWANARI
  • Publication number: 20200402504
    Abstract: Provided herein are system, apparatus, article of manufacture, method and/or computer program product embodiments, and/or combinations and sub-combinations thereof, for enabling Do Not Disturb functionality in voice responsive devices. An example embodiment operates by: enabling an user to configure Do Not Disturb settings for a voice responsive device; while (a) the Do Not Disturb functionality is activated for the voice responsive device, and (b) within a Do Not Disturb time period specified by the Do Not Disturb settings: disabling one or more microphones; receiving an unambiguous trigger; responsive to receiving the unambiguous trigger, enabling the microphone(s); receiving a voice command; and processing the voice command. An example of an unambiguous trigger may be the user pressing a talk button (either a physical or digital button) on a remote control associated with the voice responsive device.
    Type: Application
    Filed: March 4, 2020
    Publication date: December 24, 2020
    Inventors: ALI M. VASSIGH, SHUBHADA HEBBAR, CHRISTOPHER JAMES TEGETHOFF
  • Publication number: 20200402505
    Abstract: Provided are a trigger recognition model generating method for a robot and a robot to which the method is applied. A trigger recognition model generating method comprises obtaining an input text which expresses a voice trigger, obtaining a first set of voice triggers by voice synthesis from the input text, obtaining a second set of voice triggers by applying a first filter in accordance with an environmental factor to the first set of voice triggers, obtaining a third set of voice triggers by applying a second filter in accordance with a mechanism characteristic of the robot to the second set of voice triggers, and applying the first, second, and third sets of voice triggers to the trigger recognition model as learning data for the voice trigger. By doing this, a trigger recognition model which is capable of recognizing a new trigger is generated.
    Type: Application
    Filed: April 20, 2020
    Publication date: December 24, 2020
    Applicant: LG ELECTRONICS INC.
    Inventor: Yongjin PARK
  • Publication number: 20200402506
    Abstract: A response device includes a data acquirer configured to acquire data to which metadata is added, a database register configured to generate a tag on the basis of the metadata and register the generated tag in a database in association with the data to which the metadata is added, an utterance content interpreter configured to interpret content of an utterance of a user, a searcher configured to search the database using the tag included in the utterance content when an intention to search for the data has been interpreted by the utterance content interpreter, and a responder configured to cause an outputter to output information according to a search result of the searcher.
    Type: Application
    Filed: June 17, 2020
    Publication date: December 24, 2020
    Inventors: Kenzo Yoneyama, Tsutomu Ogawa
  • Publication number: 20200402507
    Abstract: Training and/or utilizing a single neural network model to generate, at each of a plurality of assistant turns of a dialog session between a user and an automated assistant, a corresponding automated assistant natural language response and/or a corresponding automated assistant action. For example, at a given assistant turn of a dialog session, both a corresponding natural language response and a corresponding action can be generated jointly and based directly on output generated using the single neural network model. The corresponding response and/or corresponding action can be generated based on processing, using the neural network model, dialog history and a plurality of discrete resources. For example, the neural network model can be used to generate a response and/or action on a token-by-token basis.
    Type: Application
    Filed: June 24, 2020
    Publication date: December 24, 2020
    Inventors: Arvind Neelakantan, Daniel Duckworth, Ben Goodrich, Vishaal Prasad, Chinnadhurai Sankar, Semih Yavuz
  • Publication number: 20200402508
    Abstract: According to examples, an apparatus may include a processor and a memory on which are stored machine readable instructions that when executed by the processor cause the processor to access a voice command file pertaining to a data file and to send the voice command file to at least one of a server and a voice services provider, in which the voice services provider is to translate the voice command file to a workflow task message and to send the workflow task message to the server. The instructions may also cause the processor to receive the workflow task message from the server and execute a workflow task on the data file corresponding to the received workflow task message.
    Type: Application
    Filed: March 23, 2018
    Publication date: December 24, 2020
    Inventor: Marcus Allen Thomas
  • Publication number: 20200402509
    Abstract: A function execution instruction system includes a function execution instruction unit configured to instruct execution of one or more functions, a sentence input unit configured to input a sentence, an execution function determination unit configured to determine a function the execution of which is instructed on the basis of an input sentence, a time information extraction unit configured to extract time information indicating a time from the input sentence, and a time specification unit configured to, in accordance with a determined function, specify a time used for the execution of the function on the basis of extracted time information wherein the function execution instruction unit instructs the execution of the determined function, which uses a specified time.
    Type: Application
    Filed: December 4, 2018
    Publication date: December 24, 2020
    Applicant: NTT DOCOMO, INC.
    Inventors: Hiroshi FUJIMOTO, Kousuke KADONO
  • Publication number: 20200402510
    Abstract: Disclosed herein is a calendar-based information management system. The system may include a touchscreen display for displaying digital content and receiving a touch input, a speaker to produce an audible output, a communication device to communicate with an external electronic device, a GUI engine for generating the GUI presented through the touchscreen, a user management engine to manage a plurality of user profiles, a speech recognition engine to perform speech recognition, a command processing engine to process at least one command, a storage device to retrievably store a plurality of calendar event data and at least one digital data, a category management engine to management a plurality of categories associated with the plurality of calendar event data, a notification engine to generate at least one notification based on the plurality of calendar event data and a document and image repository engine to process at least one digital document.
    Type: Application
    Filed: September 8, 2020
    Publication date: December 24, 2020
    Inventor: Carrie D. Sluiter
  • Publication number: 20200402511
    Abstract: A method for processing speech data for a speech event, wherein the speech data comprises a visible component and an audible component. The method comprises identifying a first visible feature within the visible component that corresponds to a predetermined visible speech feature and determining a first time corresponding to the occurrence of the first visible feature during the speech event. The method further comprises determining a measurement of a characteristic of the audible component at a second time during the speech event, which has a predefined temporal relationship to the first time at which the first visible feature occurred, and using the determined measurement of a characteristic at the second time to output an evaluation of an attribute, with which the predetermined visible speech feature is associated.
    Type: Application
    Filed: June 19, 2020
    Publication date: December 24, 2020
    Applicant: University of Tartu
    Inventors: Gholamreza Anbarjafari, Kadir Aktas
  • Publication number: 20200402512
    Abstract: A method includes receiving a speech input from a user and obtaining context metadata associated with the speech input. Hie method also includes generating a raw speech recognition result corresponding to the speech input and selecting a list of one or more denormalizers to apply to the generated raw speech recognition result based on the context metadata associated with the speech input. The generated raw speech recognition result includes normalized text. The method also includes denormalizing the generated raw speech recognition result into denormalized text by applying the list of the one or more denormalizers in sequence to the generated raw speech recognition result.
    Type: Application
    Filed: September 1, 2020
    Publication date: December 24, 2020
    Applicant: Google LLC
    Inventors: Assaf Hurwitz Michaely, Petar Aleksic, Pedro J. Moreno Mengibar
  • Publication number: 20200402513
    Abstract: The present disclosure provides a Bluetooth speaker base, a method and a system for controlling a Bluetooth speaker base. The method includes: acquiring voice data, and determining whether the voice data includes a wake-up word, when positions of the Bluetooth speaker base and a Bluetooth speaker satisfy a preset condition; controlling the Bluetooth speaker base to enter a wake-up recognition state, and compressing the voice data based on a compression ratio, when the voice data includes the wake word; and sending the voice data compressed to a mobile terminal through a first profile, to cause the mobile terminal to decompress the voice data received, send the voice data decompressed to a server for voice recognition to obtain audio data, and send the audio data to the Bluetooth speaker for playback through a second profile.
    Type: Application
    Filed: March 9, 2020
    Publication date: December 24, 2020
    Inventors: Xujie ZHU, Jingran LI, Chao TIAN, Shoukuan WANG, Lili WANG, Wei LIU
  • Publication number: 20200402514
    Abstract: The present disclosure proposes a speech chip and an electronic device. The speech chip includes: a peripheral interface connected to a speech receiver and configured to receive a speech signal; a bus matrix connected to the peripheral interface; a first processor connected to the bus matrix and configured to determine whether is the speech signal contains a wake-up word according to the speech signal; a second processor connected to the bus matrix and configured to perform signal denoising and speech recognition on the speech signal; and a memory array connected to the bus matrix.
    Type: Application
    Filed: April 29, 2020
    Publication date: December 24, 2020
    Inventors: Xiaoping YAN, Chao TIAN
  • Publication number: 20200402515
    Abstract: Features are disclosed for performing functions in response to user requests based on contextual data regarding prior user requests. Users may engage in conversations with a computing device in order to initiate some function or obtain some information. A dialog manager may manage the conversations and store contextual data regarding one or more of the conversations. Processing and responding to subsequent conversations may benefit from the previously stored contextual data by, e.g., reducing the amount of information that a user must provide if the user has already provided the information in the context of a prior conversation. Additional information associated with performing functions responsive to user requests may be shared among applications, further improving efficiency and enhancing the user experience.
    Type: Application
    Filed: July 2, 2020
    Publication date: December 24, 2020
    Inventors: Nishant Kumar, David Robert Thomas, Sumedha Arvind Kshirsagar, Vikas Jain, Jeff Bradley Beal, Ajay Gopalakrishnan, Shishir Sridhar Bharathi
  • Publication number: 20200402516
    Abstract: Aspects of the present invention disclose a method for preventing adversarial audio attacks through detecting and isolating inconsistencies utilizing beamforming techniques and IoT devices. The method includes one or more processors identifying an audio command received by a listening device. The method further includes determining a source location of the audio command utilizing a sensor array of the listening device. The method further includes determining a location of a user in relation to the listening device based on data of an Internet of Things (IoT) device. The method further includes determining an inconsistency between the determines source location and the determined location of the user based at least in part on data of the sensor array and data of the IoT device.
    Type: Application
    Filed: June 18, 2019
    Publication date: December 24, 2020
    Inventors: Craig M. Trim, Michael Bender, Zachary A. Silverstein, Martin G. Keen
  • Publication number: 20200402517
    Abstract: Example implementations are directed to maximizing the accuracy of command recognition in a noisy environment, such as a factor shop floor, by providing appropriate parameters and configurations to a speech recognition algorithm and a denoising algorithm based on an operator condition, such as the identified user and the location. Through the example implementations described herein, machine processes can be controlled through properly configured speech recognition and denoising algorithms despite having a surrounding noisy environment.
    Type: Application
    Filed: June 21, 2019
    Publication date: December 24, 2020
    Inventors: Yusuke SHOMURA, Yasutaka SERIZAWA, Sudhanshu GAUR
  • Publication number: 20200402518
    Abstract: Methods and apparatus for decoding a compressed Higher Order Ambisonics (HOA) representation of a sound or soundfield. The method may include receiving a bit stream containing the compressed HOA representation and decoding, based on a determination that there are multiple layers, the compressed HOA representation from the bitstream to obtain a sequence of decoded HOA representations. A first subset of the sequence of decoded HOA representations is determined based only on corresponding ambient HOA components. A second subset of the sequence of decoded HOA representations is determined based on corresponding ambient HOA components and corresponding predominant sound components.
    Type: Application
    Filed: June 3, 2020
    Publication date: December 24, 2020
    Applicant: Dolby Laboratories Licensing Corporation
    Inventors: Sven KORDON, Alexander KRUEGER, Oliver WUEBBOLT
  • Publication number: 20200402519
    Abstract: In general, techniques are described by which to code scaled spatial components. A device comprising a memory and one or more processors may be configured to perform the techniques. The memory may store a bitstream including an encoded foreground audio signal and a corresponding quantized spatial component. The one or more processors may perform psychoacoustic audio decoding with respect to the encoded foreground audio signal to obtain a foreground audio signal, and determine, when performing psychoacoustic audio decoding, a bit allocation for the encoded foreground audio signal. The one or more processors may dequantize the quantized spatial component to obtain a scaled spatial component, and descale, based on the bit allocation, the scaled spatial component to obtain a spatial component. The one or more processors may reconstruct, based on the foreground audio signal and the spatial component, scene-based audio data.
    Type: Application
    Filed: June 22, 2020
    Publication date: December 24, 2020
    Inventors: Ferdinando Olivieri, Taher Shahbazi Mirzahasanloo, Nils Günther Peters
  • Publication number: 20200402520
    Abstract: An audio post-processor for post-processing an audio signal having a time-variable high frequency gain information as side information includes: a band extractor for extracting a high frequency band of the audio signal and a low frequency band of the audio signal; a high band processor for performing a time-variable modification of the high frequency band in accordance with the time-variable high frequency gain information to obtain a processed high frequency band; and a combiner for combining the processed high frequency band and the low frequency band. Furthermore, a pre-processor is illustrated.
    Type: Application
    Filed: June 4, 2020
    Publication date: December 24, 2020
    Inventors: Florin Ghido, Sascha Disch, Jürgen Herre, Alexander Adami, Franz Reutelhuber
  • Publication number: 20200402521
    Abstract: In general, various aspects of the techniques described in this disclosure are directed to performing psychoacoustic audio coding based on operating conditions. A device comprising a memory and one or more processors may be configured to perform the techniques. The memory may be configured to store the encoded scene-based audio data. The one or more processors may be configured to obtain an operating condition of the device for decoding the encoded scene-based audio data and perform, based on the operating condition, psychoacoustic audio decoding with respect to the encoded scene-based audio data to obtain ambisonic transport format audio data. The one or more processors may also be configured to perform spatial audio decoding with respect to the ambisonic transport format audio data to obtain scene-based audio data.
    Type: Application
    Filed: June 22, 2020
    Publication date: December 24, 2020
    Inventors: Ferdinando Olivieri, Taher Shahbazi Mirzahasanloo, Nils Günther Peters
  • Publication number: 20200402522
    Abstract: In general, techniques are described for quantizing spatial components based on bit allocations determined for psychoacoustic audio coding. A device comprising a memory and one or more processors may perform the techniques. The memory may store a bitstream including an encoded foreground audio signal and a corresponding quantized spatial component. The one or more processors may perform psychoacoustic audio decoding with respect to the encoded foreground audio signal to obtain a foreground audio signal, and determine, when performing the psychoacoustic audio decoding, a first bit allocation for the encoded foreground audio signal. The one or more processors may also determine, based on the first bit allocation, a second bit allocation, and dequantize, based on the second bit allocation, the quantized spatial component to obtain a spatial component. The one or more processors may reconstruct, based on the foreground audio signal and the spatial component, scene-based audio data.
    Type: Application
    Filed: June 22, 2020
    Publication date: December 24, 2020
    Inventors: Ferdinando Olivieri, Taher Shahbazi Mirzahasanloo, Nils Günther Peters
  • Publication number: 20200402523
    Abstract: In general, techniques are described for psychoacoustic audio coding of ambisonic audio data. A device comprising a memory and one or more processors may be configured to perform the techniques. The memory may store the bitstream that includes an encoded audio object and a corresponding spatial component that defines spatial characteristics of the encoded foreground audio signal. The encoded foreground audio signal may include a coded gain and a coded shape. The one or more processors may perform a gain and shape synthesis with respect to the coded gain and the coded shape to obtain a foreground audio signal, and reconstruct, based on the foreground audio signal and the spatial component, the ambisonic audio data.
    Type: Application
    Filed: June 22, 2020
    Publication date: December 24, 2020
    Inventors: Ferdinando Olivieri, Taher Shahbazi Mirzahasanloo, Nils Günther Peters
  • Publication number: 20200402524
    Abstract: Efficient assignment of bit numbers is performed even under a low bit rate condition. A quantizer 12 obtains a quantized spectral sequence from a frequency spectral sequence. An integer transformer 13 obtains a unified quantized spectral sequence by obtaining, by a bijective transformation, a transformed integer for each of the sets, each being made up of integer values, obtained from the quantized spectral sequence. An integer encoder 15 obtains an integer code by encoding the unified quantized spectral sequence using a bit assignment sequence. An object-to-be-encoded estimator 18 obtains an estimated unified spectral sequence from the frequency spectral sequence by a transformation which is performed by the integer transformer 13 or a transformation that approximates the magnitude relationship between values before and after the above transformation. A bit assigner 14 obtains a bit assignment sequence and a bit assignment code from the estimated unified spectral sequence.
    Type: Application
    Filed: February 19, 2019
    Publication date: December 24, 2020
    Applicant: NIPPON TELEGRAPH AND TELEPHONE CORPORATION
    Inventors: Ryosuke SUGIURA, Yutaka KAMAMOTO, Takehiro MORIYA
  • Publication number: 20200402525
    Abstract: A method obtains a first sound signal representative of a first sound, including a first spectrum envelope contour and a first reference spectrum envelope contour; obtains a second sound signal, representative of a second sound differing in sound characteristics from the first sound, including a second spectrum envelope contour and a second reference spectrum envelope contour; generates a synthesis spectrum envelope contour by transforming the first spectrum envelope contour based on a first difference between the first spectrum envelope contour and the first reference spectrum envelope contour at a first time point of the first sound signal, and a second difference between the second spectrum envelope contour and the second reference spectrum envelope contour at a second time point of the second sound signal; and generates a third sound signal representative of the first sound that has been transformed using the generated synthesis spectrum envelope contour.
    Type: Application
    Filed: September 8, 2020
    Publication date: December 24, 2020
    Inventors: Ryunosuke DAIDO, Hiraku KAYAMA
  • Publication number: 20200402526
    Abstract: A speech extraction method based on the supervised learning auditory attention includes: converting an original overlapping speech signal into a two-dimensional time-frequency signal representation by a short-time Fourier transform to obtain a first overlapping speech signal; performing a first sparsification on the first overlapping speech signal, mapping intensity information of a time-frequency unit of the first overlapping speech signal to preset D intensity levels, and performing a second sparsification on the first overlapping speech signal based on information of the preset D intensity levels to obtain a second overlapping speech signal; converting the second overlapping speech signal into a pulse signal by a time coding method; extracting a target pulse from the pulse signal by a trained target pulse extraction network; converting the target pulse into a time-frequency representation of the target speech to obtain the target speech by an inverse short-time Fourier transform.
    Type: Application
    Filed: April 19, 2019
    Publication date: December 24, 2020
    Applicant: INSTITUTE OF AUTOMATION, CHINESE ACADEMY OF SCIENCES
    Inventors: Jiaming XU, Yating HUANG, Bo XU
  • Publication number: 20200402527
    Abstract: Confirmation can be made what sound has been made under a restriction in which transmittable traffic is small. An abnormal sound detection system including an artificial sound creating function is configured, the abnormal sound detection system including a statistic calculation unit configured to calculate a statistic set expressing sizes of a direct current component, an alternating current component, and a noise component in an amplitude time series at each of frequencies of a sound inputted at a terminal, a statistic transmitting unit configured to transmit the statistic set from the terminal to a server, a statistic receiving unit configured to receive the statistic set in the server, and an artificial sound reproducing unit configured to reproduce a cyclostationary artificial sound based on the statistic set received in the server.
    Type: Application
    Filed: June 4, 2020
    Publication date: December 24, 2020
    Inventor: Yohei KAWAGUCHI
  • Publication number: 20200402528
    Abstract: [Overview] [Problem to be Solved] To provide an information processing apparatus that makes it possible to stably secure time for receiving audio data when reproducing audio data while receiving audio data. [Solution] An information processing apparatus including: an audio buffer unit; a reproduction time calculation unit; a position decision unit; and an insertion unit. The audio buffer unit retains first audio data that have not been reproduced in the first audio data received from another apparatus via a transmission path. The reproduction time calculation unit calculates a reproduction time of second audio data on the basis of at least any of a state of the first audio data retained in the audio buffer unit or a state of the transmission path. The second audio data are to be inserted and reproduced while the first audio data are being reproduced. The position decision unit decides an insertion position of the second audio data in the first audio data.
    Type: Application
    Filed: August 31, 2018
    Publication date: December 24, 2020
    Inventors: DAISUKE FUKUNAGA, YOSHIKI TANAKA, HISAHIRO SUGANUMA
  • Publication number: 20200402529
    Abstract: In general, techniques are described by which to correlate scene-based audio data for psychoacoustic audio coding. A device comprising a memory and one or more processors may be configured to perform the techniques. The memory may store a bitstream including a plurality of encoded correlated components of a soundfield represented by scene-based audio data. The one or more processors may perform psychoacoustic audio decoding with respect to one or more of the plurality of encoded correlated components to obtain a plurality of correlated components, and obtain, from the bitstream, an indication representative of how the one or more of the plurality of correlated components were reordered in the bitstream. The one or more processors may reorder, based on the indication, the plurality of correlated components to obtain a plurality of reordered components, and reconstruct, based on the plurality of reordered components, the scene-based audio data.
    Type: Application
    Filed: June 22, 2020
    Publication date: December 24, 2020
    Inventors: Ferdinando Olivieri, Taher Shahbazi Mirzahasanloo, Nils Günther Peters
  • Publication number: 20200402530
    Abstract: An apparatus for generating a score signal representing the quality of an audio or video signal supplied to the apparatus is proposed. The apparatus comprises: an input for supplying an audio or video signal, a computing unit implementing a neural network, the computing unit being supplied with the audio or video signal, and producing a score signal representing the quality of an audio or video signal supplied representing at least one predefined quality parameter of the audio or video signal, the neural network being set up by being trained with training data of a specific transmission standard and/or codec used for generating the audio or video data.
    Type: Application
    Filed: June 21, 2019
    Publication date: December 24, 2020
    Inventor: Baris GÜZELARSLAN
  • Publication number: 20200402531
    Abstract: [Object] To provide a technology for further improving the recording density of data. [Solving Means] A magnetic recording medium according to the present technology includes: a base material; and a magnetic layer, in which the magnetic recording medium has a tape shape that is long in a longitudinal direction and short in a width direction, the magnetic layer includes a data band and a servo band, a data signal being written to the data band, the data band being long in the longitudinal direction, a servo signal being written to the servo data, the servo band being long in the longitudinal direction, the degree of perpendicular orientation of the magnetic layer being 65% or more, a full width at half maximum of an isolated waveform in a reproduced waveform of the servo signal is 195 nm or less, the magnetic layer has a thickness of 90 nm or less, and the base material has a thickness of 4.2 ?m or less.
    Type: Application
    Filed: November 19, 2018
    Publication date: December 24, 2020
    Inventors: Minoru YAMAGA, Takanobu IWAMA, Jun TAKAHASHI
  • Publication number: 20200402532
    Abstract: A current-assisted magnetic recording write head has an electrically conductive layer in the write gap between the write pole and the trailing shield. Electrical circuitry directs current between the write pole and the trailing shield, through the conductive layer in the write gap. The current through the conductive layer generates an Ampere field substantially orthogonal to the magnetization in the write pole to assist magnetization switching of the write pole. The conductive layer is wider in the cross-track direction than the trailing edge of the write pole and may extend beyond the write pole side gaps so as to be in contact with both the side shields and the trailing shield. The conductive layer may have substantially the same along-the-track thickness across its width or it may have a thicker central region at the write pole trailing edge and thinner side regions.
    Type: Application
    Filed: September 2, 2020
    Publication date: December 24, 2020
    Inventors: Muhammad ASIF BASHIR, Venkatesh CHEMBROLU, Alexander GONCHAROV, Petrus Antonius VAN DER HEIJDEN, Yingjian CHEN
  • Publication number: 20200402533
    Abstract: [Object] To provide technologies such as an orientation device capable of increasing strength of a magnetic field in a transport path. [Solving Means] An orientation device according to the present technology includes a transport path, a permanent magnet portion, and a yoke portion. The transport path allows a base on which a magnetic coating film containing magnetic powder has been formed to pass through the transport path along a transport direction. The permanent magnet portion includes a plurality of first permanent magnets, and a plurality of second permanent magnets that is opposed to the plurality of first permanent magnets across the transport path in a vertical direction that is vertical to the transport direction in a manner that opposite poles face each other, the permanent magnet portion vertically orienting particles of the magnetic powder by applying a magnetic field to the magnetic coating film on the base that passes through the transport path.
    Type: Application
    Filed: March 6, 2019
    Publication date: December 24, 2020
    Inventors: Eiji NAKASHIO, Hidetoshi SAKUMA, Shuhei MATSUYA, Hidetoshi NISHIYAMA, Jun SASAKI
  • Publication number: 20200402534
    Abstract: Systems and methods, e.g., optical apparatuses, for digital optical information storage systems that improve the speed, signal to noise, controllability, and data storage density for fluorescent and reflective multilayer optical data storage media. The systems and methods include an optical system for a reading beam of a data channel from a moving single or multi-layer or otherwise 3-dimensional optical information storage medium that comprises at least one optical element characterized by restricting the field of view (FOV) of the reading beam on an associated image plane to 0.3 to 2 Airy disk diameters in a first direction.
    Type: Application
    Filed: June 24, 2020
    Publication date: December 24, 2020
    Inventors: Kenneth D. Singer, Irina Shiyanovskaya, Asher Sussman, Thomas Milster, Young Sik Kim
  • Publication number: 20200402535
    Abstract: Control apparatus to control operation of a data buffer to which data items are written according to a write pointer which advances in position in response to an input data item rate and from which data items are read according to a read pointer which advances in position in response to an output data item rate, comprises: a detector configured to detect an occupancy difference between a current buffer occupancy and a target buffer occupancy, in which the current buffer occupancy represents a difference between the read and write pointers; an output data item interpolator configured to interpolate a data item at an interpolated data buffer location displaced by a read offset displacement from a data buffer location pointed to by the read pointer; and output control circuitry configured, in response to a current occupancy difference exceeding a threshold occupancy difference, to change the read pointer from an initial read pointer to a target read pointer by a change amount so as to reduce the occupancy differ
    Type: Application
    Filed: February 15, 2019
    Publication date: December 24, 2020
    Applicant: Sony Corporation
    Inventor: Stephen Mark KEATING
  • Publication number: 20200402536
    Abstract: A method for multi-track file sharing and editing, comprising receiving from client devices audio records documenting audio tracks of a multi-track project, upon receiving at least one of the audio records: performing a temporal synchronization of a respective the audio record with previously received audio tracks, updating an interactive graphical interface that includes concentric track control elements to share the common center with the respective concentric control element and to reflect the temporal synchronization, each of the plurality of concentric control elements has bar elements arranged in a sequence and associated with a the audio segments of one of the audio tracks, identifying at least one user selection indicative of a group of the audio segments of one of the bar elements, and editing the group of the audio segments according to user editing instructions.
    Type: Application
    Filed: November 12, 2018
    Publication date: December 24, 2020
    Applicant: Musico Ltd.
    Inventors: Jhonatan Oved Eliyahu PISTINER, Barak Daniel INBAR, Eyal COHEN