COMPUTER-BASED SYSTEMS, DEVICES, AND METHODS FOR GENERATING MUSICAL COMPOSITIONS THAT ARE SYNCHRONIZED TO VIDEO
Computer-based systems, devices, and methods for generating musical compositions that are purposefully synchronized with video are described. A video timeline is defined with various time-markers that demarcate specific events in the video. A music timeline is generated based on the video timeline. The music timeline preserves the various time-markers from the video timeline. A computer-based musical composition system generates a musical composition based on the music timeline. The musical composition includes various musical events that align, synchronize, or coincide with the time-markers such that when the video and musical composition are played together the musical events align, synchronize, or coincide with the demarcated events in the video.
The present systems, devices, and methods generally relate to computer-generated music, and particularly relate to automatically generating musical compositions with musical events that are synchronized to events in video.
BACKGROUND Description of the Related Art Composing Musical CompositionsA musical composition may be characterized by sequences of sequential, simultaneous, and/or overlapping notes that are partitioned into one or more tracks. Starting with an original musical composition, a new musical composition or “variation” can be composed by manipulating the “elements” (e.g., notes, bars, tracks, arrangement, etc.) of the original composition. As examples, different notes may be played at the original times, the original notes may be played at different times, and/or different notes may be played at different times. Further refinements can be made based on many other factors, such as changes in musical key and scale, different choices of chords, different choices of instruments, different orchestration, changes in tempo, the imposition of various audio effects, changes to the sound levels in the mix, and so on.
In order to compose a new musical composition (or variation) based on an original or previous musical composition, it is typically helpful to have a clear characterization of the elements of the original musical composition. In addition to notes, bars, tracks, and arrangements, “segments” are also important elements of a musical composition. In this context, the term “segment” (or “musical segment”) is used to refer to a particular sequence of bars (i.e., a subset of serially-adjacent bars) that represents or corresponds to a particular section or portion of a musical composition. A musical segment may include, for example, an intro, a verse, a pre-chorus, a chorus, a bridge, a middle8, a solo, or an outro. The section or portion of a musical composition that corresponds to a “segment” may be defined, for example, by strict rules of musical theory and/or based on the sound or theme of the musical composition.
Digital Audio File FormatsWhile it is common for human musicians to communicate musical compositions in the form of sheet music, it is notably uncommon for computers to do so. Computers typically store and communicate music in well-established digital audio file formats, such as .mid, .wav, or .mp3 (just to name a few), that are designed to facilitate communication between electronic instruments and other devices by allowing for the efficient movement of musical waveforms over computer networks. In a digital audio file format, audio data is typically encoded in one of various audio coding formats (which may be compressed or uncompressed) and either provided as a raw bitstream or, more commonly, embedded in a container or wrapper format.
BRIEF SUMMARYA computer-implemented method of synchronizing a musical composition with a video may be summarized as including: preparing a video that begins at a video start time and ends at a video stop time; establishing a first time-marker corresponding to a first event in the video, wherein the first time-marker demarcates a time in between the video start time and the video stop time at which the first event occurs in the video; and generating a musical composition, wherein generating the musical composition includes synchronizing a first musical event of the musical composition with the first time-marker. Generating the musical composition may further include synchronizing a start time of the musical composition with the video start time. Generating the musical composition may further include synchronizing a stop time of the musical composition with the video stop time.
Generating the musical composition may further include: generating a timeline for the musical composition, the timeline comprising: a start time for the musical composition; a stop time for the musical composition; and the first time-marker in between the start time for the musical composition and the stop time for the musical composition; and generating the musical composition based on the timeline, wherein the musical composition begins at the start time for the musical composition, includes the first musical event at the first time-marker, and ends at the stop time for the musical composition. The method may further include: generating a video timeline for the video, wherein the video timeline includes the video start time, the video stop time, and the first time-marker in between the video start time and the video stop time, and wherein generating the musical composition based on the timeline includes aligning the timeline for the musical composition with the video timeline to synchronize the first musical event with the first event in the video at the first time-marker.
The method may further include: establishing a second time-marker corresponding to a second event in the video, wherein the second time-marker demarcates a time in between the video start time and the video stop time at which the second event occurs in the video, and wherein generating the musical composition further includes synchronizing a second musical event of the musical composition with the second time-marker. The method may further include: establishing at least one additional time-marker, each respective additional time-marker corresponding to a respective additional event in the video, wherein each respective additional time-marker demarcates a respective time in between the video start time and the video stop time at which a respective additional event occurs in the video, and wherein generating the musical composition further includes synchronizing a respective additional musical event of the musical composition with each respective additional time-marker. Generating the musical composition may further include: generating a timeline for the musical composition, the timeline comprising: a start time for the musical composition that coincides with the video start time; a stop time for the musical composition that coincides with the video stop time; the first time-marker in between the start time for the musical composition and the stop time for the musical composition; the second time-marker in between the start time for the musical composition and the stop time for the musical composition; and each additional time-marker in between the start time for the musical composition and the stop time for the musical composition; and generating the musical composition based on the timeline, wherein the musical composition begins at the start time for the musical composition, includes the first musical event at the first time-marker, includes the second musical event at the second time-marker, includes each respective additional musical event at each respective additional time-marker, and ends at the stop time for the musical composition.
Establishing a first time-marker corresponding to a first event in the video may include establishing the first time-marker corresponding to a first event selected from a group consisting of: a change of a view in the video, a change of a scene in the video, a change of a background in the video, a change of a foreground in the video, an action performed by a character in the video, an action performed by an object in the video, a period preceding a particular action in the video, a period preceding a particular event in the video, a period succeeding a particular action in the video, a period succeeding a particular event in the video, a change in a character depicted in the video, an introduction of a new character depicted in the video, a change in an object depicted in the video, and an introduction of a new object depicted in the video. Synchronizing a first musical event of the musical composition with the first time-marker may include synchronizing, with the first time-marker, a musical event selected from a group consisting of: a particular note, a particular note sequence, a key modulation, a chord progression, a particular instrument, a change in instrumentation, a change in beats per minute, a change in timing, an acceleration, a deceleration, and a change in volume.
Generating a musical composition may include automatically generating the musical composition by a computer-based musical composition system, and automatically generating the musical composition by the computer-based musical composition system may include: defining a start time of the musical composition to at least approximately coincide with the video start time, defining a stop time of the musical composition to at least approximately coincide with the video stop time, defining the first musical event of the musical composition to coincide with the first time-marker, and generating music that includes the first musical event at the first time-marker and continuously connects between the start time of the musical composition and the stop time of the musical composition.
Generating a musical composition may include solving, by a computer-based musical composition system, a constraint satisfaction problem. In this case, synchronizing a first musical event of the musical composition with the first time-marker may include providing, in a formulation of the constraint satisfaction problem, a constraint that specifies that the first musical event of the musical composition is synchronized with the first time-marker.
A system for synchronizing a musical composition with a video may be summarized as including: at least one processor; and a non-transitory processor-readable storage medium communicatively coupled to the at least one processor, the non-transitory processor-readable storage medium storing processor-executable instructions and/or data that, when executed by the at least one processor, cause the at least one processor to: prepare a video that begins at a video start time and ends at a video stop time; establish a first time-marker that demarcates a time in between the video start time and the video stop time at which a first event occurs in the video; and generate a musical composition that synchronizes a first musical event with the first time-marker. The processor-executable instructions and/or data that, when executed by the at least one processor, cause the at least one processor to generate a musical composition that synchronizes a first musical event with the first time-marker, may cause the at least one processor to: generate a timeline for the musical composition, the timeline comprising: a start time for the musical composition; a stop time for the musical composition; and the first time-marker in between the start time for the musical composition and the stop time for the musical composition; and generate the musical composition based on the timeline, wherein the musical composition begins at the start time for the musical composition, includes the first musical event at the first time-marker, and ends at the stop time for the musical composition. The non-transitory processor-readable storage medium may further store processor-executable instructions and/or data that, when executed by the at least one processor, cause the at least one processor to generate a video timeline for the video, wherein the video timeline includes the video start time, the video stop time, and the first time-marker in between the video start time and the video stop time, and wherein the processor-executable instructions and/or data that, when executed by the at least one processor, cause the at least one processor to generate the musical composition based on the timeline, may cause the at least one processor to align the timeline for the musical composition with the video timeline to synchronize the first musical event with the first event in the video at the first time-marker.
The processor-executable instructions and/or data that, when executed by the at least one processor, cause the at least one processor to establish a first time-marker that demarcates a time in between the video start time and the video stop time at which a first event occurs in the video, may further cause the at least one processor to establish a second time-marker that demarcates a time in between the video start time and the video stop time at which a second event occurs in the video. The processor-executable instructions and/or data that, when executed by the at least one processor, cause the at least one processor to generate a musical composition that synchronizes a first musical event synchronized with the first time-marker, may further cause the at least one processor to generate the musical composition with a second musical event synchronized with the second time-marker.
The processor-executable instructions and/or data that, when executed by the at least one processor, cause the at least one processor to generate a musical composition that synchronizes a first musical event with the first time-marker, may cause the at least one processor to: define a constraint satisfaction problem, the constraint satisfaction problem including: a first constraint that specifies that the first musical event of the musical composition is synchronized with the first time-marker; and a second constraint that specifies that the second musical event of the musical composition is synchronized with the second time-marker; and solve the constraint satisfaction problem.
A computer program product may be summarized as including: processor-executable instructions and/or data that, when the computer program product is stored in a non-transitory processor-readable storage medium and executed by at least one processor communicatively coupled to the non-transitory processor-readable storage medium, cause the at least one processor to: prepare a video that begins at a video start time and ends at a video stop time; establish a first time-marker that demarcates a time in between the video start time and the video stop time at which a first event occurs in the video; and generate a musical composition that synchronizes a first musical event with the first time-marker. The processor-executable instructions and/or data that, when executed by the at least one processor, cause the at least one processor to generate a musical composition that synchronizes a first musical event with the first time-marker, may cause the at least one processor to: generate a timeline for the musical composition, the timeline comprising: a start time for the musical composition; a stop time for the musical composition; and the first time-marker in between the start time for the musical composition and the stop time for the musical composition; and generate the musical composition based on the timeline, wherein the musical composition begins at the start time for the musical composition, includes the first musical event at the first time-marker, and ends at the stop time for the musical composition. The non-transitory processor-readable storage medium may further store processor-executable instructions and/or data that, when executed by the at least one processor, cause the at least one processor to generate a video timeline for the video, wherein the video timeline includes the video start time, the video stop time, and the first time-marker in between the video start time and the video stop time. The processor-executable instructions and/or data that, when executed by the at least one processor, cause the at least one processor to generate the musical composition based on the timeline, may cause the at least one processor to align the timeline for the musical composition with the video timeline to synchronize the first musical event with the first event in the video at the first time-marker.
The processor-executable instructions and/or data that, when executed by the at least one processor, cause the at least one processor to establish a first time-marker that demarcates a time in between the video start time and the video stop time at which a first event occurs in the video, may further cause the at least one processor to establish a second time-marker that demarcates a time in between the video start time and the video stop time at which a second event occurs in the video. The processor-executable instructions and/or data that, when executed by the at least one processor, cause the at least one processor to generate a musical composition that synchronizes a first musical event synchronized with the first time-marker, may further cause the at least one processor to generate the musical composition with a second musical event synchronized with the second time-marker.
The various elements and acts depicted in the drawings are provided for illustrative purposes to support the detailed description. Unless the specific context requires otherwise, the sizes, shapes, and relative positions of the illustrated elements and acts are not necessarily shown to scale and are not necessarily intended to convey any information or limitation. In general, identical reference numbers are used to identify similar elements or acts.
The following description sets forth specific details in order to illustrate and provide an understanding of the various implementations and embodiments of the present systems, devices, and methods. A person of skill in the art will appreciate that some of the specific details described herein may be omitted or modified in alternative implementations and embodiments, and that the various implementations and embodiments described herein may be combined with each other and/or with other methods, components, materials, etc. in order to produce further implementations and embodiments.
In some instances, well-known structures and/or processes associated with computer systems and data processing have not been shown or provided in detail in order to avoid unnecessarily complicating or obscuring the descriptions of the implementations and embodiments.
Unless the specific context requires otherwise, throughout this specification and the appended claims the term “comprise” and variations thereof, such as “comprises” and “comprising,” are used in an open, inclusive sense to mean “including, but not limited to.”
Unless the specific context requires otherwise, throughout this specification and the appended claims the singular forms “a,” “an,” and “the” include plural referents. For example, reference to “an embodiment” and “the embodiment” include “embodiments” and “the embodiments,” respectively, and reference to “an implementation” and “the implementation” include “implementations” and “the implementations,” respectively. Similarly, the term “or” is generally employed in its broadest sense to mean “and/or” unless the specific context clearly dictates otherwise.
The headings and Abstract of the Disclosure are provided for convenience only and are not intended, and should not be construed, to interpret the scope or meaning of the present systems, devices, and methods.
The various embodiments described herein provide systems, devices, and methods for computer-based generation of musical compositions in which one or more musical event(s) is/are deliberately synchronized with one or more corresponding event(s) in a video. As a result, when the video and musical composition are played together in synchronization, the music and the video accentuate one another (particularly at the synchronized events) to enhance the experience beyond that which is achieved by either the music or the video on its own.
In some implementations, the present systems, devices, and methods provide new musical compositions that are automatically generated and specifically tailored to provide distinct musical events at predetermined points in time (i.e., at time points throughout the duration of the musical composition) that align or synchronize with events that occur in a video. In other words, starting with video material, a timeline may be generated that capture a length or duration of the video and also various time-markers throughout the length or duration of the video. The timeline may then be used as the basis for a new musical composition to be generated automatically by a computer-based musical composition system. In automatically generating the new musical composition based on the timeline, the computer-based musical composition system may purposefully align distinct musical events in the new musical composition with identified events in the video such that, when both the video and the new musical composition are played together, the video and the new musical composition accent one another.
In some applications, the new musical composition may be purposefully generated to accent the video. For example, if the video corresponds to an advertisement or commercial, or a scene from a film, movie, or television show, then the deliberately synchronized musical events of the musical composition may at greater effect or significance to the corresponding events in the video.
Throughout this specification and the appended claims, a musical variation is considered a form of musical composition and the term “musical composition” (as in, for example, “computer-generated musical composition” and “computer-based musical composition system”) is used to include musical variations.
Systems, devices, and methods for encoding musical compositions in hierarchical data structures of the form Music[Segments{ }, barsPerSegment{ }] are described in U.S. Pat. No. 10,629,176, filed Jun. 21, 2019 and entitled “Systems, Devices, and Methods for Digital Representations of Music” (hereinafter “Hum Patent”), which is incorporated by reference herein in its entirety.
Systems, devices, and methods for automatically identifying the musical segments of a musical composition and which can facilitate encoding musical compositions (or even simply undifferentiated sequences of musical bars) into the Music[Segments{ }, barsPerSegment{ }] form described above are described in U.S. Pat. No. 11,024,274, filed Jan. 28, 2020 and entitled “Systems, Devices, and Methods for Segmenting a Musical Composition into Musical Segments” (hereinafter “Segmentation Patent”), which is incorporated herein by reference in its entirety.
Systems, devices, and methods for identifying harmonic structure in digital data structures and for mapping the Music[Segments{ }, barsPerSegment{ }] data structure into an isomorphic HarmonicStructure[Segments{ }, harmonicSequencePerSegment{ }] data structure are described in U.S. patent spplication Ser. No. 16/775,254, filed Jan. 28, 2020 and entitled “Systems, Devices, and Methods for Harmonic Structure in Digital Representations of Music” (hereinafter “Harmony Patent”), which is incorporated herein by reference in its entirety.
Returning to
At 101, a video is prepared. Depending on the specific implementation, the video may be prepared by a computer-based musical composition system or by a different computer system (e.g., a computer-based video production system). The video may include footage or recording of real-life, animation, computer graphics, or any combination of the foregoing. The video may include a single continuous scene or multiple scenes edited together. The video may or may not include audio. In any case, the video prepared at 101 generally begins at a video start time and ends at a video stop time.
At 102, a computer system (either the same computer system that prepares the video at 101 or a different computer system) establishes a first time-marker corresponding to a first event in the video prepared at 101. The first time-marker demarcates a time in between the video start time and the video stop time at which the first event occurs in the video. For example, the video may generally be characterized by a video timeline that beings at the video start time, ends at the video start time, and connects linearly and continuously in between the video start time and the video stop time. The first time-marker may correspond to a point on the video timeline, in between the video start time and the video stop time, at which the first event occurs in the video. Depending on the specific implementation, the first event in the video may correspond to any occurrence in the video to which significance is to be imparted. Examples of suitable events that may correspond to the first event include, without limitation: a change of a view in the video, a change of a scene in the video, a change of a background in the video, a change of a foreground in the video, an action performed by a character in the video, an action performed by an object in the video, a period preceding a particular action in the video, a period preceding a particular event in the video, a period succeeding a particular action in the video, a period succeeding a particular event in the video, a change in a character depicted in the video, an introduction of a new character depicted in the video, a change in an object depicted in the video, and an introduction of a new object depicted in the video.
At 103, a computer-based musical composition system generates a musical composition. As previously described, the same computer-based musical composition system may prepare the video at 101 and establish the first time-marker at 102, or the computer-based musical composition may receive the video (or receive just video timeline information that characterizes the timeline of the video) from a different computer system prior to act 103.
In accordance with the present systems, devices, and methods, when the computer-based musical composition system generates a musical composition at 103, the computer-based musical composition may perform or execute sub-act 131 of method 100. At 131, the computer-based musical composition system synchronizes (i.e., as an element of generating the musical composition) a first musical event of the musical composition with the first time-marker from 102. Depending on the specific implementation, the first musical event may correspond to any occurrence or element in the musical composition that may impart significance to the first time-marker. Examples of suitable musical events that may correspond to the first musical event include, without limitation: a particular note, a particular note sequence, a key modulation, a chord progression, a particular instrument, a change in instrumentation, a change in beats per minute, a change in timing (e.g., an abrupt change in timing), an acceleration, a deceleration, and a change in volume (i.e., either an abrupt change in volume or the beginning/end of a crescendo/decrescendo).
The computer-based musical composition system may generate, at 103, the musical composition substantially automatically by employing, among other things, the teachings of Hum Patent, Segmentation Patent, and Harmony Patent. Automatically generating the musical composition by the computer-based musical composition system may include, for example: defining a start time of the musical composition to at least approximately coincide with the video start time; defining a stop time of the musical composition to at least approximately coincide with the video stop time; defining the first musical event of the musical composition to coincide with the first time-marker; and generating music that includes the first musical event at the first time-marker and continuously connects between the start time of the musical composition and the stop time of the musical composition.
In some implementations, at least a portion of generating the musical composition by the computer-based musical composition system at 103 may include defining and solving, by the computer-based musical composition system, a constraint satisfaction problem, or other optimization problem such as a planning problem and/or a scheduling problem. Examples of defining and solving a constraint satisfaction problem as part of the automatic generation of a musical composition are described in U.S. patent application Ser. No. 17/361,414, filed Jun. 29, 2021 and entitled “Computer-Based Systems, Devices, And Methods For Generating Aesthetic Chord Progressions And Key Modulations In Musical Compositions”, which is incorporated herein by reference in its entirety. In accordance with the present systems, devices, and methods, a constraint satisfaction problem may be defined to include constraints that impose various synchronizations between time-markers and musical events in a musical composition. That is, the constraint satisfaction problem may include: a first constraint that specifies that a first musical event is synchronized, aligned, or coincident with a first time-marker; a second constraint that specifies that a second musical event is synchronized, aligned, or coincident with a second time-marker; and any number of additional constraints that each respectively specify that a respective additional musical event is synchronized, aligned, or coincident with a respective additional time-marker. In some implementations, a constraint may specify that a stop time of the musical composition synchronizes with a video stop time (or more generally, that a stop time of the musical composition occurs at specific point tin time or that a duration or length of the musical composition is a specific value).
In some implementations, the length or duration of the musical composition may equal the length or duration of the video. That is, in some implementations, when the computer-based musical composition system generates the musical composition at 103 and synchronizes the first musical event with the first time-marker at 131, the computer-based musical composition system may synchronize a start time of the musical composition with the video start time and the computer-based musical composition system may synchronize a stop time of the musical composition with the video stop time. In some implementations, the length or duration of the musical composition may be shorter or longer than the length or duration of the video, in which case the computer-based musical composition may either synchronize a start time of the musical composition with the video start time or synchronize a stop time of the musical composition with the video stop time. In some implementations, the start time of the musical composition may not be synchronized with the video start time and/or the stop time of the musical composition may not be synchronized with the video stop time. In some implementations, the video may include more than one time-marker.
Similar to method 100, in method 200 a video is prepared at 101 and a first time-marker is established at 102. However, method 200 further includes act 221 and, optionally, act 222. At 221, a computer system (e.g., the same computer system that performs act 102) establishes a second time-marker corresponding to a second event in the video prepared at 101. The second time-marker demarcates a time in between the video start time and the video stop time at which the second event occurs in the video. For example, if the video is generally characterized by a video timeline that beings at the video start time, ends at the video start time, and connects linearly and continuously in between the video start time and the video stop time, then the first time-marker may correspond to a point on the video timeline, in between the video start time and the video stop time, at which the first event occurs in the video and the second time-marker may correspond to a point on the video timeline, in between the video start time and the video stop time, at which the second event occurs in the video. Similar to the first event in the video, the second event in the video may correspond to any occurrence in the video to which significance is to be imparted.
In some implementations, method 200 may further include optional act 222. At 222, a computer system (e.g., the same computer system that performs act 102 and 221) establishes at least one additional time-marker (e.g., any number of additional time-markers), where each additional time-marker corresponds to a respective additional event in the video prepared at 101. Each respective additional time-marker may demarcate a respective time in between the video start time and the video stop time at which a respective additional event occurs in the video. For example, if the video is generally characterized by a video timeline that beings at the video start time, ends at the video start time, and connects linearly and continuously in between the video start time and the video stop time, then the first time-marker may correspond to a point on the video timeline, in between the video start time and the video stop time, at which the first event occurs in the video, the second time-marker may correspond to a point on the video timeline, in between the video start time and the video stop time, at which the second event occurs in the video, and each respective additional time-marker may correspond to a respective point on the video timeline, in between the video start time and the video stop time, at which a respective additional event occurs in the video. Similar to the first event and the second event in the video, each respective additional event in the video may correspond to a respective occurrence in the video to which significance is to be imparted.
Similar to method 100, in method 200 a musical composition is generated at 103 and the act of generating the musical composition at 103 includes the sub-act, 131, of synchronizing a first musical event in the musical composition with the first time-marker from act 102. However, in method 200 the act of generating the musical composition at 103 further includes sub-act 231 and, optionally, sub-act 232. At 231, the computer-based musical composition system synchronizes (i.e., as an element of generating the musical composition at 103) a second musical event of the musical composition with the second time-marker from 221. Depending on the specific implementation, the second musical event may correspond to any occurrence or element in the musical composition that may impart significance to the second time-marker.
Generally, an implementation of method 200 may only include optional sub-act 232 when the implementation of method 200 incudes optional act 222. That is, when an implementation of method 200 includes, at 222, establishing at least one additional time-marker with each additional time-marker corresponding to a respective additional event in the video, then in such implementation of method 200, at 232, the computer-based musical composition system synchronizes (i.e., as an element of generating the musical composition at 103) a respective additional musical event of the musical composition with each respective additional time-marker from 222. Depending on the specific implementation, each respective additional musical event may correspond to a respective occurrence or element in the musical composition that may impart significance to a respective additional time-marker.
Similar to method 100, in method 300 a video is prepared at 101, a first time-marker is established at 102, and a musical composition is generated at 103. However, method 300 further includes sub-acts 331 and 332 that provide additional detail about an exemplary implementation of generating the musical composition at 103. Specifically, at 331 the computer-based musical composition system generates a timeline for the musical composition (i.e., a “music timeline”). The music timeline generally includes a start time for the musical composition, a stop time for the musical composition, and the first time-marker in between the start time for the musical composition and the stop time for the musical composition. Per sub-act 131 of method 100, the position of the first-time marker in the music timeline may be synchronized with a first event in the video. For example, if the video is generally characterized by a video timeline then the music timeline and the video timeline may be aligned or synchronized such that a position of the first-time marker is aligned, synchronized, or coincident in both the music timeline and the video timeline. In some implementations, the start time of the musical composition in the music timeline may be aligned, synchronized, or coincident with the video start time in the video timeline and/or the stop time of the musical composition in the music timeline may be aligned, synchronized, or coincident with the video stop time in the video timeline.
At 332, the computer-based musical composition system generates the musical composition based, at least in part, on the music timeline. That is, the computer-based musical composition generates a musical composition that begins at the start time for the musical composition, includes the first musical event at the first time-marker, and ends at the stop time for the musical composition, all in accordance with the music timeline.
In implementations of the present systems, devices, and methods in which multiple time-markers are established in a video (each corresponding to a respective one of multiple events in the video) and multiple musical events are desired in the musical composition being generated (i.e., in implementations of method 200), then method 300 may be extended to include additional time-markers. For example, in addition to the start time for the musical composition, the stop time for the musical composition, and the first time-marker in between the start time for the musical composition and the stop time for the musical composition, the music timeline generated by the computer-based musical composition system at 331 may further include a second time-marker in between the start time for the musical composition and the stop time for the musical composition (corresponding to the second time-marker from act 221 of method 200) and any number of additional time-markers in between the start time for the musical composition and the stop time for the musical composition (each additional time-marker corresponding to a respective additional time-marker from act 222 of method 200). In this case, when the computer-based musical composition system generates the musical composition based, at least in part, on the music timeline at 332 of method 300, the musical composition may (in addition to beginning at the start time of the musical composition, stopping at the stop time of the musical composition, and including a first musical event at the first time-marker in between the start time of the musical composition and the stop time of the musical composition) include a second musical event at the second time-marker (per sub-act 231 of method 200) and a respective additional musical event at each respective additional time-marker in the music timeline (per sub-act 232 of method 200).
Similar to method 300, in method 400 a video is prepared at 101, a first time-marker is established at 102, and a musical composition is generated at 103. Generating the musical composition at 103 includes generating a timeline for the musical composition (i.e., a music timeline) at 331 and generating the musical composition based, at least in part, on the music timeline at 332. However, method 400 further includes act 411 and sub-act 431.
At 411, a video timeline is generated (e.g., by the computer-cased musical composition system, or by a different computer system such as a video production computer system). The video timeline includes the video start time, the video stop time, and the first time-marker in between the video start time and the video stop time, where the first time-marker is aligned, synchronized, or coincident with the first event in the video.
At 431, an example of how, at 332, the musical composition may be generated based, at least in part, on the music timeline is provided. In this example, at 431 the computer-based musical composition system aligns the timeline for the musical composition (generated at 331) with the video timeline (generated at 411) to synchronize the first musical event with the first event in the video at the first time-marker.
Exemplary implementation 500 also includes an illustrative musical composition 502 generated by a computer-based musical composition system in accordance with the present systems, devices, and methods. Musical composition 502 includes a music timeline 520 that is based on video timeline 510. That is, music timeline 520 includes a start time 521 that aligns, synchronizes, or coincides with video start time 511, a stop time 523 that aligns, synchronizes, or coincides with video stop time 513, and a first time-marker 522 that aligns, synchronizes, or coincides with first time-marker 512 in video timeline 510. In accordance with the present systems, devices, and methods, musical composition 502 purposefully includes a first musical event 540 (in this case, a particular dotted half-note that stands out among a string of quarter notes) that aligns, synchronizes, or coincides with first time-marker 522 in music timeline 520. In this way, timing information that originates in video 501 is captured and preserved in video timeline 510, through music timeline 520, and into musical composition 502.
Throughout this specification and the appended claims, the term “first” and related similar terms, such as “second,” “third,” and the like, are often used to identify or distinguish one element or object from other elements or objects (as in, for example, “first note” and “first bar”). Unless the specific context requires otherwise, such uses of the term “first,” and related similar terms such as “second,” “third,” and the like, should be construed only as distinguishing identifiers and not construed as indicating any particular order, sequence, chronology, or priority for the corresponding element(s) or object(s). For example, unless the specific context requires otherwise, the term “first note” simply refers to one particular note among other notes and does not necessarily require that such one particular note be positioned ahead of or before any other note in a sequence of notes; thus, a “first note” of a musical composition or bar is one particular note from the musical composition or bar and not necessarily the lead or chronologically-first note of the musical composition or bar.
The various implementations described herein often make reference to “computer-based,” “computer-implemented,” “at least one processor,” “a non-transitory processor-readable storage medium,” and similar computer-oriented terms. A person of skill in the art will appreciate that the present systems, devices, and methods may be implemented using or in association with a wide range of different hardware configurations, including localized hardware configurations (e.g., a desktop computer, laptop, smartphone, or similar) and/or distributed hardware configurations that employ hardware resources located remotely relative to one another and communicatively coupled through a network, such as a cellular network or the internet. For the purpose of illustration, exemplary computer systems suitable for implementing the present systems, devices, and methods are provided in
Computer-based musical composition system 600 includes at least one processor 601, a non-transitory processor-readable storage medium or “system memory” 602, and a system bus 610 that communicatively couples various system components including the system memory 602 to the processor(s) 601. Computer-based musical composition system 600 is at times referred to in the singular herein, but this is not intended to limit the implementations to a single system, since in certain implementations there will be more than one system or other networked computing device(s) involved. Non-limiting examples of commercially available processors include, but are not limited to: Core microprocessors from Intel Corporation, U.S.A., PowerPC microprocessor from IBM, ARM processors from a variety of manufacturers, Sparc microprocessors from Sun Microsystems, Inc., PA-RISC series microprocessors from Hewlett-Packard Company, and 68xxx series microprocessors from Motorola Corporation.
The processor(s) 601 of computer-based musical composition system 600 may be any logic processing unit, such as one or more central processing units (CPUs), microprocessors, digital signal processors (DSPs), application-specific integrated circuits (ASICs), field programmable gate arrays (FPGAs), and/or the like. Unless described otherwise, the construction and operation of the various blocks shown in
The system bus 610 in the computer-based musical composition system 600 may employ any known bus structures or architectures, including a memory bus with memory controller, a peripheral bus, and/or a local bus. The system memory 602 includes read-only memory (“ROM”) 621 and random access memory (“RAM”) 622. A basic input/output system (“BIOS”) 623, which may or may not form part of the ROM 621, may contain basic routines that help transfer information between elements within computer-based musical composition system 600, such as during start-up. Some implementations may employ separate buses for data, instructions and power.
Computer-based musical composition system 600 (e.g., system memory 602 thereof) may include one or more solid state memories, for instance, a Flash memory or solid state drive (SSD), which provides nonvolatile storage of processor-executable instructions, data structures, program modules and other data for computer-based musical composition system 600. Although not illustrated in
Program modules in computer-based musical composition system 600 may be stored in system memory 602, such as an operating system 624, one or more application programs 625, program data 626, other programs or modules 627, and drivers 628.
The system memory 602 in computer-based musical composition system 600 may also include one or more communications program(s) 629, for example, a server and/or a Web client or browser for permitting computer-based musical composition system 600 to access and exchange data with other systems such as user computing systems, Web sites on the Internet, corporate intranets, or other networks as described below. The communications program(s) 629 in the depicted implementation may be markup language based, such as Hypertext Markup Language (HTML), Extensible Markup Language (XML) or Wireless Markup Language (WML), and may operate with markup languages that use syntactically delimited characters added to the data of a document to represent the structure of the document. A number of servers and/or Web clients or browsers are commercially available such as those from Google (Chrome), Mozilla (Firefox), Apple (Safari), and Microsoft (Internet Explorer).
While shown in
Computer-based musical composition system 600 may include one or more interface(s) to enable and provide interactions with a user, peripheral device(s), and/or one or more additional processor-based computer system(s). As an example, computer-based musical composition system 600 includes interface 630 to enable and provide interactions with a user of computer-based musical composition system 600. A user of computer-based musical composition system 600 may enter commands, instructions, data, and/or information via, for example, input devices such as computer mouse 631 and keyboard 632. Other input devices may include a microphone, joystick, touch screen, game pad, tablet, scanner, biometric scanning device, wearable input device, and the like. These and other input devices (i.e., “I/O devices”) are communicatively coupled to processor(s) 601 through interface 630, which may include one or more universal serial bus (“USB”) interface(s) that communicatively couples user input to the system bus 610, although other interfaces such as a parallel port, a game port or a wireless interface or a serial port may be used. A user of computer-based musical composition system 600 may also receive information output by computer-based musical composition system 600 through interface 630, such as visual information displayed by a display 633 and/or audio information output by one or more speaker(s) 634. Monitor 633 may, in some implementations, include a touch screen.
As another example of an interface, computer-based musical composition system 600 includes network interface 640 to enable computer-based musical composition system 600 to operate in a networked environment using one or more of the logical connections to communicate with one or more remote computers, servers and/or devices (collectively, the “Cloud” 641) via one or more communications channels. These logical connections may facilitate any known method of permitting computers to communicate, such as through one or more LANs and/or WANs, such as the Internet, and/or cellular communications networks. Such networking environments are well known in wired and wireless enterprise-wide computer networks, intranets, extranets, the Internet, and other types of communication networks including telecommunications networks, cellular networks, paging networks, and other mobile networks.
When used in a networking environment, network interface 640 may include one or more wired or wireless communications interfaces, such as network interface controllers, cellular radios, WI-FI radios, and/or Bluetooth radios for establishing communications with the Cloud 641, for instance, the Internet or a cellular network.
In a networked environment, program modules, application programs or data, or portions thereof, can be stored in a server computing system (not shown). Those skilled in the relevant art will recognize that the network connections shown in
For convenience, processor(s) 601, system memory 602, interface 630, and network interface 640 are illustrated as communicatively coupled to each other via the system bus 610, thereby providing connectivity between the above-described components. In alternative implementations, the above-described components may be communicatively coupled in a different manner than illustrated in
In accordance with the present systems, devices, and methods, computer-based musical composition system 600 may be used to implement or in association with any or all of methods 100, 200, 300, and/or 400 described herein and/or to encode, manipulate, vary, and/or generate any or all of the musical compositions described herein. Generally, computer-based musical composition system 600 may be deployed or leveraged to generate musical compositions with purposefully positioned (in terms of time) musical events that align, synchronize, or coincide with events in a corresponding video as described throughout this specification and the appended claims. Where the descriptions of methods 100, 200, 300, and 400 make reference to an act being performed by at least one processor or more generally by a computer-based musical composition system, such act may be performed by processor(s) 601 and/or system memory 602 of computer system 600.
Computer system 600 is an illustrative example of a system for performing all or portions of the various methods described herein, the system comprising at least one processor 601, at least one non-transitory processor-readable storage medium 602 communicatively coupled to the at least one processor 601 (e.g., by system bus 610), and the various other hardware and software components illustrated in
Throughout this specification and the appended claims, the term “computer program product” is used to refer to a package, combination, or collection of software comprising processor-executable instructions and/or data that may be accessed by (e.g., through a network such as cloud 641) or distributed to and installed on (e.g., stored in a local non-transitory processor-readable storage medium such as system memory 602) a computer system (e.g., computer system 600) in order to enable certain functionality (e.g., application(s), program(s), and/or module(s)) to be executed, performed, or carried out by the computer system.
Throughout this specification and the appended claims, reference is often made to musical compositions being “automatically” generated/composed by computer-based algorithms, software, and/or artificial intelligence (AI) techniques. A person of skill in the art will appreciate that a wide range of algorithms and techniques may be employed in computer-generated music, including without limitation: algorithms based on mathematical models (e.g., stochastic processes), algorithms that characterize music as a language with a distinct grammar set and construct compositions within the corresponding grammar rules, algorithms that employ translational models to map a collection of non-musical data into a musical composition, evolutionary methods of musical composition based on genetic algorithms, and/or machine learning-based (or AI-based) algorithms that analyze prior compositions to extract patterns and rules and then apply those patterns and rules in new compositions. These and other algorithms may be advantageously adapted to exploit the features and techniques enabled by the digital representations of music described herein.
The various implementations described herein improve the functioning of computer systems for the specific practical application of generating musical compositions that accent or otherwise impart a greater effect or significance in or to video content. Extracting a video timeline from a video and adding time-markers that demarcate particular events in the video enables a music timeline with matching time-markers to be constructed and used as the basis for a computer-generated musical composition. By building such a “video-inspired” music timeline and then composing music to accommodate the timeline (as opposed to trying to build a video timeline that matches an existing music timeline), the resulting musical composition generated may sound more natural (less forced/contrived) and the resulting video+music combination may synchronize to a better degree and effect.
Throughout this specification and the appended claims the term “communicative” as in “communicative coupling” and in variants such as “communicatively coupled,” is generally used to refer to any engineered arrangement for transferring and/or exchanging information. For example, a communicative coupling may be achieved through a variety of different media and/or forms of communicative pathways, including without limitation: electrically conductive pathways (e.g., electrically conductive wires, electrically conductive traces), magnetic pathways (e.g., magnetic media), wireless signal transfer (e.g., radio frequency antennae), and/or optical pathways (e.g., optical fiber). Exemplary communicative couplings include, but are not limited to: electrical couplings, magnetic couplings, radio frequency couplings, and/or optical couplings.
Throughout this specification and the appended claims, infinitive verb forms are often used. Examples include, without limitation: “to encode,” “to provide,” “to store,” and the like. Unless the specific context requires otherwise, such infinitive verb forms are used in an open, inclusive sense, that is as “to, at least, encode,” “to, at least, provide,” “to, at least, store,” and so on.
This specification, including the drawings and the abstract, is not intended to be an exhaustive or limiting description of all implementations and embodiments of the present systems, devices, and methods. A person of skill in the art will appreciate that the various descriptions and drawings provided may be modified without departing from the spirit and scope of the disclosure. In particular, the teachings herein are not intended to be limited by or to the illustrative examples of computer systems and computing environments provided.
This specification provides various implementations and embodiments in the form of block diagrams, schematics, flowcharts, and examples. A person skilled in the art will understand that any function and/or operation within such block diagrams, schematics, flowcharts, or examples can be implemented, individually and/or collectively, by a wide range of hardware, software, and/or firmware. For example, the various embodiments disclosed herein, in whole or in part, can be equivalently implemented in one or more: application-specific integrated circuit(s) (i.e., ASICs); standard integrated circuit(s); computer program(s) executed by any number of computers (e.g., program(s) running on any number of computer systems); program(s) executed by any number of controllers (e.g., microcontrollers); and/or program(s) executed by any number of processors (e.g., microprocessors, central processing units, graphical processing units), as well as in firmware, and in any combination of the foregoing.
Throughout this specification and the appended claims, a “memory” or “storage medium” is a processor-readable medium that is an electronic, magnetic, optical, electromagnetic, infrared, semiconductor, or other physical device or means that contains or stores processor data, data objects, logic, instructions, and/or programs. When data, data objects, logic, instructions, and/or programs are implemented as software and stored in a memory or storage medium, such can be stored in any suitable processor-readable medium for use by any suitable processor-related instruction execution system, apparatus, or device, such as a computer-based system, processor-containing system, or other system that can fetch the data, data objects, logic, instructions, and/or programs from the memory or storage medium and perform various acts or manipulations (i.e., processing steps) thereon and/or in response thereto. Thus, a “non-transitory processor-readable storage medium” can be any element that stores the data, data objects, logic, instructions, and/or programs for use by or in connection with the instruction execution system, apparatus, and/or device. As specific non-limiting examples, the processor-readable medium can be: a portable computer diskette (magnetic, compact flash card, secure digital, or the like), a random access memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM, EEPROM, or Flash memory), a portable compact disc read-only memory (CDROM), digital tape, and/or any other non-transitory medium.
The claims of the disclosure are below. This disclosure is intended to support, enable, and illustrate the claims but is not intended to limit the scope of the claims to any specific implementations or embodiments. In general, the claims should be construed to include all possible implementations and embodiments along with the full scope of equivalents to which such claims are entitled.
Claims
1. A computer-implemented method of synchronizing a musical composition with a video, the method comprising:
- preparing a video that begins at a video start time and ends at a video stop time;
- establishing a first time-marker corresponding to a first event in the video, wherein the first time-marker demarcates a time in between the video start time and the video stop time at which the first event occurs in the video; and
- generating a musical composition, wherein generating the musical composition includes synchronizing a first musical event of the musical composition with the first time-marker.
2. The method of claim 1 wherein generating the musical composition further includes synchronizing a start time of the musical composition with the video start time.
3. The method of claim 1 wherein generating the musical composition further includes synchronizing a stop time of the musical composition with the video stop time.
4. The method of claim 1 wherein generating the musical composition further includes:
- generating a timeline for the musical composition, the timeline comprising: a start time for the musical composition; a stop time for the musical composition; and the first time-marker in between the start time for the musical composition and the stop time for the musical composition;
- and
- generating the musical composition based on the timeline, wherein the musical composition begins at the start time for the musical composition, includes the first musical event at the first time-marker, and ends at the stop time for the musical composition.
5. The method of claim 4, further comprising:
- generating a video timeline for the video, wherein the video timeline includes the video start time, the video stop time, and the first time-marker in between the video start time and the video stop time, and wherein generating the musical composition based on the timeline includes aligning the timeline for the musical composition with the video timeline to synchronize the first musical event with the first event in the video at the first time-marker.
6. The method of claim 1, further comprising:
- establishing at least one additional time-marker, each respective additional time-marker corresponding to a respective additional event in the video, wherein each respective additional time-marker demarcates a respective time in between the video start time and the video stop time at which a respective additional event occurs in the video, and wherein generating the musical composition further includes synchronizing a respective additional musical event of the musical composition with each respective additional time-marker.
7. The method of claim 6 wherein generating the musical composition further includes:
- generating a timeline for the musical composition, the timeline comprising: a start time for the musical composition that coincides with the video start time; a stop time for the musical composition that coincides with the video stop time; the first time-marker in between the start time for the musical composition and the stop time for the musical composition; and each additional time-marker in between the start time for the musical composition and the stop time for the musical composition;
- and
- generating the musical composition based on the timeline, wherein the musical composition begins at the start time for the musical composition, includes the first musical event at the first time-marker, includes each respective additional musical event at each respective additional time-marker, and ends at the stop time for the musical composition.
8. The method of claim 1 wherein establishing a first time-marker corresponding to a first event in the video includes establishing the first time-marker corresponding to a first event selected from a group consisting of: a change of a view in the video, a change of a scene in the video, a change of a background in the video, a change of a foreground in the video, an action performed by a character in the video, an action performed by an object in the video, a period preceding a particular action in the video, a period preceding a particular event in the video, a period succeeding a particular action in the video, a period succeeding a particular event in the video, a change in a character depicted in the video, an introduction of a new character depicted in the video, a change in an object depicted in the video, and an introduction of a new object depicted in the video.
9. The method of claim 1 wherein synchronizing a first musical event of the musical composition with the first time-marker includes synchronizing, with the first time-marker, a musical event selected from a group consisting of: a particular note, a particular note sequence, a key modulation, a chord progression, a particular instrument, a change in instrumentation, a change in beats per minute, a change in timing, an acceleration, a deceleration, and a change in volume.
10. The method of claim 1 wherein generating a musical composition includes automatically generating the musical composition by a computer-based musical composition system, and wherein automatically generating the musical composition by the computer-based musical composition system includes defining a start time of the musical composition to at least approximately coincide with the video start time, defining a stop time of the musical composition to at least approximately coincide with the video stop time, defining the first musical event of the musical composition to coincide with the first time-marker, and generating music that includes the first musical event at the first time-marker and continuously connects between the start time of the musical composition and the stop time of the musical composition.
11. The method of claim 1 wherein generating a musical composition includes solving, by a computer-based musical composition system, a constraint satisfaction problem, and wherein synchronizing a first musical event of the musical composition with the first time-marker includes providing, in a formulation of the constraint satisfaction problem, a constraint that specifies that the first musical event of the musical composition is synchronized with the first time-marker.
12. A system for synchronizing a musical composition with a video, the system comprising:
- at least one processor; and
- a non-transitory processor-readable storage medium communicatively coupled to the at least one processor, the non-transitory processor-readable storage medium storing processor-executable instructions and/or data that, when executed by the at least one processor, cause the at least one processor to:
- prepare a video that begins at a video start time and ends at a video stop time;
- establish a first time-marker that demarcates a time in between the video start time and the video stop time at which a first event occurs in the video; and
- generate a musical composition that synchronizes a first musical event with the first time-marker.
13. The system of claim 12 wherein the processor-executable instructions and/or data that, when executed by the at least one processor, cause the at least one processor to generate a musical composition that synchronizes a first musical event with the first time-marker, cause the at least one processor to:
- generate a timeline for the musical composition, the timeline comprising: a start time for the musical composition; a stop time for the musical composition; and the first time-marker in between the start time for the musical composition and the stop time for the musical composition;
- and
- generate the musical composition based on the timeline, wherein the musical composition begins at the start time for the musical composition, includes the first musical event at the first time-marker, and ends at the stop time for the musical composition.
14. The system of claim 13 wherein the non-transitory processor-readable storage medium further stores processor-executable instructions and/or data that, when executed by the at least one processor, cause the at least one processor to generate a video timeline for the video, wherein the video timeline includes the video start time, the video stop time, and the first time-marker in between the video start time and the video stop time, and wherein the processor-executable instructions and/or data that, when executed by the at least one processor, cause the at least one processor to generate the musical composition based on the timeline, cause the at least one processor to align the timeline for the musical composition with the video timeline to synchronize the first musical event with the first event in the video at the first time-marker.
15. The system of claim 12 wherein:
- the processor-executable instructions and/or data that, when executed by the at least one processor, cause the at least one processor to establish a first time-marker that demarcates a time in between the video start time and the video stop time at which a first event occurs in the video, further cause the at least one processor to establish a second time-marker that demarcates a time in between the video start time and the video stop time at which a second event occurs in the video; and
- the processor-executable instructions and/or data that, when executed by the at least one processor, cause the at least one processor to generate a musical composition that synchronizes a first musical event synchronized with the first time-marker, further cause the at least one processor to generate the musical composition with a second musical event synchronized with the second time-marker.
16. The system of claim 15 wherein the processor-executable instructions and/or data that, when executed by the at least one processor, cause the at least one processor to generate a musical composition that synchronizes a first musical event with the first time-marker, cause the at least one processor to:
- define a constraint satisfaction problem, the constraint satisfaction problem including: a first constraint that specifies that the first musical event of the musical composition is synchronized with the first time-marker;
- and a second constraint that specifies that the second musical event of the musical composition is synchronized with the second time-marker;
- and
- solve the constraint satisfaction problem.
17. A computer program product comprising:
- processor-executable instructions and/or data that, when the computer program product is stored in a non-transitory processor-readable storage medium and executed by at least one processor communicatively coupled to the non-transitory processor-readable storage medium, cause the at least one processor to:
- prepare a video that begins at a video start time and ends at a video stop time;
- establish a first time-marker that demarcates a time in between the video start time and the video stop time at which a first event occurs in the video; and
- generate a musical composition that synchronizes a first musical event with the first time-marker.
18. The computer program product of claim 17 wherein the processor-executable instructions and/or data that, when executed by the at least one processor, cause the at least one processor to generate a musical composition that synchronizes a first musical event with the first time-marker, cause the at least one processor to:
- generate a timeline for the musical composition, the timeline comprising: a start time for the musical composition; a stop time for the musical composition; and the first time-marker in between the start time for the musical composition and the stop time for the musical composition;
- and
- generate the musical composition based on the timeline, wherein the musical composition begins at the start time for the musical composition, includes the first musical event at the first time-marker, and ends at the stop time for the musical composition.
19. The computer program product of claim 18 wherein the non-transitory processor-readable storage medium further stores processor-executable instructions and/or data that, when executed by the at least one processor, cause the at least one processor to generate a video timeline for the video, wherein the video timeline includes the video start time, the video stop time, and the first time-marker in between the video start time and the video stop time, and wherein the processor-executable instructions and/or data that, when executed by the at least one processor, cause the at least one processor to generate the musical composition based on the timeline, cause the at least one processor to align the timeline for the musical composition with the video timeline to synchronize the first musical event with the first event in the video at the first time-marker.
20. The computer program product of claim 17 wherein:
- the processor-executable instructions and/or data that, when executed by the at least one processor, cause the at least one processor to establish a first time-marker that demarcates a time in between the video start time and the video stop time at which a first event occurs in the video, further cause the at least one processor to establish a second time-marker that demarcates a time in between the video start time and the video stop time at which a second event occurs in the video; and
- the processor-executable instructions and/or data that, when executed by the at least one processor, cause the at least one processor to generate a musical composition that synchronizes a first musical event synchronized with the first time-marker, further cause the at least one processor to generate the musical composition with a second musical event synchronized with the second time-marker.
Type: Application
Filed: Jun 29, 2021
Publication Date: Dec 30, 2021
Inventor: Colin P. Williams (Half Moon Bay, CA)
Application Number: 17/361,594