METHOD, SYSTEM, AND NON-TRANSITORY MACHINE-READABLE MEDIUM FOR CONTROLLING A DISPLAY IN A FIRST MEDIUM BY ANALYSIS OF CONTEMPORANEOUSLY ACCESSIBLE CONTENT SOURCES

Info

Publication number: 20130308051
Type: Application
Filed: May 15, 2013
Publication Date: Nov 21, 2013
Inventors: ANDREW MILBURN (LOS ANGELES, CA), THOMAS HAJDU (SANTA BARBARA, CA)
Application Number: 13/895,274

Abstract

A display in a first medium is controlled in response to commands generated by analyzing a second, contemporaneously available medium such as a video accompanied by a soundtrack having components in first and second domain. A transform is applied to signals in the first domain to generate signals in the second domain. The second domain signals are ordered according to a rule and used to produce a command signal or signals to produce time varying commands to vary at least one parameter of the video signal.

Description

Description

CROSS REFERENCE TO RELATED APPLICATIONS

This patent application claims priority of Provisional Patent Application 61/648,593 filed May 18, 2012, Provisional Patent Application 61/670,754 filed Jul. 12, 2012, Provisional Patent Application 61/705,051 filed Sep. 24, 2012, Provisional Patent Application 61/771,629 filed Mar. 1, 2013, Provisional Patent Application 61/771,646 filed Mar. 1, 2013, Provisional Patent Application 61/771,690 filed Mar. 1, 2013, and Provisional Patent Application 61/771,704 filed Mar. 1, 2013, the disclosures of which are each incorporated by reference herein in its entirety.

BACKGROUND OF THE INVENTION

1. Field of the Invention

The present subject matter relates to a presentation of a display in a first medium, having parameters controlled by analysis of a contemporaneously available source in a second medium as by applying a transform, such as a soundtrack, live performance, or audio input.

2. Related Art

An example of a first medium played concurrently with a second medium is a video display accompanied by music. In the prior art, video displays have been synchronized to music tracks.

An early, well-known, display is the “visualizer” on media players on personal computers. A preprogrammed visual pattern is modulated by the music, and variations in the video display are synchronized to the music. However, in this application, there is no display in the absence of the music. The music cannot synchronize an independent video program.

There are other applications which enable a user to manually assemble portions of various video and photographic sources into a composition in synchronism with one or more pieces of music. These typically require considerable training and practice before the user is able to produce results of a high quality.

A product called “PhotoCinema” marketed by a Japanese company called Digital Stage allows for a fairly sophisticated slide show to be created and viewed on a computer screen of a personal computer. Digital images stored on a personal computer can be presented in a variety of sequences, and individual images in a sequence can be zoomed. A chain of multiple images can be made to move from left to right across the computer screen. A chain of multiple images can be made to move from top to bottom across the computer screen. Music can be selected to accompany the slide show.

However, the video program produced is simply based on arbitrary selections by a user. While the music accompanies a slideshow, the synchronization between the audio and visual components of the presentation is prepared in advance of performance. Parameters of the synchronization are not determined dynamically.

A video jockey (VJ) is generally a person who mixes a variety of video sources together to create a unique video image for display at large club events or other venues. Automated content manipulation could be provided in the alternative or in addition. A typical mix of images would be some pre-mixed DVDs of video images from previous events, abstract images such as proprietary visualizations, and live images from a video camera directed at the VJ or dancers in the audience, together with overlaying of text, for example, to display the name of the event, the VJ's name, or messages input by the VJ. The images from the respective sources are mixed by the VJ using video mixer/switcher hardware, which controls the overlay of the separate sources on a single display depending on the selected input source and fading transitions between the sources, much like audio mixers. While overlay of images can be provided, there is not a mathematical relationship between audio sources and construction of the video display.

A product called “Avenue 4” by Resolume is a fully-featured professional VJ software tool, allowing elaborate mixing and manipulation of video sources. The complexity and steep learning curve of such a program make it unrealistic as a consumer tool. With such a system, while many effects and manipulations are possible, much practice is needed before sufficient mastery can be achieved to be able to work quickly. Streamlining and simplifying the complexity of operation would facilitate making this tool accessible to VJs of varying skill levels.

U.S. Pat. No. 8,402,356 discloses systems, methods, and apparatus for collecting data and presenting media to a user. The systems generally include a data gathering module associated with an electronic device. The data gathering module communicates gathered data to a management module, which manages at least one user profile based on the gathered data. The management module may select media for presentation to a user based on the user profile, and the selected media may be displayed to the user via a media output device co-located with the user, such as a display of the user's mobile electronic device or a television, computer, billboard, or other display. Related methods are also provided.

United States Published Patent Application No. 20110283865 discloses a system and method for visual representation of sound, wherein the system and method obtain sound information from a multimedia content. The system generates an icon and a directional indicator, each based on the sound information obtained. The sound information typically includes various attributes that can be mapped to various properties of the display elements such as the icon and directional indicator in order to provide details relating to the sound via a visual display. This system effectively “illustrates” sounds so that particular video cues may be given a one-to-one correspondence with particular sounds. However, no video program is generated based on the sound analysis.

United States Published Patent Application No. 20070292832 discloses a system for creating sound using visual images. Various controls and features are provided for the selection, editing, and arrangement of the visual images and tones used to create a sound presentation. Visual image characteristics such as shape, speed of movement, direction of movement, quantity, location, etc. can be set by a user. However, playback of a first medium is not synchronized to analysis of characteristics of a second medium.

Also, these systems do not allow for cooperative control of generating a video program. Another area regarding provision of a video display in which capability has been limited is that of audience interaction and provision of displays constructed for particular users.

The prior art regarding communication with multiple users having devices coupled to receive a display will not enable performance of the present subject matter.

U.S. Pat. No. 7,796,162 discloses a system in which one set of cameras generates multiple synchronized camera views for broadcast from a live venue activity to remote viewers. A user chooses which view to follow. However, there is no plan for varying the sets of images sent to users. There is no uploading capability for users.

SUMMARY

The present subject matter relates to a method, system, and non-transitory programmed medium for presentation of a display in a first medium such as a video display having parameters controlled by analysis of a contemporaneously available source in a second medium, such as a soundtrack, live performance, or audio input.

In one form, a display in a first medium is controlled in response to commands generated by analyzing a second, contemporaneously available medium. A first medium is played for distribution by a communications link to a plurality of users. A contemporaneously available medium, e.g., a sound track, having components in first and second domains is analyzed. A transform is applied to signals in the first domain to generate signals in the second domain, such as generating signals in the frequency domain, based on an audio signal amplitude waveform. The second domain signals are ordered according to a rule. The ordering of the second domain signals is used to produce a command signal or signals to produce time varying commands to vary at least one parameter of the audio signal. Parameters may include pixilation, color saturation, contrast, or others.

In a further form of the system, the second medium is derived from a composite of social interactions in which a plurality of users communicates with a central server. Communications are monitored and responded to in order to construct a signal comprising the second medium.

BRIEF DESCRIPTION OF THE DRAWINGS

The present subject matter may be further understood by reference to the following description taken in connection with the following drawings:

FIG. 1 is an illustration of a venue employing the method and apparatus of the present subject matter;

FIG. 2 is a block diagram illustrating one embodiment of hardware for implementing the system illustrated in FIG. 1;

FIG. 3, consisting of FIGS. 3A and 3C, is an illustration of a signal transformed from a time domain signal to a frequency domain signal and of data derived from the frequency domain signal;

FIG. 4 is a block diagram illustrating a rule-based circuit for generating control signals based on data derived from the signal in a second domain, in this embodiment a frequency domain signal;

FIG. 5 is a block diagram illustrating generation of a video program in the context of a set of collaborative social functions used as a second medium;

FIG. 6 is a flow diagram illustrating one form of the operation of the system illustrated in FIG. 5;

FIG. 7 illustrates correlation data for controlling production of control signals;

FIG. 8 is a block diagram of a system for mapping audio components to visual display controls.

DETAILED DESCRIPTION

The present subject matter may be used to enhance the experience of an audience at an event by providing variations that have not been available before in displays in a first medium, e.g., a video display. A first medium may comprise a program presented on a video display screen. A second medium may comprise an audio soundtrack or an audio performance. Many characteristics of the video display may be varied in accordance with analysis of the second medium.

A large screen video display has many different characteristics. Characteristics include hue, intensity, pixilation, color saturation, and RGB values. Control signals may be applied to command values of these and other characteristics. A second medium may comprise a soundtrack. In the prior art, the second medium will generally accompany the video display without affecting it. In accordance with the present subject matter, the second medium is analyzed to measure components and to generate command signals from the components based on a rule. The command signals are applied sequentially in time in order to control the video display.

FIG. 1 is an illustration of a venue 10 comprising a system 2 in accordance with the present subject matter. The venue 10 may include a performance stage 12, audience area 14, a control room 16, and a sound system 18 which may interact with the control room 16 in a conventional manner and which may also be coupled to a processing system as further described below. In order to enhance the experience of perceiving music, a video program 20 shown on a display 22 is provided in conjunction with a sound source 28. The sound source 28 may comprise a prerecorded program coupled to the sound system 18. In the present illustration, the sound source 28 comprises a live performance provided by a performer or performers 29. In one preferred form the display 22 is a screen 24 that comprises a backdrop for the performance stage 12. The display 22 could comprise an array 25 of monitors over which an image is distributed. The display 22 could alternatively comprise a plurality of identical displays in other locations.

In order to provide a more complete experience, the video program 20 can include matter which is synchronized with components of a performance. Components to which variations in the video program 20 are synchronized are provided from a synchronizing source 30. The synchronizing source 30 may receive an input from the sound system 18. The synchronizing source 30 provides signals from which synchronizing command signals will be generated. In a commonly used embodiment, it will be desirable to synchronize the video program to the songs or the music being played. In such an embodiment, the synchronizing source 30 is coupled to the sound system 18. A sound system 18 need not necessarily be operating on music. Other audio sources could include spoken words. Other sounds could also be used. For example, the synchronizing source 30 could be responding to sounds of car engines at a race track.

Sources need not necessarily be audio sources. Non-audio sources include phenomena that may be sensed. These sources could include ocean waves or vibration of molecules displaying nuclear magnetic resonance.

The display 22 is driven by a video interface 42. A video processor 40 provides signals for display. The video processor 40 may be coupled to and interact with one or more of the following components. A content database 50 may contain a library of video clips, still images, color patterns, and other content selectable for display. A portable device 60 could comprise a smartphone, tablet computer, or other portable devices that may come into existence in the future. Another video source is a video camera 70. A control circuit 80 may be provided for selecting sources or commanding particular actions. The control circuit 80 may be commanded by an operator which for purposes of the present description is described as a video jockey or VJ 84.

One preferred embodiment of the video processor 40 which may be utilized to perform functions of the present subject matter is illustrated in FIG. 2. FIG. 2 is a block diagrammatic representation which may be embodied in any of a number of monolithic integrated circuits or other components. The video processor 40 includes a data bus 104 which carries communications between various modules. For purposes of the present description, various subsystems of the video processor are illustrated as discrete components. However, these subsystems could be embodied in one microcircuit chip or distributed over various components within or without the video processor 40.

A central processing unit (CPU) 110 is programmed to process data and generate commands to select content for a video program. A RAM 112 is used to facilitate CPU 110 operations. A transform generator 114 responds to the synchronizing source 30 (FIG. 1) to produce intelligence for invoking commands to apply to video content. The transform generator 114 takes intelligence from the synchronizing source 30 and transforms it into data which has a meaningful relationship to parameters which will operate on the video display. The data is recognized and processed by an audio analyzer 116. In one preferred embodiment, the transform generator 114 produces a Fast Fourier Transform (FFT). The audio analyzer 116 measures values indicative of the signal in a first domain. The term “audio” analyzer is used because the synchronizing source 30 will often comprise a sound source. However, other sources than audio may be provided by the synchronizing source 30. The operation and outputs of the second domain analyzer 118 described with respect to FIG. 4 are coupled to a signal generator 140. In a further form, described below, the signal in the first domain is analyzed to provide further control signals.

FIG. 3 consists of FIGS. 3A, 3B, and 3C. FIG. 3A is an illustration of an audio signal 200 having content in the time domain and the frequency domain. Frequency components 202, 204, and 206 are illustrated as being components of the audio signal 200 for simplicity in illustration. Common audio signals 200 have a far greater range of frequencies. The signal 200 is represented in the time domain by a waveform 210. The waveform 210 represents the composite of the components, and is displayed as amplitude versus time. The Fourier Transform provides a representation of the signal in the frequency domain using the signal 200 as an input. The result is seen in waveform 220 which is illustrated in amplitude of components versus the frequency value of components.

FIG. 3B illustrates a first domain analysis. In the present embodiment, this comprises a basic RMS amplitude analysis of the audio signal. RMS amplitude is a measure of overall perceived “loudness.” Amplitude points 230 are derived periodically.

FIG. 3C illustrates data generation in an embodiment in which the basis for video control signals is based on frequency of signals received by second domain analyzer 118 (FIG. 2). The second domain analyzer 118 comprises discriminators and filters to resolve the continuous waveform 220 into “bins” 240, each having a particular frequency width. In effect, a bar graph is generated with each bar comprising one bin 240.

In one form, the bins 240 of adjacent frequencies are manipulated according to a rule to develop an amplitude vector for the bass, mid-range, and treble portions of the spectrum of music in a performance. Each of these broad frequency bins 240 is coupled to the signal generator 140. The rule utilized to produce a control signal for operating on the video signal is to produce a time-varying amplitude value in accordance with outputs corresponding to a current value of a selected bin or bins in the second domain analyzer 118.

Preselected bins 240 are selected corresponding to develop an amplitude vector for the bass, mid-range, and treble portions of the spectrum of music. Each amplitude vector comprises a respective value for each of three control signals for each of three video characteristics.

In one embodiment, audio is analyzed in order to derive values that are applied to various modifying functions on the video signal. A windowed-FFT analysis of the audio is performed, and bins defined by the cumulative amplitude of a range of frequencies. Bins may have overlapping boundaries. For example, all the amp energy from 50 Hz to 400 Hz defines a range for one bin. A single amplitude value is produced which represents the amount of low-frequency energy in each window. Another bin may collect values for the energy from 300 Hz to 700 Hz. Further bins may be similarly defined. This then produces a time varying set of single-valued controls that can be mapped onto one or many control parameters. The control parameters embody various functions which are dynamically manipulating the video signal at the same time.

The degree of visual effects to be applied to the video clips and stills is selected in accordance with the time-varying amplitude of respective control signals. The time varying amplitudes are mapped onto the selected parameters. In one form, the time-varying amount of energy in a bass bin 240 is applied to determine the amount of pixillation applied to a video source. Visual effects that could be controlled include pixillation, tiling, pan and zoom, sepia tone, and distortion.

The production of the effects which contribute to the video program 20 (FIG. 1) needs to be achieved in time to provide the video program in synchronism with the synchronizing source 30. In the case of recorded music that is played back to an audience, the synchronizing source 30 looks at the audio input ahead of the actual audio playback. A nominal, satisfactory lead time is on the order of tenths of a second. Transformation from the first domain into the second domain and generation of control signals must be performed in time to produce the desired effect at the same time that the analyzed music exits the sound system.

The video processor 40 (FIG. 1) measures basic RMS amplitude. The RMS amplitude signal may be coupled to determine when and how often to change from one video source clip or image to the next.

A minimum and maximum time are selected that must elapse before changing video sources. Once the minimum time has elapsed, a sound level is selected to trigger a change in video source. The trigger may be produced in response to a signal crossing a preselected amplitude threshold. Alternatively, an input circuit may resolve a selected peak in the RMS amplitude value to “trigger” a change in source. If the maximum time elapses without a trigger stimulus, a “timeout” signal triggers the change in the video source. Once that minimum time has elapsed, a next significant spike in the RMS amplitude value to is used to “trigger” a change in source. If a maximum time elapses without a spike, the source may be changed anyway.

The video database of moving and still images may be preprogrammed into the database 50 (FIG. 1). Alternatively, the VJ 84 may select material from a given storage location or may access media through a real-time search on a search engine or on tags of other data sources.

A resulting composite video/audio composition may be saved to a standard QuickTime file. Streaming, either to an external monitor, or to a streaming host on the Internet, may provide for live sharing.

FIG. 4 is a block diagram illustrating a rule-based circuit for generating control signals based on data derived from the signal in a second domain, which in the present illustration is the frequency domain. In FIG. 4 one form of generating control signals is illustrated. The Fourier transform generator 114 provides an input signal in the second domain to the second domain analyzer 118. The second domain analyzer 118 includes an arithmetic unit 250 to produce the signals represented as bins 240 in FIG. 3C. The signals are separated within frequency ranges, and a signal indicating an amplitude per frequency range per timeslot is stored in a data memory 260. A clock circuit 265 clocks the data memory 260 to provide one set of amplitudes for a current time into a control signal register 270. There are many techniques for providing a gain control signal. In the present embodiment, the control signal register 270 addresses a lookup table 274, which provides respective outputs to a gain control amplifier 276. The gain control amplifier, 276 provides the parameter control signals to the video processor 40.

FIG. 5 is a block diagram illustrating an embodiment in which the second medium comprises a set of collaborative social functions. In this embodiment, individual devices 300-1 through 300-n such as smart phones or tablet computers each store an application, or app. Each portable device contains its own music library 302 and a program memory 304 storing an app 306.

Each portable device 300 includes a microprocessor 310. The portable devices 300 each interact via a communications link 330 with a video processor 340. The communications link 330 may comprise the Internet, telephone connections, satellite communications, and other forms of communication alone or in combination. The video processor 340 includes a data bus 342 and a CPU 344. The interaction with the video processor 340 is shown for convenience and is not essential. A separate processor could be used to interact with the portable devices 300.

The present system uses a similarity indication for the contents of one user's music library 302 with that of another user. The similarity is a value based on metrics associated with each entry in a music library 302. Many known functions exist for characterizing the types of music that are stored, generating a profile of musical tastes of one user that can be compared to musical tastes of other users. For example, at http://www.InstantEncore.com, users are informed of other users who have similar musical tastes. Also, characterizations are provided to indicate musical compatibility of other users who have lesser degrees of similarity in musical tastes. Characterization may be performed in the microprocessor 310 of a device 300 in accordance with the app 306. Alternatively, the app 306 may be uploaded to the CPU 344 for performing the characterization.

The video processor 340 comprises a data analyzer 346. The data analyzer 346 registers similarity metrics data in a manner similar to the audio analyzer 116 (FIG. 2) registering frequency data. The similarity metrics data is monitored and used for application to a rule-based signal generation means.

FIG. 6 is a flow diagram illustrating one form of the operation of the system of FIG. 5. At block 400, users use the app 306 to connect a user's portable device 300 to the video processor 340. These connections are generally made at different times. Actions at block 400 are ongoing. They need not occur at any particular time. The app 306 uploads the music library metric for the respective portable device 300 to the video processor 340. At block 402 a user's music metric is uploaded to the video processor 340. The upload may be initiated by means of a user request or may be programmed into the app 306 for automatic execution. The video processor 340 collects music metrics for selected users 60. At block 404, the video processor 340 selects metrics signals to compare. At block 406, the video processor 340 performs a comparison of the selected metrics and stores values of correlations. At block 408, a time varying signal based on a correlation function is calculated. At block 410, a time varying signal indicative of varying correlations signals versus time is produced. At block 420, a rule-based control signal is generated based on the correlation signals versus time output.

FIG. 7 is a block diagram illustrating the data produced by the circuit of FIG. 5 and the method of FIG. 6. The correlations that are compared can be grouped in any desired preselected manner. In FIG. 7, bins 524 of correlation factors are created and the number of comparisons producing correlations within the width of each bin are registered. The output of the analyzer 346 (FIG. 5) is read periodically to provide current values for the bins 524.

FIG. 8 is a block diagram of a system for mapping audio components to visual display controls. FIG. 8 is illustrated as being embodied in analog hardware. However, software solutions, as in the programming for the CPU 110 (FIG. 2), may be supplied. In this case, analog signals are converted to digital signals for processing. A transform generator 602 provides an amplitude versus frequency waveform 604 for a sound input. The waveform 604 is provided to a discriminator circuit 610, which measures amplitude within a preselected frequency width. In this manner, bins, e.g., the bins 240 in FIG. 4, are generated. The output of the discriminator circuit 610 is provided to a value register 614. The value register 614 may provide signals in parallel to a control signal generator 620. The control signal generator 620 converts signals from the value register 614 into inputs usable by a video processor 640. The video processor 640 may include a program 642 that selects video effects in response to input signals. One example of a program 642 is VFX VJ Software made by MixVibes. Visual effects in the VFX VJ Software are commanded through a graphical user interface (GUI) 646. In accordance with the present subject matter, the control signal generator 620 produces data streams which are interfaced to provide input signals corresponding to selection of options in a GUI 646.

The previous description is provided to enable any person skilled in the art to make or use the present invention. Various modifications to these aspects will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other aspects without departing from the spirit or scope of the invention. For example, one or more elements can be rearranged and/or combined, or additional elements may be added. A wide range of systems may be provided consistent with the principles and novel features disclosed herein.

Claims

1. A method for controlling a display in a first medium by analysis of a selected contemporaneously available content source comprising:

providing a presentation in the first medium having at least one characteristic subject to control;

providing a source in a second medium, the source having components in at least first and second domains;

applying a transform to a source in the second medium to transform a signal in the first domain into a signal in the second domain;

defining bins each collecting measured values signals in a selected range of the signal in the second domain;

a bin value corresponding to a function of the integrated value of the measured signals in each selected range over a selected time period; and

applying the bin value to control at least one characteristic of the presentation in the first medium.

2. A method according to claim 1 wherein the step of applying the bin value to control at least one characteristic of the presentation in the first medium comprises providing a control signal generator and generating with said control signal generator a characteristic control signal in correspondence to each respective bin value.

3. A method according to claim 2 wherein defining bins comprises selecting ranges that may have overlapping boundaries.

4. A method according to claim 3 further comprising reading said second source signal at a preselected number of clock periods in advance of a program point and calculating the control signals in advance of the program point.

5. A method according to claim 4 wherein said first medium comprises a video program, and said second medium comprises audio.

6. A method according to claim 5 wherein the first domain comprises amplitude and the second domain comprises frequency.

7. A non-transitory machine-readable medium for execution on a digital processor, which when executed causes the processor to perform the steps of:

monitoring a source in a second medium, the source having components in at least first and second domains;

applying a transform to a source in the second medium to transform a signal in the first domain into a signal in the second domain;

measuring selected values of selected ranges of the signal in the second domain;

producing a bin value corresponding to each selected range; and

applying the bin value to control at least one characteristic of the presentation in the first medium.

8. A non-transitory machine-readable medium according to claim 7 that causes the processor to perform the further steps of generating a plurality of bin values, applying each bin value to a control circuit, generating a control signal having a value corresponding to an amplitude of each bin value, and controlling each of a plurality of respective characteristics in said first media program.

9. A system for controlling a display in a first medium by analysis of contemporaneously available content sources comprising:

a synchronizing source receiving a program from a source in a second medium, the source having components in at least first and second domains;

a transform generator applying a transform to a source in the second medium to transform a signal in the first domain into a signal in the second domain;

a signal generator resolving the second domain signal into a plurality of bin values each indicative of amplitude in a preselected range of the second domain signal;

a control signal generator receiving each bin value and comprising means for producing a control signal in correspondence with each bin value; and

a control circuit coupled to vary the value of characteristics of a program signal in the first medium in correspondence with a respective control signal.

10. A system according to claim 9 wherein the system is coupled to receive a video program in a first medium and contemporaneously available audio content in a second medium.

11. A system according to claim 9 wherein the system is coupled to receive inputs from a second medium comprising portable interactive devices in an audience, wherein the first domain is a parameter based on one parameter of information describing a user generated by an app and wherein the second domain is an amplitude generated in correspondence with the number of occurrences of the parameter within each of a preselected number of ranges.