INTERFACES FOR DIGITAL MEDIA PROCESSING
APIs discussed herein promote efficient and timely interoperability between hardware and software components within the media processing pipelines of media content players. A PhysMemDataStructure API facilitates a hardware component's direct access to information within a memory used by a software component, to enable the hardware component to use direct memory access techniques to obtain the contents of the memory, instead of using processor cycles to execute copy commands. The PhysMemDataStructure API exposes one or more fields of data structures associated with units of media content stored in a memory used by a software component, and the exposed fields store information about the physical properties of the memory locations of the units of media content. SyncHelper APIs are used for obtaining information from, and passing information to, hardware components, which information is used to adjust the hardware components' timing for preparing media samples of synchronously-presentable media content streams.
This application is a continuation of U.S. Ser. No. 14/107,529, filed Dec. 16, 2013, entitled, “INTERFACES FOR DIGITAL MEDIA PROCESSING”, which is a continuation of U.S. Ser. No. 11/824,720, filed Jun. 30, 2007, entitled, “INTERFACES FOR DIGITAL MEDIA PROCESSING”, now U.S. Pat. No. 8,612,643, issued Dec. 17, 2013, which are incorporated herein by reference in their entirety.
BACKGROUNDDigital media presentations are composed of sequenced sets of media content such as video, audio, images, text, and/or graphics. When media content players render and/or present such sequenced sets of media content to users, they are referred to as streams of media content. Some media content players are configured to concurrently render and present more than one independently-controlled stream of media content (for example, a main movie along with features such as a director's commentary, actor biographies, or advertising). Such media content players may also be capable of rendering and presenting user-selectable visible or audible objects (for example, various menus, games, special effects, or other options) concurrently with one or more streams of media content.
Any type of device in the form of software, hardware, firmware, or any combination thereof may be a media content player. Devices such as optical media players (for example, DVD players), computers, and other electronic devices that provide access to large amounts of relatively inexpensive, portable or otherwise accessible data storage are particularly well positioned to meet consumer demand for digital media presentations having significant play durations.
It is common for various entities to supply different software and hardware components of media content players, and such components are expected to successfully interoperate in environments having limited processing and memory resources. It is therefore desirable to provide techniques for ensuring resource-efficient, relatively glitch-free play of digital media presentations, including the accurate synchronization of concurrently presentable streams of media content, on all types of media content players and architectures thereof.
SUMMARYDigital media processing techniques and interfaces (such as application programming interfaces (“APIs”)) discussed herein promote efficient, consistent interoperability between hardware and software components within a media processing pipeline associated with a media content player.
Generally, a media processing pipeline is responsible for receiving sets of media content from media sources such as optical disks, hard drives, network locations, and other possible sources, and performing processing tasks to prepare the sets of media content for presentation to a user as one or more media content streams of a digital media presentation such as a movie, television program, audio program, or other presentation. Sets of media content are referred to as “clips,” with one clip generally received from one media source. Discrete portions of clips read from a particular media source are referred to herein as media content units, which are generally demultiplexed, decompressed, decoded, and/or decrypted. After being demultiplexed, such media content units are referred to herein as media samples. It will be appreciated, however, that the naming convention(s) used herein is/are for illustrative purposes only, and that any desired naming conventions may be used.
A media processing pipeline includes components such as media source readers, demultiplexers, decoders, decrypters, and the like, which are implemented in hardware or software or a combination thereof. Frameworks such as the Microsoft® DirectShow™ multimedia framework may be used to implement a media processing pipeline. It will be appreciated, however, that any now known or later developed framework may be used to implement a media processing pipeline.
Information (such as information about the media content itself and/or presentation of the media content to a user) is exchanged at boundaries between software components and hardware components in a media processing pipeline. In one information exchange scenario, information within a memory (the term memory can encompass any type of computer-readable storage medium) used by a software component is usable by a hardware component. In another information exchange scenario, a hardware component modifies its operation based on information ascertained by a software component, or vice-versa.
One exemplary technique and interface discussed herein—referred to for discussion purposes as the “PhysMemDataStructure” interface—is configured for operation at a boundary between a software component and a hardware component of a media processing pipeline to facilitate the hardware component's direct access of information from a memory used by the software component, instead of using instructions/processor cycles to copy the information. The PhysMemDataStructure interface exposes to the hardware component one or more fields of data structures associated with units of media content (which are to be processed by the hardware component) stored in a memory used by the software component. The fields of the data structures store information about the physical properties of the memory where individual units of media content are located. Examples of such physical properties include but are not limited to type of memory, memory block size, locations of read/write pointers to memory locations, and offset locations of media content units with respect to such memory pointers. To further enhance the efficient use of memory resources, the software component may store units of media content in a ring buffer. To achieve still further memory and processing efficiencies, virtual memory may be used to duplicate the beginning portion of the ring buffer at the ending portion of the physical memory ring buffer.
Other exemplary techniques and interfaces discussed herein—referred to for discussion purposes as the “SyncHelper” interfaces—are configured to facilitate information exchange between hardware components and software components, which may be used to adjust timing (to maintain perceived synchronization between two media content streams, for example) or other operational aspects of the hardware or software components. One SyncHelper interface discussed herein—referred to as the “GetDecodeTimes” interface—provides information about a particular media content unit or media sample being rendered by a hardware component (such as a demultiplexer, decoder, or renderer) at a particular point in time. The provided information includes the elapsed amount of the play duration of the digital media presentation at the particular point in time, as well as the elapsed amount of the play duration of the clip from which the media sample was derived. Another SynchHelper interface—referred to as the “SyncToSTC” interface—facilitates synchronization of various concurrently presentable media content streams. In an exemplary scenario, the SyncToSTC interface ascertains (that is, either requests/receives or calculates) a difference between two values of the elapsed amount of the play duration of the digital media presentation returned by the GetDecodeTimes interface, and instructs one or more hardware components to adjust timing (for example, adjust the rate of a timing signal or adjust which media sample is being decoded or both) based on the ascertained difference.
This Summary is provided to introduce a selection of concepts in a simplified form. The concepts are further described in the Detailed Description section. Elements or steps other than those described in this Summary are possible, and no element or step is necessarily required. This Summary is not intended to identify key features or essential features of the claimed subject matter, nor is it intended for use as an aid in determining the scope of the claimed subject matter. The claimed subject matter is not limited to implementations that solve any or all disadvantages noted in any part of this disclosure.
The predictable and relatively glitch-free play of a digital media presentation is often dependent on the efficient use of limited computing resources of the media content player, especially memory and processor resources. Glitches and inefficiencies can arise in various situations, especially when information is transferred between hardware components and software components operating in a media processing pipeline. In one scenario, inefficiencies may arise when information is transferred between a memory used by a software component and a memory used by a hardware component—it is desirable to minimize the processing and/or memory resources used in memory access transactions. In another scenario, glitches in the play of the media content stream(s) and/or user-perceived loss of synchronization may occur when multiple media content streams are prepared by separate hardware components for concurrent presentation to a user and appropriate information is not available to the hardware components to ensure operational synchronization—it is desirable to provide information to the hardware components for use in adjusting the timing for performing certain processing tasks.
Various techniques and application programming interfaces (“APIs”) are discussed herein that operate at a boundary between a software component and a hardware component, to expose information usable by the hardware component to enhance the efficiency, accuracy and interoperability of the components operating in the media processing pipeline of a media content player.
Turning now to the drawings, where like numerals designate like components,
As shown, Presentation System 100 includes a media content manager 102, an interactive content (“IC”) manager 104, a presentation manager 106, a timing signal management block 108, and a mixer/renderer 110. In general, design choices dictate how specific functions of Presentation System 100 are implemented. Such functions may be implemented using hardware, software, or firmware, or combinations thereof.
In operation, Presentation System 100 handles interactive multimedia presentation content (“Presentation Content”) 120. Presentation Content 120 includes a media content component (“media component”) 122 and an interactive content component (“IC component”) 124. Media component 122 and IC component 124 are generally, but need not be, handled separately streams, by media content manager 102 and IC manager 104, respectively.
Presentation System 100 facilitates presentation of Presentation Content 120 to a user (not shown) as played presentation 127. Played presentation 127 represents the visible and/or audible information associated with Presentation Content 120 that is produced by mixer/renderer 110 and receivable by the user via devices such as displays or speakers (not shown). For discussion purposes, it is assumed that Presentation Content 120 and played presentation 127 represent aspects of high-definition DVD movie content, in any format. It will be appreciated, however, that Presentation Content 120 and Played Presentation 127 may be configured for presenting any type of presentation of media content now known or later developed.
Media component 122 represents one or more sequences (generally, time-ordered) of video, audio, images, text, and/or graphics presentable to users as media content streams (media content streams 308 and 328 are shown and discussed further below, in connection with
A movie generally has one or more versions (a version for mature audiences, and a version for younger audiences, for example); one or more titles 131 with one or more chapters (not shown) associated with each title (titles are discussed further below, in connection with presentation manager 106); one or more audio tracks (for example, the movie may be played in one or more languages, with or without subtitles); and extra features such as director's commentary, additional footage, actor biographies, advertising, trailers, and the like. It will be appreciated that distinctions between titles and chapters are purely logical distinctions. For example, a single perceived media segment could be part of a single title/chapter, or could be made up of multiple titles/chapters. It is up to the content authoring source to determine the applicable logical distinctions.
Sets of sequences of video, audio, images, text, and/or graphics that form aspects of media component 122 are commonly referred to as clips 123 (clips 123 are shown within media component 122 and playlist 128, and are also referred to in
Media data 132 is data associated with media component 122 that has been prepared for rendering by media content manager 102 and transmitted to mixer/renderer 110. Media data 132 generally includes, for each active clip 123, a rendering of a portion of the clip.
Referring again to Presentation Content 120, IC component 124 includes interactive objects 125, which are user-selectable visible or audible objects, optionally presentable concurrently with media component 122, along with any instructions (shown as applications 155) for presenting the visible or audible objects. Examples of interactive objects include, among other things, video samples or clips, audio samples or clips, images, graphics, text, and combinations thereof.
Applications 155 provide the mechanism by which Presentation System 100 presents interactive objects 125 to a user. Applications 155 represent any signal processing method or stored instruction(s) that electronically control predetermined operations on data.
IC manager 104 includes one or more instruction handling engines 181, which receive, interpret, and arrange for execution of commands associated with applications 155. As execution of applications 155 progresses and user input 150 is received, behavior within played presentation 127 may be triggered. Execution of certain instructions of application 155, labeled as “input from ICM” 190, may facilitate communication or interoperability with other functionality or components within Presentation System 100. As shown, input 190 is received by media content manager 102 (discussed further below, in connection with
Interactive content data (“IC data”) 134 is data associated with IC component 124 that has been prepared for rendering by IC manager 104 and transmitted to mixer/renderer 110.
Timing signal management block 108 (discussed further below, in connection with
Mixer/renderer renders media data 132 in a video plane (not shown), and renders IC data 134 in a graphics plane (not shown). The graphics plane is generally, but not necessarily, overlayed onto the video plane to produce played presentation 127 for the user.
Presentation manager 106, which is configured for communication with media content manager 102, IC manager 104, mixer/renderer 110, and timing signal management block 108, facilitates handling of Presentation Content 120 and presentation of played presentation 127 to the user. Presentation manager 106 has access to a playlist 128. Playlist 128 includes, among other things, a time-ordered sequence of clips 123 and applications 155 (including interactive objects 125) that are presentable to a user. The clips 123 and applications 155/interactive objects 125 may be arranged to form one or more titles 131. As discussed above, it is possible for more than one independently-controlled title/media content stream to be concurrently played to a user. Such concurrently played streams may be indicated on playlist 128, or serendipitous user input may cause concurrent play of media content streams.
Presentation manager 106 uses playlist 128 to ascertain a presentation timeline 130 for a particular media presentation (a title 131 in the case of a movie), which generally has a predetermined play duration representing the particular amount of time in which the title is presentable to a user. Representations of amounts of specific elapsed times within the play duration are often referred to as “title times”. Because a title may be played once or more than once (in a looping fashion, for example), the play duration is determined based on one iteration of the title. Conceptually, presentation timeline 130 indicates the title times when specific clips 123 and applications 155 are presentable to a user (although as indicated, it is not generally known when user inputs starting and stopping the play of some specific clips may occur). Specific clips 123 also generally have predetermined play durations representing the particular amounts of time for presenting the clip. Representations of amounts of specific elapsed times within the clip play durations are often referred to as “presentation times”. Each individually-presentable portion of a clip (which may for discussion purposes be referred to as a “media sample,” although any desired naming convention may be used) has an associated, pre-determined presentation time within the play duration of the clip. To avoid user-perceptible glitches in the presentation of media content, one or more upcoming media samples are prepared for presentation in advance of the scheduled/pre-determined presentation time.
To better illustrate the play of a particular clip and timing/times associated therewith, it is useful to use playlist 128 and/or presentation timeline 130 to ascertain one or more media content timelines (“media timeline(s)”) 142. With continuing reference to
A current elapsed play time 209 (that is, the title time of the digital media presentation with which the clip is associated) is shown on media timeline 142. Media sample 250 is being presented to a user at current elapsed play time 209. As shown, current elapsed play time 209 coincides with a particular media sample presentation time 202, although such coinciding is not necessary. A next presentable media sample presentation time 214 is also shown. Next presentable media sample presentation time 214 is used to determine the next media sample, and/or the next media sample presentation time, that should be next prepared for presentation to a user (as shown, next processable media sample 270 is to be prepared for presentation). It will be appreciated that the next presentable media sample/presentation time may be the next consecutive media sample/presentation time based on playlist 128, or may be a media sample/presentation time one or more media samples/presentation times 202 away from the media sample/presentation time associated with current elapsed play time 209. There are various ways to ascertain the next presentable media sample/media sample presentation time, which are not discussed in detail herein. Generally, however, a predicted elapsed play time 220 (that is, predicted title time of the play duration of the digital media presentation) and the corresponding next presentable media sample/presentation time are ascertained. Information such as the play speed, media frame rate 207, and other information may be used to determine the predicted elapsed play time and/or locate the particular media sample presentation time/media sample.
Referring again to
With continuing reference to
Media content manager 102 is responsible for preparing upcoming individually-presentable portions of clips, such as next processable media sample(s) 270 shown in
It will be appreciated that media content manager 102 may have a dynamic processing load based on the identity and scheduling (pre-determined or based on serendipitous user input 150) of the various clips 123 comprising media component 122 and/or IC component 124. Generally, it is desirable for media processing pipelines to consume no more than 10-15% of the processing resources (for example, CPU cycles) of Presentation System 100.
Large amounts of processing resources can be consumed when information is transferred between memory locations using traditional copy transactions such as memory-to-memory copies, and the over-use of processing resources for copy transactions has the potential to cause glitches in the play of a digital media presentation. Yet, it is often desirable to transfer information between memories used by different components of media processing pipelines, especially between memories used by software components and memories used by hardware components. Hardware components are used, among other reasons, to accelerate media content processing.
Contemporaneously preparing for presentation upcoming portions of two or more clips can also consume large amounts of computing resources such as memory and processor cycles in a manner that is not easily predictable, and can further exacerbate the potential for glitches in the play of digital media content. Moreover, memory and/or processing resources required to prepare a particular portion of a clip for presentation (and thus times for such preparation) are not always constant from sample-to-sample or clip-to-clip. Some factors that affect required resources and preparation times are associated with the media content itself (including but not limited to factors such as media unit/sample size, media source/location, encoding or decoding parameters, and encryption parameters). Other factors that affect required resources are associated with the media content player (for example, media processing pipeline architecture, dynamic processing loads, and other features of media content player architecture), while still other factors that affect required resources are associated with user input (user-selected media content, content formats, or play speeds, for example).
With continuing reference to
As shown, a software-hardware boundary 403 is indicated by a dashed line—components on the left side of boundary 403 are primarily software-based components (or portions of components implemented using software), and components on the right side of boundary 403 are primarily hardware-based components (or portions of components implemented using hardware or firmware or a combination thereof). An exemplary architecture includes a software-based media source reader 402 having access to a first memory 430 from which a hardware-based component can directly read from; a hardware-based demultiplexer (“demux”) 404 generally having access to one or more blocks of memory (shown and referred to as a second memory 433 for discussion purposes); one or more hardware-based decoders/renderers 490 also generally having access to one or more blocks of memory (shown and referred to as second memory 433 for discussion purposes); one or more hardware-based decoders/renderers 490 also generally having access to one or more blocks of memory (shown and referred to as second memory 433 for discussion purposes); and application programming interfaces 408, which include a PhysMemDataStructure API 410, Sniffer/Callback APIs 422, and SyncHelper APIs 416 including GetDecodeTimes API 418 and SyncToSTC API 440.
Media source reader 402 is responsible for receiving (via data push or pull techniques) individually-presentable portions of clips (referred to for discussion purposes as media units 407) from a particular media source, storing the received media units 407 in memory 430, and for passing data regarding the stored media units 407 downstream (to demux 404 or decoders/renderers 490, for example). In one possible implementation, data is passed downstream to demux 404 using data structures. In the context of a Microsoft® DirectShow™ framework, for example, media units 407 are wrapped in data structures referred to as IMediaSample objects (IMediaSample references an interface the objects implement, the objects may be referred to as Media Samples). Often IMediaSample objects are constrained to a fixed size allocation at initialization time, and depending on sizes of media content units, may not be used to their full extent. Using a ring buffer 420 as discussed below enables more efficient use of memory.
Memory 430 represents any computer-readable medium (computer-readable media are discussed further below, in connection with
Demux 404 is responsive to receive media units 407 (such as next processable media sample(s) 270, shown in
Decoders/renderers 490 are responsible for receiving demultiplexed media units, referred to for discussion purposes as media samples 409 (MPEG-2 samples, for example), and for using generally well-known techniques for unscrambling/unencrypting the demultiplexed media samples to produce media data 132 associated with a particular media content stream 308, 328. Although a one-to-one relationship between media sources, demultiplexers, and decoders/renderers is shown, it will be appreciated that any arrangement of any number of such components (along with additional components) is possible, and that such components may be shared between media processing pipeline1 302 and media processing pipeline2 320.
APIs 408 are provided to enhance the interoperability of software components and hardware components within a media processing pipeline, and to promote the efficient use of memory and processing resources of Presentation System 100. In one possible implementation, APIs 408 are sets of computer-executable instructions encoded on computer-readable storage media that may be either executed during operation of Presentation System 100 and/or accessed by authors of instructions for media processing components 306 and 326. Generally, APIs 408 are configured to perform aspects of the method(s) shown and discussed further below in connection with
PhysMemDataStructure API 410 is configured to generalize the support of memory 430 that can be directly consumed by hardware components such as demux 404 and decoders/renderers 490. In one possible implementation (in the context of a media processing pipeline having a DirectShow™ framework, for example) media units 407 wrapped in IMediaSample objects are allocated (by means of an implementation of an IMemAllocator object (using input pin 411 of demux 404, for example—output pin 401 would query input pin 411, so demux 404 can provide memory with properties usable/needed by the hardware) to storage locations within hardware-allocated memory block 432, and information about such storage locations (such as the type memory; a size of a memory block; a location of a pointer to the memory; and an offset location of a storage location of a particular media unit with respect to a pointer to the memory) is exposed to hardware components such as demux 404 and decoders/renderers 490 by PhysMemDataStructureAPI 410. Hardware components are thereby able to directly access/retrieve information within hardware-allocated memory block 432 (via direct memory access techniques, for example), instead of using instructions and processor cycles to copy the information.
Exemplary pseudo-code usable for implementing PhysMemDataStructureAPI 410 in the context of media processing pipelines 302, 320 and/or media processing components 306, 326 is shown below.
Sniffer/Callback APIs 422 are used to provide access by software-based elements of Presentation System 100 to certain media samples 409 (for example, “HLI,” “ADV,” and “NAV” packets multiplexed in a high-definition DVD program stream) that have been parsed by demux 404 and/or media data 132 that has been decoded/rendered by decoders/renderers 490. In one possible implementation, a DirectShow™ framework filter is connected to output pin 421 of demux 404 or an output pin (not shown) of decoders/renderers 490, and this filter is used to support the Sniffer/Callback APIs 422.
Exemplary pseudo-code usable for implementing a Sniffer/Callback API that will detect certain types of media samples 409 or media data 132 in the context of media processing pipelines 302, 320 and/or media processing components 306, 326 is shown below.
In return, the callback renderer calls
SyncHelper APIs 416 are configured to facilitate information exchange usable to maintain perceived synchronization between media content streams 308 and 328. GetDecodeTimes API 418 is configured to provide status notifications about certain times (such as title times 209 and media sample presentation times 202) associated with times at which certain media samples (for example, media units 407 or media samples 409 deemed to be next processable media samples 270) are being prepared for presentation by a hardware component (such as demux 404 or one or more decoders/renderers 490). Information provided via the SyncToSTC API 440 may be used, among other things, to adjust timing signals 350 and/or 370 based on differences in title times 209 returned by GetDecodeTimes API 418 from different decoders/renderers (or other hardware components) processing synchronously presentable media samples.
Exemplary pseudo-code usable for implementing SyncHelper APIs 416 is shown below.
With continuing reference to
The processes illustrated in
Referring to the method shown in the flowchart of
In the context of media processing components 306, 326 of media processing pipelines 302, 320, respectively, hardware-allocated memory block 432 may be implemented as ring buffer 420 to enhance the efficient use of memory and processing resources. Ring buffer 420 can be viewed as having blocks that may be separately allocated, via media source reader 402 (or other components of media processing pipelines 302 or 320), for storing media units 407. The offset of each media unit 407 stored in ring buffer 420 is known, and can be expressed relative to the values of one or more pointers to locations within ring buffer 420, such as a beginning of memory (“BOM”) pointer 435, an end of memory (“EOM”) pointer 437, a beginning of used memory pointer (“BUMP”) 453, and/or an end of used memory pointer (“EUMP”) 455. As demux 404 or another hardware component obtains representations of media units 407 from ring buffer 420, BUMP 453 and/or EUMP 455 may be moved accordingly. Because media units 407 may be obtained and released out of order, a list of offsets of media units 407 within ring buffer 420 may be maintained to ensure that BUMP 453 and EUMP 455 are not permitted to bypass each other.
To further enhance memory use and processing efficiencies, virtual memory may be used to duplicate one or more memory blocks from the beginning of ring buffer 420 to the end of ring buffer 420. As shown, duplicate BOM block 450 (which is a duplicate of beginning-of-memory “BOM” block 450) is implemented using virtual memory, and is logically located after end-of-memory “EOM” block 441. This use of virtual memory is referred to as the “auto-wrap” function, because it is especially useful when breaking up a larger block of memory to be used in a ring buffer fashion with read and write pointers. Use of the auto-wrap function is optional—generally the provider of demux 404 can choose to provide memory that does not map twice and the media processing pipeline will still work, but may make less efficient use of memory. In such a ring buffer implementation there is the special case that the piece of memory that “wraps around” to the beginning of the buffer may require special treatment. For example, copying or otherwise obtaining the information in the portion of memory that wraps around may require two transactions—one transaction to retrieve the information in the end of the buffer, and another transaction to retrieve the information in the beginning of the buffer. Thus, it is usually difficult to take full advantage of the ring buffer size. Use of virtual memory as described above avoids the need to either allocate extra memory or skip to the end of the ring buffer (both result in inefficient use of memory) when the information size is too large to fit at the end of the ring buffer.
Exemplary code usable (for Microsoft® Windows CE 6.0 operating system software, although any operating system using virtual memory may be used) for implementing an “auto-wrap” feature that maps a physical piece of memory twice to a double-sized virtual memory region is shown below.
Referring again to the flowchart of
In the context of media processing components 306, 326 implemented using DirectShow™ frameworks, media source reader 402 uses data structures such as IMediaSampleObjects to provide all or some of the following information to downstream hardware components: pointers to memory 430 and/or hardware-allocated memory block 432; size of memory 430 and/or hardware-allocated memory block 432; start and stop times of media units 407; flag(s); and any other desired information. Advantageously, information regarding properties of memory blocks of ring buffer 420 allocated by media source reader 402 for access by demux 404 (and other hardware components) are exposed via PhysMemDataStructure API 410, which may also be provided by a data structure (or fields thereof) such as the IMediaSampleObject. Physical memory information derived by demux 404 and other hardware components from the PhysMemDataStructure API 410 are used to directly access storage location of individual media content units 407 within ring buffer 420, largely obviating the need for processor-intensive copy transactions such as “memcopy” transactions. Information regarding properties of hardware-allocated memory block 432 that is exposed via the PhysMemDataStructure API 410 include but is not limited to: the type of memory 432; a size of a memory block of the memory; a location of one or more pointers 437, 435, 453, or 455 to the memory; and an offset location of a particular media unit 407 with respect to one or more pointers to the memory.
Referring to the method shown in the flowchart of
Generally, software-based components of Presentation System (such as aspects of presentation manager 106) are aware of currently playable clips 123. In the context of media processing components 306, 326 of media processing pipelines 302, 320, respectively, it is possible to use Sniffer/Callback APIs 422 to identify specific media units 407 and/or media samples 409 being processed by demux 404 and/or decoders/renderers 490.
As indicated at block 610, certain information is ascertained at a first time—the first time associated with when the media sample from the first clip is undergoing preparation for presentation by a first hardware component, such as demux 404 or decoder/renderer 490 within media processing pipeline1 302. The following information is ascertained at block 610: an elapsed amount of the play duration of the digital media presentation, and an elapsed amount of the play duration of the first clip.
As indicated at block 612, certain information is ascertained at a second time—the second time associated with when the media sample from the second clip is undergoing preparation for presentation by a second hardware component, such as demux 404 or decoder/renderer 490 within media processing pipeline2 322. The following information is ascertained at block 612: an elapsed amount of the play duration of the digital media presentation, and an elapsed amount of the play duration of the second clip.
As discussed above in connection with the media exemplary media timeline shown in
At block 614, the difference between the elapsed amount of the play duration of the digital media presentation calculated at block 610 and the elapsed amount of the play duration of the digital media presentation calculated at block 612 is ascertained, and, as indicated at block 616, is usable to adjust timing of the hardware components for preparing and/or presenting media samples.
In the context of media processing components 306, 326 of media processing pipelines 302, 320, respectively, the SyncToSTC API 440 is configured to use information obtained via the GetDecodeTimesAPI 418 to synchronize various media content streams from different hardware components, by applying deltas (based on the difference between the elapsed amount of the play duration ascertained at block 614) to processing times and/or timing signals, such as timing signals 350 and 370. It will be appreciated that the SyncToSTC API 440 can also be used to synchronize media content streams with other playback constraints (for example, as defined by a playlist).
With continued reference to
As shown, operating environment 700 includes or accesses components of a computing unit, including one or more processors 702, computer-readable media 704, and computer programs 706. Processor(s) 702 is/are responsive to computer-readable media 704 and to computer programs 706. Processor(s) 702 may be physical or virtual processors, and may execute instructions at the assembly, compiled, or machine-level to perform a particular process. Such instructions may be created using source code or any other known computer program design tool.
Computer-readable media 704 represent any number and combination of local or remote devices, in any form, now known or later developed, capable of recording, storing, or transmitting computer-readable data, such as the instructions executable by processor 702. In particular, computer-readable media 704 may be, or may include, a semiconductor memory (such as a read only memory (“ROM”), any type of programmable ROM (“PROM”), a random access memory (“RAM”), or a flash memory, for example); a magnetic storage device (such as a floppy disk drive, a hard disk drive, a magnetic drum, a magnetic tape, or a magneto-optical disk); an optical storage device (such as any type of compact disk or digital versatile disk); a bubble memory; a cache memory; a core memory; a holographic memory; a memory stick; a paper tape; a punch card; or any combination thereof. Computer-readable media 704 may also include transmission media and data associated therewith. Examples of transmission media/data include, but are not limited to, data embodied in any form of wire line or wireless transmission, such as packetized or non-packetized data carried by a modulated carrier signal. The above notwithstanding, computer-readable media 704 does not include any form of propagated data signal.
Computer programs 706 represent any signal processing methods or stored instructions that electronically control predetermined operations on data. In general, computer programs 706 are computer-executable instructions implemented as software components according to well-known practices for component-based software development, and encoded in computer-readable media (such as computer-readable media 704). Computer programs may be combined or distributed in various ways.
Storage 714 includes additional or different computer-readable media associated specifically with operating environment 700, such as an optical disc or other portable (optical discs are handled by optional optical disc drive 716). One or more internal buses 720, which are well-known and widely available elements, may be used to carry data, addresses, control signals and other information within, to, or from operating environment 700 or elements thereof.
Input interface(s) 708 provide input to computing environment 700. Input may be collected using any type of now known or later-developed interface, such as a user interface. User interfaces may be touch-input devices such as remote controls, displays, mice, pens, styluses, trackballs, keyboards, microphones, scanning devices, and all types of devices that are used input data.
Output interface(s) 710 provide output from operating environment 700. Examples of output interface(s) 710 include displays, printers, speakers, drives (such as optical disc drive 716 and other disc drives or storage media), and the like.
External communication interface(s) 712 are available to enhance the ability of operating environment 700 to receive information from, or to transmit information to, another entity via a communication medium such as a channel signal, a data signal, or a computer-readable medium. External communication interface(s) 712 may be, or may include, elements such as cable modems, data terminal equipment, media players, data storage devices, personal digital assistants, or any other device or component/combination thereof, along with associated network support devices and/or software or interfaces.
On client-side 802, one or more clients 806, which may be implemented in hardware, software, firmware, or any combination thereof, are responsive to client data stores 808. Client data stores 808 may be computer-readable media 704, employed to store information local to clients 806. On server-side 804, one or more servers 810 are responsive to server data stores 812. Like client data stores 808, server data stores 812 may include one or more computer-readable media 704, employed to store information local to servers 810.
Various aspects of a presentation system that is used to present interactive content to a user synchronously with media content have been described. It will be understood, however, that all of the described components of the presentation system need not be used, nor must the components, when used, be present concurrently. Functions/components described in the context of Presentation System 100 as being computer programs are not limited to implementation by any specific embodiments of computer programs. Rather, functions are processes that convey or transform data, and may generally be implemented by, or executed in, hardware, software, firmware, or any combination thereof.
Although the subject matter herein has been described in language specific to structural features and/or methodological acts, it is also to be understood that the subject matter defined in the claims is not necessarily limited to the specific features or acts described above. Rather, the specific features and acts described above are disclosed as example forms of implementing the claims.
It will further be understood that when one element is indicated as being responsive to another element, the elements may be directly or indirectly coupled. Connections depicted herein may be logical or physical in practice to achieve a coupling or communicative interface between elements. Connections may be implemented, among other ways, as inter-process communications among software processes, or inter-machine communications among networked computers.
The word “exemplary” is used herein to mean serving as an example, instance, or illustration. Any implementation or aspect thereof described herein as “exemplary” is not necessarily to be constructed as preferred or advantageous over other implementations or aspects thereof.
As it is understood that embodiments other than the specific embodiments described above may be devised without departing from the spirit and scope of the appended claims, it is intended that the scope of the subject matter herein will be governed by the following claims.
Claims
1. A method for preparing media content as a media presentation by a software-based media player including a decoder/renderer, the media content receivable from a media source as a plurality of media content units, the method comprising:
- identifying a portion of a memory having blocks that are configured to be separately allocated, the portion allocated for storing media content units comprising individually-presentable portions of clips received from the media source, the individually-presentable portions of clips being multiplexed into a single program stream comprising the media presentation received at a hardware component, the hardware component providing scheduling information, responsively to a timing signal, to the decoder/renderer to enable time-based synchronization of the individually-presentable portions of clips,
- identifying a plurality of media content units received from the media source;
- identifying storage locations and offsets within the memory in which each of the plurality of media content units has been stored in the allocated portion of the memory;
- forming data structures associated with each of the plurality of media content units, the data structures each having a field for storing information about the storage location of a particular media content unit;
- arranging for exposure of the data structures to the hardware component, the information about the storage locations of the particular media content units obtained from the data structures usable by the hardware component to directly access the particular media content units from the memory without using a central processing unit; and
- after a particular media content unit has been accessed, releasing the particular media content unit from the storage location.
2. The method according to claim 1, wherein the portion of the memory allocated for storing media content units received from the media source is selected from the group consisting of: a contiguous physical memory, a locked scatter-gathered physical memory, an unlocked scatter-gathered physical memory, a cached virtual memory, and an uncached virtual memory, and wherein the information about the storage locations exposed by the data structure is selected from the group consisting of: a type of the memory; a size of a memory block of the memory; a location of a pointer to the memory; and an offset location of the storage location with respect to a pointer to the memory.
3. The method-readable medium according to claim 1, wherein the method is performed within a media processing pipeline comprising a filter graph having a plurality of filters, each filter having one or more input pins and one or more output pins.
4. The method according to claim 3, wherein the hardware component comprises one of the plurality of filters, and wherein the hardware component is selected from the group consisting of: a demultiplexer, a decoder, and a renderer.
5. The method according to claim 3, wherein the method is performed by a software component comprising one of the plurality of filters, the software component having an input pin configured to receive media content units and an output pin configured for communication with the hardware component.
6. The method according to claim 5, wherein the method step of forming data structures associated with each of the plurality of media content units, the data structures each having a field for storing information about the storage locations of a particular media content unit, comprises:
- issuing, by the software component, a call to an application programming interface (“API”) configured to receive information about a storage location of a particular media content unit and to populate the field of the a corresponding data structure with the information about the storage location; and
- receiving, by the software component, a response from the application programming interface exposing the field of the corresponding data structure with the information about the storage location.
7. The method according to claim 6, wherein the step of arranging for exposing data structures to a hardware component comprises transmitting, on the output pin configured for communication with the hardware component, the data structure with the field with the information about the storage location exposed.
8. A method for preparing media content as a media presentation by a software-based media player including a decoder/renderer, the media content receivable from a media source as a plurality of media content units, the method comprising:
- identifying a portion of a memory for storing media content units comprising individually-presentable portions of clips received from the media source, the individually-presentable portions of clips being multiplexed into a single program stream comprising the media presentation received at a hardware component, the hardware component providing scheduling information, responsively to a timing signal, to the decoder/renderer to enable time-based synchronization of the individually-presentable portions of clips;
- identifying a first media content unit received from the media source;
- identifying a first storage location in which the first media content unit has been stored in the portion of the memory;
- forming a first data structure associated with the first media content unit, the first data structure having a field for storing information about the storage location;
- arranging for exposing the first data structure to the hardware component, the information about the storage location obtained from the first data structure usable by the hardware component to directly access the first media content unit from the memory without using a central processing unit;
- wherein the step of forming a first data structure comprises: issuing a call to an application programming interface (“API”) configured to receive information about the storage location and to populate the field of the first data structure with the information about the storage location; and receiving a response from the application programming interface exposing the field of the first data structure with the information about the storage location.
9. The method according to claim 8, wherein the method is performed within a media processing pipeline comprising a filter graph having a plurality of filters, each filter having one or more input pins and one or more output pins.
10. The method according to claim 9, wherein the method is performed by a software component comprising one of the plurality of filters, the software component having an input pin configured to receive media content units and an output pin configured for communication with the hardware component.
11. The method according to claim 10, wherein the step of arranging for exposing the first data structure to a hardware component comprises transmitting, on the output pin configured for communication with the hardware component, the first data structure with the field of the first data structure exposed.
12. The method according to claim 8, wherein the portion of the memory allocated for storing media content units received from the media source is arranged as a ring buffer, the ring buffer having a beginning memory block and an ending memory block.
13. The method according to claim 12, wherein the beginning memory block is duplicated, using virtual memory, after the ending memory block.
14. The method according to claim 13, wherein the method further comprises the step of:
- when the storage location would include the beginning memory block and the ending memory block, storing a first portion of the first media content unit in the ending memory block and a second portion of the first media content unit in the beginning memory block duplicated using virtual memory.
15. The method according to claim 14 wherein after a particular media content unit has been accessed, releasing the particular media content unit from the storage location.
16. The method according to claim 12, wherein the ring buffer is implemented using a begin pointer for referencing the beginning of used memory in the ring buffer and an end pointer for referencing the end of used memory in the ring buffer, and wherein the step of identifying a first storage location for the first media content unit in the allocated portion of the memory comprises identifying an offset of the first media content unit within the ring buffer, the offset specified relative to the begin pointer or the end pointer or both.
17. The method according to claim 16, the method further comprising:
- after the first media content unit has been transmitted, releasing the first media content unit from the first storage location by moving the begin pointer or the end pointer or both.
18. The method according to claim 16, further comprising:
- maintaining a list of offset positions including storage locations of all media content units stored in the ring buffer; and
- using the list of offset positions to ensure that the begin pointer and the end pointer do not bypass each other when media content units are released from the storage locations.
19. An architecture for a media processing pipeline for use in a media presentation system, the architecture comprising:
- a software component configured to receive media content units and to arrange for storage of the received media content units in a first memory; and
- a hardware component having a second memory, the hardware component responsive to the software component and configured to transfer representations of media content units stored in the first memory to the second memory, wherein said software component is operable to form a first data structure, associated with a first media content unit, having a field for storing information about a first storage location, by issuing a call to an application programming interface (“API”) configured to receive information about the first storage location, and to populate the field, and to receive a response from the application programming interface exposing the field of the first data structure with the information about the first storage location, wherein the API is operable at a boundary between said software component and said hardware component, the API configured to expose, at the request of said software component, the at least one data structure field having information about the first storage location of first media content unit within the first memory to said hardware component, the exposed data structure field enabling said hardware component to directly transfer representations of media content units stored in the first memory to the second memory without using a central processing unit.
20. The architecture according to claim 19, wherein the software component and the hardware component comprise filters in a filter graph, the software component has an output pin that connects to an input pin of the hardware component, and the API is responsive to a call from the output pin of the software component to expose the data structure field to the input pin of the hardware component.
Type: Application
Filed: May 1, 2015
Publication Date: Sep 24, 2015
Inventors: Rajasekaran Rangarajan (Kirkland, WA), Martin Regen (Bavaria), Richard Gains Russell (Sammamish, WA)
Application Number: 14/702,300