Distributed And Automated Video Encoding And Delivery System
At a user or client site, a user initially visits a website hosted by a partner, and following an initial interaction with the website, the user is provided with various software from an administrator website. The software enables the user to download information, such as a video file, to the partner website. The information is encoded in a format that was pre-selected by the partner so that the delivery process is streamlined for the partner. In addition, for further efficiency, the delivery process from the user to the partner is implemented substantially simultaneously with the encoding.
This application claims the benefit of U.S. patent application Ser. No. 11/903,770, filed on Sep. 24, 2007, entitled “Distributed and Automated Video Encoding and Delivery System” which claims the benefit of U.S. provisional patent application No. 60/847,296, filed on Sep. 25, 2006.
BACKGROUND OF THE INVENTION1. Field of the Invention
The invention relates to digital video encoding and distribution.
2. Description of the Related Art
Video encoding is time consuming and complicated, and is often a multi-step process that involves using more than one tool. This is a major challenge to video aggregators who receive videos in all possible formats. Video tapes, DVDs, CDs, and all kinds of media files are submitted daily by the thousands world wide. These video submissions then have to be converted and “normalized” to one format for editing, broadcasting and publishing.
Some tools currently in the market are in the form of software and hardware applications for post-production facilities. These tools encode video material after a tape or file has been received by the facility. They are complicated to use and require and expert staff to operate them. Furthermore, they do nothing to help ease the process for people submitting their content.
BRIEF SUMMARY OF THE INVENTIONAs the number of videos submitted to news agencies, viral video shows and Internet video sites increases, it is advantageous to design an automated video conversion and distribution system over the Internet or similar network that enables users to easily submit video while at the same time delivering the proper format to these media companies. Further, a self-scalable system can be obtained if the transfer and conversion process is performed at the submitting computer as opposed to the receiving company's servers. In accordance with an embodiment of the invention, automatic web encoding (AWE) is provided using a web framework of plug-ins that allow users to submit video and audio content while simultaneously converting and delivering the output to a local or remote location on a computer network. Features of embodiments of the invention may include individual encoding and decoding filters for all file formats, individual image and audio processing filters, multiplexing and de-multiplexing features, IP transmission protocol output filters (FTP, UDP, HTTP, etc.), and so forth.
It will be appreciated that delivering a video as it is being encoded can be more efficient than performing the encoding first and then transmitting the file. But greater advantage can be achieved by adjusting key encoding parameters dynamically to maintain throughput and optimize for quality.
FIGS. 3 and 3A-3D are schematic diagrams of various aspects involved with the AWE platform.
Embodiments of the present invention are described herein in the context of digital video encoding and distribution. Those of ordinary skill in the art will realize that the following detailed description of the present invention is illustrative only and is not intended to be in any way limiting. Other embodiments of the present invention will readily suggest themselves to such skilled persons having the benefit of this disclosure. Reference will now be made in detail to implementations of the present invention as illustrated in the accompanying drawings. The same reference indicators will be used throughout the drawings and the following detailed description to refer to the same or like parts.
In the interest of clarity, not all of the routine features of the implementations described herein are shown and described. It will, of course, be appreciated that in the development of any such actual implementation, numerous implementation-specific decisions must be made in order to achieve the developer's specific goals, such as compliance with application- and business-related constraints, and that these specific goals will vary from one implementation to another and from one developer to another. Moreover, it will be appreciated that such a development effort might be complex and time-consuming, but would nevertheless be a routine undertaking of engineering for those of ordinary skill in the art having the benefit of this disclosure.
In the context of the present invention, the term “network” includes local area networks, wide area networks, the Internet, cable television systems, telephone systems, wireless telecommunications systems, fiber optic networks, ATM networks, frame relay networks, satellite communications systems, and the like. Such networks are well known in the art and consequently are not further described here.
In accordance with one embodiment of the present invention, the components, processes and/or data structures may be implemented using C or C++ programs running on high performance computers (such as an Enterprise 2000™ server running Sun Solaris™ as its operating system. The Enterprise 2000™ server and Sun Solaris™ operating system are products available from Sun Microsystems, Inc. of Mountain View, Calif.). Different implementations may be used and may include other types of operating systems, computing platforms, computer programs, firmware, computer languages and/or general-purpose machines. In addition, those of ordinary skill in the art will recognize that devices of a less general purpose nature, such as hardwired devices, processors and microprocessors, field programmable gate arrays (FPGAs), application specific integrated circuits (ASICs), or the like, may also be used without departing from the scope and spirit of the inventive concepts disclosed herein.
The automatic web encoding (AWE) platform is a distributed video encoding and delivery application that is designed for use for example by digital video aggregators and distributors (referred to herein as partners). AWE includes a web plug-in whereby partners can customize the look and feel and video encoding format and parameters in which they desire to receive video submissions from the users. Partners customize these settings by creating profiles that are stored at a service provider or administrator site. The profiles are applied to each instance of the web (AWE) plug-in in a partner's web page. At the user or client site, a user initially visits a website hosted by the partner, and following an initial interaction with the website, the user is provided with various software from an administrator website. The software enables the user to download information, such as a video file, to the partner website. The information is encoded in a format that was pre-selected by the partner so that the delivery process is streamlined for the partner. In addition, for further efficiency, the delivery process from the user to the partner is implemented substantially simultaneously with the encoding.
As illustrated diagrammatically in
With reference to FIGS. 3 and 3A-D, each partner 104 (multiple partners are contemplated) creates an account on the service provider or administrator website 302 and hosts its own website 303 accessible by its users (submitting computers 102,
Before running, the AWE plug-in component verifies that the partner web page 303 in which it is embedded belongs to a licensed partner 104. To achieve this, licensed partners 104 receive the html object tag 402 to include in their web page. When one of the partner's users 102 opens web page 303, a call to the web service of the service provider or administrator 108 is made to initiate a license verification process. When the web service receives a call, it extracts the URL of the calling page to be used to look up the domain name from the partner's database and extract a partner ID. If this partner ID is incorrect, the call will fail. When a valid partner ID is retrieved for a domain, the web service returns an object tag 404 that contains the code to embed the AWE plug-in component and its initialization parameters. At this point the component will be instantiated and initialized with the parameters in the object tag 404 and will call another webservice in the service provider or administrator 108 website which will be termed OCXVerificationService. If a validation succeeds, the method returns true and will fill in the partnered data member with a valid partner ID.
After successful license validation, the AWE plug-in component receives a valid partner ID which is used to retrieve the “look and feel” 304 and encoding 306 profiles from the service provider or administrator server 305 or associated database. This is accomplished by calling a web service in the service provider or administrator website, which is termed OCXWebService.
The AWE plug-in makes a call to the OCXWebService which in turn retrieves the profile information in xml form for the supplied partner ID. The AWE plug-in component then uses the supplied profile data to initialize.
The AWE plug-in is a multi dialog application with color customization that embeds in web page 303. The look and the feel of these dialogs upon instantiation at the user 102 location are fully customizable, as selected by the partner 104 and stored at service provider or administrator server 305 or an associated database. Color, shape and number of buttons and features that appear available to a user are chosen by a company partner and specified in the look and feel profile 304. The look and feel profile 304 is obtained by the AWE plug-in via webservices in xml format. Upon initialization, the AWE plug-in uses the look and feel data to create its dialogs according to the look and feel profile 304.
Encoding profile parameters are stored in the encoding profile 306. Encoding parameters are applied to an encoding object when a video is submitted. The encoding profile 306 is retrieved via webservices from the company website. Encoding profiles specify, for example, the file format, video codec, audio codec to be applied to the to-be delivered file as well as more specific parameters such as frame rate and frame resolution.
With respect to encoding, video creator objects in the form of libraries or software components create specific video types. For example, MPEGCreator objects are used to produce mpeg1, mpeg2, mpeg4 video files with audio in PCM, AAC, MP3 and MPL2. And WMVCreator objects create videos in WMV7, WMV8 and WMV9 formats with WMA and WMA Professional audio codecs. When a video encoding process begins (details provided below), other dialog windows within the AWE plug-in update the user 102 on encoding progress, elapsed time and gives a preview of the resulting file. The encoding process is described in the flow diagram of
While encoding is performed, an encoded video packet is delivered to receiving or upload server 309 (
The AWE plug-in is a client application downloaded on the user computer 102 and having many components that get called as needed to perform a particular task, as can be seen from
The dialogs are configured to guide the user through the submission process in an intuitive manner. For instance, first, a user 102 selects the video source. Second, the user 102 selects the file, capture, or import depending on the device from which the source video originates. Third, videos can be reviewed and the beginning, end, or other portions can be edited or “trimmed”. Then the use can submit the video. When the user initiates submission, encoding of the video into the format pre-selected by the partner 104 and indicated in the encoding profile that was downloaded to the user takes place automatically. Encoding takes place at the user location, and is performed substantially simultaneously with delivery, such that while a portion of the data stream, for example a packet or a GOP (group of pictures) or some other increment of information, is being delivered, a succeeding portion is being encoded in preparation for delivery. The user waits for the completion while watching a progress dialog which updates a progress bar and the elapsed time. When the submission is complete, a success dialog appears and gives the user the opportunity to make another submission or to terminate the application.
Dynamic Encoding ProcessDigital video is the representation of a sequence of images and sound in a binary format. This binary representation produces files that are relatively large—in the order of a couple of Giga Bytes per minute of video—and are often compressed.
Video compression reduces the number of bits required to state a sequence of image and sound in a binary file. This reduction in file size requires that computations be performed at both the compression and decompression stages.
There are several algorithms used in video compression today that are applied to individual pictures by themselves (intra-picture compression) and others that take advantage of similarities between adjacent frames in the sequence (inter-picture compression.)
Intra-picture compression or coding simply compresses each frame or field, in a sequence of frames, as a self contained compressed picture. Several methods are available today to produce loss-less or lossy compressed pictures that include JPEG, DCT, RLE or LZW. The images produced by this compression method are referred to as I frames.
Inter-picture compression or coding exploits the similarities between successive pictures in a sequence. Picture differences between frames are computed and transmitted instead of the full frame information. This method potentially saves a great deal of redundant picture information. There are often picture elements that remain static across a sequence of images and there are other elements that retain their shape but change position due to motion. Identifying such image objects and computing their change in position across a sequence of frames is known as motion estimation. Images produced with this type of compression are called P or B frames. P frames are only dependent on the prior frame whereas B frames depend on both prior and subsequent frames.
Video encoding and compression algorithms often include a combination of these types of compressed images, but the image sequence starts with an I frame and is typically followed by a combination of P or B frames. A group of pictures (GOP) is then defined as a sequence of frames that start with an I frame which is followed by a combination of B and P frames. In a compressed video stream, a GOP is delimited by two I frames. The number of P and B frames in the GOP could be constant and periodic, which produces a constant bit rate video stream CBR, or it could be variable and non periodic, which produces a variable bit rate video stream VBR.
With CBR, the quality of the picture can vary across individual frames as complex images across the GOP can introduce artifacts. On the other hand, VBR could produce a constant image quality as algorithms that interpret the complexity of the images in the sequence can be used to determine what compression algorithm is best.
The Dynamic Encoding Protocol (DEP) is essentially a method for encoding and transmitting compressed video packets that adjusts the encoding according to the connection speed to deliver sustained video payloads while preserving image quality. DEP takes advantage of common video compression techniques and uses any available digital file transfer protocols or mechanisms.
With reference to
Thus, to ensure sustained video packet transmission while simultaneously encoding such packets in the data stream, a packet of compressed video data of size ζ or a multiple of size ζ should be produced in the time interval Δt or multiple time interval of Δt.
Additionally a number N of frames can be set by the user to be transmitted in the time Δt or multiple of time Δt with a pre-determined quality index I. The higher the value of I, the greater the image quality, resulting in a greater file size. A GOP (group of pictures) can be interpreted as the minimum data packet to be transmitted. The number of frames inside the GOP is defined by the user or is determined automatically by the system.
For a data packet that contains a GOP with a fixed number of frames, the size in bytes can be determined by using the equation below:
Where the GOP size in turn is the sum of the sizes of the P, B and I frames that make it. Calling the sum of the sizes of all of the P frames p and the sum of the sizes of all of the B frames β and the size of the I frame γ, we obtain:
By the GOP definition it is known that there is only one I frame, whose size and resolution can vary, but it can be set to be constant or constrained, for example, to be no greater than half of the original frame size. Therefore it is to be a constant and thus a known parameter. Also, B and P frames are relatively close in size except that the quality of the B frames is generally better than P frames as they both use adjacent frame information in their creation. Therefore, the number of B frames included in a GOP, if any, can be incorporated into the quality coefficient I. These assumptions produce the simplified equation below:
To produce a constant bit rate stream that is optimized for a given connection or file transfer rate, a GOP must contain a number of P (or a combination of B and P) frames whose sizes add up to p as shown in the above equation. The number of P (or a combination of B and P) frames can be pre-set by the user or it can be calculated by the system. Equation (4) can be simplified using (1) such that:
Additionally, the quality of the interpolated frames themselves (P or B) is affected by the algorithms that create the vectors used in the motion estimation process. In general, algorithms that produce accurate motion vectors are time consuming and a balance must be reached when deciding how many interpolated frames are to be used in the GOP. Here, a quality motion estimation parameter μ is introduced where:
And if we solve for p to find the optimum number of interpolated frames to be included in the GOP we have:
The minimum interpolated frame (P or B) size in bytes is related to how much the picture has changed in adjacent frames. If the picture has not changed much, only the motion vector information will be sent in the video stream to represent that frame, reducing the overall file size.
Based on the above, it can be seen that the dynamic encoding process (DEP) has the advantage of producing a file with relatively constant video quality while being substantially simultaneously delivered through a network connection. If the network connection is a fast connection, the time used to compress the video is short. In this case, there is not enough time to compress the video much, which produces relatively larger file sizes with higher quality. But since the video file is being transferred through a fast connection, the file transfers will not be slowed down and the overall video throughput can be sustained at its maximum capacity. If on the other hand the network connection is slow, there will be more time available to compress the file which yields a smaller file size. Conventionally, higher compression ratios produce lower quality video in general because the algorithms for producing good quality compression are time consuming and often avoided. But in the present case, the system has the time to use complex algorithms that produce high quality compressed video whose file sizes can be substantially reduced. Since the file size will be reduced substantially, they can be delivered faster through a slow network connection.
Using the above determinations, and with reference to
The above are exemplary modes of carrying out the invention and are not intended to be limiting. It will be apparent to those of ordinary skill in the art that modifications thereto can be made without departure from the spirit and scope of the invention as set forth in the following claims.
Claims
1-13. (canceled)
14. A method for facilitating delivery of content from a user to a partner over a network, the method comprising:
- hosting an administrator website, including storing a partner encoding profile containing partner encoding information;
- hosting a partner website or application accessible by a user device;
- providing access to the partner encoding information through the partner website or application;
- delivering the partner encoding information to the user device; and
- receiving from the user device user content encoded in accordance with the partner encoding information.
15. The method of claim 14, further comprising:
- providing the partner with access to the partner encoding profile to enable the partner to customize the encoding information therein.
16. The method of claim 14, wherein the encoding information relates to one or more of video codecs, video encoding parameters, video stream parameters, quality parameters, bitrates, constant or variable bitrates, frame size and resolution, and frame rates.
17. The method of claim 14, wherein the partner encoding information includes software operative to enable the delivery of files from the user device to a partner-designated destination.
18. The method of claim 14, wherein the receiving occurs substantially simultaneously with encoding the content at the user device.
19. The method of claim 14, wherein the encoding of the user content in accordance with the partner encoding information is performed using a dynamic encoding protocol that adjusts encoding according to a quality parameter or connection speed.
20. A method for obtaining content from a user, the method comprising:
- storing at an administrator website a partner encoding profile containing encoding information for the partner;
- providing a user device with access to the partner encoding information through a partner website or application;
- delivering the partner encoding information to the user device; and
- receiving at the partner or application user content encoded in accordance with the partner encoding information.
21. The method of claim 20, further comprising:
- providing the partner with access to the partner encoding profile to enable the partner to customize the encoding information therein.
22. The method of claim 20, wherein the encoding information relates to one or more of video codecs, video encoding parameters, video stream parameters, quality parameters, bitrates, constant or variable bitrates, frame size and resolution, and frame rates.
23. The method of claim 20, wherein the partner encoding information includes software operative to enable the delivery of files from the user device to a partner-designated destination.
24. The method of claim 20, wherein the receiving occurs substantially simultaneously with encoding the content at the user device.
25. The method of claim 20, wherein the encoding of the user content in accordance with the partner encoding information is performed using a dynamic encoding protocol that adjusts encoding according to a quality parameter or connection speed.
26. A system for facilitating delivery of content from a user to a partner over a network, the system comprising
- one or more administrator servers operable to: host an administrator website, including storing a partner encoding profile containing partner encoding information; host a partner website or application accessible by a user device; provide access to the partner encoding information through the partner website or application; deliver the partner encoding information to the user device; and receive from the user device user content encoded in accordance with the partner encoding information.
27. The system of claim 26, wherein the one or more administrator servers are further operable to provide the partner with access to the partner encoding profile to enable the partner to customize the encoding information therein.
28. The system of claim 26, wherein the encoding information relates to one or more of video codecs, video encoding parameters video stream parameters, quality parameters, bitrates, constant or variable bitrates, frame size and resolution, and frame rates.
29. The system of claim 26, wherein the partner encoding information includes software operative to enable the delivery of files from the user device to a partner-designated destination.
30. The system of claim 26, wherein the receiving occurs substantially simultaneously with encoding the content at the user device.
31. The system of claim 26, wherein the encoding of the user content in accordance with the partner encoding information is performed using a dynamic encoding protocol that adjusts encoding according to a quality parameter or connection speed.
32. A nontransitory program storage device readable by a machine, embodying a program of instructions executable by the machine to perform a method, method comprising:
- hosting an administrator website, including storing a partner encoding profile containing partner encoding information;
- hosting a partner website or application accessible by a user device;
- providing access to the partner encoding information through the partner website or application;
- delivering the partner encoding information to the user device; and
- receiving from the user device user content encoded in accordance with the partner encoding information.
33. The device of claim 32, wherein the method further comprises:
- providing the partner with access to the partner encoding profile to enable the partner to customize the encoding information therein.
34. The device of claim 32, wherein the encoding information relates to one or more of video codecs, video encoding parameters, video stream parameters, quality parameters, bitrates, constant or variable bitrates, frame size and resolution, and frame rates.
35. The device of claim 32, wherein the receiving occurs substantially simultaneously with encoding the content at the user device.
36. The device of claim 32, wherein the encoding of the user content in accordance with the partner encoding information is performed using a dynamic encoding protocol that adjusts encoding according to connection speed.
37. A nontransitory program storage device readable by a machine, embodying a program of instructions executable by the machine to perform a method, method comprising:
- storing at an administrator website a partner encoding profile containing encoding information for the partner;
- providing a user device with access to the partner encoding information through a partner website or application;
- delivering the partner encoding information to the user device; and
- receiving at the partner website or application content encoded in accordance with the partner encoding information.
38. The device of claim 37, wherein the encoding information relates to one or more of video codecs, video encoding parameters, video stream parameters, quality parameters, bitrates, constant or variable bitrates, frame size and resolution, and frame rates.
39. The device of claim 37, wherein the receiving occurs substantially simultaneously with encoding the content at the user device.
40. The device of claim 37, wherein the encoding of the user content in accordance with the partner encoding information is performed using a dynamic encoding protocol that adjusts encoding according to a quality parameter or connection speed.
41. A method for sending information comprising:
- acquiring digital information in a first format;
- encoding the acquired information into a data stream of a second format; and
- transmitting portions of the data stream during the encoding.
42. The method of claim 41, wherein said encoding comprises:
- determining a latency period for transmission of a packet from a source to a destination;
- adjusting the encoding based on said latency period.
43. A nontransitory program storage device readable by a machine, embodying a program of instructions executable by the machine to perform a method, method comprising:
- acquiring digital information in a first format;
- encoding the acquired information into a data stream of a second format; and
- transmitting portions of the data stream during the encoding.
44. The device of claim 43, wherein said encoding comprises:
- determining a latency period for transmission of a packet from a source to a destination;
- adjusting the encoding based on said latency period.
Type: Application
Filed: May 1, 2012
Publication Date: May 9, 2013
Inventors: Jaime Arturo Valenzuela (Redlands, CA), Hassan Hamid Wharton-Ali (Pacific Palisades, CA)
Application Number: 13/461,015
International Classification: H04N 7/26 (20060101); H04N 7/36 (20060101);