DYNAMIC TRANSRATING BASED ON AUDIO ANALYSIS OF MULTIMEDIA CONTENT

- VIXS SYSTEMS, INC.

Exemplary techniques for modifying multimedia data based on content are disclosed. One technique comprises determining whether a first portion of multimedia content of multimedia data has a first content characteristic and performing one or more content actions associated with the first content characteristic when the first portion of the multimedia content is determined to have the first content characteristic, wherein the one or more content actions modify a first portion of the multimedia data associated with the first portion of the multimedia content.

Skip to: Description  ·  Claims  · Patent History  ·  Patent History
Description
CROSS-REFERENCE TO RELATED APPLICATIONS

The present application is a continuation of U.S. patent application Ser. No. 11/237,435 (Attorney Docket No. 1459-VIXS084), entitled “SYSTEM AND METHOD FOR DYNAMIC TRANSRATING BASED ON CONTENT” and filed on Sep. 28, 2005, the entirety of which is incorporated by reference herein.

The present application is related to co-pending U.S. patent application Ser. No. ______ (Attorney Docket No. 1459-VIXS084C3), entitled “DYNAMIC TRANSRATING BASED ON OPTICAL CHARACTER RECOGNITION ANALYSIS OF MULTIMEDIA CONTENT” and filed on even date herewith.

The present application is related to co-pending U.S. patent application Ser. No. 11/522,141 (Attorney Docket No. 1459-VIXS084C), entitled “SYSTEM AND METHOD FOR TRANSRATING BASED ON MULTIMEDIA PROGRAM TYPE” and filed on Sep. 15, 2006.

FIELD OF THE DISCLOSURE

The present disclosure relates generally to data processing and more particularly to processing multimedia information.

BACKGROUND

Current trends in multimedia content distribution are directed to the storage of multimedia content for subsequent access or distribution. Presently, many households utilize personal video recorders (PVRs), also referred to as digital video recorders (DVRs) to store multimedia content received from a terrestrial broadcast as digital data. This data then may be accessed at a later date for display or transmission to another device, such as a cell phone or a portable video player. Current multimedia storage solutions face a choice of either content quality or storage space. As such, these conventional solutions either elect to indiscriminately reduce content quality content, thereby reducing the space required to store the data, or they elect to retain content quality, thereby limiting the amount of data that may be stored in a cost-effective manner. Accordingly, an improved technique for processing multimedia data for storage or distribution would be advantageous.

BRIEF DESCRIPTION OF THE DRAWINGS

The purpose and advantages of the present disclosure will be apparent to those of ordinary skill in the art from the following detailed description in conjunction with the appended drawings in which like reference characters are used to indicate like elements, and in which:

FIGS. 1 and 2 are block diagrams illustrating exemplary multimedia processing systems in accordance with at least one embodiment of the present disclosure.

FIG. 3 is a block diagram illustrating an exemplary implementation of a transrating system in accordance with at least one embodiment of the present disclosure.

FIG. 4 is a flow diagram illustrating an exemplary method for dynamic transrating in accordance with at least one embodiment of the present disclosure.

DETAILED DESCRIPTION OF THE DISCLOSURE

The following description is intended to convey a thorough understanding of the present disclosure by providing a number of specific embodiments and details involving modifying multimedia content based on one or more rule sets associated with content characteristics. It is understood, however, that the present disclosure is not limited to these specific embodiments and details, which are exemplary only. It is further understood that one possessing ordinary skill in the art, in light of known systems and methods, would appreciate the use of the disclosure for its intended purposes and benefits in any number of alternative embodiments, depending upon specific design and other needs.

FIGS. 1-4 illustrate exemplary techniques for modifying multimedia content based on rule sets associated with one or more content characteristics. In at least one embodiment, multimedia data representing, for example, a program is received. Based on program information associated with the multimedia data, a rule template is identified. The rule template includes one or more rules, each rule being represented by, for example, a content characteristic and one or more content actions associated with the content characteristic. The rule template then is applied to the multimedia data so as to modify the multimedia data. In one embodiment, the rule template is applied by processing the multimedia data using some or all of the applicable rules of the rule template, where the multimedia content of the multimedia data is analyzed to determine if the content characteristics of one or more rules are present, and if so, one or more of the content actions associated with the identified content characteristics may be performed.

Referring to FIG. 1, an exemplary multimedia processing system 100 is illustrated in accordance with at least one embodiment of the present disclosure. As exemplary depicted, the system 100 can include a multimedia processing device 102, a content provider 104 and one or more storage devices 106. The multimedia processing device 102, in one embodiment, includes a transrater 110 and a rules template database. Implementations of the multimedia processing device 102 may include, for example, a set top box, a personal versatile recorder (PVR), a television tuner card, a video card, and the like. The content provider 104 can include, for example, a satellite video feed, a cable television head in, a digital versatile disk (DVD) drive, and the like. The storage device 106 may include, for example, memory, a hard disc, a DVD drive, and the like.

In operation, the multimedia processing device 102 receives multimedia data 108 from the content provider 104, where the multimedia data may be provide in, for example, an MPEG data stream format. Program information 114 associated with the multimedia data 108 also can be provided with the multimedia data 108 or as a separate transmission. The program information 114 provides an indication or description of the programmatic details of the multimedia data 108. To illustrate, the multimedia data 108 may represent, for example, a particular football game and the program information 114 therefore can identify the multimedia data generally as a sports program, more particularly as a football program, and more specifically as, for example, a NFL® football program for a particular television network (e.g., Monday Night Football®). The program information 114 can include electronic program guide (EPG) information or information transmitted as closed captioning information during vertical blanking intervals.

Based on the program information 114, the transrater 110 identifies a particular rules template from the rules template database 112 that is applicable to the program type of the multimedia data 114. Using the example provided above, the rules template selected may be a rules template that is applicable to sports programs in general, a rules template that is applicable to football games, or a rules template that is applicable to the particular type of football game (e.g., a college football game or a football game program provided by a particular television network). After identifying the appropriate rules template, the transrater 110 analyzes one or more portions of the content of the multimedia data 108 to determine if one or more content characteristics identified by the rules of the rules template are present in an analyzed portion. To illustrate, the rules template can include a rule that provides that if the analyzed content portion includes a change in average audio volume that is greater than a given threshold (one example of a content characteristic), then the bit rate of the content portion is reduced by a provided amount (one example of a content action). This rule may be utilized, for example, to identify the presence of a commercial (which often is preceded by a change in volume), and if so present, the bit rate of the multimedia data representing the commercial content may be reduced so as to reduce the overall amount of multimedia data without materially affecting the multimedia content of the program that a viewer is likely to care about (i.e., non-commercial content).

If the content characteristic of an applied rule is present in the analyzed portion of the multimedia content, the transrater 110 may perform one or more content actions associated with the rule with respect to the multimedia content. For example, using the change-in-volume content characteristic described above, a corresponding content action can include, for example, a reduction in the bit-rate of the multimedia data representing the commercial content. As a result, the data representing the commercial content can be reduced without materially affecting the user's enjoyment of the program because users typically do not pay as much attention to commercials as they do the remainder of the program and the commercials therefore do not need to be of the same or similar quality as the rest of the program.

The transrater 110 thus can apply the rules template to the multimedia data by analyzing the multimedia content of the data in view of some or all of the rules of the template, thereby generating modified multimedia data 116 from the received multimedia data 108. The modified multimedia data 116 then may be provided for storage in the storage device 106 for subsequent access.

Referring to FIG. 2, another exemplary multimedia processing system 200 is illustrated in accordance with at least one embodiment of the present disclosure. As illustrated, the system 200 includes the multimedia processing device 102 having an input connected to a storage device 206 and an output connected to at least one multimedia device 210 via a network 204. In the illustrated example, the transrater 110 may access multimedia data 208 stored in the storage device 206, identify the appropriate rules template from the template database 112 using program information 214 associated with the multimedia data 208, and apply the identified template to the multimedia content of the multimedia data 208 to generate modified multimedia data 216 as described above. Further, in addition to, or instead of, providing the modified multimedia data 216 for storage in a local storage device, the modified multimedia data 216 may be transmitted for storage in the multimedia device 208 via the network 204, where the network 204 may include, for example, a wireless network, the Internet, a universal serial bus (USB), and the like. Accordingly, the modified multimedia data 216 subsequently may be accessed by the multimedia device for processing for display or for transmission to another device.

It will be appreciated that data storage limitations of the multimedia device 210 and/or bandwidth limitations of the network 204 may require additional consideration when transrating the input multimedia data 208 to generate the output modified multimedia data 216 so as to comply with these limitations. Accordingly, in at least one embodiment, the template database 112 may include templates indexed not only by, for example, program type, but also by one or more characteristics of the multimedia device 210 and/or the network 204. For example, a given news program may have a plurality of different rules templates that can be applied, where some rules templates are directed to portable multimedia devices that have limited storage and other rules templates are directed to multimedia devices or storage devices that have less limited storage or higher-bandwidth network connections. Those rules template directed to portable device can have, for example, rules that have more aggressive data-reducing content actions, whereas those rule templates directed to high-capacity devices can have, for example, rules that are less aggressive with respect to data-reduction and focus more on total image quality.

Table 1 below provides a non-limiting list of exemplary rules used to process the content of multimedia data.

TABLE 1 Exemplary Rule Sets Rule Name Content Characteristic Description Content Action(s) Commercial Change in average volume > threshold Insert commercial index into Detect multimedia data Reduce bit rate for duration of identified commercial content Reduce resolution for duration of identified commercial content Score Change OCR analysis of portion of image Increase audio volume of content for representing score box indicates time period encompassing the score change in text (and therefore change in change score) Increase bit-rate of content for time period encompassing the score change Increase resolution of content for time period encompassing the score change Goal Audio content includes the voiced Increase bit rate of content for time word “goal” period encompassing the goal Game in Play Detect time period having a yellow Increase bit rate of content for time line in image frame indicating line of period scrimmage in football game Talking Head Unconditional Decrease bit rate for screen portion Box used to display news anchor Stock Ticker Unconditional Decrease resolution for screen portion used to display stock prices

Referring to FIG. 3, an exemplary implementation of the transrater 110 of FIGS. 1 and 2 is illustrated in accordance with at least one embodiment of the present disclosure. The exemplary transrater 300 includes a rules table identifier module 302, a rules table buffer 304, a content analyzer 306, an input data buffer 308, a transcoder 310, an output data buffer 312, a system layer formatter 314 and an indexer 316.

In operation, multimedia data 322 is received and buffered in the input data buffer 308. Program information 324 associated with the input multimedia data 322 is provided to the table identifier module 302. Based on the program information 324, the table identifier 302 indexes the rules table database 112 to identify an appropriate rules table 326 to apply to the incoming multimedia data 322. The identified rules table 326 can be provided for storage in the table buffer 304 for use by the content analyzer 306. Alternately, an indicator (e.g., an address or pointer) to the identified rules table 326 may be provided to the content analyzer 306.

In at least one embodiment, the table identifier 302 has access to electronic programming guide (EPG) information 318 so that the table identifier 302 may identify one or more program types of the incoming multimedia data 322 and identify the rules table 326 accordingly. In at least one embodiment, multiple rules tables may be appropriately applied to the multimedia data 322. In such instances, the table identifier module 302 can select the more appropriate template to apply to the multimedia data 322, where the more appropriate template typically is the template aligned with the most specific program type. For example, the EPG information 318 may identify the incoming multimedia data 322 as being associated with a sports program in general and a soccer game program specifically. The table identifier module 302 therefore may identify a rules template associated with soccer game programs in particular. If such template is not available, the table identifier module 302 alternatively may select a rules template associated with sports programs in general.

Moreover, in one embodiment, when no rules template is identified based on specific program information, the table identifier 302 may select a default rules template. For example, if the multimedia data 322 represents a news broadcast from a particular television network for which there is no corresponding rules template in the template database 112, the table identifier module 302 may select a default template that may be generally associated with, for example, the type of multimedia device 210 that is expected to receive the resulting modified multimedia data.

An exemplary implementation of the rules template 326 is depicted by FIG. 3. As illustrated, some or all of the rules templates of the template database 112 may include one or more rules (e.g., rules entries 330-333, also identified as Rule 1-Rule N), each rule having a content characteristic descriptor (e.g., content characteristic descriptors CC1-CCN for rule entries 330-333, respectively) and one or more content action descriptors (CA) associated with each content characteristic descriptor. The content characteristic descriptors typically represent a content characteristic which, if found in an analyzed content portion, results in the performance (or the avoidance of the performance) of one or more content actions represented by the one or more content action descriptors. The content characteristic descriptors typically include information describing a characteristic of the multimedia content (such as, for example, a descriptor indicating that if an optical character recognition analysis of the top portion of successive frames indicates that the score of a game has changed). The content characteristic descriptors may include, for example, microcode, a pointer to a memory location storing a routine for performing the one or more content actions, or information used in processing the multimedia data (such as, for example, a quantization scaling factor or a resolution scaling factor to be applied by the transcoder 310).

Each rule further may include a link field 336 to indicate if the rule is linked to any other links and a link type field 338 to indicate the type of link (e.g., an AND relationship or an OR relationship). For example, Rule 1 may be linked to Rule 2 in an AND relationship whereby if the content characteristic of Rule 1 is found the content characteristic of Rule 2 also must be found before the content actions of Rule 1 can be performed. As another example, Rule 1 may be linked to Rule 2 in an OR relationship whereby if the content characteristic of Rule 1 is identified as present in the analyzed content portion, Rule 2 is not to be applied to the content portion.

Although the exemplary rules described above have a condition (the presence of the content characteristic) before the corresponding one or more content actions can be performed, in at least one embodiment one or more of the rules may be unconditional rules whose content action(s) are always performed without a corresponding condition being met. For example, for multimedia content representing a new program, the image portion of successive frames that represents, for example, a stock ticker can be transrated so as to automatically reduce the image portion's resolution with an analysis of the content of the image portion.

The content analyzer 306, in one embodiment, analyzes the multimedia content of the multimedia data 322 in view of some or all of the rules of the rule template 326. Accordingly, in one embodiment, the content analyzer 306 obtains rule information from the table buffer 304 (or, alternatively, from the template database 112 directly) and analyzes the content of the multimedia data 322 to determine if content characteristics associated with the applied rules are present in one or more portions of the multimedia content.

In some instances, the content analyzer 306 can analyze the multimedia data 322 for certain content characteristics while the multimedia data 322 is in encoded form. To illustrate, an exemplary content characteristic to be identified can be an amount of motion between successive frames that is greater than a certain threshold. In this case, the content analyzer 306 may analyze, for example, the motion vector information of the encoded multimedia data 322 to determine if there is substantial motion between image frames. However, in other instances, identifying certain content characteristics can require that the multimedia data 322 be in decoded form. For example, in one embodiment the content characteristic may be the identification of the word “goal” in the audio content of the multimedia data 322. In this case, the content analyzer 306 typically would access decoded audio information to perform an audio analysis for the word “goal”. Accordingly, the transcoder 310 may decode some or all of the multimedia data 322 and store the decoded multimedia data in a frame buffer 340 (exemplary illustrated as part of the data buffer 308). The content analyzer 306 then may access the decoded multimedia data in the frame buffer 340 to perform the content analysis.

The portion of the content of the multimedia data 322 analyzed for any particular content characteristic typically is dependent on the particular characteristic. To illustrate, the content characteristic of a change in displayed text representing, for example, a score or a stock value may be identified by an OCR analysis of a certain segment of two successive image frames. Thus, the content portion for this characteristic would be two or more frames. As another example, the content characteristic may be the presence of a yellow line indicating the line of scrimmage in a football game. In this instance, the yellow line may be detected by analyzing a particular portion of a single image frame (e.g., the center column of an image frame). In this case, the analyzed content portion can include the center column of the image frame.

In the event that the content analyzer 306 detects that a content characteristic of a rule is present in an analyzed portion of the multimedia data 322, the content analyzer 306 provides the transcoder 310 an indication of the one or more content actions to be performed. The transcoder 310 then processes some or all of the multimedia data 322 in accordance with the content actions to generate modified multimedia data 342. The multimedia data 342 then may be stored in the outgoing data buffer 312 before it is formatted for transmission as, for example, an MPEG program stream 346 by the system layer formatter 314.

In at least one embodiment, a content action to be performed includes embedding a content characteristic index in the modified multimedia data 342, where the content characteristic index identifies the corresponding portion of multimedia data 342 as representing multimedia content having the indicated content characteristic. For example, the content analyzer 306 may analyze the audio content of the multimedia data 322 to identify rapid increases in the average volume of the audio. In the event that such an average volume increase is found, the corresponding content action can include inserting a content characteristic that identifies the multimedia data portion representing, for example, the next thirty seconds of content as a commercial. Thus, the multimedia data 342 subsequently can be rapidly searched to identify the data associated with commercial content and this content may be filtered by, for example, removing the commercial content, reducing the audio volume of the commercial content, reducing the resolution and/or bit rate of the commercial content, and the like. Alternately, the content action can include the creation of a separate index table to a location of the commercial content.

Accordingly, upon identifying a content characteristic present in a portion of the multimedia content that has a corresponding content index action, the content analyzer 306 may provide index information to the indexer 316 which then manages the insertion of the appropriate content characteristic index into the multimedia data 341 using, for example, the system layer formatter 314.

Referring to FIG. 4, an exemplary dynamic transrating method 400 is illustrated in accordance with at least one embodiment of the present disclosure. The method 400 initiates at step 402 whereby multimedia data is received and temporarily buffered. At step 404, the appropriate rules template for the multimedia data is selected based on program information, such as, for example, program type, associated with the multimedia data. As noted above, there can be a number of rules templates appropriate for application to the multimedia data. In such instances, the rules template more aligned with the program characteristics (e.g., a rules template specifically for a football game rather than a general sports program template) is selected. In the event that a specific rules template is not available or appropriate, in one embodiment a default general rules template may be selected.

At step 406, a rule from the selected rules template is accessed and a portion of the content of the multimedia data is analyzed to determine whether the content characteristic associated with the rule is present in the analyzed content portion. If the content characteristic is not present in the analyzed content portion, the method 400 continues to step 410. Otherwise, when the content characteristic is present, one or more of the content actions associated with the rule are performed at step 408. In one embodiment, rules may be linked using logical operations such as AND operations and/or OR operations. Accordingly, if so linked, the content actions of the rule may not be performed at step 408 until the conditions of other linked rules are confirmed.

At step 410 the rules template is checked to determine if the last rule to be applied has been applied. If not, the method 400 repeats steps 406-410 for the next rule to be applied. If it is determined at step 412 that content portions of the multimedia data have yet to be analyzed, the method 400 repeats steps 406-410 to apply the appropriate rules to the next multimedia content portion to be analyzed.

At step 414 the resulting modified multimedia data is provided for storage in a storage device, such as a hard disc or a DVD disc, or provided for transmission to one or more multimedia devices, such as a cellular phone or PDA, via a network. As a result, the original multimedia data may be modified so as to reduce its size while retaining suitable content quality.

Other embodiments, uses, and advantages of the present disclosure will be apparent to those skilled in the art from consideration of the specification and practice of the disclosure disclosed herein. The specification and drawings should be considered exemplary only, and the scope of the disclosure is accordingly intended to be limited only by the following claims and equivalents thereof.

Claims

1. In a multimedia processing device, a method comprising:

performing an audio analysis of audio content of multimedia data for a specified sound, the multimedia data representing multimedia content for a program; and
modifying the multimedia data in response to determining the specified sound is present in the audio content.

2. The method of claim 1, wherein the specified sound is a spoken word.

3. The method of claim 2, wherein the program comprises a sports program and the spoken word is associated with an event in the sports program.

4. The method of claim 1, wherein modifying the multimedia data comprises changing an audio volume for a portion of the multimedia data that is associated with the specified sound.

5. The method of claim 1, wherein modifying the multimedia data comprises changing at least one of a bit rate or a resolution for a portion of the multimedia data that is associated with the specified sound.

6. The method of claim 1, wherein modifying the multimedia data comprises inserting an identifier into a portion of the multimedia data that is associated with the specified sound, the identifier identifying the portion of the multimedia data as being associated with the specified sound.

7. The method of claim 1, further comprising:

identifying a select template of a plurality of templates based on the program, wherein the select template comprises a rule specifying an audio analysis for the specified sound; and
wherein performing the audio analysis comprises performing the audio analysis based on the select template.

8. The method of claim 7, wherein the select template is identified based on received electronic programming guide information associated with the program.

9. A multimedia processing device comprising:

a content analyzer to perform an audio analysis of audio content of multimedia data for a specified sound, the multimedia data representing multimedia content for a program; and
a transcoder to modify the multimedia data in response to determining the specified sound is present in the audio content.

10. The multimedia processing device of claim 9, wherein the specified sound is a spoken word.

11. The multimedia processing device of claim 10, wherein the program comprises a sports program and the spoken word is associated with an event in the sports program.

12. The multimedia processing device of claim 9, wherein the transcoder is to modify the multimedia data by changing an audio volume for a portion of the multimedia data that is associated with the specified sound.

13. The multimedia processing device of claim 9, wherein the transcoder is to modify the multimedia data by changing at least one of a bit rate or a resolution for a portion of the multimedia data that is associated with the specified sound.

14. The multimedia processing device of claim 9, further comprising:

a system layer formatter to insert an identifier into a portion of the multimedia data that is associated with the specified sound, the identifier identifying the portion of the multimedia data as being associated with the specified sound.

15. The multimedia processing device of claim 9, further comprising:

a rules table identifier module to identify a select template of a plurality of templates based on the program, wherein the select template comprises a rule specifying an audio analysis for the specified sound; and
wherein the content analyzer performs the audio analysis based on the select template.

16. In a multimedia processing device, a method comprising:

receiving multimedia data representing multimedia content for a program;
identifying a select template of a plurality of templates based on the program, wherein the select template comprises a plurality of rules, each rule comprising a characteristic and one or more actions to be performed by the multimedia processing device in association with the characteristic, wherein the plurality of rules includes a select rule comprising a characteristic representing a specified sound;
performing an audio analysis of the multimedia data for the specified sound responsive to identifying the select rule in the select template.

17. The method of claim 16, further comprising

performing the one or more actions associated with the select rule when a portion of the multimedia data is determined to have the specified sound.

18. The method of claim 17, wherein the one or more actions associated with the select rule comprise an action to change at least one of a bit rate or a resolution of at least the portion of the multimedia data.

19. The method of claim 17, wherein the one or more actions associated with the select rule comprise an action to change a volume for least the portion of the multimedia data.

20. The method of claim 17, wherein the one or more actions associated with the select rule comprise an action to insert an identifier into the portion of the multimedia data, the identifier identifying the portion of the multimedia data as having the specified sound.

Patent History
Publication number: 20100145488
Type: Application
Filed: Feb 17, 2010
Publication Date: Jun 10, 2010
Applicant: VIXS SYSTEMS, INC. (Toronto)
Inventor: Indra Laksono (Richmond Hill)
Application Number: 12/707,398
Classifications
Current U.S. Class: Digital Audio Data Processing System (700/94)
International Classification: G06F 17/00 (20060101);