SYSTEMS AND METHODS FOR MEDIA ANNOTATION, SELECTION AND DISPLAY OF BACKGROUND DATA

Info

Publication number: 20120106924
Type: Application
Filed: Jan 6, 2012
Publication Date: May 3, 2012
Applicant:
Inventors: Richard H. Krukar (Albuquerque, NM), Luis M. Ortiz (Albuquerque, NM), Kermit D. Lopez (Albuquerque, NM)
Application Number: 13/345,404

Abstract

Video content is a time varying presentation of scenes or video frames. Each frame can contain a number of scene elements such as actors, foreground items, background items, or other items. A person enjoying video content can select a scene element by specifying a screen coordinate while the video content plays. Frame specification data identifies the specific frame or scene being displayed when the coordinate is selected. The coordinate in combination with the frame specification data is sufficient to identify the scene element that the person has chosen. Information about the scene element can then be presented to the person. An annotation database can relate the scene elements to the frame specification data and coordinates.

Description

Description

CROSS REFERENCE TO RELATED APPLICATIONS

This patent application is a continuation of U.S. patent application Ser. No. 12/976,148, entitled “Flick Intel Annotation Methods and Systems,” which was filed on Dec. 22, 2010 and which is incorporated herein by reference in its entirety. U.S. patent application Ser. No. 12/976,148 in turn claims the priority and benefit of U.S. provisional patent application 61/291,837, entitled “Systems and Methods for obtaining background data associated with a movie, show, or live sporting event”, filed on Dec. 31, 2009 and of U.S. Provisional Patent Application No. 61/419,268, filed Dec. 3, 2010, entitled “FlickIntel Annotation Systems and Webcast Infrastructure”. This patent application therefore claims priority to U.S. Provisional Patent Applications 61/291,837 and 61/419,268, which are herein incorporated by reference.

TECHNICAL FIELD

Embodiments relate to video content, video displays, and video compositing. Embodiments also relate to computer systems, user input devices, databases, and computer networks.

BACKGROUND OF THE INVENTION

People have watched video content on televisions and other audio-visual devices for decades. They have also used gaming systems, personal computers, handheld devices, and other devices to enjoy interactive content. They often have questions about places, people and things' appearing as the video content is displayed, and about the music they hear. Databases containing information about the content such as the actors in a scene or the music being played already exist and provide users with the ability to learn more.

The existing database solutions provide information about elements appearing in a movie or scene, but only in a very general way. A person curious about a scene element can obtain information about the scene and hope that the information mentions the scene element in which the person is interested. Systems and methods that provide people with the ability to select a specific scene element and to obtain information about only that element are needed.

BRIEF SUMMARY

The following summary is provided to facilitate an understanding of some of the innovative features unique to the embodiments and is not intended to be a full description. A full appreciation of the various aspects of the embodiments can be gained by taking the entire specification, claims, drawings, and abstract as a whole.

It is therefore an aspect of the embodiments that a media device can provide video content to a display device and that a person can view the video content as it is presented on the display device. A series of scenes or a time varying series of frames along with any audio dialog, music, or sound effects are examples of video content.

It is another aspect of the embodiments that the person can choose a coordinate on the display device. A coordinate can be chosen with a pointing device or any other form of user input by which the person can indicate a spot on the display device and select that spot. Frame specification data can be generated when the person chooses the coordinate. The frame specification data can identify a specific scene or frame within the video content.

It is yet another aspect of the embodiments to provide an element identifier based on the coordinate and the frame specification data. Element identifiers are uniquely associated with scene elements. The element identifier can be obtained by querying an annotation database that relates element identifiers to coordinates and frame specification data. The element identifier can also be provided by a human worker who views the scene or frame, looks to the coordinate, and reports what appears at that location.

A number of embodiments, preferred and alternative, are disclosed herein. For example, in one embodiment, a system can be implemented, which includes an annotation module that automatically annotates content including video content comprising a plurality of frames; an annotation database that stores at least one element identifier, and wherein the annotation database communicates with the annotation module; and at least one element identifier provided by the annotation database in response to a query comprising frame specification data and a coordinate, wherein the frame specification data identifies which frames among the plurality of frames was displayed when the coordinate is selected. In yet another embodiment, a display device can display the video content comprising the plurality of frames, and a pointing device can select the coordinate on the display device. In still other embodiments, a media device that can provide the video content to the display device.

In yet other embodiments, the aforementioned frame specification data can include a timestamp and a media tag. Such a media tag can identify the video content, wherein the timestamp identifies at least one frame among the plurality of frames of the video content and wherein the at least one frame is displayed the coordinate is selected. In other embodiments, an additional data server can produce element data based on the at least one element identifier, and a data presentation can display the element data.

In still other embodiments, the at least one element identifier can correspond to an item for sale and the data presentation comprises an offer to purchase the item. In other embodiments, the at least one element identifier can correspond to a person and the data presentation can provide information about that person. In still other embodiments, the at least one element identifier can correspond to a song and the data presentation can comprise an offer to purchase a copy of the song. In still other embodiments, the at least one element identifier can correspond to a location and the data presentation can comprise travel information for reaching the location.

In another embodiment, a method can be implemented, which includes the steps of, for example, automatically annotating via an annotation module, content including video content comprising a plurality of frames; storing in an annotation database that communicates with the annotation module, at least one element identifier; and providing from the at least one element identifier from the annotation database in response to a query comprising frame specification data and a coordinate, wherein the frame specification data identifies which frames among the plurality of frames was displayed when the coordinate is selected.

In another embodiment, steps or operations can be provided for displaying via a display device, the video content comprising the plurality of frames; and selecting via a pointing device, the coordinate on the display device. In still other embodiments, a step or operation can be implemented for providing the video content to the display device from a media device. In still other embodiments of the aforementioned method, the frame specification data comprises a timestamp and a media tag, wherein the media tag identifies the video content, wherein the timestamp identifies at least one frame among the plurality of frames of the video content and wherein the at least one frame is displayed the coordinate is selected.

In other embodiments, steps can be implemented for providing an additional data server that produces element data based on the at least one element identifier; and providing a data presentation that displays the element data. In yet other embodiments of such a method, the at least one element identifier can correspond to an item for sale and the data presentation comprises an offer to purchase the item.

In still other embodiments of such a method, the at least one element identifier can correspond to a person and the data presentation provides information about that person. In yet another embodiment of such a method, the at least one element identifier can correspond to a song and the data presentation comprises an offer to purchase a copy of the song. In yet other embodiments, the at least one element identifier can correspond to a location and the data presentation comprises travel information for reaching the location.

In still other embodiments, a processor-readable medium can store code representing instructions to cause a processor to perform a process. Such code can comprise code, for example: automatically annotate via an annotation module, content including video content comprising a plurality of frames; store in an annotation database that communicates with the annotation module, at least one element identifier; and provide from the at least one element identifier from the annotation database in response to a query comprising frame specification data and a coordinate, wherein the frame specification data identifies which frames among the plurality of frames was displayed when the coordinate is selected.

BRIEF DESCRIPTION OF THE DRAWINGS

The accompanying figures, in which like reference numerals refer to identical or functionally similar elements throughout the separate views and which are incorporated in and form a part of the specification, further illustrate aspects of the embodiments and, together with the background, brief summary, and detailed description serve to explain the principles of the embodiments.

FIG. 1 illustrates element data being presented on a second display in response to the selection of a scene element on a first display in accordance with aspects of certain embodiments;

FIG. 2 illustrates an annotation database providing element identifiers in response to a person selecting scene elements in accordance with aspects of the embodiments;

FIG. 3 illustrates an annotation service providing element identifiers in response to a person selecting scene elements in accordance with aspects of the embodiments; and

FIG. 4 illustrates an annotated content stream passing to a media device such that the media device produces element data in accordance with aspects of certain embodiments.

DETAILED DESCRIPTION

The particular values and configurations discussed in these non-limiting examples can be varied and are cited merely to illustrate at least one embodiment and are not intended to limit the scope thereof. In general, the figures are not to scale.

The embodiments will now be described more fully hereinafter with reference to the accompanying drawings, in which illustrative embodiments of the invention are shown. The embodiments disclosed herein can be embodied in many different forms and should not be construed as limited to the embodiments set forth herein; rather, these embodiments are provided so that this disclosure will be thorough and complete and will fully convey the scope of the invention to those skilled in the art. Like numbers refer to like elements throughout. As used herein, the term “and/or” includes any and all combinations of one or more of the associated listed items.

Video content is a time varying presentation of scenes or video frames. Each frame can contain a number of scene elements such as actors, foreground items, background items, or other items. A person enjoying video content can select a scene element by specifying a screen coordinate while the video content plays. Frame specification data identifies the specific frame or scene being displayed when the coordinate is selected. The coordinate in combination with the frame specification data is sufficient to identify the scene element that the person has chosen. Information about the scene element can then be presented to the person. An annotation database can associate scene elements with frame specification data and coordinates.

FIG. 1 illustrates element data being presented on a second display 119 in response to the selection of a scene element on a display 101 in accordance with aspects of certain embodiments. A media device 104 passes video content to the display 101 to be viewed by a person. The person can manipulate a selection device 112 to choose a coordinate 102 on a display device 101. The coordinate can then be passed to a media device 104. In some embodiments the selection device can detect the coordinate 105. For example, the selection device 112 can detect the locations of emitters 106 and infer the screen position being pointed at from those emitter locations. In other embodiments the display 101 can detect the coordinate 103. For example, the selection device can emit a light beam that the display device detects. Other common coordinate selection means include mice, trackballs, and touch sensors. More advanced pointing means can observe the person's body or eyeballs to thereby determine a coordinate. Clicking a button or some other action can generate an event indicting that a scene element is chosen.

The media device 104 can generate a selection packet 107 that includes frame selection data and the coordinate 102. The frame selection data is data that is sufficient to identify a specific frame or scene. For example, the frame selection data can be a media tag 108 and a timestamp 109. The media tag 108 can identify a particular movie, show, sporting event, advertisement, video clip, scene or other unit of video content. A timestamp 109 specifies a time within the video content. In combination, a media tag and timestamp can specify a particular frame from amongst all the frames of video content that have ever been produced.

The frame selection packet 107 can be formed into a query for an annotation database 111. The annotation database 111 can contain associations of element identifiers associated with frame selection data and coordinates. As such, the annotation database 111 can produce an element identifier 113 in response to the query. The element identifier 113 can identify a person 114, an item 115, music 116, a place 117, or something else.

The element identifier 113 can then be passed to another server 118 that responds by producing element data for presentation to the person. Examples of element data include, but are not limited to: statistics on a person such as an athlete; a picture of a person, object or place; an offer for purchase of an item, service, or song; and links to other media in which a person, item, or place appears.

FIG. 2 illustrates an annotation database 111 providing element identifiers 211 in response to a person selecting scene elements in accordance with aspects of the embodiments. An annotation service/module 202 can produce annotated content 203 by annotating content 201. An annotation module is a device, algorithm, program, or other means that automatically annotates content. Image recognition algorithms can locate items within scenes and frames and thereby automatically provide annotation data. An annotation service is a service provider that annotates content. An annotation service provider can employ both human workers and annotation modules.

Annotation is a process wherein scene elements, each having an element identifier, are associated with media tags and space time ranges. A space time range identifies a range of times and positions at which a scene element appears. For example, a car can sit unmoving during an entire scene. The element identifier can specify the make, model, color, and trim level of the car, the media tag can identify a movie containing the scene, and the space time range can specify the time range of the movie scene and the location of the car within the scene.

The content 201 can be passed to a media device 104 that produces a media stream 207 for presentation on a display device 206. A person 205 watching the display device 206 can use a selection device 112 to choose a coordinate on the display device 206. A selection packet 107 containing the coordinate and some frame specification data can then be passed to the annotation database 111 which responds by identifying the scene element 211. An additional data server 118 can produce element data 212 for that identified scene element 211. The element data 212 can then be presented to the person.

FIG. 3 illustrates an annotation service providing element identifiers in response to a person selecting scene elements in accordance with aspects of the embodiments. The embodiment of FIG. 3 differs from that of FIG. 2 in that the content 201 is not necessarily annotated before being viewed by the person 205. The selection packet 107 is passed to the annotation service 301 where a human worker 302 or annotation module 303 determines what scene element the person 205 selected and creates a new annotation entry for incorporation into the annotation database 111.

FIG. 4 illustrates an annotated content stream 401 passing to a media device 104 such that the media device 104 produces element data 407 in accordance with aspects of certain embodiments. Annotated content, such as annotated content 203 of FIG. 2, can be passed as an annotated content stream 401 to the media device 104. The annotated content stream 401 can include a content stream 402, element stream 403, and element data 406. The media device 104 can then pass the content for presentation on the display 206 and store the element data 406 and the data in the element stream 403. The data in the element stream 403 can be formed into an annotation database with the possible exception that no media tag is needed. No media tag is needed because all the annotations refer only to the content stream 402. As such, the element stream 403 is illustrated as containing only space time ranges 404 and element identifiers 405.

The media device 104, having assembled an annotation database and having stored element data 406, can produce element data 407 for a scene element selected by a person 205 without querying remote databases or accessing remote resources.

Note that in practice, the content stream 402, element stream 403, and element data 406 can be transferred separately or in combination as streaming data. Means for transferring content, annotations, and element data include TV signals and storage devices such as DVD disks or data disks. Furthermore, the element data 406 can be passed to the media device 104 or can be stored and accessed on a remote server.

It will be appreciated that variations of the above-disclosed and other features and functions, or alternatives thereof, may be desirably combined into many other different systems or applications. Also, that various presently unforeseen or unanticipated alternatives, modifications, variations or improvements therein may be subsequently made by those skilled in the art which are also intended to be encompassed by the following claims.

The particular values and configurations discussed in these non-limiting examples can be varied and are cited merely to illustrate at least one embodiment and are not intended to limit the scope thereof.

The embodiments will now be described more fully hereinafter with reference to the accompanying drawings, in which illustrative embodiments of the invention are shown. The embodiments disclosed herein can be embodied in many different forms and should not be construed as limited to the embodiments set forth herein; rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the scope of the invention to those skilled in the art. Like numbers refer to like elements throughout. As used herein, the term “and/or” includes any and all combinations of one or more of the associated listed items.

The terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the invention. As used herein, the singular forms “a”, “an” and “the” are intended to include the plural forms as well, unless the context clearly indicates otherwise. It will be further understood that the terms “comprises” and/or “comprising,” when used in this specification, specify the presence of stated features, integers, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components, and/or groups thereof.

Unless otherwise defined, all terms (including technical and scientific terms) used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs. It will be further understood that terms, such as those defined in commonly used dictionaries, should be interpreted as having a meaning that is consistent with their meaning in the context of the relevant art and will not be interpreted in an idealized or overly formal sense unless expressly so defined herein.

As will be appreciated by one skilled in the art, the present invention can be embodied as a method, data processing system, or computer program product. Accordingly, the present invention may take the form of an entire hardware embodiment, an entire software embodiment or an embodiment combining software and hardware aspects all generally referred to herein as a “circuit” or “module.” Furthermore, the present invention may take the form of a computer program product on a computer-usable storage medium having computer-usable program code embodied in the medium. Any suitable computer readable medium may be utilized including hard disks, USB Flash Drives, DVDs, CD-ROMs, optical storage devices, magnetic storage devices, etc.

Computer program code for carrying out operations of the present invention may be written in an object oriented programming language (e.g., Java, C++, etc.). The computer program code, however, for carrying out operations of the present invention may also be written in conventional procedural programming languages such as the “C” programming language, in a visually oriented programming environment such as, for example, VisualBasic, or in functional programming languages such as LISP or Erlang.

The program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer. In the latter scenario, the remote computer may be connected to a user's computer through a local area network (LAN) or a wide area network (WAN), wireless data network e.g., WiFi, Wimax, 802.xx, and cellular network or the connection may be made to an external computer via most third party supported networks (for example, through the Internet using an Internet Service Provider).

The invention is described in part below with reference to flowchart illustrations and/or block diagrams of methods, systems, computer program products, and data structures according to embodiments of the invention. It will be understood that each block of the illustrations, and combinations of blocks, can be implemented by computer program Instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, or other programmable data processing apparatus to produce a machine such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions/acts specified in the block or blocks.

These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function/act specified in the block or blocks.

Note that computer program instructions and other process-readable media discusses herein may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions/acts specified in the block or blocks.

Based on the foregoing, it can be appreciated that a number of embodiments, preferred and alternative, are disclosed herein. For example, in one embodiment, a system can be implemented, which includes an annotation module that automatically annotates content including video content comprising a plurality of frames; an annotation database that stores at least one element identifier, and wherein the annotation database communicates with the annotation module; and at least one element identifier provided by the annotation database in response to a query comprising frame specification data and a coordinate, wherein the frame specification data identifies which frames among the plurality of frames was displayed when the coordinate is selected. In yet another embodiment, a display device can display the video content comprising the plurality of frames, and a pointing device can select the coordinate on the display device. In still other embodiments, a media device that can provide the video content to the display device.

In yet other embodiments, the aforementioned frame specification data can include a timestamp and a media tag. Such a media tag can identify the video content, wherein the timestamp identifies at least one frame among the plurality of frames of the video content and wherein the at least one frame is displayed the coordinate is selected. In other embodiments, an additional data server can produce element data based on the at least one element identifier, and a data presentation can display the element data.

In still other embodiments, the at least one element identifier can correspond to an item for sale and the data presentation comprises an offer to purchase the item. In other embodiments, the at least one element identifier can correspond to a person and the data presentation can provide information about that person. In still other embodiments, the at least one element identifier can correspond to a song and the data presentation can comprise an offer to purchase a copy of the song. In still other embodiments, the at least one element identifier can correspond to a location and the data presentation can comprise travel information for reaching the location.

In another embodiment, a method can be implemented, which includes the steps of, for example, automatically annotating via an annotation module, content including video content comprising a plurality of frames; storing in an annotation database that communicates with the annotation module, at least one element identifier; and providing from the at least one element identifier from the annotation database in response to a query comprising frame specification data and a coordinate, wherein the frame specification data identifies which frames among the plurality of frames was displayed when the coordinate is selected.

In another embodiment, steps or operations can be provided for displaying via a display device, the video content comprising the plurality of frames; and selecting via a pointing device, the coordinate on the display device. In still other embodiments, a step or operation can be implemented for providing the video content to the display device from a media device. In still other embodiments of the aforementioned method, the frame specification data comprises a timestamp and a media tag, wherein the media tag identifies the video content, wherein the timestamp identifies at least one frame among the plurality of frames of the video content and wherein the at least one frame is displayed the coordinate is selected. In other embodiments, steps can be implemented for providing an additional data server that produces element data based on the at least one element identifier; and providing a data presentation that displays the element data. In yet other embodiments of such a method, the at least one element identifier can correspond to an item for sale and the data presentation comprises an offer to purchase the item. In still other embodiments of such a method, the at least one element identifier can correspond to a person and the data presentation provides information about that person. In yet another embodiment of such a method, the at least one element identifier can correspond to a song and the data presentation comprises an offer to purchase a copy of the song. In yet other embodiments, the at least one element identifier can correspond to a location and the data presentation comprises travel information for reaching the location.

In still other embodiments, a processor-readable medium can store code representing instructions to cause a processor to perform a process. Such code can comprise code, for example: automatically annotate via an annotation module, content including video content comprising a plurality of frames; store in an annotation database that communicates with the annotation module, at least one element identifier; and provide from the at least one element identifier from the annotation database in response to a query comprising frame specification data and a coordinate, wherein the frame specification data identifies which frames among the plurality of frames was displayed when the coordinate is selected.

It will be appreciated that variations of the above-disclosed and other features and functions, or alternatives thereof, may be desirably combined into many other different systems or applications. Also that various presently unforeseen or unanticipated alternatives, modifications, variations or improvements therein may be subsequently made by those skilled in the art which are also intended to be encompassed by the following claims.

Claims

1. A system, comprising:

an annotation module that automatically annotates content ding video content comprising a plurality of frames;

an annotation database that stores at least one element identifier, and wherein said annotation database communicates with said annotation module; and

at least one element identifier provided by said annotation database in response to a query comprising frame specification data and a coordinate, wherein said frame specification data identifies which frames among said plurality of frames was displayed when said coordinate is selected.

2. The system of claim 1 further comprising:

a display device that displays said video content comprising said plurality of frames; and

a pointing device for selecting said coordinate on said display device.

3. The system of claim 2 further comprising a media device that provides said video content to said display device.

4. The system of claim 1 wherein the frame specification data comprises a timestamp and a media tag, wherein said media tag identifies said video content, wherein said timestamp identifies at least one frame among said plurality of frames of said video content and wherein said at least one frame is displayed said coordinate is selected.

5. The system of claim 1 further comprising:

an additional data server that produces element data based on said at least one element identifier; and

a data presentation that displays said element data.

6. The system of claim 1 wherein said at least one element identifier corresponds to an item for sale and said data presentation comprises an offer to purchase said item.

7. The system of claim 1 wherein said at least one element identifier corresponds to a person and said data presentation provides information about that person.

8. The system of claim 1 wherein said at least one element identifier corresponds to a song and said data presentation comprises an offer to purchase a copy of said song.

9. The system of claim 1 wherein said at least one element identifier corresponds to a location and said data presentation comprises travel information for reaching said location.

10. A method, comprising:

automatically annotating via an annotation module, content including video content comprising a plurality of frames;

storing in an annotation database that communicates with said annotation module, at least one element identifier; and

providing from said at least one element identifier from said annotation database in response to a query comprising frame specification data and a coordinate, wherein said frame specification data identifies which frames among said plurality of frames was displayed when said coordinate is selected.

11. The method of claim 10 further comprising:

displaying via a display device, said video content comprising said plurality of frames; and

selecting via a pointing device, said coordinate on said display device.

12. The method of claim 11 further comprising providing said video content to said display device from a media device.

13. The method of claim 10 wherein said frame specification data comprises a timestamp and a media tag, wherein said media tag identifies said video content, wherein said timestamp identifies at least one frame among said plurality of frames of said video content and wherein said at least one frame is displayed said coordinate is selected.

14. The method of claim 10 further comprising:

providing an additional data server that produces element data based on said at least one element identifier; and

providing a data presentation that displays said element data.

15. The method of claim 10 wherein said at least one element identifier corresponds to an item for sale and said data presentation comprises an offer to purchase said item.

16. The method of claim 10 wherein said at least one element identifier corresponds to a person and said data presentation provides information about that person.

17. The method of claim 10 wherein said at least one element identifier corresponds to a song and said data presentation comprises an offer to purchase a copy of said song.

18. The method of claim 10 wherein said at least one element identifier corresponds to a location and said data presentation comprises travel information for reaching said location.

19. A processor-readable medium storing code representing instructions to cause a processor to perform a process, said code comprising code to:

automatically annotate via an annotation module, content including video content comprising a plurality of frames;

store in an annotation database that communicates with said annotation module, at least one element identifier; and

provide from said at least one element identifier from said annotation database in response to a query comprising frame specification data and a coordinate, wherein said frame specification data identifies which frames among said plurality of frames was displayed when said coordinate is selected.

20. The processor-readable medium of claim 19, wherein said frame specification data comprises a timestamp and a media tag, wherein said media tag identifies said video content, wherein said timestamp identifies at least one frame among said plurality of frames of said video content and wherein said at least one frame is displayed said coordinate is selected.