TECHNIQUE AND APPARATUS FOR ANALYZING VIDEO AND DIALOG TO BUILD VIEWING CONTEXT

Info

Publication number: 20130346144
Type: Application
Filed: Aug 25, 2011
Publication Date: Dec 26, 2013
Applicant: Intel Corporation (Santa Clara, CA)
Inventors: Bran Ferren (Beverly Hills, CA), Dimitri Negroponte (Los Angeles, CA), Eric Lawrence Angelson (New York, NY), Cory J. Booth (Beaverton, OR), Genevieve Bell (Portland, OR)
Application Number: 13/819,295

Abstract

A media processing device may include a processing component and a viewing context builder operative on the processing component. The viewing context builder may analyze media content comprising an audio stream, a video stream, and/or a closed captioning stream from a selected channel; extract context relevant data from the analyzed media content; and build a viewing preference profile from the context relevant data. Other embodiments are described and claimed.

Description

Description

RELATED APPLICATIONS

This application claims priority to U.S. Provisional Patent Application Ser. No. 61/377,588 filed Aug. 27, 2010, which is incorporated herein by reference in its entirety.

BACKGROUND

Many content providers have attempted to determine what content is of interest to the content consumers. Knowing what content is of interest can help target advertising, suggest similar content to the consumer, and other functions. Content providers may include some metadata with their content streams, and the metadata can be used in the determination of the content of interest. The metadata may include such information as a category of content, e.g. sports, news, comedy, drama; primary actors; a title; and a channel. However, the content that is of interest to a consumer may not be reflected or captured by the metadata. Accordingly, there may be a need for improved techniques to solve these and other problems.

BRIEF DESCRIPTION OF THE DRAWINGS

FIG. 1 illustrates one embodiment of a media processing system.

FIG. 2 illustrates one embodiment of a media processing component.

FIG. 3 illustrates one embodiment of a viewing context builder component.

FIG. 4 illustrates one embodiment of media content analyzer component.

FIG. 5 illustrates one example of media content.

FIG. 6 illustrates one embodiment of a logic flow.

FIG. 7 illustrates one embodiment of a computing architecture.

FIG. 8 illustrates one embodiment of a communications architecture.

DETAILED DESCRIPTION

Consumer electronics, processing systems and communications systems are converging. For instance, consumer electronics such as digital televisions and media centers are evolving to include processing capabilities typically found on a computer and communications capabilities typically found in mobile devices. As such, heterogeneous consumer electronics continue to evolve into a single integrated system, sometimes referred to as a “digital home system.”

A digital home system may be arranged to provide a compelling entertainment environment in which a user can move seamlessly between television viewing, internet access, and home media management in various embodiments. In some embodiments, a single flexible and dynamic interface may allow a user to find the television programming that they wish to view, acquire the information that they seek from the Web, or enjoy personal audio files, photos, and movies. The system may also facilitate enhanced television viewing, enable collaborative interaction with family and friends, and securely execute financial transactions. A digital home system may provide these features while retaining the familiar design sensibilities and ease-of-use of a traditional television.

In various embodiments, a digital home system may address common deficiencies associated with current entertainment systems in which access to television programming, the internet, and personal media requires operation of three separate interfaces. For example, a unified interface of the digital home system may incorporate physical and graphical elements tied to an easily understood underlying organizational framework, making a home entertainment experience more interesting, compelling, engaging, and efficient. A unified interface may combine the best aspects of the three integrated paradigms, e.g., those of television, internet, and computers. For example, elements such as animation, information-rich displays, and video and audio cues from traditional televisions and television menus may be incorporated into the unified interface. Similarly, seamless integration of different forms of content and communications mechanisms from traditional internet experiences, allowing links from one form of content to another and providing tools such as messaging and video conferencing may also be incorporated. And from computers, point-and-click mechanisms that allow effective navigation of complex information spaces may also be part of the unified interface of the digital home system in various embodiments.

The digital home system may utilize, in some embodiments, a visual display such as a television display as a navigation device. Using the display in combination with any number of remote control devices, a user can carry out complex tasks in fulfilling and transformative ways. The digital home system may include familiar mechanisms such as on-screen programming guides, innovative technologies that facilitate navigation via natural motions and gestures and context-sensitivity that understands the user and the options available to the user which all combine to make the digital home system experience intuitive and efficient as it empowers the user to utilize multiple devices in a seamlessly integrated way.

For a typical television-viewing, media-perusing, and web-browsing home user, the digital home system may be arranged to provide a unified home entertainment experience, allowing the user to freely navigate through television, media, and internet offerings from a traditional viewing position (such as a sofa) using a unified interface. In some embodiments, the unified interface integrates the information provided by a diverse array of devices and services into the existing television or other display in a functionally seamless and easily understood manner.

The digital home system may include, in various embodiments, a multi-axis integrated on-screen navigation allowing the display screen to be used for navigation as well as for the presentation of content. In some embodiments, the digital home system may also include a user interface engine operative to provide context-sensitive features and overlays intelligently integrated with the underlying content and adaptive to the viewing environment. A family of remote control and other input/output device may also be incorporated into the digital home system in various embodiments to further enhance the intuitive user interactions, ease of use and overall quality of the system. The embodiments are not limited in this context.

Various embodiments are directed to a system and method of building viewing context from media content. Embodiments may analyze an audio stream, a video stream, and/or a closed captioning stream for context relevant data, such as keywords, actor or personality names, locations, topics, historical time periods, music and so forth. The context relevant data may be used to build a viewing preference profile for a content consumer that aggregates the context relevant data. The viewing preference profile may be used to identify content that may also be of interest to the content consumer, target advertising, build a personal brand channel, and so forth. As a result, the embodiments can improve upon the data available for building viewer profiles and enhance a viewing experience for a content consumer.

Reference is now made to the drawings, wherein like reference numerals are used to refer to like elements throughout. In the following description, for purposes of explanation, numerous specific details are set forth in order to provide a thorough understanding thereof. It may be evident, however, that the novel embodiments can be practiced without these specific details. In other instances, well known structures and devices are shown in block diagram form in order to facilitate a description thereof. The intention is to cover all modifications, equivalents, and alternatives falling within the spirit and scope of the claimed subject matter.

FIG. 1 illustrates a block diagram for a media processing system 100. The media processing system 100 is generally directed to performing media processing operations for media content in accordance with any associated control signaling necessary for presenting media content on an output device. In one embodiment, the media processing system 100 is particularly arranged to provide media content from disparate media sources to viewers in a home environment, such as a digital home system, for example. However, the media processing system 100 may be suitable for any use scenarios involving presentation and display of media content. Although the media processing system 100 shown in FIG. 1 has a limited number of elements in a certain topology, it may be appreciated that the media processing system 100 may include more or less elements in alternate topologies as desired for a given implementation. The embodiments are not limited in this context.

In various embodiments, various elements of the media processing system 100 may communicate, manage, or process information in accordance with one or more protocols. A protocol may comprise a set of predefined rules or instructions for managing communication among nodes. A protocol may be defined by one or more standards as promulgated by a standards organization, such as, the International Telecommunications Union (ITU), the International Organization for Standardization (ISO), the International Electrotechnical Commission (IEC), the Institute of Electrical and Electronics Engineers (IEEE), the Internet Engineering Task Force (IETF), the Motion Picture Experts Group (MPEG), and so forth. For example, the described embodiments may be arranged to operate in accordance with standards for media processing, such as the National Television Systems Committee (NTSC) standards, the Advanced Television Systems Committee (ATSC) standards, the Phase Alteration by Line (PAL) standards, the MPEG-1 standard, the MPEG-2 standard, the MPEG-4 standard, the Open Cable standard, the Society of Motion Picture and Television Engineers (SMPTE) Video-Codec (VC-1) standards, the ITU/IEC H.263 and H.264 standards, and others. Another example may include various Digital Video Broadcasting (DVB) standards, such as the Digital Video Broadcasting Terrestrial (DVB-T) broadcasting standard, the DVB Satellite (DVB-S) broadcasting standard, the DVB Cable (DVB-C) broadcasting standard, and others. Digital Video Broadcasting (DVB) is a suite of internationally accepted open standards for digital television. DVB standards are maintained by the DVB Project, an international industry consortium, and they are published by a Joint Technical Committee (JTC) of European Telecommunications Standards Institute (ETSI), European Committee for Electrotechnical Standardization (CENELEC) and European Broadcasting Union (EBU). The embodiments are not limited in this context.

In various embodiments, elements of the media processing system 100 may be arranged to communicate, manage or process different types of information, such as media information and control information. Examples of media information may generally include any data or signals representing multimedia content meant for a user, such as media content, voice information, video information, audio information, image information, textual information, numerical information, alphanumeric symbols, graphics, and so forth. Control information may refer to any data or signals representing commands, instructions, control directives or control words meant for an automated system. For example, control information may be used to route media information through a system, to establish a connection between devices, instruct a device to process the media information in a predetermined manner, monitor or communicate status, perform synchronization, and so forth. The embodiments are not limited in this context.

In various embodiments, media processing system 100 may be implemented as a wired communication system, a wireless communication system, or a combination of both. Although media processing system 100 may be illustrated using a particular communications media by way of example, it may be appreciated that the principles and techniques discussed herein may be implemented using any type of communication media and accompanying technology. The embodiments are not limited in this context.

When implemented as a wired system, for example, the media processing system 100 may include one or more elements arranged to communicate information over one or more wired communications media. Examples of wired communications media may include a wire, cable, printed circuit board (PCB), backplane, switch fabric, semiconductor material, twisted-pair wire, co-axial cable, fiber optics, and so forth. The wired communications media may be connected to a device using an input/output (I/O) adapter. The I/O adapter may be arranged to operate with any suitable technique for controlling information signals between elements using a desired set of communications protocols, services or operating procedures. The I/O adapter may also include the appropriate physical connectors to connect the I/O adapter with a corresponding communications medium. Examples of an I/O adapter may include a network interface, a network interface card (NIC), disc controller, video controller, audio controller, and so forth. The embodiments are not limited in this context.

When implemented as a wireless system, for example, the media processing system 100 may include one or more wireless elements arranged to communicate information over one or more types of wireless communication media. An example of wireless communication media may include portions of a wireless spectrum, such as the RF spectrum. The wireless elements may include components and interfaces suitable for communicating information signals over the designated wireless spectrum, such as one or more antennas, wireless transmitters, receiver, transmitters/receivers (“transceivers”), amplifiers, filters, control logic, antennas, and so forth. The embodiments are not limited in this context.

In the illustrated embodiment shown in FIG. 1, the media processing system 100 may comprise a media processing device 110. The media processing device 110 may further comprise one or more input devices 102-a, one or more output devices 104-b, and one or more media sources 106-c. The media processing device 110 may be communicatively coupled to the input devices 102-a, the output devices 104-b, and the media sources 106-c via respective wireless or wired communications connections 108-d, 110-e and 112-f.

It is worthy to note that “a” and “b” and “c” and similar designators as used herein are intended to be variables representing any positive integer. Thus, for example, if an implementation sets a value for a=5, then a complete set of input devices 102-a may include computing devices 102-1, 102-2, 102-3, 102-4 and 102-5. The embodiments are not limited in this context.

In various embodiments, the media processing system 100 may include one or more input devices 102-a. In general, each input device 102-a may comprise any component or device capable of providing information to the media processing device 110. Examples of input devices 102-a may include without limitation remote controls, pointing devices, keyboards, keypads, trackballs, trackpads, touchscreens, joysticks, game controllers, sensors, biometric sensors, thermal sensors, motion sensors, directional sensors, microphones, microphone arrays, video cameras, video camera arrays, global positioning system devices, mobile computing devices, laptop computers, desktop computers, handheld computing devices, tablet computing devices, netbook computing devices, smart phones, cellular telephones, wearable computers, and so forth. The embodiments are not limited in this context.

In various embodiments, the media processing system 100 may include one or more output devices 104-b. An output device 104-b may comprise any electronic device capable of reproducing, rendering or presenting media content for consumption by a human being. Examples of output devices 104-b may include without limitation a display, an analog display, a digital display, a television display, a projector and screen, a computer display, audio speakers, headphones, a printing device, lighting systems, warning systems, mobile computing devices, laptop computers, desktop computers, handheld computing devices, tablet computing devices, netbook computing devices and so forth. The embodiments are not limited in this context.

While various embodiments refer to input devices 102-a providing information to media processing device 110 and output devices 104-b receiving information from media processing device, it should be understood that one or more of the input devices 102-a and output device 104-b may allow for the exchange of information to and from media processing device 110 via their respectively connections 108-d and 110-e. For example, one or more of input devices 102-a may be operative to provide information to media processing device 110 and to receive information from media processing device 110. In various embodiments, one or more of output devices 104-b may be operative to receive information from media processing device 110 and may also be operative to provide information to media processing device 110. Similarly, there may be a bi-directional exchange between the media processing device 110 and media sources 106-c. For instance, a media source 106-c may be operative to provide media information to the media processing device 110 and to receive information from the media processing device 110. An example of this would be a video on demand (VOD) application implemented by the media processing device 110. The embodiments are not limited in this context.

In one embodiment, for example, the media processing system 100 may include a display 104-1. The display 104-1 may comprise any analog or digital display capable of presenting media information received from media sources 106-c. The display 104-1 may display the media information at a defined format resolution. In various embodiments, for example, the incoming video signals received from media sources 106-c may have a native format, sometimes referred to as a visual resolution format. Examples of a visual resolution format include a digital television (DTV) format, high definition television (HDTV), progressive format, computer display formats, and so forth. For example, the media information may be encoded with a vertical resolution format ranging between 480 visible lines per frame to 1080 visible lines per frame, and a horizontal resolution format ranging between 640 visible pixels per line to 1920 visible pixels per line. In one embodiment, for example, the media information may be encoded in an HDTV video signal having a visual resolution format of 720 progressive (720p), which refers to 720 vertical pixels and 1280 horizontal pixels (720×1280). In another example, the media information may have a visual resolution format corresponding to various computer display formats, such as a video graphics array (VGA) format resolution (640×480), an extended graphics array (XGA) format resolution (1024×768), a super XGA (SXGA) format resolution (1280×1024), an ultra XGA (UXGA) format resolution (1600×1200), and so forth. The embodiments are not limited in this context. The type of displays and format resolutions may vary in accordance with a given set of design or performance constraints, and the embodiments are not limited in this context.

In various embodiments, the media processing system 100 may include one or more media sources 106-c. Media sources 106-c may comprise any media source capable of sourcing or delivering media information and/or control information to media processing device 110. More particularly, media sources 106-c may comprise any media source capable of sourcing or delivering digital audio and/or video (AV) signals to media processing device 110. Examples of media sources 106-c may include any hardware or software element capable of storing and/or delivering media information, such as a digital video recorder (DVR), a personal video recorder (PVR), a digital versatile disc (DVD) device, a video home system (VHS) device, a digital VHS device, a disk drive, a hard drive, an optical disc drive a universal serial bus (USB) flash drive, a memory card, a secure digital (SD) memory card, a mass storage device, a flash drive, a computer, a gaming console, a compact disc (CD) player, computer-readable or machine-readable memory, a digital camera, camcorder, video surveillance system, teleconferencing system, telephone system, medical and measuring instruments, scanner system, copier system, television system, digital television system, set top boxes, personal video records, server systems, computer systems, personal computer systems, smart phones, tablets, notebooks, handheld computers, wearable computers, portable media players (PMP), portable media recorders (PMR), digital audio devices (e.g., MP3 players), digital media servers and so forth. Other examples of media sources 106-c may include media distribution systems to provide broadcast or streaming analog or digital AV signals to media processing device 110. Examples of media distribution systems may include, for example, Over The Air (OTA) broadcast systems, terrestrial cable systems (CATV), satellite broadcast systems, and so forth. It is worthy to note that media sources 106-c may be internal or external to media processing device 110, depending upon a given implementation. The embodiments are not limited in this context.

In various embodiments, the media processing system 100 may include one or more media processing devices 110. The media processing device 110 may comprise any electronic device arranged to receive, process, manage, and/or present media information received from media sources 106-c. In general, the media processing device 110 may include, among other elements, a processing system, a processing sub-system, a processor, a computer, a device, an encoder, a decoder, a coder/decoder (codec), a filtering device (e.g., graphic scaling device, deblocking filtering device), a transformation device, an entertainment system, a display, or any other processing or communications architecture. The embodiments are not limited in this context.

The media processing device 110 may execute processing operations or logic for the media processing system 100 using a processing component 112. The processing component 112 may comprise various hardware elements, software elements, or a combination of both. Examples of hardware elements may include devices, components, processors, microprocessors, circuits, circuit elements (e.g., transistors, resistors, capacitors, inductors, and so forth), integrated circuits, application specific integrated circuits (ASIC), programmable logic devices (PLD), digital signal processors (DSP), field programmable gate array (FPGA), memory units, logic gates, registers, semiconductor device, chips, microchips, chip sets, and so forth. Examples of software elements may include software components, programs, applications, computer programs, application programs, system programs, machine programs, operating system software, middleware, firmware, software modules, routines, subroutines, functions, methods, procedures, software interfaces, application program interfaces (API), instruction sets, computing code, computer code, code segments, computer code segments, words, values, symbols, or any combination thereof. Determining whether an embodiment is implemented using hardware elements and/or software elements may vary in accordance with any number of factors, such as desired computational rate, power levels, heat tolerances, processing cycle budget, input data rates, output data rates, memory resources, data bus speeds and other design or performance constraints, as desired for a given implementation.

The media processing device 110 may execute communications operations or logic for the media processing system 100 using communications component 120. The communications component 120 may implement any well-known communications techniques and protocols, such as techniques suitable for use with packet-switched networks (e.g., public networks such as the Internet, private networks such as an enterprise intranet, and so forth), circuit-switched networks (e.g., the public switched telephone network), or a combination of packet-switched networks and circuit-switched networks (with suitable gateways and translators). The communications component 120 may include various types of standard communication elements, such as one or more communications interfaces, network interfaces, network interface cards (NIC), radios, wireless transmitters/receivers (transceivers), wired and/or wireless communication media, infra-red transceivers, serial interfaces, parallel interfaces, bus interfaces, physical connectors, and so forth. By way of example, and not limitation, communication media 120 includes wired communications media and wireless communications media, as previously described.

In various embodiments, the media processing device 110 may comprise viewing context builder 114. Viewing context builder 114 is shown as part of media processing device 110 for purposes of illustration and not limitation. It should be understood that it viewing context builder 114 could be located in other devices, components or nodes of media processing system 100 in various embodiments and still fall within the described embodiments.

While many digital streams include some forms of metadata about a media content product, the metadata may be limited in the information it includes, and in some cases, may not be available at all. In particular, metadata provided from the content source may be limited to the product as a whole, and may not address segments within the product. For example, in a local news program, the viewer may be interested in the local crime reports but not in the sports segment, a distinction that the metadata does not make. Further, metadata from different content sources may be inconsistent with each other, making data aggregation more challenging. For example, for a movie that takes place in outer space and has a horror component, one media source may classify the movie as “science fiction” while another may classify the movie as “horror”.

Viewing context builder 114 may analyze media content received from media sources 106-c and/or stored on media processing device 110 for data that is relevant to viewer context. Context relevant data may include information about content that provides some insight or indication as to what a content consumer, e.g. a viewer, prefers. Context relevant data may include any kind of information that can be gleaned from audio streams, video streams, closed caption streams, and other streams that may be included in the digital media stream. Context relevant data may include, for example but not limited to: keywords, locations or settings, actors or personalities, genres, time period, subject, music, and so forth. While context relevant data may include data from the metadata, it may also go beyond that information to provide a richer context for a viewer. An example of a viewing context builder 114 is described in more detail below with respect to FIG. 3.

FIG. 2 illustrates a block diagram for a media processing system 200 that may be the same or similar to media processing system 100 of FIG. 1 where like elements are similarly numbered. The media processing system 200 may comprise a sample digital home system implementation that is arranged to provide media content from disparate media sources to viewers in a home, office, or room environment. Although the media processing system 200 shown in FIG. 2 has a limited number of elements in a certain topology, it may be appreciated that the media processing system 200 may include more or less elements in alternate topologies as desired for a given implementation. The embodiments are not limited in this context.

In the illustrated embodiment shown in FIG. 2, the media processing system 200 may comprise a media processing device 110, input device 102-1, output devices 104-1, 104-2 and 104-3, and one or more media sources 106 (not shown). The media processing device 110 may be communicatively coupled to the input device 102-1, the output devices 104-1, 104-2 and 104-3, and the media sources 106 via respective wireless or wired communications connections 108-2, 110-1, 110-2 and 110-3. For purposes of illustration, the one or more media sources 106 of FIG. 2 (not shown) are part of, or integrated into, media processing device 110. Other embodiments are described and claimed.

In various embodiments, media processing device 110 may comprise a set-top box, digital media hub, media server, or other suitable processing device arranged to control the digital home system 200. While shown as a separate component in FIG. 2, it should be understood that media processing device 110 may be arranged as part of output device 104-1 or any other suitable component of system 200 in some embodiments. Output device 104-1 may comprise a display arranged to display information received from media processing device 110 over connection 110-1 in some embodiments. Embodiments of output device 104-1 may include without limitation a display, an analog display, a digital display, a television display, a projector and screen, a computer display, audio speakers, headphones, a printing device, lighting systems, warning systems, mobile computing devices, laptop computers, desktop computers, handheld computing devices, tablet computing devices, netbook computing devices and so forth. In various embodiments, output devices 104-2 and 104-3 may comprise speakers arranged to reproduce audio or other acoustic signals received from media processing device 110 over connections 110-2 and 110-3 respectively. Input device 102-1 may comprise a remote control, smart phone, or other suitable processing device capable of communicating with media processing device 110, output device 104-1 or any other device in the digital home system 200. Together, each of the components, nodes or devices of media processing system 200 may form or comprise one example embodiment of digital home entertainment system. The embodiments are not limited to the type, number or arrangement of components illustrated in FIG. 2.

FIG. 3 illustrates an embodiment of a viewing context builder 300. Viewing context builder 300 may be a representative example of viewing context builder 114. Viewing context builder 300 may include one or more components or modules to provide the functionality described herein. In an embodiment, for example, viewing context builder 114 may include a media content analyzer 310, a context relevant data extractor 320, and a viewing preference profile builder 330. The embodiments are not limited to type, number or arrangement of components illustrated in FIG. 3.

Media content analyzer 310 may analyze one or more streams of media content. Analyzing may include, for example, parsing the closed captioning stream to retrieve individual words, performing speech recognition on the audio stream, and/or performing various forms of video analysis on the video stream. Video analysis may include, for example, performing a “screen scrape” from an image in the video stream and identifying various elements, such as text, logos, faces, locations and so forth, in the screen scrape. Video analysis may also include, for example, object recognition, person or face recognition, foreground/background distinction, motion detection, and so forth. Media content analyzer 310 may generally identify components within the streams that potentially convey context relevant information. An example of a media content analyzer 310 is described further with respect to FIG. 4.

Context relevant data extractor 320 (“data extractor 320”) may receive analyzed media content from media content analyzer 310 and may identify the components that convey context relevant information.

The components of a closed captioning stream, for example, are words. The words usually reflect the spoken dialog in the audio stream, and may also provide indications as to what non-verbal audible events are taking place, such as “soft music playing” or “footsteps approaching.” Data extractor 320 may perform information retrieval or text mining on the parsed text to identify statistically “important” words or phrases. The statistically important words are most likely to provide some indication of content and context. Important words may generally be words that occur rarely in a particular corpus of text.

By way of background, one method of text mining is term frequency-inverse document frequency (tf-idf). In tf-idf, the importance of a word increases with the number of times it appears in the document (its frequency), offset by the frequency of the word in the corpus, or collection, of documents. A word such as “the” is a high-frequency word in all English documents, and would have a very low importance. However, a word such as “hippopotamus” is infrequently encountered and would likely be a more important word, if it appears several times in one document. Other methods of information retrieval and text mining may be used.

For embodiments, the closed captioning stream for one movie or television program may be considered a document, while the collection of closed captioning over a set of movies and/or television programs may be considered a corpus. A corpus may be comprised, for example, of the closed captioning of movies of a particular genre, e.g. science fiction movies. Data extractor 320 may, therefore, examine the frequency of words in the closed captioning stream of a television program, compare the frequency to the frequency of the words in the corpus and determine important words. The important words may be considered to be context relevant data in the form of keywords. For example, in a nature program about animals of Africa, the words “hippopotamus”, “lion”, “predator” and so forth may occur relatively frequently and may be determined to be keywords.

In an embodiment, the closed captioning stream may include non-dialog related text, such as text descriptions of sound effects or music. When a particular piece of music is identified, data extractor may use that identification to present options to the viewer, such as a link to an online music store where the piece may be purchased. The genre and/or artist of the music may also be considered context relevant data. A viewer, for example, may show a preference for movies that play country music as part of the soundtrack.

In some cases, a closed captioning stream may not be available, for example, with a live television broadcast. In an embodiment, data extractor 320 may receive recognized speech from media content analyzer 310. Data extractor 320 may perform similar information retrieval or text mining operations on the recognized speech to extract keywords.

In an embodiment, data extractor 320 may receive components of a video stream from media content analyzer. Components of a video stream may include, for example, recognized text, a logo, a face, a background object, a foreground object, a color, a pattern, or any other visual feature that can be distinguished in a video image or stream. From the components, data extractor 320 may extract content relevant data. For example, from text, data extractor 320 may extract keywords; names of people; location names; dates; time periods, e.g. “last year” or “Victorian times”; and so forth. From a logo, data extractor 320 may identify a corporate or organization entity. From a face, data extractor 320 may use a facial recognition application to identify the person, such as an actor, a politician, a television personality, a public figure, or other individuals. From a background object, such as a landmark, data extractor 320 may identify, for example, a location from the landmark. A date may be identified from text or from a calendar graphic. From a foreground object, data extractor 320 may identify a type of object, such as a car, and identify that object as context relevant data. For any extracted data item, data extractor 320 may perform additional statistical analysis, analogous to tf-idf or other information mining techniques, to determine whether the extracted component is statistically likely to be indicative of viewer preference. The embodiments are not limited to these examples.

The context relevant data from data extractor 320 may be used as input to a search engine. The search engine may return information and links from sources external to media processing device 110 that are related to the context relevant data. The search engine may also search other channels of media content for context information containing the search terms.

Viewing preference profile builder 330 (“profile builder 330”) may receive the extracted context relevant data from data extractor 320. Profile builder 330 may generate and update a viewing preference profile 340 for the viewer or viewers using media processing device 110. Viewing preference profile 340 may count or statistically aggregate the extracted context relevant data over a period of time, e.g. a day, a week, a month and so forth. Viewing preference profile 340 may include a list, table or other data structure that reflects the viewing habits of the viewers. For example, viewing preference profile 340 may store information such as: 80% of the television programs watched contain the keywords: “cardiac”, “arrest”, “ER”, “doctor”, and “code blue,” if a viewer watches a lot of medical drama programs. Viewing preference profile 340 may include information such as: 60% of the movies watched take place in the time period “19^thcentury”. Information from different data types may be combined, such as: 65% of news segments watched contain video of the President and the keyword “economy”.

Profile builder 330 may compare the context relevant data to data about concurrently available programming. This comparison may help distinguish between content that a viewer is consuming because “there's nothing else on” and content that a viewer is choosing over other content that the viewer has also indicated an interest in. The context relevant data may be analyzed, aggregated or processed in many other ways beyond the examples listed herein to form viewing preference profile 340.

Profile builder 330 may weight context related data differently, depending, for example, on how frequently a particular context arises. If a viewer watches two to three soccer matches a week, and watches a basketball game once a month, then “soccer match” as a context may be weighted more heavily than “basketball game” in the preference profile, indicating that the viewer is more interested in soccer than basketball. Similarly, recently watched contexts may have more weight than contexts that have not been encountered for a period of time.

FIG. 4 illustrates an example of a media content analyzer 400. Media content analyzer 400 may be a representative example of media content analyzer 310. Media content analyzer 400 may include one or more components or modules to provide the functionality described herein. In an embodiment, for example, media content analyzer 400 may include a closed captioning parser 410, a speech recognizer 420, and a video analyzer 430. The embodiments are not limited to type, number or arrangement of components illustrated in FIG. 4.

Closed captioning parser 410 may receive the closed captioning stream from media content and parse the stream for individual words. Capturing context relevant data from closed captioning may be desirable because parsing text is computationally inexpensive, in particular, when compared to speech recognition or video analytics. In an embodiment, closed captioning parser 410 may remove “filler” words, such as articles, prepositions and pronouns, prior to providing the parsed words to data extractor 320. In an embodiment, closed captioning parser 410 may be used in conjunction with a search engine, where the search engine may search the parsed closed captioning streams for the search terms to assist the searcher in finding content of interest.

Closed captioning parser 410 may recognize non-verbal items in the closed captioning stream, such as sound effect notations and musical information. In an embodiment, the non-verbal items may be identified as non-verbal text data. In an embodiment, multiple channels of closed captioning streams may be parsed in parallel by closed captioning parser 410.

Speech recognizer 420 may receive the audio stream from media content and perform speech recognition, e.g. speech to text, on the audio stream to retrieve individual words. Once the individual words are identified, speech recognizer 420 may function similarly to closed captioning parser 410, for example, by removing “filler” words.

Video analyzer 430 may be comprised of one or more components or modules to provide the functionality described herein. For example, video analyzer 430 may include a screen scraper 432, an object recognizer 434, a text recognizer 436, a face recognizer 438, and additional other recognizers 440. Video analyzer 430 may receive the video stream from media content and perform various video analytical functions on the video stream.

Screen scraper 432, for example, may capture and analyze one or more still images that form part of the video stream. Analyzing may include, for example, operating on text with an optical character recognition (OCR) function. Screen scraper 432 may independently from, or in conjunction with, the other video analyzer 430 components, identify potential context relevant data from a “screen scrape”. For example, screen scraper 432 may use known or future image and video analytics to recognize the presence of text, faces, specific objects, logos, patterns, dates, people, or any other visual element of the screen scrape. In an embodiment, screen scraper 432 may analyze the screen scrape to isolate objects, but may not further identify what types of objects they are.

Object recognizer 434 may analyze the video stream directly, or from the screen scrape to identify discrete objects. Discrete objects tend to have a consistent border and colors as compared to a background. In some cases, object recognizer 434 may be able to recognize and identify specific categories or types of objects, such as people, vehicles, animals, landmark buildings, e.g. the Eiffel Tower, plants, bodies of water, and so forth. Object recognizer 434 may use various definitions, rule sets, or comparison images to identify an object. For example, a person may be, simplistically, recognized by a round or oval shape (the head) above a wider rectangular shape (the torso). A vehicle such as a car may be recognized by a rectangular shape with a trapezoidal shape on top. Object recognizer 434 may identify graphics, such as logos, icons, trademarks, or symbols.

Text recognizer 436 may analyze the video stream directly, or from the screen scrape to identify text on the screen. Text recognizer 436 may, for example, compare objects to the shapes of alphanumeric characters to determine whether the object is text. Text recognizer 436 may then perform optical character recognition on the identified text to convert the graphical representation of text to text. Text recognizer 436 may therefore be able to retrieve text from, for example, sub-titles, signs, bill-boards, text “crawl” areas, image or video captions, and so forth. Once the text is converted, text recognizer 436 may function similarly to closed captioning parser 410, for example, by removing “filler” words.

Face recognizer 438 may analyze the video stream directly, from the screen scrape, or from object recognizer 434 to recognize faces on the screen. Face recognizer 438 may use known or future-developed facial identification and recognition techniques. Face recognizer 438, or object recognizer 434, may identify a face, for example, from an object having an ovoid shape having eye-shapes roughly aligned in the horizontal, a nose shape below and between the eye shapes, and a horizontal mouth-shape below and centered on the nose shape. Once an object is recognized as being a face, face recognizer 438 may perform face recognition to identify the specific person having the face. Face recognizer 438 may access local or remote storage sites that include faces linked with names, in particular of prominent people, including news sites, entertainment sites, government sites and so forth. Face recognizer 438 may perform the face recognition itself, or may provide the face to an external service, receiving the name of the recognized person as a result. Face recognizer 438 may identify people such as actors, news reporters or presenters, politicians, business people, musicians, athletes, or other people that may be prominently figured in media content.

Video analyzer 430 may include other content recognizers 440 to isolate and/or identify content that is present in a video stream. Other content recognizers 440 may, for example, recognize colors, color schemes, patterns, lighting qualities, and so forth. Viewers may have preferences for brightly lit scenes, or warm color palettes, for example, perhaps subconsciously, that viewing preference profile builder 330 may include as a context relevant factor.

The various components of video analyzer 430 may be separate components, or may be joined in any combination to provide the functionality described herein. The embodiments are not limited to the examples.

FIG. 5 illustrates an example of a “screen scrape” 500 and of some of the context relevant data that may be extracted therefrom. Screen scrape 500 may be of a news program, for example. Screen scrape 500 may include a person object 502, e.g. the news presenter, a logo graphic 504, e.g. the news station or channel logo, and a date field 506. Date field 506 is shown as text, but could also be a calendar graphic. Screen scrape 500 may also include a text crawl area 508, a background object 510, an icon or graphic 512, a caption 514, and an inset area with a face 516.

Media content analyzer 310, 400 may use various video analysis components to analyze screen scrape 500. For example, object recognizer 434 may identify a person (or face) object 502, a background object 510, graphic objects 504 and 512, and a face object 516. Text recognizer 436 may identify the texts areas 506, 508 and 514, and may convert the graphical text to plain text, for example, via optical character recognition. Face recognizer 438 may attempt to identify the people of objects 502 and 516.

Data extractor 320 may receive the components from the analyzed screen scrape, and identify and extract context relevant data. Data extractor 320 may additionally receive dialog information from speech recognizer 420 and/or closed captioning parser 410. For example, data extractor 320 may identify background object 510 as one of the great pyramids in Egypt, and extract the context relevant data of “Egypt”. From the dialog that may be describing the end to a drought in the desert in Egypt, and the text in text object 514, keywords such as “rain”, “desert” and “drought” may be extracted as context relevant data. Face 516 may be identified as an official of the Egyptian government, and the name of the official may be extracted as context relevant data.

The different elements of context relevant data that are extracted can further serve to identify or rule out other elements as being context related. In the pictured example, the location “Egypt”, the Egyptian official, and keyword “desert” support a context such as “news about Egypt”, while the potential keywords “championship” and “kitten” from text 508 may be discarded, as they appear too infrequently to be considered important.

The extracted context relevant data may be received by viewing preference profile builder 330. Viewing preference profile builder 330 may add the context relevant data to viewing preference profile 340, either directly, or after some statistical analysis. If, for example, there is no data in the profile that relates to deserts, droughts, and/or Egypt, those topics may be added to the profile but may be assigned a low weight so as not to unduly influence the existing preferences. If “Egypt” occurs frequently in a viewing context, then the re-occurrence of “Egypt” may add some weight to the context as a preference.

In an embodiment, media processing device 110 may use viewing preference profile 110 in several ways. For example, media content known to have similar or related contexts to the contexts in the viewing preference profile may be suggested to the viewer as being similar to other media content the viewer has enjoyed. Media processing device 110 may be in communication with other media processing devices, for example, those identified as “friends” of the viewer, and may compare viewing preference profiles. Where context elements overlap between profiles, media processing device 110 may determine other context elements in the friend's profile not present in viewing preference profile 340. Media processing device 110 may then identify media content that contains those other context elements and suggest them to the viewer as “People who watched X, also watched Y”, where “X” reflects media content with context in common with the viewer, and “Y” reflects media content with context not found in viewing preference profile 340.

Embodiments may perform analysis and context relevant data extraction on a selected channel, e.g. the channel that media processing device 110 is tuned to. Embodiments may also perform at least some forms of analysis and context relevant data extraction on multiple channels in parallel. For example, closed captioning parsing requires relatively little processing power, compared to video analytics, and may provide sufficient context to provide some concurrent content suggestions to a viewer.

Included herein is a set of flow charts representative of exemplary methodologies for performing novel aspects of the disclosed architecture. While, for purposes of simplicity of explanation, the one or more methodologies shown herein, for example, in the form of a flow chart or flow diagram, are shown and described as a series of acts, it is to be understood and appreciated that the methodologies are not limited by the order of acts, as some acts may, in accordance therewith, occur in a different order and/or concurrently with other acts from that shown and described herein. For example, those skilled in the art will understand and appreciate that a methodology could alternatively be represented as a series of interrelated states or events, such as in a state diagram. Moreover, not all acts illustrated in a methodology may be required for a novel implementation.

FIG. 6 illustrates one embodiment of a logic flow 600. The logic flow 600 may be representative of some or all of the operations executed by one or more embodiments described herein.

In the illustrated embodiment shown in FIG. 6, the logic flow 600 may analyze media content at block 602. For example, media content analyzer 310 may analyze video, audio, and/or closed captioning streams in media content to identify “objects” that may convey context. The result of analysis may include words, objects, faces, music, and image characteristics, for example.

One or more analysis techniques may be performed. Examples of media content analysis may include, for example, text parsing, speech recognition, optical character recognition, face recognition, object recognition, location detection, and so forth.

The logic flow 600 may extract context relevant data from the analyzed media content at block 604. For example, context relevant data extractor 320 may examine text and use information mining or text mining to determine what words convey contextual meaning and what words may be ignored. The text may be received from parsed text, speech recognized text, and optical character recognized text, for example. Context relevant data extractor 320 may examine objects identified in the video stream to determine what objects they are, and whether that type of object conveys context. Context relevant data extractor 320 may examine two or more types of analyzed media content in conjunction to identify context relevant data. The embodiments are not limited to this example.

The logic flow 600 may build a viewing preference profile from the context relevant data at block 606. For example, viewing preference profile builder 330 may receive keywords, locations, people, and other context relevant data from context relevant data extractor 320. Viewing preference profile builder 330 may aggregate the context relevant data, and may weight the data as well. For example, a keyword may be counted each time it is encountered, and the keywords having the highest counts may be considered to indicate the most preferred context. Relationships among keywords, or other context relevant data may be used to generate more complex profiles. For example, if a group of keywords often appears together, perhaps indicating a topic, and with a specific actor, then a preference for content on that topic that includes the actor may be indicated. The embodiments are not limited to this example.

The logic flow 600 may use the viewing preference profile to identify related media content at block 608. For example, media processing device 110 may compare context relevant data from other media content to the viewing preference profile to identify media content that is similar to a preferred context. Media processing device 110 may provide some or all of the viewing preference profile to an advertising service to obtain advertising that is targeted to the viewer's interests, according to the viewing preference profile. Media processing device 110 may use the viewing preference profile to generate a custom personal brand channel for the viewer that collects media content having similar context into one channel for the viewer. The viewing preference profile may be used to identify content that another viewer may find of interest, when the viewing preference profiles of the two viewers contain some contexts that overlap. The embodiments are not limited to this example.

Embodiments may thus provide a richer viewing context with finer granularity compared to a context constructed merely from content provider supplied metadata about media content.

FIG. 7 illustrates an embodiment of an exemplary computing architecture 700 suitable for implementing various embodiments as previously described. As used in this application, the terms “system” and “device” and “component” are intended to refer to a computer-related entity, either hardware, a combination of hardware and software, software, or software in execution, examples of which are provided by the exemplary computing architecture 700. For example, a component can be, but is not limited to being, a process running on a processor, a processor, a hard disk drive, multiple storage drives (of optical and/or magnetic storage medium), an object, an executable, a thread of execution, a program, and/or a computer. By way of illustration, both an application running on a server and the server can be a component. One or more components can reside within a process and/or thread of execution, and a component can be localized on one computer and/or distributed between two or more computers. Further, components may be communicatively coupled to each other by various types of communications media to coordinate operations. The coordination may involve the uni-directional or bi-directional exchange of information. For instance, the components may communicate information in the form of signals communicated over the communications media. The information can be implemented as signals allocated to various signal lines. In such allocations, each message is a signal. Further embodiments, however, may alternatively employ data messages. Such data messages may be sent across various connections. Exemplary connections include parallel interfaces, serial interfaces, and bus interfaces.

In one embodiment, the computing architecture 700 may comprise or be implemented as part of an electronic device. Examples of an electronic device may include without limitation a mobile device, a personal digital assistant, a mobile computing device, a smart phone, a cellular telephone, a handset, a one-way pager, a two-way pager, a messaging device, a computer, a personal computer (PC), a desktop computer, a laptop computer, a notebook computer, a handheld computer, a tablet computer, a server, a server array or server farm, a web server, a network server, an Internet server, a work station, a mini-computer, a main frame computer, a supercomputer, a network appliance, a web appliance, a distributed computing system, multiprocessor systems, processor-based systems, consumer electronics, programmable consumer electronics, television, digital television, set top box, wireless access point, base station, subscriber station, mobile subscriber center, radio network controller, router, hub, gateway, bridge, switch, machine, or combination thereof. The embodiments are not limited in this context.

The computing architecture 700 includes various common computing elements, such as one or more processors, co-processors, memory units, chipsets, controllers, peripherals, interfaces, oscillators, timing devices, video cards, audio cards, multimedia input/output (I/O) components, and so forth. The embodiments, however, are not limited to implementation by the computing architecture 700.

As shown in FIG. 7, the computing architecture 700 comprises a processing unit 704, a system memory 706 and a system bus 708. The processing unit 704 can be any of various commercially available processors. Dual microprocessors and other multi processor architectures may also be employed as the processing unit 704. The system bus 708 provides an interface for system components including, but not limited to, the system memory 706 to the processing unit 704. The system bus 708 can be any of several types of bus structure that may further interconnect to a memory bus (with or without a memory controller), a peripheral bus, and a local bus using any of a variety of commercially available bus architectures.

The computing architecture 700 may comprise or implement various articles of manufacture. An article of manufacture may comprise a computer-readable storage medium to store various forms of programming logic. Examples of a computer-readable storage medium may include any tangible media capable of storing electronic data, including volatile memory or non-volatile memory, removable or non-removable memory, erasable or non-erasable memory, writeable or re-writeable memory, and so forth. Examples of programming logic may include executable computer program instructions implemented using any suitable type of code, such as source code, compiled code, interpreted code, executable code, static code, dynamic code, object-oriented code, visual code, and the like.

The system memory 706 may include various types of computer-readable storage media in the form of one or more higher speed memory units, such as read-only memory (ROM), random-access memory (RAM), dynamic RAM (DRAM), Double-Data-Rate DRAM (DDRAM), synchronous DRAM (SDRAM), static RAM (SRAM), programmable ROM (PROM), erasable programmable ROM (EPROM), electrically erasable programmable ROM (EEPROM), flash memory, polymer memory such as ferroelectric polymer memory, ovonic memory, phase change or ferroelectric memory, silicon-oxide-nitride-oxide-silicon (SONOS) memory, magnetic or optical cards, or any other type of media suitable for storing information. In the illustrated embodiment shown in FIG. 7, the system memory 706 can include non-volatile memory 710 and/or volatile memory 712. A basic input/output system (BIOS) can be stored in the non-volatile memory 710.

The computer 702 may include various types of computer-readable storage media in the form of one or more lower speed memory units, including an internal hard disk drive (HDD) 714, a magnetic floppy disk drive (FDD) 716 to read from or write to a removable magnetic disk 718, and an optical disk drive 720 to read from or write to a removable optical disk 722 (e.g., a CD-ROM or DVD). The HDD 714, FDD 716 and optical disk drive 720 can be connected to the system bus 708 by a HDD interface 724, an FDD interface 726 and an optical drive interface 728, respectively. The HDD interface 724 for external drive implementations can include at least one or both of Universal Serial Bus (USB) and IEEE 1394 interface technologies.

The drives and associated computer-readable media provide volatile and/or nonvolatile storage of data, data structures, computer-executable instructions, and so forth. For example, a number of program modules can be stored in the drives and memory units 710, 712, including an operating system 730, one or more application programs 732, other program modules 734, and program data 736.

The one or more application programs 732, other program modules 734, and program data 736 can include, for example, the viewing context builder 300, the media content analyzer 310, the context relevant data extractor 320, the viewing preference profile builder 330, and the viewing preference profile 340, among others.

A user can enter commands and information into the computer 702 through one or more wire/wireless input devices, for example, a keyboard 738 and a pointing device, such as a mouse 740. Other input devices may include a microphone, an infra-red (IR) remote control, a joystick, a game pad, a stylus pen, touch screen, or the like. These and other input devices are often connected to the processing unit 704 through an input device interface 742 that is coupled to the system bus 708, but can be connected by other interfaces such as a parallel port, IEEE 1394 serial port, a game port, a USB port, an IR interface, and so forth.

A monitor 744 or other type of display device is also connected to the system bus 708 via an interface, such as a video adaptor 746. In addition to the monitor 744, a computer typically includes other peripheral output devices, such as speakers, printers, and so forth.

The computer 702 may operate in a networked environment using logical connections via wire and/or wireless communications to one or more remote computers, such as a remote computer 748. The remote computer 748 can be a workstation, a server computer, a router, a personal computer, portable computer, microprocessor-based entertainment appliance, a peer device or other common network node, and typically includes many or all of the elements described relative to the computer 702, although, for purposes of brevity, only a memory/storage device 750 is illustrated. The logical connections depicted include wire/wireless connectivity to a local area network (LAN) 752 and/or larger networks, for example, a wide area network (WAN) 754. Such LAN and WAN networking environments are commonplace in offices and companies, and facilitate enterprise-wide computer networks, such as intranets, all of which may connect to a global communications network, for example, the Internet.

When used in a LAN networking environment, the computer 702 is connected to the LAN 752 through a wire and/or wireless communication network interface or adaptor 756. The adaptor 756 can facilitate wire and/or wireless communications to the LAN 752, which may also include a wireless access point disposed thereon for communicating with the wireless functionality of the adaptor 756.

When used in a WAN networking environment, the computer 702 can include a modem 758, or is connected to a communications server on the WAN 754, or has other means for establishing communications over the WAN 754, such as by way of the Internet. The modem 758, which can be internal or external and a wire and/or wireless device, connects to the system bus 708 via the input device interface 742. In a networked environment, program modules depicted relative to the computer 702, or portions thereof, can be stored in the remote memory/storage device 750. It will be appreciated that the network connections shown are exemplary and other means of establishing a communications link between the computers can be used.

The computer 702 is operable to communicate with wire and wireless devices or entities using the IEEE 802 family of standards, such as wireless devices operatively disposed in wireless communication (e.g., IEEE 802.11 over-the-air modulation techniques) with, for example, a printer, scanner, desktop and/or portable computer, personal digital assistant (PDA), communications satellite, any piece of equipment or location associated with a wirelessly detectable tag (e.g., a kiosk, news stand, restroom), and telephone. This includes at least Wi-Fi (or Wireless Fidelity), WiMax, and Bluetooth™ wireless technologies. Thus, the communication can be a predefined structure as with a conventional network or simply an ad hoc communication between at least two devices. Wi-Fi networks use radio technologies called IEEE 802.11x (a, b, g, n, etc.) to provide secure, reliable, fast wireless connectivity. A Wi-Fi network can be used to connect computers to each other, to the Internet, and to wire networks (which use IEEE 802.3-related media and functions).

FIG. 8 illustrates a block diagram of an exemplary communications architecture 800 suitable for implementing various embodiments as previously described. The communications architecture 800 includes various common communications elements, such as a transmitter, receiver, transceiver, radio, network interface, baseband processor, antenna, amplifiers, filters, and so forth. The embodiments, however, are not limited to implementation by the communications architecture 800.

As shown in FIG. 8, the communications architecture 800 comprises includes one or more clients 802 and servers 804. The clients 802 may implement the client systems 310, 400. The servers 804 may implement the server system 330. The clients 802 and the servers 804 are operatively connected to one or more respective client data stores 808 and server data stores 810 that can be employed to store information local to the respective clients 802 and servers 804, such as cookies and/or associated contextual information.

The clients 802 and the servers 804 may communicate information between each other using a communication framework 806. The communications framework 806 may implement any well-known communications techniques and protocols, such as those described with reference to systems 300, 400 and 700. The communications framework 806 may be implemented as a packet-switched network (e.g., public networks such as the Internet, private networks such as an enterprise intranet, and so forth), a circuit-switched network (e.g., the public switched telephone network), or a combination of a packet-switched network and a circuit-switched network (with suitable gateways and translators).

Some embodiments may be described using the expression “one embodiment” or “an embodiment” along with their derivatives. These terms mean that a particular feature, structure, or characteristic described in connection with the embodiment is included in at least one embodiment. The appearances of the phrase “in one embodiment” in various places in the specification are not necessarily all referring to the same embodiment. Further, some embodiments may be described using the expression “coupled” and “connected” along with their derivatives. These terms are not necessarily intended as synonyms for each other. For example, some embodiments may be described using the terms “connected” and/or “coupled” to indicate that two or more elements are in direct physical or electrical contact with each other. The term “coupled,” however, may also mean that two or more elements are not in direct contact with each other, but yet still co-operate or interact with each other.

It is emphasized that the Abstract of the Disclosure is provided to allow a reader to quickly ascertain the nature of the technical disclosure. It is submitted with the understanding that it will not be used to interpret or limit the scope or meaning of the claims. In addition, in the foregoing Detailed Description, it can be seen that various features are grouped together in a single embodiment for the purpose of streamlining the disclosure. This method of disclosure is not to be interpreted as reflecting an intention that the claimed embodiments require more features than are expressly recited in each claim. Rather, as the following claims reflect, inventive subject matter lies in less than all features of a single disclosed embodiment. Thus the following claims are hereby incorporated into the Detailed Description, with each claim standing on its own as a separate embodiment. In the appended claims, the terms “including” and “in which” are used as the plain-English equivalents of the respective terms “comprising” and “wherein,” respectively. Moreover, the terms “first,” “second,” “third,” and so forth, are used merely as labels, and are not intended to impose numerical requirements on their objects.

What has been described above includes examples of the disclosed architecture. It is, of course, not possible to describe every conceivable combination of components and/or methodologies, but one of ordinary skill in the art may recognize that many further combinations and permutations are possible. Accordingly, the novel architecture is intended to embrace all such alterations, modifications and variations that fall within the spirit and scope of the appended claims.

Claims

1. An apparatus, comprising:

a processing component;

a viewing context builder operative on the processing component to: analyze media content comprising one or more of an audio stream, a video stream, and a closed captioning stream from a selected channel; extract context relevant data from the analyzed media content; and build a viewing preference profile from the context relevant data.

2. The apparatus of claim 1, wherein context relevant data comprises keywords, and the viewing context builder is further operative to:

analyze media content by parsing dialog in the closed captioning stream; and

identify keywords in the parsed dialog.

3. The apparatus of claim 1, wherein context relevant data comprises keywords, and the viewing context builder is further operative to:

analyze media content by performing speech recognition on dialog in the audio stream; and

identify keywords from the recognized speech.

4. The apparatus of claim 1, the viewing context builder further operative to:

analyze media content from a plurality of channels;

extract context relevant data from the analyzed media content; and

identify media content that may be of interest according to the viewing preference profile and the context relevant data extracted from the selected channel and the plurality of channels.

5. The apparatus of claim 1, the viewing context builder further operative to:

analyze the video stream; and

extract from the analyzed video stream content relevant data comprising one or more of: a keyword, a graphic, a location, a person, a date, a background object, a foreground object, a pattern, a color palette, and a lighting quality.

6. The apparatus of claim 5, wherein extracting relevant data from the analyzed video stream comprises at least one of:

extracting a keyword from text recognized from the video stream;

extracting a logo, icon, trademark, or symbol from a static graphic in the video stream;

extracting a location from one of: text and a background object;

extracting a person by identifying an object in the video stream as a face, and performing facial recognition on the face to identify the person; and

extracting a date from one of text and a graphic.

7. The apparatus of claim 1, wherein the context relevant data is external to media metadata included in the media content by a media source.

8. The apparatus of claim 1, comprising a digital display to present control information for the apparatus.

9. A method, comprising:

analyzing media content comprising at least one of an audio stream, a video stream, and a closed captioning stream from a selected channel;

extracting context relevant data from the analyzed media content;

building a viewing preference profile from the context relevant data; and

identifying media content that includes context relevant data similar to the viewing preference profile.

10. The method of claim 9, wherein context relevant data comprises keywords, further comprising:

analyzing media content by parsing dialog in the closed captioning stream; and

identifying keywords in the parsed dialog.

11. The method of claim 10, further comprising:

searching the parsed closed captioning stream for a search term; and

identifying media content associated with a closed captioning stream that contains the search term as a search result.

12. The method of claim 10, further comprising:

identifying keywords by:

comparing the frequency of occurrence of a word in the dialog to the frequency of the same word in a corpus of dialogs; and

identifying a word as a keyword when the frequency of occurrence of the word in the dialog is high relative to the frequency of the word in the corpus of dialogs.

13. The method of claim 9, wherein context relevant data comprises keywords, further comprising:

analyzing media content by performing speech recognition on dialog in the audio stream; and

identifying keywords from the recognized speech.

14. The method of claim 9, further comprising:

analyzing media content from a plurality of channels;

extracting context relevant data from the analyzed media content; and

identifying media content that may be of interest according to the viewing preference profile and the context relevant data extracted from the selected channel and the plurality of channels.

15. The method of claim 9, further comprising:

analyzing the video stream; and

extracting from the analyzed video stream content relevant data comprising at least one of: a keyword, a graphic, a location, a person, a date, a background object, a foreground object, a pattern, a color palette, and a lighting quality.

16. The method of claim 15, wherein extracting relevant data from the analyzed video stream comprises at least one of:

extracting a keyword from text recognized from the video stream;

extracting a logo, icon, trademark, or symbol from a static graphic in the video stream;

extracting a location from one of: text and a background object;

extracting a person by identifying an object in the video stream as a face, and performing facial recognition on the face to identify the person; and

extracting a date from one of text and a static graphic.

17. The method of claim 9, wherein the context relevant data is external to media metadata included in the media content by a media source.

18. An article comprising a computer-readable storage medium containing instructions that when executed cause a system to:

analyze media content comprising at least one of an audio stream, a video stream, and a closed captioning stream from a selected channel;

extract context relevant data from the analyzed media content;

build a viewing preference profile from the context relevant data; and

search for media content including one or more elements of the context relevant data.

19. The article of claim 18, wherein context relevant data comprises keywords, the article further comprising instructions that when executed cause the system to:

analyze media content by parsing dialog in the closed captioning stream; and

analyze media content by performing speech recognition on dialog in the audio stream; and

to identify keywords in one of the parsed dialog and the recognized speech.

20. The article of claim 19, further comprising instructions that when executed cause the system to:

identify keywords by:

comparing the frequency of occurrence of a word in the dialog to the frequency of the same word in a corpus of dialogs; and

identifying a word as a keyword when the frequency of occurrence of the word in the dialog is high relative to the frequency of the word in the corpus of dialogs.

21. The article of claim 18, further comprising instructions that when executed cause the system to:

search a plurality of closed captioning streams for a search term; and

identify media content associated with a closed captioning stream that contains the search term as a search result.

22. The article of claim 18, further comprising instructions that when executed cause the system to:

analyze the video stream; and

extract from the analyzed video stream content relevant data comprising at least one of: a keyword, a graphic, a location, a person, a date, a background object, a foreground object, a pattern, a color palette, and a lighting quality.

23. The article of claim 22, further comprising instructions that when executed cause the system to:

extract a keyword from text recognized from the video stream;

extract a logo, icon, trademark, or symbol from a static graphic in the video stream;

extract a location from one of: text and a background object;

extract a person by identifying an object in the video stream as a face, and performing facial recognition on the face to identify the person; and

extract a date from one of text and a static graphic.

24. The article of claim 18, further comprising instructions that when executed cause the system to:

analyze media content from a plurality of channels;

extract context relevant data from the analyzed media content; and

identify media content that may be of interest according to the viewing preference profile and the context relevant data extracted from the selected channel and the plurality of channels.

25. The article of claim 18, wherein the context relevant data is external to media metadata included in the media content by a media source.