MEDIA SHARING DURING A VIDEO CALL

Info

Publication number: 20120287231
Type: Application
Filed: May 13, 2012
Publication Date: Nov 15, 2012
Inventors: Sreekanth Ravi (Atherton, CA), Sudhakar Ravi (Atherton, CA), Jeremy Zullo (Livermore, CA), Aditya Mavlankar (Foster City, CA), Gaurav Gupta (Palo Alto, CA)
Application Number: 13/470,336

Abstract

Video call devices having corresponding methods and non-transitory computer-readable media comprise: a video input interface configured to receive first video information; an audio input interface configured to receive first audio information; a transmitter configured to transmit first signals during a video call, wherein the first signals represent the first video information and the first audio information; a receiver configured to receive second signals during the video call, wherein the second signals represent second video information and second audio information; a video output interface configured to provide the second video information; an audio output interface configured to provide the second audio information; wherein the transmitter is further configured to transmit third signals during the video call, wherein the third signals represent at least one of media content, and a hyperlink, wherein the hyperlink indicates a location of the media content.

Description

Description

CROSS-REFERENCE TO RELATED APPLICATIONS

This application claims benefit of U.S. Provisional Patent Application Ser. No. 61/485,229 entitled “MEDIA SHARING DURING A VIDEO CALL,” filed May 12, 2011, the disclosure thereof incorporated by reference herein in its entirety.

This application claims benefit of U.S. Provisional Patent Application Ser. No. 61/485,233 entitled “WIRELESS NETWORK DEVICE CONFIGURATION USING TWO-DIMENSIONAL PATTERNS,” filed May 12, 2011, the disclosure thereof incorporated by reference herein in its entirety.

This application claims benefit of U.S. Provisional Patent Application Ser. No. 61/485,237 entitled “SMART REMOTE CONTROL DEVICES FOR VIDEO CALLING,” filed May 12, 2011, the disclosure thereof incorporated by reference herein in its entirety.

This application is related to U.S. Patent Application Serial No. (to be assigned, Attorney Docket No. TLY003001), entitled “WIRELESS NETWORK DEVICE CONFIGURATION USING TWO-DIMENSIONAL PATTERNS,” filed TBD, the disclosure thereof incorporated by reference herein in its entirety.

This application is related to U.S. Patent Application Serial No. (to be assigned, Attorney Docket No. TLY004001), entitled “SMART REMOTE CONTROL DEVICES FOR CONTROLLING VIDEO CALL DEVICES,” filed TBD, the disclosure thereof incorporated by reference herein in its entirety.

FIELD

The present disclosure relates generally to video calling. More particularly, the present disclosure relates to sharing media during a video call.

BACKGROUND

The traditional use of television has been for passive consumption of content. The content is mostly television programming (live as well as on-demand) and outputs of other local devices such as media players (for example, DVD, CD, and VCR devices), video game devices, and the like. But despite the availability of large, high-resolution television screens, few solutions allow the use of these television screens for video calling. One solution involves connecting a computer to a webcam, speakers, microphone, and the television screen, installing and executing video calling software on the computer, and controlling the computer using a keyboard and mouse. Users generally avoid such tedious tasks.

Similar problems plague interactive media sharing such as sharing photos online. In order to share photos, most people prepare a web album, and email links to those albums to other parties. However, this technique does not allow the involved parties to step through the album in a synchronized manner while sharing verbal comments and the like. In addition, the steps of uploading photo albums and emailing links to the albums generally require a web browser or other dedicated software and a computer.

Other solutions involve “pure” screen sharing. According to these solutions, one user's screen is compressed and transported to other participants. Such screen sharing often suffers from video compression artifacts, especially if the available data rate for communication fluctuates. Furthermore, such solutions fail to deliver media to all participants in their native, pristine format. These solutions also fail to provide synchronized playback such that all users have an almost identical contemporaneous experience while playing the media.

SUMMARY

In general, in one aspect, an embodiment features a video call device comprising: a video input interface configured to receive first video information; an audio input interface configured to receive first audio information; a transmitter configured to transmit first signals during a video call, wherein the first signals represent the first video information and the first audio information; a receiver configured to receive second signals during the video call, wherein the second signals represent second video information and second audio information; a video output interface configured to provide the second video information; an audio output interface configured to provide the second audio information; wherein the transmitter is further configured to transmit third signals during the video call, wherein the third signals represent at least one of media content, and a hyperlink, wherein the hyperlink indicates a location of the media content.

Embodiments of the video call device can include one or more of the following features. Some embodiments comprise an encoder/decoder (CODEC); wherein the receiver is further configured to receive fourth signals, wherein the fourth signals represent a link partner media quality request, wherein the link partner media quality request indicates a desired quality for the first video information; wherein the CODEC is configured to transcode the first video information according to the desired quality prior to the first signals being transmitted by the transmitter. Some embodiments comprise a media interface configured to receive the media content; wherein the third signals represent the media content. Some embodiments comprise an encoder/decoder (CODEC); wherein the receiver is further configured to receive fourth signals, wherein the fourth signals represent a link partner media quality request, wherein the link partner media quality request indicates a desired quality for the media content; wherein the CODEC is configured to transcode the media content according to the desired quality prior to the third signals being transmitted by the transmitter. In some embodiments, the media interface comprises at least one of: an SD card interface; a USB interface; and a mass storage interface. Some embodiments comprise a processor configured to generate one or more first playback synchronization commands, wherein the first playback synchronization commands include timing information for playback of the media content; wherein the transmitter is further configured to transmit fourth signals during the video call, wherein the fourth signals represent the one or more first playback synchronization commands. In some embodiments, the receiver is further configured to receive fifth signals during the video call, wherein the fifth signals represent one or more second playback synchronization commands; and the processor is further configured to control playback of the media content according to the one or more second playback synchronization commands. In some embodiments, the playback synchronization commands represent at least one of: a file transfer status for the media content; a playback position for the media content; and a time of a modification of the media content by a user. Some embodiments comprise one or more cameras configured to provide the first video information to the video input interface; and one or more microphones configured to provide the first audio information to the audio input interface.

In general, in one aspect, an embodiment features a method comprising: receiving first video information; receiving first audio information; transmitting first signals during a video call, wherein the first signals represent the first video information and the first audio information; receiving second signals during the video call, wherein the second signals represent second video information and second audio information; providing the second video information; providing the second audio information; and transmitting third signals during the video call, wherein the third signals represent at least one of media content, and a hyperlink, wherein the hyperlink indicates a location of the media content.

Embodiments of the method can include one or more of the following features. Some embodiments comprise receiving fourth signals, wherein the fourth signals represent a link partner media quality request, wherein the link partner media quality request indicates a desired quality for the first video information; and transcoding the first video information according to the desired quality prior to transmitting the first signals. Some embodiments comprise receiving the media content; wherein the third signals represent the media content. Some embodiments comprise receiving fourth signals, wherein the fourth signals represent a link partner media quality request, wherein the link partner media quality request indicates a desired quality for the media content; and transcoding the media content according to the desired quality prior to transmitting the third signals. Some embodiments comprise generating one or more first playback synchronization commands, wherein the first playback synchronization commands include timing information for playback of the media content; and transmitting fourth signals during the video call, wherein the fourth signals represent the one or more first playback synchronization commands. Some embodiments comprise receiving fifth signals during the video call, wherein the fifth signals represent one or more second playback synchronization commands; and controlling playback of the media content according to the one or more second playback synchronization commands. In some embodiments, the playback synchronization commands represent at least one of: a file transfer status for the media content; a playback position for the media content; and a time of a modification of the media content by a user.

In general, in one aspect, an embodiment features non-transitory computer-readable media embodying instructions executable by a computer to perform functions comprising: receiving first video information and first audio information; causing transmission of first signals during a video call, wherein the first signals represent the first video information and the first audio information; providing second video information and second audio information based on second signals received during the video call, wherein the second signals represent the second video information and the second audio information; causing transmission of third signals during the video call, wherein the third signals represent at least one of media content, and a hyperlink, wherein the hyperlink indicates a location of the media content.

Embodiments of the non-transitory computer-readable media can include one or more of the following features. In some embodiments, the functions further comprise: receiving a link partner media quality request, wherein the link partner media quality request indicates a desired quality for the first video information; and transcoding the first video information according to the desired quality prior to causing transmission of the first signals. In some embodiments, the functions further comprise: receiving the media content; wherein the third signals represent the media content. In some embodiments, the functions further comprise: receiving a link partner media quality request, wherein the link partner media quality request indicates a desired quality for the media content; and transcoding the media content according to the desired quality prior to causing transmission of the third signals. In some embodiments, the functions further comprise: generating one or more first playback synchronization commands, wherein the first playback synchronization commands include timing information for playback of the media content; and causing transmission of fourth signals during the video call, wherein the fourth signals represent the one or more first playback synchronization commands. In some embodiments, the functions further comprise: receiving one or more second playback synchronization commands; and controlling playback of the media content according to the one or more second playback synchronization commands. In some embodiments, the playback synchronization commands represent at least one of: a file transfer status for the media content; a playback position for the media content; and a time of a modification of the media content by a user.

The details of one or more implementations are set forth in the accompanying drawings and the description below. Other features will be apparent from the description and drawings, and from the claims.

DESCRIPTION OF DRAWINGS

FIG. 1 shows elements of a video calling system according to one embodiment.

FIG. 2 shows elements of a video calling device of FIG. 1 according to one embodiment.

FIG. 3 shows a process for the video calling system of FIG. 1 according to an embodiment where a first video call device shares local media content with a second video call device during a video call.

FIG. 4 shows a process for the video calling system of FIG. 1 according to an embodiment where two video call devices share media content stored at a remote location during a video call.

FIG. 5 shows a process for the video calling system of FIG. 1 according to an embodiment where a first video call device transcodes video and shared local media content for a second video call device during a video call.

The leading digit(s) of each reference numeral used in this specification indicates the number of the drawing in which the reference numeral first appears.

DETAILED DESCRIPTION

The described embodiments provide media sharing during a video call while not requiring a personal computer. The video calls and media sharing can be point-to-point or multi-point. These calls are not limited to video calls, and can include voice-only or video-only calls as well. Embodiments ensure that the shared media are rendered (that is, played back to the participants) such that the playback timing, as well as the apparent quality of the media, is nearly identical for all participants. Before describing these aspects, an example video call device is described.

Video Call Device

FIG. 1 shows elements of a video calling system 100 according to one embodiment. Although in the described embodiments the elements of the video calling system 100 are presented in one arrangement, other embodiments may feature other arrangements. For example, elements of the video calling system 100 can be implemented in hardware, software, or combinations thereof.

Referring to FIG. 1 the video calling system 100 includes N video call devices (video call device) 102A and 102B through 102N connected by a network 108. Network 108 can be implemented as a wide-area network such as the Internet, a local-area network (LAN), or the like. While various embodiments are described with respect to network communications, they also apply to devices employing other forms of data communications such as direct links and the like.

In the embodiment of FIG. 1, the video call devices 102 do not include display screens or speakers. Therefore each video call device 102 is connected to a respective television set (TV) 106A and 106B through 106N. In other embodiments, one or more of the video call devices 102 includes a display screen and speakers, so one or more television sets 106 are not required. In FIG. 1, each video call device 102 is controlled by one or more respective users, for example using one or more respective remote controls (RC) 110.

FIG. 2 shows elements of a video call device 102 of FIG. 1 according to one embodiment. Although in the described embodiments the elements of video call device 102 are presented in one arrangement, other embodiments may feature other arrangements. For example, elements of video call device 102 can be implemented in hardware, software, or combinations thereof.

Referring to FIG. 2, the video call device 102 includes an audio-visual (AV) interface (I/F) 202, a network adapter 204, a media interface 206, and a remote control (RC) interface 208. The video call device 102 also includes a processor or central processing unit (CPU) 210, a graphical processing unit (GPU) 212, a memory 214, a coder/decoder (CODEC) 218, a multiplexer (MUX) 220, and a clock 222.

The AV interface 202 includes a video input interface (Video In) 224, an audio input interface (Audio In) 226, a video output interface (Video Out) 228, and an audio output interface (Audio Out) 230. The video input interface 224 can be connected to one or more video capture devices such as a camera 232 or the like. Camera 232 can be implemented as a wide-angle camera that sees the whole room. The audio input interface 226 can be connected to one or more audio capture device such as a microphone 234 or the like. Microphone 234 can be implemented as a noise-cancelling microphone. In some embodiments, video call device 102 includes one or more cameras 232 and/or one or more microphones 234. For example, multiple cameras 232 can be included to generate three-dimensional (3D) video. As another example, multiple microphones 234 can be included so that beamforming techniques can be used to isolate conversations from background noise.

The video output interface 228 can be connected to a display screen such as that of a television set 106. The audio output interface 230 can be connected to one or more speakers such as those of a television set 106. Alternatively, the video output interface 228 and/or the audio output interface 230 can be connected to the audio-visual inputs of a home theater system or the like. The video output interface 228 and the audio output interface 230 can employ any appropriate connection, for example such as Digital Visual Interface (DVI), High-Definition Multimedia Interface (HDMI), and the like.

The network adapter 204 includes a wireless network adapter 236 and a wired network adapter 238. In some embodiments, network adapter 204 includes additional communication interfaces, for example including Bluetooth communication interfaces and the like.

The wireless network adapter 236 includes a transmitter (TX) 240 to transmit wireless signals and a receiver (RX) 242 to receive wireless signals, and is connected to one or more antennas 244. In some embodiments, wireless network adapter 236 is compliant with all or part of IEEE standard 802.11, including draft and approved amendments such as 802.11-1997, 802.11a, 802.11b, 802.11g, 802.11-2007, 802.11n, 802.11-2012, and 802.11ac. For example, the wireless network adapter 236 can allow Wi-Fi connections, for example to a router, to other Wi-Fi devices such as smartphones and computers, and the like.

The wired network adapter 238 includes a transmitter (TX) 246 to transmit wired signals and a receiver (RX) 248 to transmit wired signals, and is connected to a wired network interface 250. In some embodiments, wired network adapter 238 is compliant with all or part of IEEE standard 802.3, including draft and approved amendments.

The disclosed video call devices 102 are capable of peer-to-peer (P2P) audio/video communication. Using P2P technology, two video call devices 102 can be connected to each other by one or more networks such that data packets can flow between them. The video call devices 102 can be located anywhere in the world, so long as they are connected by networks 108 such as the Internet. The video call devices 102 can employ multiple communication channels between participants. One channel carries the primary video stream of the video call. Another channel carries the primary audio stream of the video call. A command channel carries commands such as camera commands (for example, pan, tilt, and zoom) and the like. The command channel can also carry synchronization commands to ensure synchronized media playback across multiple sites. Additional channels can employed for other tasks such as media sharing and the like.

Some available P2P technologies provide multiple communication channels for each video call device 102. The video call device 102 can employ the provided channels and/or channels established outside the chosen P2P technology. P2P technologies generally provide network address translation (NAT) traversal for their channels. The video call devices 102 described herein can provide NAT traversal for channels established outside the chosen P2P technology.

The media interface 206 receives local media content from external sources, and provides that media content to one or both of processors 210 and 212. In the embodiment of FIG. 2, the media interface 206 includes a Secure Digital (SD) interface 252, a Universal Serial Bus (USB) interface 254, and a mass storage interface 216. Other embodiments can include other interfaces.

The SD interface 252 receives SD cards, and provides media content stored thereon to the CPU 210 and the GPU 212. The USB interface 254 receives USB devices such as USB memory sticks, USB-cabled devices, and the like, and provides media content from those devices to the CPU 210 and the GPU 212. The USB interface 254 can also receive input devices such as USB dongles for wireless keyboards, wireless pointing devices, and the like. The mass storage interface 216 allows for connection to mass storage devices such as external solid-state drives, disk drives, and the like, and provides media content stored thereon to the CPU 210 and the GPU 212.

The remote control (RC) interface 208 receives wireless signals such as infrared signals from remote control devices for controlling the video call device 102. In some embodiments, the video call device 102 can be controlled by a wireless device via the wireless network adapter 236.

The CPU 210 handles general processing functions, while the GPU 212 handles graphic processing functions. In some embodiments, the CPU 210 handles graphic processing functions as well, so the GPU 212 is not required. The CPU 210 receive a time base from clock 222. The memory 214 can be implemented as semiconductor memory and the like.

The CODEC 218 provides encoding, decoding, and transcoding of the audio and video data handled by the video call device 102. In some embodiments, the CODEC 218 is compliant with one or more standards such as the H.264 standard and the like.

The MUX 220 allows audio and video to be exchanged via the A/V interface 202, a virtual interface 256, or both. The MUX 220 allows any of the inputs and outputs to be switched with virtual inputs and outputs. For example, audio and video can be provided to and/or from other local devices such as smartphones, portable cameras, document cameras, computer displays of external computers, and the like.

Media Sharing

In some embodiments, the described video call devices 102 provide for sharing of arbitrary media content during the video call. In some cases the media content is provided by a video call device 102. In other cases, the media content is stored at a remote location, and a video call device 102 provides a hyperlink that indicates that location. The hyperlink can include a uniform resource locator (URL), Internet protocol (IP) address, or the like. Any type of media content can be shared Examples of media content that can be shared include photos; documents in various formats; document snapshots; screen snapshots, video files and streams, audio files and streams, and the like. Video call participants can provide audio and/or video commentary during the video call while sharing the media content.

The described video call devices 102 allow users to share photos during a video call. For example, a user can prepare a playlist of photos during or before a video call. During the video call, the user can manually step through the playlist, thereby deciding the sequence and pace of sharing the photos in real time. Alternatively, the user can prepare the playlist with the desired sequence and share the photos such that the sequence of photos advances automatically. The photos can be transferred as files during the video call.

The described video call devices 102 allow users to share documents during a video call. For example, a user can choose documents in various formats for sharing. The formats can include text and binary formats, from simple text-only documents to documents that include text and graphics. The formats can include portable formats, webpage formats, and the like. The documents can be transferred as files during the video call.

The described video call devices 102 allow users to share document snapshots during a video call. The documents can include the documents mentioned above, webpages, and the like. A snapshot can be an image representing a document. The image can be in any format. For example, in the case of a webpage, the snapshot can be a bit-map recording of the rendered webpage content. The user can use the web browser during or before the call to take snapshots of web content to share in the call. The document snapshots can be transferred as files during the video call.

The described video call devices 102 allow users to share video files and streams during a video call. The video files and streams can be recorded using camera 232. Alternatively, the video files and streams can be imported into the video call devices 102 from other sources, such as memory cards, other physical devices, networks such as the Internet, and the like. The video files and streams can be transferred as files or streamed during the video call.

The described video call devices 102 allow users to share audio files and streams during a video call. The audio files and streams can be recorded using microphone 234. Alternatively, the audio files and streams can be imported into the video call devices 102 from other sources, such as memory cards, other physical devices, networks such as the Internet, and the like. The audio files can be transferred as files or streamed during the video call.

The described video call devices 102 allow users to share applications and screens during a video call. For example, during execution of an application a live computer screen can be shared as a video or one or more snapshots. In many cases, sharing an application is more useful than sharing documents produced by the application. One advantage is that the other users need not execute, or even possess, the application. Another advantage is that the document views produced by the application are the same for all users. For example, it is more useful to share a view of a small portion of a spreadsheet than to share the spreadsheet file, have all users execute the spreadsheet application, and then have all users navigate to the same place in the spreadsheet. As another example, it is simpler to share views of a secure webpage than to have all of the users log on to the web site and find the same view of the same webpage.

The described video call devices 102 allow users to share playlists of media content. The playlist can include only one type of media, or multiple types of media. The media content can be shared consecutively, and in some cases, simultaneously. For example, a playlist of photos can be accompanied by music. The playlist can include media files, media streams, and hyperlinks to media content. URLs to photo playlists can also be shared, such that photos are fetched and rendered in a synchronized manner for all participants during the call. The order of stepping through the media can be chosen by any participant during the call.

The described video call devices 102 allow users to share media content using hyperlinks for the media content during a video call. For example, the hyperlink can be the URL of a webpage, which is then both fetched and rendered in a synchronized manner for all participants during the video call. URLs to playlists can also be shared, such that playlist media content are fetched and rendered in a synchronized manner for all participants during the call. The order of stepping through the media content can be chosen by any participant during the call. The hyperlinks can be transferred as files during the video call.

The media content shared during the video call can originate from many sources. For example, a participant can insert an SD card storing a media file into the SD interface 252 of a video call device 102. A participant can obtain a media file from an external device such as a smartphone, computer, or the like using the USB interface 254 of a video call device 102, the wired network adapter 238 or wireless network adapter 236 of a video call device 102, or the like. A participant can record a media file using the camera 232 of a video call device 102. A participant can download a media file from the Internet into a video call device 102. A participant can generate snapshots of web pages while browsing the web on a video call device 102. A participant can share a URL for media files stored in a network such as the Internet. Of course, other sources can be used.

Synchronized Media Playback

In some embodiments, the described video call devices 102 also provide synchronized media playback such that all participants of a video call have an almost identical contemporaneous experience while playing the media content. During a video call, media files can be downloaded or streamed from the Internet to video call devices 102 or transmitted from one video call device 102 to another with the goal of playing them back in a synchronized manner. The video call devices 102 employ synchronization algorithms to keep track of what data are available at what video call devices 102 at a certain time. To minimize latencies, some files are transmitted (or fetched) ahead of time. Data are pre-fetched (or pre-transmitted) in anticipation that those data will be required as the playback advances. The synchronization algorithms also consider how much network throughput is available, while not adversely affecting the audio/video packet flow.

Synchronization scenarios are described below. In sharing files such as photos, synchronization involves determining when all participants have obtained the file. In sharing streams and video files, synchronization involves matching playback positions in the streams.

The media can originate not only from video call participants' end-points, but also from remote sites such as Internet servers. In the latter case, each participant's video call device 102 fetches the media using the same URLs in a synchronized manner such that the media can be played back in a synchronized manner.

In some embodiments, other events can be synchronized during a video call. For example, when a video call participant modifies the shared media content, the modification can be synchronized so that it is rendered simultaneously for all video call participants. Example modifications can include rotating an image after it is shared, changing the cursor position in a video (jogging), zooming in on an image or video, marking up the shared media content, and the like.

FIG. 3 shows a process 300 for the video calling system 100 of FIG. 1 according to an embodiment where a first video call device 102A shares local media content with a second video call device 102B during a video call. Although in the described embodiments the elements of process 300 are presented in one arrangement, other embodiments may feature other arrangements. For example, in various embodiments, some or all of the elements of process 300 can be executed in a different order, concurrently, and the like. Also some elements of process 300 may not be performed, and may not be executed immediately after each other. For clarity, only two video call devices 102A,B are shown in FIG. 3. However, it should be understood that more than two video call devices 102 can participate in process 300.

Referring to FIG. 3, the video call devices 102A,B conduct a video call. In particular, the first video call device 102A receives first audio information and first video information AV1 at 302, for example from a local camera 232 and microphone 234. The second video call device 102B receives second audio information and second video information AV2 at 304. The video call devices 102A,B exchange the first and second audio and video information AV1 and AV2 at 306. The first video call device 102A renders the second audio and video information AV2 at 308, for example on television set 106A. The second video call device 102B renders the first audio and video information AV1 at 310, for example on television set 106B. This exchange can continue for the remainder of process 300.

At 312, the first video call device 102A receives local media content. For example, the first video call device 102A can receive a photo stored on an SD card inserted in SD interface 252. At 314, the first video call device 102A sends the media content to the second video call device 102B. While the media content is being transferred, the displays at both ends of the video call show the status of the transfer.

At 316, when the media content is available at video call device 102B, the video call devices 102 exchange one or more synchronization commands over the command channel to indicate completion of the transfer. For example, at one video call device 102, the processor 210 generates one or more synchronization commands, and either transmitter 240 or transmitter 246 transmit signals representing the one or more synchronization commands. At the other video call device 102, either receiver 242 or receiver 248 receive the signals representing the one or more synchronization commands, and processor 210 controls the playback of the media content according to the one or more synchronization commands. In response to the commands, the video call devices 102 render the media content at the same time. For example, the video call devices 102 render a photo simultaneously. As another example, the video call devices 102 begin playback of a video file simultaneously. In particular, video call device 102A renders the media content at 318, and video call device 102B renders the media content at 320.

In FIG. 3, a two-way exchange of synchronization commands is shown. In other embodiments, a single synchronization command can be sent. For example, the second video call device 102B can send a single synchronization command to the first video call device 102A when the transfer of the media content to the second video call device 102B is complete, is sufficiently complete to begin playback, or the like. For example, rendering of a video can begin when a sufficient amount of the video has been transferred rather than waiting for the transfer to complete. As another example, when the media content includes multiple photos, rendering of one photo can begin while subsequent photos are being transferred.

FIG. 4 shows a process 400 for the video calling system 100 of FIG. 1 according to an embodiment where two video call devices 102A,B share media content stored at a remote location during a video call. Although in the described embodiments the elements of process 400 are presented in one arrangement, other embodiments may feature other arrangements. For example, in various embodiments, some or all of the elements of process 400 can be executed in a different order, concurrently, and the like. Also some elements of process 400 may not be performed, and may not be executed immediately after each other. For clarity, only two video call devices 102A,B are shown in FIG. 4. However, it should be understood that more than two video call devices 102 can participate in process 400.

Referring to FIG. 4, the video call devices 102A,B conduct a video call. In particular, the first video call device 102A receives first audio information and first video information AV1 at 402, for example from a local camera 232 and microphone 234. The second video call device 102B receives second audio information and second video information AV2 at 404. The video call devices 102A,B exchange the first and second audio and video information AV1 and AV2 at 406. The first video call device 102A renders the second audio and video information AV2 at 408, for example on television set 106A. The second video call device 102B renders the first audio and video information AV1 at 410, for example on television set 106B. This exchange can continue for the remainder of process 400.

At 412, the first video call device 102A receives a hyperlink such as a URL. For example, a user of the first video call device 102A can input the hyperlink using a remote control or wireless keyboard. At 414, the first video call device 102A sends the hyperlink to the second video call device 102B. Each video call device 102 uses the hyperlink to independently get the media content from a network server 430 indicated by the hyperlink. In particular, at 416 the first video call device 102A uses the hyperlink to get the media content, and at 418 the second video call device 102B uses the hyperlink to get the media content.

At 420, when the media content is available at both video call devices 102A,B, the video call devices 102 exchange one or more synchronization commands over the command channel to indicate completion of the transfer. In response to the commands, the video call devices 102 render the media content at the same time. For example, the video call devices 102 render a photo simultaneously. As another example, the video call devices 102 begin playback of a video file simultaneously. In particular, video call device 102A renders the media content at 422, and video call device 102B renders the media content at 424.

Media Quality Preservation

In some embodiments, the described video call devices 102 preserve the apparent quality of media shared during a video call. One illustrative example is the case of photo sharing. In some current photo sharing approaches, a photo slideshow is treated as a video. That is, the photos are compressed using a video compression technique and streamed to other participants. However, this approach introduces undesirable video compression artifacts. For example, each photo appears blurry at first, and then is enhanced over time. As another example, hardware video encoders are often forced to insert a key frame (also called intra frame or intra-coded frame or I-frame) at regular intervals. In case of a photo, this is equivalent to communicating the photo again from scratch. This might lead to the photo's appearance going through repeated cycles of blurry-to-crisp and blurry again. In contrast, described embodiments send the photo files to all participants so the photos appear the same to all participants, including the sender.

The described video call devices 102 also consider the resolution of the display devices employed by video call participants, both for the primary video stream and for media sharing. During setup of a video call, each video call device 102 informs the other video call devices 102 of its display device resolution. During the video call, the video call devices 102 employ transcoding to generate video streams of appropriate resolutions, thereby preserving video quality while reducing bandwidth usage.

FIG. 5 shows a process 500 for the video calling system 100 of FIG. 1 according to an embodiment where a first video call device 102A transcodes video and shared local media content for a second video call device 102B during a video call. Although in the described embodiments the elements of process 500 are presented in one arrangement, other embodiments may feature other arrangements. For example, in various embodiments, some or all of the elements of process 500 can be executed in a different order, concurrently, and the like. Also some elements of process 500 may not be performed, and may not be executed immediately after each other. For clarity, only two video call devices 102A,B are shown in FIG. 5. However, it should be understood that more than two video call devices 102 can participate in process 500.

Referring to FIG. 5, at 502, the second video call device 102B sends a link partner media quality request to the first video call device 102A. The link partner media quality request indicates a desired quality for video call video and the media content. For example, the link partner media quality request can indicate the resolution of the display device connected to the second video call device 102B.

The video call devices 102A,B then conduct a video call. In particular, the first video call device 102A receives first audio information and first video information AV1 at 504, for example from a local camera 232 and microphone 234. The second video call device 102B receives second audio information and second video information AV2 at 506. The video call devices 102A,B exchange the first and second audio and video information AV1 and AV2 at 510. The first video call device 102A renders the second audio and video information AV2 at 512, for example on television set 106A. The second video call device 102B renders the first audio and video information AV1 at 514, for example on television set 106B. However, the first video call device 102A transcodes the first video information according to the desired quality at 508 prior to sending the first video information to the second video call device 102B. In particular, CODEC 218 of the first video call device 102A transcodes the first video information. This exchange can continue for the remainder of process 500.

At 516, the first video call device 102A receives local media content. For example, the first video call device 102A can receive a photo stored on an SD card inserted in SD interface 252. At 518, the first video call device 102A transcodes the media content according to the desired quality prior to sending the media content to the second video call device 102B. In particular, CODEC 218 of the first video call device 102A transcodes the media content. At 520, the first video call device 102A sends the transcoded media content to the second video call device 102B. While the media content is being transferred, the displays at both ends of the video call show the status of the transfer.

At 520, when the transcoded media content is available at video call device 102B, the video call devices 102 exchange one or more synchronization commands over the command channel to indicate completion of the transfer. In response to the commands, the video call devices 102 render the media content at the same time. For example, the video call devices 102 render a photo simultaneously. As another example, the video call devices 102 begin playback of a video file simultaneously. In particular, video call device 102A renders the media content at 524, and video call device 102B renders the transcoded media content at 526.

In FIG. 5, a two-way exchange of synchronization commands is shown. In other embodiments, a single synchronization command can be sent. For example, the second video call device 102B can send a single synchronization command to the first video call device 102A when the transfer of the transcoded media content to the second video call device 102B is complete, is sufficiently complete to begin playback, or the like.

Embodiments of the disclosure can be implemented in digital electronic circuitry, or in computer hardware, firmware, software, or in combinations of them. Embodiments of the disclosure can be implemented in a computer program product tangibly embodied in a machine-readable storage device for execution by a programmable processor; and method steps of the disclosure can be performed by a programmable processor executing a program of instructions to perform functions of the disclosure by operating on input data and generating output. The disclosure can be implemented advantageously in one or more computer programs that are executable on a programmable system including at least one programmable processor coupled to receive data and instructions from, and to transmit data and instructions to, a data storage system, at least one input device, and at least one output device. Each computer program can be implemented in a high-level procedural or object-oriented programming language, or in assembly or machine language if desired; and in any case, the language can be a compiled or interpreted language. Suitable processors include, by way of example, both general and special purpose microprocessors. Generally, a processor will receive instructions and data from a read-only memory and/or a random access memory. Generally, a computer will include one or more mass storage devices for storing data files; such devices include magnetic disks, such as internal hard disks and removable disks; magneto-optical disks; and optical disks. Storage devices suitable for tangibly embodying computer program instructions and data include all forms of non-volatile memory, including by way of example semiconductor memory devices, such as EPROM, EEPROM, and flash memory devices; magnetic disks such as internal hard disks and removable disks; magneto-optical disks; and CD-ROM disks. Any of the foregoing can be supplemented by, or incorporated in, ASICs (application-specific integrated circuits).

A number of implementations of the disclosure have been described. Nevertheless, it will be understood that various modifications may be made without departing from the spirit and scope of the disclosure. Accordingly, other implementations are within the scope of the following claims.

Claims

1. A video call device comprising:

a video input interface configured to receive first video information;

an audio input interface configured to receive first audio information;

a transmitter configured to transmit first signals during a video call, wherein the first signals represent the first video information and the first audio information;

a receiver configured to receive second signals during the video call, wherein the second signals represent second video information and second audio information;

a video output interface configured to provide the second video information;

an audio output interface configured to provide the second audio information;

wherein the transmitter is further configured to transmit third signals during the video call, wherein the third signals represent at least one of media content, and a hyperlink, wherein the hyperlink indicates a location of the media content.

2. The video call device of claim 1, further comprising:

an encoder/decoder (CODEC);

wherein the receiver is further configured to receive fourth signals, wherein the fourth signals represent a link partner media quality request, wherein the link partner media quality request indicates a desired quality for the first video information;

wherein the CODEC is configured to transcode the first video information according to the desired quality prior to the first signals being transmitted by the transmitter.

3. The video call device of claim 1, further comprising:

a media interface configured to receive the media content;

wherein the third signals represent the media content.

4. The video call device of claim 5, further comprising:

an encoder/decoder (CODEC);

wherein the receiver is further configured to receive fourth signals, wherein the fourth signals represent a link partner media quality request, wherein the link partner media quality request indicates a desired quality for the media content;

wherein the CODEC is configured to transcode the media content according to the desired quality prior to the third signals being transmitted by the transmitter.

5. The video call device of claim 5, wherein the media interface comprises at least one of:

an SD card interface;

a USB interface; and

a mass storage interface.

6. The video call device of claim 1, further comprising:

a processor configured to generate one or more first playback synchronization commands, wherein the first playback synchronization commands include timing information for playback of the media content;

wherein the transmitter is further configured to transmit fourth signals during the video call, wherein the fourth signals represent the one or more first playback synchronization commands.

7. The video call device of claim 6, wherein:

the receiver is further configured to receive fifth signals during the video call, wherein the fifth signals represent one or more second playback synchronization commands; and

the processor is further configured to control playback of the media content according to the one or more second playback synchronization commands.

8. The video call device of claim 7, wherein the playback synchronization commands represent at least one of:

a file transfer status for the media content;

a playback position for the media content; and

a time of a modification of the media content by a user.

9. The video call device of claim 1, further comprising:

one or more cameras configured to provide the first video information to the video input interface; and

one or more microphones configured to provide the first audio information to the audio input interface.

10. A method comprising:

receiving first video information;

receiving first audio information;

transmitting first signals during a video call, wherein the first signals represent the first video information and the first audio information;

receiving second signals during the video call, wherein the second signals represent second video information and second audio information;

providing the second video information;

providing the second audio information; and

transmitting third signals during the video call, wherein the third signals represent at least one of media content, and a hyperlink, wherein the hyperlink indicates a location of the media content.

11. The method of claim 10, further comprising:

receiving fourth signals, wherein the fourth signals represent a link partner media quality request, wherein the link partner media quality request indicates a desired quality for the first video information; and

transcoding the first video information according to the desired quality prior to transmitting the first signals.

12. The method of claim 10, further comprising:

receiving the media content;

wherein the third signals represent the media content.

13. The method of claim 12, further comprising:

receiving fourth signals, wherein the fourth signals represent a link partner media quality request, wherein the link partner media quality request indicates a desired quality for the media content; and

transcoding the media content according to the desired quality prior to transmitting the third signals.

14. The method of claim 10, further comprising:

generating one or more first playback synchronization commands, wherein the first playback synchronization commands include timing information for playback of the media content; and

transmitting fourth signals during the video call, wherein the fourth signals represent the one or more first playback synchronization commands.

15. The method of claim 14, further comprising:

receiving fifth signals during the video call, wherein the fifth signals represent one or more second playback synchronization commands; and

controlling playback of the media content according to the one or more second playback synchronization commands.

16. The method of claim 15, wherein the playback synchronization commands represent at least one of:

a file transfer status for the media content;

a playback position for the media content; and

a time of a modification of the media content by a user.

17. Non-transitory computer-readable media embodying instructions executable by a computer to perform functions comprising:

receiving first video information and first audio information;

causing transmission of first signals during a video call, wherein the first signals represent the first video information and the first audio information;

providing second video information and second audio information based on second signals received during the video call, wherein the second signals represent the second video information and the second audio information;

causing transmission of third signals during the video call, wherein the third signals represent at least one of media content, and a hyperlink, wherein the hyperlink indicates a location of the media content.

18. The non-transitory computer-readable media of claim 17, wherein the functions further comprise:

receiving a link partner media quality request, wherein the link partner media quality request indicates a desired quality for the first video information; and

transcoding the first video information according to the desired quality prior to causing transmission of the first signals.

19. The non-transitory computer-readable media of claim 17, wherein the functions further comprise:

receiving the media content;

wherein the third signals represent the media content.

20. The non-transitory computer-readable media of claim 19, wherein the functions further comprise:

receiving a link partner media quality request, wherein the link partner media quality request indicates a desired quality for the media content; and

transcoding the media content according to the desired quality prior to causing transmission of the third signals.

21. The non-transitory computer-readable media of claim 17, wherein the functions further comprise:

generating one or more first playback synchronization commands, wherein the first playback synchronization commands include timing information for playback of the media content; and

causing transmission of fourth signals during the video call, wherein the fourth signals represent the one or more first playback synchronization commands.

22. The non-transitory computer-readable media of claim 21, wherein the functions further comprise:

receiving one or more second playback synchronization commands; and

controlling playback of the media content according to the one or more second playback synchronization commands.

23. The non-transitory computer-readable media of claim 22, wherein the playback synchronization commands represent at least one of:

a file transfer status for the media content;

a playback position for the media content; and

a time of a modification of the media content by a user.