Method of relaying digital video & audio data via a communications medium

Info

Publication number: 20010056575
Type: Application
Filed: Apr 5, 2001
Publication Date: Dec 27, 2001
Inventors: Winston Ser-Tuen Wei (Singapore), Zhong Hua Xu (Singapore)
Application Number: 09827059

Abstract

A method of creating a hybrid file provides at least a first file and a second file, each file having contents in a file type that is different from the file types of all the other files. The method then reads and interprets the at least first and second files, and creates a hybrid file that contains the contents of the at least first and second files, the hybrid file having a hybrid file type that is different from the file types of the at least first and second files.

Description

Description

RELATED CASES

[0001] This is a continuation-in-part of application Ser. No. 09/053,878, filed Apr. 1,1998 and entitled “A Method of Relaying Digital Video & Audio Data Via A Communications Medium”.

FIELD OF THE INVENTION

[0002] The present invention relates to a method of relaying digital video and audio data over a communications medium and relates particularly but not exclusively to such in an Internet environment.

DESCRIPTION OF PRIOR ART

[0003] Hitherto, when digital motion video data has been relayed on a communication medium such as the Internet, it has been usual for the duration of the motion video to be kept to a minimum because of the slow speed of the Internet transmission process and the very large size of the files necessary to transmit the video content. Because of the slow speed of the Internet, such as 28.6 k or 56 k, and the large file size, it has meant that a motion video image of image size of say 60×60 pixels, and 15 seconds duration can take up to 10 minutes or more to be downloaded. This is generally considered unacceptable. The resultant image quality is also generally considered unacceptable. Moreover, the width and height size is kept as small as possible to reduce the file size. Further, the frame rate is sometimes reduced from NTSC 30 frames per second or PAL 25 frames per second, to around 15 frames per second to again reduce the file size. In some instances, the picture detail of the video image is traded off by having low resolution images and consequential relatively smaller file sizes. Generally, transmission of motion video via the Internet has not been entirely satisfactory owing to the large transmission times needed to transmit the entire video content. The problem is further exacerbated if audio is to accompany the motion video transmission as the audio files are in general “wave” files and digital files in this format also have a very large file size.

[0004] It is generally acknowledged that the large file size of a video file can be reduced by a suitable compression codec such as an MPEG (Motion Picture Expert Group) codec or other codecs. The “wave” files can be similarly suitably compressed. In some motion video applications, video data and audio file data are interleaved, such as in the AVI Video for Windows file format. In general, these techniques have not elevated video transmission over the Internet to a position where it can be considered as an efficient commercial option.

[0005] Several attempts have been made to provide enhanced video transmission over the Internet. For example, U.S. Pat. No. 5,132,792 describes a system that uses a frame dropping processes to remove predetermined frames from a transmission. In general, the more frames that are dropped, the more efficiently the video signals are transmitted however, this tends to deteriorate video quality by providing jerky video movement. Accordingly, this solution has limited application.

SUMMARY OF THE INVENTION

[0006] The present invention attempts to overcome one or more of the aforementioned problems by a completely different technique. In this technique one or more frames are relayed together with data of filters and parameters therefor to be applied to the one or more frames. When received, the data of one or more frames is then self generating over many frames with filters such as ZOOM, PAN, FADE etc., applied over those frames and this results in an apparent motion video image in the final video. As a consequence of the relayed data, the resultant motion video is generated at receivers personal computers. This substantially shortens the transmission time. An example of the invention has particular application in relaying motion video and audio data via the Internet. The video and audio content is particularly suited for providing motion video information in relation real estate where motion video of buildings/apartments can be received by intending purchasers to obtain a first hand viewing before making a physical site inspection or a purchase. In the case of use in a real estate environment, the system also enables a 2-D floor plan of the real estate to be provided and a walk through path provided on the 2-D floor plan indicating the path taken of a person walking through the real estate as the video plays. This, in turn, gives an apparent indication of the position in the real estate at any given instant of the viewed video. Whilst the invention has particular application in conveying video information concerning real estate the invention should not be considered limited thereto as it has application in many other fields.

[0007] Accordingly to a first broad aspect of the present invention there is provided a method of relaying digital video data, said method involving providing a set of digital data of one frame representing a scene of a video image, assembling data of one or more filters and parameters therefor to be applied to the digital data over a plurality of repeated frames of that scene, relaying the digital data and the data of the filters and parameters therefor from one location to another location over a communications medium, and at said another location receiving and reproducing the digital data over the plurality of repeated frames with the one or more video filters and parameters therefor applied whereby to generate at the another location an apparent video movie over those reproduced video frames and whereby the transmission time over the transmission medium is less than that required to transmit digital data of a video movie of the same content if digital data of all of those reproduced video frames were transmitted.

[0008] The present invention also provides a method of creating a hybrid file, in which at least a first file and a second file are provided, each file having contents in a file type that is different from the file types of all the other files. The method then reads and interprets the at least first and second files, and creates a hybrid file that contains the contents of the at least first and second files, the hybrid file having a hybrid file type that is different from the file types of the at least first and second files.

BRIEF DESCRIPTION OF THE DRAWINGS

[0009] In order that the invention can be more clearly ascertained a preferred example for use in a real estate environment will now be described with reference to the accompanying drawings wherein:

[0010] FIG. 1 is a block schematic diagram showing functional descriptions of processes involved;

[0011] FIGS. 2a through 2e are a series of diagrams time synchronised to one another showing:-

[0012] In FIG. 2a, a type of video time-line for three scenes;

[0013] In FIG. 2b, a time duration for each of the scenes;

[0014] In FIG. 2c, a duration, commencement and end times for various transition filters;

[0015] In FIG. 2d, a time-line representation of further filters; and

[0016] In FIG. 2e, timing points T1 and T2 of the end of scene 1 and start of scene 2, and the end of scene 2 and the start of scene 3;

[0017] FIG. 3 is a 2-D floor plan of real estate and a walk through route-of-path therein;

[0018] FIG. 4 is a diagram of a viewing preview window on a receiving computer;

[0019] FIGS. 5, 6 and 7 show diagrams of the image sizes of the scenes 1, 2 and 3;

[0020] FIGS. 8 and 9 show interconnected functional diagrams of a “production system” and connection with the Internet; and

[0021] FIGS. 10 through 12 show functional diagrams of a user connecting with the Internet and downloading video and audio files and subsequently reproducing the video/audio at the users PC.

[0022] FIG. 13 illustrates how a plurality of files having different file types are merged to create a single hybrid file according to the present invention.

[0023] FIG. 14 illustrates the format in which different files can be represented under the present invention.

[0024] FIG. 15 is a flowchart illustrating how a plurality of files having different file types are merged to create a single hybrid file according to the present invention.

[0025] FIG. 16 illustrates a hybrid file according to the present invention.

DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS

[0026] Referring firstly to FIG. 1 it can be seen that the preferred method for use in a real estate environment is implemented in a system which comprises three primary blocks comprising a production system 1, the Internet 2 as a communication medium, and one or more receiving computers 3. Whilst FIG. 1 shows only video encoding in the production system 1, it should be understood that both video and audio can be encoded and used in the method. In a broader sense, only video may be encoded and transmitted without audio. A video server 20 is provided as a means to store the video/audio data relating to individual items of real estate which can be accessed by receiving computers 3.

[0027] Reference will firstly be made to FIGS. 2a through 2e, and FIGS. 3 through 7 in order to obtain an understanding of the system. Reference will then be made again to FIG. 1, and then to FIGS. 8 through 12 for a detailed explanation of other aspects of the system.

[0028] Firstly, it should be understood that the motion video image which a user observes at a monitor of a receiving computer 3 is generated at the receiving computer 3. There is no corresponding video at the video server 20 but rather information which can be utilised to enable a receiving computer 3 to self generate a motion video sequence from still frame images.

[0029] In a very simple system, if a video sequence to be observed by a person on a monitor of a receiving computer 3 is to be say

[0030] 20 seconds long, the video sequence could be made up from a single still frame picture image. In other words, if the video frame rate is 15 frames per second then the video could be made up from 15×20 identical images. A video of that nature would not appear to have any motion when viewed on a monitor of a receiving computer 3 because all frames would be identical in content. For this video it is only necessary to transmit one frame of the picture frame image and then to self generate the remaining frames at the receiving computer. Accordingly, in the simple system referred to above, digital data representative of the image of a single frame can be relayed over the transmission medium—the Internet 2—and received in a receiving computer 3, and then the computer 3 will self generate the video locally. The video image then viewed by a viewer on the monitor screen of a receiving computer 3 would appear as a stationary image for the duration of the video—20 seconds.

[0031] If motion is to be observed in the video content then an apparent moving video image can be created by the self generation at the receiving computer 3 if that receiving computer 3 knows how to manipulate the video image of the single still frame for each particular frame of the 20 second period duration.

[0032] Accordingly, when a video image is created for relaying over the transmission medium by the production system 1, it is also produced with information concerning the way the image is to be manipulated. Throughout the specification we refer to the way in which the video image is manipulated by the use of term “filters”. These filters can comprise one or more of ZOOM, PAN, TILT, DISTORT, MORPH, FADE etc and also transition effect filters, as will be discussed in relation to a more complex example.

[0033] If it is assumed that a ZOOM filter is applied to the image of each of the frames in the video sequence, and that the zooming progresses from one frame to the next frame etc over the duration of the video sequence, then an observer of a monitor screen on the receiving computer 3 will have the impression of an apparent motion video sequence because the image will appear to be changing.

[0034] Accordingly, it can be appreciated therefore, that the time taken to relay the single frame image, together with information concerning the duration of the proposed video sequence, and information concerning any filters to be applied, will be far less than that required to relay a complete motion video sequence from the video server 20 where each frame is relayed one after the other. Thus, with the system that is proposed, the relaying time over the transmission medium is substantially less than that required to relay digital data of a motion video movie of the same content if digital data of all those reproduced video frames were transmitted.

[0035] If the video image of a still frame is compressed according to a particular codec, further efficiencies can be obtained with regard to the relaying time over the communications medium.

[0036] The codec used for compression of the still frame image of the scene can be any particular codec and is not limited solely to MPEG or JPEG or the like. However, it is preferred to use a JPEG compression codec as it is readily available and operates reasonably satisfactorily.

[0037] FIG. 2a shows a type of video time-line representation of three scenes. It can be seen that scene 1 is an image of a hallway, scene 2 an image of a living room, and scene 3 an image of a bedroom. In FIG. 3 there is shown a floor plan in 2-D format of real estate in the form of an apartment scene 1, being the hallway, and this is diagrammatically shown as having a picture image plane represented by plane A. Scene 2, being a living room, is represented by a picture image plane B. Scene 3, being a bedroom, is represented by a picture image plane C. Thus, in order to provide an apparent video of the apartment, three picture frame images A, B and C are provided to provide a total video sequence of a duration of 11 seconds. This is shown in FIG. 2B. The duration of the hallway video part of the total video sequence has a duration of 5 seconds. The duration of the living room part of the video has a duration of 2 seconds. The bedroom part of the total video sequence has a duration of 4 seconds. The time-line shown in FIG. 2a shows only the first and last frames of each of the individual video parts. It also shows that the video part for the living room buts with the video part for the hallway. It also shows that the bedroom part of the video buts with the living room part of the video.

[0038] FIG. 2c shows a transition in the form of a “RIGHT/LEFT SLIDE WIPE” at the end of the hallway video and the start of the living room video. It also shows a similar transition at the end of the living room video, and the start of the bedroom video. The time durations of the transitions have been arbitrarily shown as 1 second and 0.75 seconds respectively. The transitions could occur over any given time period. Desirably, for a “SLIDE WIPE TRANSITION” which is from right to left, the apparent wipe speed should approximate the apparent speed of a person walking through the apartment along a route-of-path P (see FIG. 3). Thus, in the examples, shown in FIG. 2c with the particular transitions shown, the transition from the living room to the bedroom is slightly faster than the transition from the hallway to the living room. This will have the apparent effect in the video of a person walking quicker at the transition between the living room and the bedroom than at the transition between the hallway and the living room. The transitions are interpreted herein as certain filters.

[0039] FIG. 2d shows that for a 2.5 second period of the hallway video there is a ZOOM IN filter applied and that for the remaining 2.5 second period of that video, there is a PAN filter applied. The PAN filter also extends for a period of 1 second into the living room video. It next shows a ZOOM IN transition for the remaining 1 second period of the living room video and that that ZOOM IN filter is applied for a 0.5 second period over the bedroom video. It also shows that for a further 1.5 second period of the bedroom video, a ZOOM OUT filter is applied and that for the balance of the 2 second bedroom video there is a PAN left to right filter applied.

[0040] Referring now to FIG. 4 it can be seen that a preview window 4 is available on the monitor screen of a receiving computer 3. Typically, this preview window can have any size and any aspect ratio. It has been arbitrarily shown as having a height of 180 and a width of 180 pixels but of course any desired size and aspect ratio could be utilised. FIG. 5 shows the size of the image for the hallway image which has a height of 180 and a width of 1900 pixels. These dimensions are also arbitrary but because the height 180 equals the height 180 of the preview window 4, the height images correspond with what will be displayed in the preview window 4.

[0041] Similar arrangements are shown in FIGS. 6 and 7 for the images of the living room and of the bedroom. In each case the height is still 180 whereas the lengths are 1400, and 1700 respectively. Each of these images is available at each frame in the total video sequence for the hallway video part, the living room video part, and the bedroom video part respectively. Because the widths are greater than the preview window width then it is possible to provide relative movement between the preview window 4 and the images of the hallway, the living room, and the bedroom, to give an apparent movement by applying a PAN filter. If the height of the images of each of the hallway video, living room video and bedroom video were greater than 180 it would be possible to PAN up or DOWN as well. If PANNING is effected past the sides and or top or bottom of the images then there will be MASKS appearing in the preview window which is undesirable. The intention is to provide an apparent continuous video through the preview window 4 and any MASKS which would occur as a result of over PANNING would destroy the apparent video motion.

[0042] Summarising, the preview window has particular dimensions for width and height, and the particular images in each frame for the hallway part of the video, the living room part of the video, and the bedroom part of the video have a much greater width than the width of the preview window. Accordingly, it is possible for a relative PANNING motion to appear through the preview window 4 by scanning across the respective images. As each frame in the video section for the hallway, the living room, and the bedroom, are respectively identical, there will appear through the preview window an image, which represents a ZOOMING down the hallway, a PAN from left to right with a transition of a SLIDE WIPE between scene 2 and scene 1 representing movement into the living room. There will then be a further ZOOM IN in the living room followed by continuous ZOOMING IN to the bedroom, with a SLIDE WIPE transition between the living room and the bedroom. There will then be a ZOOM OUT in the bedroom followed by a further PAN left to right.

[0043] All of the above can be created at the receiving computer 3 by relaying a single picture frame image of each of the hallway scene, the living room scene, and the bedroom scene. Thus, only three single frame images need to be relayed via the communications medium together with information concerning the duration of each of the video components, together with the particular filters which are applied, and the start and end times of the filters. Thus, it can be seen that one or more filters are applied together with parameters therefor and that these are utilized at the receiving computer 3 to self generate an apparent video from only three single frame images. All the missing frames are reconstituted at the receiving computer 3 according to the parameters of the filters and the parameters associated with each of the respective frame images ie the time duration for each of the picture frame images in the video sequence.

[0044] Returning now to FIG. 3 it can be seen that an indicator arrowhead Z is shown on the route-of-path P. The arrowhead Z is caused to move along the route-of-path P as viewed on the monitor screen of a receiving computer 3 to provide an indication approximately where a viewer is in the apartment. In other words, there is a 2-D floor plan of the apartment shown in FIG. 3 with a route-of-path P shown thereon and an arrowhead Z to indicate the approximate position at any given instant during the playing of a video as observed through the preview window 4. FIG. 2e shows transition times T1 and T2 representing the transitions between the hallway part of the video and the living room part of the video, and the living room part of the video and the bedroom part of the video respectively. These time periods T1 and T2 are used to tell controlling software which moves the arrowhead Z along the route-of-path P that it is to change the direction at the transition time T1 or T2 etc. Thus, as a viewer plays a video and operates conventional video player stop, start, pause, rewind buttons which can be displayed on the monitor of the receiving computer 3 concurrently with the video, the video can be made to play, start or stop, pause etc. and the arrowhead Z will assume a position along the route-of-path P which approximates the position of a person on the route-of-path P at that time. Normally, the arrowhead Z moves continuously along the route-of-path P in accordance with the playing of the video.

[0045] Clearly, an apparent video can be made up from a number of different single frame images representing each of the rooms of an apartment or indeed any other area. A 2-D floor plan of the area can be provided together with a route-of-path which indicates the path of a person walking through the area. The single picture frame images can be related to the position of the person walking through the area, and by applying various filters there can be provided an apparent video of the 2-D area.

[0046] It should also be realised that the video can be of a 3-D volume depending on the particular images of the single picture frames and the particular PANS and other filters applied thereto. Thus, there can be an apparent three dimensional movement. In this case, other 2-D representations can be provided to indicate vertical planes, and a route-of-path can be shown on those planes in a similar manner to the route-of-path on the 2-D plan format shown in FIG. 3. Thus, on the monitor screen of a receiving computer 3, there may be provided several diagrams similar to FIG. 3 but representing plan views, and side elevational views etc. with appropriate route-of-paths drawn thereon together with appropriate arrows on the routes-of-paths. The picture images can be assembled into a digital file together with the parameters relating thereto, including the timing of the duration of each of the video components for each of the still frames and the various filters applied thereto. In the preferred embodiment we store this information in a file with a .MSD file extension. The transition timing information relating to T1 T2 etc as referred to in relation to FIG. 2e is stored in a file with a .TRK extension. Data representative of the floor plan in 2-D format as shown in FIG. 3 can be represented in a vector, bitmap or other compatible format as a file with a .MGM file extension. These file extensions have arbitrarily been provided by ourselves however, any file extensions can be utilized. Typically, the video picture images for each frame are compressed according to the JPEG codec, although they may be uncompressed images such as in a bitmap file or other file.

[0047] Thus, when a receiving computer 3 makes contact with the video server 20 via the Internet, it accesses these three types of file types and then at the receiving computer 3 a video is generated from the still picture frame images. If desired, audio may accompany the video .MSD files. This audio may be compressed, if desired, according to a particular codec. Thus, when received at the receiving computer 3, it can be uncompressed by that codec.

[0048] Returning now to FIG. 1, and with particular reference to the production system block 1, it can be seen that there is an editing block 10. Here, the particular still frame images used to create the entire video are assembled. In the case of the examples shown in relation to FIGS. 2a through 2e there are three scenes of picture images. These scene images are then presented to the video encoding block 15 where they are compressed according to a particular codec such as the JPEG compression codec. The video server block 20 is provided to store all the data files necessary for the video display including the .MSD files, the .MGM files and the .TRK files. Audio files may also be included. The video server transmits all the files to a receiving computer 3 when a receiving computer 3 request the playing of a video from the Internet address site of the server 20. When the request has been made from the receiving computer 3 to the video server 20, the downloading and decoding is identified in the receiving computer 3 in decoding block 30. The receiving computer 3 will then wait for a request from a reply block 40 to self generate a playable motion video from the digital data which has been transmitted. This is shown in the self generation block 50. The video and audio processor block 60 enables the video and audio from the self generated motion video to be displayed on the monitor of the receiving computer 3. As discussed before, a video player arrangement can be provided on the monitor screen to simulate the normal buttons of a video recorder/player such as stop, play, pause, rewind etc. At the same time as the video and audio processor 60 is processing data, the draw walk-through-route block 65 processes the 2-D floor plan information from the .MGM file and displays the route-of-path thereon. It also processes the .TRK file which controls points of transition of movement of a position indicator in the shape of the arrowhead Z over the route-of-path P. The draw walk through route block 65 can also check to see if the video has played in its entirety such that the end block 68 can automatically return the video to its start point for subsequent replaying.

[0049] Referring now to FIG. 8 it can be seen that there are two subsection blocks being create .MSD file 5, and create .TRK file 7. In the create .MSD file block, it is necessary to define some of the parameters relating to the video information which is to be assembled. These are, project name (project name .MSD), preview window display area eg 180×180, 240×180 etc. which represents the pixel size. This all occurs within the block 70 which is a sub-block of the create .MSD file 5.

[0050] It is then necessary to import the still picture scenes in JPEG, TIFF, BITMAP, or other formats to a work space. This occurs in block 75 which is a further sub-block of the create .MSD file 5. It is then necessary to specify the duration of the video parts, and to apply effects to the still picture scene images such as ZOOM IN, ZOOM OUT, PAN, FADE, MORPH, and the like, together with the various start times and end times for those effects. It is also necessary to apply transition filters if they are required, between the ends of the video portions made from each of the still picture images. These transitions may be PAN left to right, PAN right to left, PAN up and down, PAN down up, and other known transition filters. All of these filters are assembled in the block 80 which is a sub-block of the create .MSD file 5.

[0051] In the create .TRK file block 7, a 2-D floor plan is firstly opened in the sub-block 85. The 2-D floor plan can be a vector or bitmap based drawing and has been provided with a .MGM file extension as an arbitrary extension.

[0052] A route-of-path P is then drawn on the 2-D floor plan as a polyline that is made up of one or more straight lines. The example in FIG. 3 shows the walk through route-of-path P having three straight lines. As the route-of-path P is to correspond to particular time instances relative to the picture images in the .MSD files, it is necessary to manipulate the route-of-path so it will fine tune and be in synchronisation with the resultant video. At present this is done by trial and error by manipulating the length of the individual straight lines in the route-of-path and the particular angles of intersection. The timings T1 and T2 etc at the change of direction points is also manipulated slightly so that they approximately correspond with the resultant video image. This synchronisation is performed in block 95 in the create .TRK file 7. The resultant information is then saved in a .TRK file in block 97.

[0053] FIG. 9 shows a flow diagram of the functions associated with video encoding block 15 where the picture data from the create .MSD file block 5 is saved in an MSD file block 100. The resultant picture data, with the video effects and timings in the .MSD file is then brought together with the information in the .TRK file and copied into the video server 20 in block 110. The information is saved in hard disk in the video server 20 and can be accessed via the Internet using TCP/IP protocol.

[0054] In the video encoding block 15, the still picture scene images are compiled with effects and timings with a JPEG (Joint Photographic Expert Group) codec compression screen. This will reduce the .MSD file size by about 25 times its original size. It will then be saved as a .MSD file for viewing via the Internet. In addition, it is possible at this time to merge a plurality of files having different file types into a hybrid file, as will be explained in greater detail hereinbelow in connection with FIGS. 13-15. In the video server block 20, three kinds of files are copied to specific directories. These files are the .MGM files, .MSD files, and the .TRK files. Within the video server 20, a program is run to facilitate relaying of the three types of files of each video to the receiving computers as requested by using TCP/IP protocol. A multi-thread system can be utilized to connect each of the receiving computers 3 to each thread to facilitate requests from the receiving computers 3. This will increase the efficiency of the video server 20 to the receiving computers 3. Typically the video server 20 can be connected to the Internet using an ISDN line (eg 64K to 384K bit-sec or higher capacity bandwidth line such as a fibre optic line which can have data transfer rates up to megabits/sec).

[0055] Referring now to FIG. 10 there is shown a breakdown of the decoding block 30 which occurs within the receiving computer 3. Typically, the receiving computers 3 are connected to the Internet via a modem dial-up or LAN based connection. The interacting software with the Media Server program residing in the receiving computer 3 is an Internet Browser Plug-in which is compatible with Microsoft Internet Explorer 3 and 4 and Netscape 3 and 4. It requests the necessary files ie .MGM, .TRK and .MSD files respectively from the server 20 using TCP/IP.

[0056] In the decoding block 30 (FIG. 8) there are two paths. One path is for audio and the other path is for video. The audio is processed within the blocks 200 and 220 and the video is processed within the blocks 210 to 230. Thus, the audio file which is pre-stored in the plug-in directory of a receiving computer 3 when installing or updating of the Internet plug-in, so that for audio, an audio file is opened in block 200 and is assembled ready for playing in a buffer in block 220. The video is unpacked in block 210 and is decompressed using the JPEG codec in block 230.

[0057] FIG. 11 is a flow diagram of the self generation block 50 shown in FIG. 1. The 2-D floor plan is drawn onto the Internet browser window using the downloaded Internet plug-in written in C++ as it is received from the video server 20. Within the plug-in, there are interfaces that allow the user of a receiving computer 3 to manipulate the 2-D floor plan such as magnify or de-magnify and allows Play, Pause, and Stop functions within the block 235. The plug-in interface enables activation of the video by clicking either the play button or by directly clicking on the route-of-path P polyline while still receiving unpacked .MSD files. The .MSD files will be decompressed and all the necessary frames to create the video segment will be produced in the self generation block 50 in accordance with the sub-block 250. The audio component begins playing concurrently with any video and is synchronised with block 270 and block 280.

[0058] FIG. 12 shows a flow diagram of the floor walk-through block 65 and shows that the display route is provided as a decision process as block 300 and the arrow head Z is drawn on the 2-D floor plan in block 320. Interrogation is made at decision 330 whether the video has been completed in its entirety. If it has, the audio playback and audio buffer are closed in block 340. If it has not the process re-loops.

[0059] Whilst the above system has been described for a particular application in relation to real estate, it should be appreciated that it can be applied to other areas and fields and is not limited solely to real estate areas.

[0060] It should also be appreciated that audio files such as music files are loaded during installation of the plug-in. Any audio files can be updated from time to time by simply reloading the plug-in once again. Many audio files such as music files can be loaded with the plug-in, and can be arranged to be randomly selected during playing of a video, to provide variation to the accompaniment of the video.

[0061] Particulars of some algorithms associated with certain filters are set out below:

[0062] ZOOM IN Effect

[0063] Suppose:

[0064] 1. Effect time: 0˜T

[0065] 2. At time 0: Image is I(O,O,W,H)

[0066] W is the width of still picture

[0067] H is the height of still picture

[0068] 3. At time T: Image is I(xT,yT,wT,hT)

[0069] xT is left position

[0070] yT is top position

[0071] wT is width

[0072] hT is height

[0073] Then at time t (from O to T): Image is I(x,y,w,h)

x=xT*t/T

y=yT*t/T

w=W−(W−wT)*t/T

h=H−(H−hT)*t/T

[0074] ZOOM OUT Effect

[0075] Suppose:

[0076] 1. Effect time: 0˜T

[0077] 2. At time 0: Image is I(xT,yT,wT,hT)

[0078] xT is left position

[0079] yT is top position

[0080] wT is width

[0081] hT is height.

[0082] 3. At time T: Image is I(O,O,W,H)

[0083] W is the width of still picture

[0084] H is the height of still picture

[0085] Then at time t (from O to T): Image is I(x,y,w,h)

x=xT*(T−t)/T

y=yT*(T−t)/T

w=W−(W−wT)*(T−t)/T

h=H−(H−hT)*(T−t)/T

[0086] FADE Effect (One Picture Fades in and Another Fades Out)

[0087] Suppose:

[0088] 1. Effect time: O˜T

[0089] 2. At time 0: Image is I(O,O,W,H)

[0090] W is the width of the still picture I

[0091] H is the height of the still picture I

[0092] 3. At time T: Image is I′(O,O,W,H)

[0093] Then at time t (from O to T): Image is i(O,O,W,H)

[0094] p(x,y) is the pixel value (colour or grayscale) in position (x,y)

[0095] if, I and I′ are 8-bits (256 colour or grayscale) Image p(x,y)≅(pI(x,y)*t+pI′(x,y)*(T−t))/T

[0096] else if I and I′ are true colour image

p(x,y)—R=(pI(x,y)—R*t+pI′(x,y)—R*(T−t))/T

p(x,y)˜G=(pI(x,y)—G*t+pI′(x,y)—G*(T−t))/T

p(x,y)˜B=(pI(x,y)—B*t+pI′(x,y)—B*(T−t))/T

[0097] Last formula can be improved to reduce calculation time

[0098] PAN Effect (From Left to Right)

[0099] Suppose:

[0100] 1. Effect time: O˜T

[0101] 2. At time 0: Image is I(O,O,W,H)

[0102] W is the width of the still picture I

[0103] H is the height of the still picture I

[0104] 3. At time T: Image is I′(O,O,W,H)

[0105] Then at time t (from O to T): Image is i (O,O,W,H)

[0106] p(x,y) is the pixel value (colour or grayscale) in position (x,y)

[0107] if x+W*t/T<W

p(x,y)=pI(x+W*t/T,y)

[0108] else

p(x,y)=PI′(x+W*t/T−W, y)

[0109] Perspective Display Effect:

[0110] Sometimes we display the picture on the screen with just a size change, however, we also can display the picture in rational distortion mode to produce a perspective effect.

[0111] Suppose:

[0112] Image is I(O,O,W,H)

[0113] W is the width of the still picture

[0114] H is the height of the still picture

[0115] The four corner's coordination is (xs[4], ys[4])

xs[0]=0, ys[0]=0

xs[1]=W, ys[1]=0

xs[2]=W, ys[2]=H

xs[3]=0, ys[3]=H

[0116] The display area is a four points polygon (xd[4],yd[4])

[0117] Then transforming from (xs[4],ys[4]) to (xd[4],yd[4]). 1

[0118] xs[4],ys[4] xd[4],yd[4]

[0119] For a point p(x,y) in left plane, suppose mapping to point p′(x′ y′) in right plane, we can list the simultaneous equation:

x1=x+(a[0]*(x−xs[0])+a[1]*(y−ys[0]))/(a[2]*(x−xs[0])+a[3]*(y−ys[0])+10.0)

y1=y+(a[4]*(x−xs[0])+a[5]*(y−ys[0]))/(a[2]*(x−sx[0])+a[3]*(y−ys[0])+10.0)

[0120] a[6] is coefficients for solving simultaneous equation.

[0121] If we display the picture with only a size change, the last formula can be simplified because the display area is a rectangle (X′,Y′,W′,H′):

x1=X′˜x*W′/W

y1=Y′+y*H′/H

[0122] Other filters can be applied and the algorithms associated with these other filters can be determined according to the intended nature of the filters and an understanding of the way the algorithms above have been provided.

[0123] The present invention further enhances the efficiency of the transfer of video information over the Internet or other communications medium by merging a plurality of files having different file types to create a single hybrid file that can be transmitted over the communications medium. This hybrid file will contain all the information of the original files in a format that is different from the original formats of each of these original files. In other words, the format of the information in the hybrid file will be different from all the formats that the original files could have been presented in. To view or use the hybrid file, the user only needs to download a single plug-in. This plug-in will direct the user's web browser to load the selected hybrid file for processing.

[0124] The use of this hybrid file allows the user to save time and increases the convenience to the user. For example, the user will not need to download the various plug-ins to display the different file formats of the original files. In addition, the user can utilize a single “click” of the input device (e.g., mouse) to download the hybrid file (and hence all of its merged information) across a communication medium, since the hybrid file eliminates the need to transfer multiple file formats over slow communication mediums such as the Internet. This also eliminates the disadvantage of having to view multiple files using multiple plug-ins on a single Windows viewer in a web browser.

[0125] FIGS. 13-16 further illustrate how the hybrid file is created and utilized according to one possible embodiment of the present invention. First, as shown in FIG. 13, a plurality of files 500 having different file formats (e.g., raster, vector, text, audio, and others) are provided sequentially (i.e., one at a time) to a transformation engine 510, which translates and combines the various file types into a single highly compressed hybrid file 520 that allows for quick transmission over traditional communication mediums.

[0126] FIG. 14 illustrates a conventional file format for any of the original files 500. The file 500 contains a header 530 and a footer 540. The header 530 typically stores information about the file type (i.e., JPEG, DXF, WAV, MIDI, AVI, TXT, etc.), and project information (e.g., date of creation, project title, remarks, etc.). The header 530 can also store data regarding the version, and the object list (i.e., indicating the size of the object file, starting point/byte for reading first object, ending point/byte before starting second object, etc.). The footer 540 typically stores information about the end of the file, and prevents the program from over-reading the file thereby causing an invalid file format. The contents 550 of that file 500 are provided between the header 530 and the footer 540.

[0127] FIG. 15 illustrates how the transformation engine 510 creates the hybrid file 520. The transformation engine 510 can be implemented in the form of computer software, and located in the video encoding block 15 in FIG. 1. The files 500 are processed sequentially by the transformation engine 510. In step 560, the transformation engine 510 reads the header 530 and the footer 540 of a first file 500. In step 570, the transformation engine 510 reads the contents 550 of the first file 500. The program will allocate a temporary memory (e.g., a RAM) to store the original file data with its original object list for the first file 500. In step 580, the transformation engine 510 interprets the contents 550 of the first file 500. The contents 550 are interpreted using the header information. The program will identify the file type and will load the object list or library for the first file 500.

[0128] Next, in step 585, the interpreted contents 550 of the first file 500 are processed and stored in a temporary memory. As the program reads first file 500, it will allocate a temporary memory in the operating system (e.g., Microsoft Windows) to store the interpreted data or contents of the first file 500. As described above, there are generally five types of files (Raster, Vector, Text, Audio, and others) that are transformed into a single hybrid file 520. Before each type of file can be interpreted, the program will need to know those objects in each file type that need to be interpreted or imported. The object lists for each individual file need to be stored prior to interpreting the file. For new file formats that need to be packed into the hybrid file 520, it is necessary to process and store the new object list for each such new file format to the program. For example, for a Vector file, the program will import commonly used vector formats such as .dxf (drawing exchange format) or .wmf (Windows Metafile). If a file in the temporary memory is a vector file, the program will match the object list or library of that file with the program's predetermined object list or library for .dxf interpret/import, and replaces the objects of the file in memory with the name of objects already pre-packaged with the downloaded pulg-in, thereby achieving compression. This is because the .dxf format is a general vector format and thus stores redundant data (such as versioning information) to achieve compatibility with other vector/CADD programs.

[0129] If a file in the temporary memory is an image, the program will apply compression to the image to reduce its size while retaining its image quality.

[0130] Background (audio) music is provided by playing audio files on the user's local hard disk when the plug-in is initially installed. Another possible mode of audio is through “text-to-speech” where the commentary text will be stored as part of the hybrid file and generated to speech (commentary of the video) at the user's computer. This will greatly reduce the audio data size from a wave file of, for example, 1 MB to a mere 2kb of text file to be packed with the hybrid file for transmission.

[0131] In step 590, it is determined whether more files 500 need to be merged. If so, processing returns to step 560. In other words, the program will process the next file 500 in line until there are no more files to be processed. If all files 500 have been read, then in step 600, all the read files 500 are merged by tightly integrating and compressing the files 500. During this step, the program determines the sequencing of the files 500 that have been placed into the temporary memory. Then, in step 605, the program overlays user-defined hotspots that link to other elements such as hyperlinks, among others.

[0132] In step 610, a header 650 is written for the hybrid file 520. This header 650 can include a brief description of the hybrid file 520, such as the publisher's information, description of the contents of the hybrid file, etc. Then in step 620, the contents 660 of the hybrid file 520 are written. In this step, the program will write the merged files with the hotspot overlay into a single hybrid file and store the hybrid file into permanent storage (e.g., floppy disk, hard disk, etc.), and then closes the hybrid file 520 by putting an indicator to signify EOF (end-of-file) in the footer 665. See FIG. 16. The contents 660 in the hybrid file 520 can have a .MGM file type. The header 650 is textual information to allow for indexing, searching and customization for additional functionality on the .MGM file format.

[0133] As used in the present invention, this hybrid file 520 can contain video data, audio data, raster images (e.g., photographs, illustrations, sketches), vector images (e.g., CADD, architectural designs, product design models, etc.), and textual information (e.g., product information/specifications, descriptions). These are merely illustrative examples, since this hybrid file 520 can be used in numerous applications. Some applications include, but are not limited to, (1) interactive video walkthroughs of real estate properties or other processes, (2) showcase for a product or process, and (3) providing a commentary or presentation of a property, process, or product.

[0134] This hybrid file 520 is then relayed across the communications medium and downloaded at the receiving computer(s) 3. Another decoding engine is located at the decoding block 30 in FIG. 1 and operates to read and interpret the contents 660 of the hybrid file 520. The decoding engine (i.e., the plugin) reads the hybrid file 250 into memory, and replaces the name of the objects with the actual objects themselves. After this, the contents 660 are rendered or displayed at the display with the hotspot overlay. To read and interpret the contents 660, the user at the receiving computer 3 will need to download one plug-in program.

[0135] When used in the video context of the present invention, the video data in this hybrid file 520 can be zoomable and interactive when displayed. The ability to zoom can be built-in to the viewer (AKA plug-in). The user simply clicks the zoom button, defines the area to zoom (by way of click-and-drag) and the plug-in will enlarge the user defined area and displays the enlarged area. The interactivity is based on the hotspot overlay. The user can click on predefined spots or areas and be led to predefined links which can contain images and/or more information, thereby achieving a degree of interactivity. As a result, the hybrid file 520 has the ability to define hotspots to trigger external components (e.g., web pages, music, etc.) and the display of different segments or frames in the apparent video (.msd). In addition, to provide a synchronized commentary walkthrough video during production, a development tool that records the timing of the route of path with the apparent video is provided to add commentary text or speech to narrate the walkthrough. This will reduce the transmission that is otherwise needed for conventional audio files (.wav), which are typically large files. Aside from speech, synthesized background music can be stored in midi format, which is relatively smaller in size as compared to a .wav file.

[0136] Modifications may be made to the example above as would be apparent to persons skilled in the art of video image manipulation, and to persons skilled in the art of Internet home page design and set-up. These and other modifications may be made without departing from the ambit of the invention the nature of which is to be determined from the foregoing description and the appended claims.

Claims

1. A method of creating a hybrid file, comprising:

providing at least a first file and a second file, each file having contents in a file type that is different from the file types of all the other files;

reading and interpreting the at least first and second files; and

creating a hybrid file that contains the contents of the at least first and second files, the hybrid file having a hybrid file type that is different from the file types of the at least first and second files.

2. The method of

claim 1, wherein the step of reading and interpreting the at least first and second files includes:

reading and interpreting the contents of the first file;

storing the interpreted contents of the first file in a temporary location;

reading and interpreting the contents of the second file; and

storing the interpreted contents of the second file in the temporary location.

3. The method of

claim 2, wherein the step of creating a hybrid file includes:

merging the interpreted contents of the first and second files.

4. The method of

claim 3, wherein the step of creating a hybrid file further includes:

compressing the merged contents of the first and second files.

5. The method of

claim 3, wherein the step of creating a hybrid file further includes:

creating a header for the hybrid file.

6. The method of

claim 1, wherein the step of reading and interpreting the at least first and second files includes:

reading and interpreting information from headers of the at least first and second files.

7. The method of

claim 1, wherein the step of reading and interpreting the at least first and second files includes:

loading object lists for the first and second files.

8. A method of transmitting a plurality of different files across a communication medium from a first location to a second location, comprising:

providing at least a first file and a second file, each file having contents in a file type that is different from the file types of all the other files;

reading and interpreting the at least first and second files;

creating a hybrid file that contains the contents of the at least first and second files, the hybrid file having a hybrid file type that is different from the file types of the at least first and second files; and

transmitting the hybrid file across a communication medium to the second location.

9. The method of

claim 8, wherein the step of reading and interpreting the at least first and second files includes:

reading and interpreting the contents of the first file;

storing the interpreted contents of the first file in a temporary location;

reading and interpreting the contents of the second file; and

storing the interpreted contents of the second file in the temporary location.

10. The method of

claim 9, wherein the step of creating a hybrid file includes:

merging the interpreted contents of the first and second files.

11. The method of

claim 10, wherein the step of creating a hybrid file further includes:

compressing the merged contents of the first and second files.

12. The method of

claim 10, wherein the step of creating a hybrid file further includes:

creating a header for the hybrid file.

13. The method of

claim 8, wherein the step of reading and interpreting the at least first and second files includes:

reading and interpreting information from headers of the at least first and second files.

14. The method of

claim 8, wherein the step of reading and interpreting the at least first and second files includes:

loading object lists for the first and second files.

15. The method of

claim 8, further including:

loading, at the second location, a plug-in for the file type for the hybrid file.

16. The method of

claim 10, wherein the step of creating a hybrid file includes:

storing the hybrid file into a permanent storage device.

17. The method of

claim 10, wherein the step of creating a hybrid file includes:

overlaying user-defined hotspots that link to other elements.