Abstract: A method of decoding video data includes receiving a plurality of blocks of video data, wherein each block of the plurality of blocks is encoded using a respective affine motion model of a plurality of affine motion models, and decoding the plurality of blocks of video data using the same affine motion field derivation process for each of the plurality of affine motion models.
June 17, 2019
Date of Patent:
May 4, 2021
Han Huang, Wei-Jung Chien, Marta Karczewicz
Abstract: Embodiments of the present application relate to a prediction mode selection method performed at a video encoding device, including: acquiring a first optimal intra-frame prediction mode of a downsampled unit obtained by downsampling an image frame to which a target prediction unit belongs and then dividing the downsampled image frame, and the first optimal intra-frame prediction mode obtained by performing precoding analysis on the downsampled unit; adding a candidate intra-frame prediction mode to a candidate mode set according to the first optimal intra-frame prediction mode; adding a candidate intra-frame prediction mode to the candidate mode set according to a second optimal intra-frame prediction mode of an adjacent PU corresponding to the target PU; and determining an optimal intra-frame prediction mode of a current PU according to prediction residuals and encoding cost values corresponding to the candidate intra-frame prediction modes in the candidate mode set.
April 2, 2019
Date of Patent:
April 20, 2021
TENCENT TECHNOLOGY (SHENZHEN) COMPANY LIMITED
Abstract: A method for decoding a video according to the present invention may comprise: decoding information indicating whether illumination compensation is performed for a current block, determining an illumination compensation parameter of the current block when the information indicates that the illumination compensation is performed for the current block, obtaining a prediction block by performing inter prediction for the current block, and performing the illumination compensation on the current block using the illumination compensation parameter.
Abstract: An image display control device displays, on a monitor, images of the periphery of a vehicle that are captured by a left camera and a right camera installed in the vehicle. The device includes: a vehicle detection unit which detects another vehicle from the images captured by the left camera and the right camera; an enlargement unit which, when the other vehicle is detected by the vehicle detection unit, subjects the images captured by the left camera and the right camera to enlargement processing on the basis of a delay time the images take to be displayed on the monitor; and an output unit which outputs the images subjected to the enlargement processing by the enlargement unit to the monitor.
Abstract: Systems and methods are provided for improving accuracy and efficiency of a context-adaptive binary arithmetic coding (CABAC) by adaptively selecting a context model specific to the characteristics of a coding unit (CU), such as the size, dimension (height and/or width), type (luma or chroma), and/or flag type (cu_palette_flag or pred_mode_flag) of the CU, that comprise: determining a characteristic of the CU; determining whether the characteristic of the CU meets a corresponding threshold; and upon determining that the characteristic of the CU meets the corresponding threshold, selecting a first context model, or upon determining that the characteristic of the CU fails to meet the corresponding threshold, selecting a second context model.
Abstract: One variation of a method for monitoring occupancy in a work area includes, at a sensor block: transitioning from an inactive state into an active state when an output of a motion sensor indicates motion in a work area; during a scan cycle in the active state, recording an image through an optical sensor at a time, detecting a set of humans in the image, detecting a second set of human effects in the image, predicting a second set of humans occupying but absent the work area based on the second set of human effects, and estimating a total occupancy in the work area at the time based on the set of humans and the second set of humans; and transmitting the total occupancy to a remote computer system for update of a scheduler for the work area.
Abstract: A method for processing a video stream prior to encoding, the video stream potentially comprising a film grain, the method comprising: measuring a film grain intensity in the video stream; obtaining at least one encoding rate information item associated with the video stream, in order to determine a pair of respective values for the grain intensity and encoding rate; comparing the pair values with predetermined respective threshold values in order to categorize the video stream with respect to pairs of predetermined values of grain intensity and rate; and selecting a film grain management strategy among at least four combinations based on the categorization of the video stream.
Abstract: An example device for processing video data includes a memory configured to store video data; and one or more processors implemented in circuitry and configured to determine to extract a motion constrained tile sets (MCTS) sub-bitstream from an original bitstream including the video data based at least in part on information of an MCTS extraction information set (MCTS-EIS) supplemental enhancement information (SEI) message; and in response to determining to extract the MCTS sub-bitstream, omit all SEI network abstraction layer (NAL) units that contain non-MCTS-nested SEI messages from inclusion in the extracted MCTS sub-bitstream, regardless of a value of a NAL unit header layer identifier value for the non-MCTS-nested SEI messages.
Abstract: There is provided an image processing device including a far-infrared acquisition unit that acquires a far-infrared image, a first extraction unit that extracts a plurality of first markers having a first temperature from the far-infrared image, and a far-infrared specification unit that specifies a position of each of a plurality of second markers having a second temperature in the far-infrared image based on a geometric relationship between the plurality of respective first markers.
Abstract: An apparatus includes a video capture device, an audio capture device and a processor. The video capture device may be configured to generate a plurality of video frames. The audio capture device may be configured to capture audio. The processor may be configured to perform video operations to detect objects in the video frames, extract data about the objects based on characteristics of the objects determined using the video operations, detect whether an event has occurred based on the characteristics of the objects, determine a permission status based on the captured audio and generate a video stream based on the video frames. The video stream may be generated only if the permission status allows the video stream. The captured audio may be monitored after the event has been detected to determine whether the permission status allows the video stream.
Abstract: Attributes of vegetables or biologics are derived by use of color imaging sensors and relative spectral band analysis. Enabled smart phones or dedicated single pixel or focal plane instruments for crop applications to quickly report the biological condition of vegetables or other organics by providing an augmented view or relative quantification of RGB of the inspected items. Disclosed embodiments are well suited for analyzing the health and needs of living plants or crops. Ratios of observed wide band red, green and blue are compared on a relative basis. While food shopping, an enabled smart phone may view a collection of produce and display each piece of produce in a manner disclosing a quality ranking. Thus, a consumer may view produce through a smartphone camera and quickly evaluate its relative quality. Novel approaches are used to associate the calculated data with the original source imagery.
Abstract: A system for forming an image (110) of a substantially translucent specimen (102) has an illuminator (108) configured to variably illuminate the specimen from a plurality of angles of illumination such that (a) when each angle (495) at a given point on the specimen is mapped to a point (445) on a plane (420) perpendicular to an optical axis (490), the points on the plane have an increasing density (e.g. FIGS. 4, 11C, 11E, 12C, 12E, 13A, 14A, 14C, 14E, 15A, 15C, 15E) towards an axial position on the plane; or (b) the illumination angles are arranged with a substantially regular pattern in a polar coordinate system (FIG. 13A,13B) defined by a radial coordinate that depends on the magnitude of the distance from an optical axis and an angular coordinate corresponding to the orientation of the angle relative to the optical axis.
Abstract: A method for non-uniform mapping for quantization matrix coefficients between different sizes of quantization matrices in image/video coding includes obtaining a first quantization matrix and identifying a second quantization matrix to be formed therefrom. The second quantization matrix is a factor of two larger than the first quantization matrix. The second quantization matrix is populated with values from the first matrix through non-uniform mapping of the first quantization matrix. Non-uniform mapping to populate the second quantization matrix includes directly mapping values of all or a portion of the first quantization matrix into a most upper left portion of the second quantization matrix and mapping up-sampling values of the first quantization matrix into a remaining portion of the second quantization matrix.
Abstract: A near-eye display system includes a display panel to display a near-eye lightfield frame comprising an array of elemental images and an eye tracking component to track a pose of a user's eye. The system further includes a lenslet array and a rendering component to adjust the focal points of the array of elemental images in the integral lightfield frame based on the pose of the user's eye. A method of operation of the near-eye display system includes determining, using an eye tracking component of the near-eye display system, a first pose of a user's eye and determining a desired focal point for an array of elemental images forming an integral lightfield frame based on the first pose of the user's eye. The method further includes changing the focal length of light projecting out of a lenslet array based on the first pose of the user's eye.
Abstract: An example device for coding video data includes a memory configured to store video data, and one or more processors implemented in circuitry and configured to code a first motion vector difference (MVD) representing a difference between a first motion vector of a current block of video data predicted using affine prediction and a first motion vector predictor (MVP) for the first motion vector, predict a second MVD from the first MVD for a second motion vector of the current block, and code the current block using affine prediction according to the first motion vector and the second motion vector. Predicting the second MVD from the first MVD in this may reduce bitrate of a bitstream including coded video data, as well as improve processing efficiency.
October 1, 2018
Date of Patent:
December 1, 2020
Kai Zhang, Jianle Chen, Xiang Li, Wei-Jung Chien, Yi-Wen Chen, Li Zhang, Marta Karczewicz
Abstract: A technique for processing video includes receiving a pixel array, such as a block or layer of video content, as well as a mask that distinguishes masked, “don't-care” pixels in the pixel array from unmasked, “care” pixels. The technique encodes the pixel array by taking into consideration the care pixels only, without regard for the don't-care pixels. An encoder operating in this manner can produce a simplified encoding of the pixel array, which represents the care pixels to any desired level of precision, without regard for errors in the don't-care pixels, which are irrelevant to reconstruction. Further embodiments apply a polynomial transform in place of a frequency transform for encoding partially-masked video content, and/or video content meeting other suitable criteria.
Abstract: Vehicle safety is enhanced by providing a video system incorporating multiple cameras providing separate video feeds that are stitched together by a controller to provide a composite video viewable by the operator that changes in real time along with changes to the speed and direction of the vehicle.
Abstract: A decoder for decoding a data stream into which media data is coded has a mode switch configured to activate a low-complexity mode or a high-efficiency mode depending on the data stream, an entropy decoding engine configured to retrieve each symbol of a sequence of symbols by entropy decoding using a selected one of a plurality of entropy decoding schemes, a desymbolizer configured to desymbolize the sequence of symbols to obtain a sequence of syntax elements, a reconstructor configured to reconstruct the media data based on the sequence of syntax elements, selection depending on the activated low-complexity mode or the high-efficiency mode.
November 25, 2019
Date of Patent:
October 27, 2020
GE VIDEO COMPRESSION, LLC
Valeri George, Benjamin Bross, Heiner Kirchhoffer, Detlev Marpe, Tung Nguyen, Matthias Preiss, Mischa Siekmann, Jan Stegemann, Thomas Wiegand, Christian Bartnik
Abstract: Systems and methods to calibrate an imaging sensor may include an enclosure with a controlled environment, a light source illuminating the environment, and a particle source emitting desired particles at desired concentrations into the environment. An imaging sensor, which may be associated with an aerial vehicle, may be placed within the enclosure, and the imaging sensor may capture imaging data within the controlled environment. In addition, the imaging data may be processed to determine current spectral characteristics of the imaging sensor. Based on the environment, light, and particle properties within the environment, the imaging sensor may be calibrated to exhibit nominal or desired spectral characteristics.