KEYWORDS: Video compression, Video, Video coding, Computer programming, RGB color model, Image compression, Data conversion, Motion estimation, Cameras, Wavelets
Most consumer digital color cameras capture video using a single chip. Single chip cameras do not capture
RGB triples for every pixel, but a subsampled version with only one color component per pixel (e.g. Bayer
pattern). Conventionally, a full resolution video is constructed from the Bayer pattern by demosaicing before
being converted to YUV domain for compression. In order to lower the encoding complexity, we propose in
this work a novel color space conversion in the pre-processing step. Compared to the conventional method,
the proposed scheme reduces the encoding complexity almost by half. Moreover, it improves the reconstructed
video quality by up to 1.5 dB in CPSNR, when H.264/AVC is used for compression. To further lower the
encoding complexity, we additionally use our Wyner-Ziv video coder for compression. Again, we observe in our
experiments a similar gain of the proposed scheme over the conventional one.
We present a new video coding scheme that uses several references frames for improved motion-compensated prediction. The reference pictures are warped versions of the previously decoded frame applying polynomial motion compensation. In contrast to global motion compensation, where typically one motion model is transmitted, we show that in the general case more than one motion model is of benefit in terms of coding efficiency. In order to determine the multiple motion models we employ a robust clustering method based on the iterative application of the least median of squares estimator. The approach is incorporated into an H-263-based video codec and embedded into a rate- constrained motion estimation and macroblock mode decision frame work. It is demonstrated that adaptive multiple reference picture coding in general improves rate-distortion performance. PSNR gains of 1.2 dB in comparison to the H-263 codec for the high global and local motion sequence Stefan and 1 dB for the sequence Mobile and Calendar, which contains no global motion, are reported. These PSNR gains correspond to bit-rate savings of 21 percent and 30 percent comparing to the H-263 codec, respectively. The average number of motion models selected by the encoder for our test sequences is between 1 and 7 depending on the actual bit- rate.
In this paper, we present a new approach to robust 3D rigid body motion estimation and scene structure recovery using an epipolar corridor. In comparison to traditional two stage approaches we do not rely on independent establishment of feature point correspondences and subsequent computation of 3D motion parameters, but iteratively feed the motion parameters back to the point correspondences and subsequent computation of 3D motion parameters, but iteratively feed the motion parameters back to the point correspondence estimator, restricting the search space to an epipolar corridor. As the iterations proceed we narrow the width of the corridor and reach a stable solution with all point correspondences obeying the epipolar line constraint. The least median of squares estimator is integrated into the 3D motion parameters estimation framework to deal with the multi-motion problem. The position of the feature points along the epipolar line finally leads to structure recovery from motion. Experimental results using real and synthetic image sequence data show the ability of the approach to robustly estimate 3D motion parameters.
The draft international standard ITU-T H.263 is closely related to the well known and widely used ITU-T Recommendation H.261. However, H.263 does provide the same subjective image quality at less than half the bit-rate. In this paper we investigate to what extend single enhancements of H.263 contribute to this performance gain, and consider the trade-off quality vs. complexity. Based on the test sequence “Foreman”, H.263 in its default- and optional codingmodes is compared to H.261 on the basis of rate distortion curves at bit-rates up to 128 kbps. At 64 kbps, the performance gain of H.263 in its default mode compared to H.261 is approximately 2 dB PSNR. This improvement is achieved with only little increase of complexity, and is mainly due to more accurate motion compensation with half-pel accuracy. Considering the trade-off quality vs. complexity, the combination of the optional coding-modes “Advanced prediction mode” and “PB-frames mode” seems to be a good compromise, resulting in an additional performance gain of 1 dB PSNR at 64 kbps. The “Syntax-based arithmetic coding mode” on the other hand, offers only a very small performance gain (0.2 dB PSNR at 64 kbps) for its increased computational complexity. Results from profiling a H.263 software codec are presented in order to support complexity considerations of the optional coding-modes.
Proceedings Volume Editor (1)
This will count as one of your downloads.
You will have access to both the presentation and article (if available).
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
INSTITUTIONAL Select your institution to access the SPIE Digital Library.
PERSONAL Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.