Definition

Digital video watermarking is a technology to embed and retrieve information into and from digital video data.

A variety of robust and fragile video watermarking methods have been proposed to solve the illegal copying and proof of ownership problems as well as to identify manipulations. Although a number of broad claims have been made in the field of robustness of various digital watermarking methods, it is still difficult to handle combined or non-linear geometric transformations. The methods can be divided into techniques working on compressed or uncompressed data. In particular video watermarking is based in general on the following concepts to hide a watermark by modifying some of its characteristics:

  1. 1.

    Spatial domain approach, also called native domain: embedding and detection are performed on spatial pixels values (luminance, chrominance, color space) or on the overall video frame characteristic,

  2. 2.

    Feature or salient point watermarking by modifying geometric properties of the video frames,

  3. 3.

    Frequency domain techniques where the spatial values are transformed, like DCT Discrete Cosine, FFT Fast Fourier Transform, Wavelets or fractals,

  4. 4.

    Quantization index modulation (QIM) watermarking,

  5. 5.

    Format-specific approaches like watermarking of structure elements like Facial Animation Parameter of MPEG-4 or motion vectors.

The robust watermarking approaches usually spread the watermarking information redundant over the overall signal representation in a non-invertible manner to enforce identification or verification of ownership or to annotate the video. For example the message is spread and encoded in the 2D FFT frequencies of each video frame. In most cases prior to transforming each frame to the frequency domain, the frame data is transformed from e.g., RGB space to Weber-Fechner YCbCr space. Generically, YCbCr space consists of a luminance component Y and two color difference components Cb and Cr. The Y component contains the luminance and black & white image information, while Cb represents the difference between R and Y and Cr represents the difference between Y and B. In YCbCr space most of the frame information is in the Y component. This representation is used also during MPEG compression. The MPEG algorithm grossly removes large portions of the Cb and Cr components without damaging the frame quality. The MPEG algorithm uses compression to reduce the Y component since it has more effect on the quality of the compressed frames, which is also used in most watermarking schemes. To avoid estimation attacks the watermark signal should be adoptive designed to the overall video frame sequence characteristics by facing the problem that equal or similar watermarking pattern for each frame could allow an estimation attack based on the slide visual differences between adjacent frames, different watermarking pattern could allow an estimation attack based on the similarities between similar visual frames, see for example in [1].

Data rates are measured in bits per frame. If the watermark is embedded into the compressed domain for example MPEG video, we count the embedded data rates in bits per I, B and P frame or per GOP.

For fragile video watermarking, relevant for authenticate the data in its authenticity and integrity, a fragile watermark signal can be spread over the overall video into manipulation sensitive video elements like LSB to detect changed and manipulated regions. The watermark for authentication purposes is often designed in an invertible (reversible) manner to allow reproduction of the original, see for example an analysis of the approaches in [2]. Today we find several fragile watermarking techniques to recognize video manipulations. In the moment most fragile watermarks are very sensitive to changes and can detect most possible changes in pixel values. Only few approaches address the so-called content-fragile watermarking relevant in applications with several allowed post production editing processes. For example [3] addresses the recognition of video frame sequence changes with the possibility of reproduction of the original frame sequence called self-watermarking of frame-pairs.

A further idea is for example from [4] suggesting to embed a visual content feature M into the video frame with a robust watermarking method. The content-fragile watermarking approach for video authentication tries to extract the frame characteristics of human perception, called content. The approach is for example to determine the edge characteristics of the single video frame. This characteristic is transformed into a feature code for the content-fragile digital watermark. The edge characteristics of a frame give a very good reflection of the frame content, because they allow the identification of object structures and homogeneity of the video. Dittmann [4] use the canny edge detector described as the most efficient edge separator. The author described several strategies for generating and verification of the edge based feature codes. From the general perspective the content feature M can be embedded directly or used as a seed to generate the watermarking pattern itself.

Benchmarking of Video Watermarks

The available benchmarking suites mainly cover image and audio watermarks and neglect video specific aspects. In [1], we find an evaluation of video watermarking sensitivity to collusion attacks and potential solutions. Video specific aspects of video watermarks for cinema applications are summarized in [5]. Various features were analyzed, including robustness to non intentional attacks such as MPEG compression, transcoding, analog to digital and digital-to-analog conversions, standard conversions (PAL – NTSC), and change of geometry.