You are here

Multimedia Source Compression


Research topics:

Wideband speech coding for mobile communications
Hee Thong How
Wideband speech coding is of increasing interest today and it is intended for compressing speech or audio signals of 7 kHz bandwidth, sampled at 16 kHz. As shown in Figure \ref{wideband_spectrum}, extending the lower frequency range down to 50 Hz from the 300 Hz lower boundary of narrowband codecs increases naturalness, presence and comfort. At the high end of the spectrum, extending the higher frequency range to 7000 Hz increases intelligibility and makes it easier to differentiate between fricative sounds. This results in an improved speech quality, which will find applications in videoconferencing, loudspeaker telephony and high-definition television.

Typical energy spectrum of a wideband speech signal

Figure 1 : Typical energy spectrum of a wideband speech signal.
\label{wideband_spectrum}

Hee Thong's research is considering various techniques of coding wideband speech at bit rates between 8 and 32 kbit/s. These techniques include the state-of-the-art Code Excited Linear Prediction (CELP) algorithms, which have been applied successfully in narrowband speech coding. Both forward- and backward-adaptive CELP schemes were investigated. Currently, we are investigating the feasibililty of employing sinusoidal transform coding (STC) techniques in wideband speech coding. It is also of interest to study the application of the wavelet transform in the context of wideband speech coding, especially in the intermediate bit rate range between 16 and 64 kbit/s, where the computational complexity of CELP codecs becomes excessive.
[top]

Mobile Video networking
Dr. Peter Cherriman
The wireless communicators of the near-future are likely to becapable of capturing and transmitting video signals and hence video telephony may gradually replace voice-only mobile communications. In order to achieve this goal a range of proprietary and standard based video codecs have been proposed. However most of these video codecs were designed for standard phone lines, and not mobile radio communication. In a mobile radio system the signal strength can vary rapidly. The video codecs need to be made more robust to errors in order for them to be able to operate as part of a wireless communications device.

Most of the existing standard video codecs were designed for achieving high compression ratios of up-to 400:1. This is generally achieved by transmitting only the 'difference' between consecutive video frames, and using techniques such as run-length
and Huffman coding for removing the predictable information. However, if some information is lost or corrupted, the decoder can misinterpret the following data, inflicting artifacts, which may persist for several seconds. Alternatively, if the difference between two consecutive video frames was someone moving their hand, and this information is lost, the decoded video could show an additional hand floating in the air', and again, the effects could last for several seconds. In order to recover from artifacts the video encoder regularly transmits certain video frame in full, without motion prediction, which are referred to as INTRA-coded frames. However, transmitting a video frame in full -without motion prediction - requires more bits, than transmitting the video scene 'changes' since the last frame, and hence in low bitrate applications the number of INTRA-coded frames is limited, which extends the time that artifacts appear for. However, if some of the INTRA-coded frames are lost or corrupted, then the decoded video could be worse rather than better following a corrupted INTRA-coded frame.

The simplest solution for protecting the video information is to use powerful error-correction coding, in order to improve the robustness against transmission errors. However, this technique has some disadvantages, since strong error correction can severely reduce the available bitrate for the video codec and reduce the video quality. If the error correction scheme is designed to cope with short periods of total signal loss, then the error correction can add significant delay to the video system, which may cause the video to appear jerky, and may lead to the loss of synchronisation with the associated audio information.

The best solution is to use error correction coding in conjunction with other techniques, in order to overcome short periods of total signal loss. During my PhD we investigated how we could improve the robustness of two standard video codecs, namely that of the H.261 and H.263 schemes. We designed a system, which we found to be robust to lost information and errors. The system divided the encoded information of the video codec into packets and if the packet was corrupted or failed to arrive at the decoder, then the decoder signalled this to the encoder, which provided a feedback to the video codec encoder, in order to ensure that the information from corrupted packets was retransmitted in later video frames. In order that these retransmissions did not increase the bitrate, the video packetiser was arranged to control the bitrate of the video codec to ensure that the bitrate remained constant. The performance of this intelligent packetisation scheme can be judged from the example shown in Figure~\ref{fig:peter1}.

Figure 2 :This is an example of our error resilient packet video system which shows four consecutive video frames. The network has corrupted some of the transmitted packets, resulting in part of the video image to become frozen, as shown in frames 26 and 27. However, the system soon recovers, and replenishes the affected areas, as shown in frame 28. When these video frames are shown at the full speed of 10 frames per second, the loss of the packets is almost undetectable.
\label{fig:peter1}

During his PhD Peter also investigated various network techniques for improving the performance of packet video systems. A particularly effective technique was using multi-mode modulation transceivers. These transceiver are capable of reconfiguring themselves in response to time-variant channel conditions. For example, when the signal strength is poor, a more robust but lower bitrate modulation scheme is used. However, when the signal strength is strong, since the mobile receiver is close to the transmitter, the transceiver can switch to a higher-order modulation scheme, which supports a significantly higher bitrate, providing for either higher quality video telephony, or higher bitrate internet access.

Upon completing his PhD, Peter joined an industrial project funded by the UK's Virtual Centre of Excellence in Mobile and Personal Communications - in short the Mobile VCE. It is a collaborative partnership between about 25 of the world's most prominent Mobile Communications companies and seven UK Universities, each having long standing specialist expertise in relevant areas. Its formation was endorsed by the Department of Trade and Industry.

The Communications group was awarded funding to conduct research in Intelligent Wireless Video Communications under the auspices of the Terminals Working Group in the Mobile VCE.

Every three months all the academic partners and representatives from some of the Industrial sponsers meet to discuss the progress of the research. This provides invaluable industrial feedback for our research, and often opens up new areas of research based on these discussions.
[top]

Video over wireless interactive and broadcast environments
Dr. Chee Siong Lee
The objective of Chee Siong's research is to investigate various video
coding issues within wireless distributive and interactive
environments. Areas of research include:
  • Error resilience study of MPEG-2 video parameters
  • Digital Video Broadcasting (DVB) transmission aspects for terrestrial and satellite schemes
  • Stereoscopic video coding within the MPEG-4 video coding framework
  • Investigate the use of wavelets within stereoscopic video coding as well as the optimization of the wavelet-based coding scheme in a rate-distortion sense
  • System aspects of the proposed stereoscopic video coding system as well as conventional video coding system

With the third generation mobile communication systems promising users the possibility of multimedia communication, research has intensified towards the use of various standard video codecs, such as MPEG2, MPEG4 and H.263 in providing the video content of multimedia communication. However, due to the excessive use of entropy coding in achieving compression, these video codecs perform poorly when their video bitstreams are corrupted during transmission. Since the mobile radio channels are known to be hostile, the chances of a bitstream being received in error is higher than over other communication channels. As such, apart from designing robust communication systems which attempt to present an error free compressed video bitstream to the video decoder, researchers have also concentrated their efforts on designing more robust video codecs. A recent example is the MPEG-4 video codec. It possesses certain redundancy information in the compressed video bitstream structure which assist the video decoder to resynchronise itself as quickly as possible, once a decoding error has been detected.

In our work, we have explored the feasibility of providing broadcast video services to mobile users. The video codec used in our study was the MPEG-2 video codec. The MPEG-2 video bitstream was subjected to a rigorous bit error sensitivity investigation, in order to assist in contriving various error protection schemes for wireless broadcast video transmission. Turbo-coded performance enhancement of the Pan-European terrestrial Digital Video Broadcast (DVB) system was proposed for transmission over mobile channels to receivers on the move. The turbo codec was shown to provide substantial performance advantages over conventional convolutional coding both in terms of bit error rate and video quality.

In order to increase the resilience of the video codec, we have also experimented with a technique known as data partitioning. However, our experiments suggested that multi-class data partitioning did not result in error resilience improvements, since a high proportion of relatively sensitive video bits had to be relegated to the lower integrity transmission subchannel, when invoking a powerful low-rate channel codec in the high-integrity protection class. Nonetheless, DVB transmission to mobile receivers is feasible, when using turbo-coded OFDM transceivers at realistic power-budget requirements under the investigated highly dispersive fading channel conditions.

With the technology for coding and conveying conventional two-dimensional video reaching a mature status, our effort is now directed towards three-dimensional video communication for mobile users. This type of service is useful in applications where spatial depth information is important to the user. One particular example is telemedicine. Within this framework, we would like to investigate the use of wavelets as well as discrete cosine transform based codecs such as MPEG4 in providing efficient compression for the three-dimensional video signals. System aspects such as transmission and network layer issues are also of interest.

[top] [Main Page]

University of Southampton: