Tuesday, June 4, 2019
H.264 Video Streaming System on Embedded Platform
H.264 Video stream System on Embedded PlatformABSTRACTThe adoption of technological products bid digital television and pic conferencing has make delineation be adrift an active research ara.This re way pres raritys the integration of a telly pinion mental faculty into a baseline H.264/AVC en fagonr running a TMSDM6446EVM introduce platform. The main design of this switch is to strain real- meter float of the baseline H.264/AVC television system everywhere a local anesthetic ara ne cardinalrk (LAN) which is a part of the oersight tv set ashes.The en crypt compend of baseline H.264/AVC and the hardw atomic offspring 18 comp starnts of the platform ar world-class discussed. Various drift communications communications protocols atomic outlet 18 study in order to implement the flick catameniaer on the DM6446 board. The multi-threaded screening encoder architectural plan is mapd to encode peeled television system frames into H.264/AVC format onto a f iling cabinet. For the word word picture float, open obtain Live555 MediaServer was intentiond to pelt motion picture entropy to a upstage VLC node over LAN.Initially, file float was utilise from PC to PC. Upon successfully implementation on PC, the depiction banner was fashioned to the board. The steps involved in embrasureing the Live555 application were also described in the report. Both unicast and multicast file cyclosis were enforced in the video streamer.Due to the problems of file stream, the spanking float approach was adopted. Several regularityologies were discussed in integrating the video streamer and the encoder program. Modification was made some(prenominal) the encoder program and the Live555 application to achieve have sex stream of H.264/AVC video. Results of both file and live streaming result be shown in this report. The implemented video streamer module pull up s compacts be utilise as a base module of the video surveillance come for thline.Chapter 1 Introduction 1.1. Background real breakthroughs grant been made over the last few years in the line of business of digital video crush technologies. As such applications making use of these technologies thrust also become prevalent and continue to be of active research topics today. For example, digital television and video conferencing argon nearly of the applications that ar instanter usually en dependered in our daily lives. One application of interest here is to make use of the technologies to implement a video tv camera surveillance ashes which arse enhance the security of consumers business and home environment.In typical surveillance systems, the puzzled video is sent over a cable ne cardinalrks to be monitored and stored at remote stations. As the captured raw video contains large amount of info, it will be of profit to first compress the selective culture by using a compression technique before it is transferred over the meshwork. One such c ompression technique that is suitable for this type of application is the H.264 coding standard.H.264 coding is better than the opposite coding technique for video streaming as it is to a greater extent robust to data losses and coding efficiency, which argon important factors when streaming is performed over a sh atomic number 18 Local celestial orbit meshwork. As at that place is an increasing acceptance of H.264 coding and the availability of high computing power enter systems, digital video surveillance system based on H.264 on embedded platform is and so a feasible and a potentially much cost-effective system.Implementing a H.264 video streaming system on an embedded platform is a logical extension of video surveillance systems which are still typical implemented using high computing power stations (e.g. PC). In a embedded version, a Digital Signal Processor (DSP) forms the core of the embedded system and executes the intense signal processing algorithm. Current embedd ed systems typical also embarrass network features which enable the implementation of data streaming applications. To facilitate data streaming, a quash of network protocol standards have also being delimitate, and are currently utilize for digital video applications.1.2. Objective and ScopeThe intentive of this final examination year tole range is to implement a video surveillance system based on the H.264 coding standard running on an embedded platform. such(prenominal) a system contains extensive scopes of functionalities and would require extensive amount of development time if implemented from scratch. Hence this project is to focus on the data streaming aspect of a video surveillance system.After some initial investigation and experimentation, it is decided to confine the main scope of the project to develop a live streaming H.264 based video system running on a DM6446 EVM development platform. The break trim back of the work to be progressive performed are therefore set as follows1. Familiarization of open source live555 streaming media master of ceremoniesDue to the complexity of implementing the non-homogeneous standard protocols needed for multimedia streaming, the live555 media host program is apply as a base to implement the streaming of the H.264.based video data.2. Streaming of stored H.264 file over the networkThe live555 is wherefore modified to support streaming of raw encoded H.264 file from the DM6446 EVM board over the network. Knowledge of H.264 coding standard is necessary in order to parse the file stream before streaming over the network.3. Modifying a demo version of an encoder program and integrating it together with live555 to achieve live streamingThe demo encoder was modified to send encoded video data to the Live555 program which would do the necessary piece of landization to be streamed over the network. Since data is passed from one process to another, variant inter-process communication techniques were studied and employ in this project.1.3. ResourcesThe resources apply for this project are as follows1. DM6446 (DaVinci) Evaluation Module2. SWANN C500 Professional CCTV Camera Solution 400 TV Lines CCD Color Camera3. LCD Dis tamper4. IR far prevail5. TI Davinci demo version of MontaVista Linux Pro v4.06. A Personal Workstation with Centos v5.07. VLC role romanceer v.0.9.8a as node8. Open source live555 program (downloaded from www.live555.com)The system setup of this project is shown below1.4. Report OrganizationThis report consists of 7 chapters.Chapter 1 introduces the motivation behind embedded video streaming system and defines the scope of the project.Chapter 2 illustrates the video lit follow-up of the H.264/AVC video coding technique and the various streaming protocols which are to be implemented in the project.Chapter 3 explains the ironware literature review of the platform being used in the project. The architecture, memory trouble, inter-process communication and the packet tools are also discussed in this chapter.Chapter 4 explains the execution of the encoder program of the DM6446EVM board. The fundamental interaction of the various threads in this multi-threaded application is also discussed to fully understand the encoder program.Chapter 5 gives an overview of the Live555 MediaServer which is used as a base to implement the video streamer module on the board. Adding support to unicast and multicast streaming, porting of live555 to the board and receiving video stream on remote VCL leaf node are explained in this chapter.Chapter 6 explains the limitations of file streaming and sorrowful towards live streaming system. Various integration methodologies and modification to both encoder program and live555 program are shown as easy.Chapters 7 summarize the implementation results of file and live streaming, analysis the performance of these results.Chapter 8 gives the conclusion by stating the current limitation and problems, scope for future implementation.Chapter 2 Video Literature Review 2.1. H.264/AVC Video Codec OverviewH.264 is the most(prenominal) locomote and latest video coding technique. Although there are some(prenominal) video coding schemes interchangeable H.26x and MPEG, H.264/AVC made many improvements and tools for coding efficiency and fracture resiliency. This chapter shortly will discuss the network aspect of the video coding technique. It will also cover illusion resiliency needed for transmission of video data over the network. For a more detailed explanation of the H.264/AVC, refer to appendix A.2.1.1. Network Abstraction Layer (NAL)The aim of the NAL is to ensure that the data sexual climax from the VCL grade is network worthy so that the data can be used for numerous systems. NAL facilitates the mapping of H.264/AVC VCL data for divergent carry-over layers such as* RTP/IP real-time streaming over wired and wire slight mediums* Different storage file formats such as MP4, MMS, AVI and etc .The beliefs of NAL and error robustness techniques of the H.264/AVC will be discussed in the following parts of the report. NAL UnitsThe encoded data from the VCL are packed into NAL units. A NAL unit represents a software which makes up of a indisputable derive of bytes. The first byte of the NAL unit is called the fountainhead byte which indicates the data type of the NAL unit. The remaining bytes make up the payload data of the NAL unit.The NAL unit social structure allows provision for contrary transport systems videlicet big bucks-oriented and bit stream-oriented. To cater for bit stream-oriented transport systems ilk MPEG-2, the NAL units are make into byte stream format. These units are prefixed by a specific pay back code prefix of three bytes which is namely 0x000001. The start code prefix indicates and the start of each NAL units and hence defining the boundaries of the units.For software program-oriented transport systems, the encoded video data are transpor ted via packets delimit by transport protocols. Hence, the boundaries of the NAL units are known without having to include start code prefix byte. The details of packetization of NAL units will be discussed in later sections of the report.NAL units are make headway categorised into two types* VCL unit comprises of encoded video data Non-VCL unit comprises of additional entropy like line sets which is the important headspring information. Also contains supplementary sweetener information (SEI) which contains the timing information and other data which increases the usability of the decoded video signal. Access unitsA group of NAL units which adhere to a certain form is called a approaching unit. When one access unit is decoded, one decoded picture is formed. In the table 1 below, the functions of the NAL units derived from the access units are explained.Data/Error robustness techniquesH.264/AVC has several(prenominal) techniques to ebb error/data loss which is an essential qu ality when it comes to streaming applications. The techniques are as follows disceptation sets contains information that is being applied to large act of VCL NAL units. It comprises of two kinds of arguing sets Sequence Parameter set (SPS) Information pertaining to place of encoded picture Picture Parameter Set (PPS) Information pertaining to one or more various(prenominal) picturesThe to a higher place mentioned parameters hardly changes and hence it need not be transmitted repeatedly and saves command processing overhead time. The parameter sets can be sent in-band which is carried in the same cable as the VCL NAL units. It can also be sent out-of-band using reliable transport protocol. in that locationfore, it enhances the resiliency towards data and error loss. Flexible Macroblock Ordering (FMO)FMO maps the macroblocks to different baseball swing groups. In the event of any slice group loss, missing data is masked up by interpolating from the other slice groups. Redu ndancy Slices (RS)Redundant deputation of the picture can be stored in the redundant slices. If the loss of the veritable slice occurs, the decoder can make use of the redundant slices to recover the original slice.These techniques introduced in the H.264/AVC makes the codec more robust and resilient towards data and error loss.2.1.2. Profiles and LevelsA profile of a codec is defined as the set of features identified to meet a certain specifications of intended applications For the H.264/AVC codec, it is defined as a set of features identified to generate a conforming bit stream. A level is imposes restrictions on some account parameters of the bit stream.In H.264/AVC, there are three profiles namely Baseline, Main and Extended. 5 shows the relationship between these profiles. The Baseline profile is most in all probability to be used by network cameras and encoders as it requires limited computing resources. It is quite ideal to make use of this profile to support real-time st reaming applications in a embedded platform.2.2. Overview of Video StreamingIn previous systems, accessing video data across network consummation the download and go approach. In this approach, the node had to wait until the whole video data is downloaded to the media forgeer before play out begins. To combat the long initial play out delay, the concept of streaming was introduced.Streaming allows the client to play out the earlier part of the video data whilst still transferring the remaining part of the video data. The major(ip) advantage of the streaming concept is that the video data need not be stored in the clients computer as compared to the traditional download and play approach. This reduces the long initial play out delay experienced by the client.Streaming adopts the traditional client/server model. The client connects to the listening server and postulate for video data. The server sends video data over to the client for play out of video data.2.2.1. Types of Stream ingThere are three different types of streaming video data. They are pre-recorded/ file streaming, live/real-time streaming and interactive streaming.* Pre-recorded/live streaming The encoded video is stored into a file and the system streams the file over the network. A major overhead is that there is a long initial play out delay (10-15s) experienced by the client.* Live/real-time streaming The encoded video is streamed over the network directly without being stored into a file. The initial play out delay reduces. Consideration must be taken to ensure that play out rate does not exceed direct rate which may result in jerky the picture. On the other hand, if the sending rate is too slow, the packets arriving at the client may be dropped, causing in a freezing the picture. The timing requirement for the end-to-end delay is more stringent in this scenario.* synergistic streaming handle live streaming, the video is streamed directly over the network. It moves to users control inpu t such as rewind, pause, stop, play and forward the particular video stream. The system should respond in accordance to those inputs by the user.In this project, both pre-recorded and live streaming are implemented. Some functionality of interactive streaming controls like stop and play are also part of the system.2.2.2. Video Streaming System modulesVideo SourceThe intent of the video source is to capture the raw video sequence. The CCTV camera is used as the video source in this project. Most cameras are of analogue inputs and these inputs are connected to the encode station via video friendships. This project makes use of plainly one video source due to the limitation of the video connections on the encoding station. The raw video sequence is then passed onto the encoding station. convert propertyThe aim of the encoding station digitized and encodes the raw video sequence into the desire format. In the actual system, the encoding is done by the DM6446 board into the H.264/AV C format. Since the hardware encoding is CPU intensive, this forms the bottleneck of the whole streaming system. The H.264 video is passed onto the video streamer server module of the system.Video Streaming and WebServerThe role of the video streaming server is to packetize the H.264/AVC to be streamed over the network. It serves the requests from individual clients. It needs to support the total bandwidth requirements of the particular video stream requested by clients. WebServer offers a URL necktie which connects to the video streaming server. For this project, the video streaming server module is embedded inside DM6446 board and it is serves every individual clients requests.Video actorThe video player acts a client connecting to and requesting video data from the video streaming server. Once the video data is received, the video player buffers the data for a while and then begins play out of data. The video player used for this project is the VideoLAN (VLC) Player. It has th e relevant H.264/AVC codec so that it can decode and play the H264/AVC video data.2.2.3. Unicast VS MulticastThere are two give away delivery techniques employed by streaming media distribution.Unicast transmission is the sending of data to one particular network destination host over a packet switched network. It establishes two way point-to-point connection between client and server. The client communicates directly with the server via this connection. The drawback is that every connection receives a separate video stream which uses up network bandwidth rapidly.Multicast transmission is the sending of still one copy of data via the network so that many clients can receive simultaneously. In video streaming, it is more cost effective to send single copy of video data over the network so as to conserve the network bandwidth. Since multicast is not connection oriented, the clients cannot control the streams that they can receive.In this project, unicast transmission is used to stre am encoded video over the network. The client connects directly to the DM6446 board where it gets the encoded video data. The project can tardily be extended to multicast transmission.2.3. Streaming communications protocolsWhen streaming video content over a network, a reduce of network protocols are used. These protocols are well defined by the Internet Engineering Task Force (IETF) and the Internet Society (IS) and documented in Request for Comments (RFC) documents. These standards are adopted by many developers today.In this project, the same standards are also employed in order to successfully stream H.264/AVC content over a simple Local Area Network (LAN). The following sections will discuss about the various protocols that are studied in the course of this project.2.3.1. Real-Time Streaming Protocol (RTSP)The most normally used application layer protocol is RTSP. RTSP acts a control protocol to media streaming servers. It establishes connection between two end points of the system and control media posings. Clients publishing VCR-like commands like play and pause to facilitate the control of real-time playback of media streams from the servers. However, this protocol is not involved in the transport of the media stream over the network. For this project, RTSP version 1.0 is used.RTSP StatesLike the Hyper text Transfer Protocol (HTTP), it contains several methods. They are OPTIONS, DESCRIBE, setup, PLAY, PAUSE, RECORD and TEARDOWN. These commands are sent by using the RTSP URL. The default port number used in this protocol is 554. An example of such as URL is rtsp// OPTIONS An OPTIONS request returns the types of request that the server will accept. An example of the request isOPTIONS rtsp//155.69.148.136554/test.264 RTSP/1.0CSeq 1rnUser- instrument VLC media PlayerThe CSeq parameter keeps track of the number of request send to the server and it is incremented every time a new request is issued. The User-agent refers to the client making the request. * DESCRIBE This method gets the presentation or the media object identified in the request URL from the server. An example of such a requestDESCRIBE rtsp//155.69.148.138554/test.264 RTSP/1.0CSeq 2rn film application/sdprnUser agent VLC media PlayerThe Accept fountainhead is used to describe the formats understood by the client. All the initialization of the media resource must be present in the DESCRIBE method that it describes. SETUP This method will specify the mode of transport mechanism to be used for the media stream. A typical example isSETUP rtsp//155.69.148.138554/test.264 RTSP/1.0CSeq 3rnTransport RTP/AVP unicast client_port = 1200-1201User agent VLC media PlayerThe Transport header specifies the transport mechanism to be used. In this case, real-time transport protocol is used in a unicast manner. The relevant client port number is also reflected and it is selected randomly by the server. Since RTSP is a stateful protocol, a session is created upon successful acknowledgem ent to this method. PLAY This method request the server to start sending the data via the transport mechanism stated in the SETUP method. The URL is the same as the other methods except for school term 6 mold npt= 0.000- rnThe Session header specifies the comical session id. This is important as server may establish various sessions and this keep tracks of them. The Range header positions play time to the beginning and plays till the end of the range.* PAUSE This method informs the server to pause sending of the media stream. Once the PAUSE request is sent, the range header will capture the position at which the media stream is paused. When a PLAY request is sent again, the client will resume playing from the current position of the media stream as specified in the range header.RSTP Status CodesWhenever the client sends a request message to the server, the server forms a equivalent response message to be sent to the client. The response codes are similar to HTTP as they are both in ASCII text. They are as follows200 OK301 Redirection405 Method non Allowed451 Parameter Not Understood454 Session Not Found457 Invalid Range461 Unsupported Transport462 Destination UnreachableThese are some of the RTSP status codes. There are many others but the codes mentioned above are of importance in the context of this project.2.3.2. Real-time Transport Protocol (RTP)RTP is a defined packet structure which is used for transporting media stream over the network. It is a transport layer protocol but developers view it as a application layer protocol stack. This protocol facilitates jitter wages and detection of incorrect sequence arrival of data which is common for transmission over IP network. For the transmission of media data over the network, it is important that packets descend in a timely manner as it is loss tolerant but not delay tolerant. Due to the high latency of Transmission obligate Protocol in establishing connections, RTP is often built on top of the User Data gram Protocol (UDP). RTP also supports multicast transmission of data.RTP is also a stateful protocol as a session is established before data can be packed into the RTP packet and sent over the network. The session contains the IP dish out of the destination and port number of the RTP which is usually an even number. The following section will explain about the packet structure of RTP which is used for transmission.RTP Packet StructureThe below shows a RTP packet header which is appended in front of the media data.sThe minimum size of the RTP header is 12 bytes.. Optional extension information may be present by and by the header information. The fields of the header are V (2 bits) to indicate the version number of the protocol. Version used in this project is 2. P (Padding) (1 bit) to indicate if there dramatise which can be used for encryption algorithm X (Extension) (1 bit) to indicate if there is extension information between header and payload data. CC (CSRC Count) (4 bits) indicates the number of CSRC identifiers M (Marker) (1 bit) used by application to indicate data has specific relevance in the perspective of the application. The setting for M bit label the end of video data in this project PT (Payload Type) (7 bits) to indicate the type of payload data carried by the packet. H.264 is used for this project Sequence number (16 bits) incremented by one for every RTP packet. It is used to detect packet loss and out of sequence packet arrival. Based on this information, application can take suspend action to correct them. Time Stamp (32 bits) receivers use this information to play samples at correct intervals of time. Each stream has independent time stamps. SSRC (32 bits) it unique identifies source of the stream. CSRC sources of a stream from different sources are enumerated according to its source IDs.This project does not involve the use of Extension field in the packet header and hence will not be explained in this report. Once this header infor mation is appended to the payload data, the packet is sent over the network to the client to be played. The table below summarizes the payload types of RTP and highlighted region is of interest in this project.Table 2 Payload Types of RTP Packets2.3.3. RTP Control Protocol (RTCP)RTCP is a sister protocol which is used in conjunction with the RTP. It provides out-of-band statistical and control information to the RTP session. This provides certain Quality of Service (QoS) for transmission of video data over the network.The primary functions of the RTCP are* To set up statistical information about the quality aspect of the media stream during a RTP session. This data is sent to the session media source and its participants. The source can exploit this information for adaptive media encoding and detect transmission errors.* It provides canonical end point identifiers (CNAME) to all its session participants. It allows unique identification of end points across different application ins tances and serves as a third party monitoring tool.* It also sends RTCP reports to all its session participants. By doing so, the traffic bandwidth increases proportionally. In order to avoid congestion, RTCP has bandwidth management techniques to only use 5% of the total session bandwidth.RTCP statistical data is sent odd numbered ports. For instance, if RTP port number is 196, then RTCP will use the 197 as its port number. There is no default port number assigned to RTCP.RTCP Message TypesRTCP sends several types of packets different from RTP packets. They are sender report, receiver report, source commentary and bye. vector Report (SR) Sent periodically by senders to report the transmission and reception statistics of RTP packets sent in a period of time. It also includes the senders SSRC and senders packet count information. The timestamp of the RTP packet is also sent to allow the receiver to synchronize the RTP packets. The bandwidth required for SR is 25% of RTCP bandwidth. Receiver Report (RR) It reports the QoS to other receivers and senders. Information like highest sequence number received, inter arrival jitter of RTP packets and fraction of packets loss further explains the QoS of the transmitted media streams. The bandwidth required for RR is 75% of the RTCP bandwidth. Source Description (SDES) Sends the CNAME to its session participants. Additional information like name, address of the owner of the source can also be sent. End of Participation (BYE) The source sends a BYE message to indicate that it is shutting down the stream. It serves as an announcement that a particular end point is leaving the conference.Further RTCP ConsiderationThis protocol is important to ensure that QoS standards are achieved. The acceptable frequencies of these reports are less than one minute. In major application, the frequency may increase as RTCP bandwidth control mechanism. Then, the statistical reporting on the quality of the media stream becomes inaccurate.Sin ce there are no long delays introduced between the reports in this project, the RTCP is adopted to incorporate a certain level of QoS on streaming H.264/AVC video over embedded platform.2.3.4. Session Description Protocol (SDP)The Session Description Protocol is a standard to describe streaming media initialization parameters. These initializations describe the sessions for session announcement, session invitation and parameter negotiation. This protocol can be used together with RTSP. In the previous sections of this chapter, SDP is used in the DESCRIBE state of RTSP to get sessions media initialization parameters. SDP is scalable to include different media types and formats.SDP SyntaxThe session is described by attribute/value pairs. The syntax of SDP are summarized in the below.In this project, the use of SDP is important in streaming as the client is VLC Media Player. If the streaming is done via RTSP, then VLC expects a sdp description from the server in order to setup the sess ion and facilitate the playback of the streaming media.Chapter 3 computer hardware Literature Review 3.1. Introduction to Texas Instrument DM6446EVM DavinciTMThe development of this project based on the DM6446EVM board. It is necessary to understand the hardware and software aspects of this board. The DM6446 board has a build processor operating at a quantify speed up to 300MHz and a C64x Digital Signal Processor operating at a clock speed of up to 600MHz.3.1.1. Key Features of DM6446The key features that are shown in the above are* 1 video port which supports composite of S video* 4 video DAC outputs component, RGB, composite* 256 MB of DDR2 DRAM* UART, Media Card port wine (SD, xD, SM, MS ,MMC Cards)* 16 MB of non-volatile Flash Memory, 64 MB NAND Flash, 4 MB SRAM* USB2 interface* 10/100 MBS Ethernet interface* Configurable boot load options* IR Remote Interface, real time clock via MSP4303.1.2. DM6446EVM ArchitectureThe architecture of the DM6446 board is organized into severa l subsystems. By knowing the architecture of the DM6446, the developer can then design and built his application module on the boards underlining architecture.The shows that DM6446 has three subsystems which are connected to the be hardware peripherals. This provides a decoupled architecture which allows the developers to implement his applications on a particular subsystem without having to modify the other subsystems. Some of subsystems are discussed in the next sections. spike SubsystemThe ARM subsystem is trustworthy for the master control of the DM6446 board. It handles the system-level initializations, configurations, user interface, connectivity functions and control of DSP subsystems. The ARM has a larger program memory space and better context switching capabilities and hence it is more suited to handle complex and multi tasks of the system.DSP SubsystemThe DSP subsystem is mainly the encoding the raw captured video frames into the desired format. It performs several numb er crunching operations in order to achieve the desired compression technique. It works together with the Video Imaging Coprocessor to compress the video frames.Video Imaging Coprocessor (VICP)The VICP is a signal processing library which contains various software algorithms that execute on VICP hardware accelerator. It helps the DSP by taking over computation of varied intensive tasks. Since hardware implementation of number cruH.264 Video Streaming System on Embedded PlatformH.264 Video Streaming System on Embedded PlatformABSTRACTThe adoption of technological products like digital television and video conferencing has made video streaming an active research area.This report presents the integration of a video streamer module into a baseline H.264/AVC encoder running a TMSDM6446EVM embedded platform. The main objective of this project is to achieve real-time streaming of the baseline H.264/AVC video over a local area network (LAN) which is a part of the surveillance video system.T he encoding of baseline H.264/AVC and the hardware components of the platform are first discussed. Various streaming protocols are studied in order to implement the video streamer on the DM6446 board. The multi-threaded application encoder program is used to encode raw video frames into H.264/AVC format onto a file. For the video streaming, open source Live555 MediaServer was used to stream video data to a remote VLC client over LAN.Initially, file streaming was implemented from PC to PC. Upon successfully implementation on PC, the video streamer was ported to the board. The steps involved in porting the Live555 application were also described in the report. Both unicast and multicast file streaming were implemented in the video streamer.Due to the problems of file streaming, the live streaming approach was adopted. Several methodologies were discussed in integrating the video streamer and the encoder program. Modification was made both the encoder program and the Live555 applicatio n to achieve live streaming of H.264/AVC video. Results of both file and live streaming will be shown in this report. The implemented video streamer module will be used as a base module of the video surveillance system.Chapter 1 Introduction 1.1. BackgroundSignificant breakthroughs have been made over the last few years in the area of digital video compression technologies. As such applications making use of these technologies have also become prevalent and continue to be of active research topics today. For example, digital television and video conferencing are some of the applications that are now commonly encountered in our daily lives. One application of interest here is to make use of the technologies to implement a video camera surveillance system which can enhance the security of consumers business and home environment.In typical surveillance systems, the captured video is sent over a cable networks to be monitored and stored at remote stations. As the captured raw video cont ains large amount of data, it will be of advantage to first compress the data by using a compression technique before it is transferred over the network. One such compression technique that is suitable for this type of application is the H.264 coding standard.H.264 coding is better than the other coding technique for video streaming as it is more robust to data losses and coding efficiency, which are important factors when streaming is performed over a shared Local Area Network. As there is an increasing acceptance of H.264 coding and the availability of high computing power embedded systems, digital video surveillance system based on H.264 on embedded platform is hence a feasible and a potentially more cost-effective system.Implementing a H.264 video streaming system on an embedded platform is a logical extension of video surveillance systems which are still typical implemented using high computing power stations (e.g. PC). In a embedded version, a Digital Signal Processor (DSP) fo rms the core of the embedded system and executes the intensive signal processing algorithm. Current embedded systems typical also include network features which enable the implementation of data streaming applications. To facilitate data streaming, a number of network protocol standards have also being defined, and are currently used for digital video applications.1.2. Objective and ScopeThe objective of this final year project is to implement a video surveillance system based on the H.264 coding standard running on an embedded platform. Such a system contains extensive scopes of functionalities and would require extensive amount of development time if implemented from scratch. Hence this project is to focus on the data streaming aspect of a video surveillance system.After some initial investigation and experimentation, it is decided to confine the main scope of the project to developing a live streaming H.264 based video system running on a DM6446 EVM development platform. The brea kdown of the work to be progressive performed are then identified as follows1. Familiarization of open source live555 streaming media serverDue to the complexity of implementing the various standard protocols needed for multimedia streaming, the live555 media server program is used as a base to implement the streaming of the H.264.based video data.2. Streaming of stored H.264 file over the networkThe live555 is then modified to support streaming of raw encoded H.264 file from the DM6446 EVM board over the network. Knowledge of H.264 coding standard is necessary in order to parse the file stream before streaming over the network.3. Modifying a demo version of an encoder program and integrating it together with live555 to achieve live streamingThe demo encoder was modified to send encoded video data to the Live555 program which would do the necessary packetization to be streamed over the network. Since data is passed from one process to another, various inter-process communication tec hniques were studied and used in this project.1.3. ResourcesThe resources used for this project are as follows1. DM6446 (DaVinci) Evaluation Module2. SWANN C500 Professional CCTV Camera Solution 400 TV Lines CCD Color Camera3. LCD Display4. IR Remote Control5. TI Davinci demo version of MontaVista Linux Pro v4.06. A Personal Workstation with Centos v5.07. VLC player v.0.9.8a as client8. Open source live555 program (downloaded from www.live555.com)The system setup of this project is shown below1.4. Report OrganizationThis report consists of 7 chapters.Chapter 1 introduces the motivation behind embedded video streaming system and defines the scope of the project.Chapter 2 illustrates the video literature review of the H.264/AVC video coding technique and the various streaming protocols which are to be implemented in the project.Chapter 3 explains the hardware literature review of the platform being used in the project. The architecture, memory management, inter-process communication a nd the software tools are also discussed in this chapter.Chapter 4 explains the execution of the encoder program of the DM6446EVM board. The interaction of the various threads in this multi-threaded application is also discussed to fully understand the encoder program.Chapter 5 gives an overview of the Live555 MediaServer which is used as a base to implement the video streamer module on the board. Adding support to unicast and multicast streaming, porting of live555 to the board and receiving video stream on remote VCL client are explained in this chapter.Chapter 6 explains the limitations of file streaming and moving towards live streaming system. Various integration methodologies and modification to both encoder program and live555 program are shown as well.Chapters 7 summarize the implementation results of file and live streaming, analysis the performance of these results.Chapter 8 gives the conclusion by stating the current limitation and problems, scope for future implementatio n.Chapter 2 Video Literature Review 2.1. H.264/AVC Video Codec OverviewH.264 is the most advanced and latest video coding technique. Although there are many video coding schemes like H.26x and MPEG, H.264/AVC made many improvements and tools for coding efficiency and error resiliency. This chapter briefly will discuss the network aspect of the video coding technique. It will also cover error resiliency needed for transmission of video data over the network. For a more detailed explanation of the H.264/AVC, refer to appendix A.2.1.1. Network Abstraction Layer (NAL)The aim of the NAL is to ensure that the data coming from the VCL layer is network worthy so that the data can be used for numerous systems. NAL facilitates the mapping of H.264/AVC VCL data for different transport layers such as* RTP/IP real-time streaming over wired and wireless mediums* Different storage file formats such as MP4, MMS, AVI and etc.The concepts of NAL and error robustness techniques of the H.264/AVC will b e discussed in the following parts of the report. NAL UnitsThe encoded data from the VCL are packed into NAL units. A NAL unit represents a packet which makes up of a certain number of bytes. The first byte of the NAL unit is called the header byte which indicates the data type of the NAL unit. The remaining bytes make up the payload data of the NAL unit.The NAL unit structure allows provision for different transport systems namely packet-oriented and bit stream-oriented. To cater for bit stream-oriented transport systems like MPEG-2, the NAL units are organized into byte stream format. These units are prefixed by a specific start code prefix of three bytes which is namely 0x000001. The start code prefix indicates and the start of each NAL units and hence defining the boundaries of the units.For packet-oriented transport systems, the encoded video data are transported via packets defined by transport protocols. Hence, the boundaries of the NAL units are known without having to inclu de start code prefix byte. The details of packetization of NAL units will be discussed in later sections of the report.NAL units are further categorized into two types* VCL unit comprises of encoded video data Non-VCL unit comprises of additional information like parameter sets which is the important header information. Also contains supplementary enhancement information (SEI) which contains the timing information and other data which increases the usability of the decoded video signal. Access unitsA group of NAL units which adhere to a certain form is called a access unit. When one access unit is decoded, one decoded picture is formed. In the table 1 below, the functions of the NAL units derived from the access units are explained.Data/Error robustness techniquesH.264/AVC has several techniques to mitigate error/data loss which is an essential quality when it comes to streaming applications. The techniques are as follows Parameter sets contains information that is being applied to large number of VCL NAL units. It comprises of two kinds of parameter sets Sequence Parameter set (SPS) Information pertaining to sequence of encoded picture Picture Parameter Set (PPS) Information pertaining to one or more individual picturesThe above mentioned parameters hardly changes and hence it need not be transmitted repeatedly and saves overhead. The parameter sets can be sent in-band which is carried in the same channel as the VCL NAL units. It can also be sent out-of-band using reliable transport protocol. Therefore, it enhances the resiliency towards data and error loss. Flexible Macroblock Ordering (FMO)FMO maps the macroblocks to different slice groups. In the event of any slice group loss, missing data is masked up by interpolating from the other slice groups. Redundancy Slices (RS)Redundant representation of the picture can be stored in the redundant slices. If the loss of the original slice occurs, the decoder can make use of the redundant slices to recover the ori ginal slice.These techniques introduced in the H.264/AVC makes the codec more robust and resilient towards data and error loss.2.1.2. Profiles and LevelsA profile of a codec is defined as the set of features identified to meet a certain specifications of intended applications For the H.264/AVC codec, it is defined as a set of features identified to generate a conforming bit stream. A level is imposes restrictions on some key parameters of the bit stream.In H.264/AVC, there are three profiles namely Baseline, Main and Extended. 5 shows the relationship between these profiles. The Baseline profile is most likely to be used by network cameras and encoders as it requires limited computing resources. It is quite ideal to make use of this profile to support real-time streaming applications in a embedded platform.2.2. Overview of Video StreamingIn previous systems, accessing video data across network exploit the download and play approach. In this approach, the client had to wait until the whole video data is downloaded to the media player before play out begins. To combat the long initial play out delay, the concept of streaming was introduced.Streaming allows the client to play out the earlier part of the video data whilst still transferring the remaining part of the video data. The major advantage of the streaming concept is that the video data need not be stored in the clients computer as compared to the traditional download and play approach. This reduces the long initial play out delay experienced by the client.Streaming adopts the traditional client/server model. The client connects to the listening server and request for video data. The server sends video data over to the client for play out of video data.2.2.1. Types of StreamingThere are three different types of streaming video data. They are pre-recorded/ file streaming, live/real-time streaming and interactive streaming.* Pre-recorded/live streaming The encoded video is stored into a file and the system s treams the file over the network. A major overhead is that there is a long initial play out delay (10-15s) experienced by the client.* Live/real-time streaming The encoded video is streamed over the network directly without being stored into a file. The initial play out delay reduces. Consideration must be taken to ensure that play out rate does not exceed sending rate which may result in jerky the picture. On the other hand, if the sending rate is too slow, the packets arriving at the client may be dropped, causing in a freezing the picture. The timing requirement for the end-to-end delay is more stringent in this scenario.* Interactive streaming Like live streaming, the video is streamed directly over the network. It responds to users control input such as rewind, pause, stop, play and forward the particular video stream. The system should respond in accordance to those inputs by the user.In this project, both pre-recorded and live streaming are implemented. Some functionality of interactive streaming controls like stop and play are also part of the system.2.2.2. Video Streaming System modulesVideo SourceThe intent of the video source is to capture the raw video sequence. The CCTV camera is used as the video source in this project. Most cameras are of analogue inputs and these inputs are connected to the encoding station via video connections. This project makes use of only one video source due to the limitation of the video connections on the encoding station. The raw video sequence is then passed onto the encoding station.Encoding StationThe aim of the encoding station digitized and encodes the raw video sequence into the desired format. In the actual system, the encoding is done by the DM6446 board into the H.264/AVC format. Since the hardware encoding is CPU intensive, this forms the bottleneck of the whole streaming system. The H.264 video is passed onto the video streamer server module of the system.Video Streaming and WebServerThe role of the video st reaming server is to packetize the H.264/AVC to be streamed over the network. It serves the requests from individual clients. It needs to support the total bandwidth requirements of the particular video stream requested by clients. WebServer offers a URL link which connects to the video streaming server. For this project, the video streaming server module is embedded inside DM6446 board and it is serves every individual clients requests.Video PlayerThe video player acts a client connecting to and requesting video data from the video streaming server. Once the video data is received, the video player buffers the data for a while and then begins play out of data. The video player used for this project is the VideoLAN (VLC) Player. It has the relevant H.264/AVC codec so that it can decode and play the H264/AVC video data.2.2.3. Unicast VS MulticastThere are two key delivery techniques employed by streaming media distribution.Unicast transmission is the sending of data to one particular network destination host over a packet switched network. It establishes two way point-to-point connection between client and server. The client communicates directly with the server via this connection. The drawback is that every connection receives a separate video stream which uses up network bandwidth rapidly.Multicast transmission is the sending of only one copy of data via the network so that many clients can receive simultaneously. In video streaming, it is more cost effective to send single copy of video data over the network so as to conserve the network bandwidth. Since multicast is not connection oriented, the clients cannot control the streams that they can receive.In this project, unicast transmission is used to stream encoded video over the network. The client connects directly to the DM6446 board where it gets the encoded video data. The project can easily be extended to multicast transmission.2.3. Streaming ProtocolsWhen streaming video content over a network, a numb er of network protocols are used. These protocols are well defined by the Internet Engineering Task Force (IETF) and the Internet Society (IS) and documented in Request for Comments (RFC) documents. These standards are adopted by many developers today.In this project, the same standards are also employed in order to successfully stream H.264/AVC content over a simple Local Area Network (LAN). The following sections will discuss about the various protocols that are studied in the course of this project.2.3.1. Real-Time Streaming Protocol (RTSP)The most commonly used application layer protocol is RTSP. RTSP acts a control protocol to media streaming servers. It establishes connection between two end points of the system and control media sessions. Clients issue VCR-like commands like play and pause to facilitate the control of real-time playback of media streams from the servers. However, this protocol is not involved in the transport of the media stream over the network. For this pro ject, RTSP version 1.0 is used.RTSP StatesLike the Hyper Text Transfer Protocol (HTTP), it contains several methods. They are OPTIONS, DESCRIBE, SETUP, PLAY, PAUSE, RECORD and TEARDOWN. These commands are sent by using the RTSP URL. The default port number used in this protocol is 554. An example of such as URL is rtsp// OPTIONS An OPTIONS request returns the types of request that the server will accept. An example of the request isOPTIONS rtsp//155.69.148.136554/test.264 RTSP/1.0CSeq 1rnUser-agent VLC media PlayerThe CSeq parameter keeps track of the number of request send to the server and it is incremented every time a new request is issued. The User-agent refers to the client making the request.* DESCRIBE This method gets the presentation or the media object identified in the request URL from the server. An example of such a requestDESCRIBE rtsp//155.69.148.138554/test.264 RTSP/1.0CSeq 2rnAccept application/sdprnUser agent VLC media PlayerThe Accept header is used to describe th e formats understood by the client. All the initialization of the media resource must be present in the DESCRIBE method that it describes. SETUP This method will specify the mode of transport mechanism to be used for the media stream. A typical example isSETUP rtsp//155.69.148.138554/test.264 RTSP/1.0CSeq 3rnTransport RTP/AVP unicast client_port = 1200-1201User agent VLC media PlayerThe Transport header specifies the transport mechanism to be used. In this case, real-time transport protocol is used in a unicast manner. The relevant client port number is also reflected and it is selected randomly by the server. Since RTSP is a stateful protocol, a session is created upon successful acknowledgement to this method. PLAY This method request the server to start sending the data via the transport mechanism stated in the SETUP method. The URL is the same as the other methods except forSession 6Range npt= 0.000- rnThe Session header specifies the unique session id. This is important as serv er may establish various sessions and this keep tracks of them. The Range header positions play time to the beginning and plays till the end of the range.* PAUSE This method informs the server to pause sending of the media stream. Once the PAUSE request is sent, the range header will capture the position at which the media stream is paused. When a PLAY request is sent again, the client will resume playing from the current position of the media stream as specified in the range header.RSTP Status CodesWhenever the client sends a request message to the server, the server forms a equivalent response message to be sent to the client. The response codes are similar to HTTP as they are both in ASCII text. They are as follows200 OK301 Redirection405 Method Not Allowed451 Parameter Not Understood454 Session Not Found457 Invalid Range461 Unsupported Transport462 Destination UnreachableThese are some of the RTSP status codes. There are many others but the codes mentioned above are of importanc e in the context of this project.2.3.2. Real-time Transport Protocol (RTP)RTP is a defined packet structure which is used for transporting media stream over the network. It is a transport layer protocol but developers view it as a application layer protocol stack. This protocol facilitates jitter compensation and detection of incorrect sequence arrival of data which is common for transmission over IP network. For the transmission of media data over the network, it is important that packets arrive in a timely manner as it is loss tolerant but not delay tolerant. Due to the high latency of Transmission Control Protocol in establishing connections, RTP is often built on top of the User Datagram Protocol (UDP). RTP also supports multicast transmission of data.RTP is also a stateful protocol as a session is established before data can be packed into the RTP packet and sent over the network. The session contains the IP address of the destination and port number of the RTP which is usually an even number. The following section will explain about the packet structure of RTP which is used for transmission.RTP Packet StructureThe below shows a RTP packet header which is appended in front of the media data.sThe minimum size of the RTP header is 12 bytes.. Optional extension information may be present after the header information. The fields of the header are V (2 bits) to indicate the version number of the protocol. Version used in this project is 2. P (Padding) (1 bit) to indicate if there padding which can be used for encryption algorithm X (Extension) (1 bit) to indicate if there is extension information between header and payload data. CC (CSRC Count) (4 bits) indicates the number of CSRC identifiers M (Marker) (1 bit) used by application to indicate data has specific relevance in the perspective of the application. The setting for M bit marks the end of video data in this project PT (Payload Type) (7 bits) to indicate the type of payload data carried by the packet. H.264 is used for this project Sequence number (16 bits) incremented by one for every RTP packet. It is used to detect packet loss and out of sequence packet arrival. Based on this information, application can take appropriate action to correct them. Time Stamp (32 bits) receivers use this information to play samples at correct intervals of time. Each stream has independent time stamps. SSRC (32 bits) it unique identifies source of the stream. CSRC sources of a stream from different sources are enumerated according to its source IDs.This project does not involve the use of Extension field in the packet header and hence will not be explained in this report. Once this header information is appended to the payload data, the packet is sent over the network to the client to be played. The table below summarizes the payload types of RTP and highlighted region is of interest in this project.Table 2 Payload Types of RTP Packets2.3.3. RTP Control Protocol (RTCP)RTCP is a sister protocol whi ch is used in conjunction with the RTP. It provides out-of-band statistical and control information to the RTP session. This provides certain Quality of Service (QoS) for transmission of video data over the network.The primary functions of the RTCP are* To gather statistical information about the quality aspect of the media stream during a RTP session. This data is sent to the session media source and its participants. The source can exploit this information for adaptive media encoding and detect transmission errors.* It provides canonical end point identifiers (CNAME) to all its session participants. It allows unique identification of end points across different application instances and serves as a third party monitoring tool.* It also sends RTCP reports to all its session participants. By doing so, the traffic bandwidth increases proportionally. In order to avoid congestion, RTCP has bandwidth management techniques to only use 5% of the total session bandwidth.RTCP statistical da ta is sent odd numbered ports. For instance, if RTP port number is 196, then RTCP will use the 197 as its port number. There is no default port number assigned to RTCP.RTCP Message TypesRTCP sends several types of packets different from RTP packets. They are sender report, receiver report, source description and bye. Sender Report (SR) Sent periodically by senders to report the transmission and reception statistics of RTP packets sent in a period of time. It also includes the senders SSRC and senders packet count information. The timestamp of the RTP packet is also sent to allow the receiver to synchronize the RTP packets. The bandwidth required for SR is 25% of RTCP bandwidth. Receiver Report (RR) It reports the QoS to other receivers and senders. Information like highest sequence number received, inter arrival jitter of RTP packets and fraction of packets loss further explains the QoS of the transmitted media streams. The bandwidth required for RR is 75% of the RTCP bandwidth. Sou rce Description (SDES) Sends the CNAME to its session participants. Additional information like name, address of the owner of the source can also be sent. End of Participation (BYE) The source sends a BYE message to indicate that it is shutting down the stream. It serves as an announcement that a particular end point is leaving the conference.Further RTCP ConsiderationThis protocol is important to ensure that QoS standards are achieved. The acceptable frequencies of these reports are less than one minute. In major application, the frequency may increase as RTCP bandwidth control mechanism. Then, the statistical reporting on the quality of the media stream becomes inaccurate.Since there are no long delays introduced between the reports in this project, the RTCP is adopted to incorporate a certain level of QoS on streaming H.264/AVC video over embedded platform.2.3.4. Session Description Protocol (SDP)The Session Description Protocol is a standard to describe streaming media initializ ation parameters. These initializations describe the sessions for session announcement, session invitation and parameter negotiation. This protocol can be used together with RTSP. In the previous sections of this chapter, SDP is used in the DESCRIBE state of RTSP to get sessions media initialization parameters. SDP is scalable to include different media types and formats.SDP SyntaxThe session is described by attribute/value pairs. The syntax of SDP are summarized in the below.In this project, the use of SDP is important in streaming as the client is VLC Media Player. If the streaming is done via RTSP, then VLC expects a sdp description from the server in order to setup the session and facilitate the playback of the streaming media.Chapter 3 Hardware Literature Review 3.1. Introduction to Texas Instrument DM6446EVM DavinciTMThe development of this project based on the DM6446EVM board. It is necessary to understand the hardware and software aspects of this board. The DM6446 board has a ARM processor operating at a clock speed up to 300MHz and a C64x Digital Signal Processor operating at a clock speed of up to 600MHz.3.1.1. Key Features of DM6446The key features that are shown in the above are* 1 video port which supports composite of S video* 4 video DAC outputs component, RGB, composite* 256 MB of DDR2 DRAM* UART, Media Card interface (SD, xD, SM, MS ,MMC Cards)* 16 MB of non-volatile Flash Memory, 64 MB NAND Flash, 4 MB SRAM* USB2 interface* 10/100 MBS Ethernet interface* Configurable boot load options* IR Remote Interface, real time clock via MSP4303.1.2. DM6446EVM ArchitectureThe architecture of the DM6446 board is organized into several subsystems. By knowing the architecture of the DM6446, the developer can then design and built his application module on the boards underlining architecture.The shows that DM6446 has three subsystems which are connected to the underlying hardware peripherals. This provides a decoupled architecture which allows the developers to implement his applications on a particular subsystem without having to modify the other subsystems. Some of subsystems are discussed in the next sections.ARM SubsystemThe ARM subsystem is responsible for the master control of the DM6446 board. It handles the system-level initializations, configurations, user interface, connectivity functions and control of DSP subsystems. The ARM has a larger program memory space and better context switching capabilities and hence it is more suited to handle complex and multi tasks of the system.DSP SubsystemThe DSP subsystem is mainly the encoding the raw captured video frames into the desired format. It performs several number crunching operations in order to achieve the desired compression technique. It works together with the Video Imaging Coprocessor to compress the video frames.Video Imaging Coprocessor (VICP)The VICP is a signal processing library which contains various software algorithms that execute on VICP hardware accelerator. It hel ps the DSP by taking over computation of varied intensive tasks. Since hardware implementation of number cru
Subscribe to:
Post Comments (Atom)
No comments:
Post a Comment