Blame - jni/pjproject-android/.svn/pristine/c7/c7c05de33c585729358af4df12fe3e58738758ad.svn-base - jami-client-android

blob: d4f71fe1a9f1a7ec57cfd114362c5425605ae1e3 [file] [log] [blame]

Tristan Matthews	0a329cc	2013-07-17 13:20:14 -0400	[diff] [blame]	1	/* $Id$ */
				2	/*
				3	* Copyright (C) 2008-2011 Teluu Inc. (http://www.teluu.com)
				4	*
				5	* This program is free software; you can redistribute it and/or modify
				6	* it under the terms of the GNU General Public License as published by
				7	* the Free Software Foundation; either version 2 of the License, or
				8	* (at your option) any later version.
				9	*
				10	* This program is distributed in the hope that it will be useful,
				11	* but WITHOUT ANY WARRANTY; without even the implied warranty of
				12	* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
				13	* GNU General Public License for more details.
				14	*
				15	* You should have received a copy of the GNU General Public License
				16	* along with this program; if not, write to the Free Software
				17	* Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
				18	*/
				19
				20
				21	/**
				22
				23	@defgroup nat_intro Introduction to Network Address Translation (NAT) and NAT Traversal
				24	@brief This page describes NAT and the problems caused by it and the solutions
				25
				26
				27
				28	\section into Introduction to NAT
				29
				30
				31	NAT (Network Address Translation) is a mechanism where a device performs
				32	modifications to the TCP/IP address/port number of a packet and maps the
				33	IP address from one realm to another (usually from private IP address to
				34	public IP address and vice versa). This works by the NAT device allocating
				35	a temporary port number on the public side of the NAT upon forwarding
				36	outbound packet from the internal host towards the Internet, maintaining
				37	this mapping for some predefined time, and forwarding the inbound packets
				38	received from the Internet on this public port back to the internal host.
				39
				40
				41	NAT devices are installed primarily to alleviate the exhaustion of IPv4
				42	address space by allowing multiple hosts to share a public/Internet address.
				43	Also due to its mapping nature (i.e. a mapping can only be created by
				44	a transmission from an internal host), NAT device is preferred to be
				45	installed even when IPv4 address exhaustion is not a problem (for example
				46	when there is only one host at home), to provide some sort of security/shield
				47	for the internal hosts against threats from the Internet.
				48
				49
				50	Despite the fact that NAT provides some shields for the internal network,
				51	one must distinguish NAT solution from firewall solution. NAT is not
				52	a firewall solution. A firewall is a security solution designed to enforce
				53	the security policy of an organization, while NAT is a connectivity solution
				54	to allow multiple hosts to use a single public IP address. Understandably
				55	both functionalities are difficult to separate at times, since many
				56	(typically consumer) products claims to do both with the same device and
				57	simply label the device a NAT box. But we do want to make this distinction
				58	rather clear, as PJNATH is a NAT traversal helper and not a firewall bypass
				59	solution (yet).
				60
				61
				62
				63	\section problems The NAT traversal problems
				64
				65
				66	While NAT would work well for typical client server communications (such as
				67	web and email), since it's always the client that initiates the conversation
				68	and normally client doesn't need to maintain the connection for a long time,
				69	installation of NAT would cause major problem for peer-to-peer communication,
				70	such as (and especially) VoIP. These problems will be explained in more detail
				71	below.
				72
				73
				74	\subsection peer_addr Peer address problem
				75
				76
				77	In VoIP, normally we want the media (audio, and video) to flow directly
				78	between the clients, since relaying is costly (both in terms of bandwidth
				79	cost for service provider, and additional latency introduced by relaying).
				80	To do this, each client informs its media transport address to the other
				81	client , by sending it via the VoIP signaling path, and the other side would
				82	send its media to this transport address.
				83
				84
				85	And there lies the problem. If the client software is not NAT aware, then
				86	it would send its private IP address to the other client, and the other
				87	client would not be able to send media to this address.
				88
				89
				90	Traditionally this was solved by using STUN. With this mechanism, the client
				91	first finds out its public IP address/port by querying a STUN server, then
				92	send sthis public address instead of its private address to the other
				93	client. When both sides are using this mechanism, they can then send media
				94	packets to these addresses, thereby creating a mapping in the NAT (also
				95	called opening a "hole", hence this mechanism is also popularly called
				96	"hole punching") and both can then communicate with each other.
				97
				98
				99	But this mechanism does not work in all cases, as will be explained below.
				100
				101
				102
				103	\subsection hairpin Hairpinning behavior
				104
				105
				106	Hairpin is a behavior where a NAT device forwards packets from a host in
				107	internal network (lets call it host A) back to some other host (host B) in
				108	the same internal network, when it detects that the (public IP address)
				109	destination of the packet is actually a mapped IP address that was created
				110	for the internal host (host B). This is a desirable behavior of a NAT,
				111	but unfortunately not all NAT devices support this.
				112
				113
				114	Lacking this behavior, two (internal) hosts behind the same NAT will not
				115	be able to communicate with each other if they exchange their public
				116	addresses (resolved by STUN above) to each other.
				117
				118
				119
				120	\subsection symmetric Symmetric behavior
				121
				122
				123	NAT devices don't behave uniformly and people have been trying to classify
				124	their behavior into different classes. Traditionally NAT devices are
				125	classified into Full Cone, Restricted Cone, Port Restricted Cone, and
				126	Symmetric types, according to <A HREF="http://www.ietf.org/rfc/rfc3489.txt">RFC 3489</A>
				127	section 5. A more recent method of classification, as explained by
				128	<A HREF="http://www.ietf.org/rfc/rfc4787.txt">RFC 4787</A>, divides
				129	the NAT behavioral types into two attributes: the mapping behavior
				130	attribute and the filtering behavior attribute. Each attribute can be
				131	one of three types: <i>Endpoint-Independent</i>, <i>Address-Dependent</i>,
				132	or <i>Address and Port-Dependent</i>. With this new classification method,
				133	a Symmetric NAT actually is an Address and Port-Dependent mapping NAT.
				134
				135
				136	Among these types, the Symmetric type is the hardest one to work with.
				137	The problem is because the NAT allocates different mapping (of the same
				138	internal host) for the communication to the STUN server and the
				139	communication to the other (external) hosts, so the IP address/port that
				140	is informed by one host to the other is meaningless for the recipient
				141	since this is not the actual IP address/port mapping that the NAT device
				142	creates. The result is when the recipient host tries to send a packet to
				143	this address, the NAT device would drop the packet since it does not
				144	recognize the sender of the packet as the "authorized" hosts to send
				145	to this address.
				146
				147
				148	There are two solutions for this. The first, we could make the client
				149	smarter by switching transmission of the media to the source address of
				150	the media packets. This would work since normally clients uses a well
				151	known trick called symmetric RTP, where they use one socket for both
				152	transmitting and receiving RTP/media packets. We also use this
				153	mechanism in PJMEDIA media transport. But this solution only works
				154	if a client behind a symmetric NAT is not communicating with other
				155	client behind either symmetric NAT or port-restricted NAT.
				156
				157
				158	The second solution is to use media relay, but as have been mentioned
				159	above, relaying is costly, both in terms of bandwidth cost for service
				160	provider and additional latency introduced by relaying.
				161
				162
				163
				164	\subsection binding_timeout Binding timeout
				165
				166	When a NAT device creates a binding (a public-private IP address
				167	mapping), it will associate a timer with it. The timer is used to
				168	destroy the binding once there is no activity/traffic associated with
				169	the binding. Because of this, a NAT aware application that wishes to
				170	keep the binding open must periodically send outbound packets,
				171	a mechanism known as keep-alive, or otherwise it will ultimately
				172	loose the binding and unable to receive incoming packets from Internet.
				173
				174
				175	\section solutions The NAT traversal solutions
				176
				177
				178	\subsection stun Old STUN (RFC 3489)
				179
				180	The original STUN (Simple Traversal of User Datagram Protocol (UDP)
				181	Through Network Address Translators (NATs)) as defined by
				182	<A HREF="http://www.ietf.org/rfc/rfc3489.txt">RFC 3489</A>
				183	(published in 2003, but the work was started as early as 2001) was
				184	meant to be a standalone, standard-based solution for the NAT
				185	connectivity problems above. It is equipped with NAT type detection
				186	algoritm and methods to hole-punch the NAT in order to let traffic
				187	to get through and has been proven to be quite successful in
				188	traversing many types of NATs, hence it has gained a lot of popularity
				189	as a simple and effective NAT traversal solution.
				190
				191	But since then the smart people at IETF has realized that STUN alone
				192	is not going to be enough. Besides its nature that STUN solution cannot
				193	solve the symmetric-to-symmetric or port-restricted connection,
				194	people have also discovered that NAT behavior can change for different
				195	traffic (or for the same traffic overtime) hence it was concluded that
				196	NAT type detection could produce unreliable results hence one should not
				197	rely too much on it.
				198
				199	Because of this, STUN has since moved its efforts to different strategy.
				200	Instead of attempting to provide a standalone solution, it's now providing
				201	a part solution and framework to build other (STUN based) protocols
				202	on top of it, such as TURN and ICE.
				203
				204
				205	\subsection stunbis STUN/STUNbis (RFC 5389)
				206
				207	The Session Traversal Utilities for NAT (STUN) is the further development
				208	of the old STUN. While it still provides a mechanism for a client to
				209	query its public/mapped address to a STUN server, it has deprecated
				210	the use of NAT type detection, and now it serves as a framework to build
				211	other protocols on top of it (such as TURN and ICE).
				212
				213
				214	\subsection midcom_turn Old TURN (draft-rosenberg-midcom-turn)
				215
				216	Traversal Using Relay NAT (TURN), a standard-based effort started as early
				217	as in November 2001, was meant to be the complementary method for the
				218	(old) STUN to complete the solution. The original idea was the host to use
				219	STUN to detect the NAT type, and when it has found that the NAT type is
				220	symmetric it would use TURN to relay the traffic. But as stated above,
				221	this approach was deemed to be unreliable, and now the prefered way to use
				222	TURN (and it's a new TURN specification as well) is to combine it with ICE.
				223
				224
				225	\subsection turn TURN (draft-ietf-behave-turn)
				226
				227	Traversal Using Relays around NAT (TURN) is the latest development of TURN.
				228	While the protocol details have changed a lot, the objective is still
				229	the same, that is to provide relaying control for the application.
				230	As mentioned above, preferably TURN should be used with ICE since relaying
				231	is costly in terms of both bandwidth and latency, hence it should be used
				232	as the last resort.
				233
				234
				235	\subsection b2bua B2BUA approach
				236
				237	A SIP Back to Back User Agents (B2BUA) is a SIP entity that sits in the
				238	middle of SIP traffic and acts as SIP user agents on both call legs.
				239	The primary motivations to have a B2BUA are to be able to provision
				240	the call (e.g. billing, enforcing policy) and to help with NAT traversal
				241	for the clients. Normally a B2BUA would be equipped with media relaying
				242	or otherwise it wouldn't be very useful.
				243
				244	Products that fall into this category include SIP Session Border
				245	Controllers (SBC), and PBXs such as Asterisk are technically a B2BUA
				246	as well.
				247
				248	The benefit of B2BUA with regard to helping NAT traversal is it does not
				249	require any modifications to the client to make it go through NATs.
				250	And since basically it is a relay, it should be able to traverse
				251	symmetric NAT successfully.
				252
				253	However, since it is a relay, the usual relaying drawbacks apply,
				254	namely the bandwidth and latency issue. More over, since a B2BUA acts
				255	as user agent in either call-legs (i.e. it terminates the SIP
				256	signaling/call on one leg, albeit it creates another call on the other
				257	leg), it may also introduce serious issues with end-to-end SIP signaling.
				258
				259
				260	\subsection alg ALG approach
				261
				262	Nowdays many NAT devices (such as consumer ADSL routers) are equipped
				263	with intelligence to inspect and fix VoIP traffic in its effort to help
				264	it with the NAT traversal. This feature is called Application Layer
				265	Gateway (ALG) intelligence. The idea is since the NAT device knows about
				266	the mapping, it might as well try to fix the application traffic so that
				267	the traffic could better traverse the NAT. Some tricks that are
				268	performed include for example replacing the private IP addresses/ports
				269	in the SIP/SDP packet with the mapped public address/port of the host
				270	that sends the packet.
				271
				272	Despite many claims about its usefullness, in reality this has given us
				273	more problems than the fix. Too many devices such as these break the
				274	SIP signaling, and in more advanced case, ICE negotiation. Some
				275	examples of bad situations that we have encountered in the past:
				276
				277	- NAT device alters the Via address/port fields in the SIP response
				278	message, making the response fail to pass SIP response verification
				279	as defined by SIP RFC.
				280	- In other case, the modifications in the Via headers of the SIP
				281	response hides the important information from the SIP server,
				282	nameny the actual IP address/port of the client as seen by the SIP
				283	server.
				284	- Modifications in the Contact URI of REGISTER request/response makes
				285	the client unable to detect it's registered binding.
				286	- Modifications in the IP addresses/ports in SDP causes ICE
				287	negotiation to fail with ice-mismatch status.
				288	- The complexity of the ALG processing in itself seems to have caused
				289	the device to behave erraticly with managing the address bindings
				290	(e.g. it creates a new binding for the second packet sent by the
				291	client, even when the previous packet was sent just second ago, or
				292	it just sends inbound packet to the wrong host).
				293
				294
				295	Many man-months efforts have been spent just to troubleshoot issues
				296	caused by these ALG (mal)functioning, and as it adds complexity to
				297	the problem rather than solving it, in general we do not like this
				298	approach at all and would prefer it to go away.
				299
				300
				301	\subsection upnp UPnP
				302
				303	The Universal Plug and Play (UPnP) is a set of protocol specifications
				304	to control network appliances and one of its specification is to
				305	control NAT device. With this protocol, a client can instruct the
				306	NAT device to open a port in the NAT's public side and use this port
				307	for its communication. UPnP has gained popularity due to its
				308	simplicity, and one can expect it to be available on majority of
				309	NAT devices.
				310
				311	The drawback of UPnP is since it uses multicast in its communication,
				312	it will only allow client to control one NAT device that is in the
				313	same multicast domain. While this normally is not a problem in
				314	household installations (where people normally only have one NAT
				315	router), it will not work if the client is behind cascaded routers
				316	installation. More over uPnP has serious issues with security due to
				317	its lack of authentication, it's probably not the prefered solution
				318	for organizations.
				319
				320	\subsection other Other solutions
				321
				322	Other solutions to NAT traversal includes:
				323
				324	- SOCKS, which supports UDP protocol since SOCKS5.
				325
				326
				327
				328	\section ice ICE Solution - The Protocol that Works Harder
				329
				330	A new protocol is being standardized (it's in Work Group Last Call/WGLC
				331	stage at the time this article was written) by the IETF, called
				332	Interactive Connectivity Establishment (ICE). ICE is the ultimate
				333	weapon a client can have in its NAT traversal solution arsenals,
				334	as it promises that if there is indeed one path for two clients
				335	to communicate, then ICE will find this path. And if there are
				336	more than one paths which the clients can communicate, ICE will
				337	use the best/most efficient one.
				338
				339	ICE works by combining several protocols (such as STUN and TURN)
				340	altogether and offering several candidate paths for the communication,
				341	thereby maximising the chance of success, but at the same time also
				342	has the capability to prioritize the candidates, so that the more
				343	expensive alternative (namely relay) will only be used as the last
				344	resort when else fails. ICE negotiation process involves several
				345	stages:
				346
				347	- candidate gathering, where the client finds out all the possible
				348	addresses that it can use for the communication. It may find
				349	three types of candidates: host candidate to represent its
				350	physical NICs, server reflexive candidate for the address that
				351	has been resolved from STUN, and relay candidate for the address
				352	that the client has allocated from a TURN relay.
				353	- prioritizing these candidates. Typically the relay candidate will
				354	have the lowest priority to use since it's the most expensive.
				355	- encoding these candidates, sending it to remote peer, and
				356	negotiating it with offer-answer.
				357	- pairing the candidates, where it pairs every local candidates
				358	with every remote candidates that it receives from the remote peer.
				359	- checking the connectivity for each candidate pairs.
				360	- concluding the result. Since every possible path combinations are
				361	checked, if there is a path to communicate ICE will find it.
				362
				363
				364	There are many benetifs of ICE:
				365
				366	- it's standard based.
				367	- it works where STUN works (and more)
				368	- unlike standalone STUN solution, it solves the hairpinning issue,
				369	since it also offers host candidates.
				370	- just as relaying solutions, it works with symmetric NATs. But unlike
				371	plain relaying, relay is only used as the last resort, thereby
				372	minimizing the bandwidth and latency issue of relaying.
				373	- it offers a generic framework for offering and checking address
				374	candidates. While the ICE core standard only talks about using STUN
				375	and TURN, implementors can add more types of candidates in the ICE
				376	offer, for example UDP over TCP or HTTP relays, or even uPnP
				377	candidates, and this could be done transparently for the remote
				378	peer hence it's compatible and usable even when the remote peer
				379	does not support these.
				380	- it also adds some kind of security particularly against DoS attacks,
				381	since media address must be acknowledged before it can be used.
				382
				383
				384	Having said that, ICE is a complex protocol to implement, making
				385	interoperability an issue, and at this time of writing we don't see
				386	many implementations of it yet. Fortunately, PJNATH has been one of
				387	the first hence more mature ICE implementation, being first released
				388	on mid-2007, and we have been testing our implementation at
				389	<A HREF="http://www.sipit.net">SIP Interoperability Test (SIPit)</A>
				390	events regularly, so hopefully we are one of the most stable as well.
				391
				392
				393	\section pjnath PJNATH - The building blocks for effective NAT traversal solution
				394
				395	PJSIP NAT Helper (PJNATH) is a library which contains the implementation
				396	of standard based NAT traversal solutions. PJNATH can be used as a
				397	stand-alone library for your software, or you may use PJSUA-LIB library,
				398	a very high level library integrating PJSIP, PJMEDIA, and PJNATH into
				399	simple to use APIs.
				400
				401	PJNATH has the following features:
				402
				403	- STUNbis implementation, providing both ready to use STUN-aware socket
				404	and framework to implement higher level STUN based protocols such as
				405	TURN and ICE.
				406	- NAT type detection, useful for troubleshooting purposes.
				407	- TURN implementation.
				408	- ICE implementation.
				409
				410
				411	More protocols will be implemented in the future.
				412
				413	Go back to \ref index.
				414
				415	*/