blob: 1fdda7a87c50d37c33e9150441b7bb4c045e0384 [file] [log] [blame]
Tristan Matthews0a329cc2013-07-17 13:20:14 -04001/* $Id: doc_nat.h 3553 2011-05-05 06:14:19Z nanang $ */
2/*
3 * Copyright (C) 2008-2011 Teluu Inc. (http://www.teluu.com)
4 *
5 * This program is free software; you can redistribute it and/or modify
6 * it under the terms of the GNU General Public License as published by
7 * the Free Software Foundation; either version 2 of the License, or
8 * (at your option) any later version.
9 *
10 * This program is distributed in the hope that it will be useful,
11 * but WITHOUT ANY WARRANTY; without even the implied warranty of
12 * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
13 * GNU General Public License for more details.
14 *
15 * You should have received a copy of the GNU General Public License
16 * along with this program; if not, write to the Free Software
17 * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
18 */
19
20
21/**
22
23@defgroup nat_intro Introduction to Network Address Translation (NAT) and NAT Traversal
24@brief This page describes NAT and the problems caused by it and the solutions
25
26
27
28\section into Introduction to NAT
29
30
31NAT (Network Address Translation) is a mechanism where a device performs
32modifications to the TCP/IP address/port number of a packet and maps the
33IP address from one realm to another (usually from private IP address to
34public IP address and vice versa). This works by the NAT device allocating
35a temporary port number on the public side of the NAT upon forwarding
36outbound packet from the internal host towards the Internet, maintaining
37this mapping for some predefined time, and forwarding the inbound packets
38received from the Internet on this public port back to the internal host.
39
40
41NAT devices are installed primarily to alleviate the exhaustion of IPv4
42address space by allowing multiple hosts to share a public/Internet address.
43Also due to its mapping nature (i.e. a mapping can only be created by
44a transmission from an internal host), NAT device is preferred to be
45installed even when IPv4 address exhaustion is not a problem (for example
46when there is only one host at home), to provide some sort of security/shield
47for the internal hosts against threats from the Internet.
48
49
50Despite the fact that NAT provides some shields for the internal network,
51one must distinguish NAT solution from firewall solution. NAT is not
52a firewall solution. A firewall is a security solution designed to enforce
53the security policy of an organization, while NAT is a connectivity solution
54to allow multiple hosts to use a single public IP address. Understandably
55both functionalities are difficult to separate at times, since many
56(typically consumer) products claims to do both with the same device and
57simply label the device a “NAT box”. But we do want to make this distinction
58rather clear, as PJNATH is a NAT traversal helper and not a firewall bypass
59solution (yet).
60
61
62
63\section problems The NAT traversal problems
64
65
66While NAT would work well for typical client server communications (such as
67web and email), since it's always the client that initiates the conversation
68and normally client doesn't need to maintain the connection for a long time,
69installation of NAT would cause major problem for peer-to-peer communication,
70such as (and especially) VoIP. These problems will be explained in more detail
71below.
72
73
74\subsection peer_addr Peer address problem
75
76
77In VoIP, normally we want the media (audio, and video) to flow directly
78between the clients, since relaying is costly (both in terms of bandwidth
79cost for service provider, and additional latency introduced by relaying).
80To do this, each client informs its media transport address to the other
81client , by sending it via the VoIP signaling path, and the other side would
82send its media to this transport address.
83
84
85And there lies the problem. If the client software is not NAT aware, then
86it would send its private IP address to the other client, and the other
87client would not be able to send media to this address.
88
89
90Traditionally this was solved by using STUN. With this mechanism, the client
91first finds out its public IP address/port by querying a STUN server, then
92send sthis public address instead of its private address to the other
93client. When both sides are using this mechanism, they can then send media
94packets to these addresses, thereby creating a mapping in the NAT (also
95called opening a "hole", hence this mechanism is also popularly called
96"hole punching") and both can then communicate with each other.
97
98
99But this mechanism does not work in all cases, as will be explained below.
100
101
102
103\subsection hairpin Hairpinning behavior
104
105
106Hairpin is a behavior where a NAT device forwards packets from a host in
107internal network (lets call it host A) back to some other host (host B) in
108the same internal network, when it detects that the (public IP address)
109destination of the packet is actually a mapped IP address that was created
110for the internal host (host B). This is a desirable behavior of a NAT,
111but unfortunately not all NAT devices support this.
112
113
114Lacking this behavior, two (internal) hosts behind the same NAT will not
115be able to communicate with each other if they exchange their public
116addresses (resolved by STUN above) to each other.
117
118
119
120\subsection symmetric Symmetric behavior
121
122
123NAT devices don't behave uniformly and people have been trying to classify
124their behavior into different classes. Traditionally NAT devices are
125classified into Full Cone, Restricted Cone, Port Restricted Cone, and
126Symmetric types, according to <A HREF="http://www.ietf.org/rfc/rfc3489.txt">RFC 3489</A>
127section 5. A more recent method of classification, as explained by
128<A HREF="http://www.ietf.org/rfc/rfc4787.txt">RFC 4787</A>, divides
129the NAT behavioral types into two attributes: the mapping behavior
130attribute and the filtering behavior attribute. Each attribute can be
131one of three types: <i>Endpoint-Independent</i>, <i>Address-Dependent</i>,
132or <i>Address and Port-Dependent</i>. With this new classification method,
133a Symmetric NAT actually is an Address and Port-Dependent mapping NAT.
134
135
136Among these types, the Symmetric type is the hardest one to work with.
137The problem is because the NAT allocates different mapping (of the same
138internal host) for the communication to the STUN server and the
139communication to the other (external) hosts, so the IP address/port that
140is informed by one host to the other is meaningless for the recipient
141since this is not the actual IP address/port mapping that the NAT device
142creates. The result is when the recipient host tries to send a packet to
143this address, the NAT device would drop the packet since it does not
144recognize the sender of the packet as the "authorized" hosts to send
145to this address.
146
147
148There are two solutions for this. The first, we could make the client
149smarter by switching transmission of the media to the source address of
150the media packets. This would work since normally clients uses a well
151known trick called symmetric RTP, where they use one socket for both
152transmitting and receiving RTP/media packets. We also use this
153mechanism in PJMEDIA media transport. But this solution only works
154if a client behind a symmetric NAT is not communicating with other
155client behind either symmetric NAT or port-restricted NAT.
156
157
158The second solution is to use media relay, but as have been mentioned
159above, relaying is costly, both in terms of bandwidth cost for service
160provider and additional latency introduced by relaying.
161
162
163
164\subsection binding_timeout Binding timeout
165
166When a NAT device creates a binding (a public-private IP address
167mapping), it will associate a timer with it. The timer is used to
168destroy the binding once there is no activity/traffic associated with
169the binding. Because of this, a NAT aware application that wishes to
170keep the binding open must periodically send outbound packets,
171a mechanism known as keep-alive, or otherwise it will ultimately
172loose the binding and unable to receive incoming packets from Internet.
173
174
175\section solutions The NAT traversal solutions
176
177
178\subsection stun Old STUN (RFC 3489)
179
180The original STUN (Simple Traversal of User Datagram Protocol (UDP)
181Through Network Address Translators (NATs)) as defined by
182<A HREF="http://www.ietf.org/rfc/rfc3489.txt">RFC 3489</A>
183(published in 2003, but the work was started as early as 2001) was
184meant to be a standalone, standard-based solution for the NAT
185connectivity problems above. It is equipped with NAT type detection
186algoritm and methods to hole-punch the NAT in order to let traffic
187to get through and has been proven to be quite successful in
188traversing many types of NATs, hence it has gained a lot of popularity
189 as a simple and effective NAT traversal solution.
190
191But since then the smart people at IETF has realized that STUN alone
192is not going to be enough. Besides its nature that STUN solution cannot
193solve the symmetric-to-symmetric or port-restricted connection,
194people have also discovered that NAT behavior can change for different
195traffic (or for the same traffic overtime) hence it was concluded that
196NAT type detection could produce unreliable results hence one should not
197rely too much on it.
198
199Because of this, STUN has since moved its efforts to different strategy.
200Instead of attempting to provide a standalone solution, it's now providing
201a part solution and framework to build other (STUN based) protocols
202on top of it, such as TURN and ICE.
203
204
205\subsection stunbis STUN/STUNbis (RFC 5389)
206
207The Session Traversal Utilities for NAT (STUN) is the further development
208of the old STUN. While it still provides a mechanism for a client to
209query its public/mapped address to a STUN server, it has deprecated
210the use of NAT type detection, and now it serves as a framework to build
211other protocols on top of it (such as TURN and ICE).
212
213
214\subsection midcom_turn Old TURN (draft-rosenberg-midcom-turn)
215
216Traversal Using Relay NAT (TURN), a standard-based effort started as early
217as in November 2001, was meant to be the complementary method for the
218(old) STUN to complete the solution. The original idea was the host to use
219STUN to detect the NAT type, and when it has found that the NAT type is
220symmetric it would use TURN to relay the traffic. But as stated above,
221this approach was deemed to be unreliable, and now the prefered way to use
222TURN (and it's a new TURN specification as well) is to combine it with ICE.
223
224
225\subsection turn TURN (draft-ietf-behave-turn)
226
227Traversal Using Relays around NAT (TURN) is the latest development of TURN.
228While the protocol details have changed a lot, the objective is still
229the same, that is to provide relaying control for the application.
230As mentioned above, preferably TURN should be used with ICE since relaying
231is costly in terms of both bandwidth and latency, hence it should be used
232as the last resort.
233
234
235\subsection b2bua B2BUA approach
236
237A SIP Back to Back User Agents (B2BUA) is a SIP entity that sits in the
238middle of SIP traffic and acts as SIP user agents on both call legs.
239The primary motivations to have a B2BUA are to be able to provision
240the call (e.g. billing, enforcing policy) and to help with NAT traversal
241for the clients. Normally a B2BUA would be equipped with media relaying
242or otherwise it wouldn't be very useful.
243
244Products that fall into this category include SIP Session Border
245Controllers (SBC), and PBXs such as Asterisk are technically a B2BUA
246as well.
247
248The benefit of B2BUA with regard to helping NAT traversal is it does not
249require any modifications to the client to make it go through NATs.
250And since basically it is a relay, it should be able to traverse
251symmetric NAT successfully.
252
253However, since it is a relay, the usual relaying drawbacks apply,
254namely the bandwidth and latency issue. More over, since a B2BUA acts
255as user agent in either call-legs (i.e. it terminates the SIP
256signaling/call on one leg, albeit it creates another call on the other
257leg), it may also introduce serious issues with end-to-end SIP signaling.
258
259
260\subsection alg ALG approach
261
262Nowdays many NAT devices (such as consumer ADSL routers) are equipped
263with intelligence to inspect and fix VoIP traffic in its effort to help
264it with the NAT traversal. This feature is called Application Layer
265Gateway (ALG) intelligence. The idea is since the NAT device knows about
266the mapping, it might as well try to fix the application traffic so that
267the traffic could better traverse the NAT. Some tricks that are
268performed include for example replacing the private IP addresses/ports
269in the SIP/SDP packet with the mapped public address/port of the host
270that sends the packet.
271
272Despite many claims about its usefullness, in reality this has given us
273more problems than the fix. Too many devices such as these break the
274SIP signaling, and in more advanced case, ICE negotiation. Some
275examples of bad situations that we have encountered in the past:
276
277 - NAT device alters the Via address/port fields in the SIP response
278 message, making the response fail to pass SIP response verification
279 as defined by SIP RFC.
280 - In other case, the modifications in the Via headers of the SIP
281 response hides the important information from the SIP server,
282 nameny the actual IP address/port of the client as seen by the SIP
283 server.
284 - Modifications in the Contact URI of REGISTER request/response makes
285 the client unable to detect it's registered binding.
286 - Modifications in the IP addresses/ports in SDP causes ICE
287 negotiation to fail with ice-mismatch status.
288 - The complexity of the ALG processing in itself seems to have caused
289 the device to behave erraticly with managing the address bindings
290 (e.g. it creates a new binding for the second packet sent by the
291 client, even when the previous packet was sent just second ago, or
292 it just sends inbound packet to the wrong host).
293
294
295Many man-months efforts have been spent just to troubleshoot issues
296caused by these ALG (mal)functioning, and as it adds complexity to
297the problem rather than solving it, in general we do not like this
298approach at all and would prefer it to go away.
299
300
301\subsection upnp UPnP
302
303The Universal Plug and Play (UPnP) is a set of protocol specifications
304to control network appliances and one of its specification is to
305control NAT device. With this protocol, a client can instruct the
306NAT device to open a port in the NAT's public side and use this port
307for its communication. UPnP has gained popularity due to its
308simplicity, and one can expect it to be available on majority of
309NAT devices.
310
311The drawback of UPnP is since it uses multicast in its communication,
312it will only allow client to control one NAT device that is in the
313same multicast domain. While this normally is not a problem in
314household installations (where people normally only have one NAT
315router), it will not work if the client is behind cascaded routers
316installation. More over uPnP has serious issues with security due to
317its lack of authentication, it's probably not the prefered solution
318for organizations.
319
320\subsection other Other solutions
321
322Other solutions to NAT traversal includes:
323
324 - SOCKS, which supports UDP protocol since SOCKS5.
325
326
327
328\section ice ICE Solution - The Protocol that Works Harder
329
330A new protocol is being standardized (it's in Work Group Last Call/WGLC
331stage at the time this article was written) by the IETF, called
332Interactive Connectivity Establishment (ICE). ICE is the ultimate
333weapon a client can have in its NAT traversal solution arsenals,
334as it promises that if there is indeed one path for two clients
335to communicate, then ICE will find this path. And if there are
336more than one paths which the clients can communicate, ICE will
337use the best/most efficient one.
338
339ICE works by combining several protocols (such as STUN and TURN)
340altogether and offering several candidate paths for the communication,
341thereby maximising the chance of success, but at the same time also
342has the capability to prioritize the candidates, so that the more
343expensive alternative (namely relay) will only be used as the last
344resort when else fails. ICE negotiation process involves several
345stages:
346
347 - candidate gathering, where the client finds out all the possible
348 addresses that it can use for the communication. It may find
349 three types of candidates: host candidate to represent its
350 physical NICs, server reflexive candidate for the address that
351 has been resolved from STUN, and relay candidate for the address
352 that the client has allocated from a TURN relay.
353 - prioritizing these candidates. Typically the relay candidate will
354 have the lowest priority to use since it's the most expensive.
355 - encoding these candidates, sending it to remote peer, and
356 negotiating it with offer-answer.
357 - pairing the candidates, where it pairs every local candidates
358 with every remote candidates that it receives from the remote peer.
359 - checking the connectivity for each candidate pairs.
360 - concluding the result. Since every possible path combinations are
361 checked, if there is a path to communicate ICE will find it.
362
363
364There are many benetifs of ICE:
365
366 - it's standard based.
367 - it works where STUN works (and more)
368 - unlike standalone STUN solution, it solves the hairpinning issue,
369 since it also offers host candidates.
370 - just as relaying solutions, it works with symmetric NATs. But unlike
371 plain relaying, relay is only used as the last resort, thereby
372 minimizing the bandwidth and latency issue of relaying.
373 - it offers a generic framework for offering and checking address
374 candidates. While the ICE core standard only talks about using STUN
375 and TURN, implementors can add more types of candidates in the ICE
376 offer, for example UDP over TCP or HTTP relays, or even uPnP
377 candidates, and this could be done transparently for the remote
378 peer hence it's compatible and usable even when the remote peer
379 does not support these.
380 - it also adds some kind of security particularly against DoS attacks,
381 since media address must be acknowledged before it can be used.
382
383
384Having said that, ICE is a complex protocol to implement, making
385interoperability an issue, and at this time of writing we don't see
386many implementations of it yet. Fortunately, PJNATH has been one of
387the first hence more mature ICE implementation, being first released
388on mid-2007, and we have been testing our implementation at
389<A HREF="http://www.sipit.net">SIP Interoperability Test (SIPit)</A>
390events regularly, so hopefully we are one of the most stable as well.
391
392
393\section pjnath PJNATH - The building blocks for effective NAT traversal solution
394
395PJSIP NAT Helper (PJNATH) is a library which contains the implementation
396of standard based NAT traversal solutions. PJNATH can be used as a
397stand-alone library for your software, or you may use PJSUA-LIB library,
398a very high level library integrating PJSIP, PJMEDIA, and PJNATH into
399simple to use APIs.
400
401PJNATH has the following features:
402
403 - STUNbis implementation, providing both ready to use STUN-aware socket
404 and framework to implement higher level STUN based protocols such as
405 TURN and ICE.
406 - NAT type detection, useful for troubleshooting purposes.
407 - TURN implementation.
408 - ICE implementation.
409
410
411More protocols will be implemented in the future.
412
413Go back to \ref index.
414
415 */