| 1 |
Assorted notes about udns (library). |
|---|
| 2 |
|
|---|
| 3 |
UDP-only mode |
|---|
| 4 |
~~~~~~~~~~~~~ |
|---|
| 5 |
|
|---|
| 6 |
First of all, since udns is (currently) UDP-only, there are some |
|---|
| 7 |
shortcomings. |
|---|
| 8 |
|
|---|
| 9 |
It assumes that a reply will fit into a UDP buffer. With adoption of EDNS0, |
|---|
| 10 |
and general robustness of IP stacks, in most cases it's not an issue. But |
|---|
| 11 |
in some cases there may be problems: |
|---|
| 12 |
|
|---|
| 13 |
- if an RRset is "very large" so it does not fit even in buffer of size |
|---|
| 14 |
requested by the library (current default is 4096; some servers limits |
|---|
| 15 |
it further), we will not see the reply, or will only see "damaged" |
|---|
| 16 |
reply (depending on the server). |
|---|
| 17 |
|
|---|
| 18 |
- many DNS servers ignores EDNS0 option requests. In this case, no matter |
|---|
| 19 |
which buffer size udns library will request, such servers reply is limited |
|---|
| 20 |
to 512 bytes (standard pre-EDNS0 DNS packet size). (Udns falls back to |
|---|
| 21 |
non-EDNO0 query if EDNS0-enabled one received FORMERR or NOTIMPL error). |
|---|
| 22 |
|
|---|
| 23 |
The problem is that with this, udns currently will not consider replies with |
|---|
| 24 |
TC (truncation) bit set, and will treat such replies the same way as it |
|---|
| 25 |
treats SERVFAIL replies, thus trying next server, or temp-failing the query |
|---|
| 26 |
if no more servers to try. In other words, if the reply is really large, or |
|---|
| 27 |
if the servers you're using don't support EDNS0, your application will be |
|---|
| 28 |
unable to resolve a given name. |
|---|
| 29 |
|
|---|
| 30 |
Yet it's not common situation - in practice, it's very rare. |
|---|
| 31 |
|
|---|
| 32 |
Implementing TCP mode isn't difficult, but it complicates API significantly. |
|---|
| 33 |
Currently udns uses only single UDP socket (or - maybe in the future - two, |
|---|
| 34 |
see below), but in case of TCP, it will need to open and close sockets for |
|---|
| 35 |
TCP connections left and right, and that have to be integrated into an |
|---|
| 36 |
application's event loop in an easy and efficient way. Plus all the |
|---|
| 37 |
timeouts - different for connect(), write, and several stages of read. |
|---|
| 38 |
|
|---|
| 39 |
IPv6 vs IPv4 usage |
|---|
| 40 |
~~~~~~~~~~~~~~~~~~ |
|---|
| 41 |
|
|---|
| 42 |
This is only relevant for nameservers reachable over IPv6, NOT for IPv6 |
|---|
| 43 |
queries. I.e., if you've IPv6 addresses in 'nameservers' line in your |
|---|
| 44 |
/etc/resolv.conf file. Even more: if you have BOTH IPv6 AND IPv4 addresses |
|---|
| 45 |
there. Or pass them to udns initialization routines. |
|---|
| 46 |
|
|---|
| 47 |
Since udns uses a single UDP socket to communicate with all nameservers, |
|---|
| 48 |
it should support both v4 and v6 communications. Most current platforms |
|---|
| 49 |
supports this mode - using PF_INET6 socket and V4MAPPED addresses, i.e, |
|---|
| 50 |
"tunnelling" IPv4 inside IPv6. But not all systems supports this. And |
|---|
| 51 |
more, it has been said that such mode is deprecated. |
|---|
| 52 |
|
|---|
| 53 |
So, list only IPv4 or only IPv6 addresses, but don't mix them, in your |
|---|
| 54 |
/etc/resolv.conf. |
|---|
| 55 |
|
|---|
| 56 |
An alternative is to use two sockets instead of 1 - one for IPv6 and one |
|---|
| 57 |
for IPv4. For now I'm not sure if it's worth the complexity - again, of |
|---|
| 58 |
the API, not the library itself (but this will not simplify library either). |
|---|
| 59 |
|
|---|
| 60 |
Single socket for all queries |
|---|
| 61 |
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ |
|---|
| 62 |
|
|---|
| 63 |
Using single UDP socket for sending queries to all nameservers has obvious |
|---|
| 64 |
advantages. First it's, again, trivial, simple to use API. And simple |
|---|
| 65 |
library too. Also, after sending queries to all nameservers (in case first |
|---|
| 66 |
didn't reply in time), we will be able to receive late reply from first |
|---|
| 67 |
nameserver and accept it. |
|---|
| 68 |
|
|---|
| 69 |
But this mode has disadvantages too. Most important is that it's much easier |
|---|
| 70 |
to send fake reply to us, as the UDP port where we expects the reply to come |
|---|
| 71 |
to is constant during the whole lifetime of an application. More secure |
|---|
| 72 |
implementations uses random port for every single query. While port number |
|---|
| 73 |
(16 bits integer) can not hold much randomness, it's still of some help. |
|---|
| 74 |
Ok, udns is a stub resolver, so it expects sorta friendly environment, but |
|---|
| 75 |
on LAN it's usually much easier to fire an attack, due to the speed of local |
|---|
| 76 |
network, where a bad guy can generate alot of packets in a short time. |
|---|
| 77 |
|
|---|
| 78 |
Choosing of DNS QueryID |
|---|
| 79 |
~~~~~~~~~~~~~~~~~~~~~~~ |
|---|
| 80 |
|
|---|
| 81 |
Currently, udns uses sequential number for query IDs. Which simplifies |
|---|
| 82 |
attacks even more (c.f. the previous item about single UDP port), making |
|---|
| 83 |
them nearly trivial. The library should use random number for query ID. |
|---|
| 84 |
But there's no portable way to get random numbers, even on various flavors |
|---|
| 85 |
of Unix. It's possible to use low bits from tv_nsec field returned by |
|---|
| 86 |
gettimeofday() (current time, nanoseconds), but I wrote the library in |
|---|
| 87 |
a way to avoid making system calls where possible, because many syscalls |
|---|
| 88 |
means many context switches and slow processes as a result. Maybe use some |
|---|
| 89 |
application-supplied callback to get random values will be a better way, |
|---|
| 90 |
defaulting to gettimeofday() method. |
|---|
| 91 |
|
|---|
| 92 |
Note that a single query - even if (re)sent to different nameservers, several |
|---|
| 93 |
times (due to no reply received in time), uses the same qID assigned when it |
|---|
| 94 |
was first dispatched. So we have: single UDP socket (fixed port number), |
|---|
| 95 |
sequential (= trivially predictable) qIDs, and long lifetime of those qIDs. |
|---|
| 96 |
This all makes (local) attacks against the library really trivial. |
|---|
| 97 |
|
|---|
| 98 |
See also comments in udns_resolver.c, udns_newid(). |
|---|
| 99 |
|
|---|
| 100 |
And note that at least some other stub resolvers out there (like c-ares |
|---|
| 101 |
for example) also uses sequential qID. |
|---|
| 102 |
|
|---|
| 103 |
Assumptions about RRs returned |
|---|
| 104 |
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ |
|---|
| 105 |
|
|---|
| 106 |
Currently udns processes records in the reply it received sequentially. |
|---|
| 107 |
This means that order of the records is significant. For example, if |
|---|
| 108 |
we asked for foo.bar A, but the server returned that foo.bar is a CNAME |
|---|
| 109 |
(alias) for bar.baz, and bar.baz, in turn, has address 1.2.3.4, when |
|---|
| 110 |
the CNAME should come first in reply, followed by A. While DNS specs |
|---|
| 111 |
does not say anything about order of records - it's an rrSET - unordered, - |
|---|
| 112 |
I think an implementation which returns the records in "wrong" order is |
|---|
| 113 |
somewhat insane... |
|---|
| 114 |
|
|---|
| 115 |
CNAME recursion |
|---|
| 116 |
~~~~~~~~~~~~~~~ |
|---|
| 117 |
|
|---|
| 118 |
Another interesting point is the handling of CNAMEs returned as replies |
|---|
| 119 |
to non-CNAME queries. If we asked for foo.bar A, but it's a CNAME, udns |
|---|
| 120 |
expects BOTH the CNAME itself and the target DN to be present in the reply. |
|---|
| 121 |
In other words, udns DOES NOT RECURSE CNAMES. If we asked for foo.bar A, |
|---|
| 122 |
but only record in reply was that foo.bar is a CNAME for bar.baz, udns will |
|---|
| 123 |
return no records to an application (NXDOMAIN). Strictly speaking, udns |
|---|
| 124 |
should repeat the query asking for bar.baz A, and recurse. But since it's |
|---|
| 125 |
stub resolver, recursive resolver should recurse for us instead. |
|---|
| 126 |
|
|---|
| 127 |
It's not very difficult to implement, however. Probably with some (global?) |
|---|
| 128 |
flag to en/dis-able the feature. Provided there's some demand for it. |
|---|
| 129 |
|
|---|
| 130 |
To clarify: udns handles CNAME recursion in a single reply packet just fine. |
|---|
| 131 |
|
|---|
| 132 |
Note also that standard gethostbyname() routine does not recurse in this |
|---|
| 133 |
situation, too. |
|---|
| 134 |
|
|---|
| 135 |
Error reporting |
|---|
| 136 |
~~~~~~~~~~~~~~~ |
|---|
| 137 |
|
|---|
| 138 |
Too many places in the code (various failure paths) sets generic "TEMPFAIL" |
|---|
| 139 |
error condition. For example, if no nameserver replied to our query, an |
|---|
| 140 |
application will get generic TEMPFAIL, instead of something like TIMEDOUT. |
|---|
| 141 |
This probably should be fixed, but most applications don't care about the |
|---|
| 142 |
exact reasons of failure - 4 common cases are already too much: |
|---|
| 143 |
- query returned some valid data |
|---|
| 144 |
- NXDOMAIN |
|---|
| 145 |
- valid domain but no data of requested type - =NXDOMAIN in most cases |
|---|
| 146 |
- temporary error - this one sometimes (incorrectly!) treated as NXDOMAIN |
|---|
| 147 |
by (naive) applications. |
|---|
| 148 |
DNS isn't yes/no, it's at least 3 variants, temp err being the 3rd important |
|---|
| 149 |
case! And adding more variations for the temp error case is complicating things |
|---|
| 150 |
even more - again, from an application writer standpoint. For diagnostics, |
|---|
| 151 |
such more specific error cases are of good help. |
|---|
| 152 |
|
|---|
| 153 |
Planned API changes |
|---|
| 154 |
~~~~~~~~~~~~~~~~~~~ |
|---|
| 155 |
|
|---|
| 156 |
At least one thing I want to change for 0.1 version is a way how queries are |
|---|
| 157 |
submitted and how replies are handled. |
|---|
| 158 |
|
|---|
| 159 |
I want to made dns_query object to be owned by an application. So that instead |
|---|
| 160 |
of udns library allocating it for the lifetime of query, it will be pre- |
|---|
| 161 |
allocated by an application. This simplifies and enhances query submitting |
|---|
| 162 |
interface, and complicates it a bit too, in simplest cases. |
|---|
| 163 |
|
|---|
| 164 |
Currently, we have: |
|---|
| 165 |
|
|---|
| 166 |
dns_submit_dn(dn, cls, typ, flags, parse, cbck, data) |
|---|
| 167 |
dns_submit_p(name, cls, typ, flags, parse, cbck, data) |
|---|
| 168 |
dns_submit_a4(ctx, name, flags, cbck, data) |
|---|
| 169 |
|
|---|
| 170 |
and so on -- with many parameters missed for type-specific cases, but generic |
|---|
| 171 |
cases being too complex for most common usage. |
|---|
| 172 |
|
|---|
| 173 |
Instead, with dns_query being owned by an app, we will be able to separately |
|---|
| 174 |
set up various parts of the query - domain name (various forms), type&class, |
|---|
| 175 |
parser, flags, callback... and even change them at runtime. And we will also |
|---|
| 176 |
be able to reuse query structures, instead of allocating/freeing them every |
|---|
| 177 |
time. So the whole thing will look something like: |
|---|
| 178 |
|
|---|
| 179 |
q = dns_alloc_query(); |
|---|
| 180 |
dns_submit(dns_q_flags(dns_q_a4(q, name, cbck), DNS_F_NOSRCH), data); |
|---|
| 181 |
|
|---|
| 182 |
The idea is to have a set of functions accepting struct dns_query* and |
|---|
| 183 |
returning it (so the calls can be "nested" like the above), to set up |
|---|
| 184 |
relevant parts of the query - specific type of callback, conversion from |
|---|
| 185 |
(type-specific) query parameters into a domain name (this is for type- |
|---|
| 186 |
specific query initializers), and setting various flags and options and |
|---|
| 187 |
type&class things. |
|---|
| 188 |
|
|---|
| 189 |
One example where this is almost essential - if we want to support |
|---|
| 190 |
per-query set of nameservers (which isn't at all useless: imagine a |
|---|
| 191 |
high-volume mail server, were we want to direct DNSBL queries to a separate |
|---|
| 192 |
set of nameservers, and rDNS queries to their own set and so on). Adding |
|---|
| 193 |
another argument (set of nameservers to use) to EVERY query submitting |
|---|
| 194 |
routine is.. insane. Especially since in 99% cases it will be set to |
|---|
| 195 |
default NULL. But with such "nesting" of query initializers, it becomes |
|---|
| 196 |
trivial. |
|---|
| 197 |
|
|---|
| 198 |
Another way to do the same is to manipulate query object right after a |
|---|
| 199 |
query has been submitted, but before any events processing (during this |
|---|
| 200 |
time, query object is allocated and initialized, but no actual network |
|---|
| 201 |
packets were sent - it will happen on the next event processing). But |
|---|
| 202 |
this way it become impossible to perform syncronous resolver calls, since |
|---|
| 203 |
those calls hide query objects they use internally. |
|---|
| 204 |
|
|---|
| 205 |
Speaking of replies handling - the planned change is to stop using dynamic |
|---|
| 206 |
memory (malloc) inside the library. That is, instead of allocating a buffer |
|---|
| 207 |
for a reply dynamically in a parsing routine (or memdup'ing the raw reply |
|---|
| 208 |
packet if no parsing routine is specified), I want udns to return the packet |
|---|
| 209 |
buffer it uses internally, and change parsing routines to expect a buffer |
|---|
| 210 |
for result. When parsing, a routine will return true amount of memory it |
|---|
| 211 |
will need to place the result, regardless of whenever it has enough room |
|---|
| 212 |
or not, so that an application can (re)allocate properly sized buffer and |
|---|
| 213 |
call a parsing routine again. |
|---|
| 214 |
|
|---|
| 215 |
Another modification I plan to include is to have an ability to work in |
|---|
| 216 |
terms of domain names (DNs) as used with on-wire DNS packets, not only |
|---|
| 217 |
with asciiz representations of them. For this to work, the above two |
|---|
| 218 |
changes (query submission and result passing) have to be completed first |
|---|
| 219 |
(esp. the query submission part), so that it will be possible to specify |
|---|
| 220 |
some additional query flags (for example) to request domain names instead |
|---|
| 221 |
of the text strings, and to allow easy query submissions with either DNs |
|---|
| 222 |
or text strings. |
|---|