dnsanon_rssac

Dnsanon_rssac is an implementation of RSSAC002v5 processing for DNS
statistics. It implements all of v5, except for zone size and update
time. With C<–version=v4>, it default to “lax” mode that provides a
superset of v4. With –version v3 and v2 it also implements most of prior
versions (all but zone size). Given the “RSSAC Advisory on Measurements
of the Root Server System”, at
https://www.icann.org/en/system/files/files/rssac-002-measurements-root-20nov14-en.pdf,
it provides all values that can be computed from packet captures. Its
processing can be parallelized and and done incrementally.

Design Goals

Explicit design goals:

- Incremental computation. It must be possible to compute statistics
  over the day and merge once, or to compute statistics at different
  sites and back-haul only minimal information for a central merge.

- Extensibility. It should be easy to add measurements in the future.

- Constant memory usage. Roots get attacked; we don’t want attacks to
  take out (computationally) the measurement system by having steps that
  require O(n) memory.

- Optional parallel processing. It works with Hadoop or GNU parallel,
  and it can also run sequentially.

Non-goals:

- High performance (we know some ways to make it faster; maybe in the
  future). (However, it seems plenty fast enough to track B-root’s
  statistics with a small, 40-core Hadoop cluster.)

- Pedantic levels of accuracy. The goal is to support root operation,
  and that does not requires 5 decimal places of precision. We believe
  our approach is correct (we’re just adding up sums), we do not
  currently implement careful checks around time boundaries (midnight).

- Computation of RSSAC002v3 values that cannot be easily derived from
  packet captures. We do not compute the load time nor zone size
  metrics.

- Graphs. (Although if you want to add some, please let us know.)

Although not an explicit goal, this implementation is largely
independent of the other implementations we know of. We depend on
dnsanon, which includes some code from DSC (TCP reassembly).

The Basic Idea

The basic idea: nearly everything in RSSAC-002 is a specialized version
of “word count”, if you write the words carefully. That lets one use
Hadoop style-parallelism to process and combine data.

Get pcaps and extract the DNS queries to Fsdb format (Fsdb is
tab-separated text with a header, see
http://ant.isi.edu/~johnh/SOFTWARE/FSDB.)

Convert each pcap’s queries to “rssacint” format, an internal format
that supports easy aggregation. Each line of rssacint format is of the
format (OPERATOR)(KEY) (COUNT). For example, for “+udp-ipv4-queries 10”
the operator is “+”, the key is “udp-ipv4-queries” and we’ve seen 10 of
them. The + means if we see two rows with the same key, we can add them
together. (In practice we use terser keys because we move a lot of bytes
around, so this key is actually “+3u04”.) Operators allow one to compute
sums, minimum and maximum, lists that check for completeness, and some
others; see rssacint_reduce for details.

Rssacint files can be arbitrarily combined using the rssacint_reduce
command. Just merge and sort two or more files then the reduce command
will sum up counts (or more generally, apply the operator) without
losing information.

As the last step, count the number of unique sources and convert to
YAML. These steps loose information.

The Specific Workflow

A full pipeline is:

1.  collect pcaps of all traffic. We use LANDER. Alternates: dnscap.

    We assume pcaps show up as a series of files with dates and/or
    sequence numbers. For B, they look like
    20151227-050349-00203216.lax.pcap, where the last set of numbers are
    a sequence number and “lax” is a site-name.

2.  extract the DNS queries to “message_question” format. We use
    dnsanon. Dnsanon is packaged separately at
    https://ant.isi.edu/software/dnsanon/index.html.

         < 20151227-050349-00203216.pcap  dnsanon -i - -o . -p mQ -f 20151227-050349-00203216

    will write the file 20151227-050349-00203216.message_question.xz

    (this code should actually be

         < 20151227-050349-00203216.pcap  dnsanon -i - -o - -p Q > 20151227-050349-00203216.message_question.xz

    but a bug in dnsanon-1.3 (to be fixed in dnsanon-1.4) causes this
    pipeline to not work.

3.  convert messages to rssacint format. Use ./message_to_rssacint.

         xzcat 20151227-050349-00203216.message_question.xz | \
         ./message_to_rssacint --file-seqno=203216 >20151227-050349-00203216.rssacint

4.  optionally (but recommended), process that rssacint format locally
    to reduce data size:

         < 20151227-050349-00203216.rssacint LC_COLLATE=C sort -k 1,1 | \
           ./rssacint_reduce > smaller.20151227-050349-00203216.rssacint.fsdb

5.  merge all rssacint files into one big one and reduce it (can be done
    multiple times).

         cat smaller*.rssacint.fsdb | LC_COLLATE=C sort -k 1,1 | ./rssacint_reduce > complete.rssacint.fsdb

6.  reduce it again to count unique ips

         < complete.rssacint.fsdb ./rssacint_reduce --count-ips > complete.rssacfin.fsdb

7.  Convert rssacfin to yaml. We use ./rssacfin_to_rssacyaml:

         < complete.rssacfin.fsdb ./rssacfin_to_rssacyaml

In Hadoop terms, steps 2 and 3 are the map phase, 4 is a combiner, step
5 is a reduce phase, and steps 6 and 7 are a second reduce phase. When
we run with Hadoop we often do steps 6 and 7 as a single process.

(And there is nothing magical about Hadoop. The only requirement is that
data be sorted before any rssacint_reduce step.)

Detailed Documentation and Sample Output

Each program has a manual page with examples and short sample input and
output.

Extended sample output is included in the sample_data subdirectory. Run
cd sample_data; make test to exercise this sample output as a test
suite.

At B

For B-Root, we capture about 1 pcap file every minute or two (step 1),
we process them incrementally over the day (steps 2 and 3 and 4). Every
night we run steps 5 as a map-reduce job with Hadoop, and run the final
reduce directly (without Hadoop).

On occasion we have re-run an entire day’s computation (steps 2 through
7). We can process that in a few hours on a moderate-size (about
120-core) Hadoop cluster.

Each pcap file is 2GB uncompressed. Each message file is about 200MB
compressed (xz). A merged rssacint file for a day of traffic is
typically 10MB after xz compression. After counting unique IPs, this
drops to about 2KB.

Validation

We have checked our computations for internal consistency and against
the Hedgehog implementation of RSSAC-002. We believe our results are
internally consistent. We see some differences with Hedgehog’s numbers,
but they are close. We believe some differences are due to B-Root’s
specific use of Hedgehog which triggers a limitation of Hedgehog that we
have never worked-around.

The included program dsc_to_rssacint converts Hedgehog’s modified DSC
output to rssacint. Although we do not recommend it for production use,
it may be useful to compare implementations.

Our program includes test cases (make test) and sample data.

Installation

These program use the standard Perl build system. To install:

    perl Makefile.PL
    make
    make test
    make install

For customization options, see ExtUtils::MakeMaker::FAQ(3) or
http://perldoc.perl.org/ExtUtils/MakeMaker/FAQ.html.

The current version of dnsanon_rssac is at
https://ant.isi.edu/software/dnsanon_rssac/.

This program depends on dnsanon, available from
https://ant.isi.edu/software/dnsanon/.

Releases

- dnsanon_rssac-1.0 2016-05-29: First public release.
- dnsanon_rssac-1.1 2016-05-29: Corrects RPM build specification.
- dnsanon_rssac-1.2 2016-05-31: Adds dsc_to_rssacint.
- dnsanon_rssac-1.3 2016-06-13: Updated to default to output rssac002v3
  (use –version v2 for old).
- dnsanon_rssac-1.4 2016-12-11: Better error messages in
  rssacfin_to_rssacyaml, and a fix to –file-seqno with sites (as in
  C<–file-seqno=lax:1>)
- dnsanon_rssac-1.5 2016-12-29: Add a C<–fileseqno=comment> option to
  message_to_rssacint
- dnsanon_rssac-1.6 2017-03-15: Improved documentation of file format.
  Improved test suite. Fixed non-YAML in extra output. (Thanks to Duane
  Wessels for reporting this bug.)
- dnsanon_rssac-1.7 2017-05-04: Greatly improved performance for IPv6 (a
  factor of 50!).
- dnsanon_rssac-1.8 2017-05-11: Fix a corner case with :: IPv6 addreses
  (prevents failing with “Modification of non-creatable array value
  attempted, subscript -1 at …”)
- dnsanon_rssac-1.9 2017-05-11: Better fix for corner case with :: IPv6
  addreses
- dnsanon_rssac-1.10 2020-02-23: Add support for RSSAC002v4, defaulting
  to v4-lax.
- dnsanon_rssac-1.11 2020-03-23: update test suites for RSSAC002v4.
- dnsanon_rssac-1.12 2020-06-30: remove unnecessary quotes from
  start-time, and fix chown bug.
- dnsaanon_rssac-1.13 2020-07-01: move rcodes to the toplevel for
  RSSAC002v3 and later. (Fixing a long-standing divergence from the
  specification. Thanks Anand Buddhdev for reporting this error.)
- dnsaanon_rssac-1.14 2020-07-02: add missing timezone (Z) to dates.
- dnsaanon_rssac-1.15 2021-05-06: Add –license option, which adds a
  license field to extras.
- dnsaanon_rssac-1.16 2021-06-12: rssacint_reducer now correctly
  propages : rows, rather than throwing an error.
- dnsaanon_rssac-1.17 2021-06-13: rssacint_reducer no longer add /e to
  overlapping rangelists.
- dnsaanon_rssac-1.18 2021-12-02: message_to_rssacint now reports
  service addresess (with “s”) and rssacfin_to_rssacyaml puts it in
  “extra”.; update for and default to RSSAC002v5.
- dnsaanon_rssac-1.19 2023-12-19: message_to_rssacint now outputs tls
  and https for DoT and DoH, and rssacfin_to_rssacyaml reports them.
- dnsaanon_rssac-1.20 2024-01-04: tls and https statitics in yaml are
  now prefixed by “b-” until perhaps v6 standardizes them.
- dnsaanon_rssac-1.21 2024-01-06: message_to_rssacint now understands
  dnstapmq comments.
- dnsaanon_rssac-1.22 2024-01-07: message_to_rssacint now handles a dash
  in site names.
- dnsaanon_rssac-1.23 2024-01-07: rssacfin_to_rssacyaml fixes bugs and
  typos in tls accounting.
- dnsaanon_rssac-1.24 2025-06-09: message_to_rssacint now reports both
  queries and replies per service addresess (with “s”) and
  rssacfin_to_rssacyaml puts it in “b-extra”.
- dnsaanon_rssac-1.25 2025-07-23: message_to_rssacint now has modes to
  optionally count unique IPs per service address and TLD.
- dnsaanon_rssac-1.26 2025-07-25: message_to_rssacint restore query
  counts for all modes.
- dnsaanon_rssac-1.27 2025-09-25: message_to_rssacint handle EDNS
  extended rcodes better, add support for TLD maginitude queries,
  deprecate –mode in favor of options for different expensive elements.
  rssacint_reduce: generalize handling of “hard” counts.
- dnsaanon_rssac-1.28 2025-10-01: rssacfin_to_rssacyaml: add optional
  –service-address-regexp to filter.
- dnsaanon_rssac-1.29 2025-10-06: rssacint_reduce: correctly handle
  –count-ips for the new hard sections.
- dnsaanon_rssac-1.30 2025-10-24: New rssacint_to_magnitude_tsv to
  extract TLDs to a tab-separated value format for input to
  dns-magnitude
- dnsaanon_rssac-1.31 2025-11-06: Fix syntax errors in 1.30.
- dnsaanon_rssac-1.32 2026-03-30: Add another option for setting
  --file-seqno, and avoid dnsmag records in YAML conversion.

Feedback

We are interested in feedback, particularly about correctness or other
active users.

Please contact John Heidemann johnh@isi.edu with comments.
