Work planning
2019-08-29 - Future - Tony Finch
I'm back from a summer holiday and it is "Back to School" season, so now seems like a good time to take stock and write down some plans.
This is roughly in order of priority.
KSK rollover project status
2019-02-07 - Progress- Future - Tony Finch
I have spent the last week working on DNSSEC key rollover automation in BIND. Or rather, I have been doing some cleanup and prep work. With reference to the work I listed in the previous article...
Done
Stop BIND from generating SHA-1 DS and CDS records by default, per RFC 8624
Teach
dnssec-checkds
about CDS and CDNSKEY
Started
- Teach
superglue
to use CDS/CDNSKEY records, with similar logic todnssec-checkds
The "similar logic" is implemented in dnssec-dsfromkey
, so I don't
actually have to write the code more than once. I hope this will also
be useful for other people writing similar tools!
Some of my small cleanup patches have been merged into BIND. We are currently near the end of the 9.13 development cycle, so this work is going to remain out of tree for a while until after the 9.14 stable branch is created and the 9.15 development cycle starts.
Next
So now I need to get to grips with dnssec-coverage
and dnssec-keymgr
.
Simple safety interlocks
The purpose of the dnssec-checkds
improvements is so that it can be
used as a safety check.
During a KSK rollover, there are one or two points when the DS records in the parent need to be updated. The rollover must not continue until this update has been confirmed, or the delegation can be broken.
I am using CDS and CDNSKEY records as the signal from the key
management and zone signing machinery for when DS records need to
change. (There's a shell-style API in dnssec-dsfromkey -p
, but that
is implemented by just reading these sync records, not by looking into
the guts of the key management data.) I am going to call them "sync
records" so I don't have to keep writing "CDS/CDNSKEY"; "sync" is also
the keyword used by dnssec-settime
for controlling these records.
Key timing in BIND
The dnssec-keygen
and dnssec-settime
commands (which are used by
dnssec-keymgr
) schedule when changes to a key will happen.
There are parameters related to adding a key: when it is published in the zone, when it becomes actively used for signing, etc. And there are parameters related to removing a key: when it becomes inactive for signing, when it is deleted from the zone.
There are also timing parameters for publishing and deleting sync records. These sync times are the only timing parameters that say when we must update the delegation.
What can break?
The point of the safety interlock is to prevent any breaking key changes from being scheduled until after a delegation change has been confirmed. So what key timing events need to be forbidden from being scheduled after a sync timing event?
Events related to removing a key are particularly dangerous. There are some cases where it is OK to remove a key prematurely, if the DS record change is also about removing that key, and there is another working key and DS record throughout. But it seems simpler and safer to forbid all removal-related events from being scheduled after a sync event.
However, events related to adding a key can also lead to nonsense. If we blindly schedule creation of new keys in advance, without verifying that they are also being properly removed, then the zone can accumulate a ridiculous number of DNSKEY records. This has been observed in the wild surprisingly frequently.
A simple rule
There must be no KSK changes of any kind scheduled after the next sync event.
This rule applies regardless of the flavour of rollover (double DS, double KSK, algorithm rollover, etc.)
Applying this rule to BIND
Whereas for ZSKs, dnssec-coverage
ensures rollovers are planned for
some fixed period into the future, for KSKs, it must check correctness
up to the next sync event, then ensure nothing will occur after that point.
In dnssec-keymgr
, the logic should be:
If the current time is before the next sync event, ensure there is key coverage until that time and no further.
If the current time is after all KSK events, use
dnssec-checkds
to verify the delegation is in sync.If
dnssec-checkds
reports an inconsistency and we are within some sync interval dictated by the rollover policy, do nothing while we wait for the delegation update automation to work.If
dnssec-checkds
reports an inconsistency and the sync interval has passed, report an error because operator intervention is required to fix the failed automation.If
dnssec-checkds
reports everything is in sync, schedule keys up to the next sync event. The timing needs to be relative to this point in time, since any delegation update delays can make it unsafe to schedule relative to the last sync event.
Caveat
At the moment I am still not familiar with the internals of
dnssec-coverage
and dnssec-keymgr
so there's a risk that I might
have to re-think these plans. But I expect this simple safety rule
will be a solid anchor that can be applied to most DNSSEC key
management scenarios. (However I have not thought hard enough about
recovery from breakage or compromise.)
DNSSEC key rollover automation with BIND
2019-01-30 - Future - Tony Finch
I'm currently working on filling in the missing functionality in BIND that is needed for automatic KSK rollovers. (ZSK rollovers are already automated.) All these parts exist; but they have gaps and don't yet work together.
The basic setup that will be necessary on the child is:
Write a policy configuration for
dnssec-keymgr
.Write a cron job to run
dnssec-keymgr
at a suitable interval. If the parent does not rundnssec-cds
then this cron job should also runsuperglue
or some other program to push updates to the parent.
The KSK rollover process will be driven by dnssec-keymgr
, but it
will not talk directly to superglue
or dnssec-cds
, which make the
necessary changes. In fact it can't talk to dnssec-cds
because that
is outside the child's control.
So, as specified in RFC 7344,
the child will advertise the desired state of its delegation using CDS
and CDNSKEY records. These are read by dnssec-cds
or superglue
to
update the parent. superglue
will be loosely coupled, and able to
work with any DNSSEC key management softare that publishes CDS
records.
The state of the keys in the child is controlled by the timing
parameters in the key files, which are updated by dnssec-keymgr
as
determined by the policy configuration. At the moment it generates
keys to cover some period into the future. For KSKs, I think it will
make more sense to generate keys up to the next DS change, then stop
until dnssec-checkds
confirms the parent has implemented the change,
before continuing. This is a bit different from the ZSK coverage
model, but future coverage for KSKs can't be guaranteed because
coverage depends on future interactions with an external system which
cannot be assumed to work as planned.
Required work
Teach
dnssec-checkds
about CDS and CDNSKEYTeach
dnssec-keymgr
to set "sync" timers in key files, and to invokednssec-checkds
to avoid breaking delegations.Teach
dnssec-coverage
to agree withdnssec-keymgr
about sensible key configuration.Teach
superglue
to use CDS/CDNSKEY records, with similar logic todnssec-checkds
Stop BIND from generating SHA-1 DS and CDS records by default, per draft-ietf-dnsop-algorithm-update
IPv6 prefixes and LAN names
2018-12-06 - Future - Tony Finch
I have added a note to the ipreg schema wishlist that it should be possible for COs to change LAN names associated with IPv6 prefixes.
A note on prepared transactions
2018-04-24 - Future - Tony Finch
Some further refinements of the API behind shopping-cart style prepared transactions:
On the server side, the prepared transaction is a JSON-RPC request blob which can be updated with HTTP PUT or PATCH. Ideally the server should be able to verify that the result of the PATCH is a valid JSON-RPC blob so that it doesn't later try to perform an invalid request. I am planning to do API validity checks using JSON schema.
This design allows the prepared transaction storage to be just a simple JSON blob store, ignorant of what the blob is for except that it has to match a given schema. (I'm not super keen on nanoservices so I'll just use a table in the ipreg database to store it, but in principle there can be some nice decoupling here.)
It also suggests a more principled API design: An immediate
transaction (typically requested by an API client) might look like the
following (based on JSON-RPC version 1.0 system.multicall
syntax):
{ jsonrpc: "2.0", id: 0, method: "rpc.transaction", params: [ { jsonrpc: "2.0", id: 1, method: ... }, { jsonrpc: "2.0", id: 2, method: ... }, ... ] }
When a prepared transaction is requested (typically by the browser UI) it will look like:
{ jsonrpc: "2.0", id: 0, method: "rpc.transaction", params: { prepared: "#" } }
The "#" is a relative URI referring to the blob stored on the JSON-RPC endpoint (managed by the HTTP methods other than POST) - but it could in principle be any URI. (Tho this needs some thinking about SSRF security!) And I haven't yet decided if I should allow an arbitrary JSON pointer in the fragment identifier :-)
If we bring back rpc.multicall
(JSON-RPC changed the reserved prefix
from system.
to rpc.
) we gain support for prepared
non-transactional batches. The native batch request format becomes a
special case abbreviation of an in-line rpc.multicall
request.
DNS server QA traffic
2018-03-28 - Future - Tony Finch
Yesterday I enabled
serve-stale
on our recursive DNS servers, and after a few hours one of them
crashed messily. The automatic failover setup handled the crash
reasonably well, and I disabled serve-stale
to avoid any more
crashes.
How did this crash slip through our QA processes?
Test server
My test server is the recursive resolver for my workstations, and the primary master for my personal zones. It runs a recent development snapshot of BIND. I use it to try out new features, often months before they are included in a release, and I help to shake out the bugs.
In this case I was relatively late enabling serve-stale
so I was
only running it for five weeks before enabling it in production.
It's hard to tell whether a longer test at this stage would have exposed the bug, because there are relatively few junk queries on my test server.
Pre-heat
Usually when I roll out a new version of BIND, I will pre-heat the cache of an upgraded standby server before bringing it into production. This involves making about a million queries against the server based on a cache dump from a live server. This also serves as a basic smoke test that the upgrade is OK.
I didn't do a pre-heat before enabling serve-stale
because it was
just a config change that can be done without affecting service.
But it isn't clear that a pre-heat would have exposed this bug because the crash required a particular pattern of failing queries, and the cache dump did not contain the exact problem query (though it does contain some closely related ones).
Possible improvements?
An alternative might be to use live traffic as test data, instead of a
static dump. A bit of code could read a dnstap
feed on a live
server, and replay the queries against another server. There are two
useful modes:
test traffic: replay incoming (recursive client-facing) queries; this reproduces the current live full query load on another server for testing, in a way that is likely to have reproduced yesterday's crash.
continuous warming: replay outgoing (iterative Internet-facing) queries; these are queries used to refill the cache, so they are relatively low volume, and suitable for keeping a standby server's cache populated.
There are a few cases where researchers have expressed interest in DNS
query data, of either of the above types. In order to satisfy them we
would need to be able to split a full dnatap
feed so that recipients
only get the data they want.
This live DNS replay idea needs a similar dnstap
splitter.
Transactions and JSON-RPC
2018-03-02 - Future - Tony Finch
The /update
API endpoint
that I outlined turns out to be basically
JSON-RPC 2.0, so it seems to be worth making
the new IP Register API follow that spec exactly.
However, there are a couple of difficulties wrt transactions.
The current
not-an-API list_ops
page
runs each requested action in a separate transaction. It should be
possible to make similar multi-transation batch requests with the new
API, but my previous API outline did not support this.
A JSON-RPC batch request is a JSON array of request objects, i.e. the
same syntax as I previously described for /update
transactions,
except that JSON-RPC batches are not transactional. This is good for
preserving list_ops
functionality but it loses one of the key points
of the new API.
There is a simple way to fix this problem, based on a fairly
well-known idea. XML-RPC doesn't have batch requests like JSON-RPC,
but they were retro-fitted by defining
a system.multicall
method
which takes an array of requests and returns an array of responses.
We can define transactional JSON-RPC requests in the same style, like this:
{ "jsonrpc": "2.0", "id": 0, "method": "transaction", "params": [ { "jsonrpc": "2.0", "id": 1, "method": "foo", "params": { ... } }, { "jsonrpc": "2.0", "id": 2, "method": "bar", "params": { ... } } ] }
If the transaction succeeds, the outer response contains a "result" array of successful response objects, exactly one for each member of the request params array, in any order.
If the transaction fails, the outer response contains an "error" object, which has "code" and "message" members indicating a transaction failure, and an "error" member which is an array of response objects. This will contain at least one failure response; it may contain success responses (for actions which were rolled back); some responses may be missing.
Edited to add: I've described some more refinements to this idea
User interface sketch
2018-02-12 - Future - Tony Finch
The current IP Register user interface closely follows the database schema: you choose an object type (i.e. a table) and then you can perform whatever search/create/update/delete operations you want. This is annoying when I am looking for an object and I don't know its type, so I often end up grepping the DNS or the textual database dumps instead.
I want the new user interface to be search-oriented. The best existing example within the UIS is Lookup. The home page is mostly a search box, which takes you to a search results page, which in turn has links to per-object pages, which in turn are thoroughly hyperlinked.
ANAME vs aname
2018-02-01 - Future - Tony Finch
The IETF dnsop working group are currently discussing a draft specification for an ANAME RR type. The basic idea is that an ANAME is like a CNAME, except it only works for A and AAAA IP address queries, and it can coexist with other records such as SOA (at a zone apex) or MX.
I'm following the ANAME work with great interest because it will make certain configuration problems much simpler for us. I have made some extensive ANAME review comments.
An ANAME is rather different from what the IP Register database calls
an aname
object. An aname
is a name for a set of existing IP
addresses, which can be an arbitrary subset of the combined addresses
of multiple box
es or vbox
es, whereas an ANAME copies all the
addresses from exactly one target name.
There is more about
the general problem of aliases in the IP Register database
in one of the items I posted in December. I am still unsure how the
new aliasing model might work; perhaps it will become more clear when
I have a better idea about how the existing aname
implementation and
its limitations.
High-level API design
2018-01-09 - Future - Tony Finch
This is just to record my thoughts about the overall shape of the IP Register API; the details are still to be determined, but see my previous notes on the data model and look at the old user interface for an idea of the actions that need to be available.
IP Register schema wishlist
2017-12-19 - Future - Tony Finch
Here are some criticisms of the IP Register database schema and some thoughts on how we might change it.
There is a lot of infrastructure work to do before I am in a position to make changes - principally, porting from Oracle to PostgreSQL, and developing a test suite so I can make changes with confidence.
Still, it's worth writing down my thoughts so far, so colleagues can see what I have in mind, and so we have some concrete ideas to discuss.
I expect to add to this list as thoughts arise.
Authentication and access control
2017-12-06 - Future - Tony Finch
The IP Register database is an application hosted on Jackdaw, which is a platform based on Oracle and Apache mod_perl.
IP Register access control
Jackdaw and Raven handle authentication, so the IP Register database
only needs to concern itself with access control. It does this using
views defined with check option
, as is briefly described in
the database overview and visible in the
SQL view DDL.
There are three levels of access to the database:
the
registrar
table contains privileged users (i.e. the UIS network systems team) who have read/write access to everything via the views with theall_
prefix.the
areader
table contains semi-privileged users (i.e. certain other UIS staff) who have read-only access to everything via the views with thera_
prefix.the
mzone_co
table contains normal users (i.e. computer officers in other institutions) who have read-write access to their mzone(s) via the views with themy_
prefix.
Apart from a few special cases, all the underlying tables in the database are available in all three sets of views.
IP Register user identification
The first part of the view definitions
is where the IP Register database schema is tied to the authenticated
user. There are two kinds of connection: either a web connection
authenticated via Raven, or a direct sqlplus
connection
authenticated with an Oracle password.
SQL users are identified by Oracle's user
function; Raven users are
obtained from the sys_context()
function, which we will now examine
more closely.
Porting to PostgreSQL
We are fortunate that support for create view with check option
was
added to PostgreSQL by our colleague Dean Rasheed.
The sys_context()
function is a bit more interesting.
The Jackdaw API
Jackdaw's mod_perl
-based API is called WebDBI, documented at
https://jackdaw.cam.ac.uk/webdbi/
There's some discussion of authentication and database connections at https://jackdaw.cam.ac.uk/webdbi/webdbi.html#authentication and https://jackdaw.cam.ac.uk/webdbi/webdbi.html#sessions but it is incomplete or out of date; in particular it doesn't mention Raven (and I think basic auth support has been removed).
The interesting part is the description of sessions. Each web server process makes one persistent connection to Oracle which is re-used for many HTTP requests. How is one database connection securely shared between different authenticated users, without giving the web server enormously privileged access to the database?
Jackdaw authentication - perl
Instead of mod_ucam_webauth
, WebDBI has its own implementation of
the Raven protocol - see jackdaw:/usr/local/src/httpd/Database.pm
.
This mod_perl
code does not do all of the work; instead it calls
stored procedures to complete the authentication. On initial login it
calls raven_auth.create_raven_session()
and for a returning user
with a cookie it calls raven_auth.use_raven_session()
.
Jackdaw authentication - SQL
These raven_auth
stored procedures set the authenticated user that
is retrieved by the sys_context()
call in the IP Register views -
see jackdaw:/usr/local/src/httpd/raven_auth/
.
Most of the logic is written in PL/SQL, but there is also an external
procedure written in C which does the core cryptography - see
jackdaw:/usr/local/oracle/extproc/RavenExtproc.c
.
Porting to PostgreSQL - reprise
On the whole I like Jackdaw's approach to preventing the web server from having too much privilege, so I would like to keep it, though in a simplified form.
As far as I know, PostgreSQL doesn't have anything quite like
sys_context()
with its security properties, though you can get
similar functionality using PL/Perl.
However, in the future I want more heavy-weight sessions that have more server-side context, in particular the "shopping cart" pending transaction.
So I think a better way might be to have a privileged session table,
keyed by the user's cookie and containing their username and jsonb
session data, etc. This table is accessed via security definer
functions, with something like Jackdaw's create_raven_session()
,
plus functions for getting the logged-in user (to replace
sys_context()
) and for manipulating the jsonb
session data.
We can provide ambient access to the cookie using the set session
command at the start of each web request, so the auth functions can
retrieve it using the current_setting()
function.
Streaming replication from PostgreSQL to the DNS
2016-12-23 - Future - Tony Finch
This entry is backdated - I'm writing this one year after I made this experimental prototype.
Our current DNS update mechanism runs as an hourly batch job. It would be nice to make DNS changes happen as soon as possible.
user interface matters
Instant DNS updates have tricky implications for the user interface.
At the moment it's possible to make changes to the database in between batch runs, knowing that broken intermediate states don't matter, and with plenty of time to check the changes and make sure the result will be OK.
If the DNS is updated immediately, we need a way for users to be able to prepare a set of inter-related changes, and submit them to the database as a single transaction.
(Aside: I vaguely imagine something like a shopping-cart UI that's available for collecting more complicated changes, though it should be possible to submit simple updates without a ceremonial transaction.)
This kind of UI change is necessary even is we simply run the current batch process more frequently. So we can't reasonalbly deploy this without a lot of front-end work.
back-end experiments
Ideally I would like to keep the process of exporting the database to the DNS and DHCP servers as a purely back-end matter; the front-end user interface should only be a database client.
So, assuming we have a better user interface, we would like to be able to get instant DNS updates by improvements to the back end without any help from the front end.
PostgreSQL has a very tempting replication feature called "logical decoding", which takes a replication stream and turns it into a series of database transactions. You can write a logical decoding plugin which emits these transactions in whatever format you want.
With logical decoding, we can (with a bit of programming) treat the
DNS as a PostgreSQL replication target, with a script that looks
something like pg_recvlogical | nsupdate
.
I wrote a prototype along these lines, which is published at https://git.uis.cam.ac.uk/x/uis/ipreg/pg-decode-dns-update.git
status of this prototype
The plugin itself works in a fairly satisfactory manner.
However it needs a wrapper script to massage transactions before they
are fed into nsupdate
, mainly to split up very large transactions
that cannot fit in a single UPDATE request.
The remaining difficult work is related to starting, stopping, and pausing replication without losing transactions. In particular, during initial deployment we need to be able to pause replication and verify that the replicated updates are faithfully reproducing what the batch update would have done. We can use the same pause/batch/resume mechanism to update the parts of the DNS that are not maintained in the database.
At the moment we are not doing any more work in this area until the other prerequisites are in place.