Public Comment CloseStatement
Name 

Status

Assignee(s)

Call for
Comments Open
Call for
Comments
Close 
Vote OpenVote CloseDate of SubmissionStaff Contact and EmailStatement Number

02 April 2018

ADOPTED

13Y, 0N, 0A

01 April 2018

02 April 2018

03 April 2018

06 April 2018

02 April 2018

AL-ALAC-ST-0418-02-01-EN

Hide the information below, please click here 

FINAL VERSION TO BE SUBMITTED IF RATIFIED

The final version to be submitted, if the draft is ratified, will be placed here by upon completion of the vote. 



FINAL DRAFT VERSION TO BE VOTED UPON BY THE ALAC

The final draft version to be voted upon by the ALAC will be placed here before the vote is to begin.


The ALAC and At-Large Community understand the need to roll the KSK but parts of the community have strong concerns for the potential impact on users world-wide.

We believe that a holistic review is needed including a risk assessment of the alternatives, in time for further discussion at ICANN62. The assessment should include then current information related to the RFC 8145 trust anchor reports, the prognosis for availability of the in-development IETF “sentinel” mechanism and the potential for using the sentinel mechanism to create a greater level of comfort prior to the KSK rollover.

In parallel, ICANN should ramp up its awareness campaign using all possible conduits to reach ISP, telcos, and governments as well as critical sectors who must be able to continue to function post-rollover and who may be in a position to communicate with key DNS providers in their regions. Banking is one such sector that must not be put offline and which may have valuable contacts in their local areas. RIRs may have good contact information for large ISPs and other large users in their regions.

ICANN should also make available an information packet, in at least the languages ICANN normally supports, and preferably more, which will allow users and businesses to understand the issue (i.e. in simple terms) and tell them what they need to do/ask with regard to their local ISPs.

ICANN should provide a simple test web address and/or application that will allow users to verify if the resolver they typically use is DNSSEC-aware. If it is not, then they are likely to be unaffected by the KSK rollover. If their resolver is DNSSEC-aware, then they should be told what to do to try to verify that their provider is aware of and prepared for the rollover (recognizing that the technical support most end-users can contact will not likely be aware of terms such as DNSSEC, KSK or Rollover). http://dnssec.donnerhacke.de is an example on which such a tool may be modelled.

ICANN should provide a list (either viewable or searchable) of DNS resolvers known to be DNSSEC enabled for which we definitively know either do or do not have the new trust-anchor installed, and the awareness campaign should describe how end users can check this list. An automated app that users could run on various platforms would be even better.

Lastly, the At-Large Community has concerns that the rollover is scheduled to take place on a Thursday (and most likely Friday in some parts of the world). That seems like a plan designed to maximize and prolong any problems. We would like to understand the potential and possibility of a minimal delay to ensure that the day-of-the-week issue reduces impact instead of increases it.



FIRST DRAFT SUBMITTED

The first draft submitted will be placed here before the call for comments begins.

Another proposed Draft (Hadia)

The At– Large Advisory Committee (ALAC) of the ICANN takes this opportunity to thank ICANN org for opening for public comments the plan to restart the Root Key Signing (KSK) Rollover Process and is glad to provide its comments herein

The postponement of the KSK roll over on 11 October 2017 was based on newly discovered information concerning validating recursive resolvers that might not be ready for the rollover, to this end ICANN org researched the new data to determine if it could be useful in determining when to roll the root KSK,  however on 18 December 2017, ICANN org reported to the community the results of its research, in that report, ICANN detailed that the collected data does not provide any clear explanation as to why so many resolvers appeared to still be using only the 2010 KSK. Most of the messages received at the root zone that indicated that particular resolvers were not ready for the rollover were not helpful, ICANN org could often not determine which resolvers sent the message or why those resolvers had not updated their trust anchors. Additionally, even when the resolvers were identified efforts to contact the operator were often unsuccessful.

Taking into consideration that

  • It is not for seen that new reliable data will be available soon.  
  • There is nothing to indicate that operators of DNS resolvers that are operating with only the 2010 KSK will fix their systems soon.  
  • The existence of DNSSEC is important to protect the integrity of the DNS data, where DNSSEC applies digital signatures to DNS data to authenticate the data's origin and verify its integrity as it moves throughout the Internet.
  • It was agreed that each root zone KSK will be scheduled to be rolled over through a key ceremony as required, or after 5 years of operation and 6 years have already elapsed
  • Postponing the KSK rollover might put the security of the DNS at risk, where the key could be compromised, lost among others risks stated in the SSAC Advisory on DNSSEC Key Rollover in the Root Zone on 7 November 2013.


The ALAC recognizes that while it is important to guarantee that the users affected are as minimum as possible it is equally important for the security of the DNS to proceed with the KSK Rollover. To this end the ALAC supports the proposed plan for the KSK rollover while highlighting the importance of an extensive outreach plan and requesting a detailed PR plan that includes all the necessary information and documents, and post-rollover recovery guides for unprepared DNS resolvers, This is in addition to the recommendations previously mentioned in the   

-----------------------------------------------------------------------------------------------------------------------------------------------------

The ALAC is pleased to have the opportunity to comment on the “Plan to Restart the Root KSK Rollover Process”.

DNSSEC changes the nature of the most decentralized service of the Internet, the DNS, fundamentally. It transforms a lightweight "lookup table" into a trustworthy database. It's trust is made up of two important fragments: solid cryptography implementations and transparent operations.

DNSSEC anchors the zone trust by hopping the delegation hierarchy backwards. But this process does terminate at the root. Implementing the trust for the root itself is incredible hard, ultimately it's not a technical problem at all. At this point, the trust (KSK) information needs to be put in the hands of countless network operators all over the world. This task is attributed to ICANN. AtLarge has to do it's own outreach on this subject using it's own distributed structure.

Cryptographic elements always have a lifetime, if they are keys or algorithms. It is good operational practice to change the keys regularly in order update the material and to ensure, that all processes are still in place. Changes of algorithms require a working key change process. Therefore the important root keys (where all the trust is rooted) need to be changed, too.

During the preparations of the rollover, various issues arose especially from embedded and operator-less devices. Efforts were made to estimate the impact of an KSK change for affected user groups. But all the data is still vague.

The proposed plan is to schedule the rollover for October  this year, missing two possible earlier dates. The gained time should be used for intensify the communication with the network operator crowd out there. Waiting for new protocols to be deployed in order to guess more accurately is not an option. So the communication should concentrate on preparation check lists and post-rollover recovery guides for validating recursors.

ALAC supports this plan: Shifting the schedule by exactly one year makes is much more easy for outsiders to keep in time with the activities. April is clearly to short for the necessary communication. Starting in July will collide with holiday breaks in several countries. Because we depend on the operators, the October date is more appropriate in terms of workload and memorizing the important dates.

ICANN should provide a test page for the end users, which tell them in a very simple way, if they are affected by the KSK rollover or not (example http://dnssec.donnerhacke.de/). This page should offer a link to the gathered information from RFC 8145: The end user can enter the IP of the resolver (preferably automatically detected using a Nonce-FQDN) to check, if the active resolver is already trusting all active keys. This tool should be stay in place for further rollovers.

There are some recommendations:

  • Do not publish changes near the end of the week (Friday to Sunday can not considered as work days), choose only Tuesday or Wednesday.
  • Because it's impossible to gather detailed information about every possible situation, prepare a worldwide anti-blame PR action. Try to get this event into the news, it has to be broadcasted in the same way as the Year2000 problem. Ensure, that every problem during the roll over will be attributed to the ISPs. Strictly speaking try to blame the ISPs, hosters, and IoT companies beforehand through the press, (social) media, and TV shows. This action has to reach the climax some weeks before the date.
  • Use the opportunities of the distributed ICANN sub-organisations (for AtLarge: ALSs) to distribute knowledge into the communities.
  • Allow the end users to self-test their own environment by providing a appropriate web-page.

The rise of IoT starts to create swamps of non updateable network devices. If the KSK rollover is further delayed, more and more such devices will be deployed, all of them unable to deal with an upcoming KSK change. The only chance to get those developers and companies on board is to rollover the KSK. Regularly.

-------------------------------------------------------------------------------------------------------------------------------------------------

Alternative Text: (courtesy of John Laprise)

Whereas:

Resolved:

The ALAC advises the Board

  • to provide a holistic risk assessment to the community on the KSK Rollover at ICANN62 comparing the relative risks of implementing the KSK rollover as provided in the revised plan vs. further delaying the KSK rollover. The risk assessment should include an evaluation of both the technical and reputational risks to the security and stability of the Internet and should reflect opinions of the SSAC, RSAC, and Risk Committee
  • To direct ICANN to develop a robust ISP mitigation strategy in the event of resolver failure affecting end users and communicate it to the community
  • To provide clarification on the communication plan. ALAC notes that the recipients for the KSK rollover communication plan considerably overlap those of the DNSSEC implementation

Furthermore

  • Upon receipt of the risk assessment at ICANN62, ALAC will be better prepared to offer advice on the KSK Rollover Plan in a timely fashion prior to its proposed execution in October 2018

116 Comments

  1. Via email Tuesday, February 13, 2018 at 8:20 PM

    Hi everyone,

    I'm very surprised that this plan didn't get the attention of us earlier. Last year it was postponed almost at the starting date, the reason argued was that an important part of the internet might fail.

    Looking at the actual plan, it doesn’t seem to be a clearer picture of what might happen, all the studies and comments point towards a vague result.

    The main addition of this year rollover plan is more publicity.

    My real concern is for the small ccTLDs (and I guess some small gTLDs), that doesn’t have DNSSEC installed, because they do not have the resources to do so, less to make the appropriated technical adjustments for the new KSK.

    Comments are very welcome

    Best

    Ricardo Holmquist

  2. Via email Tuesday, February 13, 2018 at 8:42 PM

    Dear Ricardo,

    On 13/02/2018 18:20, Ricardo Holmquist wrote:

    My real concern is for the small ccTLDs (and I guess some small gTLDs), that doesn’t have DNSSEC installed, because they do not have the resources to do so, less to make the appropriated technical adjustments for the new KSK.


    The discussion about these features took place in the drafting of the Applicant Guidebook a few years ago. Our community was somehow in two minds about this. On the one hand you are correct that some small TLD operators might have challenges installing DNSSEC. Same for IPv6 too. On the other, we owe it to end users that the domains they register under TLDs are as safe as possible, thus should we compromise on DNSSEC? ICANN went the full way due to its Core Mission (taken from the ICANN Bylaws):

    Section 1.1. MISSION

    (a) The mission of the Internet Corporation for Assigned Names and Numbers ("ICANN") is to ensure the stable and secure operation of the Internet's unique identifier systems as described in this Section 1.1(a) (the "Mission"). 

    Not having DNSSEC makes the operation of the Internet's unique identifier system less secure than with DNSSEC.

    Kindest regards,

    Olivier Crépin-Leblond 

  3. Via email Tuesday, February 13, 2018 at 10:43 PM

    I can appreciate Ricardo's concern for smaller ccTLDs, but I do know that David Conrad made attempts to personally attend regional Telecom meetings (as he did in the Pacific,at a PITA  meeting on Rarotonga) to explain the reasons why they should install DNSSEC and to prepare for the KSK rollover process.  For those who may not have had the expertise or resources to do this for themselves, support was there. Like OCL, I believe it is important to secure the Internet's unique identifier system and that ccTLD managers should be encouraged to do so.

    On saying that, I contacted David a few months ago and found that despite the gathering he attended being held here and hosted by our monopoly Telecom company, it still had not installed DNSSEC. I raised this with both the company itself and our Minister (as co-owner of the company), but they really are still not understanding the importance of this investment. This would be a problem in a lot of developing countries. So what do we do?

    Maureen Hilyard


  4. Via email Tuesday, February 13, 2018 at 11:07 PM

    Ricardo, I had conversations with LACNIC. I offered to collaborate through all our ALSs to spread both DNSSEC and KSK. Humberto Carrasco was in continuing this topic, but I do not have any news. The fact is that instead of reaching each of the ISPs, in principle we would go through the cameras that group them in each country.

    Regards

    Alberto Soto

  5. Unless I am missing something, this is not an issues of whether a ccTLD uses DNSSEC. The verification/changes must be carried out by anyone who runs a DNS resolver. Now a ccTLD may do that or not. They will generally be found at ISPs, registrars and organizations that manage their own DNS entries (such as medium to large organizations).

    The key issues are:

    1. Getting the message out that anyone who runs a resolver MUST verify whether it is doing DNS validation, or relying on an upstream DNS server to do validation for it (that is what DNS resolvers in workstations and WiFi routers typically do)
    2. Providing places to go for help, which should be regional and with local language support.
    3. ICANN needs to be pro-active on ensuring that there are such regional help centres.

    I am asking that this be on the agenda for ICANN61 and that in particular we be briefed on exactly who must act and what they will need to do so that our people on the ground around the world can help.

    We will need someone to be a penholder!  Volunteers?? Preferably some one sufficiently technical as to make sure it is worded correctly (that is not me!).


    1. Installing "regional help centers" will not provide any positive effect. It will waste ressouces.

      If the resolver runs with DNSSEC disabled or with a correct configuration: Nobody will notice.

      There are three possibilities, if the rollover breaks incorrect configured DNSSEC resolver and all clients can't work anymore.:

      1. Operator googles from a different network, finds a twitter shitstorm and reads, that the only possible solution is to install a non-validation resolver or disable DNSSEC, (Most likely, will cause long term problems for DNSSEC)
      2. Operator is aware of the problem and fixed the configuration. (Unlikely)
      3. Operator is aware of the problem but can't fix the problem. Because he knows, that ICANN has regional help centers, so he calls there for technical assistance with his outdated and obscure system. (Not even unlikely)


  6. Not having a technical background, I am really not qualified to do anything other than read the contribution of others and learn.  But one point does stand out - Alan's call on ICANN to be proactice on ensuring there are regional help centres.  Is this something that can be put to the RIRs/GAC.  I know APNIC does a lot of free training and assistance throughout the region - maybe use that as a model?

  7. Sorry for being late.

  8. Thanks for the draft Lutz. It provides good background.  And it does put ALAC in the frame for spreading the word.  But what are we saying about the actual plan?  Do we think it will work?

  9. Lutz, a question for you (or anyone else knowledgeable in this area).

    Is it possible to give users on multiple platforms (Windows, iOS, Android, whatever) a simple, black-box tool that they can run to tell them is the DNS that they are using is ready for the rollover? Run the app, it gives a yes of no answer (and perhaps more detail if asked)? Sort of like the Android Stagefright Detector App.

    That would allow a user to verify whether their DNS chain is vulnerable and just perhaps rattle someone's cage enough to get action.

    1. That's a really cool idea.

      I've to check, if there is a reliable one.

      The problem is (in a nutshell) to determine, if the queried server can update it's configuration automatically.

      1. I presume the requirement is to verify that the path to each TLD that is signed is ok. Certainly to the major ones.

        Whether there exists an application that does that today is less important that if it is possible to construct such a program.

        1. The problem is to determine, which recursive resolver is able to fulfill the requirements of RFC 5011 during the rollover. This requires to update the stored KSK information for the root.

          Certainly an issue, which hard to check remotely.

  10. What about having a program hosted be each registry that could be invoked by a user to check the path from the registry end? This need not be 100% reliable as long as it uncovers SOME resolvers that need work.
  11. If during the rollover the resolver is unable to update the KSK, all clients will have no connectivity at all, once the rollover is finished.

    1. I was referring to doing these test prior to rollover.
  12. Okay, we have currently two root KSK in the public. See: http://dnsviz.net/d/www.icann.org/dnssec/

    • The actual trust anchor is 19036 and we want to switch to 20326.
    • Every test, we can make from outside, will always succeed, if the 19036 is known to the resolver in question.
    • But we want to test, if 20326 is also trusted (because we want it to become the new root).
    • For the purposes of RFC 5011, 20326 is signed by 19036, so every resolver can learn the new KSK.

    In order to test, that the new resolver is already trusting the new key 20326, we have to use this key in signing.

    • We could sign an extra record in the root (i.e. www.root-key.dnssec-test. only signed by 20326)
    • Now we query the test web page www.root-key.dnssec-test so that the client can check the new trust at his resolver.
    • Unfortunately the new key 20326 is signed by 19036 and can be verified using this (old) trust anchor, so we get nothing.
    • Therefore we need to break this DNSKEY signing (in violation of RFC 5011) and hope, that this split brain zone will not cause serious trouble at various resolvers.
    • But if we break, we can now test the resolver reliably.
    • But if we break, all other resolvers which try to prepare for the situation can't follow RfC 5011 to update their trust anchors.

    That's the situation, including a this new hack, never described before. BUT: If you ever point somebody from the SSAC to this idea, please provide personal security guards for me at the next ICANN meeting.

    So how can we check, if the resolver already did update the trust anchors?

    There is a new protocol extension RFC 8145, which allows a client to retrieve the trust anchors know to the resolver. That' cool, isn't it?

    • If the queries resolver respond to this freshly invented extension, it will be operated in a perfect way (trusting the new key).
    • But if the query is ignored or rejected, nothing is gained. It might be a resolver, which trusts the new key, or it might fail during rollover.

    Okay?

  13. I have attended the ICANN Session with David Conrad, telling us about the KSK Roll-over schedule and concerns over readiness on D day when this will take place.

    I would like to offer the additional points:

    1. We should call for a campaign that gets Internet users around the world to query their ISP on whether they are ready or not. This means a more extensive use of Social Media, a simpler explanation on why this matters and the use of simple schematics and graphics that convey a strong message, and what happens if the roll-over fails for some domains. What about a Roll-Over clock
    2. We have heard that resolvers running on a Microsoft platform might not properly report whether they are compliant or not. Would a push from end users be necessary to get this issue to be addressed?
    3. The current target date is 11 October 2018. The previous date was 11 October 2017. The previous date was a Wednesday. The current target date, 11 October 2018 is a Thursday. Assuming ICANN will start the roll-over in its preferred US-centric time zones, many parts of the world will see it as Friday (day off in Muslim world) and in the Far East in particular – and it is good practice to never start mission critical work on a Friday due to the proximity of the week-end.
    1. Absolutely agree we should make a comment, particularly after David Conrad's session.

  14. I do not understand the 'DNSSEC anchors the zone trust by hopping the delegation hierarchy backwards. But this process does terminate at the root. Implementing the trust for the root itself is incredible hard, ultimately it's not a technical problem at all' in the first draft.

    If I am not mistaken 'the process' does not 'terminate at the root' but it starts there. That is why we are talking about this KSK rollover, right? And doing the KSK in itself is an easy exercise technically speaking, which to me implies that 'implementing the trust for the root itself' is trivial.

    I furthermore think that the current draft does not voice a strong enough opinion, i.e. what is our concern in terms of potential impact on Internet end-users when, following the KSK rollover, DNSSec enabled resolvers that do not have the proper key configured stop resolving.


    1. A DS record, which points from the parent zone to the child zone, establishes trust into (points to) the child zone keys. So the trust anchor is located in the parent zone, which delegates the child zone. So the direction of anchoring trust is reverse to the direction of delegation.

      If you have a parent zone, you can anchor the trust for your zone there. The root does not have a parent zone. So the process of anchoring trust using DS records terminates at the root.

      It has to find a different way to obtain trust into it's keys. And this is really hard: It has to be done in millions of resolvers. And it is not a technical problem, it's an operational problem, a problem how to interact with millions of operators.

      RFC 5011 may ease this process. But we currently do not know, if it's deployed correctly. That's part of the problem.

      I do think, it's more problematic to delay the rollover than to raise concern about problems.

  15. On KSK Rollover:

    After hearing from SSAC and ICANN CTO David Conrad, ALAC has not reached consensus on the timing of the necessary KSK Rollover. On the one hand, the KSK Rollover is delayed and overdue. Continued delay poses unquantified risk to the security of the DNS through a signing compromise. On the other hand, we face unquantified risk from resolvers that will fail the Rollover, the impact this failure will have on end users, and reputational harm ICANN may incur from DNS interruptions. Moreover, the data being collected by SSAC and ICANN is very noisy and bears further analysis. In short there are a lot of unknowns.

    Because of the multidimensional nature of the risks posed by the KSK Rollover, we would ask the Board's Risk Committee rather than SSAC to consider the balance of risks and issue a statement at ICANN62 considering whether the planned October 2018 KSK Rollover should go ahead as planned or should be delayed to allow useful time for additional risk analysis and mitigation to occur.”

    1. I did not attend the SSAC session, but I do not see any significant more information coming in, if we spend more time in waiting.

      OTOH, delaying will cause more and more embedded systems to enter the market, which will stop any later try to rollover.

      1. One of the big red flags IMO is that SSAC and ICANN don't understand the data they are getting...

  16. And I also think we need to align our efforts: I saw another (draft) statement from John Laprise, which I thinks is very good and I will leave it to him or staff to post it here, and learnt that Hadia Elminiawi too is drafting something on the topic or intends to do so.

  17. It would be fine to see any of those drafts.

  18. @Hadia Elminiawi  offered to put together a summary of the discussion of the pros and cons that were raised during our KSK Rollover session yesterday, especially as there was no consensus on whether the ALAC would recommend to ICANN that we continue with the Rollover or not. This was a one off situation and there were so many excellent offerings from the group, that I suggested it as an output of our discussion highlighting the end-user concerns and the various arguments for and against the Rollover. i found the whole discussion enlightening, and i am sure David will also find it helpful. While it might be more of a list, it would also be useful for Lutz to be provided with Hadia's summary for this formal statement as well. 

  19. Just for arguments sake, see slide 4 from https://static.ptbl.co/static/attachments/169141/1520696901.pdf?1520696901 , 'results of discussion':

    • Agreement there is no way to accurately measure the number of users who would be affected by rolling the root KSK

    and

    • Consensus was that the ICANN org should proceed with rolling the root zone KSK in a timely fashion


    So. No idea what impact might be but still go ahead? I think that is an issue.

    1. We know, there is an issue with badly maintained resolvers, if we do the rollover. We do not know how big the issue is.

      We know, there is an issue with badly developed appliances, if we avoid rollover. We do not know how big the issue is, but we know, it's increasing rapidly.

      We know, the badly developed appliances will ultimately prevent any later rollover.

      Therefore, I strongly suggest to continue with the rollover. And do it regularly.

      1. Lutz, this is the second time you mention "appliances" presumably with bad resolvers built in. Can you be more specific about exactly what devices or kinds of devices you are referring to?

        1. Badly developed applications or apps? Or badly developed devices (as in IoT)?


          1. Yep, but I too would like to know what devices/appliances are being referred to specifically here and what the associated risk ('issue') is in terms of a (non) roll over.

            Just so I understand correctly: the bulk of the 'badly developed appliances' or 'devices' ('IoT') that is coming our way, and/or that is already out there, does not have 'resolver' functionality built in, right?

          2. In our context also lets call "bad resolvers" the ones without ability to accept new key. Because some still do their job resolving dns queries.

        2. I refer to any device which is running a recursive DNS resolver, running as a black box for the user.

          It's very easy to set up such a system (even embedded) and use a hardcoded setup which will work at the day of shipping.

          Given the IoT numbers, remembering price restrictions, and do not forget the missing long term warranty, it's more than likely to find a validating resolver (because it's default in many current Linux distributions) combined with a read only configuration on flash in such systems.

  20. Concur: unbounded risk

  21. Revised KSK Rollover statement

    After hearing from SSAC and ICANN CTO David Conrad, ALAC is concerned about the risks we face with respect to the necessary KSK Rollover.

    Delaying the overdue KSK rollover again increases the risk of key compromise. Security best practices dictate that we rollover the key on a regular basis and we are overdue.  

    However, the risks to unprepared service providers and their dependent end users are uncertain. While the data SSAC and ICANN.org have collected to date has proven to be very noisy and difficult to interpret, we know for certain that the rollover will break some resolvers leaving those who depend on them disconnected for perhaps only week assuming the resolver maintainer has skilled staff on hand.  Unfortunately, we do not know the identity, location, or communities served by these resolvers and further research may not yield better information. We also don’t know whether the maintainers of these resolvers have the technical skill on hand to address a failed rollover in a timely fashion. Furthermore, one straightforward fix for a broken rollover is turning off DNSSEC potentially undoing the careful work engagement work to encourage DNSSEC adoption. Finally, significant outages may reflect negatively on ICANN and damage reputation and trust.

    In short, there are a lot of things we don’t know and a large amount of unquantified risk. Because of the multidimensional nature of the risks posed by the KSK Rollover, we ask the Board's Risk Committee rather than SSAC to consider the balance of risks and issue a statement at ICANN62 considering whether the planned October 2018 KSK Rollover should go ahead as planned or should be delayed to allow useful time for additional research and mitigation including but not limited to resolver risk analysis, execution of a strategic integrated communications plan directed at service providers, and communications planning to counter risks to trust.


    1. I see your point, thank you.

      What is your opinion on the opportunities, which arose from a further delay? Which information are likely to be come up in this time? Which currently active process will be come up with more useful insights?

      Why do you think, that handing over the process to a different entity will speed up the process?

      1. So, SSAC thinks that there will be more data and diagnostic tools becoming available to better gauge the situation. On the mitigation side we're talking about the DNSSEC strategic communications plan to reach service providers only reskinned for KSK rollover. Also a strategic communications plan ready to address any trust erosion issues that might arise.

  22. Hello, Taking into account the exchanges on this page, the last version send by John and support from Olivier, I suggest the following comments from ALAC regarding Plan to Restart the Root Key Signing Key (KSK) Rollover Process. Thanks for all inputs and welcome to any additional comments.

    _  _  _  _  _  _  _  _  _  _  _  _  _  _  _  _  _  _  _  _  _  _  _  _  _  _  _  _  _  _  _  _  _  _  _  _  _  _  _  _  _  _  _  _  _  _  _  _  _  _  _  _  _  _  _  _  _  _

    RESOLVED:

    The ALAC asks the following actions to be taken by

    • SSAC to issue an update report to help the ALAC understand the consequences of executing the KSK rollover on 11 October 2018 (or any other day) so that the community may issue informed advice to the Board on this issue.
    • Board's Risk Committee to consider the balance of risks because of their multidimensional nature posed by the KSK Rollover and to issue a statement to that effect.

    If published one month in advance, the ALAC will be considering before and at ICANN62 those statements and issue a better-informed point of view regarding KSK Rollover and any possible implication for end-users.

    In addition, the ALAC suggests that the ICANN communication department along with the Office of the CTO and the relevant parties of the community must communicate to the relevant technical organizations concerned and call for a campaign that gets Internet users around the world to query their Internet Service Provider (ISP) on whether their readiness to perform the roll-over. This means a more extensive use of Social Media, a simpler explanation on why this matters and the use of simple schematics and graphics that convey a strong message, and what happens if the roll-over fails for some domains.

    DISCUSSION:

    After listening to the information from SSAC and ICANN CTO David Conrad, the ALAC is concerned about the current plan regarding the necessary KSK rollover, the date of such rollover (11 October 2018) and the risks that may be faced at various levels of the network and the community.

    On the one hand, delaying the overdue KSK rollover again increases the risk of key compromise. Security best practices dictate that we rollover the key on a regular basis. 

    On the other hand, the risks to unprepared service providers and their dependent end users are uncertain, while the data SSAC and ICANN.org have collected to date has proven to be very noisy and difficult to interpret. The risk is high that the rollover will break some resolvers leaving those who depend on them disconnected for varying lengths of time depending on the availability of skilled staff at the resolver maintainer.  The identity, location, or communities served by these resolvers and further research may not yield better information. Furthermore, one straightforward fix for a broken rollover is turning off DNSSEC potentially undoing the careful work engagement work to encourage DNSSEC adoption. Indeed, turning off DNSSEC might mean it may remain off for the foreseeable future. Finally, significant outages may reflect negatively on ICANN and damage reputation and trust.

    Other concerns were raised concerning:

    • Microsoft platforms that might not properly report whether they are KSK rollover compliant.
    • The current target date is 11 October 2018. The previous date was 11 October 2017 and it was a Wednesday. The current target date, 11 October 2018 is a Thursday. Assuming ICANN will start the roll-over in its preferred US-centric time zones, many parts of the world will see it as Friday (day off in Muslim world) and in the Far East in particular – and it is good practice to never start mission critical work on a Friday due to the proximity of the week-end.

    _  _  _  _  _  _  _  _  _  _  _  _  _  _  _  _  _  _  _  _  _  _  _  _  _  _  _  _  _  _  _  _  _  _  _  _  _  _  _  _  _  _  _  _  _  _  _  _  _  _  _  _  _  _  _  _  _  _

    1. I like this statement better than the first draft. It goes to the point and have calls to actions which is something we can track.


      -ed

    2. My concern is simple: Delaying the rollover will cause real, hard problems, which in turn delay the rollover further until no rollover will be possible ever.

      Everything, which can break now, will break in any future. New problems will add faster, than assumed problems will be solved.

      All we can do is to focus on appropriate PR to prevent negative reputation from ICANN (which is not really a concern in my eyes).

  23. Hello, Even though I see Sebastian's new draft as an improvement, I do not think we need to make conditional statement (re: one month limit).... However I can live with what has been proposed. Matt I suggest a minor update: "....help the ALAC better understand.... Regards
  24. I think we are close to a final statement. I think it could do with a final edit with everything we ae asking in the one place. Maybe start with  what we know of the plan, then our concerns followed by what we recommend.  (nothing really new, just a bit more clarity)

  25. I've got time blocked off this weekend to work on this. John
  26. Having root (and ICANN), means it is centralized. Only few protocols are de-cent, such as blockchain and bittorrent.  It is fair to use "distributed" instead.

    1. You are correct. Thank you.

    2. My comment on the first line of the draft

      "DNSSEC changes the nature of the most decentralized service of the Internet, the DNS, fundamentally "

      The DNS is a centralized service but distributed, having a single point of failure makes it centralized - Someone should correct Wikipedia (smile) 

      1. It's hierarchically distributed. There is no fundamental single point of failure, thanks to the Root Server Operators. There is a technical single point of failure due to the DNSSEC signature lifetimes, which is handled by the current key officer scheme. There is a organizational single point of failure: The authority to apply changes to the root zone. Misbehaviour for this authority can be overcome by changing the authority organization during the key ceremony.

        So, you are right, the internal structure of the DNS is hierarchical, the resolvers are decentralized and the authority servers are distributed.

        1. Though the debate with regard to the DNS being centralized or decentralized is off topic (KSK rollover)

          But technically speaking and not from an organizational point of view the DNS has a distributed hierarchical infrastructure, however all the distributed components operate in reference to the root zone. So we say the DNS is distributed but not decentralized. If it was decentralized we would have been able to do the KSK roll over by region (smile) DNSSEC is not the issue here. As Andrey mentions earlier an example of a distributed architecture is Blockchain technology, I invite you to take a look at the architecture of Blockchain in order to see the difference between a centralized service and a decentralized service. 

  27. Hadia's comment from the email thread (which I support) : 23/03/18

    Dear All,
    I would like to note that I communicated earlier today to Maureen that while I shall be putting my comments with regard to the statement I do not find it necessary to summarize our discussion about the KSK during the ICANN meeting in Puerto Rico as the group got to discuss again through the work space also the transcripts are already there.
    Though the comments on the work-space expired yesterday I still put my proposed statement below noting that the public comments on the KSK rollover closes on April 2 that is in 11 days and I find no reason not to take it into consideration if it is supported by the group - while of course suggesting any modifications or changes to it.
    -------------------------------------------------------------------------------------
    The At– Large Advisory Committee (ALAC) of the ICANN takes this opportunity to thank ICANN org for opening for public comments the plan to restart the Root Key Signing (KSK) Rollover Process and is glad to provide its advice herein

    The postponement of the KSK roll over on 11 October 2017 was based on newly discovered information concerning validating recursive resolvers that might not be ready for the rollover, to this end ICANN org researched the new data to determine if it could be useful in determining when to roll the root KSK,  however On 18 December 2017, ICANN org reported to the community the results of its research, in that report, ICANN detailed that the collected data does not provide any clear explanation as to why so many resolvers appeared to still be using only the 2010 KSK. Most of the messages received at the root zone that indicated that particular resolvers were not ready for the rollover were not helpful, ICANN org could often not determine which resolvers sent the message or why those resolvers had not updated their trust anchors. Additionally, even when the resolvers were identified efforts to contact the operator were often unsuccessful.
    Taking into consideration that
    ·        It is not for seen that new reliable data will be available soon.
    ·        There is nothing to indicate that operators of DNS resolvers that are operating with only the 2010 KSK will fix their systems soon.
    ·        The existence of DNSSEC is important to protect the integrity of the DNS data, where DNSSEC applies digital signatures to DNS data to authenticate the data's origin and verify its integrity as it moves throughout the Internet.
    ·        It was agreed that each root zone KSK will be scheduled to be rolled over through a key ceremony as required, or after 5 years of operation and 6 years have already elapsed
    ·        Postponing the KSK rollover might put at risk the security of the DNS, where the key could be compromised.
    The ALAC recognizes that while it is important to guarantee that the users affected are as minimum as possible it is equally important for the security of the DNS to proceed with the KSK Rollover. To this end the ALAC supports the proposed plan for the KSK rollover while highlighting the importance of an extensive outreach plan and requesting a detailed PR plan that includes all the necessary information and documents. 
    Best
    Hadia

    Response from Alberto Soto: 23/03/18

    I partially agree.
    ICANN does not have a risk analysis carried out, or despite this, it does not know the impact of the change on the end users. I understand the risk of not changing the password. If it is not changed, it is ICANN's responsibility before an attack as a consequence. If it is changed, there may not be an attack, and ICANN will be responsible for the unpredictable consequences. But in the latter case, it will be ALAC's responsibility not to have requested information about the impact of the change on the end users.
    Regards
    Alberto

    Response from Bartlett Morgan: 23/03/18

    I'm willing to support this position as you've articulated it Hadia.

    Response from Javier Rua-Jovet: 23/03/18

    My comments where made on the record in San Juan, generally in support of the cautious position first enunciated by John Laprise.  I fully trust that the pending ALAC statement on KSK rollover (or postponement) will reflect the sense and temperature of the room during the San Juan discussion: opinions were basically equally divided into those that favored moving forward, and those, like I, who favored waiting until understandable and reliable data regarding risks to potential end-users is made available.

  28. Dear All,

    With regard to

    John's proposal "Asking the Board’s risk Committee to consider the balance of risks and issue a statement at ICANN62 considering whether the planned October 2018 KSK Rollover should go etc." and Sabastien's proposal "The ALAC asks the following actions to be taken by the Board's Risk Committee to consider the balance of etc."

    I note here that the proposed plan to restart the Root Key Signing Key (KSK) Rollover is currently put forward to the community for public comment, after the community provides its comments ICANN organization will prepare a full plan for the rollover and present it to the board, at such point the Board’s risk committee could make considerations with regard to the plan. That is it is not possible to ask the board to let the risk committee look into a plan that has not been finalized yet nor presented to them. 

    1. I agree with that point - as the 'Plan for Continuing the Root KSK Rollover', currently up for public comment, says in '3.1 Roll the Root KSK on 11 October 2018':

      'The ICANN org plans to take the next step of rolling the Root KSK on 11 October 2018, assuming that such a step is approved by the ICANN Board of Directors. (...) The date of 11 October 2018 was chosen to give the community time to review the plan with a public comment period and to give the ICANN Board of Directors time to approve it after consulting any parties it may wish to contact. This date also gives ICANN org plenty of time to publicize the new date and attempt to get more validating resolver operators ready for the roll over.'

  29. I support Olivier's points

    1. We should call for a campaign that gets Internet users around the world to query their ISP on whether they are ready or not. This means a more extensive use of Social Media, a simpler explanation on why this matters and the use of simple schematics and graphics that convey a strong message, and what happens if the roll-over fails for some domains. What about a Roll-Over clock

    2. The current target date is 11 October 2018. The previous date was 11 October 2017. The previous date was a Wednesday. The current target date, 11 October 2018 is a Thursday. Assuming ICANN will start the roll-over in its preferred US-centric time zones, many parts of the world will see it as Friday (day off in Muslim world) and in the Far East in particular – and it is good practice to never start mission critical work on a Friday due to the proximity of the week-end.
  30. I would strongly suggest that we make a clear and relatively terse statement. Too many words will only make it more obscure.

    That being said, the following is NOT that statement but is an attempt to put the issues (at least as I understand them) in a single list (not in any particular order)

    • There is no accurate way to measure the number of users that may be affected by resolvers that enable DNSSEC but do not have the new trust anchor.
    • There is no way for users or major DNS providers (such as the root servers of the servers for each TLD zone file to know whether the resolvers contacting them are ready for the rollover
    • Most technical people seem to feel that the risk is warranted and we should proceed.
    • Many non-technical people have some concerns that the impact will be large and worse, this will reflect very poorly on ICANN.
    • When we did a poll in San Juan, the ALAC was roughly divided with a slight preference to proceeding with the rollover in October 2018.
    • The date selected for the rollover is a Thursday, which implies Friday in some parts of the world and leading into a weekend, perhaps the worse time of the week to do such a change. This date cannot be readily changed unless the rollover is delayed.
    • a decision to proceed with the rollover now can always be reversed later. The last potential rollover was deferred just days before it was to happen
    • there is a new protocol under development in the IETF, "sentinel" which when deployed, will allow a user to determine to what extent they may be affected by a rollover. It is not known when it will be ready. The last time a related protocol was developed, it was implemented by key DNS implementers very quickly. There is an IETF meeting going on this week, and perhaps there will be more clarity in a few days. See https://tools.ietf.org/html/draft-ietf-dnsop-kskroll-sentinel-07. When deployed, this will give users the ability to query thier provider and raise awareness if necessary. If deployed QUICKLY, it may give us more data by October.
    • ICANN has attempted to contact ISPs. Goran has sent a letter to regulators, who may forward it to Telecoms (who are typically large ISPs). RIRs have just recently been involved. We need to be far more aggressive in reaching out and RIR, who presumably know about ISPs in their regions should be part of this. ICANN needs to also contact major critical industrial segments such as banking to ensure that they are aware of the issue.The GAC should also be involved.
    • At-Large can play a role by spreading the message through our ALSes and individual members.
    • Some people have suggested that the matter be referred to the SSAC although it is not clear what the question(s) would be.
    • Some people have suggested that the matter be referred to the Board RISK Committee (I think this is not necessary since the decision is ultimately the up to the Board and it will rely on its own Risk Committee).
    • Regardless of whether we cancel now and set a new date, plan to roll in October and cancel  later, or plan to roll and do it, we need an intensive campaign to raise awareness without causing panic.


    What have I missed?

    1. Great summary, thank you.

    2. Just to get an idea about dnssec stat.

      The advanced dnssec cctld .nl https://stats.sidnlabs.nl/#/dnssec

      Not so advanced dnssec cctld .ru https://statdom.ru/tld/ru/report/domainsdnsseccount/#31:by=month

      We still talking about small fraction of dns queries.

  31. Hi Alan,

    Thanks for your summarized points, below are my comments in green

    • There is no accurate way to measure the number of users that may be affected by resolvers that enable DNSSEC but do not have the new trust anchor.

               And there lies the challenge, that is determining the number of users that are going to be negatively impacted by the roll over

    • There is no way for users or major DNS providers (such as the root servers of the servers for each TLD zone file to know whether the resolvers contacting them are ready for the rollover

              I would put to “accurately know” however I do not think that getting into such technical matters is necessary, the true challenge lies in assessing the negative impact on users and that is              what we should be concerned with.

    • Most technical people seem to feel that the risk is warranted and we should proceed.

              This comment is supported by the input and discussions on the mailing list set for this purpose by ICANN

    • Many non-technical people have some concerns that the impact will be large and worse, this will reflect very poorly on ICANN

              How did we come up with the above statement?

    • When we did a poll in San Juan, the ALAC was roughly divided with a slight preference to proceeding with the rollover in October 2018
    • The date selected for the rollover is a Thursday, which implies Friday in some parts of the world and leading into a weekend, perhaps the worse time of the week to do such a change. This date cannot be readily changed unless the rollover is delayed
      Why can’t this date be readily changed unless the rollover is delayed? it could be advanced to Wednesday, that is instead of initiating the rollover on Thursday they initiate it on Wednesday, however it is again up to the implementation team to decide what’s best
    • A decision to proceed with the rollover now can always be reversed later. The last potential rollover was deferred just days before it was to happen
    • there is a new protocol under development in the IETF, "sentinel" which when deployed, will allow a user to determine to what extent they may be affected by a rollover. It is not known when it will be ready. The last time a related protocol was developed, it was implemented by key DNS implementers very quickly. There is an IETF meeting going on this week, and perhaps there will be more clarity in a few days. See https://tools.ietf.org/html/draft-ietf-dnsop-kskroll-sentinel-07. When deployed, this will give users the ability to query thier provider and raise awareness if necessary. If deployed QUICKLY, it may give us more data by October.

             Waiting for an indefinite amount of time and hoping for the best defies any kind of logical reasoning, this leaves us with no notion on where we are going forward with this

    • ICANN has attempted to contact ISPs. Goran has sent a letter to regulators, who may forward it to Telecoms (who are typically large ISPs). RIRs have just recently been involved. We need to be far more aggressive in reaching out and RIR, who presumably know about ISPs in their regions should be part of this. ICANN needs to also contact major critical industrial segments such as banking to ensure that they are aware of the issue.The GAC should also be involved.
    • At-Large can play a role by spreading the message through our ALSes and individual members.
    • Some people have suggested that the matter be referred to the SSAC although it is not clear what the question(s) would be.
    • Some people have suggested that the matter be referred to the Board RISK Committee (I think this is not necessary since the decision is ultimately the up to the Board and it will rely on its own Risk Committee).

           To ask for the matter to be referred to the board at this stage is like asking the board to step in and look into a matter that is still put out for the community for public comment, which defies          the process      

            After the plan is submitted to the board, the board could present it to the Risk committee, it’s the board’s decision and however ALAC may provide advice to the board after the final plan             is submitted to it – we could ask for that

    • Regardless of whether we cancel now and set a new date, plan to roll in October and cancel  later, or plan to roll and do it, we need an intensive campaign to raise awareness without causing panic

    • It is not for seen that new reliable data will be available soon.  
    • There is nothing to indicate that operators of DNS resolvers that are operating with only the 2010 KSK will fix their systems soon.  
    • The existence of DNSSEC is important to protect the security of the DNS, where DNSSEC applies digital signatures to DNS data to authenticate the data's origin and verify its integrity as it moves throughout the Internet.
    • It was agreed that each root zone KSK will be scheduled to be rolled over through a key ceremony as required, or after 5 years of operation and 6 years have already elapsed
    • Postponing the KSK rollover might put the security of the DNS at risk, where the key could be compromised, lost among others risks stated in the SSAC Advisory on DNSSEC Key Rollover in the Root Zone on 7 November 2013


    Alan I suggest that we incorporate some of your points in the comment, we could also ask the board for an opportunity to provide advice to them after the final plan is presented to the board, this will give us a chance to highlight any worries that we might still have with regard to any huge negative impact on end users .

    1. To address the questions that Hadia asks:

      How did we come up with the above statement?  I was merely reporting what a number of people said.

      Why can’t this date be readily changed unless the rollover is delayed? I don't know the answer but my understanding is that it is the case. Happy to be proven wrong.

       Waiting for an indefinite amount of time and hoping for the best defies any kind of logical reasoning, this leaves us with no notion on where we are going forward with this This item on Sentinel says that if we wait, we WILL have more information, and a tool by which users can probe their DNS resolver. The problem is we do not know how long this will take.

       To ask for the matter to be referred to the board at this stage is like asking the board to step in and look into a matter that is still put out for the community for public comment, which defies the process  The matter is ALREADY before the Board


      At the ALAC meeting on Tuesday, we will decide if we have an actual preference and what else to put in the statement.

    2. There is nothing to indicate that operators of DNS resolvers that are operating with only the 2010 KSK will fix their systems soon.

      They will not with probability > 50%

  32. The last rollover deployment was unsuccessful because of the technical issues stated. Now we are re-rolling over again. I understand the need but as ICANN had already identified the issues. My comment is more a question. Do we have a greenlight that the situation is better and the rollover can happen with a full deployment? Has ICANN made sure that everyone is reasnably and quite ready? 

    Is there a need to be hasty to rollout? Because deploying resources in house and then stop for another story again is useless. Either that or ICANN go in a very phased and regional approach at a time. Last but no the least DNSSEC is not on everyones' radar yet. So please think properly and cut out a phased roolout and regional approach if needs to ....

    1. Thanks, Kris - not sure what you mean though: rolling the Root KSK is not something one can do 'phased' or 'regionally', 'at a time', right?

      1. True but do we know that at this point they are all ready? Else we reschedule when all zones are ready to go. Else another waste  because you need to rollback to previous again. For me i would be adamant unless we have a clear cut go ahead from all zones. I know a lot are still rolling DNS and haven't really moved to DNSSEC no matter what. So we start over every time. Just to ensure staff has something to do. I wold really like this roll out to be a success not go and restart. I do not see any document or anything that suggests that the zones are really ready.

        1. Can you give an example of the "zones", you are referring to?

          I do not grasp it.

        2. "True but do we know that at this point they are all ready?"

          Kris, we  actually do know that some are not ready, however we can not tell why they are not ready, or if they are going to be ready anytime in the near future and in mnay cases the validating resolvers can not even be identified. However there are estimates of the percentage of resolvers that are not ready, 

          "Else we reschedule when all zones are ready to go."  We don't know when this will happen - maybe never , too wait for an indefinite amount of time hoping for the best if not the impossible defies any kind of logical reasoning 

           

    2. Actually Kris I wouldn't say that the last rollover deployment was unsuccessful, as it did not actually happen, it was postponed, we are now trying to guarantee that when it actually happens it is successful.  

       Do we have a greenlight that the situation is better and the rollover can happen with a full deployment? I don't think that what we are trying to guarantee here is full deployment, we are trying to achieve the first KSK rollover with minimum and acceptable negative impacts, however if we do achieve full deployment that would be great

       Either that or ICANN go in a very phased and regional approach - I am not sure that I correctly understand what you mean however changing the root KSK  cannot happen by region

  33. To add more to it : Level of preparedness !!!!

    As Alan has really pitched it. The issue of rolling out is not only with the root zones it's also ISPs. Well actually more to do with ISPs. Not all are DNSSEC capable and do not really understand the need, the why's etc., Take it like IPV6 deployment how many ISPs have really deployed or even advertised their v6..... 

    I am fully in agreement with Alan's summary... now or later but do it when everyone is clear and prepared.

    1. It's unknown how many servers are ill operated. It's unclear, how many people will be affected. It's impossible to gather strong evidence about those issues beforehand.

      In order to create awareness and preparedness, roll! Regularly.

    2. Kris I really don't think that any operator that has deployed DNSSEC does not understand the need !!! As for those who did not deploy DNSSEC, they don't need to care because they will not be affected

  34. As some previous comments have already pointed out, I believe this is an excellent opportunity to get our RALOs and ALSes to be involved in policy issues of ICANN.  That is, we may want to raise the awareness of this security issue among all the end users around the world, and mobilize all ALSes to urge their local ISPs to comply with the KSK requirement.  This will not only be good for this issue, but will also be good for a long run.

    1.  I agree Kaili, and on my return I reported to my Minister about the discussions that we'd had among the ALAC and with David Conrad. I had a query about the cost of connecting the the DNSSEC (that's where we are at in our country the Cook Islands and in fact across many of the countries in the Pacific) and David has sent me information that again I was able to pass on. I raise the issue about security with the Minister after each meeting, but this meeting provided information which I felt would be helpful for their decision making. 

    2. Kaili, I came to our LACRALO RIR and I proposed this collaboration of all our ALSs, for 22 countries in LAC. They told me that it was a very good project, but they never cotacted again...

  35. A few comments on risk (and an omnibus draft to follow):

    Going ahead with the KSK rollover

    • ICANN simply does not know which resolvers will break. To a large extent this will occur at the ISP level globally. In more developed nations, a fix could take up to a week (from a conversation with ARIN). In less developed nations with less sophisticated ISPs, it could be longer. In countries with few ISPs, this could be a catastrophic single point of failure. 
    • The data collected to date does not conform to expected SSAC models. SSAC is still building untested hypotheses. The technologists are fine with throwing the switch, breaking stuff, fixing stuff, and moving along however:
      • This will likely further slow the adoption of DNSSEC as one primary solution to a broken resolver is to disable it. Current data suggests such ISPs in the future are unlikely to turn it back on.
      • Without knowing which resolvers will break and who the end users are, we have no sense about the scope of the risk for outages in the wake of the rollover. It could be trivial or could blackout whole countries.
      • The latter possibility cuts to the core of ICANN's trust mission. End users unaware of the fine points of the KSK rollover may awake and find their internet doesn't work and will ask why? ICANN will inescapably be to blame.

    Delaying the KSK rollover

    • In further delaying the rollover we accrue additional risk of key compromise. While initial minute, good security practice dictates regular changes to the key. Delay increases the likelihood of key compromise.
    • Furthermore, building muscle memory for the key rollover is beneficial.Having never done one before, we don't know where the pain points are.
    • As an aside, a member of SSAC noted that there is a separate procedure for an emergency key change that is unaffected by this decision. 

    Miscellaneous notes:

    • ICANN has an existing communications plan it used to publicize the DNSSEC rollout. The KSK rollover is largely the same audience. After retooling the content, the same plan could be used to communicate to resolver maintainers. 
    • I asked the Board in public session about the rollover and it noted the risk committee, SSAC, and RSAC were already investigating. 
    • The KSK rollover plan offers no details about the communications plan.

    The problem IMO is that both choices involve undefined and unscoped risk. My previous draft asked the board to try and shed light on this risk and try to quantify it. At present I nor anyone else (I think) can say with any degree of certainty whether going ahead or delaying is a safer course of action, risk-wise. Asking for advice on which is the safer course so that we can advocate and support that in the interest of end users is critical. 

    1. ICANN simply does not know which resolvers will break.

      Correct. To my knowledge there is no way to gather such information.

      To a large extent this will occur at the ISP level globally. In more developed nations, a fix could take up to a week (from a conversation with ARIN). In less developed nations with less sophisticated ISPs, it could be longer. In countries with few ISPs, this could be a catastrophic single point of failure.

      I'm really interested in reading this conversation, because it contradicts my experience. If the new KSK was not learned by a validating resolver, this special resolver will break as soon as the old KSK signatures are expired. I do know - from my own failure to operate a (test) signed root correctly - that any central validation failure in the root itself will cause every DNS query to be failing. So this is not a failure, which will cause some trouble, it's a total show stopper. In such a case every ISP will take any action to solve the problem within hours. And you have to use to the physical console of the server, in person. Talking about days or weeks to fix this error is incomprehensible to me.

      The data collected to date does not conform to expected SSAC models. SSAC is still building untested hypotheses.

      That's not surprising. SSAC are not the gods, which recognise every possible error in advance and know, how to deal with it. Fortunately, most errors fall into some category, which can be reasoned about: You call it hypotheses. Most people do not have first hand experiences with hard to debug errors in production environments. Experience is gathered from lab environments and typical (assumed) setups. Real world production is much, much stranger. So in reality, nobody can really know, what will happen.

      The technologists are fine with throwing the switch, breaking stuff, fixing stuff, and moving along however:

      As said before: There is no other way to find it out. Technically. We all would prefer to have a palantir or any other mechanism to detect erroneous setups beforehand, but there a none. In order to understand the issue a bit more, there are not only the central ISP resolvers, there are a vast amount of servers relying on their own DNS resolvers. About 7% of hosted servers are horribly maintained (so that they contribute to DDoS), can you guess, how many are ready for a KSK roll over?

      This will likely further slow the adoption of DNSSEC as one primary solution to a broken resolver is to disable it. Current data suggests such ISPs in the future are unlikely to turn it back on.

      That's correct. OTOH should we stop deploying DNSSEC for the purpose of not loosing them? That makes no sense to me. But we known, that there is an ever increasing need for DNSSEC: Customers insist in DNSSEC, because checker-sites to tell them to do so. Disabling DNSSEC will not be a long term solution.

      Without knowing which resolvers will break and who the end users are, we have no sense about the scope of the risk for outages in the wake of the rollover. It could be trivial or could blackout whole countries.

      We know, what will happen: Total blackout. So every company relying on this server will have to fix it really soon. The trivial fix is to disable DNSSEC, the better one is to relearn the KSK (and fail again on the next rollover), and the correct one is to deploy RFC5011.

      The latter possibility cuts to the core of ICANN's trust mission. End users unaware of the fine points of the KSK rollover may awake and find their internet doesn't work and will ask why? ICANN will inescapably be to blame.

      That seems to be the central point of all the comments. I missed it, because I did not acknowledge it's importance. My fault.

      So, what can we and ICANN do? Doing very good PR. We have to manage to get this event into the news, it has to be broadcasted in the same way as the Year2000 problem. We have to ensure, that every problem during the roll over will be attributed to the ISP (even if it has nothing to do with DNS). So we have to blame the ISPs, hosters, and IoT companies beforehand through the press, (social) media, and TV shows. They have to feel guilty weeks before the date.

      I should rephrase the draft recommendation.

    2. Hi John,

      Please find my comments below in green 

      Going ahead with the KSK rollover

      • ICANN simply does not know which resolvers will break.
        True ICANN mentioned clearly on 18 December 2017 that in most cases they are not able to know which validating resolvers have not updated their trust anchors and even when they are able to determine which validating resolver are not updated they can not tell the reason and in most cases even if they are able to identify the validating resolvers they are not able to contact the operators and this is a situation that is not foreseen to change in the near future.
      •  To a large extent this will occur at the ISP level globally.

        In ICANN’s announcement on 27 September 2018 it mentions that the KSK roll over was postponed because the data coming from a large number of ISPs and network operators reported only the old key and also mentions that one of the reasons could be that they discovered that one widely used resolver program appears to not be automatically updating the key as it should, however the information mentioned below 

        “In more developed nations, a fix could take up to a week (from a conversation with ARIN). In less developed nations with less sophisticated ISPs, it could be longer. In countries with few ISPs, this could be a catastrophic single point of failure.” 

        Is not referenced anywhere, on February 6, 2018 ARIN mentions on its website in a post under the tile Update To ICANN’s KSK Rollover  “We are not involved in the rollover itself, nor will anything here at ARIN change as a result of the rollover”


        • This will likely further slow the adoption of DNSSEC as one primary solution to a broken resolver is to disable it. Current data suggests such ISPs in the future are unlikely to turn it back on.
          This is not our current concern, this could be tackled later 
        • Without knowing which resolvers will break and who the end users are, we have no sense about the scope of the risk for outages in the wake of the rollover. It could be trivial or could blackout whole countries.
          estimates are available, and however ICANN has already mentioned that the analyzed data did not and will not provide better information. Additionally there is no expectations for more reliable data in the near future and by the way none of the estimates ever referred to blackouts in whole countries 
        • The latter possibility cuts to the core of ICANN's trust mission. End users unaware of the fine points of the KSK rollover may awake and find their internet doesn't work and will ask why? ICANN will inescapably be to blame.
          Let's not forget that ICANN's core mission includes the security of the Internet unique identifiers   


      Delaying the KSK rollover


      • In further delaying the rollover we accrue additional risk of key compromise. While initial minute, good security practice dictates regular changes to the key. Delay increases the likelihood of key compromise.
      • It was agreed that each root zone KSK will be scheduled to be rolled over through a key ceremony as required, or after 5 years of operation and 6 years have already elapsed
      • Furthermore, building muscle memory for the key rollover is beneficial.Having never done one before, we don't know where the pain points are.
      • As an aside, a member of SSAC noted that there is a separate procedure for an emergency key change that is unaffected by this decision.
      • Postponing the KSK rollover might put the security of the DNS at risk, where the key could be compromised, lost among others risks stated in the SSAC Advisory on DNSSEC Key Rollover in the Root Zone on 7 November 2013.
      • No further reliable data is for seen to be available in the near future, that is we shall be waiting for an indefinite period of time hoping for the best, which gives us no indication about where we are going forward with this
      • There are no indications that operators that have not updated their systems are going to do so in the near future, waiting for the unexpected to happen defies any kind of logical reasoning  


        • ARIN conversation: This point comes from an off the record conversation with a senior trusted ARIN technologist I had at ICANN61.
        • DNSSEC: I disagree. It is part of the risk we incur for the KSK Rollover and should be considered.
        • National issues: Sidebar conversations with SSAC confirmed that this is a possibility. Hence the request for advice. Furthermore, this is the rationale for an ISP remediation plan in the aftermath of a failed KSK rollover.
        • Security and trust go hand in hand. The reason for asking for advice is to better understand the risk.
        • The KSK rollover was already delayed once as the risk was deemed by the board to be too great SOP notwithstanding.
        • Agreed on security of DNS (see my first bulletpoint in section)
        • I am not asking for an indefinite wait. The advice asks for advice by ICANN62 at which point ALAC acting in its advisory capacity can make a statement on the KSK rollover outside the bounds of the public comment period.
        • The whole point of the wording is to seek a holistic risk assessment on the relative merits/risks. Fundamentally, if we delay again for x time, can we use that time to good effect to substantially improve the risk profile? If not, then we should go ahead with rollover. However, I want ICANN's best assessment on the relative risks before coming to that conclusion. I want us to exercise due diligence and best practices with respect to risk management. The fact that the board has already asked SSAC, RSAC and the Risk committee to look at this is heartening as it recognizes the risk.
        1. ARIN conversation: This point comes from an off the record conversation with a senior trusted ARIN technologist I had at ICANN61

          I am happy to acknowledge the conversation but since it is an off the record conversation we can not mention it in our statement 

  36. First ,a big thanks to John for carefully laying out what is known.

    Next - based on that, it is really unclear what is the best way forward.  As John says, BOTH choices involve undefined and unscoped risk,.SSooo - my question to John - WHOM do we ask for advice on which is the safer course.  From what I am reading, neither choice is an obvious one.  If there is a best source of advice, I am happy to listen.

  37. Whereas:

    Resolved:

    The ALAC advises the Board

    • to provide a holistic risk assessment to the community on the KSK Rollover at ICANN62 comparing the relative risks of implementing the KSK rollover as provided in the revised plan vs. further delaying the KSK rollover. The risk assessment should include an evaluation of both the technical and reputational risks to the security and stability of the Internet and should reflect opinions of the SSAC, RSAC, and Risk Committee
    • To direct ICANN to develop a robust ISP mitigation strategy in the event of resolver failure affecting end users and communicate it to the community
    • To provide clarification on the communication plan. ALAC notes that the recipients for the KSK rollover communication plan considerably overlap those of the DNSSEC implementation

    Furthermore

    • Upon receipt of the risk assessment at ICANN62, ALAC will be better prepared to offer advice on the KSK Rollover Plan in a timely fashion prior to its proposed execution in October 2018
    1. Hi John,

      We are currently commenting on the plan put forward by ICANN org, after the period of comments and based on the comments provided ICANN will put together a final plan that it will present to the board, at such point the ALAC can and should advice the board with regard to the way forward that is your recommendation with regard to providing advice to the board will happen but not now and is a second chance for us to provide our input, I don't see the reason for requesting it now, needless to say that it defies the initial process.

      However if we reach no consensus among the ALAC members on the way forward we could state all of our concerns whether they are pros or cons

      As for the request for the outreach, post recovery and PR plan this should be requested now in our comments on the proposed plan so that it can be included in the final plan that will be presented to the board (which we will provide advice on later)

      1. I understand; that's why the link in the proposal references the plan. We are asking now in the public comment period because we've been asked to by the process. We're requesting because as an AC we have the latitude to take the information they provide at ICANN62 and comment further. It in no way defies the process. Rather it takes advantage of the process and our capabilities.

        If we reach no consensus that would be an excellent path forward on advice.

        Included in the draft (bullets 2 & 3 of resolved)

  38. Please note that this draft does not comment on whether or not to go forward. We're mixed on that. ALAC has the capacity and mandate to offer advice outside the public comment period. We're not limited. By asking for information at ICANN62, we can offer informed advice prior to the planned execution.

    1. As you stated multiple times yourself: There is no way to gather robust information about ill behaving system in the case of the roll over. Therefore it's pointless to ask somebody to come up with such information. There is no data and there will be no data.

      All we have are a bunch of hypotheses, based on educated guesses.

      Let me tell you my own story: I'm not the person, who enables DNSSEC by accident. But some of my own resolvers were not ready for the KSK roll over until I double checked the situation. It was a simple permission problem.

      But: Nothing was reported, nothing breaks. The server was even able to respond, that it is new KSK ready (from RAM) using the new protocol extension. But in reality is was not ready, it would even survive the roll over period without any problems. But on the next reboot, it will stop working.

      If you really believe to get valid data about the readiness, you will be very disappointed. Or you know some god like instance, which will come down to the ICANN62 event and tell us about the hidden mysteries of the future. I, personally, do not know such a deity.

      So please do not try to postpone the decision.

      1. As I understand it, there are many parties actively striving to better understand the data we have and gathering more data. ICANN62 and the planned October KSK rollover are months away. I understand the limitations of robustness. Nevertheless, what I want is an expert opinion from the board on whether the risks of delay outweigh the risks of implementation.

        On the remediation side, I would really like to see the communication plan ICANN has implemented and the contingency plans ICANN has in place for ISPs whose resolvers break. 

        At the end of the day the draft I wrote seeks to bound risk as best we can prior to the rollover to make an informed decision. The lack of robustness is a problem and one we must accept into the risk model. I also understand the limits of valid data. In essence, I'm trying to weigh the risks of delaying the rollover vs going ahead with it and would prefer not to make a blind choice.

        The advice I composed does not advocate postponement; it asks the board to gather data and provide a best assessment so that we can issue informed advice. 

        1. Good luck. You are asking for the impossible.

          1. Lutz,

            I'm not asking for something definitive; I'm asking for a best guess upon which to make a decision. We've yet to receive that and there is time before October left.

            1. The main activity to start, is public information. This is not a quick shot blog posting on the ICANN website, it's a coordinated set of easy to understand instructions for various groups (ISP, end customers, governments, media, yellow press, ...)

              I do not see, how to even think about this in the remaining months ... Do you really believe that such a world information action can be done in a week? Are you kidding me?

              1. The outreach audience for the KSK rollover is the same as it was for the DNSSEC outreach; only the content is different. It's an ISP outreach.

                1. Sorry, John. You do not understand the problem.

                  You introduced the point of "blaming ICANN" into the discussion, but now you will not talk to this audience, which will do so?

                  How do you want to prevent any "blame", if you keep silent?

                  1. I respectfully disagree.

                    Not at all. I'm just saying that ICANN has existing communications plans that can be retooled with KSK rollover content.

                    I have no intention of keeping silent. Sorry I miscommunicated. (See third bullet point).

                    1. Those communication plans and channels are too narrow for the required job.

                      They stay only in touch with those people, which are already ready to roll over.

                      They miss all those (technical) people responsible for systems, which are not ready. Several commenters already quote the problem reports on reaching those people.

                      They also miss all people, which are subject of failure during the roll over. And - as you mentioned correctly - might blame ICANN for the problem.

                      1. That's why the statement draft asks for clarification on the communication plan.

  39. Something that everyone seems to be forgetting is the Sentinel work going on in the IETF.

    The 24 March version can be found at https://tools.ietf.org/html/draft-ietf-dnsop-kskroll-sentinel-08.

    Alan


    1. Alan, This is one of the reason why I believe that all we should be saying as end user is to challenge those directly involved in the technicalities to ensure minimal impact, and to challenge ICANN to improve on her awareness. The reality is that ensuring minimal impact is of more concern to service providers than the service receivers hence our comment as the receivers should be more about calling for an improve awareness. I don't think we should be in an authoritative position to put a condition on the rollover date, especially as we are not in a position to accurately interpret whatever data is provided neither are we in a position to fix them as end users. Regards
      1. Seun I agree with you, lets not say whether we support the roll over in October or not, lets just say that we request an extensive outreach, post recovery and PR plan and a guaranteed minimum negative impact on the end users if and when the roll over happens. 

        1. Yup except that I don't think adding "guaranteed" will be practical, as this implies that ICANN would have to be able to get a 100% correct date in other to provide such a guarantee and we both know that isn't going to be feasible. I suggest we leave that word of guarantee out of it but request that they be well prepared to maintain minimum negative impact.
  40. As for the advice to the board, we shall provide it when the final plan is presented to them, which definitely could include having the board's risk committee look into it

  41. John (and others): Why do you wish to treat this as Advice to the Board when there is an open public comment. All they can do now is say that we are out for comment and it will consider the input. And by NOT submitting our view for the PC, there is a good chance it will not be considered in the analysis.

    Let's focus on what to say in the comment. Plenty of time for advice to the board if we do not like the outcome.

    What I am hearing (extrapolated) is:

    • Our community understands the need to roll the KSK but is divided on which path to follow at this point.
    • We want an intense awareness campaign including targetting ISPs, Telcos, Major industries. Use RIRs where they can identify potential DNS providers.
    • Information packet we can send to our community (ALSes, Individual Members) to get them to prompt their local ISPs or others and which can be similarly used by other constituencies within ICANN.
    • A holistic review of the situation (including a risk assessment of alternatives) in time for further discussion at ICANN62, including the then current state of whatever data is available and forecast of Sentinal availability.
    • We would like to better understand under what conditions a rollover date change is possible to avoid doing it at the end of a week.
    1. I quite agree. Sorry I miscommunicated.
  42. We have three draft statements in the "First Draft" box above. Two are attributed, one is not (I think it is Lutz's original statement) and none of the additions ahve a date, so it is impossible to know what order they came in. It makes it hard for someone looking at this now to understand where we are.

    1. Indeed, a shorter statement is needed. Submitted a suggested wording below

  43. Option 1: the change is made and there are problems that can not be quantified. Particularly for end users. It is the responsibility of ICANN. Criticism worldwide. ICANN does not have secure DNS administration.
    Option 2: The change is not made. There may be security problems in the DNS. ICANN responsibility. Criticism worldwide. ICANN does not have secure DNS administration.
    I think we should proceed with option 1, trying to mitigate the risks with some actions that have been proposed. ALAC should note this to the Board.
    My only observation in this problem. Given ICANN's long preparation time for this topic, if it were in the private sphere, the person in charge would no longer be in office within the company.
    Unfortunately, at this point, it seems that it only remains to define responsibilities after the change of password.
    I think the report is ending correctly.

  44. Throwing my hat in drafting a statement (March 27 2018)

    The ALAC understands the need for proceeding with the Root KSK Rollover. However, the ALAC remains concerned about the lack of clarity regarding the number of DNS resolvers not yet ready for the Root KSK rollover on October 2018 and the potential impact of disruption of Internet services by the users of such DNS resolvers.

    The ALAC therefore urges ICANN to raise awareness of the Root KSK Rollover in as many ways possible: via interviews to media organisations that covered the postponent of the KSK Rollover in October 2017, via presentations at ICT/Internet conferences and having online informational materials that can be shared with the global Internet community.

    Also, it is important that information about the KSK Rollover be developed for Internet end users (in the 6 UN languages) on
    - how to check whether the DNS they use is ready for the KSK Rollover.
    - what to do if their DNS is NOT ready for the KSK Rollover.

    The ALAC welcomes regular updates from ICANN on their KSK Rollover preparations and developments before October 2018.



  45. +1 to Dev's:

    Raise and Outline the concern and its practical implications; offer a solution with verification modality.

    I also like John's stark 'just the facts, ma'am' approach but as Alan pointed out, that would be good and sufficient as Board Advice.  But this is not what's asked for here. 

    -Carlton

    1. I think that your message for DNSSEC-enabled resolvers is a bit too rosy. We know from the stats that we do have that many DNSSEC-enabled resolvers do not have the new trust anchor installed.

      1. It was a first approach.

        The wording was be changed, to make the possibilites clear.

      2. Changed the wording another time. Now it is not so optimistic anymore about a successful rollover, if DNSSEC is validated.

    2. I added some background of this test page, how it works, how it can be implemented without causing trouble in the DNS operations of real productive zones, ...

      https://lutz.donnerhacke.de/eng/Blog/Ready-for-the-Root-Zone-Key-Rollover


  46. I just added to the proposed draft:

    ICANN should provide a test page for the end users, which tell them in a very simple way, if they are affected by the KSK rollover or not (example http://dnssec.donnerhacke.de/). This page should offer a link to the gathered information from RFC 8145: The end user can enter the IP of the resolver (preferably automatically detected using a Nonce-FQDN) to check, if the active resolver is already trusting all active keys. This tool should be stay in place for further rollovers.

  47. Not sure we need to push for apps.

  48. I thought the same. But he says example ...