I’ve recently found myself working on an interesting issue where I was working to spin up a new domain controller in a brand new site, promotion went just fine but the initial replication failed.
Follow me

Jim Jones

Jim Jones has been a SysAdmin for 15 years and is currently working as a Sr. Network Administrator in West Virginia, USA. Honored to be elected a vExpert and Veeam Vanguard, Jim can be found on Twitter @k00laidIT and at his personal site, koolaid.info.
Follow me

What initially tipped me off that there was an issue is that while the DNS Server service was running, when attempting to access to the console it would say the DNS server wasn’t running. Further accessing any of the Active Directory management tools was exceptionally sluggish and neither the sysvol or netlogon shares were created on the new DC.

Site replication

After working this myself for a while I ended up contacting Microsoft support and eventually found the issue to be one which doesn’t have a publicly accessible knowledge base article for but evidently is documented internally. In this article I’m going to outline the specifics of the issue, the commands that I found helpful in troubleshooting the issue, and finally what ultimately fixed the issue.

Problem description ^

A good starting point is probably to be able to visualize the network, so please refer to the network diagram above. In my situation all domain controllers are meshed with replication connections to each other. Field office 3 is a brand new location so a new site and subnet were setup first and then a Windows Server 2008 R2 server was spun up in that subnet. After installing the Active Directory Service role and running dcpromo, which had zero errors through the process, is when I began to see the issues described above. Further inspection showed that no site connectors were created on the server in AD Sites and Services. The following errors also showed up repeatedly in the event log:

  • ActiveDirectory_DomainService 1865 “The Knowledge Consistency Checker (KCC) was unable to form a complete spanning tree network topology. As a result, the following list of sites cannot be reached from the local site.”
  • ActiveDirectory_Domain_Service 1311 “The Knowledge Consistency Checker (KCC) has detected problems with the following directory partition.”
  • ActiveDirectory_Domain_Service 1566 “All directory servers in the following site that can replicate the directory partition over this transport are currently unavailable.”
  • DNS-Server-Service 4013 “The DNS server is waiting for Active Directory Domain Services (AD DS) to signal that the initial synchronization of the directory has been completed.”
  • Above and beyond these issues using the portqry.exe tool I was able to figure out that the server was not listening on any of the relevant domain controller ports, TCP 137-139 or UDP port 53.

Problem troubleshooting ^

Once the problem was as fully defined as possible, both by myself and Microsoft support engineers, the troubleshooting process began. Before contacting support I took the generic step of trying the process of demoting and then re-promoting the domain controller again with no noticeable effect. After contacting support I was honestly surprised that this seemed to be a staple of troubleshooting for them as well, because at each tier of support that I worked up this process was done again; in total domain dontroller promotion was performed on this 4 times. Once we got past that provided quite a bit more information. These include:

  • Repadmin /syncall /AdePq Performs a synchronization for a server with all of its replication partners, the modifiers help in performing the sync in a multisite environment
  • Repadmin /replsum Summarizes the state of replication of the forest
  • Repadmin /kcc * Forces a recalculation of the topology, has the effect of rebuilding the automatically created partner connections in Sites and Services
  • Dcdiag /test:Connectivity dcdiag over all is great, but using the /test modifier you are able to run only specific tests as needed

Problem solution ^

As stated in the introduction, the problem here ended up being one known within the Directory Services support group, but as far as they or I know not documented publicly anywhere. Essentially if you bring up a domain controller in a site without a fully replicated domain controller already in it replication will continuously fail, but as soon as the domain controller is logically put into a site with a “good” domain controller it will replicate. So in this case it was as simple as going into AD Sites and Services, choosing move on the domain controller with the issue and putting it in a different site.

Once that’s done I again ran the repadmin /kcc * to create the correct site connections followed by repadmin /syncall. After replication finished I noticed that the local DNS server was functioning correctly and that the sysvol and netlogon shares had been created on the server. Next I ran the repadmin /replsum command again and saw that successful replication had occurred. Finally I was able to logically move the server back to the correct site and everything functioned normally.

Are you an IT pro? Apply for membership!

Your question was not answered? Ask in the forum!

3+
Share
8 Comments
  1. Jaymz Mynes 6 years ago

    Not exactly the same problem I had, but another solution can be exporting replication data to file (from DC in another site) and importing it when promoting the new DC. Ive seen DCs in other sites work fine with normal promotion and Ive seen them not work in normal promotion. I like your way as it is quicker, but if moving a DC into another AD site is prohibited (for whatever reason) the export/import method works pretty good.

    2+

  2. Paul 6 years ago

    I'm guessing using install from media for the dcpromo would have negated this problem??

    0

  3. Jim Jones 6 years ago

    @Jaymz, that sounds like a good solution as well, but honestly the process I described was very quick (30 minutes) once the solution was found. What got me was it took MS support 5 days to find an already notated internal bug.

    @Paul, in this case the install actually went fine, it just had an issue with the original replication of Active Directory.

    0

  4. Mathias 6 years ago

    @Jim, thanks a lot! Your solution made my day!!!

    0

  5. anand 4 years ago

    Hello everyone,

    I'm facing very strange issue of AD replication in with different subnets DC.

    Site A : 172.16.1.0/16

    Site B: 192.168.20.0/24

    sites are connected through P2P and IPsec tunnel (two different way).

    Ive get checked that the all ports are open from ISP side and firewall is off on DCs.

    I got following errors:

    Error: The RPC server is unavailable 1752

    Port query to TCP 135

    TC port 135 is listening but There are no more endpoints available from the endpoint mapper error is showing.

    Alternative solution:

    I gave the second subnet IP in other subnet DC then i'm able to get reply from endpoint mapper and AD is synching properly.

    my question is that why DCs are not replication on single IP (of their subnet).

    Please guide.

    0

  6. MikeR 3 years ago

    This article helped me as well.  I added DNS to the new DC I brought up and did a repladmin /kcc *

     

    full replication happened after that

    0

  7. Yongama 1 year ago

    Hi Guys im currently facing this issue. Please assist

    • DNS-Server-Service 4013 “The DNS server is waiting for Active Directory Domain Services (AD DS) to signal that the initial synchronization of the directory has been completed.”
    0

  8. OJRam 12 months ago

    Last weekend, after bringing up our two Windows Server 2008 R2 domain controllers (both VM's, both hosted in Hyper-V) that were down for 24 hours due to power maintenance, they won't sync again. The following RPC-related error began to consistently appear in almost every diagnostic we ran:

    Last attempt (...) failed, result 1753 (0x6d9):
    There are no more endpoints available from the endpoint mapper.

    After three days of troubleshooting the Windows side of the issue, we ended up disabling Symantec Endpoint Protection (SEP) clients on both DC's and then boom they began to synchronize as usual; I suspect some SEP update needing a restart somehow triggered blocking LDAP/RPC connectivity between them.  Needless to say, we went back to happiness.

    0

Leave a reply

Your email address will not be published. Required fields are marked *

*

© 4sysops 2006 - 2020

CONTACT US

Please ask IT administration questions in the forums. Any other messages are welcome.

Sending

Log in with your credentials

or    

Forgot your details?

Create Account