More User Profile Sync issues in SP2010: Certificate Provisioning Fun

Wow, isn’t the SharePoint 2010 User Profile Service just a barrel of laughs. Without a bit of context, when you compare it to SP2007, you can do little but shake your head in bewilderment at how complex it now appears.

I have a theory about all of this. I think that this saga started over a beer in 2008 or so.

I think that Microsoft decided that SharePoint 2010 should be able to write back to Active Directory (something that AD purists dislike but sold Bamboo many copies of their sync tool). Presumably the SharePoint team get on really well with the Forefront Identify Manager team and over a few Friday beers, the FIM guys said “Why write your own? Use our fit for purpose tool that does exactly this. As an added bonus, you can sync to other directories easily too”.

“Damn, that *is* a good idea”, says the SharePoint team and the rest is history. Remember the old saying, the road to hell is paved with good intentions?

Anyways, when you provision the UPS enough times, and understand what Forefront Identity Manager does, it all starts to make sense. Of course, to have it make sense, requires you to mess it up in the first place and I think that everyone universally will do this – because it is essentially impossible to get it right the first time unless you run everything as domain administrator. This is a key factor that I feel did not get enough attention within the product team. I have now visited three sites where I have had to correct issues with the user profile service. Remember, not all of us do SharePoint all day – for the humble system administrator that is also catering with the overall network, this implementation is simply too complex. Result? Microsoft support engineers are going to get a lot of calls here – and its going to cost Microsoft that way.

One use-case they never tested

I am only going to talk about one of the issues today because Spence has written the definitive article that will get you through if you are doing it from scratch.

I went to a client site where they had attempted to provision the user profile synchronisation unsuccessfully. I have no idea of the things they tried because I wasn’t there unfortunately, but I made a few changes to permissions, AD rights and local security policy as per Spencers post. I then provisioned user profile sync again and I hit this issue. A sequence of 4 event log entries.

Event ID:      234
Description:
ILM Certificate could not be created: Cert step 2 could not be created: C:\Program Files\Microsoft Office Servers\14.0\Tools\MakeCert.exe -pe -sr LocalMachine -ss My -a sha1 -n CN=”ForefrontIdentityManager” -sky exchange -pe -in “ForefrontIdentityManager” -ir localmachine -is root

Event ID:      234
Description:
ILM Certificate could not be created: Cert could not be added: C:\Program Files\Microsoft Office Servers\14.0\Tools\CertMgr.exe -add -r LocalMachine -s My -c -n “ForefrontIdentityManager” -r LocalMachine -s TrustedPeople

Event ID:      234
Description:
ILM Certificate could not be created: netsh http error:netsh http add urlacl url=
http://+:5725/ user=Domain\spfarm sddl=D:(A;;GA;;;S-1-5-21-2972807998-902629894-2323022004-1104)

Event ID:      234
Description:
Cannot get the self issued certificate thumbprint:

The theory

Luckily this one of those rare times where the error message actually makes sense (well – if you have worked with PKI stuff before). Clearly something went wrong in the creation of certificates. Looking at the sequence of events, it seems that as part of provisioning ForeFront Identity Manager, a self signed certificate was created for the Computer Account, added to the Trusted People certificate store and then is used for SSL on a web application or web service listening on port 5725.

By the way, don’t go looking for the web app listening on such a port in IIS because its not there. Just like SQL Reporting Services, FIM likely uses very little of IIS and doesn’t need the overhead.  

The way I ended up troubleshooting this issue was to take a good look at the first error in the sequence and what the command was trying to do. Note the description in the event log is important here. “ILM Certificate could not be created: Cert step 2 could not be created”. So this implies that this command is the second step in a sequence and there was a step 1 that must have worked. Below is the step 2 command that was attempted.

C:\Program Files\Microsoft Office Servers\14.0\Tools\MakeCert.exe -pe -sr LocalMachine -ss My -a sha1 -n CN=”ForefrontIdentityManager” -sky exchange -pe -in “ForefrontIdentityManager” -ir localmachine -is root

When you create a certificate, it has to have a trusted issuer. Verisign and Thawte are examples and all browsers consider them trustworthy issuers. But we are not using 3rd party issuers here. Forefront uses a self-signed certificate. In other words, it trusts itself. We can infer that step 1 is the creation of this self-trusted certificate issuer by looking at the parameters of the MakeCert command that step 2 is using.

Now I am not going to annotate every Makecert parameter here, but the English version of the command above says something like:

Make me a shiny new certificate for the local machine account and call it “ForefrontIdentityManager”, issued by a root certificate that can be found in the trusted root store also called ForeFrontIdentityManager.

So this command implies that step 1 was the creation of that root certificate that issues the other certificates. (Product team – you could have given the name of the root issuer certificate something different to the issued certificate)

The root cause

Now that we have established a theory of what is going on, the next step is to run the failing Makecrt command from a prompt and see what we get back. Make sure you do this as the Sharepoint farm account so you are comparing apples with apples.

C:\Program Files\Microsoft Office Servers\14.0\Tools>MakeCert.exe -pe -sr LocalMachine -ss My -a sha1 -n CN=”ForefrontIdentityManager” -sky exchange -pe -in “ForefrontIdentityManager” -ir localmachine -is root

Error: There are more than one matching certificate in the issuer’s root cert store. Failed

Aha! so what do we have here? The error message states that we have more than 1 matching certificate in the issuers root certificate store.

For what its worth it is the parameters “-ir localmachine -is root” that specifies the certificate store to use. In this case, it is the trusted root certificate store on the local computer.

So lets go and take a look. Run the Microsoft Management Console (MMC) and Choose “Add/Remove Snap In” from the File Menu.

image

From the list of snap ins choose Certificates and then choose “Computer Account”

image

Now in the list of certificate stores, we need to examine the one that the command refers to: The Trusted Root Certification Authorities store. Well, look at that, the error was telling the truth!

image

Clearly the Forefront Identity Manager provisioning/unprovisioning code does not check for all circumstances. I can only theorise what my client did to cause this situation because I wasn’t privy to what was done on this particular install before I got there. but step 1 of this provisioning process would create an issuing certificate whether one existed already or not. Step 2 then failed because it had no way to determine which of these certificates is the authoritative one.

This was further exacerbated because each re-attempt creates another root certificate because there is no check whether a certificate already exists.

The cure is quite easy. Just delete all of the ForefrontIdentityManager certificates from the Trusted Root Certification Authorities and re-provision the user profile sync in SharePoint. Provided that there is no certificate in this store to begin with, step 1 will create it and step 2 will then be able to create the self signed certificate using this issuer just fine.

Conclusion (and minor rant)

Many SharePoint pros have commented on the insane complexity of the new design of the user profile sync system. Yes I understand the increased flexibility offered by the new regime, leveraging a mature product like Forefront, but I see that with all of this flexibility comes risk that has not been well accounted for. SP2010 is twice as tough to learn as SP2007 and it is more likely that you will make a mistake than not making one. The more components added, the more points of failure and the less capable over-burdened support staff are in dealing with it when it happens.

SharePoint 2010 is barely out of nappies and I have already been in a remediation role for several sites over the user profile service alone.

I propose that Microsoft add a new program level KPI to rate how well they are doing with their SharePoint product development. That KPI should be something like % of time a system administrator can provision a feature without making a mistake or resorting to running it all as admin. The benefit to Microsoft would be tangible in terms of support calls and failed implementations. Such a KPI would force the product team to look at an example like the user profile service and think “How can we make this more resilient?”. “How can we remove the number of manual steps that are not obvious?”, “how can we make the wizard clearer to understand?” (yes they *will* use the wizard).

Right now it feels like the KPI was how many features could be crammed, in as well as how much integration between components there is. If there is indeed a KPI for that they definitely nailed it this time around.

Don’t get me wrong – its all good stuff, but if Microsoft are stumping seasoned SharePoint pros with this stuff, then they definitely need to change the focus a bit in terms of what constitutes a good product.

Thanks for reading

Paul Culmsee

www.sevensigma.com.au

31 Comments on “More User Profile Sync issues in SP2010: Certificate Provisioning Fun

  1. Thorough job, Paul, and something I’ll ask our team to look out for. I even missed your rants a bit and am pleased to see them back (alcohol references included)!

  2. Nice to see you bloggign again mate, I konw you’ve been busy! User Profiles are busting my ass on every single deployment we’ve done for different reasons…it’s been a “great journey” LOL It’s definately the most unstable part of #SP2010 for sure.

  3. Hey Jeremy.. Wait till you hear about the Lotus Domino connector for search! yet to write that one as I never took screenshots the first time round…

  4. Sounds like your tech hat is well and truly on again…

    Nice too see your troubleshooting process explained too 🙂 I’m sure a few people can benefit from applying similar logic “what is the system trying to do; where are the points that it is falling over” to technical problems in the future.

    BTW Very well behaved rant!

  5. Trust me I’d rather my tech hat isn’t on, but I think that with SP2010 thats impossible. I’m noticing that Jeremy is almost an infrastructure guy now 🙂

  6. In the your instructions you say “The cure is quite easy. Just delete all of the certificates from the Trusted Root Certification Authorities and re-provision the user profile sync in SharePoint. Provided that there is no certificate in this store to begin with, step 1 will create it and step 2 will then be able to create the self signed certificate using this issuer just fine.”

    By all, do you mean every certificate in there or just the ones created for Forefront? If it is just deleting the ForFront certs, I’ve done this repeatedly and I’m still not getting anywhere.

  7. No do not under any circumstances delete all of them. I bet that would cause all sorts of issues.

    If you have deleted the trusted root certificates, then run the offending command from the command line, logged in as the farm account and see if the error is any different. Paste the error here if it is different.

    regards

    Paul

  8. I have the same error as described in your post, and tried your solution deleting the “ForefrontidentityManager” certificate from the “Trusted Root Certification Authorities” and tried to run the failing “Makecrt” command again. I then got the following error: “Error: There is no matching certificate in the issuer’s root cert store Failed”. I also tried start the service from Central Admin again, but it gets stuck on “Starting” in Central Admin, and “Provisioning” if i run Get-SPServiceInstance in PowerShell.
    Have I misunderstood what you meant and how to “… re-provision the user profile sync in SharePoint. Provided that there is no certificate in this store to begin with, step 1 will create it and step 2 will then be able to create the self-signed certificate using this issuer just fine.”

    I have also read Henrik Andersson’s post without finding a solution there. Tried deleting the “ForefrontIdentitymanager” certificate under “Personal” to, but that didn’t help. Before I deleted the certificate here, I was able to start the FIM service manually, afterwards I get this error when trying to start the FIM service manually: “System.ServiceModel: System.InvalidOperationException: Cannot find the X.509 certificate using the following search criteria: StoreName ‘My’, StoreLocation ‘LocalMachine’, FindType ‘FindByThumbprint’, FindValue ‘”

    I’m at a bit of a loss for where to go next. Any suggestions would be appreciated.

  9. Hi

    You never start the services manually. The error message your getting is suggesting that the root certificate is not not being created at all. Is your farm account local admin?

    Your next bet it to follow Spencers article very slowly and carefully.

    regards

    Paul

  10. With the June CU, the cause of the creation of the 4 certificates in the trusted root, and the 1 in the personal store is the FIMSynchronizationService, spawned from the ProfileSynchronizationSetupJob. If you delete the 5 certs, and do iisreset, reboot, whatever, the attempt to provision the UPS Service will still fail under certain circumstances; which I am in pursuit of now.

  11. UPDATE: According to MS Support, the event id 234 warnings in the event log, and their root cause, are not an issue. I never question what has worked though 🙂

    Plus, as of the June CU, maybe the issue was corrected. Great post though.

  12. Thanks alot for the valuable information and this is one of the funniest posts I’ve ever read. I literally giggled 🙂 I am posting in my blog for the specific scenario I faced with UPSS and referencing your blog post.

  13. Hi Yousef

    Glad you found the post funny and informative. Thats pretty much my blogging KPI 🙂

    Looking forward to reading your specific issue

    regards

    Paul

  14. Damn!. I delete all my FIM trusted root certificates.

    I can’t seem to setup another UPS Application, SSS. Can someone help me out here.

  15. Wow I just spent a ton of time on this… your info led me down the right path but unfortunately didn’t quite resolve it. My UPS was stuck “starting” even after deleting the FIM certs that littered the cert store. It wasn’t until I started the “SharePoint Web Services Root” application pool in IIS that everything went through normally.

    I figured this out by parsing the ULS logs and noticed immediately after provisioning the UPS it was repeatedly making calls to the web services application (port 32843) which is stopped by default. I’m not sure if it’s supposed to automatically start this up when you provision the UPS but in my case, starting the app pool resolved the stuck on starting issue.

  16. FIM is one of the most frustrating points in SharePoint 2010. Just checking up in Google on configuration profile synchronization in SharePoint 2010 returns so many tutorials, documentations and rants only shows what people think about the feature.

  17. ForefrontIdentityManager certificates are stored in 3 Locations:

    – Personal
    – Trusted Root Certification Authorities
    – Trusted People

    Thanks for the grate post

  18. Took me a good day to figure this one out…

    My story:
    Had to change the AD account used as farm account. This is pretty simple using stsadm -o updatefarmcredentials (works on SP2010). (see: http://www.tsls.co.uk/index.php/2011/01/21/how-to-change-the-farm-service-account-in-sharepoint-2010/)

    Everything worked pretty good except the UPS (of course) which wouldn’t start (stuck at starting).

    What I did:
    To stop the UPS when stuck on starting: 2 lines of powershell fix the problem:

    Get-SPServiceInstance | Where-Object {$_.TypeName -like “User Pro*”}

    Stop-SPServiceInstance SERVICE_INSTANCE_GUID

    Next I diagnosed the problem using ULS Viewer filtering on category User (see: http://www.harbar.net/articles/sp2010ups2.aspx#ups0)

    My problem was it was always getting stuck on ILM Configuration: Configuring certificate. I followed this post: (see: http://www.cleverworkarounds.com/2010/08/15/more-user-profile-sync-in-sp2010-certificate-provisioning-issues/)
    but it wouldn’t resolve the issue.

    The trouble is that you need to delete some other certificate that weren’t shown if you login with the default user credentials. I was not opening the certificate manager using computer account (I was opening it with my user account) so was not getting anything under personal certificate which also needs to be deleted so thats why it was stucking at ILM: Configuring Certificate.

    After I fixed that it was pretty straightforward.

    Robert – nitront.com

  19. Hi Paul,
    I had exactly the same issues. But with 2013 on premise. Can these things not be fixed. I followed you work arounds, and it still wouldn’t work.
    In the end, I found that turning off ‘Automatically detect network settings’ was off, and that there was no mention of proxies in IE on the app server. And it worked – yay!
    Thanks for your post,
    Glenn

  20. Hey Pauly. Fancy coming across this post 4 and a half years later when looking for a solution to the same issues with SP2013 after an SP1 upgrade. 😉

    Keep the CleverWorkarounds coming, my friend.

  21. Great post! Got me on the right track for SP 2013 UPS also. I found that deleting the certs from Trusted Root Certificate Authorities and Personal, then copy the cert left for ForeFront from the Trusted People back to the above mentioned fixed it also.

  22. Hi There, I know I’m late in the day but does anyone know what the command is for step 1? I can’t run the the step 2 command because step 1 never ran. Soooo close!!

  23. When starting the User Profile Service in Central Administration, the service starts and then stops immediately. Inspection of the SharePoint ULS indicates that the failure to start is a result of the following:
    “UserProfileApplication.SynchronizeMIIS: Failed to configure ILM, will attempt to rerun.
    Exception: System.Security.SecurityException: The encryption type requested is not supported by the KDC.”
    https://learn.microsoft.com/en-us/sharepoint/troubleshoot/security/configuration-to-support-kerberos-aes-encryption

Leave a Reply

Your email address will not be published. Required fields are marked *

*

This site uses Akismet to reduce spam. Learn how your comment data is processed.