Friday, December 6, 2013

SChannel Errors on Lync Server Preventing Client Logon

I was at a client setting up a brand-spanking new Lync 2013 deployment on Windows 2012.  I was setting up two pools in two datacenters. The server deployment went without a hitch and we got everything up and running in no time flat. However, we could not sign on with a Lync 2013 client to either pool.  The client just complained it couldn't log on. 

Looking at the server event logs, we saw numerous SChannel errors as below:
Event ID: 36874 - TLS 1.2 connection request was received from a remote client application, but none of the cipher suites supported by the client application are supported by the server. The SSL connection request has failed.
Event ID: 36888 - A fatal alert was generated and sent to the remote endpoint. This may result in termination of the connection. The TLS protocol defined fatal error code is 40. The Windows SChannel error state is 1205.
Looking around for solutions on the web, I came across these two apparent gems:
http://social.technet.microsoft.com/Forums/lync/en-US/41718327-203f-445f-8657-87b0a8545ead/lync-2013-client-signin-issue-with-lync-2013-server?forum=lyncprofile (Look towards the bottom for the answer)
and
http://www.logicspot.net/index.php?id=50

If you don't feel like reading the aforementioned links, the answer was to use Regedit to disable TLS 1.2 on the Lync front-ends. This was the solution provided by MS Support. Sure enough, doing that fixed the problem, but as noted in the links above, this broke Windows Update.  To get Windows Update to work, you would have to remove the registry entry, restart the server, run Windows Update, re-add the registry entry and reboot the server once more.

Since this was a brand-new Lync deployment on brand new Windows 2012 servers, I had a hard time believing this was the only fix for the problem. Since the problem was affecting two independent pools, I figured there must be some common feature shared between them causing the issue. After much flailing about, I turned my attention to the recently installed Windows Certificate Authority installation. Another consultant had installed a CA for the company in preparation for Lync.

Comparing against known good installations, we noticed the signature hash algorithm used for the root certificate was SHA512, but other working deployments used SHA256 or lower. We reissued the root certificate using SHA256, and installed new certificates on the Lync front-ends using this hash algorithm. After a server restart, clients were able to log on successfully, and the SChannel errors went away.

I'm not a cryptography expert, so I'm not exactly sure why SHA512 caused issues with TLS 1.2. Poking around the Internet gave me the impression that SHA512 and TLS 1.2 just don't work together (but damned if I can find where I saw that again).

Regardless, this just goes to show that even if a workaround provided by Microsoft themselves might solve an issue, it doesn't necessarily mean its the right way to do it.