Friday, October 21, 2011

Lync Loses Connection Every 8min 28sec

If you enjoy CSI-like detective stories (and by CSI, I mean cheesy quips as he puts on his sunglasses, NOT this) centered around isoteric computer issues involving Lync, then you'll love this post.  For the other 6.99999 billion people in the world, move along....nothing to see here.

Late last week, my Lync client on my main work desktop started acting oddly.  Every so often, it would spontaneously lose its connection, log out, and re-login again - all within about 5-10 seconds.  At first, I suspected a network problem.  Running a steady ping against the Lync edge server showed a constant connection.  I looked at the Lync servers for any potential issue, thinking it was something that could be affecting everybody, but all seemed well.  It was late in the day on Friday, so I dealt with it like I deal with all my problems.....ignore it and hope it just goes away.

Monday morning - same thing.  It was starting to get annoying.  I'd be in the middle of an IM conversation, and BAM, log out and back in again.  Audio calls were unaffected, except for the message about limited connectivity popping up.  Weirdly enough, when I logged on via my laptop, the issue never came up, which lead me to believe it was a problem with my desktop computer.  Even more weird was that if I logged onto another Lync account on the same pool via my desktop, the problem never came up.

The first thing I tried was to restart my computer.  Didn't fix it.  Then I exited Lync and deleted all my settings in my %userprofile%\AppData\Local\Microsoft\Communicator folder.  No change in behaviour.

There were no relevant messages in the Event Log, even though Event Log logging was turned on.  I decided to turn on detailed logging via Options - General - Turn on logging in Lync.  You should only turn on this option when troubleshooting an issue, because it can take up a lot of disk space.  Logs are stored in the %userprofile%\Tracing folder.  The Communicator-uccapi-0.uccapilog is the log file to look at.  This file will grow to about 50 MB before it rolls over into Communicator-uccapi-1.uccapilog and so on.  It's Notepad-friendly. 

Digging through the log, I focussed on the time where the issue arose.  I started to notice a pattern.  My client would lose its connection EXACTLY every 8 minutes 28 seconds.  I racked my mind to think of what would be happening every 8:28, but nothing came to mind.  I ripped out all the log data surrounding that time to try to find a pattern.  This is the sequence that repeated every 8:28:

10/18/2011|12:23:30.119 1674:2350 INFO  :: UCCP:ClientAllowedAuthProts0x10004
10/18/2011|12:23:30.120 1674:2350 INFO  :: SIP_REGISTER::RefreshRegistration force(1),refreshSA(1),refreshRoute(0),state(2)
10/18/2011|12:23:30.120 1674:2350 INFO  :: SIP_REGISTER:State (2) => (4)
10/18/2011|12:23:30.120 1674:2350 INFO  :: SA(b0b8c38) Dropped
10/18/2011|12:23:30.120 1674:2350 INFO  :: Out trxn corr-id (0CE823A8)
10/18/2011|12:23:30.127 1674:2350 INFO  :: SignMsg:NoSA for request(ce823a8)
10/18/2011|12:23:30.127 1674:2350 INFO  :: SignMsg: send request without signature for trans(ce823a8)
10/18/2011|12:23:30.127 1674:2350 INFO  :: Trxn corr-id (0CE823A8), SIP msg corr-id (7aa8ef7f)
10/18/2011|12:23:30.127 1674:2350 INFO  :: Sending Packet - 209.xx.xxx.xxx:443 (From Local Address: 10.0.0.2:55072) 793 bytes:
10/18/2011|12:23:30.127 1674:2350 INFO  :: REGISTER sip:contoso.com SIP/2.0
Via: SIP/2.0/TLS 10.0.0.2:55072

The client would then try re-registering itself multiple times (about 10) for about 3-5 seconds, which would generate a SIP/2.0 401 Unauthorized from the server.  Then, it would just log back in as if nothing ever happened. 

It looked to me like the line that had SIP_REGISTER::RefreshRegistration force(1),refreshSA(1),refreshRoute(0),state(2) was the likely culprit, but I still had no idea why it was happening.  I ran log captures on working machines and never saw that line appear anywhere else.  Internet searches came up blank.  I even ran traces from the front-end Lync server, which only showed me the multiple 401 Unauthorized messages it was returning to the client.  No indication why it was not allowing me in.

I tried uninstalling and reinstalling Lync.  Nope.  I tried uninstalling and searching and deleting any trace of Lync in folders and the registry.  I deleted any cached credentials in Windows via Credential Manager.  Nope.  In frustration, I even deleted my Lync account and re-created it.  Nope.....Chuck Testa.

Then I tried creating a new local account on my desktop and logged into Lync.  Great success!!!  The problem was gone.  However, I wasn't all that thrilled because I didn't want to go through the pain of moving all my profile data to a new account.  Plus, it felt like giving up....kind of like when you re-install the OS to deal with an issue.

So, I reasoned that there must be something about my profile that was causing the issue.  So, I logged back into my main account, deleted all the temp files in %userprofile%\TEMP, cleaned out my browser history and cache, and restarted.  Amazingly, the problem was gone.  It's been 3 days and my client has been working normally.

I'm not sure if it was cleaning out the TEMP folder or the IE cache that fixed the problem.  Ideally, I should have tried one or the other first, but I was getting tired of troubleshooting by this point.  If anybody has any insight as to what was really going on, please drop me a line.