Friday, October 21, 2011

Lync Loses Connection Every 8min 28sec

If you enjoy CSI-like detective stories (and by CSI, I mean cheesy quips as he puts on his sunglasses, NOT this) centered around isoteric computer issues involving Lync, then you'll love this post.  For the other 6.99999 billion people in the world, move along....nothing to see here.

Late last week, my Lync client on my main work desktop started acting oddly.  Every so often, it would spontaneously lose its connection, log out, and re-login again - all within about 5-10 seconds.  At first, I suspected a network problem.  Running a steady ping against the Lync edge server showed a constant connection.  I looked at the Lync servers for any potential issue, thinking it was something that could be affecting everybody, but all seemed well.  It was late in the day on Friday, so I dealt with it like I deal with all my problems.....ignore it and hope it just goes away.

Monday morning - same thing.  It was starting to get annoying.  I'd be in the middle of an IM conversation, and BAM, log out and back in again.  Audio calls were unaffected, except for the message about limited connectivity popping up.  Weirdly enough, when I logged on via my laptop, the issue never came up, which lead me to believe it was a problem with my desktop computer.  Even more weird was that if I logged onto another Lync account on the same pool via my desktop, the problem never came up.

The first thing I tried was to restart my computer.  Didn't fix it.  Then I exited Lync and deleted all my settings in my %userprofile%\AppData\Local\Microsoft\Communicator folder.  No change in behaviour.

There were no relevant messages in the Event Log, even though Event Log logging was turned on.  I decided to turn on detailed logging via Options - General - Turn on logging in Lync.  You should only turn on this option when troubleshooting an issue, because it can take up a lot of disk space.  Logs are stored in the %userprofile%\Tracing folder.  The Communicator-uccapi-0.uccapilog is the log file to look at.  This file will grow to about 50 MB before it rolls over into Communicator-uccapi-1.uccapilog and so on.  It's Notepad-friendly. 

Digging through the log, I focussed on the time where the issue arose.  I started to notice a pattern.  My client would lose its connection EXACTLY every 8 minutes 28 seconds.  I racked my mind to think of what would be happening every 8:28, but nothing came to mind.  I ripped out all the log data surrounding that time to try to find a pattern.  This is the sequence that repeated every 8:28:

10/18/2011|12:23:30.119 1674:2350 INFO  :: UCCP:ClientAllowedAuthProts0x10004
10/18/2011|12:23:30.120 1674:2350 INFO  :: SIP_REGISTER::RefreshRegistration force(1),refreshSA(1),refreshRoute(0),state(2)
10/18/2011|12:23:30.120 1674:2350 INFO  :: SIP_REGISTER:State (2) => (4)
10/18/2011|12:23:30.120 1674:2350 INFO  :: SA(b0b8c38) Dropped
10/18/2011|12:23:30.120 1674:2350 INFO  :: Out trxn corr-id (0CE823A8)
10/18/2011|12:23:30.127 1674:2350 INFO  :: SignMsg:NoSA for request(ce823a8)
10/18/2011|12:23:30.127 1674:2350 INFO  :: SignMsg: send request without signature for trans(ce823a8)
10/18/2011|12:23:30.127 1674:2350 INFO  :: Trxn corr-id (0CE823A8), SIP msg corr-id (7aa8ef7f)
10/18/2011|12:23:30.127 1674:2350 INFO  :: Sending Packet - 209.xx.xxx.xxx:443 (From Local Address: 10.0.0.2:55072) 793 bytes:
10/18/2011|12:23:30.127 1674:2350 INFO  :: REGISTER sip:contoso.com SIP/2.0
Via: SIP/2.0/TLS 10.0.0.2:55072

The client would then try re-registering itself multiple times (about 10) for about 3-5 seconds, which would generate a SIP/2.0 401 Unauthorized from the server.  Then, it would just log back in as if nothing ever happened. 

It looked to me like the line that had SIP_REGISTER::RefreshRegistration force(1),refreshSA(1),refreshRoute(0),state(2) was the likely culprit, but I still had no idea why it was happening.  I ran log captures on working machines and never saw that line appear anywhere else.  Internet searches came up blank.  I even ran traces from the front-end Lync server, which only showed me the multiple 401 Unauthorized messages it was returning to the client.  No indication why it was not allowing me in.

I tried uninstalling and reinstalling Lync.  Nope.  I tried uninstalling and searching and deleting any trace of Lync in folders and the registry.  I deleted any cached credentials in Windows via Credential Manager.  Nope.  In frustration, I even deleted my Lync account and re-created it.  Nope.....Chuck Testa.

Then I tried creating a new local account on my desktop and logged into Lync.  Great success!!!  The problem was gone.  However, I wasn't all that thrilled because I didn't want to go through the pain of moving all my profile data to a new account.  Plus, it felt like giving up....kind of like when you re-install the OS to deal with an issue.

So, I reasoned that there must be something about my profile that was causing the issue.  So, I logged back into my main account, deleted all the temp files in %userprofile%\TEMP, cleaned out my browser history and cache, and restarted.  Amazingly, the problem was gone.  It's been 3 days and my client has been working normally.

I'm not sure if it was cleaning out the TEMP folder or the IE cache that fixed the problem.  Ideally, I should have tried one or the other first, but I was getting tired of troubleshooting by this point.  If anybody has any insight as to what was really going on, please drop me a line.

11 comments:

  1. Mras token valid for 480 minutes, trying to re-register which fails (get the disconnect)and after 10 times it succeeds (gets connected). You should see only 3x 401 errors. More attempts indicate that something might be wrong. Probably some cache files. This is what I think, not 100% sure.

    ReplyDelete
    Replies
    1. For Sorin: I was tracing my registsration proccess in Lync and observed that it gets consistently 4 register operations, the first 3 return 401 error, then the 4th one works. Do you mean that ist the expected behaviour? Do you know were can I find a description of the "correct" registering proccess?

      Delete
  2. http://www.microsoft.com/en-us/download/details.aspx?id=34950

    ReplyDelete
  3. Any other work around other than clearing TEMP folder and IE Cache, besides rebuilding windows profile?

    ReplyDelete
    Replies
    1. Hey Carla,
      I would try clearing the IE cache first. If that doesn't work, then try the TEMP folder. It was one or the other that fixed the issue for me. Rebuilding the Windows profile wasn't necessary.

      Ken

      Delete
  4. Is this because of Kaspersky anti virus?

    ReplyDelete
  5. Did you ever find that root cause?
    Are you on an On-Prem Exchange server or is your serer in the cloud?

    ReplyDelete
    Replies
    1. No, I didn't...outside of something corrupt in either my temp or IE cache. We were running on-prem Exchange at the time.

      Ken

      Delete
  6. Hi Ken, I have a user who is reporting logging out automatically and sign back in after some time. This happens only when he is on WiFi. When he is wired all looks good - didnt attempt any clear cache delete profile etc yet, i thought will find out the root cause. Had any luck finding what was happening here ?
    I saw the below errors in the uccapilog

    SIP_MSG_PROCESSOR::ThrottleRequestForConnection m_pRequestConnection->ThrottleRequest() failed for 80ee0067

    ReplyDelete
  7. Thank you Ken for sharing how you resolved this issue. I only had to clear the content of %temp% folder. Thanks.

    ReplyDelete