Thursday, February 21, 2019

Can You Install an SDN Listener in a SfB CCE Environment?

As part of my job at Nectar Corp, I'm always looking at ways to get call detail information from various Microsoft-based UC telephony components. It's easy to get a wealth of detail from an on-prem deployment of Skype for Business, using either direct access to the SfB monitoring databases (for historical info) or via the Skype for Business SDN dialog listeners (for realtime call data).  As mentioned in my recent blog post, the same cannot be said for Office 365 UC workloads such as SfB Online and Teams.

Nectar has a tool that relies on knowing when a call starts and stops in realtime so we can gather information from various network components as the call is happening. Nectar uses the SfB SDN interface for this realtime info. Since both SfBO and Teams lack ANY sort of call data API, we have to resort to other methods to learn about realtime calls.

A Nectar customer has deployed several Skype for Business Cloud Connector Edition (CCE) instances to provide on-prem PSTN call routing for SfB Online users. CCE is the precursor of Teams Direct Routing, and it does essentially the same thing but in a WAAAAAAY more complex manner. It consists of several pre-packaged VMs running a vastly slimmed down Skype for Business environment.

We were in discussions with the customer to see if there was any way we could get realtime call info from the CCE instances, by installing the Microsoft SDN Dialog Listener on any of the CCE VMs. Nobody seemed to know for certain, so I elected to setup a CCE instance of my own to run some tests. The question I was trying to answer was:

Can you get call data out of a CCE instance using the MS SDN Listener service??? 

TL;DR:  No.

I had built myself a fancy new work PC at home that I figured would be ideal for this test. I put in 32GB of RAM so I could spin up VMs as needed for my tests, and this would be an excellent test OR SO I THOUGHT!

My first hurdle was that I needed a virtual machine host that would host the VMs required for CCE. In other words, I needed to run VMs from within a VM.
Just like the movie, except much more boring.
CCE only supports Windows 2012 R2 as the VM host OS, so I started with that as my VM host for my VMs, but it blocked me from installing the Hyper-V components, since it was already a VM. A bit of research showed that I could use Windows Server 2019, which had no such limitation. I spun up my Hyper-V host VM, gave it 24 GB of RAM, all the processor cores I had, and off I went.

...OR SO I THOUGHT (AGAIN). I ran into issues right off the bat when installing the CCE components, where most of the CCE PowerShell commands were available. Thanks to a workaround from Shawn Harry, I was soon off to the races and installing CCE.

...BUT NO! FOILED AGAIN!!! There are a number of steps to deploy a CCE instance, most of which require running various CCE PowerShell commands. One of these commands performed a hardware check to make sure the Hyper-V host had enough resources to run the required VMs. I was rudely informed that my super-awesome, brand-new workstation PC did not have enough memory, processor cores or disk space to run CCE.

Keanu is sad, and so am I
Time to implement my hacker skilz.

The big red, failure message that PowerShell spit out at me helpfully included the path to the script that did the hardware check:

C:\Program Files\WindowsPowerShell\Modules\CloudConnector\Internal\MtCommon.ps1

A quick review of that script, and a few edits to the $logicalProcessorsCount, TotalPhysicalMemory, FreePhysicalMemory, and FreeDiskSize variables and the script was convinced that I had a super-server with 999999999999 MB of memory, diskspace and processor cores.

With a few more tweaks here and there, I was finally able to get a full Skype for Business Cloud Connector Edition instance running in my modest home lab. Since the CCE environment consists of several standard Windows VMs, I was able to login to them and attempt to install the SDN Listener on what I hoped would be possible candidates: the CMS or the Mediation VM.

Neither of those were running as a full front-end server, but I still held out hope. The CMS VM was running the Centralized Logging Service Agent, File Transfer Agent and Master/Replica Replicator Agents. The Mediation VM was running the Centralized Logging Service Agent, Mediation service and Replica Replicator Agents.

I attempted to install the SDN Listener on both those VMs and got the following message:

And there you go. The answer to a question that nobody else was looking for. No, you can't run SDN on any CCE VM role.

Hope you're happy @ucomsgeek.

Wednesday, February 20, 2019

The Current State of Microsoft Cloud-Based Call Analytics

Microsoft Teams continues to evolve at a rapid pace, and nearly every company who currently runs on-prem Microsoft Skype for Business deployments are investigating the feasibility of moving their UC infrastructure to the cloud.

While moving UC workloads to the cloud eliminates the need for most (if not all) customer-owned server hardware, UC media still must traverse the customer network. Since the network is one of the major causes of poor call quality, customers still require advanced troubleshooting capabilities.
Microsoft does provide some basic call analytics in Office 365, but there is much room for improvement. As such, many 3rd party vendors (including Nectar, the company I work for) are keen to fill the gaps. Unfortunately, there is a lot of confusion and misdirection in the marketplace about what is possible with 3rd party call analytics in Office 365 UC platforms.

Skype for Business Online

Even though Skype for Business Online is being deprecated in favour of Microsoft Teams, there are still many customers who have not made the transition. As mentioned previously, Microsoft does provide some limited call analytics and reporting capabilities for both SfBO and Teams at no extra cost, but the available tools only show a subset of the available information, presented in limited ways. A dedicated person can customize the tools to meet some of their needs, but this takes time, effort and deep knowledge.

This is the ideal place for other companies to provide additional value. Unfortunately, the only way to access detailed call analytics is to use the Get-CsUserSession PowerShell command (included as part of the Skype forBusiness Online PS Module).

The output of the Get-CsUserSession command provides detailed CDR and quality information for a specified user over a specified timeframe in JSON format. Much of the detail is not available in the existing O365 dashboards. At first blush, this appears to be a fantastic way to import detailed call data into 3rd party call analytics platforms, but there are some significant limitations:
  • The Get-CsUserSession command only returns data for a single user
  • No built-in delta mechanism, so no way to return only new/updated information since the last time the command was run
  • Data is only available for 30 days

Theoretically, a tool could constantly run the Get-CsUserSession command for every user in the enterprise, but this is not an approach that scales to the enterprise-level.  Nectar looked at this but discarded it after early tests showed inherent scalability issues.

Another approach would be to provide a GUI front-end that simply runs the Get-CsUserSession in the background on demand for a specified user. This avoids any scalability issues and can provide some useful information on a given user’s call quality. However, this approach doesn’t allow for any enterprise-level reporting or trending data, since data for all users is never downloaded.

Microsoft Teams

As mentioned in the previous section, customers have access to a limited portfolio of Teams call analytics tools through their Office 365 subscription. There is room for improvement and vendors are anxious to show value in this area.

At the time of writing, there is absolutely no way to access detailed call analytics about Microsoft Teams calls or conferences. The Get-CsUserSession command discussed in the previous section only returns call data for SfBO calls.  Some Teams PSTN calls may show up in Get-CsUserSession results, but that’s because it appears PSTN calls are routed through SfBO infrastructure. Detailed call data on P2P calls or conferences are not available.

Some vendors provide some great analytics about Teams usage (channel members, activity etc). These analytics are provided through the Office365 Graph API and does not provide any detailed call analytics. Through misunderstanding or miscommunication, some customers end up confusing this point and think they can get the same level of call quality troubleshooting as they currently can from on-prem Skype for Business deployments.

The Future of Microsoft’s Cloud-Based Analytics

Microsoft has not shared any information about when or if they will release an API that will allow vendors to incorporate Teams call data into their existing call analytics platforms. Rest assured that Nectar is keeping a close eye on the situation and will be able to incorporate Office365 call analytics into our call analytics platform as soon as something is available.

In the meantime, if a vendor is promising they can do advanced analytics for Microsoft’s cloud platform, think of this and ask the right questions:
  • Exactly what kind of call data can the tool retrieve? (MOS scores, jitter, packet loss etc)
  • How does the tool get data from Office365?
  • Does the tool provide detailed call quality reports for the entire enterprise?

If you get appropriate answers, then you can make a more informed decision on purchasing a call analytics platform for your cloud-based UC platforms.

This article was also posted on Nectar's website and Telecom Reseller

Monday, October 1, 2018

Run PowerShell Core in Docker on Raspberry Pi

I've recently started playing around with the latest Raspberry Pi 3 B+ along with a PoE HAT, which is an amazing little piece of kit. I've been trying to offload a bunch of workloads that are currently being served by an aging Windows 10 PC whose primary role is a media PC. This very same PC used to run Windows Media Center on Windows XP, before I turned to XBMC/Kodi and then Plex. It's well over a decade old, but still going strong. Although, now that I think of it, the only thing that's probably still original is the case, since I've replaced the motherboard, videocard, disk drives, and power supply over the years.

Since I've worked with Docker in the past (most recently with a failed attempt to move the Skype Optimizer to a Docker container), I thought I'd try running Docker on the Raspberry Pi.

My home networking setup in the basement furnace room. My Raspberry Pi is circled in red.
I've been pretty successful in getting things working efficiently on Docker.  At the time of writing this post, I've got the following Docker containers running on my little Pi:
  • Traefik - an easy-to-use reverse proxy solution so that I can access the different container UIs by using https://containername.skypeoptimizer.com, instead of http://192.168.1.x:somerandomport. It also simplifies certificate management, since I only have to do it once in Traefik with a wildcard cert, rather than on every container.
  • Portainer - a GUI for container management, so I don't have to remember specific commands
  • Pihole - a DNS blackhole for ads. Blocks ads for all devices on my internal network. Tidbit: its currently blocking almost 50% of my total DNS queries!
  • Radarr/Sonarr/Jackett - *cough cough* media management
  • Unifi Controller - manages my amazing-how-did-I-live-before-this wifi infrastructure
  • NOIP - publishes my ever-changing public IP address to Noip.com.
  • PowerShellPi - Runs PowerShell scripts
It was the last one that took me the most work to get going. While there are several published PowerShell Docker containers for Linux, there didn't seem to be any that would work with the ARM processors inside the Raspberry Pi. 

After much trial-and-error (story of my life), I finally managed to build a working Docker image that works with the Raspberry Pi. I used some of the bits from this TechNet blog to get the right commands, but had to make some modifications for it to work on the latest Raspberry Pi version. 

I'm using this base image to build other images that run simple scripts on a schedule. For example, I've got one that checks my public IP, and updates an Azure DNS record when it changes. It does the same thing as the NOIP one, but its something that's fully under my control.

For those who are interested, the base image is available on https://hub.docker.com under kenlasko/powershellpi. The DockerFile used to build this image is shown below (since I haven't figured out how to display it on Docker Hub yet).

# PowerShell on Raspberry Pi
#
# Version 1.0

FROM raspbian/stretch

LABEL maintainer="Ken.Lasko@gmail.com"

RUN sudo apt-get update
# Install libraries necessary for PowerShell
RUN sudo apt-get install --no-install-recommends -y libunwind8 libicu57 libcurl4-openssl-dev cron
# Get the latest released PS module
RUN wget https://github.com/PowerShell/PowerShell/releases/download/v6.0.4/powershell-6.0.4-linux-arm32.tar.gz
# Make PowerShell directory and install PS module
RUN mkdir ~/powershell && tar -xvf ./powershell-6.0.4-linux-arm32.tar.gz -C ~/powershell && sudo ln -s ~/powershell/pwsh /usr/bin/pwsh && sudo ln -s ~/powershell/pwsh /usr/local/bin/powershell
# Remove install files after completion
RUN sudo rm -rf /var/lib/apt/lists && rm powershell-6.0.4-linux-arm32.tar.gz
# Install PS modules (Azure auth and DNS modules shown as examples)
#RUN ~/powershell/pwsh -Command Install-Module -Name AzureRM.Profile.Netcore -Force && ~/powershell/pwsh -Command Install-Module -Name AzureRM.Dns.Netcore -Force


My next move is to get a few more Raspberry Pis to create a Docker Swarm to automatically load-balance my containers, just like a wee little datacenter!

This is for you, Pat!

Tuesday, September 18, 2018

Skype Optimizer JSON Interface Now Available

For people who like to have more control over how dial rules are implemented but want to leverage my vast database of worldwide dial rules, I have implemented a JSON interface for the Skype Optimizer.

The interface will return all the dial rules for a given country in JSON format, which can then be used in your own scripts or application. 

To access the API, you simply construct the URL to correspond to the correct representation for the desired country. The base URL for the JSON interface starts with:

https://www.ucdialplans.com/queryapi/

You then add specific options as query strings. For NANPA countries like US, Canada and many Caribbean countries, use the following:
Query String
Data Value
Required?
countrycode
1
Yes
npa
3-digit integer
Yes
nxx
3-digit integer
Yes
natscope
national/uscan/all
No (defaults to national)
sevendigit
yes/no;1/0;true/false
No (defaults to no)
apikey
16-character alphanumeric
Yes

For an example location in Chicago, where we want to treat both US and Canada numbers as national numbers, use the following:
https://www.ucdialplans.com/queryapi/?countrycode=1&npa=312&nxx=226&natscope=uscan&apikey=yourapikeyhere

For any other country, the available query strings are as follows:
Query String
Data Value
Required?
countrycode
1-3 digit integer value
Yes
areacode
1-6 digit integer value
Most countries
carriercode
1-3 digit integer valueSome countries
apikey
16-character alphanumeric
Yes

For example, to return the dial rules for London, UK use the following URL:
https://www.ucdialplans.com/queryapi/?countrycode=44&areacode=20&apikey=yourapikeyhere
The results will look something like this, when you parse it through a JSON formatter:

One way to use this data is in your own personal PowerShell script. Below is an example that will pull down the dial rules for 312-226 in Chicago:
$DialPlan = "https://www.ucdialplans.com/queryapi/?countrycode=1&npa=312&nxx=226&apikey=abc1234567890xyz"
$JSON = Invoke-RestMethod -Method Get -Uri $DialPlan
Once you've got the JSON data, you can return specific elements simply by "following the path", as it were:
PS C:\> $JSON.data.routes

local         : {@{pattern=^\+1((872([^01]\d\d))|(847([^01]\d\d))...
tollfree      : @{pattern=^\+18(00|8\d|77|66|55|44|33|22)\d{7}$}premium       : @{pattern=^\+1(900|976)[2-9]\d{6}$}national      : @{pattern=^\+1(?!(24[26]|26[48]|284|345|441|473|649|664|721|[67]58|767|784|8[024]9|86[89]|876|900|976))                [2-9]\d\d[2-9]\d{6}$}international : @{pattern=^\+((1(?!(900|976))[2-9]\d\d[2-9]\d{6})|([2-9]\d{6,14}))}service       : @{pattern=^\+?([2-9]11)$}

You can get as specific as you want:
PS C:\> $JSON.data.normrules.national.pattern^1?([2-9]\d\d[2-9]\d{6})\d*(\D+\d+)?$PS C:\> $JSON.data.routes.international.pattern^\+((1(?!(900|976))[2-9]\d\d[2-9]\d{6})|([2-9]\d{6,14}))
For elements that could have multiple values, such as local routes, you have to write it a little bit differently (note the square brackets):
PS C:\> $JSON.data.routes.local[0].pattern^\+1((872([^01]\d\d))|(847([^01]\d\d))|(815(20[^189]|21[^1378]|22[01348]|23[067]|24[125]|25[02458]|26[^2469]|27[^035]|28[07]|29[0345]|30[^39]|31[03478]|32[^39]|33[^2569]|34[^0]|35[^089]|36[1346]|37[^5689]|38[23568]|40[^026]|41[^149]|42[^089....
You must also include a valid API key. For testing purposes, I've created a demo API key that anybody can use.
abc1234567890xyz
The output will be obfuscated to deter abuse, but is otherwise valid JSON.  Please give it a try, and if you feel there is a place for this in your company or application, please contact me through the usual channels.

I have to give thanks to longtime SfB/Teams MVP Jonathan McKinney for pushing me to do this, and to help test it along the way. Beers are inbound!

Wednesday, June 6, 2018

Teams Direct Routing Now Supported in the Skype Optimizer

Since Microsoft made Teams Direct Routing available as a preview, I've been working to modify the Skype Optimizer codebase to support it. Finally, after weeks of coding and testing, I'm pleased to announce that the Skype Optimizer now fully supports Teams Direct Routing.



O365-native PSTN calling is only available for a small number of countries. A lot of companies have held off from migrating to the cloud until there was some way of allowing PSTN calling for users in the other countries that aren't supported in O365. Direct Routing allows customers basically anywhere in the world to move their users to Office 365 while allowing telephony connections to their provider of choice via on-premises PSTN gateways.

Direct Routing in Teams works almost exactly the same as it does in Skype for Business on-premises. The only differences are in how normalization rules are handled (which I talked about in an earlier post), and there is no support for trunk translation rules. This means that if you have to do any manipulation of numbers before sending to another PBX or the PSTN, you will have to do it at the PSTN gateway level.

There are still voice policies, routes and PSTN usages that are combined together to provide nearly limitless ways to control how phone calls are routed in your company. The difference is really just in the name of the PowerShell commands, as shown below.

Skype for Business
Teams Direct Routing
Get-CsVoicePolicy
Get-CsOnlineVoiceRoutingPolicy
Get-CsVoiceRoute
Get-CsOnlineVoiceRoute
Get-CsPstnUsage
Get-CsOnlinePstnUsage

Fortunately, since the underlying concepts are still the same, that means the Skype Optimizer can still provide many of the same features to Teams as with on-prem deployments.  This includes:

  • Class of Service
  • Least-cost/failover routing
  • Extension ranges
Skype Optimizer features not currently supported in Teams include:
  • Location-based routing
  • Selective caller-ID blocking
  • Premium number blocking via Announcement service
  • Call Park
The PowerShell script generated by the Skype Optimizer builds on the same codebase used for Skype for Business Online scripts. The only prerequisite is that you have already paired one or more PSTN gateways to your Office 365 tenant using the New-CsOnlinePSTNGateway command.  If you haven't done this step, the Skype Optimizer script will only create normalization rules that will apply to both Skype for Business Online and Teams. 

Please try out the new functionality and let me know if you experience any issues.

Monday, March 12, 2018

The Evolution of the Skype Optimizer: From Locally-Run VBScript to Azure Web App

I was recently amazed when I realized the seeds of what is now the Skype Optimizer was created 10 YEARS AGO, back when Office Communications Server 2007 R2 was starting to make headway in the unified communications space.

The Beginning

The genesis of the program grew out of a need to know when to strip the +1 from a North American local number and when not to. Rather than rehash the creation myth, you can read all about it from one of my earliest blog posts where I announced the Dialing Rule Optimizer to the world (at that time, the 30-odd subscribers to my blog).

The very first version was a straight VBScript that I had to manually input the proper variables and run it by hand. Rather than give the code away, I told people to email me the phone numbers they wanted to get optimized dial rules for.  I would run the script and send them the results, which was a simple text file with either a bunch of regex, or text formatted to be applied to AudioCodes or Dialogic gateways.  Word got around within Microsoft, and I found myself busy sending stuff to various Microsoft consultants.

When Lync 2010 came around, which was in its earliest days known as Communications Server "14", I added the capability to create simple routing rules that consisted of a few lines of PowerShell code.  I also wrapped the code around a simple UI in something called an HTA (short for HTML application).  It made generating rulesets easier for me, but it was still something that I was running from my local machine.
The earliest known copy of the original Dialing Rule Optimizer. I obtained this from the Smithsonian Museum. The text-only v1.0 has been lost to the sands of time.
I soon figured out that it would be relatively straightforward to move the HTA into an actual web page.  I put the code on a web server hosted by the company I was working for at the time and opened up the tool to the entire world. I actually put this code on the computer that was running our OCS 2007 R2 server!

The very first web-based iteration of the Dialing Rule Optimizer. Note the Communications Server "14" logo on the top-right.

Once Communications Server "14" became Lync 2010, I realized that I could go beyond simple optimized route creation, and modified the Optimizer to create everything required for a simple Enterprise Voice setup for US and Canada deployments.

Shortly after, I realized that I could do the same for other countries as well. The Optimizer interface grew somewhat to accommodate the requirements for different countries.
Dramatic differences abound! Communications Server "14" has changed to "Microsoft Lync". Also, UK dial plans!
I slowly added other countries to the Optimizer. I also added other features such as extension dialing rules, least-cost/failover routing, among many others.

Over time, the back-end code base was starting to become difficult to support. I was using a series of XML files to deal with languages and country-specific dialrules, and the sheer number of them was becoming cumbersome to manage.  I decided to move everything from the company-hosted platform to Amazon Web Services. I built a single Windows VM with SQL Express and ported the XML files to a database. It worked well, but AWS was starting to cost a fair bit to run for a free service. Donations were not keeping pace with costs.

I then discovered that Microsoft MVPs got a monthly allotment of funds in Azure. I immediately moved my infrastructure to Azure, where it ran mostly trouble-free for the next several years.

The Optimizer featureset grew and grew, but the interface was still as ugly as the day I first created it.  One person even suggested it looked like a GeoCities page. Hey, my argument was always that I was not (and am still not) a web developer.


The Modern Era

I decided to try to give the Optimizer a more modern look. I completely re-wrote the front-end code using Notepad++ as my trusty editor.  I replaced the clunky extension builder with a better Javascript framework that emulated an Excel spreadsheet, and made other significant under-the-hood improvements. After much trial and error, I was pleased to unveil the new look.

The website looked modern, clean and easy-to-use.  However, it bugged me that while the site was running in Azure, it was still just a single Windows VM with a local SQL Server instance. It was also costing most of my monthly Azure MVP credits with not a lot of headroom. I decided to try to make the Skype Optimizer simpler and cheaper to manage, figuring that there would probably come a day when I would not be a Microsoft MVP (the horror!!!) and I'd be expected to pay a monthly bill (Oh the humanity!!!).

My first step was to dump the local SQL and move to Azure SQL Database. First, I had to copy the gigabytes of data to my SQL instance.  I opted to use transactional replication, which would allow me to keep both my local and Azure-based SQL instances up-to-date while I tested things out.  It turned out to be ridiculously easy. I had to modernize my code a bit to allow it to still read/write data to Azure SQL, but this was pretty straightforward as well.

With that hurdle out of the way, I looked at a few different ways to further reduce my costs and administrative burden. 

Docker Containers

I'd heard about Docker containers and how it was an easy way to reduce the overall complexity and costs over a traditional virtual machine.  I installed Docker on my home machine and started messing around with it. I used the Image2Docker tool to make a copy of my Windows VM-based website and installed it locally.  I had to do quite a bit of modifications to my dockerfile to support some of the added features that Image2Docker didn't capture, but after a while, I managed to make the Skype Optimizer work in a Docker container. 

Moving my newly-created container to Azure wasn't too much work, but there was definitely a learning curve involved. It started up fine, and I pointed my DNS entries to the container and away we went!  However, all was not perfect:
  1. My container stopped working a few times over the span of a few weeks. Troubleshooting this proved to be nearly impossible due to the nature of Docker containers and how you lose the previous state every time you restart it. Either that or I just don't know how these things really work.
  2. Making code modifications wasn't simple either. I'd have to make the change in my local Docker image and publish that image to Azure. The startup process took 5-10 minutes, which was probably due to how I built my container.  I looked at ways to improve the startup time, but it was already eating up lots of my time.
  3. The costs to run the container wasn't much cheaper than a full VM
Because of those reasons, I decided that Docker containers weren't well-suited to my needs and I finally turned to....

Azure Web App

All the back-end code changes I made to the Optimizer to support both Azure SQL Database and Docker containers actually had an unexpected side benefit: it allowed me to easily move the Optimizer to Azure Web App, which is Azure's web hosting framework.

To make the process of managing this easier, I finally moved away from Notepad++ as a development environment and embraced Visual Studio. Visual Studio made it trivial to take my entire website and migrate it to Azure. I had a few challenges with making my code-signing certificate work, but in the end it all worked flawlessly. 

The Skype Optimizer has been running as an Azure Web App for several months now. Its extremely reliable, simple to manage, and about 3 times cheaper to run than the original Windows VM. 

The Future

So, there you have it. The entire history of the Skype Optimizer posted here for posterity. Where do things go from here? Well, with Microsoft Teams eventually taking over the Enterprise Voice role from Skype for Business, I may decide to look into what it would take to turn the Skype Optimizer into a Teams Direct Routing and Calling Plans management platform. 

Until then, I will continue to keep updating the Skype Optimizer to make sure Skype and Teams administrators worldwide have a single place to get accurate, up-to-date dialing rules for every country in the world.


Friday, March 9, 2018

Edge Topology Replication Failures Caused by Mismatched Windows Updates

While getting a new Skype for Business edge server ready for production, I generally make sure all the latest Windows Updates are applied. I was doing exactly this for the company I work for (Nectar Services Corp).  Everything seemed to go just fine, but I noticed the edge server's replication status was showing as False when I ran Get-CsManagementStoreReplicationStatus.  The usual procedure of running Invoke-CsManagementStoreReplication -ReplicaFQDN servername did nothing. Even wiping out the C:\RtcReplicaRoot\xds-replica folder as described in this dusty old blog post didn't make a difference.

The Event Logs on the front-end server that was the master replication partner showed these fairly frequent error events:
Log Name:      Lync Server
Source:        LS File Transfer Agent Service
Date:          2/22/2018 10:01:29 AM
Event ID:      1046
Task Category: (1121)
Level:         Error
Keywords:      Classic
User:          N/A
Computer:      FRONTENDSERVERNAME.contoso.com
Description:
Skype for Business Server 2015, File Transfer Agent cannot send replication data to Replica Replicator Agent on Edge
Edge machine: EDGESERVERNAME.contoso.com
Exception: System.ServiceModel.EndpointNotFoundException: There was no endpoint listening at https://edgeservername.contoso.com:4443/ReplicationWebService that could accept the message. This is often caused by an incorrect address or SOAP action. See InnerException, if present, for more details. ---> System.Net.WebException: Unable to connect to the remote server ---> System.Net.Sockets.SocketException: No connection could be made because the target machine actively refused it 192.168.1.2:4443
I could reach the edge server's replication web service URL via port 4443 as described in the event log, so there wasn't a firewall issue or an issue with the web service that I could determine.

The edge server was also throwing errors, saying that it hadn't heard from any of the replication servers in a while and its feelings were very hurt.
Log Name:      Lync Server
Source:        LS Replica Replicator Agent Service
Date:          2/22/2018 12:04:32 PM
Event ID:      3045
Task Category: (3003)
Level:         Error
Keywords:      Classic
User:          N/A
Computer:      EDGESERVERNAME.contoso.com
Description:
The replication synthetic transaction has not been updated in a significant time period.
Time since the last update: 01.03:43:09
Cause: The Master Replicator Agent has not updated the replication transaction document in a significant time period.
Resolution:
If other replicas are experiencing similar issues, check the Master Replicator Agent and File Transfer Agent service health.  Verify access to the DFS files shares and replica file shares
The weird thing was that replication WAS working prior to me installing the final batch of Windows Updates. So, it seemed to make sense to uninstall each of the last Windows Update until things start working again. Unfortunately, after uninstalling those updates, along with a bunch of other ones in an increasingly desperate attempt to find the issue, the issue still persisted. Back to the drawing board.

I noticed that I was unable to copy binary files to the edge server via RDP copy/paste. Text files would copy fine, but any other filetype would cause the RDP session to crash with "Unexpected Server Error".  This seemed to be related to the replication issue, because this also worked fine prior to installing those updates. This is a good time to note that to access the edge server, I had to first RDP to a front-end server, then RDP to the edge from there, because the firewall was blocking all other access. This would prove to be relevant later.

I went as far as uninstalling and reinstalling Skype for Business along with all the supporting components, but it STILL didn't work.

Frustration level: STRATOSPHERIC

Finally, I nuked the entire edge server from orbit (It's the only way to be sure) and had the edge server rebuilt from scratch. Once again, replication worked.

End of story, right? Wrong. I had to figure out what the issue was, so I re-applied each of the original suspected updates until the issue re-appeared. And re-appear it did, along with the RDP copy/paste issue. The offensive update that caused the issue was 4072650, which is vaguely described as Hyper-V integration components update for Windows virtual machines. The description wasn't much help, but it certainly had an effect. Once again, removing that patch didn't make the issue disappear.

Finally, I checked to see if the front-end servers had this truly despicable update, AND THEY DIDN'T.  It was buried in Windows Update as an optional update, which weren't being automatically downloaded and installed. On a hunch, I installed the update (no restart required, thankfully), and VOILA, replication started working again, and I could copy/paste files between front-ends and edge via RDP.

So, this vague Windows Update was the source of all my issues. Sigh... Presumably, this would only show up in the following circumstances:

  • Using Hyper-V hosted virtual machines
  • Running Windows 2012 R2
  • Update 4072650 is installed on some VMs but not all

TL;DR version

If you're having replication errors on your edge server, and are getting LS File Transfer Agent Service error 1046 on your master replication server, then make sure that all front-end and edge servers have Update 4072650 if you see it applied on any one server.



Tuesday, April 25, 2017

Tenant Dial Plans in Skype for Business Online

As others have noted, all Skype for Business Online tenants that are capable of PSTN calling are now able to create their own custom dial plans. This allows administrators to create custom normalization rules for local and extension dialing and other cases the default normalization rules don't handle.

Dial plans work a bit differently in Skype Online than they do in on-premise deployments. In on-prem deployments, dial plans are completely independent of one another. Users assigned either a specific site or user-level dial plan will only use the normalization rules contained within that dial plan. Normalization rules assigned at the global level do not apply.

In Skype Online, users can be assigned to either a tenant global dial plan or a tenant user dial plan. As with Skype for Business on-prem, users will get normalization rules from the global tenant dial plan, unless they are explicitly assigned to a user tenant dial plan. There is no inheritance from tenant global to tenant user dial plans.

However, users WILL inherit dial plan normalization rules from a Skype Online global dial plan specific to the country where their phone numbers are hosted. This is the default dial plan that all users are assigned to in the absence of a tenant global or user dial plan.

If you were to run Get-CsDialPlan on your tenant, you will see dial plans for most of the countries in the world (220 to be exact. 12 short of what's in the Skype Optimizer, but who's counting?). These dial plans are pretty basic, accounting for national and international dialing prefixes but not much else.  The US and UK dial plan normalization rules are shown below:

Name
Priority
Pattern
Translation
US Intl Dialing
0
^011(\d+)$
+$1
US Long Distance
1
^1(\d+)$
+1$1
US Default
2
^(\d+)$
+1$1
US Extension Rule
3
^((\+)?(\d+))(;)?(ext|extn|EXT|EXTN|x|X)(=)?(\d+)$
$1;ext=$7
GB Intl Dialing
0
^00(\d+)$
+$1
GB Long Distance
1
^0(\d+)$
+44$1
GB Default
2
^(\d+)$
+44$1
GB Extension Rule
3
^((\+)?(\d+))(;)?(ext|extn|EXT|EXTN|x|X)(=)?(\d+)$
$1;ext=$7

The rules are quite rudimentary, and don't allow for capturing and fixing user-initiated dialing errors such as overdialing, incorrect adding of national prefixes for international calls, or local calls. The extension rule is quite interesting because it appears to have been written by someone who really doesn't know how to use regular expressions. Unless your extension rule matches the very specific cases of ext, extn, EXT, EXT, x or X, then it won't work. If they used my standard for pattern and translation, it would be both simpler and would capture pretty much ANY style of writing phone numbers with extensions:
Pattern: ^\+?(\d+)\D+(\d+)$
Translation: +$1;ext=$2
If you look at the dial rules for other countries, you'll see the only difference between them is the long distance and international dialing prefix and the country code.  I get that it can be tricky to maintain dial rules for 230+ countries, but hey if one simple Canadian dude can do it, couldn't Microsoft???

Aaaanyways, the global dial plan for your given country is applied along with any tenant level dial plans.  It appears that the tenant dial plans are applied first, followed by the global dial plan for your country.

This means you can use the Skype Optimizer to create more nuanced normalization rules that will work better than the default ones.  You can't disable the global dial plan, so if a user dials a number not covered by your tenant dial plans, then the global ones will apply. So, even if someone tries to dial a completely nonsensical number such as 00000023423422348987987998770, it would get captured by the global Default normalization rule and would allow the user to dial that number. Of course, it wouldn't get far, but I would think that if you can prevent users from dialing invalid numbers at the client level, it would keep Skype Online servers from having to use precious system resources to parse that number and notify the user it is invalid.

Thankfully, this means the Skype Optimizer is still relevant in the Skype Online world. You can still use it to create dial plans and normalization rules that will make the user dialing experience better, including support for:

Just select the desired options for your country, press the Generate Rules button, and run the script. It will ask for your credentials for your Skype Online tenant and will then ask if you want to create a tenant global or tenant user dial plan. It will then apply said normalization rules to your tenant. That's all it takes!

Buy now! Operators are standing by!





Tuesday, August 23, 2016

Lync/S4B Client Can Normalize Numbers with Plus Sign?!?!?

One of my central tenets of number normalization in Lync/Skype for Business has always been:
"Lync/Skype for Business will not attempt to normalize a number that already has a plus sign at the beginning"
There are signs that this is no longer true, at least for customers running Skype for Business Server 2015 and the Skype for Business 2016 client (and even the Lync 2010 client). I was messing around with normalization rules last night and realized that my Skype for Business 2016 client (version 16.0.7030.1021 32-bit) WAS normalizing numbers that started with a plus sign.  I later validated this with a Lync 2010 client against the same Skype for Business 2016 server.  And a commenter below said he's pretty sure he did this with Lync Server 2013.  So this means that either it has ALWAYS worked this way and I never realized it, or its something relatively new on the server-side.

I should clarify here and state that the normalization rule in question was specifically constructed to include a plus sign in the normalization pattern (as highlighted below):
^\+?(?:011)?(1|7|2[07]|3[0-46]|39\d|4[013-9]|5[1-8]|6[0-6]|8[1246]|9[0-58]|2[1-689]\d|3[578]\d|42|5[09]\d|6[789]\d|8[035789]\d|9[679]\d)(?:0)?(\d{6,14})(\D+\d+)?$
This DOES NOT mean that the Skype for Business client will suddenly normalize all numbers that start with a plus. If your normalization rules do not include a plus sign, it will not normalize a number with a plus sign.

This has all sorts of ramifications, all of them positive. That means that dealing with improperly formatted click-to-dial numbers in Internet Explorer is much simpler.  We can now fix this with a simple normalization rule, instead of doing it in a trunk translation rule or route.


This also means that dialing contact phone numbers from Outlook can also be much more reliable, especially if an extension is entered.  Before, if a user entered a contact phone number with an extension in Outlook using the "Phone number builder", it would format the number like "+1 (212) 555-1212 x 345".  If you clicked this number, Lync would parse the "x" as a "9" and the number would come out in Lync like "+121255512129345" and would likely fail.

I have no idea how long this new behaviour has been available.  It works on a Skype for Business Server 2015 environment with both Skype for Business 2016 clients and Lync 2010 clients.  I don't have a Lync 2013 or older environment to test with.  I tried to see if this behaviour extended to server-side normalization scenarios with no luck. I tried to change the destination phone number on an incoming call that started with a plus sign.  It didn't work, which implies that this is a client-only thing. The MSPL workaround is still required.

If this is only new to me, then I apologize for wasting everybody's time.  I've incorporated my new findings into the Lync Optimizer (as you would expect), because it does allow for several dialing scenarios to work better than before.

If anybody has any information on this or can test other versions of Lync server/client, drop me a line.

UPDATE (2016-Aug-24): Updated to clarify that a normalization rule has to explicitly look for a plus sign for this to work. Original post implied that any valid, matching normalization rule would be applied to a number that had a plus sign on the front.  Also made changes based on further observations with older clients.

Wednesday, August 10, 2016

North American Call Authorization Tricks

Almost every country in the world has a dedicated country code assigned to it for telephony purposes (see countrycode.org for a full list). There is one very large exception to this rule and that is countries in North America (US and Canada) and a good portion of the Caribbean.  These countries all share the same country code of +1 and are collectively part of the North American Numbering Plan (NANP), which is managed by the North American Numbering Plan Association (NANPA).

A NANP number consists of the country code +1, a 3-digit area code, and a 7-digit subscriber number, often formatted like +1 (212) 222-3333.  You'll sometimes see the +1 omitted from the number on business cards and websites that don't think globally.

Dialing numbers in North America is quite different from most places in the world.  In the rest of the world, if you want to dial a long-distance or national number, you have to dial a national access code.  In a lot of countries, this number is 0.  However, in NANP countries, you have to dial 1 for long-distance/national numbers. That's right, you have to dial the actual NANP country code as the long-distance access code, which is unique among all other countries.  Incidentally, the international access code is 011 (in many other countries, its 00).
"Connect me with the German Chancellor immediately! I have an idea for a new album!"
If you want to dial Canada from the US, or the US from Canada, or any of the Caribbean countries in NANP, you don't dial the international access code 011, you just dial the number as you would any other long-distance number.  So, if I want to dial Toronto, Canada from Vancouver, Canada I would dial +1 416 555 1111.  If I wanted to dial New York City from Vancouver, I'd dial +1 212 555 1111.

It gets even better.  NANP area codes are allocated by a trio of over-caffeinated spider monkeys who press numbers on a giant keypad at random to assign a new area code (well, that's how it appears).  There is no logical separation between area codes between any of the countries.  For example, area code 646 is in New York, USA, 647 is in Ontario, Canada, and 649 is in the Turks & Caicos (Caribbean island nation). Click here for a current list of area code allocations in NANP.

These days, calls between the US and Canada are often priced identically to a domestic long-distance call.  But calls to the NANP Caribbean countries are usually priced much higher, akin to dialing some very remote international destinations.  Continuing the previous example, a US user that has signed up for Microsoft's Skype for Business Online PSTN Calling Plan can expect to pay $0.013 per minute  (just over 1 cent a minute) to call area code 647 in Ontario, but would end up paying $0.583 (almost 45x more expensive) to $0.741 per minute to call area code 649 in the Turks & Caicos.

Since it is nigh impossible for a regular person to distinguish a Caribbean country from US/Canada based solely on the area code, it is easy for people to accidentally incur high phone charges when calling these countries.
He had an Apple watch long before anybody else. True trendsetter, even if it took 30 years.
Skype for Business administrators often want to prevent certain users from dialing these Caribbean countries, and it can be easily done with some fancy routing rules.  Assuming you've already configured voice polices for national and international access, then you just have to modify the National level voice route to block Caribbean destinations.  You could do some research to determine the Caribbean area codes, and create your own regular expression to block them, but I figure I should do SOMETHING useful with this post and provide you with that information.

This Skype for Business route number pattern will block all Caribbean countries that are part of NANP, except for the US Virgin Islands and Puerto Rico (which are often not priced much differently than US numbers, when calling from the US):
\+1(?!(24[26]|26[48]|284|345|441|473|649|664|721|758|767|784|8[024]9|86[89]|876|900|976))[2-9]\d\d[2-9]\d{6}$
If you want to also block US Virgin Islands and Puerto Rico, then use this one:
^\+1(?!(24[26]|26[48]|284|34[05]|441|473|649|664|721|758|767|78[47]|8[024]9|86[89]|876|900|939|976))[2-9]\d\d[2-9]\d{6}$
You may note that I've also made sure to exclude premium area codes like 900 and 976.

To make sure that people assigned an International voice policy can dial those Caribbean countries, your international route should include a pattern for North American numbers and look something like this:
^\+((1(?!(900|976))[2-9]\d\d[2-9]\d{6})|([2-9]\d{6,14}))$
If you REALLY wanted to get draconian and block calls to Canada from the US, and vice versa, you could use the following rules:

Block Canadian and Caribbean calls:
^\+1(?!(900|976))(20[^04]|21[^1]|22[4589]|23[149]|24[08]|25[12346]|26[0279]|27[026]|30[^06]|31[^1]|32[0135]|33[^2358]|34[067]|35[12]|36[01]|38[056]|40[^03]|41[^168]|42[345]|43[0245]|44[023]|47[0589]|48[04]|50[^06]|51[^149]|53[0149]|54[01]|55[19]|56[12347]|57[01345]|58[056]|60[^04]|61[^13]|62[03689]|63[016]|64[16]|65[017]|66[01279]|67[018]|68[124]|70[^059]|71[^01]|72[0457]|73[1247]|74[037]|75[47]|76[02359]|77[^1678]|78[1567]|80[^079]|81[^19]|83[012]|84[3578]|85[^1235]|86[02345]|87[028]|90[^025]|91[^1]|92[0589]|93[16789]|94[0179]|95[12469]|97[^4567]|98[0459]|281|458|469|520|828)[2-9]\d{6}$
Block American and Caribbean calls:
^\+1(?!(900|976))(41[68]|43[178]|51[49]|58[17]|70[59]|78[02]|90[25]|204|226|236|249|250|289|306|343|365|403|450|506|548|579|604|613|639|647|778|807|819|825|867|873)[2-9]\d{6}$
The above rules are formatted slightly differently from the ones that block Caribbean countries.  The Caribbean rules show which area codes to BLOCK.  The other rules show which area codes to ALLOW.  Of course, I didn't roll these rules by hand, since there are tens of dozens of area codes allocated to each country, and do change from time to time.   I used the super handy-dandy Lync Optimizer.  As you might expect, these options are available in the Lync Optimizer via the "Treat as National" option (only shows for North American dial rules).  You can select one of the following options with regards to dialing other countries within NANP:
  1. In-Country Only - Treat all calls to NANP countries other than your own as international, even though users don't have to dial 011 to reach them. As such, users will have to be a member of a voice policy that allows international dialing.
  2. US/Canada - Treat calls to anywhere in US/Canada as national calls, excluding the Caribbean. To dial Caribbean countries, users will have to be a member of a voice policy that allows international dialing. Not available for Caribbean rulesets.
  3. US/Canada/Caribbean - Treats calls to anywhere in NANPA (US/Canada/Caribbean) as national calls (along with the potentially higher call costs).

If you are creating a ruleset for a Caribbean country that uses +1 as the country code, you won't be able to select US/Canada, since this would make dialing within that Caribbean country difficult.

Selecting the Simple Ruleset option prevents usage of this feature, and will default to US/Canada/Caribbean. You won't be able to control how people dial other NANP countries when there is only a single simple routing rule.

So, there you go.  A lesson in the finer details of dialing numbers in North America that the rest of the world might not be aware of. Try to contain your excitement.