LS Storage Service 32054, A New Twist

Stop me if you’ve heard this one before…  “A guy walks into a bar, and says ‘Ouch'”.  Also, a Skype administrator reviews the Frontend Event logs and sees LS Storage Service errors, event id 32054, and says ‘ignore’.  Guess what, not today!!!

Log Name:      Lync Server
Source:        LS Storage Service
Date:          12/19/2016 9:32:45 AM
Event ID:      32054
Task Category: (4006)
Level:         Error
Keywords:      Classic
User:          N/A
Computer:      SFB2015.Company.net
Description:
Storage Service had an EWS Autodiscovery failure.

StoreWebException: code=ErrorEwsAutodiscover, reason=GetUserSettings failed, smtpAddress=Bob@Company.com, Autodiscover Uri=https://autodiscover.Company.com/autodiscover/autodiscover.svc, Autodiscover WebProxy=<NULL>, WebExceptionStatus=ConnectFailure —> System.Net.WebException: Unable to connect to the remote server —> System.Net.Sockets.SocketException: No connection could be made because the target machine actively refused it 40.96.38.248:443

Our environment is a common one now I think, a combination of Exchange Online with Skype for Business On Premises.  And unlike the people who are Both Online, or Both OnPrem, our Skype for Business Mobile client doesn’t get to enjoy server side Conversation History.  The key reason, OAuth.  I’ve gone through the Microsoft process of configuring Onprem with Online, and it’s ugly, MSLink and honestly couldn’t tell if it did anything and it certainly didn’t get my Server-Side Conversation History working for mobile devices.  Fortunately a hero comes along, in this case Aaron Marks, who developed a script to make that step soooo much easier and quicker.  Configure-OAuth.  There are a couple of items you need to install on a Frontend, MS Online Service Sign-in Assistant and AAD PowerShell Download Link.  The key to this script that I keep forgetting, it MUST be run via the Azure Active Directory (AAD) PowerShell (elevated of course).  I keep trying with Skype PowerShell and fails miserably.  You must also be Global Admin on the O365 portal, Exchange and Skype Admin only is not sufficient.  Typical command:

Configure-OAuth_ExOn_Sfb_Server.ps1 -WebExt “webext.company.com”

Works extremely well, but still no conversation history for mobile.

This weekend I completed a pool-to-pool transition and I’m reviewing the logs, damn 32054.  Complaining about the Autodiscover again.  I’m thinking maybe the ExchangeAutodiscoverUrl line of csOAuthConfiguration is maybe supposed to be changed to autodiscover.outlook.com or something equally ridiculous. (damn, still can’t spell rediculus without autocorrect).  Next hero walks in, this time Adam Hand and he nonchalantly mentions to set the ExchangeAutodiscoverUrl with HTTP instead of HTTPS.  I don’t know where he got the divine inspiration for that gem, but a few expletives were emitted on my part.  Maybe all you super Skype Admins knew this, if so, you’re jerks.  :p  MS support certainly didn’t when I had a running conversation for 3 months about this exact scenario not working.

Summery: When Skype Onprem is deployed with Skype Online, the set-csOAuthConfiguration command would be as follows:

Set-CsOAuthConfiguration -Identity Global -ExchangeAutodiscoverUrl http://autodiscover.company.com/autodiscover/autodiscover.svc

Note the HTTP not HTTPS.  Also if you’re checking from you are just getting the URL from your CAS server, change the .xml to .svc.

Within about 5 minutes you should start to see some new entries in your event logs as follows.  Of course this is all assuming you have Autodiscover properly set up in the first place.  The SCRAMBLE’s I added just in case someone got some funny ideas…

And Test-CsExStorageConnectivity now works too.

Log Name:      Lync Server
Source:        LS Storage Service
Date:          12/19/2016 10:22:37 AM
Event ID:      32046
Task Category: (4006)
Level:         Information
Keywords:      Classic
User:          N/A
Computer:      SFB2015.Company.net
Description:
A properly configured certificate from the OAuth Token Issuer was found.

#CTX#{ctx:{traceId:184SCRAMBLE9420, activityId:”be6SCRAMBLE-adc”}}#CTX#
Found OAuthTokenIssuer Certificate, serialNumber=44SCRAMBLE00035, issuerName=CN=IRC-DC02, DC=Company, DC=net, thumbprint=6DESCRAMBLECE20
Log Name:      Lync Server
Source:        LS Storage Service
Date:          12/19/2016 10:22:37 AM
Event ID:      32048
Task Category: (4006)
Level:         Information
Keywords:      Classic
User:          N/A
Computer:      SFB2015.Company.net
Description:
OAuth was properly configured for Storage Service.

#CTX#{ctx:{traceId:184SCRAMBLE9420, activityId:”be6085f9-SCRAMBLE-f6df8f77badc”}}#CTX#
CsOAuthConfiguration validly configured
Log Name:      Lync Server
Source:        LS Storage Service
Date:          12/19/2016 10:22:37 AM
Event ID:      32052
Task Category: (4006)
Level:         Information
Keywords:      Classic
User:          N/A
Computer:      SFB2015.Company.net
Description:
OAuth STS was properly configured for Storage Service.

#CTX#{ctx:{traceId:184SCRAMBLE9420, activityId:”be6085f9-SCRAMBLE-f6df8f77badc”}}#CTX#
GetAppToken succeeded for request with sts=https://accounts.accesscontrol.windows.net/f5e8862b-SCRAMBLE-b67b33a9001a/tokens/OAuth/2

Log Name:      Lync Server
Source:        LS Storage Web Service
Date:          12/19/2016 10:26:15 AM
Event ID:      32001
Task Category: (1307)
Level:         Information
Keywords:      Classic
User:          N/A
Computer:      SFB2015.Company.net
Description:
Storage Web Service has been loaded.
Log Name:      Lync Server
Source:        LS Storage Web Service
Date:          12/19/2016 10:26:24 AM
Event ID:      32006
Task Category: (1307)
Level:         Information
Keywords:      Classic
User:          N/A
Computer:      SFB2015.Company.net
Description:
Storage Web Service request succeeded.

 

UPDATE Dec 27, 2016:  Well, apparently these event errors don’t disappear with the change, BUT, it does resolve the OAuth issue and I do get to have Server Side conversation history working with the Skype for Business Mobile client for a Skype OnPrem/Exchange Online environment.

tLDN, 127.0.53.53 and Edge servers

It’s hard to believe in coincidences in IT, but they do sometimes happens.  Last night a client made a firewall change over to newer, bigger hardware, but due to some glitch this morning they had to switch back.  BUT, they can no longer Federate to their sister companies.  Fun scenario, they’re companyA.ca, but they federate with companyA.com and companyA.de, and so on.  To make matters more interesting, they have stub zones on the internal DNS.  No biggie, using public DNS for name resolution anyway.

Suddenly name lookups started resolving everything to 127.0.53.53, which according to ICANN is the code word for name conflict.  It would seem that they brought online some new Top Level Domain Names last night, including .ADS.  Guess what the company is using for their internal domain name space… eyuup, companyA.ads.

You’re thinking, “Ok smart guy, why do I care?”.  Well, my experience and training goes back to OCS (some LCS) and I was mentored by a guru at Microsoft, thank you CG.  I was taught to not ever join the Edge to the domain, and you modify the Primary DNS Suffix of the Edge.  Now, I was taught to use the Internal DNS Namespace, you know the same as the FE’s and all that good stuff.  My good buddy, ML had the foresight when seeing all these new tLDN’s coming, to start using the same namespace as the company SIP domain.  Gee, wish I had.

For some very strange reason, on the Windows 2012 R2 Edge deployed server, all NSLOOKUP’s were appended with companyA.ads, resulting in a failed resolution of 127.0.53.53 for CompanyB.com.companyA.ads.  No suffixes on the NIC’s, in case anyone is wondering.

Resolution Time:  Easy, rename the edge server…  In my training, I always deploy an Edge Pool, even with a single edge being deployed, luckily I did that.  I was able to:

  1. add a “New” edge server with the valid tLDN, same IP addresses and everything.  Publish.
  2. Removed “Old” edge server. Publish. (changes replicate up to edge).
  3. On the Edge, run the Deployment Wizard.  Edge services are removed as a result.
  4. Update Primary DNS Suffix.  Reboot.
  5. Retest NSLOOKUP, yea Success.
  6. run Export-CsConfiguration, copy files up to Edge.
  7. Run the Deployment Wizard again, need to rerun Step 1: Install Local Configuration Store, using the new configuration file.
  8. Run Step 2 and reinstall the Server components.
  9. Double check the Certificate assignments.
  10. Start-CsWindowServices or Reboot.
  11. Bob’s your Uncle.

Practice going forward, using the SIP domain name space for the Edge Server name/suffix.

FYI, .CORP is coming down the pipe, and I believe .DEV is already available.

One-to-One Bi-directional Static NAT

CORRECTION: Should be One-to-One Bi-Directional Static NAT.  Some firewalls perform NAT Pooling…

Firewalls; love them, hate them.  They are a necessary evil and the bane of pretty much every Lync/Skype Deployment Specialist since the beginning of UC.  Used to be we just needed a two-way NAT, then the terminology became Static NAT or SNAT, which now we can’t use thank you very much F5, so now the landed terminology that seems to have stuck which no one has usurped is “Bi-directional Static Network Address Translation”, BDSNAT or BSNAT… ok, we’ll stick with Bi-directional Static NAT.

It’s hard keeping up with all the different firewalls so I try to keep the language universal, and yes it seems that just about every firewall product out there has their own language.  Sometimes it’s as similar as Canadian English and American English, same but subtle differences.  Or as different as Oxford English and Urban English (yup, look it up, Urban).  What ever the language, we need to keep our terminology as universal as possible, and some firewall admins still struggle.

Now you get past that struggle, and a firmware update comes along for the firewall…  Recent case in point.

A client was going through a ISP requested IP Address Change to a new block.  During this outage it was thought to be a good idea to update the firmware on the firewall, in this case it was a SonicWALL NSA running firmware version 6.1.2.3 and was upgraded to 6.2.6.1.  For the first 4 days, everything seemed fine, but Friday morning things went sideways.  Outbound PSTN calls via their ThinkTel SIP trunks, failed to have Outbound Audio.  Inbound Audio was fine, as well as Inbound Calls were fine as well.  Additionally, if the Caller put the call on hold and then Resumed, Outbound audio started working.

After much trouble shooting with the good folk and ThinkTel, and burning my eyes out looking at Wireshark (literally, had to go to the optometrist), I was having a tough time aligning calls up, ports weren’t making sense, thought the times zones were buggered, it was very frustrating.

I decided to do a top-to-bottom review with the FW Admin of the SonicWALL.  Asking every question about what is working and how, what does this checkbox do, yadda, yadda, yadda.  Border line embarrassing actually, but so often works.  That’s how we came across the setting for “Disable Source Port Remap” in this environment.  Firmware version 6.1.2.5 added a new feature for NAT’s, the ability to “Disable Source Port Remap”.

After reflecting on the change and resolution we discovered, I was able to go back and see that I did in fact have the same network wire traces, and is now very obvious what was going on.  Sometimes I’m not losing my mind, I’m just not comprehending what I’m seeing and  I thought I was in error.

After getting my Wireshark all tuned up to display time and not relative time, and a few other odd and ends which I should post something about, assuming I can remember all of what I did…  Anyway, line for line, you can see the port progression, and it was actually this particular trace that flagged me to look at the ports, cause why on earth was there RTP traffic coming from ThinkTel to the Mediation server on port 3227 there, and why can’t I find it in my Mediation trace…

This will be something lodged into my brain for a while I hope.  I also do not think that this will be isolated to SonicWALL firewalls.  I expect one might also see this kind of behavior with SIP ALG, SIP Inspect, and what ever else might mess with SIP traffic at a Network or Application Layer.  The SIP packets themselves contain the port negotiation required for RTP connectivity, last thing we need is a firewall trying to do its own thing.

Feel free to add a comment if you seen similar “port remapping” on different firewalls, and if it was related to some application layer setting.  I always thought SIP ALG was more about the Verbs being used, curious now if it also messes with ports too.