ucready

Lync Front End Service (RTCSRV) won’t start

Lync Front End Service (RTCSRV) won’t start

Hello everyone!

After not having that problem for a while, I’ve recently had some trouble with the Lync 2013 Front End service (RTCSRV) which was stuck on Starting… state. Interestingly, in my case this problem occured on a Lync 2013 Survivable Branch Server (SBS). Up until now I have only known this issue from Enterprise Front End Pools.

In order to resolve the issue, I started with checking the event log on the Lync 2013 SBS and I found the typical events which usually come along with this problem. I found some events (event ID 32173, ID 32174 and 30988) from the LS User Services plus some 32027 events from the LS Storage Service. If you already had the “always starting RTCSRV” problem, you probably know these IDs.

So I checked some basic things, like the status of the Management Store Replication (Get-CSManagementstoreReplicationStatus) and the Pool Upgrade Readiness State (Get-CsPoolUpgradeReadinessState), the current utilization of HDD storage and RAM – everything looked OK. Next I reviewed the status of the Windows Fabric Host Service (FabricHostSvc) and restarted the service – without problems.

Note: In my experience, a Lync 2013 SBA or SBS sometimes shows the state “InSufficientActiveFrontEnds”, even if the required amount of Front End Servers are active within the according Front End Pool.

The Lync Front End Pool was also not the problem: The services worked fine and they were also running during the time the events were logged at the SBS. Well, I decided to restart the branch server, but the behavior was still the same.

After restarting RTCSRV, the service was permanently stuck on Starting... mode. So I tried to peform a reset of the pool registrar state (Reset-CsPoolRegistrarState -PoolFqdn "FQDN" -ResetType QuorumLossRecovery), but that couln’t be completed, due to the service not starting within the expected timeframe.

The Front End Service was still starting and the reset command couldn’t do its work, therefore I decided to shutdown the RTCSRV service manually. It was not possible to stop the service using Stop-CsWindowsService RTCSRV, for that reason I used the sc queryex command prompt for searching the right process identifier and the taskkill command for stopping the service immediately.

04_cmd_taskkill

Then, with the service being in the Stopped state, I repeated Reset-CsPoolRegistrarState, but this did not resolve the problem… .

After some brooding, I checked the event log in more detail and I found an interesting warning event:

Event ID: 507 / Source: ESE

The description for Event ID 507 from source ESE cannot be found. Either the component that raises this event is not installed on your local computer or the installation is corrupted. You can install or repair the component on the local computer.

If the event originated on another computer, the display information had to be saved with the event.

The following information was included with the event:

Fabric
896
ReconfigurationAgentStoreInstancea44cf2dd4b563f7956de728f40da1c:
C:\ProgramData\Windows Fabric\Lync_Rtm_E9_125\Fabric\work\RA\RA.edb
0 (0x0000000000000000)
65536 (0x00010000)
26

I thought, maybe this could be related to my problem. So I opened the Control Panel -> Programs -> Uninstall a Program and looked for Windows Fabric, but here the only option was uninstall – repairation wasn’t possible. Then I switched to the Lync Server 2013 DVD (setup\amd64\) and there I could repair the service with a right-click on WindowsFabric.msi. When the repairing process was done, I restarted the server.

repair_fabric_msi

After rebooting, the Front End Service started instantly – yeah. I checked the event log again and it looked fine. So I moved some users back from the backup registrar to the SBS and everything worked like expected.

Unfortunately, I could not find out why this happened on the branch server. There was no unplanned restart, no updates were applied and no backups were running. I’ve seen some events related to a connection error to the local database minutes before the other events were logged, but I cannot really tell whether this was the cause of the problem.

Further reading:

Eric

Eric

My name is Eric Schöne. I’m working as a system engineer at T-Systems Multimedia Solutions GmbH in Germany. My focuses are Microsoft Cloud Services, Unified Communications and Infrastructure.
Eric

Leave a Reply

Your email address will not be published. Required fields are marked *

*