Updated: Aug 1, 2019
We recently stumbled upon an issue where database server had to be restored a date where it was working as expected after patching somehow screwed it up.
Admins were able to connect to that server which was hosting vRA's IAAS database and take a backup of it
After Server and DB was restored , IaaS service under VAMI wasn't coming back to "REGISTERED" state
When we browse to component registry , we get following exception
<serviceStatus serviceId="5a3f7b9a-8d02-4069-b0f4-afd68679657b" serviceName="iaas-service" serviceTypeId="com.vmware.csp.iaas.blueprint.service" notAvailable="true" unregisterDenied="true">
Exception during remote status retrieval for url: https://vra-web/WAPI/api/status. Error Message 500 Internal Server Error.
We did verify ManagerService.exe.config , Web.config and [<<databasename>>].[DynamicOps.RepositoryModel].[Models] . The configuration was set correctly.
Verifying exceptions under ManagerService/All.log
[UTC:2019-07-25 07:09:01 Local:2019-07-25 15:09:01] [Error]: [sub-thread-Id="6" context="" token=""] Failed to ping the database. Details: System.Data.SqlClient.SqlException (0x80131904): The target principal name is incorrect. Cannot generate SSPI context.
at System.Data.SqlClient.SqlInternalConnection.OnError(SqlException exception, Boolean breakConnection, Action`1 wrapCloseInAction)
at System.Data.SqlClient.TdsParser.ThrowExceptionAndWarning(TdsParserStateObject stateObj, Boolean callerHasConnectionLock, Boolean asyncClose)
at System.Data.SqlClient.TdsParser.ProcessSSPI(Int32 receivedLength)
at System.Data.SqlClient.TdsParser.TryRun(RunBehavior runBehavior, SqlCommand cmdHandler, SqlDataReader dataStream, BulkCopySimpleResultSet bulkCopyHandler, TdsParserStateObject stateObj, Boolean& dataReady)
at System.Data.SqlClient.TdsParser.Run(RunBehavior runBehavior, SqlCommand cmdHandler, SqlDataReader dataStream, BulkCopySimpleResultSet bulkCopyHandler, TdsParserStateObject stateObj)
at System.Data.SqlClient.SqlInternalConnectionTds.CompleteLogin(Boolean enlistOK)
at System.Data.SqlClient.SqlInternalConnectionTds.AttemptOneLogin(ServerInfo serverInfo, String newPassword, SecureString newSecurePassword, Boolean ignoreSniOpenTimeout, TimeoutTimer timeout, Boolean withFailover)
The "Cannot generate SSPI context" error is generated when SSPI uses Kerberos authentication to delegate over TCP/IP and Kerberos authentication cannot complete the necessary operations to successfully delegate the user security token to the destination computer that is running SQL Server.
This gave us a clue that there might be a trust issue between the SQL server and the domain it's part of
Verifying Group and User memberships confirmed this to us , yea the relationship was broken. AD account login to SSMS and the server itself was broken.
As remediation task , we had to remove the node and then bring it back to the domain.
Post that AD login to SSMS and the IaaS service was immediately registered