• Arun Nukula

Unable to connect to endpoint. Timeout occurred while processing the request


Test Connection Request while adding an endpoint fails with an exception

Unable to connect to endpoint. Timeout occurred while processing the request


Even though we verified firewall , ports , vCenter Services status still we were still encountering this exception.

When we were looking into the logs

For a working environment

Manager Service \ All.log

As seen below, when a Test Connection is initiated from vRA UI , Manager Service is responsible for initiating this request and it creates a workitem to be processed by underlying vSphereAgent

[UTC:2018-11-05 09:25:45 Local:2018-11-05 09:25:45] [Debug]: [sub-thread-Id="67" context="gjbLTAc0" token="gx3pVepj"] TestConnection Request: Name [vc.nukescloud.com], Address: [https://vc.nukescloud.com/sdk], Username: [nukescloud\nukes]

[UTC:2018-11-05 09:25:48 Local:2018-11-05 09:25:48] [Debug]: [sub-thread-Id="25" context="" token=""] TestConnection WorkItemResponse: [<?xml version="1.0" encoding="utf-16"?><workItemResponse id="9bca4157-1fd3-4e19-a7dd-117aacc5e5d6"><task type="testconnection"><result succeeded="false"><message>Test connection failed: Certificate is not trusted (RemoteCertificateChainErrors). Subject: C=US, CN=vc.nukescloud.com Thumbprint: 9C949FAEB87CBF97419CF4BFE70EB3FB8A035173</message></result><parametername="TestConnection.Endpoint.Request.ResultCode">INVALID_CERTIFICATE</parameter><parameter name="trace_id" /></task></workItemResponse>]

vSphereAgent.log

Below snippet is from vSphereAgent.log as you can see , once the workitem is created by Manager Service , vSphereAgent will process the workitem and fetch the certificate from endpoint.

2018-11-05T09:25:48.218Z IAAS75 vcac: [component="iaas:VRMAgent.exe" priority="Info" thread="4332"] [sub-thread-Id="5" context="" token=""] Starting : Processing Workitem ID [9bca4157-1fd3-4e19-a7dd-117aacc5e5d6] [testconnection]

2018-11-05T09:25:48.218Z IAAS75 vcac: [component="iaas:VRMAgent.exe" priority="Debug" thread="4332"] [sub-thread-Id="5" context="" token=""] [[]] [testconnection] TestConnection.Endpoint.Address=https://vc.nukescloud.com/sdk

2018-11-05T09:25:48.218Z IAAS75 vcac: [component="iaas:VRMAgent.exe" priority="Debug" thread="4332"] [sub-thread-Id="5" context="" token=""] [[]] [testconnection] TestConnection.Endpoint.Username=nukescloud\nukes

2018-11-05T09:25:48.218Z IAAS75 vcac: [component="iaas:VRMAgent.exe" priority="Debug" thread="4332"] [sub-thread-Id="5" context="" token=""] [[]] [testconnection] TestConnection.Endpoint.TrustThumbprint=

2018-11-05T09:25:48.218Z IAAS75 vcac: [component="iaas:VRMAgent.exe" priority="Debug" thread="4332"] [sub-thread-Id="5" context="" token=""] [[]] [testconnection] TestConnection.Endpoint.TrustAllCertificates=False

2018-11-05T09:25:48.218Z IAAS75 vcac: [component="iaas:VRMAgent.exe" priority="Debug" thread="4332"] [sub-thread-Id="5" context="" token=""] Begin test connection request....

2018-11-05T09:25:48.234Z IAAS75 vcac: [component="iaas:VRMAgent.exe" priority="Warning" thread="5364"] [sub-thread-Id="10" context="" token=""] Invalid certificate found: C=US, CN=vc.nukescloud.com, Untrusted certificate chain

2018-11-05T09:25:48.234Z IAAS75 vcac: [component="iaas:VRMAgent.exe" priority="Warning" thread="5364"] [sub-thread-Id="10" context="" token=""] Invalid certificate found: C=US, CN=vc.nukescloud.com, Untrusted certificate chain

2018-11-05T09:25:48.249Z IAAS75 vcac: [component="iaas:VRMAgent.exe" priority="Error" thread="4332"] [sub-thread-Id="5" context="" token=""] <?xml version="1.0" encoding="utf-16"?>

<boolean>false</boolean>

System.AggregateException: One or more errors occurred. ---> System.Net.WebException: The underlying connection was closed: An unexpected error occurred on a send. ---> DynamicOps.Common.Client.UntrustedCertificateException: Certificate is not trusted (RemoteCertificateChainErrors). Subject: C=US, CN=vc.nukescloud.com Thumbprint: 9C949FAEB87CBF97419CF4BFE70EB3FB8A035173

Above "Certificate is not trusted" message is not an exception but a message which states that certificate of the vCenter is not present in vRA yet.

For non-working environment

Manager Service \ All.log

For the environment where we see exception , We so see test connection request being processed but there is no workitem generated for agent to go ahead and fetch certificate

[UTC:2018-10-26 06:56:46 Local:2018-10-26 17:56:46] [Trace]: [sub-thread-Id="15" context="fW8t555E" token="3wP5GoMr"] TestConnection Request: Begin creating request record for request id [d2b48795-bdd2-4f9a-b585-2f8be4e33ac9]

[UTC:2018-10-26 06:56:46 Local:2018-10-26 17:56:46] [Trace]: [sub-thread-Id="15" context="fW8t555E" token="3wP5GoMr"] TestConnection Request: Finish creating request record for request id [d2b48795-bdd2-4f9a-b585-2f8be4e33ac9]

[UTC:2018-10-26 06:56:46 Local:2018-10-26 17:56:46] [Debug]: [sub-thread-Id="15" context="fW8t555E" token="3wP5GoMr"] TestConnection Request: Name [vcenter.nukescloud.com], Address: [https://vcenter.nukescloud.com/sdk], Username: [nukescloud\nukes]

Apart from above three lines , there was no other information in the logs to tell us why and where it's stuck.

After little bit of research found that dbo.Agent table found that there were multiple entries under AgentName for the same endpoint on the same Agent Machine , Refer the screenshot below.


To come out of this problem , we followed these steps.

  • Firstly , Backup IAAS database

  • Then RDP to IAAS Agent nodes and then uninstall proxy agent for the endpoint which cannot be added

  • Once , agent's are uninstalled , browse to the previously installed location and remove any existing traces ( there might be fe log files left )

  • Now , get back to SQL Database and then execute following query to remove any traces of the endpoint where it's failing to add from dbo.Agent table

  • In my scenario

  • vCenter was vcenter.nukescloud.com

  • dbo.Agent table had following AgentID's for my endpoint vcenter.nukescloud.com

  • AAAAA , CCCCC and DDDDD

  • Now , we need to go ahead and delete these entries from Database, we may use below SQL query to do this

  • delete from dbo.Agent where AgentID in ('AAAAA','CCCCC','DDDDD');

  • Post deletion , go ahead and reinstall the agent

  • Once done your expected to see one entry for this endpoint and the agent in dbo.Agent table and it's AgentAlive value would be set to 1

  • As you can see above if it is not working it's AgentAlive status would be set to 0

  • Now adding endpoint through vRA UI , it should work as expected

Reason why this exception occurs is very simple , when an agent is removed , it does not remove it from IaaS database. When we add an endpoint now , it always looks for the top entry for the agent in the table to assign a workitem. So it hit's the one which is not a proper proxy agent for this endpoint it would not be able to process it further causing a timeout in the UI

So when Proxy Agent is uninstalled , ensure you perform a check in dbo.Agent table under IAAS database if it's completely removed and there isn't any stale entry

|| Happy Learning ||


637 views

Subscribe Now

  • Twitter
  • Facebook Social Icon

Copyright © 2019 nukescloud