Quantcast
Channel: VMware Communities : Discussion List - All Communities
Viewing all articles
Browse latest Browse all 178257

NSX-T 2.4: Node cert replacement via API fails with "com.vmware.nsx.management.container.exceptions.InvalidOwnerException" in nsxapi.log

$
0
0

I've gone through this about half-a-dozen times with different types of certificates each with the same message of failure in /var/log/proton/nsxapi.log. This same process worked fine with NSX-T 2.3. Let me explain my topology (screenshots for sauce).

 

  • NSX-T 2.4
  • 3 nodes
    • cz-nest-nsxtm01 (10.2.21.16)
    • cz-nest-nsxtm02 (10.2.21.17)
    • cz-nest-nsxtm03 (10.2.21.18)
  • 1 HA VIP address (ILB)
    • cz-nest-nsxtm (10.2.21.19)

 

I have referenced the steps in VVD 5.0.1 and used the CertGen utility to create certificates signed by my internal enterprise CA. When replacement on cz-nest-nsxtm01 with the node cert did not work, I attempted steps with a self-signed cert only with the same failure. I upload the CA cert and upload the node certs, including the one for the cluster IP. SAN contains FQDN and IP. I don't think this is an issue of cert contents.

 

I retrieve the ID of the cert in question. For more details of the cert, if I curl to /api/v1/trust-management/certificates it returns the following:

 

{

"pem_encoded": <REDACTED>,

"used_by": [],

"resource_type": "certificate_signed",

"id": "67eb0a1c-e06c-476e-980a-08519b90d16f",

"display_name": "cz-nest-nsxtm01",

"tags": [

  {

"scope": "policyPath",

"tag": "/infra/certificates/cz-nest-nsxtm01"

  }

  ],

"_create_user": "nsx_policy",

"_create_time": 1556977318572,

"_last_modified_user": "nsx_policy",

"_last_modified_time": 1556977318572,

"_system_owned": false,

"_protection": "REQUIRE_OVERRIDE",

"_revision": 0

}

 

 

I post to the necessary URI as follows:

 

curl -k -u admin:VMware1!' -X POST "https://cz-nest-nsxtm01.domain.com/api/v1/node/services/http?action=apply_certificate&certificate_id=5c4f0ee9-00cb-4acd-8431-07903767204a"

In response I receive:

 

{  "error_code": 36235,  "error_message": "Error updating certificate usage.",  "module_name": "node-services"
}

 

Upon examination of /var/log/proton/nsxapi.log I find the following messages logged after the operation returns failure (markup by VS Code for convenience):

2019-05-04T13:19:18.849ZINFOhttp-nio-127.0.0.1-7440-exec-1 PreAuthenticatedAuthenticationProvider - - [nsx@6876 comp="nsx-manager" subcomp="manager"] User node-mgmt. Granted authorities: ''

2019-05-04T13:19:18.849ZINFOhttp-nio-127.0.0.1-7440-exec-1 PreAuthenticatedAuthenticationProvider - - [nsx@6876 comp="nsx-manager" subcomp="manager"] User node-mgmt. Granted authorities: ''

2019-05-04T13:19:18.876ZINFOhttp-nio-127.0.0.1-7440-exec-1 AuditingServiceImpl - SYSTEM [nsx@6876 audit="true" comp="nsx-manager" reqId="bbf540a7-e46c-4590-811f-b078753c526e" subcomp="manager"] UserName="node-mgmt", ModuleName="CertificateManager", Operation="GetPrivateCertificate", Operation status="success", New value=["5c4f0ee9-00cb-4acd-8431-07903767204a"]

2019-05-04T13:19:18.924ZINFOhttp-nio-127.0.0.1-7440-exec-2 PreAuthenticatedAuthenticationProvider - - [nsx@6876 comp="nsx-manager" subcomp="manager"] User node-mgmt. Granted authorities: ''

2019-05-04T13:19:18.925ZINFOhttp-nio-127.0.0.1-7440-exec-2 PreAuthenticatedAuthenticationProvider - - [nsx@6876 comp="nsx-manager" subcomp="manager"] User node-mgmt. Granted authorities: ''

2019-05-04T13:19:18.936ZINFOhttp-nio-127.0.0.1-7440-exec-2 TrustStoreFacadeImpl - SYSTEM [nsx@6876 comp="nsx-manager" subcomp="manager"] Reserve certificate 5c4f0ee9-00cb-4acd-8431-07903767204a

2019-05-04T13:19:18.944ZINFOhttp-nio-127.0.0.1-7440-exec-2 TrustStoreServiceImpl - SYSTEM [nsx@6876 comp="nsx-manager" subcomp="manager"] Reserve service type API for node 4c9f2c42-57fd-88d4-24bb-3917f5e69a12 for certificate node-cz-nest-nsxtm01

2019-05-04T13:19:18.950ZERRORhttp-nio-127.0.0.1-7440-exec-2 PrincipalOwnerValidator - - [nsx@6876 comp="nsx-manager" errorCode="MP289" subcomp="manager"] XXX Principal 'node-mgmt' with role '[]' attempts to delete or modify an object of type ImmutableCertificateEntity it doesn't own. (createUser=nsx_policy, allowOverwrite=null)

2019-05-04T13:19:18.951ZINFOhttp-nio-127.0.0.1-7440-exec-2 AuditingServiceImpl - SYSTEM [nsx@6876 audit="true" comp="nsx-manager" reqId="5ce8722f-1c1e-4681-a181-db21e86aa72e" subcomp="manager"] UserName="node-mgmt", ModuleName="CertificateManager", Operation="CertificateReserve", Operation status="failure", New value=["5c4f0ee9-00cb-4acd-8431-07903767204a" {"service_type":"API","node_id":"4c9f2c42-57fd-88d4-24bb-3917f5e69a12"}]

2019-05-04T13:19:18.952ZINFOhttp-nio-127.0.0.1-7440-exec-2 NsxBaseRestController - - [nsx@6876 comp="nsx-manager" subcomp="manager"] Error in API /nsxapi/api/v1/trust-management/certificates/5c4f0ee9-00cb-4acd-8431-07903767204a?action=reserve caused by exception com.vmware.nsx.management.container.exceptions.InvalidOwnerException: {"moduleName":"common-services","errorCode":289,"errorMessage":"Principal 'node-mgmt' with role '[]' attempts to delete or modify an object of type ImmutableCertificateEntity it doesn't own. (createUser=nsx_policy, allowOverwrite=null)"}

As can be seen, it appears to be complaining about rights assigned to the user (admin) executing the POST, which doesn't make sense because it's the admin account. Otherwise, my other thought was it's refusing the operation because the 3 appliances have already been clustered. In the VVD procedure for this it makes no special mention of node leadership. But it does have the user replace the cert on the nodes individually prior to a cluster IP being assigned.

 

I've also checked the official NSX-T 2.4 documentation (doc rev. 12 April 2019; PDF p536) and there is again no special mention of anything that was different in this process from 2.3.

 

Anyone seen (or tried) this? If I don't hear anything I'll try to break the cluster, delete the other nodes, redeploy, and try again.

 

EDIT 1:  Even if I break the cluster IP (reset action) but leave all three nodes up and try the replacement, I get the same error in the logs as before.

EDIT 2:  I destroyed all the manager nodes except the first, rebooted, and tried the replacement. It failed yet again with the same messages. So I'm pretty much out of ideas here.


Viewing all articles
Browse latest Browse all 178257


<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>