Upgrading vRealize Automation 8.x to 8.2
Updated: Nov 24, 2020
Before we initiate an upgrade on vRealize Automation 8.x to 8.2, we have to upgrade vRLCM to 8.2
Now once we have vRLCM ready on 8.2, let's go ahead and discuss steps taken to upgrade vRA to version 8.2
User validations
Validate Postgres Replication
I've ensured there are no Postgres replication issues by executing the below command
seq 0 2 | xargs -r -n 1 -I {} kubectl -n prelude exec postgres-{} -- chpst -u postgres repmgr node status
DEBUG: connecting to: "user=repmgr-db passfile=/scratch/repmgr-db.cred connect_timeout=10 dbname=repmgr-db host=postgres-0.postgres.prelude.svc.cluster.local fallback_application_name=repmgr"
Node "postgres-0.postgres.prelude.svc.cluster.local":
PostgreSQL version: 10.10
Total data size: 936 MB
Conninfo: host=postgres-0.postgres.prelude.svc.cluster.local dbname=repmgr-db user=repmgr-db passfile=/scratch/repmgr-db.cred connect_timeout=10
Role: primary
WAL archiving: enabled
Archive command: /bin/true
WALs pending archiving: 0 pending files
Replication connections: 2 (of maximal 10)
Replication slots: 0 physical (of maximal 10; 0 missing)
Replication lag: n/a
DEBUG: connecting to: "user=repmgr-db passfile=/scratch/repmgr-db.cred connect_timeout=10 dbname=repmgr-db host=postgres-1.postgres.prelude.svc.cluster.local fallback_application_name=repmgr"
Node "postgres-1.postgres.prelude.svc.cluster.local":
PostgreSQL version: 10.10
Total data size: 933 MB
Conninfo: host=postgres-1.postgres.prelude.svc.cluster.local dbname=repmgr-db user=repmgr-db passfile=/scratch/repmgr-db.cred connect_timeout=10
Role: standby
WAL archiving: disabled (on standbys "archive_mode" must be set to "always" to be effective)
Archive command: /bin/true
WALs pending archiving: 0 pending files
Replication connections: 0 (of maximal 10)
Replication slots: 0 physical (of maximal 10; 0 missing)
Upstream node: postgres-0.postgres.prelude.svc.cluster.local (ID: 100)
Replication lag: 0 seconds
Last received LSN: 2/DA9C5A00
Last replayed LSN: 2/DA9C5A00
DEBUG: connecting to: "user=repmgr-db passfile=/scratch/repmgr-db.cred connect_timeout=10 dbname=repmgr-db host=postgres-2.postgres.prelude.svc.cluster.local fallback_application_name=repmgr"
Node "postgres-2.postgres.prelude.svc.cluster.local":
PostgreSQL version: 10.10
Total data size: 933 MB
Conninfo: host=postgres-2.postgres.prelude.svc.cluster.local dbname=repmgr-db user=repmgr-db passfile=/scratch/repmgr-db.cred connect_timeout=10
Role: standby
WAL archiving: disabled (on standbys "archive_mode" must be set to "always" to be effective)
Archive command: /bin/true
WALs pending archiving: 0 pending files
Replication connections: 0 (of maximal 10)
Replication slots: 0 physical (of maximal 10; 0 missing)
Upstream node: postgres-0.postgres.prelude.svc.cluster.local (ID: 100)
Replication lag: 0 seconds
Last received LSN: 2/DA9C5DA8
Last replayed LSN: 2/DA9C5DA8
My vRA 8.x environment is a distributed instance. hence it consists of 3 vRA nodes.
Each Postgres Instance belongs to one node in the background which is constantly replicated

No LB Changes Needed
I have not made any changes to my Load Balancer which is managing my distributed vRA 8.2 instances.
Validate Pods Health
Ensure All Pods are in Running and Ready state
Trigger Inventory Sync
Trigger Inventory sync before the upgrade

Submitting Upgrade Request and Prechecks
Step:1
Create a snapshot using vRLCM

Browse through the vRA environment and then select UPGRADE

Step:2
This will bring you an upgrade UI where you have to select Repository Type
In my case, I've downloaded 8.2 beforehand and
had it ready under my Product Binaries

Step:3
This pane would give you an option to trigger inventory sync if this was not performed before. If this has been done before triggering an upgrade then you may ignore it.

Once Inventory Sync is complete you may proceed to the next step
Step:4
In this step, one has to perform a precheck before performing an upgrade

Once you click on run precheck, you would be presented with a pane where you have to agree that all manual validations have been performed. This is talking about vIDM Hardware resources

Prechecks start

There is a failure. VMware introduced a check to ensure /services/logs has enough space on all the vRealize Automation appliances
This is a mandatory step that should not be missed.

If we click on VIEW under the Recommendations pane we will be presented with a pane that has all the steps to resolve the above problem.
The exception is stating that /dev/sdc which is Hard Disk 3 on the Virtual Appliance does not have enough space

Remember, I've taken snapshots for my vRealize Automation Appliances before. So to extend I had to remove to snapshots
Then extend Hard Disk 3 size from 8 GB to 30 GB, adding additional 22 GB of space


In the below screenshot as you can see my /dev/sdc was only 8 GB
Even after performing a resize the new size was not reflecting

Resize was throwing an error
[2020-10-08T04:41:12.050Z] Disk size for disk /dev/sdb has not changed.
[2020-10-08T04:41:12.079Z] Rescanning disk /dev/sdc...
[2020-10-08T04:41:12.222Z] Disk size for disk /dev/sdc has increased from 8589934592 to 32212254720.
[2020-10-08T04:41:12.423Z] Resizing physical volume...
Physical volume "/dev/sdc" changed
1 physical volume(s) resized / 0 physical volume(s) not resized
[2020-10-08T04:41:12.559Z] Physical volume resized.
[2020-10-08T04:41:12.722Z] Extending logical volume services-logs...
Size of logical volume logs_vg/services-logs changed from <8.00 GiB (2047 extents) to <30.00 GiB (7679 extents).
Logical volume logs_vg/services-logs successfully resized.
[2020-10-08T04:41:12.903Z] Logical volume resized.
[2020-10-08T04:41:12.916Z] Resizing file system...
resize2fs 1.43.4 (31-Jan-2017)
open: No such file or directory while opening /dev/mapper/logs_vg-services-logs
[2020-10-08T04:41:13.029Z] ERROR: Error resizing file system.
[2020-10-08T04:41:13.053Z] Rescanning disk /dev/sdd...
[2020-10-08T04:41:13.178Z] Disk size for disk /dev/sdd has not changed.
This was the same instruction present under the View pane. if you hit this exception, then we have to follow Step#3 from KB article 79925
After this step, the new size is reflected and we can now move forward as know that the prechecks will be successful

As stated earlier, after resolving /services-logs partition sizing issue all prechecks validations have been successful

Now when we click on next, we now head into the final phase of submitting the vRA upgrade request

Once you click on submit, the upgrade is initiated