• Arun Nukula

Upgrading vRealize Automation 8.x to 8.2

Before we initiate an upgrade on vRealize Automation 8.x to 8.2, we have to upgrade vRLCM to 8.2


Now once we have vRLCM ready on 8.2, let's go ahead and discuss steps taken to upgrade vRA to version 8.2


User validations


Validate Postgres Replication

  • I've ensured there are no Postgres replication issues by executing the below command


seq 0 2 | xargs -r -n 1 -I {} kubectl -n prelude exec postgres-{} -- chpst -u postgres repmgr node status

DEBUG: connecting to: "user=repmgr-db passfile=/scratch/repmgr-db.cred connect_timeout=10 dbname=repmgr-db host=postgres-0.postgres.prelude.svc.cluster.local fallback_application_name=repmgr"
Node "postgres-0.postgres.prelude.svc.cluster.local":
        PostgreSQL version: 10.10
        Total data size: 936 MB
        Conninfo: host=postgres-0.postgres.prelude.svc.cluster.local dbname=repmgr-db user=repmgr-db passfile=/scratch/repmgr-db.cred connect_timeout=10
        Role: primary
        WAL archiving: enabled
        Archive command: /bin/true
        WALs pending archiving: 0 pending files
        Replication connections: 2 (of maximal 10)
        Replication slots: 0 physical (of maximal 10; 0 missing)
        Replication lag: n/a

DEBUG: connecting to: "user=repmgr-db passfile=/scratch/repmgr-db.cred connect_timeout=10 dbname=repmgr-db host=postgres-1.postgres.prelude.svc.cluster.local fallback_application_name=repmgr"
Node "postgres-1.postgres.prelude.svc.cluster.local":
        PostgreSQL version: 10.10
        Total data size: 933 MB
        Conninfo: host=postgres-1.postgres.prelude.svc.cluster.local dbname=repmgr-db user=repmgr-db passfile=/scratch/repmgr-db.cred connect_timeout=10
        Role: standby
        WAL archiving: disabled (on standbys "archive_mode" must be set to "always" to be effective)
        Archive command: /bin/true
        WALs pending archiving: 0 pending files
        Replication connections: 0 (of maximal 10)
        Replication slots: 0 physical (of maximal 10; 0 missing)
        Upstream node: postgres-0.postgres.prelude.svc.cluster.local (ID: 100)
        Replication lag: 0 seconds
        Last received LSN: 2/DA9C5A00
        Last replayed LSN: 2/DA9C5A00

DEBUG: connecting to: "user=repmgr-db passfile=/scratch/repmgr-db.cred connect_timeout=10 dbname=repmgr-db host=postgres-2.postgres.prelude.svc.cluster.local fallback_application_name=repmgr"
Node "postgres-2.postgres.prelude.svc.cluster.local":
        PostgreSQL version: 10.10
        Total data size: 933 MB
        Conninfo: host=postgres-2.postgres.prelude.svc.cluster.local dbname=repmgr-db user=repmgr-db passfile=/scratch/repmgr-db.cred connect_timeout=10
        Role: standby
        WAL archiving: disabled (on standbys "archive_mode" must be set to "always" to be effective)
        Archive command: /bin/true
        WALs pending archiving: 0 pending files
        Replication connections: 0 (of maximal 10)
        Replication slots: 0 physical (of maximal 10; 0 missing)
        Upstream node: postgres-0.postgres.prelude.svc.cluster.local (ID: 100)
        Replication lag: 0 seconds
        Last received LSN: 2/DA9C5DA8
        Last replayed LSN: 2/DA9C5DA8

My vRA 8.x environment is a distributed instance. hence it consists of 3 vRA nodes.

Each Postgres Instance belongs to one node in the background which is constantly replicated



No LB Changes Needed

  • I have not made any changes to my Load Balancer which is managing my distributed vRA 8.2 instances.


Validate Pods Health

  • Ensure All Pods are in Running and Ready state


Trigger Inventory Sync

  • Trigger Inventory sync before the upgrade




Submitting Upgrade Request and Prechecks


Step:1


Create a snapshot using vRLCM


Browse through the vRA environment and then select UPGRADE


Step:2

This will bring you an upgrade UI where you have to select Repository Type

In my case, I've downloaded 8.2 beforehand and


had it ready under my Product Binaries


Step:3


This pane would give you an option to trigger inventory sync if this was not performed before. If this has been done before triggering an upgrade then you may ignore it.

Once Inventory Sync is complete you may proceed to the next step


Step:4


In this step, one has to perform a precheck before performing an upgrade


Once you click on run precheck, you would be presented with a pane where you have to agree that all manual validations have been performed. This is talking about vIDM Hardware resources

Prechecks start


There is a failure. VMware introduced a check to ensure /services/logs has enough space on all the vRealize Automation appliances

This is a mandatory step that should not be missed.


If we click on VIEW under the Recommendations pane we will be presented with a pane that has all the steps to resolve the above problem.


The exception is stating that /dev/sdc which is Hard Disk 3 on the Virtual Appliance does not have enough space

Remember, I've taken snapshots for my vRealize Automation Appliances before. So to extend I had to remove to snapshots


Then extend Hard Disk 3 size from 8 GB to 30 GB, adding additional 22 GB of space




In the below screenshot as you can see my /dev/sdc was only 8 GB

Even after performing a resize the new size was not reflecting


Resize was throwing an error

[2020-10-08T04:41:12.050Z] Disk size for disk /dev/sdb has not changed.
[2020-10-08T04:41:12.079Z] Rescanning disk /dev/sdc...
[2020-10-08T04:41:12.222Z] Disk size for disk /dev/sdc has increased from  8589934592 to 32212254720.
[2020-10-08T04:41:12.423Z] Resizing physical volume...
  Physical volume "/dev/sdc" changed
  1 physical volume(s) resized / 0 physical volume(s) not resized
[2020-10-08T04:41:12.559Z] Physical volume resized.
[2020-10-08T04:41:12.722Z] Extending logical volume services-logs...
  Size of logical volume logs_vg/services-logs changed from <8.00 GiB (2047 extents) to <30.00 GiB (7679 extents).
  Logical volume logs_vg/services-logs successfully resized.
[2020-10-08T04:41:12.903Z] Logical volume resized.
[2020-10-08T04:41:12.916Z] Resizing file system...
resize2fs 1.43.4 (31-Jan-2017)
open: No such file or directory while opening /dev/mapper/logs_vg-services-logs
[2020-10-08T04:41:13.029Z] ERROR: Error resizing file system.
[2020-10-08T04:41:13.053Z] Rescanning disk /dev/sdd...
[2020-10-08T04:41:13.178Z] Disk size for disk /dev/sdd has not changed.

This was the same instruction present under View pane. if you hit this exception, then we have to follow Step#3 from KB article 79925


After this step, the new size is reflected and we can now move forward as know that the prechecks will be successful


As stated earlier, after resolving /services-logs partition sizing issue all prechecks validations have been successful


Now when we click on next, we now head into the final phase of submitting the vRA upgrade request



Once you click on submit, the upgrade is initiated


Upgrade


There is nothing a user has to do once an upgrade request is submitted. It takes 2 hours and 35 minutes to complete 2 stages of the upgrade


Stage 1 is called as vRealize Automation Upgrade/Patch/Internal Network Range Change


Stage 2 is called as productupgradeinventoryupdate


Stage 1 in detail

  1. Starts the upgrade

  2. Checks vRealize Automation version

  3. Copies vIDM Admin token to vRA

  4. Initiates vRA upgrade

  5. Upload vRA upgrade pre-hook script

  6. Run vRA upgrade pre-hook script

  7. vRA upgrade status check

  8. Prepare vRA for an upgrade, this goes in a loop for a while till all the nodes are prepared

  9. Proceed to take a snapshot

  10. Extract vRA nodes

  11. Extract vMoid from VM's for vRA

  12. Take a snapshot of vRA using vMOID


13. Power On vRA using vMOID

14. Performs Hostname and IP checks until the appliance is back

15. Upgrade vRealize Automation is triggered

16. This goes in a loop with upgrade status check

17. Waits for initialization after vRA upgrade

18. Finalization


That's it for Stage:1, it takes a lot of time, 2 hours and 35 minutes for a 3 node architecture at the 15th and 16th step which is quite obvious


The second stage of productupgradeinventoryupdate takes hardly any milliseconds





Logs to check during an upgrade


These are a few logs which can be monitored or involved during the upgrade

The order of the logs is not the way it's being upgraded

/var/log/vmware/prelude/upgrade-YYYY-MM-DD-HH-NN-SS.log
/var/log/vmware/prelude/upgrade-report-latest
/var/log/vmware/prelude/upgrade-report-latest.json
/var/log/deploy.log
/opt/vmware/var/log/vami/vami.log
/opt/vmware/var/log/vami/updatecli.log

We will deep-dive from logs perspective in my next blog




62 views

Subscribe Now

  • Twitter
  • Facebook Social Icon

Copyright © 2019 nukescloud