Thursday, December 1, 2016

Pivotal Cloud Foundry - Stemcell Upgrade



I have been working on some upgrade scenarios for PCF environment and this is one of them. 

Stemcell is a an OS image which contains bare minimum OS with few utilities/agents/configuration. Cloud Foundry BOSH team frequently releases new version of Stemcell which addresses (let's say) some security vulnerability, it must be upgraded. Let's do that...


Existing Setup

       Ops Mgr and BOSH Director @ v1.7.14
       ELR @ v1.7.35
       Stemcell @ ubuntu trusty 3233.3
Goal
       Upgrade Stemcell to "ubuntu trusty 3233.4" 

I have a simple php test application(3 instances) deployed and I am hitting it continuously using while loop and curl to simulate the application usage.
[root@myApp php]# while true; do curl myphpapp.pivotal.local; echo; sleep 1; done
This is a Test App, Current Time [03:50:26]
This is a Test App, Current Time [03:50:27]
This is a Test App, Current Time [03:50:28]
This is a Test App, Current Time [03:50:29]
...

Now let's start the upgrade steps
  1. Go to Ops Manager Installation Dashboard, ELR tile, Settings, Stemcell and upload the new Stemcell to ELR
  2.  
  3. Go back to Installation Dashboard and apply the changes
  4. This will start the whole process of upgrade
  5.  
  6.  The upgrade follows the "Canary" method --- BOSH will first try to upgrade a small number of servers (usually 1), the “Canary”. Only if it is successful, the remaining servers will be upgraded.
  7.   
  8. During this process, my application was accessible without any issue
  9. ** As I am running this without high availability due to limited resources in my lab, some of the components were not available and I could not push new apps or make any changes in the running apps ** I also turned "VM resurrection" OFF during the process to avoid any conflicts
  10.  
  11. Normally you should not see this happening as Production environment should always be designed with High Availability. So unavailability of one instance/server of any component during upgrade should not affect other instances/servers of that component
  12. And after a while, upgrade is successful.
  13.  
  14. The new version can be verified as below
  15.  
  16. I must turn "VM resurrection" ON at the end :)

That's it. Upgrade is successful. Hope this will be useful.



3 comments: