Alcatel Virtual Chassis – neue Firmware als ISSU

Bisher war ein Software Update in einem Virtuellen Chassis immer mit einer Downtime verbunden.

Man musste die Software einspielen und dann beide Member mit der neuen Version booten.

Da die 6900 doch eine gewisse Zeit zum booten brauchen, gab es einen Ausfall im Minuten Bereich.
Mit den neuen Releases (7.3.1.748 und 7.3.2.374R01) wird nun die Möglichkeit eines ISSU (In-Service Software Upgrade) unterstützt.

Vereinfacht spielt man in einen Virtual-Chassis System die Software ein und stößt das Upgrade an.
Dann wird als erstes der Secondary mit der Software versorgt und damit gebootet. Ist er hochgefahren und bereit übernimmt er die Masterrolle und der vorher aktive bekommt die neue Firmware.
Nachdem auch die Box gebootet hat ist das VC mit der neuen Software bereit, der Ausfall sollte dann minimal sein.

Voraussetzung das neue Image (Tos.img und das passende issu_version file) liegt auf dem Master in einem Verzeichnis z.B. new_firm

Konkret:
Kopieren der Konfig

cp working/vcboot.cfg new_firm
cp working/vcsetup.cfg new_firm

Abspeichern (sicher ist sicher)

write memory flash-synchro  

Dann geht es los:

issu from new_firm
Are you sure you want an In Service System Upgrade? (Y/N) : y

Thu Sep 26 14:19:55 : ChassisSupervisor issuMgr info message:
+++ VCISSU: validating images on master. (directory "new_firm")

Thu Sep 26 14:20:16 : ChassisSupervisor issuMgr info message:
+++ VCISSU: images validated.
+++ VCISSU: all slaves copying images...

Thu Sep 26 14:21:02 : ChassisSupervisor issuMgr info message:
+++ VCISSU: slave 1 finished copying images.
+++ VCISSU: all slaves have copied images.
+++ VCISSU: slave 1 is rebooting

Thu Sep 26 14:21:24 : vcmCmm port_mgr info message:
+++ CMM:vcmCMM_client_rx_pm@1412: VFL link 2/0 down (last 2/1/20) [L2]

Thu Sep 26 14:21:24 : ChassisSupervisor bootMgr info message:
+++ bootMgrVCMTopoDataEventHandler: remote chassis 1 no longer in the topology - closing connections

Thu Sep 26 14:21:24 : vcmCmm ipc info message:
+++ CMM:vcmCMM_peer_disconnected@1756: Remote endpoint (chassis 1, slot 65)

Thu Sep 26 14:21:24 : ChassisSupervisor prbDB info message:
+++ fanTray 1,chassis 1 extracted

Thu Sep 26 14:21:24 : ChassisSupervisor issuMgr info message:
+++ VCISSU: slave 1 topology change.  Waiting for shared memory update.

Thu Sep 26 14:21:24 : ChassisSupervisor vcReloadMgr info message:
+++ Remote chassis 1 slot 65 disconnected - status 0

Thu Sep 26 14:21:55 : ChassisSupervisor vcReloadMgr info message:
+++ Reload transaction timeout - canceling

Auf dem Slave sieht man:

Thu Sep 26 14:20:16 : flashManager FlashMgr Main info message:
+++ File vcsetup.cfg differs in /flash/myissu and /flash/new_firm - copying file

Thu Sep 26 14:20:28 : flashManager FlashMgr Main info message:
+++ Image file Tos.img differs - copying file- duplicated 1 times!

Thu Sep 26 14:21:02 : ChassisSupervisor issuMgr alert message:
+++ VCISSU: Slave Chassis rebooting under command from master chassis

Thu Sep 26 14:21:02 : flashManager FlashMgr Main info message:
+++ Verifying image directory new_firm on CMM flash

Thu Sep 26 14:21:02 : ChassisSupervisor issuMgr alert message:
+++ VCISSU: boot manager is reloading from image directory "new_firm".

Thu Sep 26 14:21:24 : ChassisSupervisor reloadMgr info message:
+++ reloadMgrSetupReboot: rebooting primary CMM

Thu Sep 26 14:21:24 : ChassisSupervisor bootMgr info message:
+++ bootMgrRebootCMM: Rebooting CMM
[..]

(none) login: 
Thu Sep 26 14:23:48 : ChassisSupervisor Power Mgr info message:
+++ PS 1 is up

Thu Sep 26 14:23:49 : vcmCmm init info message:
+++ CMM:vcm_mip_eoic_cmm@532: Virtual-chassis mode (chassis 1, protocol 'vcm', max 2) [L0]

Thu Sep 26 14:23:49 : ChassisSupervisor CS Main info message:
+++ Updating chassis ID to 1

Thu Sep 26 14:23:49 : ChassisSupervisor prbDB info message:
+++ PS 1,chassis_id 1 inserted
Alcatel-Lucent OS6900-X40 7.3.2.374.R01 Service Release, August 30, 2013.
Copyright(c), 1994-2013 Alcatel-Lucent. All Rights reserved.


Thu Sep 26 14:23:54 : ipv4 itf info message:
+++ Interface EMP-CHAS1 192.168.252.1/255.255.255.0
chassis mode is
2

Thu Sep 26 14:25:53 : vcmCmm chas_sup info message:
+++ CMM:vcmCMM_cs_handle_chassis_ready@3602: Chassis 1 ready (data 0) [L1]

Thu Sep 26 14:25:56 : vcmCmm port_mgr info message:
+++ CMM:vcmCMM_client_rx_pm@1551: VFL link 1/0 up (pri 1/1/1:0x0) [L2]

Thu Sep 26 14:25:56 : vcmCmm protocol info message:
+++ CMM:vcmCMN_protocol_ready_update_cb@13348: Chassis 1, role Slave, status Running, master 2 [L3]

Thu Sep 26 14:25:57 : vcmCmm ipc info message:
+++ CMM:vcmCMM_peer_connected@1792: Remote endpoint (chassis 2, slot 65) [L4]

Thu Sep 26 14:25:57 : vcmCmm node_sync info message:
+++ CMM:notify_sync_complete@757: Sync complete 'multi node' (peers 1, conn 1, sync 1) [L5]

Thu Sep 26 14:25:59 : healthCmm main warning message:
+++ CPU crossed Above the Threshold Limit!

Thu Sep 26 14:26:10 : ChassisSupervisor prbDB info message:
+++ PS 1,chassis_id 2 inserted

Thu Sep 26 14:26:10 : isis_spb_0 TASK info message:
+++ Base MAC changed to e8:e7:32:0e:65:69

Thu Sep 26 14:26:10 : ChassisSupervisor bootMgr info message:
+++ Sending VC Takeover to NIs and applications [L6]

Thu Sep 26 14:26:10 : ConfigManager CONFD info message:
+++ Slave Configuration Manager waiting for Master sync
+++ Slave Configuration Manager Connecting with Master

Thu Sep 26 14:26:12 : isis_spb_0 TASK info message:
+++ VC Takeover: chassis_id:1

Thu Sep 26 14:26:12 : ipv4 itf info message:
+++ Interface EMP-CHAS1 192.168.252.1/255.255.255.0

Thu Sep 26 14:26:12 : SNMP aluSubagent_thread info message:
+++ snmp_vc_takeover_callback | VC Takeover complete

Thu Sep 26 14:26:12 : appfpCmm main error message:
+++ Master running old version : 

Thu Sep 26 14:26:13 : qosNi Info info message:
+++ VC Takeover in progress.
+++ VC Takeover complete.

Thu Sep 26 14:26:13 : appfpNi main error message:
+++ Master running old version : 

Thu Sep 26 14:26:16 : ChassisSupervisor bootMgr info message:
+++ Received VC Takeover Complete event from all apps [L7]
Chassis Supervision: CMM has reached the ready state [L8]

Thu Sep 26 14:26:18 : ChassisSupervisor reloadMgr info message:
+++ Redundancy time expired - updating next running to new_firm

Thu Sep 26 14:26:50 : vcmCmm port_mgr info message:
+++ CMM:vcmCMM_client_rx_pm@1569: VFL link 1/0 down (last 1/1/20) [L2]

Thu Sep 26 14:26:50 : vcmCmm config info message:
+++ CMM:vcmCMM_check_emp_exec@15094: No EMP query - controlled reset (chassis 2, mac e8:e7:32:0e:66:89)

Thu Sep 26 14:26:50 : ChassisSupervisor bootMgr info message:
+++ bootMgrVCMTopoDataEventHandler: remote chassis 2 no longer in the topology - closing connections

Thu Sep 26 14:26:50 : healthNi main info message:
+++ [hmon_reconnect_to_cmm:156] 

Thu Sep 26 14:26:50 : vcmCmm ipc info message:
+++ CMM:vcmCMM_peer_disconnected@1907: Remote endpoint (chassis 2, slot 65)

Thu Sep 26 14:26:50 : ipni udprelay info message:
+++ UDPRelay disconnected (104)

Thu Sep 26 14:26:50 : ChassisSupervisor prbDB info message:
+++ fanTray 1,chassis 2 extracted

Thu Sep 26 14:26:50 : ChassisSupervisor vcReloadMgr info message:
+++ Remote chassis 2 slot 65 disconnected - status 0

Thu Sep 26 14:26:50 : ChassisSupervisor bootMgr alert message:
+++ Master chassis ID has changed (1/65) - triggering a Virtual Chassis Takeover

Thu Sep 26 14:26:50 : ChassisSupervisor bootMgr info message:
+++ Sending VC Takeover to NIs and applications [L6]
+++ bootMgrSendSecondaryMasterChanged: entry

Thu Sep 26 14:26:55 : ChassisSupervisor bootMgr info message:
+++ Received VC Takeover Complete event from all apps [L7]
Chassis Supervision: CMM has reached the ready state [L8]

Thu Sep 26 14:26:57 : ChassisSupervisor reloadMgr info message:
+++ Redundancy time expired - updating next running to new_firm

Thu Sep 26 14:26:59 : healthCmm main warning message:
+++ CPU crossed Below the Threshold Limit!

Der Slave hat also mit der neuen Firmware geboote.

Es ist dann Master geworden und hat das Management übernommen, damit der vorherige Master das Update machen kann

Die Ausgabe auf dem (Ex)Master dabei:

Thu Sep 26 14:25:56 : vcmCmm port_mgr info message:
+++ CMM:vcmCMM_client_rx_pm@1394: VFL link 2/0 up (pri 2/1/1:0x20000) [L2]

Thu Sep 26 14:25:57 : vcmCmm ipc info message:
+++ CMM:vcmCMM_peer_connected@1634: Remote endpoint (chassis 1, slot 65) [L4]

Thu Sep 26 14:25:57 : vcmCmm port_lib error message:
+++ CMM:vcmCMM_process_config_sync_msg@4010: Get port info (vfl 0, ifidx 40000257, ret -3)

Thu Sep 26 14:26:10 : ChassisSupervisor prbDB info message:
+++ PS 1,chassis_id 1 inserted

Thu Sep 26 14:26:10 : ChassisSupervisor issuMgr info message:
+++ VCISSU: slave 1 updated shared memory. Waiting for slave vc takeover events.

Thu Sep 26 14:26:15 : ChassisSupervisor prbDB info message:
+++ PS 1,chassis_id 1 inserted
For Chassis 1 slot 1 flag is 1 :haVlanHandleCsNIUp

Thu Sep 26 14:26:16 : evbCmm main error message:
+++ plGetUportRangeFromChassisSlot failed: 1/2, rv=-30

Thu Sep 26 14:26:16 : rmon main error message:
+++ plGetUportRangeFromChassisSlot chass: 1 slot: 2 ret: -30

Thu Sep 26 14:26:16 : evbCmm main error message:
+++ plGetUportRangeFromChassisSlot failed: 1/3, rv=-30

Thu Sep 26 14:26:16 : rmon main error message:
+++ plGetUportRangeFromChassisSlot chass: 1 slot: 3 ret: -30

Thu Sep 26 14:26:16 : ChassisSupervisor issuMgr info message:
+++ VCISSU: slave 1 finished vc takeover events.  Waiting for vc ready acknowledgement.

Thu Sep 26 14:26:20 : ChassisSupervisor prbDB info message:
+++ PS 1,chassis_id 1 inserted

Thu Sep 26 14:26:25 : ChassisSupervisor prbDB info message:
+++ PS 1,chassis_id 1 inserted

Thu Sep 26 14:26:30 : ChassisSupervisor prbDB info message:
+++ PS 1,chassis_id 1 inserted

Thu Sep 26 14:26:35 : ChassisSupervisor prbDB info message:
+++ PS 1,chassis_id 1 inserted

Thu Sep 26 14:26:46 : ChassisSupervisor issuMgr info message:
+++ VCISSU: slave 1 acknowledged vc ready. VCISSU complete for this slave.

Thu Sep 26 14:26:46 : ChassisSupervisor issuMgr alert message:
+++ VCISSU: Master Chassis rebooting at end of VCISSU procedure

Thu Sep 26 14:26:46 : flashManager FlashMgr Main info message:
+++ Verifying image directory new_firm on CMM flash

Thu Sep 26 14:26:46 : ChassisSupervisor issuMgr alert message:
+++ VCISSU: boot manager is reloading from image directory "new_firm".

Thu Sep 26 14:26:49 : ChassisSupervisor reloadMgr info message:
+++ reloadMgrSetupReboot: rebooting primary CMM

Thu Sep 26 14:26:49 : ChassisSupervisor bootMgr info message:
+++ bootMgrRebootCMM: Rebooting CMM

Thu Sep 26 14:26:49 : MIP_GATEWAY mipgwd info message:
+++ msgRouter: Client unavailable, dropping response msg=MIP_SETF_RSP(15) msg_id(55509069) (APPID_CHASSISUPER(64/0) -> APPID_CLI(67
Disabling ports
All ports disabled

Thu Sep 26 14:26:49 : ChassisSupervisor bootMgr alert message:
+++ _bootMgrRebootCMM: rebooting system

Thu Sep 26 14:26:50 2013 :: Local Flash Sync complete - fail watchdog is 0
Thu Sep 26 14:26:50 2013 :: Unmounting flash


[..]

Nachdem das durch ist haben beide Maschinen die neue Firmware. (Hier ein Update von 7.3.1.748 auf 7.3.2.374)