March 16, 2017: Cumulative Update 4 Extended Maintenance

This maintenance window had been scheduled for 5:00 AM to 7:30 AM on Friday, March 17th. Due to an unexpected issue during installation, the window was extended to approximately 9:30 AM. Additional details are available below.

Updated patch set (Q2 2016 CU4)

Upgrading patch set to Q2 2016 Service Pack with Cumulative Update 4.

  • Numerous quality improvements and bug fixes
  • Security updates

Database firewall update

Updated database firewall to enable the creation of an additional standby database. Our current standby database is falling out of warranty, and may need to be replaced.

Incident analysis

During the upgrade process, Blackboard's installer makes several database changes. Due to an oversight in our maintenance checklist, an open connection from bbprd3 retained a lock on one of the database objects that was scheduled to be modified (bblearn.is_commit_disabled). The installer running on bbprd1 was unable to acquire a lock on that object, and timed out with an ORA-04021 error at 5:42 AM.

Technical staff made several attempts to safely clear the lock and re-run the installer, including a database restart, but the installer was still encountering an ORA-04021 timeout error. Given that the restart should have cleared all sessions, it is possible that a race condition within the upgrade installer was causing subsequent timeouts.

Once it was clear that maintenance would run past the scheduled 7:30 AM end time, technical staff decided to check into options for a rollback. Since the upgrade installer had run but had not completed, Blackboard application servers were left in an uncertain state; emergency contact with vendor support determined that the safest options to restore service would be (a) a complete restore from system backup, or (b) finding a way to complete the installation.

Technical staff decided to make one more attempt at running the installer, with the restore option available as a failsafe.

At 8:05 AM, technical staff were able to clear the database lock by forcibly bouncing the session which held it. This allowed the remainder of the upgrade to proceed as planned, and Blackboard returned to normal operation at 9:11 AM.

Future action steps

In the future, DLT will hold a peer review of maintenance checklists prior to any major upgrade installation.

DLT will confirm updated steps for a system restore with new Blackboard releases.

DLT will confirm updated escalation process for emergency vendor contact.