Mandatory NVIDIA driver update

Scheduled Maintenance Report for Farm HPC cluster

Completed

The drivers of all GPU nodes have been updated and the nodes released back into service.
Posted May 06, 2025 - 13:10 PDT

In progress

Scheduled maintenance is currently in progress. We will provide updates as necessary.
Posted May 06, 2025 - 08:00 PDT

Scheduled

NVIDIA has notified users of a severity High vulnerability in their GPU drivers for Linux which could allow an unprivileged user to escalate permissions (https://nvidia.custhelp.com/app/answers/detail/a_id/5630). Due to UCOP IS-3 policy, we are required to patch affected systems as soon as possible.

As a result, we will be patching NVIDIA drivers on all HPC GPU systems and rebooting them starting at 8:00 a.m. on May 6th, 2025. Jobs that are currently utilizing HPC GPUs will be killed with a reboot. New jobs will be unavailable to start until patching is complete. We expect the maintenance to last until 6:00 p.m. on the same day.

Please email hpc-help@ucdavis with any questions.
Posted Apr 25, 2025 - 16:04 PDT
This scheduled maintenance affected: bgpu and gpuh,gpum.