Identified - nas-12-2 has suffered from multiple disk failures. Admins are investigating the best path forward.

The following group directories are currently unavailable:

awhitehegrp
millermrgrp
millsgrp
runciegrp
weimergrp
yujingrp

The following home directories are unavailable:

aavalos7
awhitehe
barao
bcbaikie
bcweimer
berdeja
crice
crios
cschles
dglemay
djprince
dkblaufu
drbandoy
eabernat
ecgranad
edkoch
emmaluu
eoziolor
fengq
hahudson
hemstrow
hxhu
jagill
jajpark
jamcgirr
jassim
jcariute
jdowen
jenwash
jmiller1
jroach
jrwashab
jxnliu
katng23
ljcohen
madarm11
mam12n
mary363
millermr
mlyjones
mmosmond
motch
mtreiber
namcnabb
nmariano
nreid
pjseba
profeta
prvasque
psbapat
rsbrenna
sakre
saumyaw
scsastry
seboles
sejoslin
smhigdon
spatel23
tmbolt
vfbetsis
vpdunne
wolfie12
xmixu
yoxue
ytakim
ywdong

Mar 31, 2025 - 18:09 PDT
Login ? Partial Outage
90 days ago
99.21 % uptime
Today
Storage ? Partial Outage
90 days ago
99.19 % uptime
Today
File transfer node ? Operational
90 days ago
100.0 % uptime
Today
high2,med2,low2 ? Operational
90 days ago
99.97 % uptime
Today
high,med,low ? Operational
90 days ago
99.97 % uptime
Today
bmh,bmm ? Operational
90 days ago
99.97 % uptime
Today
bigmemh,bigmemm ? Operational
90 days ago
99.97 % uptime
Today
bgpu ? Operational
90 days ago
99.97 % uptime
Today
gpuh,gpum ? Operational
90 days ago
99.97 % uptime
Today
Email ? Operational
90 days ago
100.0 % uptime
Today
Virtualization Operational
90 days ago
100.0 % uptime
Today
Proxmox Virtualization Nodes Operational
90 days ago
100.0 % uptime
Today
Ganetti cluster ? Operational
90 days ago
100.0 % uptime
Today
Slurm ? Operational
90 days ago
100.0 % uptime
Today
Software Operational
90 days ago
84.94 % uptime
Today
Operational
Degraded Performance
Partial Outage
Major Outage
Maintenance
Major outage
Partial outage
No downtime recorded on this day.
No data exists for this day.
had a major outage.
had a partial outage.
Apr 2, 2025

No incidents reported today.

Apr 1, 2025

No incidents reported.

Mar 31, 2025
Completed - The scheduled maintenance has been completed.
Mar 31, 18:00 PDT
In progress - Scheduled maintenance is currently in progress. We will provide updates as necessary.
Mar 31, 17:00 PDT
Scheduled - Some group and home directories are unable to mount on the login node. An emergency reboot of the login node is scheduled for 5pm today. This will not impact any sbatch jobs, though it will cause all srun jobs launched from the login node to fail.
Mar 31, 10:58 PDT
Mar 30, 2025

No incidents reported.

Mar 29, 2025

No incidents reported.

Mar 28, 2025

No incidents reported.

Mar 27, 2025
Resolved - nas-5-3 is once again correctly serving data.
Mar 27, 11:59 PDT
Monitoring - nas-5-2 has been rebooted and verified to be back in service. It is taking a very high load of writes, so access will be sluggish until backed-up jobs catch up.
Mar 27, 09:39 PDT
Identified - nas-5-2 has crashed. Any home directories, or group directories, shared from there are currently hung. Admins are investigating.
Mar 27, 09:08 PDT
Mar 26, 2025

No incidents reported.

Mar 25, 2025

No incidents reported.

Mar 24, 2025

No incidents reported.

Mar 23, 2025

No incidents reported.

Mar 22, 2025

No incidents reported.

Mar 21, 2025

No incidents reported.

Mar 20, 2025

No incidents reported.

Mar 19, 2025

No incidents reported.