aLIGO LHO Logbook

H1 SUS (CDS, SYS)

jeffrey.kissel@LIGO.ORG - posted 15:03, Monday 07 April 2025 - last comment - 13:20, Tuesday 08 April 2025(83787)

Recovery from 2025-04-06 Power Outage: +18V DC Power Supply to SUS-C5 ITMY/ITMX/BS Rack Trips, ITMY PUM OSEM SatAmp Fails; Replaced Both +/-18 V Power Supplies and Replaced ITMY PUM OSEM SatAmp

J. Kissel, R. McCarthy, M. Pirello, O. Patane, D. Barker, B. Weaver
2025-04-06 Power outage: LHO:83753

Among the things that did not recover nicely from the 2025-04-06 power outage was the +18V DC power supply to the SUS ITMY / ITMX / BS rack, SUS-C5. The power supply lives in VDC-C1 U23-U21 (Left-Hand Side if staring at the rack from the front); see D2300167. More details to come, but we replaced both +/-18V power supplies and SUS ITMY PUM OSEMs satamp did not survive the powerup, so we replaced that too.

Took out 
    +18V Power Supply S1300278
    -18V Power Supply S1300295
    ITMY PUM SatAmp S1100122

Replaced with
    +18V Power Supply S1201919
    -18V Power Supply S1201915
    ITMY PUM SatAmp S1000227

Comments related to this report

jeffrey.kissel@LIGO.ORG - 13:20, Tuesday 08 April 2025 (83810)CDS, SUS

Link

And now... the rest of the story.

Upon recovery of the suspensions yesterday, we noticed that all the top-mass OSEM sensor values for ITMX, ITMY, and BS were low, *all* scattered from +2000 to +6000 [cts]. They typically should be typically sitting at ~half the ADC range, or ~15000 [cts]; see ~5 day trend of the top mass (main chain, M0) OSEMs for H1SUSBS M1,ITMX M0, and H1SUSITMY M0. The trends are labeled with all that has happen in the past 5 days. The corner was vented on Apr 4 / Friday, so that changes the physical position of the suspensions and the OSEMs see it. At the power outage on Apr 6, you can see a much different, much more drastic change.

Investigations are rapid fire during these power outages, with ideas and guesses for what's wrong are flying everywhere. The one that ended up having fruit was that Dave mentioned that it looked like "they've lost a [+/-18V differential voltage] rail or something," -- where he's thinking about the old 2011 problem LLO:1857 where
- There's a SCSI cable that connects the SCSI ports of a given AA chassis to the SCSI port of the corresponding ADC adapter card on the back of any IO chassis
- The ADC Adapter Card 's port has very small, male pins that can be easy bent if one's not careful during the connection of the cable.
- Sometimes, these male pins get bent in such a way that the (rather sharp) pin stabs into the plastic of the connecter, rather than into the conductive socket of the cable. Thus, (typically) one leg, of one differential channel is floating, and this manifests digitally in that it creates an *exact* -4300 ct (negative 4300 ct) offset that is stable and not noisy.
- (as a side note, this issue was insidious: once one bent male pin on the ADC adapter card was bent, and mashed into the SCSI cable, that *SCSI* cable was now molded to the *bent* pin, and plugging it in to *other* adapter cards would bend previously unbent pins, *propagating* the problem.)

Obviously this wasn't happening to *all* the OSEMs on three suspensions without anyone touching any cables, but it gave us enough clue to go out to the racks.
Another major clue -- the signal processing electronics for ITMX, ITMY and BS are all in the same rack -- SUS-C5 in the CER.
Upon visiting the racks, we found, indeed, that all the chassis in SUS-C5 -- the coil drivers, TOP (D1001782), UIM (D0902668) and PUM (D0902668) -- had their "-15 V" power supply indicator light OFF; see FRONT and BACK pictures of SUS-C5.

Remember several quirks of the system that help us realize what's happened (and looking at the last page of ITM/BS wiring diagram, D1100022 as your visual aide):
(1) For aLIGO "UK" suspensions -- the OSEM *sensors'* PD satellite amplifiers (sat amps, located out in the LVEA within the biergarten) that live out in the LVEA field racks are powered by the coil drivers to which their OSEM *coil actuators* are connected.
So, when the SUS-C5 coil drivers lost a differential power rail, that makes both the coils and the sensors of the OSEM behave strangely (as typical with LIGO differential electronics: not "completely off" just "what the heck is that?").
(2) Just as an extra fun gotcha, all of the UK coil drivers back panels are *labeled incorrectly* so that the +15V supply voltage indicator LED is labeled "-15" and the -15V supply is labeled "+15".
So, this is why the obviously positive 18V coming from the rack's power rail is off, but the "+15" indicator light is on an happy. #facepalm
(3) The AA Chassis and Binary IO for these SUS live in the adjacent SUS-C6 rack; it's + and - 18V DC power supply (separate and different from the supplies for the SUS-C5 rack) came up fine without any over-current trip. Similarly the IO chassis, which *do* live in SUS-C5, are powered by a separate single-leg +24V from another DC power supply, also coming up fine without over-current trip.
So, we had a totally normal digital readback of the odd electronics behavior.
(4) Also note, at this point, we had not yet untripped the Independent Software Watch Dog, and the QUAD's Hardware Watchdog had completely tripped.
So, if you "turn on the damping loops" it looks like nothing's wrong; at first glance, it might *look* like there's drive going out to the suspensions because you see live and moving MASTER_OUT channels and USER MODEL DAC output, missing that there's no IOP MODEL DAC output. and it might *look* like the suspensions are moving as a result because there are some non-zero signals coming into on OSEMINF banks and they're moving around, so that means the damping loops are doing what they do and blindly taking this sensor signal, filtering it per normal, and sending a control signal out.

Oi.

So, anyways, back to the racks -- while *I* got distracted inventorying *all* the racks to see what else failed, and mapping all the blinking lights in *all* the DC power supplies (which, I learned, are a red herring) -- Richard flipped on the +18V power supply in VDC-C1 U23, identifying quickly that it had over-current-tripped when the site regained power.
See the "before" picture of VDC-C1 U23 what it looks like tripped -- the "left" (in this "front of the rack" view) power supply's power switch on the lower left is in the OFF position, and voltage and current read zero.

Turning the +18V power supply on *briefly* restored *all* OSEM readbacks, for a few minutes.
And then the same supply, VDC-C1 U23, over-current tripped again.
So Richard and I turned off all the coil drivers in SUS-R5 via their rocker switches, turned on the VDC-C1 U23 left +18V power supply again, then one-by-one powered on the coil drivers in SUS-C5 with Richard watching the current draw on the VDC-C1 U23 power supply.

Interesting for later: when we turned on the ITMY PUM driver, he shouted down "whup! Saw that one!"
With this slow turn on, the power supply did not trip and power to the SUS-R5 held, so we left it ...for a while.
Richard and I identified that this rack's +18V and -18V power supplies had *not* yet had their fans upgraded per IIET:33728.
Given that it was functioning again and having other fish to fry, we elected to not *yet* to replace the power supplies.

Then ~10-15 minutes later, the same supply, VDC-C1 U23, over-current tripped again, again .
So, Marc and I went forward with replacing the power supplies.
Before replacement, with the power to all the SUS-C5 rack's coil drivers off again, we measured the output voltage of both supplies via DVM: +19.35 and -18.7 [V_DC].
Then we turned off both former power supplies and swapped in the replacements (see serial numbers quoted in the main aLOG); see "after" picture.

Not knowing better we set the supplies to output to a symmetric +/-18.71 [V_DC] as measured by DVM.
Upon initial power turn on with no SUS-R5 coil drivers on, we measured the voltage from an unused 3W3 power spigot of the SUS-R5 +/-18 V power rail, and measured a balanced +/-18.6 [V_DC].
Similar to Richard and I earlier, I individually turned on each coil driver at SUS-C5 while Marc watched the current draw at the VDC-C1 rack.
Again, once we got the ITMY PUM driver we saw a large jump in current draw. (this is one of the "important later")
I remeasured the SUS-R5 power rail, and the voltage on positive leg had dropped to +18.06 [V_DC].
So, we slowly increased the requested voltage from the power supply to achieve +18.5 [V_DC] again at the SUS-R5 power rail.
This required 19.34 [V_DC] at the power supply.
Welp -- I guess whomever had set the +18V power supply to +19.35 [V_DC] some time in the past had come across this issue before.

Finishing up at the supplies, we restored power / turned to all the remaining coil drivers had watched it for another bit.
No more over-current trips.
GOOD!

... but we're not done!

... upon returning to the ITMY MEDM overview screen on a CDS laptop still standing by the rack, we saw the "ROCKER SWITCH DEATH" or "COIL DRIVER DEATH" warning lights randomly and quickly flashing around *both* the L1 UIM and the L2 PUM COILOUTFs. Oli reported the same thing from the control room. However, both those coil drivers power rail lights looked fine and the rocker switches had not tripped. Reminding myself that these indicator lights are actually watching the OSEM sensor readbacks; if the sensors are some small threshold around zero, then the warning light flashes. This was a crude remote indicator of whether the coil driver itself had over-current tripped because again, the sensors are powered by the coil driver, so if the sensors are zero then there's a good chance the coil driver if off.
But in this case we're staring at the coil driver and it reports good health and no rocker switch over-current trip.
However we see the L2 PUM OSEMs were rapidly glitching between "normal signal" of ~15000 [cts] and a "noisy zero" around 0 [ct] -- hence the red, erratic (and red herring) warning lights.

Richard's instincts were "maybe the sat amp has gone in to oscillations" a la 2015's problem solved by an ECR (see IIET:4628), and suggest power cycling the sat amp.
Of course, these UK satamps () are another design without a power switch, so a "power cycle" means disconnecting and reconnecting the cabling to/from the coil driver that powers it at the satamp.
So, Marc and I headed out to SUS-R5 in the biergarten, and found that only ITMY PUM satamp had all 4 channels' fault lights on and red. See FAULT picture.
Powering off / powering on (unplugging, replugging) the sat amp did not resolve the fault lights nor the signal glitching.
We replaced the sat amp with a in-hand spare and fault lights did NOT light up and signals looked excellent. No noise, and the DC values were restored to their pre-power-outage values. See OK picture.

So, we're not sure *really* what the failure mode was for this satamp, but (a) we suspect it was a victim of the current surges and unequal power rails over the course of re-powering the SUS-C5 rack, which contains the ITMY PUM coil driver that drew a lot of current upon power up, which powers this sat-amp (this is the other of the "important later"); and (b) we had a spare and it works, so we've moved on with post-mortem to come later.

So -- for all that -- the short answer summary is as the main aLOG says:
- The VDC-C1 U23 "left" +18V DC power supply for the SUS-R5 rack (and for specifically the ITMX, ITMY, and BS coil drivers) over-current tripped several times over the course of power restoration, leading us to
- Replace both +18V and -18V power supplies that were already stressed and planned to be swapped in the fullness of time, and
- We swapped a sat-amp that did not survive the current surges and unequal power rail turn-ons of the power outage recovery and subsequent investigations.

Oi!

Images attached to this comment

H1 CDS

Link

patrick.thomas@LIGO.ORG - posted 14:26, Monday 07 April 2025 - last comment - 15:30, Monday 07 April 2025(83785)

Beckhoff recovery from power outage

There was a PSL Beckhoff chassis that needed to be powered on. There is an alog saying that I configured the PSL PLC and IOC to start automatically, so maybe this is what kept it from doing so?
I physically power cycled the BRS Beckhoff machine at end X. It was unreachable from remote desktop and in a bad frozen state when I connected to it from the KVM switch.
I started the end X NCAL PLC and IOC, the end X mains power monitoring PLC and IOC, and the corner station mains power monitoring PLC and IOC.

Comments related to this report

patrick.thomas@LIGO.ORG - 15:30, Monday 07 April 2025 (83791)

Link

On the end X mains power monitoring Beckhoff machine I had to make the tcioc firewall profile also be enabled for the private network.

H1 SUS

Link

ryan.crouch@LIGO.ORG - posted 14:19, Monday 07 April 2025 - last comment - 09:47, Tuesday 08 April 2025(83780)

Power power outage OSEM recovery / Offset check

Oli, Ibrahim, RyanC

We took a look at the osems current positions for the suspensions post power outage to make sure the offsets are still correct, the previously referenced "Golden time" was 1427541769 (the last drmi lock before the vent). While we did compare against this time we mainly set them to before the poweroutage.

Input:

IM1_P: 368.5 -> 396.1, IM1_Y: -382.7 -> -385.4

IM2_P: 558.0 -> 792.0, IM2_Y: -174.7 -> -175.7

IM3_P: -216.3 -> -207.7, IM3_Y: 334.0 -> 346.0

IM4_P: -52.4 -> -92.4, IM4_Y: 379.5 -> 122.5

SR2_P: -114.3 -> -117.6, SR2_Y: 255.2 -> 243.6

SRM_P: 2540.3 -> 2478.3, SRM_Y: -3809.1 -> -3825.1

SR3_P: 439.8 -> 442.4, SR3_Y: -137.7 -> -143.9

Output:

ZM6_P: 1408.7 -> 811.7, ZM6_Y: -260.1 -> -206.1

OM1_P: -70.9 -> -90.8, OM1_Y: 707.2 -> 704.5

OM2_P: -1475.8 -> -1445.0, OM2_Y: -141.2 -> -290.8

OM3_P: Didn't adjust, OM3_Y: Didn't adjust

Comments related to this report

ibrahim.abouelfettouh@LIGO.ORG - 15:19, Monday 07 April 2025 (83789)

Link

Input:

PRM_P: -1620.8 -> -1672, (Changed by -51.2) PRM_Y: 579.6 -> -75.6 (Changed by -655.2)

PR2_P: 1555 -> 1409 (Changed by 146), PR2_Y: 2800.7 -> -280.8 (Changed by -3081.5)

PR3_P: -122.2 -> -151 (Changed by -28.8), PR3_Y: -100 -> -232.4 (Changed by -132.4)

MC1_P: 833.3 -> 833.3, MC1_Y: -2230.6 -> -2230.6 (No change)

MC2_P: 591.5 -> 591.5, MC2_Y: -580.4 -> -580.4 (No change)

MC3_P: -20.3 -> -20.3, MC3_Y: -2431.1 -> -2431.1 (No change)

Attached are plots showing the offsets (and their relevant M1 OSEMs) before and after the suspension was reliably aligned.

Images attached to this comment

oli.patane@LIGO.ORG - 15:45, Monday 07 April 2025 (83793)

Link

Comparing QUADs, BS, and TMSs pointing before to after outage. All had come back up from the power outage with a slightly different OPTICALIGN OFFSET for P and Y, due to the power outage taking everything down, and then when systems got back up, the OPTICALIGN OFFSETS were read from the sus sdf files, and those channels aren't monitored by sdf and so had older offset values. I set the offest values back to what they were before the power outage, but still had to adjust them to get the top masses pointed back to where they were before the outage.

'Before' refers to the OPTICALIGN OFFSET values before the outage, and 'After' is what I changed those values to to get the driftmon channels to match where they were before the outage.

SUS	Before	After
ITMX
P	-114.5	-109.7
Y	110.1	110.1 (no change)
ITMY
P	1.6	2.0
Y	-17.9	-22.4
ETMX
P	-36.6	-45.7
Y	-146.2	-153.7
ETMY
P	164.6	160.6
Y	166.7	166.7 (no change)
BS
P	96.7	96.7 (no change)
Y	-393.7	-393.7 (no change)
TMSX
P	-88.9	-89.9
Y	-94.3	-92.3
TMSY
P	79.2	82.2
Y	-261.9	-261.9 (no change)

camilla.compton@LIGO.ORG - 08:37, Tuesday 08 April 2025 (83805)

Link

HAM7/8 suspensions was brought back, details in 83774.

camilla.compton@LIGO.ORG - 09:35, Tuesday 08 April 2025 (83807)ISC

Link

Betsy, Oli, Camilla

The complete story:

Ryan S had found a "good DRMI time" were all optics were in thier aligned state beofre HEPI lockign or venting: 1427541769 gps.

We had planned to go back to this time, but there was then some confusion about whether we wanted a time with the HEPIs locked or not and the team decided to go back to the time directly before the power outage, so that there is a typo is Ryan's original alog.

Everything actually got put back to a time before the power outage, when all suspensions were in the ALIGNED guardian state, (e.g. Sunday 6th April ~16:00UTC or times around then). However some of the ISI's were tripped at that time: HAM2, HAM7, BSC1, BSC2, BSC3, OPs overview attached.

As we went and checked the IMC alignment yesterday 83794 (and saw flashes on IM4_TRANS suggesting MCs and IM1,2,3 are good), we do not want to move the alignment again so are staying with everything at this "before power outage time"...for better or worse.

We rechecked that everything was put back to this time, looking at the M1 DAMP channels for each optic, e.g. H1:SUS-OM1_M1_DAMP_{P,Y}_INMON:

OM1,2,3, ZM6
SRM,2,3
IM1,2,3,4
Quads, BS, TMS
- Found that ETMX was 30urad off, same with ITMY Yaw, so Oli readjusted. All others were left as is.
Oli checked all other optics which were fine, within a few urad.

For each ndscope, 1st t-cursor on originally "good DRMI time" we had planned to go back to, second t-cursor on time we went back to.

Images attached to this comment

oli.patane@LIGO.ORG - 09:47, Tuesday 08 April 2025 (83808)

Link

I checked the rest of the optics and verified that they all got put back to point where they were pointing before the power outage. I've also used one of the horizontal cursors to mark where the optic was at the "good DRMI time", and the other cursor marks where the optic is currently.

Images attached to this comment

H1 SEI (CDS)

Link

jeffrey.kissel@LIGO.ORG - posted 14:14, Monday 07 April 2025 (83781)

Recovery from 2025-04-06 Power Outage: ISI ITMY Corner 1 Sensor Interface Chassis Fails -- Replaced with Spare

J. Warner, J. Kissel, M. Pirello
2025-04-06 Power outage: LHO:83753

Among the things that did not recover nicely from the 2025-04-06 power outage was the aLIGO BSC ISI Interface Chassis (D1002432) for Corner 1 of the ISI ITMY. Page 1 of the wiring diagram for a generic BSC SEI (HPI and ISI) system D0901301 shows that ST1 and ST2 CPS, ST1 L4Cs, and ST2 GS13s of a given corner (Corner 1, Corner 2, and Corner 3) are all read out by the same chassis. In the instance of this drawing / rack for ISI ITMY is SEI-C4, and this corner 1 chassis lives in SEI-C4 U40.

This morning, Jim identified that this particular interface chassis failed -- after hearing that Tony tried to re-isolate the platform repeated last night and couldn't -- by opening the ISI ITMY overview screen and found that all of these corner one sensors' signals were reading out a "noisy zero" i.e. 0 +/- a few counts. See attached time-machine screenshot, where the corner 1 sensors are higlighted with yellow arrows showing less than 2 counts.

Jim, Marc, and I went out to the CER SEI-C4 and found that power cycling the chassis would not help.
There happened to be a spare chassis resting on top of the existing corner 1 chassis, so we just made the swap.
We suspect that chassis power regulator didn't survive the current surge of the rack's DC power supplies losing power and coming back (perhaps unequally), but we'll do a post-mortem later.

Taken out Interface chassis S1201320.
Replaced with Interface chassis S1203892.

The attached images show the top of the SEI-C4 rack BEFORE and AFTER.

Images attached to this report

H1 SYS

Link

betsy.weaver@LIGO.ORG - posted 14:04, Monday 07 April 2025 - last comment - 16:43, Monday 07 April 2025(83783)

VENT ACTIVITY STAT mid boot fests

Slow start on vent tasks today - besides the lack of water onsite this morning, and the particularly nasty power outage which are making things not come back up very well, we popped into HAM2 to keep moving on a few of the next steps.  Corey has captured detailed pictures of components and layouts, and Camilla and I have logged all of the LSC and ASC PD serial numbers and cables numbers.  We removed all connections at these PD boxes and Daniel is out making the RF cable meter length measurements.  OPS were realigning all suspensions to a golden DRMI time they chose as a reference for any fall back times.  Jason and Ryan are troubleshooting other PSL items that are misbehaving.

We are gearing up to take a side road on today's plan to look for flashes out of HAM2 to convince ourselves that the PSL PZT and alignment restoration of some suspensions are somewhat correct.

Comments related to this report

camilla.compton@LIGO.ORG - 16:43, Monday 07 April 2025 (83798)

Link

Betsy and I also removed the septum plate VP cover to allow the PSL beam into HAM2 for alignment check work 83794, it was placed on HAM1.

H1 ISC

Link

keita.kawabe@LIGO.ORG - posted 13:58, Monday 07 April 2025 - last comment - 14:49, Monday 07 April 2025(83782)

WFS and LSC RF sensors were turned off

In preparation for disconnecting cables in HAM1, I turned off the following DC interface chassis:

LSC RF PD DC interface chassis in ISC R4 (REFL-A, POP-A among others), LSC RF PD DC interface chassis in ISC R1 (REFL-B among others), and ASC WFS DC interface chassis in ISC R4 (REFL_A, REFL_B among others).

Daniel will perform TDR to measure RF in-vac cable length from outside.

Comments related to this report

keita.kawabe@LIGO.ORG - 14:49, Monday 07 April 2025 (83786)

Link

Turning off the DC interface for LSC REFL_B somehow interfere with FSS locking. Turns out that the DC interface provides power (and maybe fast readback of the DC output) of FSS RFPD.

Since the point of powering down was to safely disconnect the DC in-vac cable from LSC REFL_B, and since the cable was safely disconnected, I restored the power and the FSS relocked right away.

H1 CDS (CDS, ISC, SYS)

Link

jeffrey.kissel@LIGO.ORG - posted 13:46, Monday 07 April 2025 (83777)

Recovery from 2025-04-06 Power Outage: SQZ Timing Comparator Lost Uplink -- Power Cycle Fixes It

J. Kissel, M. Pirello
2025-04-06 Power outage: LHO:83753

Among the things that did not recover nicely from the 2025-04-06 power outage was the Timing Comparator D1001370 that lives in ISC-C2 U40 (see component C261 on pg 3 of D1900511-v9). The symptom was that its time-synchronizing FPGA was caught in a bad state, and the timing fanout in the CER Beckhoff status for the comparator was reporting that H1:SYS-TIMING_C_FO_A_PORT_13_NODE_UPLINKUP was in error (a channel value of zero instead of one). 

We didn't know any of this at the start of the investigation.
At the time of investigation start, we only new of an error by following through the automatically generated "SYS" screens (see attached guide),
   SITEMAP > SYS > Timing > Corner A Button [which had a red status light] 
   > TIMING C_FO_A screen Port 13, dynamically marked as a "C" for comparator [whose number was red, and the status light was red] 
   > Hitting the "C" opens the subscreen for TIMING C_FO_A NODE 13, which shows that red "Uplink Down" message in the middle right
The screenshot shows the NODE 13 screen both in the "now" fixed green version state, and a time-machined "broken" version.

Going out to the CER, we found that status light for Digital Port 13 == Analog Port 14 on the timing fanout (D080534; ISC-C3 U11) was blinking. 
Marc tried moving the comms cable to analog port 16, because "sometimes these things have bad ports." That didn't work, so we moved it back to analog port 14.

That port's comms fiber cable was not labeled, so we followed it physical to find its connection to the SQZ timing comparator (again in ISC-C2 U40, thankfully "right next door"), to find it's "up" status light also blinking.
Marc suggested that the comparators may lose sync, so we power cycled it. This chassis doesn't have a power switch, so we simply disconnected and reconnected its +/-18 V power cable.
After waiting ~2 minutes, all status lights turned green.

#FIXEDIT

Images attached to this report

H1 SEI

Link

jim.warner@LIGO.ORG - posted 13:42, Monday 07 April 2025 - last comment - 14:13, Monday 07 April 2025(83779)

SEI recovery status, HAM8 and ETMY coildriver binary issues after power outage

I've been trying to recover seismic systems this morning after the power outage. So far, I've gotten HEPI pump stations back up, all the ISI are damping. Worked with Patrick to get both BRS are back up and running, but their heating was off for a while and it will take some time to get back to a good state.

ETMY and HAM8 however have some issue with the binary read backs for their coil drivers. I've check the racks and the overtemp relay lights are all good, but the binary read backs indicate a number of the overtemp relays are tripped. Not sure what the cause is yet, but fixing these isn't a priority today. Working with Dave, I power cycled both the binary in and coil drivers for HAM8, but that didn't fix the readbacks. Similar to this alog, we just put in test offsets to get the bit word clean and will wait for Fil to be available to try opening the expansion chassis. These tables should be able to isolate now, not that they are needed. Dave has filed and FRS and the test offset shows up in SDF.

Comments related to this report

jim.warner@LIGO.ORG - 14:13, Monday 07 April 2025 (83784)

Link

And now the EY bit word has fixed itself, I've removed the fix Dave put in.

H1 CDS

Link

david.barker@LIGO.ORG - posted 12:18, Monday 07 April 2025 (83776)

Updated list of systems which are down

As of 12:16 here is the latest CDS not-working list (some cleared from last list, some added).

Slow controls DEV4 EX chassis terminal missing

NCALEX

Diode Room Beckhoff (h1pslctrl0)

BRS EX and EY

HWS

EX Mains Mon

Weather stations (EX and EY)

PWRCS and PWREX

H1 PSL

Link

jason.oberling@LIGO.ORG - posted 12:12, Monday 07 April 2025 - last comment - 13:32, Monday 07 April 2025(83775)

PSL Mostly Recoverd from Power Outage

R. Short, J. Oberling, P. Thomas, J. Hanks

We restarted the PSL after yesterday's power outage. Some notes:

The LVEA Control Box is on a bench-top power supply that had to be turned on manually, the system would not turn on until this was done
The PSL restarted mostly without issue after that
- For some reason we lost our system calibrations file (power meters, PDs, etc.), it was using calibrations from November 2024, shortly after the final NPRO swap. More below.
Communication between PSL Beckhoff and EPICS is still broken, Patrick and Jonathan are currently investigating

Once the PSL Beckhoff was restarted we noticed that the output power seemed unusually low, we traced this to the calibration settings being out of date. When running a newer version of the software we have to remember to grab the persistent settings file (port_851.bootdata from C:\TwinCAT\3.1\Boot\PLC), this holds all of the trip points, sensor calibrations, and a running tally of operating hours. Not sure why the system lost this information now, I don't think we've ever seen this happen with a system restart and I currently have no explanation. We were able to get the PD and LD monitor calibration settings from my alog from February when I changed the pump diode operating currents, and were able to grab the operating hours using ndscope (we looked at what they were reading when the power outage happened and set the operating hours back to that point + 1 hour (since the system had been running for ~1 hour at that point)). One thing to note, however, the persistent operating hour data is now completely wrong. The software tracks operating hours 2 ways: a user-updatable value and a locked value. The former allows us to change operating hours when we install a new component (like swapping chillers or installing a new NPRO/Amplifier after a failure), while the latter tracks total uptime of said components (i.e. total number of hours an NPRO, any NPRO, has been running in the system). It's these latter operating hours that are completely bogus, as we have no way to update these if the persisten settings file is lost (as I said, it's a locked value).

For future reference I've attached a screenshot of the current system settings table; the operating hours that are now wrong are in column labeled OPHRS A.

Images attached to this report

Comments related to this report

erik.vonreis@LIGO.ORG - 13:32, Monday 07 April 2025 (83778)

Link

PSL Beckhoff network connection was restored with the following command:

netsh interface ipv4 add neighbors "Ethernet" 10.105.0.1 <mac-address>

where "Ethernet" was the name of the interface as given by "ipconfig"

and <mac-address> is the address of 10.105.0.1 gateway.

This adds a permanent entry to the arp table.

A similar entry had to be added to the RGA workstation to communicate with corner station RGAs after the last network upgrade, but the arp entry had to be deleted after

the installation of sw-lvea-aux1 using

netsh interface ipv4 delete neighbors ....

see this microsoft KB entry: https://support.microsoft.com/en-us/topic/cannot-delete-static-arp-entries-by-using-the-netsh-command-on-a-computer-that-is-running-windows-vista-or-windows-server-2008-08096675-0a9b-81b3-b325-6438af4450bc

LHO VE

Link

david.barker@LIGO.ORG - posted 10:32, Monday 07 April 2025 (83773)

Mon CP1 Fill

Mon Apr 07 10:07:39 2025 Fill completed in 7min 36secs

Images attached to this report

LHO VE (VE)

Link

travis.sadecki@LIGO.ORG - posted 09:41, Monday 07 April 2025 (83771)

BSC8 annulus vented

BSC8 annulus volume has been vented with nitrogen. Randy has been notified that door bolt removal can proceed.

H1 General

Link

oli.patane@LIGO.ORG - posted 09:19, Monday 07 April 2025 - last comment - 14:16, Monday 07 April 2025(83769)

Status of power outage recovery

Came in to find all IFO systems down. Working through recovery now.

SUSB123 power supply seemed to have tripped off. ITMs and BS OSEM counts were all sitting very low around 3000. Once power supply was flipped back on, OSEM counts returned to normal for ITMX and BS, but now ITMY coil drivers filter banks are fflashing ROCKER SWITCH DEATH. Jeff and Richard are back in the CER cycling the coil drivers to hopefully fix that.

Also power for ISIITMY is down and being worked on to bring back.

Most of vent work is currently on hold and focused on getting systems back online

Comments related to this report

oli.patane@LIGO.ORG - 09:50, Monday 07 April 2025 (83772)

Link

Cycling the coil drivers worked to fix that issue with the ITMY coil drivers. They needed to turn the power back off, turn the connected chassis off, then turn the power back on and then each chassis back on one by one.

The ITMY ISI GS13 that failed was replaced,and work is still going on to bring ITMY ISI back.

There are some timing errors that need to be corrected and a problem with DEV4 at EX.

camilla.compton@LIGO.ORG - 14:16, Monday 07 April 2025 (83774)SQZ

Link

Once the ISI was back, Elenna and I brought all optics in HAM7/8 back to ALIGNED. Elenna put ZM4, FC1 back to before the power outage as they had changed ~200urad from SQZ/FC ASC signals being zeroed. Everything else (ZM1,2,3,5,OPO,FC2) was a <5-10 urad change so leaving as is.

H1 PEM

Link

ryan.crouch@LIGO.ORG - posted 08:42, Monday 07 April 2025 - last comment - 08:13, Tuesday 08 April 2025(83766)

HAM1 dust monitor weekend trend

The counts were pretty low over the weekend, peaking at ~ 30 counts of 0.3s and 10 counts for 0.5s.

Images attached to this report

Comments related to this report

ryan.crouch@LIGO.ORG - 09:21, Monday 07 April 2025 (83770)

Link

The EY dust monitor died with the power outage and has not come back with a process restart, it's having connection issues.

ryan.crouch@LIGO.ORG - 08:13, Tuesday 08 April 2025 (83804)

Link

The EY dust monitor came back overnight.

H1 CDS

Link

david.barker@LIGO.ORG - posted 08:12, Monday 07 April 2025 - last comment - 09:13, Monday 07 April 2025(83765)

Systems which are still down from power outage:

By no means a complete list and in no particular order:

FMCS EPICS

CER Timing Fanout 14th port

Diode room PSL Beckhoff

HEPI pump controller, corner station

Slow controls DEV4 terminal errror

h1cdsrfm long range dolphin

DTS

ncalex

BRS EX, EY

HWS

EX mains mon

Comments related to this report

jonathan.hanks@LIGO.ORG - 09:13, Monday 07 April 2025 (83767)

Link

FMCS-EPICS and the DTS have been recovered