I reviewed the weekend lockloss where lock was lost during the calibration sweep on Saturday.
I've compared the calibration injections and what DARM_IN1 is seeing [ndscopes], relative to the last successful injection [ndscopes].
Looks pretty much the same but DARM_IN1 is even a bit lower because I've excluded the last frequency point in the DARM injection which sees the least loop suppression.
It looks like this time the lockloss was a coincidence. BUT. We desperately need to get a successful sweep to update the calibration.
I'll be reverting the cal sweep INI file, in the wiki, to what was used for the last successful injection (even though it includes that last point which I suspected caused the last 2 locklosses), out of abundance of caution and hoping the cause of locklosses is something more subtle that I'm not yet catching.
Despite the lockloss, I was able to utilise the log file saved in /opt/rtcds/userapps/release/cal/common/scripts/simuLines/logs/H1/
(log file used as input into simulines.py), to regenerate the measurement files.
As you can imagine the points where the data is incomplete are missing but 95% of the sweep is present and fitting all looks great.
So it is in some way reassuring that in case we lose lock during a measurement, data gets salvaged and processed just fine.
Report attached.
How to salvage data from any failed attempt simulines injections:
/opt/rtcds/userapps/release/cal/common/scripts/simuLines/logs/{IFO}/ for IFO=L1,H1
'./simuLines.py -i /opt/rtcds/userapps/release/cal/common/scripts/simuLines/logs/H1/{time-name}.log'
for time-name resembling
20250215T193653Z/simuLines.py
' is the simulines exectuable and can have some full path like the calibration wiki does: './ligo/groups/cal/src/simulines/simulines/simuLines.py
'The operator team have been finding 11Hz oscillation locklosses. Attached is the spectrum of MICH, PRCL and SRCL from one of our last long (2 day) locks to a more recent 6 hour lock. There is a debatable PRCL bump around 10-11Hz.
I'm comparing plots that Oli made in this alog to a plot I added to this alog from early O4a where we were having frequent locklosses due to marginal stability in PRCL. The ring up looks very similar and I would guess that we should measure the PRCL OLG and adjust the gain. Just scrolling back the last few days on the summary pages, I don't visually see the excess noise around 11 Hz for a long time before the locklosses, like we saw in O4a, but that doesn't mean much since I could be fooled by the color scale.
After Ryan S and I changed the SQZ ang servo H1:SQZ-ADF_OMC_TRANS_PHASE yesterday, the SQZ was unhappy overnight, plot, and the servo failed to work for a few hours, should be keeping H1:SQZ-ADF_OMC_TRANS_SQZ_ANG at zero but from the attached plot, clearly wasn't.
On the positive, the SHG power has now increased to 130mW and is more stable since the LVEA temperatures are stable again.
TITLE: 02/18 Day Shift: 1530-0030 UTC (0730-1630 PST), all times posted in UTC
STATE of H1: Lock Acquisition
OUTGOING OPERATOR: Ryan C
CURRENT ENVIRONMENT:
SEI_ENV state: USEISM
Wind: 2mph Gusts, 1mph 3min avg
Primary useism: 0.05 μm/s
Secondary useism: 0.47 μm/s
QUICK SUMMARY: Lost lock at 1507 UTC (0707 PT). Maintenance day today. No active alarms.
Workstations were updated and rebooted. This was an OS packages update. Conda packages were not updated.
TITLE: 02/18 Eve Shift: 0030-0600 UTC (1630-2200 PST), all times posted in UTC
STATE of H1: Observing at 149Mpc
INCOMING OPERATOR: Ryan C
SHIFT SUMMARY: Currently Observing and have been Locked for almost 1 hour.
Two locklosses during my shift, and they both had that weird oscillation in DARM right before the lockloss (see 82847). Relocking went okay as long as I ran an initial alignment.
First relocking:
At the start of the initial alignment during green arms, I did get the ALSX issue where it's locked but the WFS won't turn on. I Forced the auto-centering for the WFS and that was all it needed.
Once we got back up to NLN we were having issues with the filter cavity losing lock (similar to the issues from the rest of this weekend). It was losing lock at different places in the locking process (FIND_IR, GR_VCO_LOCKING, TRANSITION_IR_LOCKING, etc). I tried just running SQZ_MANAGER through DOWN and back up but that didn't work. I then tried moving FC1&2 a bit to get them back to where they were during our last lock but that also didn't work. I then just tried what Ryan S did yesterday (82840) - I paused SQZ_MANAGER in LOCK_CLF and adjusted the OPO temp. That worked well and we were then able to lock the filter cavity, start squeezing, and get back into Observing. Tagging sqz
Second relocking:
I just made sure to immediately run an initial alignment and there were no issues getting back to NLN and Observing.
LOG:
00:06 Lockloss
- started an inital alignment
- ALSX was locked (but oscillating weirdly) around 1 but the WFS wouldn't turn on - the auto-centering was off (we've been seeing this recently sometimes). I Forced the auto-centering and the WFS turned on and took ALSX up to 1.2 right away
- Lockloss from TURN_ON_BS_STAGE2
- Lockloss from PRMI_ASC
02:20 NOMINAL_LOW_NOISE
- FC losing lock during different stages in the locking process (FIND_IR, GR_VCO_LOCKING, TRANSITION_IR_LOCKING, etc.)
- Ran SQZ_MANAGER through DOWN and back up - didn't work
- Adjusted FC1&2 a bit to get them back to where they were during our last lock - didn't work (did not revert these changes)
- Did what Ryan S did yesterday (82840) - paused SQZ_MANAGER in LOCK_CLF and adjusted OPO temp. We were then able to lock the filter cavity, start squeezing, and get back into Observing
03:05 Observing
03:36 Popped out of Observing to reset the sqz angle since we had a message about it and sqz was slowly getting worse
03:36 Back into Observing
03:47 Popped out of Observing to reset the sqz angle since we had a message about it and sqz was slowly getting worse
03:48 Back into Observing
04:00 Lockloss
- Ran initial alignment
05:16 NOMINAL_LOW_NOISE
05:19 Observing
Start Time | System | Name | Location | Lazer_Haz | Task | Time End |
---|---|---|---|---|---|---|
00:19 | PEM | Robert | LVEA | YES | Putting viewport cover back on | 00:29 |
In our ongoing quest to figure out what causes the ETMX glitch locklosses, I've rerun all O4 NLN locklosses with our most up-to-date locklost code to see if we can pinpoint when it started and how/if it comes and goes.
To do this I made a histogram plot for all of O4 showing the amount of locklosses we had each day in blue, and then over that plotted the number of those locklosses that have ETMX glitches right before the lockloss (plot). Basically, it looks like it's been with us since the start of O4. There are times where we go a week or maybe almost a whole month without getting locklosses from them, but they're still pretty regular and we see them every few days otherwise. In O4b, it looks like we saw them very often between end of August to early October and then saw less of them in December, but it's also hard to quantify because we had less locklosses in December, but that could've been either due to more time locked, or due to more time when the detector was down (same with November - that was when we had all the NPRO stuff).
I also have this plot as a pdf, and here's the link to my lockloss tool in case anyone wants to peruse the early O4 ETM glitch locklosses. Also keep in mind that a few ETMX glitch locklosses will have been missed being tagged by the lockloss tool - our code relies on the glitch surpassing a threshold and then going back down and staying below another threshold for a set amount of time, but in a few cases the glitch leads right into the lockloss, meaning it doesn't stay below that threshold long enough for the lockloss tool to realize it was an ETM glitch and thus not tagging it. This doesn't happen very often but it's still important to note that there are more ETM glitch locklosses than we have plotted on this histogram.
A few seconds before the lockloss I saw ASC-INP1_P_INMON get pretty large, and ASC-CHARD_P_OUT also started oscillating. I also saw the lockloss in LSC-POP_A_LF_OUT and LSC_PR_GAIN_OUT a few seconds before the lockloss. Pretty sure this is was one of the weird oscillation locklosses like the ones in 82847
This is another one of those new oscillation locklosses. I pulled up a bunch of ndscopes from the front wall and set them all to the same t0 and time range. They look like this: ndscope1.
So a bunch of different channels started seeing a ~ 0.13 Hz oscillation about 15 seconds before the lockloss, and then less han half a second before the lockloss, we see the 10-11Hz oscillation in DARM, the QUADS, and the LSC MICH/PRCL/SRCL channels: ndscope2
I replaced the viewport cover I had removed in case someone needs the laser safe state in the morning.
Lockloss @ 02/18 00:06 UTC after almost 5.5 hours locked
This lockloss also has the strange oscillation in many LSC channels + QUADs right before the lockloss that was noted in 82847
03:05 UTC OBSERVING
Looking at front room ndscopes+QUADs/DARM in the minute leading up to LL: ndscope1
Oscillation seen by QUADS, ASC-INP1_P_IN, and ASC-CHARD_P_OUT
NOT seen in power recycling gain or circulating power (like the LL that happens after this one - 82865)
Looking at QUADS/DARM and LSC signals in the second before the LL: ndscope2
Seen by all QUADS, DARM, and LSC signals
Ryan S, Camilla
Corey 82849, Ryan and I have had some issues with SQZ this morning. Corey adjusted the SHG temp to get enough green power to lock the OPO but after that the SQZ and range was bad. FR3 signal and FC WFS were very noisy.
Ryan tried to take SQZ out, checked OPO temp (fine), reset the SQZ angle from 220deg back to a more sensible 190deg and then put SQZ back in again but he couldn't get the FC to lock. Followed steps in the SQZ wiki to touch up FC1/2 alignment and got H1:SQZ-FC_TRANS_C_LF_OUTPUT to >120, plot. but we still were loosing the FC at TRANSION_TO_IR_LOCKING. THE FC also seemed unstable when locked on green. While we were troubleshooting, the IFO lost lost. Unsure if this is an ASC issue, FC_ASC trends attached (POS for Y and P were moving much more than usual), SQZ ASC trends (ZM4 PIT changes alot).
After the lock loss, the SQZ_FC seemed to lock stably in green with H1:SQZ-FC_TRANS_C_LF_OUTPUT = 160. Plot. This is higher than usual and it's not clear what changed!
Ryan mentioned that something happened to Oli at the weekend where the range was bad but SQZ unlocked and re-locked and it came back good, plot, but this seemed to the the OPO PZT changing to a better place (we know it likes ot be ~90 rather than 50s).
After relocking, everything was fine but the FC ASC wasn't turning on as the threshold was too low, 2.5. Ryan decreased H1:SQZ-FC_ASC_TRIGGER_THRESH_ON from 3.0 to 2.0, this has been slowly decreasing, maybe as we decreased the OPO trans from 80uW to 60uW last week. Plot. Ryan accepted in sdf and then checked SQZ_MANAGER, SQZ_FC, sqzparams to check this isn't set in GRD.
FC ASC lower trigger threshold accepted in SDF. It is not set anywhere by Guardian nor is this model's SAFE.snap table pushed during SDF_REVERT.
After a while, Camilla and I again dropped H1 out of observing as even after thermalization, BNS range and squeezing weren't looking good. We decided to reset the SQZ ASC and angle in case they were set at a bad reference point. I took SQZ_MANAGER to 'RESET_SQZ_ASC_FDS' and adjusted the SQZ angle to optimize DARM and SQZ BLRMS. To reset the angle servo here, I adjusted SQZ-ADF_OMC_TRANS_PHASE to make SQZ-ADF_OMC_TRANS_SQZ_ANG oscillate around 0 (ended with a total change of -10deg), then requested SQZ_MANAGER back to 'FREQ_DEP_SQZ' to turn servos back on, and finally accepted on SDF (attached) to return to observing. It's been about 20 minutes since then, and so far H1 is observing with a much better steady range at around 160Mpc.
What happened that needed the ADF servo setpoint to be updated with a different SQZ angle? Investigation is ongoing.
Attaching the plot from after Corey got the SQZ locked and it was bad, you can see it looks like one of the loops was oscillating, plot attached. Compare to normal SQZ plot.
I was looking at the lockloss sheet and noticed that this weekend, there were three locklosses that were similar to each other. Ryan Short noted one in an alog a couple days ago: 82809.
The locklosses look similar to what we sometimes see when we lose lock due to ground motion from an earthquake or wind, but in all three cases there was no earthquake motion, and the wind was below 20mph. There will be an 11 Hz oscillation starting about 1 second before the lockloss, and it can be seen in DARM, all QUADs, MICH, PRCL, and SRCL.
Lockloss times:
There was another instance of a lockloss that looked just like these on February 7th 2025 at 05:47UTC as well.
Some more of these weird locklosses from the last day:
BS Camera stopped updating just like in alogs:
This takes the Camera Sevo guardian into a neverending loop (and takes ISC LOCK out of Nominal and H1 out of Observe). See attached screenshot.
So, I had to wake up Dave so he could restart the computer & process for the BS Camera. (Dave mentioned there is a new computer for this camera to be installed soon and it should help with this problem.)
As soon as Dave got the BS camera back, the CAMERA SERVO node got back to nominal, but I had accepted the SDF diffs for ASC which happened when this issue started, so I had to go back and ACCEPT the correct settings. Then we automatically went back to Observing.
OK, back to trying to go back to sleep again! LOL
Full procedure is:
Open BS (cam26) image viewer, verify it is a blue-screen (it was) and keep the viewer running
Verify we can ping h1cam26 (we could) and keep the ping running
ssh onto sw-lvea-aux from cdsmonitor using the command "network_automation shell sw-lvea-aux"
IOS commands: "enable", "configure terminal", "interface gigabitEthernet 0/35"
Power down h1cam26 with the "shutdown" IOS command, verify pings to h1cam26 stop (they did)
After about 10 seconds power the camera back up with the IOS command "no shutdown"
Wait for h1cam26 to start responding to pings (it did).
ssh onto h1digivideo2 as user root.
Delete the h1cam26 process (kill -9 <pid>), where pid given in file /tmp/H1-VID-CAM26_server.pid
Wait for monit to restart CAM26's process, verify image starts streaming on the viewer (it did).
FRS: https://services1.ligo-la.caltech.edu/FRS/show_bug.cgi?id=33320
Forgot once again to note timing for this wake-up. This wake-up was at 233amPDT (1033utc), and I was roughly done with this work in about 45min after phoning Dave for help.
ALS EY WFS F9's came up as (2) SDFs. Accepted (see attached).
The last lockloss was most likely due to an EQ (Equador?). I was already up, so I stayed up to proactively run an alignment and got almost done with SRC OFFLOADED but H1 Manager took ALIGN IFO DOWN (!!!!) and started comepletely over---I guess I shoudl have taken H1 Manager down?
At any rate, ran a manual alignment for SRC again & H1 made it back up all the way except for the ALSey diffs. OK, back to bed.
Here's the lockloss that preceded this 1253amPDT (853utc) Wake-up call (the one where I happened to still be up at 1115pmPDT); and it does not have an EQ tag, but I could have sworn I remembered hearing Verbal going to EQ Mode a few min before the lockloss---this is why I stayed up to run an alignment! Since I was up, this is the one where I tried to help H1 by running an alignment before trying to go to bed, but ended up fighting H1_MANAGER with the alignment attempts. Then I was awakened just before 1am for the SDF diffs noted above.
This wake-up call was only me getting out of bed for a few minutes to ACCEPT the ALS SDF diffs (which might have been due to me and my errant alignments).