We had a failure of h1omc0 at 16:37:24 PDT which precipitated a Dolphin crash of the usual corner station system.
System recovered by:
Fencing h1omc0 from Dolphin
Complete power cycle of h1omc0 (front end computer and IO Chassis)
Bypass SWWD for BSC1,2,3 and HAM2,3,4,5,6
Restart all models on h1susb123, h1sush2a, h1sush34 and h1sush56
Reset all SWWDs for these chambers
Recover SUS models following restart.
Cause of h1omc0 crash: Low Noise ADC Channel Hop
Right as this happened, LSC-CPSFF got much noisier, but there was not any motion seen by peakmon or HAM2 GND-STS in Z direction(ndscope). After everything was back up, it was still noisy. Probably nothing weird but still wanted to mention it.
Also, I put the IMC in OFFLINE for the night since it decided to now have trouble locking and was showing a bunch fringes. Tagging Ops aka tomorrow morning's me
FRS31855 Opened for this issue
LOGS:
2024-08-15T16:37:24-07:00 h1omc0.cds.ligo-wa.caltech.edu kernel: [11098181.717510] rts_cpu_isolator: LIGO code is done, calling regular shutdown code
2024-08-15T16:37:24-07:00 h1omc0.cds.ligo-wa.caltech.edu kernel: [11098181.718821] h1iopomc0: ERROR - A channel hop error has been detected, waiting for an exit signal.
2024-08-15T16:37:25-07:00 h1omc0.cds.ligo-wa.caltech.edu kernel: [11098181.817798] h1omcpi: ERROR - An ADC timeout error has been detected, waiting for an exit signal.
2024-08-15T16:37:25-07:00 h1omc0.cds.ligo-wa.caltech.edu kernel: [11098181.817971] h1omc: ERROR - An ADC timeout error has been detected, waiting for an exit signal.
2024-08-15T16:37:25-07:00 h1omc0.cds.ligo-wa.caltech.edu rts_awgtpman_exec[28137]: aIOP cycle timeout
Reboot/Restart Log:
Thu15Aug2024
LOC TIME HOSTNAME MODEL/REBOOT
16:49:17 h1omc0 ***REBOOT***
16:50:45 h1omc0 h1iopomc0
16:50:58 h1omc0 h1omc
16:51:11 h1omc0 h1omcpi
16:53:56 h1sush2a h1iopsush2a
16:53:59 h1susb123 h1iopsusb123
16:54:03 h1sush34 h1iopsush34
16:54:10 h1sush2a h1susmc1
16:54:13 h1susb123 h1susitmy
16:54:13 h1sush56 h1iopsush56
16:54:17 h1sush34 h1susmc2
16:54:24 h1sush2a h1susmc3
16:54:27 h1susb123 h1susbs
16:54:27 h1sush56 h1sussrm
16:54:31 h1sush34 h1suspr2
16:54:38 h1sush2a h1susprm
16:54:41 h1susb123 h1susitmx
16:54:41 h1sush56 h1sussr3
16:54:45 h1sush34 h1sussr2
16:54:52 h1sush2a h1suspr3
16:54:55 h1susb123 h1susitmpi
16:54:55 h1sush56 h1susifoout
16:55:09 h1sush56 h1sussqzout
TITLE: 08/15 Day Shift: 1430-2330 UTC (0730-1630 PST), all times posted in UTC
STATE of H1: Corrective Maintenance
SHIFT SUMMARY:
After the HAM6 cameras were reinstalled and the high voltage was turned back on then we were ready for light in the vacuum. The mode cleaner locked within about 30 seconds of requesting IMC_LOCK to Locked! Tomorrow we will bring back our seismic systems as best we can and then try some single bounce to ensure that we can get some type of light into HAM6.
Ion pumps 1,2,3,14 are in a known error state. Gerardo warned this was intentional.
LOG:
Start Time | System | Name | Location | Lazer_Haz | Task | Time End |
---|---|---|---|---|---|---|
14:34 | FAC | Karen | Opt lab | n | Tech clean | 14:50 |
14:58 | FAC | Tyler | LVEA | n | Grab a tool | 15:06 |
15:07 | FAC | Karen | LVEA | n | Tech clean | 15:19 |
15:11 | FAC | Kim | LVEA | n | Tech clean | 15:21 |
16:25 | FAC | Kim | MX | n | Tech clean | 17:11 |
16:34 | VAC | Gerardo | LVEA | n | Vac checks at HAM5,6,7 | 17:39 |
17:44 | ISC | Camilla, Oli | LVEA | n | Reinstall cameras on HAM6 | 19:06 |
18:14 | SYS | Betsy | Opt Lab | n | Betsy things | 19:15 |
19:00 | CDS | Marc, Fernando | LVEA | n | Turn on high voltage | 19:15 |
19:27 | - | Sam, tour (5) | LVEA | n | Tour of LVEA | 20:16 |
20:53 | SAF | TJ | LVEA | YES | Transition to regular laser safe | 21:18 |
22:02 | TCS | Camilla, Marc | Opt Lab | Local | Cheeta | 22:57 |
We were exploring any weird behavior during the locklosses preceding the OFI burns to try and narrow down possible causes, and we recently learned from a scientist experienced with KTP optics that the movement of a high powered beam passing through the KTP could cause damage to the optic, so that created the theory that this could have happened due to earthquakes.
There were a couple of decently-sized earthquakes before the incidents:
April Incident (seismic/LL summary alog79132)
- April 18th - nearby earthquake from Canada - we lost lock from this (lockloss ndscope)
- April 20th - nearby earthquake from Richland - stayed locked
- April 23rd - drop in output power noticed - the lockloss right before this had NOT been caused by an earthquake (previous lockloss ndscope)
July Incident
- July 11th - nearby earthquake from Canada (alog79023) (lockloss ndscope, zoomed out lockloss ndscope)
- July 12th - noticed similarities to the April incident
We used ndscope to compare ground motion to DARM, AS_A, AS_C, and the IFO state. In looking over the ndscopes, we don't see anything that would make us think that these earthquakes changed anything in the output arm.
So yes, we did have two decently sized earthquakes (+a local one) before the IFO burns took place, but we also have earthquakes hitting us all the time, many with higher ground velocities. Overall, we did not see anything strange during these earthquake locklosses in the AS_A and AS_C channels that would lead us to think that the earthquakes played a part in the OFI issues.
People who worked on this:
Kiet Pham, Sushant Sharma-Chaudhary, Camilla, TJ, Oli
I turned the CO2s back on today and CO2X came back to its usual 53W, but CO2Y came back at 24W. We've seen in the past that it will jump up a handful of watts overnight after a break, so maybe we will see something similar here. If not, we will have to investigate this loss. Trending the output of this laser, it has definitely been dropping in the last year, but we should be higher than 24W.
Sure enough, the power increased overnight and we are back to 34W. This is still low, but in line with the power loss that we've been seeing. Camilla is looking into the spare situation and we might swap it in the future.
We searched for home on both CO2 lasers, took them back to minium power and then asked CO2_PWR guardian to NOM_ANNULAR_POWER. This gave 1.73W on CO2X and CO2Y to 1.71W (1.4W before bootstrapping).
/psl/h1/scripts/RotationStage/CalibRotStage.py
), so we should change to this method next time.Per WP12042 high voltage was reactivated at the corner station via M1300464v13.
1 - ESD ITM's supply enabled
2 - HAM6 Fast Shutter and OMC PZT supply enabled
3 - Ring Heaters ITMs & SR3 enabled in TCS racks
4 - HAM6 Fast Shutter at Rack verified HV enabled
5 - MER SQZ PZT & Piezo driven PSAMs supply enabled
M. Pirello, F. Mera
The new corner station scroll pump air compressor was valved in at 09:39 PDT this morning. I has gone through 3 compression cycles since then. Here are the stats
I have reset the cell phone alarms with the levels: LOW = 80PSI, HIGH = 120PSI
The new compressor runs at a higher pressure so the alarms needed to move with that. I changed the minor and major high alarms for H0:VAC-MR_INSTAIR_PT199_PRESS_PSIG for 127 and 130 respectively in epics via the vacuum computer in the back of the control room. These alarm levels cannot be changed outside of the vacuum network. I did not see this channel in the VAC SDF.
Patrick will check if these values are hard coded into Beckhoff and will adjust accordingly.
Currently the slow-controls SDF is not able to monitor non-VAL fields (e.g. alarm fields). Future release will have this feature.
I made the following changes to the PLC code generation scripts (not pushed to the running code) to enable setting the alarm levels for each instrument air pressure EPICS channel separately. I set the alarm levels to what they currently are. Should any of these be changed? https://git.ligo.org/cds/ifo/beckhoff/lho-vacuum/-/commit/8d17cfe3284147b4e398262091ec9d9f2ddbb6ab If there are no objections, I will plan to push this change when we make the changes to add and rename filter cavity ion pump controllers.
Conda packages updated. Controls rooom tools are updated to version 4.1.4. The only change is a fix to the Python foton module.
FilterDesign::get_zroots() would throw an exception when called. Thanks to Chris Wipf for finding and fixing the problem.
Thu Aug 15 08:08:59 2024 INFO: Fill completed in 8min 55secs
TITLE: 08/15 Day Shift: 1430-2330 UTC (0730-1630 PST), all times posted in UTC
STATE of H1: Corrective Maintenance
CURRENT ENVIRONMENT:
SEI_ENV state: MAINTENANCE
Wind: 10mph Gusts, 7mph 5min avg
Primary useism: 0.01 μm/s
Secondary useism: 0.06 μm/s
QUICK SUMMARY: Pumpdown continues.
TITLE: 08/14 Day Shift: 1430-2330 UTC (0730-1630 PST), all times posted in UTC
STATE of H1: Corrective Maintenance
SHIFT SUMMARY: The pump down continues (5e-7).
LOG:
Start Time | System | Name | Location | Lazer_Haz | Task | Time End |
---|---|---|---|---|---|---|
15:24 | FAC | Karen | LVEA | n | Tech clean | 16:32 |
16:00 | FAC | Kim | LVEA | n | Tech clean | 16:15 |
16:01 | FAC | Nellie | HAM Shack | n | Tech clean | 17:01 |
16:17 | FAC | Karen | EY | n | Tech clean | 18:17 |
16:33 | FAC | Kim | EX | n | Tech clean | 17:57 |
16:40 | VAC | Gerardo | LVEA | n | HAM5/6 work with RGA and pump | 17:52 |
17:47 | VAC | Janos | LVEA | n | Walkabout | 17:54 |
17:52 | VAC | Gerardo | EX | n | Cart hunt | 18:35 |
17:56 | FAC | Karen | MY | n | Tech clean | 18:35 |
17:58 | FAC | Kim | H2 WH | n | Tech clean | 18:34 |
18:35 | - | Camilla, Detchar (8) | LVEA | n | Tour | 19:18 |
20:10 | VAC | Chris | LVEA | n | Grab straps for bake ovens | 20:16 |
20:19 | VAC | Janos, Gerardo, Travis | Opt lab | n | Viewport inspection | 21:16 |
21:43 | TCS | Camilla | Opt Lab | Local | Cheeta testing |
ongoing |
21:49 | CDS | Marc | MY | n | Looking for parts | 22:20 |
21:49 | VAC | Chris | LVEA | n | Putting straps back | 22:09 |
Yesterday Sam and I went around the LVEA with a portable accelerometer (H1:PEM-CS_ADC_5_28_2K_OUT_DQ) to investigate a ~35.4 Hz peak. This peak dates back to at least June 2023, possibly longer. We placed the portable accelerometer in seven different spots, each time against a wall except for one spot near BS1. See the attachment 1 for the reference map. The peak appears strongest at REF0 and REF5 as seen in attachment 2. We attempted to test outside near the Mitsubishis (outside the OSB kitchen) and the southeast-facing mechanical room wall as well but had data retrieval issues, so we'll go back to that soon.
We are also investigating the ~36 Hz peak seen in attachment 2, strongest near the X OL. It seems to have been around since at least 2023-05-08, but more investigation is needed to find out when it started and if it shows up consistently every day. The frequency of this peak oscillates on a range of about 1 Hz. Attachment 3 shows the peak in H1:PEM-CS_LVEAFLOOR_XCRYO_Z_DQ and H1:GDS-CALIB_STRAIN_CLEAN. This peak also travels between 30-40 Hz, generally near 40 Hz at the beginning and end of the UTC day and dropping towards 30 Hz in the middle of the day, as seen in attachment 4 from the summary pages.
WP12035
Yesterday thermocouple B (TC-B) failed during the fill. Its wire broke and its temperature railed at +1372C. The fill proceded with TC-A to completion.
If TC-A had similarly failed, the fill would have continued until either the discharge line pressure exceeded its trip pressure; or the fill timed out after 30 minutes.
Today's code modification adds a check when reading the TCs that they are not greater than the nominal_max temps (+40C).
Providing at least one TC is in operational range the fill will continue, but now if no TCs are operational the fill will abort with an error.
The new code was installed at 9am today, it will run for the first time during Thursday's fill.
During this investigation I noted that the discharge_line_pressure trip is 2.0PSI. Operationally this pressure has never exceeded 0.6PSI so I lowered the trip down to 1.0PSI.
Arianna, Derek
We noticed that the calibration line subtraction pipeline is often not correctly subtracting the calibration lines at the start of lock segments.
A particularly egregious case is on July 13, where the calibration lines are still present in the NOLINES data for ~4 minutes into the observing mode segment but then quickly switch to being subtracted. This on-to-off behavior can be seen in this spectrogram of H1 NOLINES data from July 13, where red lines at the calibration frequencies disappear at 1:10 UTC. Comparing the H1 spectrum at 1:06 UTC and 1:16 UTC shows that the H1:GDS-CALIB_STRAIN spectrum is not changed, but the H1:GDS-CALIB_STRAIN_NOLINES spectrum has calibration lines in the 1:06 UTC spectrum and no lines in the 1:16 UTC spectrum. This demonstrates that the problem is with the subtraction rather than the actual calibration lines.
This problem is still ongoing for recent locks. The most recent lock on August 15 has the same problem. The calibration lines "turn off" at 4:46 UTC as seen in the attached spectrogram.
The on-to-off behavior of the subtraction is particularly problematic for data analysis pipelines as the quickly changing line amplitude can result in the data being over- or under-whitened.
A while back, I investigated this and found that the reason for occasional early lock subtraction problems is that, at the end of the previous lock, the lines were turned off, but the TFs were still being calculated. Then, at the beginning of the next lock, it takes several minutes (due to the 512-s running median) to update the TFs with accurate values. There were two problems that contributed to this issue. I added some more gating in the pipeline to use the line amplitude channels to gate the TF calculation. This fix was included included in gstlal-calibration-1.5.3, as well as the version currently in production (1.5.4). However, there have been some instances in which those channels were indicating that the lines were on when they were actually off at the end of the previous lock, which means this code change, by itself, does not fix the problem in all cases. The version that is currently in testing, gstlal-calibration-1.5.7, includes some other fixes for line subtraction, which may or may not improve this problem.
A more reliable way to solve this issue would be to ensure that the line amplitude channels we are using always carry accurate real-time information. Specifically, we would need to prevent the occurance of the lines turning off long before these channels indicate this has occurred. The names of these channels are:
{IFO}:SUS-ETMX_L{1,2,3}_CAL_LINE_CLKGAIN
{IFO}:CAL-PCAL{X,Y}_PCALOSC{1,2,3,4,5,6,7,8,9}_OSC_SINGAIN