Online Operations Manual


Introduction


The low-latency GstLAL-based compact binary analysis implements a mixed time-domain, frequency domain filtering scheme to produce extremely low latency gravitational wave detection capabilities. The purpose is to discover gravitational waves from merging neutron stars and black holes within seconds of the waves arriving at Earth.

There is an initial configuration procedure that must be executed first. During this stage, the template bank is decomposed into SVD bins and initial dist stats are computed. Once setup, the analysis is designed to run continuously throughout an observing period. This page provides an ‘as built’ overview of the entire analysis. If you are simply looking to start an analysis from scratch, there are step-by-step instructions in the rest of this manual starting with Configuration.




Analysis diagram


Below is a diagram relating the various workflows (dashed line boxes) and communication layers (HTTP, Kafka, File I/O) for a functioning low-latency compact binary search. You can click on the diagram to learn more about each component.


gstlal ll dq
gstlal ll dq
scald metric collector
scald metric colle...
gstlal inspiral
bank splitter
gstlal inspiral...
gstlal inspiral
bank splitter
gstlal inspiral...
gstlal inspiral
svd bank
gstlal inspiral...
gstlal inspiral
svd bank
gstlal inspiral...
create prior dist_stats
create prior dist_...
x bins
x bins
gstlal marginalize likelihoods online
gstlal marginalize...
gstlal inspiral
gstlal inspiral
x ifos
x ifos
gstlal ll inspiral
trigger counter
gstlal ll inspiral...
gstlal ll inspiral 
event uploader
gstlal ll inspiral...
gstlal ll inspiral 
event plotter
gstlal ll inspiral...
x bins x ifos
x bins x ifos
x bins
x bins
strain data
strain da...
h(t)
h(t)
scald event
collector
scald event...
gstlal ll inspiral 
pastro uploader
gstlal ll inspiral...
influx database
influx database
strain data
strain da...
h(t)
h(t)
set up DAG
set up DAG
inspiral DAG
inspiral DAG
coincs
coincs
events
events
snr_history
likelihood_history
far_histroy
 ...
monitoring
topics
snr_history...
uploads
uploads
favored events
favored events
ranking stat
ranking stat
bank/
bank/
psd/
psd/
split_bank/
split_bank/
filter/
svd_bank/
filter/...
dist_stats/
dist_stats/
zerolag_dist_
stat_pdfs/
zerolag_dist_...
dist_stat_
pdfs/
dist_stat_...
single job
single job
Misc
Misc
Kafka Topic
Kafka Topic
directory
on disk
director...
multiple jobs
running in parallel
multiple jobs...
× # of jobs
× # of jobs
Input
Input
output
output
job dependency
job depende...
influx database
influx database
Diagram Key:
Diagram Key:
GraceDB
GraceDB
Text is not SVG - cannot display
gstlal ll dq
gstlal ll dq
scald metric collector
scald metric colle...
gstlal inspiral
bank splitter
gstlal inspiral...
gstlal inspiral
bank splitter
gstlal inspiral...
gstlal inspiral
svd bank
gstlal inspiral...
gstlal inspiral
svd bank
gstlal inspiral...
create prior dist_stats
create prior dist_...
x bins
x bins
gstlal marginalize likelihoods online
gstlal marginalize...
gstlal inspiral
gstlal inspiral
x ifos
x ifos
gstlal ll inspiral
trigger counter
gstlal ll inspiral...
gstlal ll inspiral 
event uploader
gstlal ll inspiral...
gstlal ll inspiral 
event plotter
gstlal ll inspiral...
x bins x ifos
x bins x ifos
x bins
x bins
strain data
strain da...
h(t)
h(t)
scald event
collector
scald event...
gstlal ll inspiral 
pastro uploader
gstlal ll inspiral...
influx database
influx database
strain data
strain da...
h(t)
h(t)
set up DAG
set up DAG
inspiral DAG
inspiral DAG
coincs
coincs
events
events
snr_history
likelihood_history
far_histroy
 ...
monitoring
topics
snr_history...
uploads
uploads
favored events
favored events
ranking stat
ranking stat
bank/
bank/
psd/
psd/
split_bank/
split_bank/
filter/
svd_bank/
filter/...
dist_stats/
dist_stats/
zerolag_dist_
stat_pdfs/
zerolag_dist_...
dist_stat_
pdfs/
dist_stat_...
single job
single job
Misc
Misc
Kafka Topic
Kafka Topic
directory
on disk
director...
multiple jobs
running in parallel
multiple jobs...
× # of jobs
× # of jobs
Input
Input
output
output
job dependency
job depende...
influx database
influx database
Diagram Key:
Diagram Key:
GraceDB
GraceDB
Text is not SVG - cannot display

Programs used in this analysis


FIXME


gstlal_inspiral
     
  • config
  • doc
  • source
  • Disk I/O

    SVD bank files. Multiple SVD bank files can be given per job, in order to analyze data from multiple IFOs. These should each correspond to the same SVD bin.

    A reference PSD file. We use a file checked into the repo as a starting point, but always use the track-psd option so that the PSD is periodically updated to reflect the current state of the detector noise. What reference PSD do we use? What time range does it correspond to? Is it updated throughout the run at all?

    Ranking statistic input file. This contains likelihood ratio ranking statistic data according to our signal and noise models, created by the create_prior_dist_stats jobs in the set-up stage. These files are in the dist_stats directory and we use the naming convention {IFOs}-{SVD_GROUP_NUM}_GSTLAL_DIST_STATS-0-0.xml.gz For injection jobs, this input file is taken from the one used for non-injection job of the same SVD bin so that the noise model is consistent between non-injection/injection twin jobs. And the rankingstat data is not updated or overwritten for this case (technically, it is internally adding counts but the internal data get overwritten by the next snapshot of the rankingstat file and will be used for ranking-statistic evaluation).

    Ranking stat PDF file. This is used to compute the FAP and FAR of triggers. This file is made by the gstlal_marginalize_liikelihoods_online job. It’s kept in the dist_stat_pdfs directory and is named as {IFOs}-GSTLAL_DIST_STAT_PDFS-0-0.xml.gz.

    Time slide xml file. This is made in the Makefile using lalapps_gen_timeslide

    Filename to write out ranking statistic data to. Overwrite / update the same file that was given as the ranking statistic input file (ie the one made from create_prior_dist_stats in the dist_stats directory), when the output filename is set to be the same as the input filename (which is usually the case for online analysis).

    For injection jobs, the rankingstat data is not output to anywhere because the collected background is nonsense.

    Zero-lag ranking stat PDF. This is a histogram of the likelihood ratios of zero-lag triggers collected in the filtering. It gets written at start-up, and updated as the job continually runs. These go in the zerolag_dist_stat_pdfs directory and are named as {IFOs}-{SVD_GROUP_NUM}_GSTLAL_ZEROLAG_DIST_STAT_PDFS-0-0.xml.gz

    Trigger files get written out to the directories gracedb_uploads/{GPS_TIME_STAMP}/ FIXME LINK TO TABLE DEFINITIONS

  • Http requests

    URL is advertised to a registry file in the top level of the analysis run directory. Data can be requested from the job via this url using bottle.

  • Kafka topics

    Output topics: all scientific metric topics, data quality metric topics, latency metric topics, and monitoring topics. See https://gwsci.org/ops/diagrams#kafka.

Analyze gravitational wave strain data in real time, filtering it with an SVD bank of CBC waveforms to generate event triggers. In the online mode, the inspiral jobs assign likelihoods and compute the FAR of triggers.
Notes:
  1. Data source - what are all the different options for this? FIXME LINK TO OPS PAGE SOURCE OF TRUTH
  2. Channel names FIXME LINK TO OPS PAGE SOURCE OF TRUTH
  3. State channel names FIXME LINK TO OPS PAGE SOURCE OF TRUTH
  4. DQ channel names FIXME LINK TO OPS PAGE SOURCE OF TRUTH
  5. State vector on and off bits FIXME LINK TO OPS PAGE SOURCE OF TRUTH
  6. Shared memory partition FIXME LINK TO OPS PAGE SOURCE OF TRUTH
  7. FAR threshold for uploads FIXME LINK TO OPS PAGE SOURCE OF TRUTH
  8. Group, pipeline, search FIXME LINK TO OPS PAGE SOURCE OF TRUTH
  9. Labels FIXME LINK TO OPS PAGE SOURCE OF TRUTH
  10. Service URL (GraceDb, playground, test, etc.) FIXME LINK TO OPS PAGE SOURCE OF TRUTH


gstlal_inspiral_marginalize_likelihoods_online
     
  • config
  • doc
  • source
  • Disk I/O

    Input: Text files containing the URL of a web server from which to retrieve likelihood data from a particular job. These are named as {JOB_NUM}_noninj_registry.txt (there are also {JOB_NUM}_inj_registry.txt for injection gstlal inspiral jobs but these are not used by the marginalize likelihoods job) and are kept in the top level of the run directory.

    Output: Name of an xml file to write out marginalized ranking statistic PDFs to. This file gets written to the directory dist_stat_pdfs and we use the naming convention {IFOs}-GSTLAL_DIST_STAT_PDFS-0-0.xml.gz. This histogram contains noise and zerolag triggers collected from all the inspiral jobs. This zerolag counts will be used to apply the extinction model (simulate the clustering effect on the ranking statistic distribution based on zerolag counts and apply that to the noise counts) when assigning FAP/FAR in the online configuration.

  • Http requests

    registry files for all gstlal inspiral (noninjection) jobs which provide a url to request data from. The ranking_data.xml files are requested from these urls.

Calculate ranking statistic PDFs from each running gstlal inspiral job and marginalize the PDFs (ie add the histograms) across all SVD bins. This is repeated in a continuous loop and takes about 4 - 8 hours to complete one round of marginalization across all bins.
Notes:
  1. Ranking stats from each gstlal inspiral job are gathered via HTTP request.


gstlal_ll_dq
     
Produce noise and range history metrics for each ifo
Notes:


gstlal_ll_inspiral_event_plotter
     
  • config
  • doc
  • source
  • Disk I/O

    Optionally write out plots to disk if --output-path is provided.

  • Kafka topics

    Input topic: gstlal.<analysis_tag>.inj_uploads OR gstlal.<analysis_tag>.uploads

    Input topic: gstlal.<analysis_tag>.inj_ranking_stat OR gstlal.<analysis_tag>.ranking_stat

Ingest messages from the uploads and ranking stat Kafka topics, store the event info in a dictionary keyed by event GPS time and SVD bank bin. Handle the stored event messages - upload all of the auxiliary files and plots to the event on GraceDb. This includes:

  • The ranking statistic data file (ranking data.xml.gz) - Ranking statistic plots (background (noise) PDF, injection (signal) PDF, zero lag (candidates) PDF, LR, likelihood ratio CCDF, horizon distance vs. time, rates) - These are made by the functions in plots/far.py - PSD plots - These are made by the functions in plots/psd.py - SNR timeseries plots
Notes:
  1. Kafka URL
  2. GraceDb group, pipeline, search and service url to use
  3. No file outputs, unless `output-path` is set by user, then plots are saved to disk in addition to being uploaded to GraceDb.


gstlal_ll_inspiral_event_uploader
     
  • config
  • doc
  • source
  • Kafka topics

    Input topics: gstlal.<analysis_tag>.inj_events OR gstlal.<analysis_tag>.events

    Output topics: gstlal.<analysis_tag>.inj_favored_events OR gstlal.<analysis_tag>.favored_events

Ingest messages via Kafka from the events topic (from the gstlal inspiral jobs) for each trigger generated. These messages are JSON packets containing the trigger GPS times, FAR, SNR, and coinc file (including PSD). Group these individual triggers into event candidates, by using a time window around the coalescence time and store them in a dictionary. Handle the stored event candidates - when there is a new candidate or enough time has passed to upload a new event to an existing candidate, process the candidates by choosing a favored event” (default favored event function is maximum SNR, minimum FAR or composite event aggregation is also supported). And send the favored event information (time, SNR, FAR, PSD and coinc files) to a Kafka topic. Upload the event to GraceDb and send a message with the GraceDB event ID, coinc file, and event time to the Kafka uploads topic.
Notes:
  1. Kafka URL and topic to consume messages from. We consume messages from the events topic which are sent by the inspiral jobs.
  2. GraceDb group, pipeline, search and service url to use in uploading events.
  3. Trials factor on the FAR. The FAR threshold will be = FAR / trials factor, where usually the trials factor corresponds to the number of independent online pipelines, ie 5 - CWB, GstLAL, MBTA, PYCBC, SPIIR. FIXME not used.
  4. Upload cadence - determine how long to wait between sending multiple events for the same event window.
  5. No file outputs.
  6. Produces Kafka messages to the `favored events`, and `uploads` topic.


gstlal_ll_inspiral_pastro_uploader
     
  • config
  • doc
  • source
  • Disk I/O

    Input file: p(astro) model file including some pre-computed data necessary to compute p(astros)

    Input file: marginalized ranking stat PDF - this file is read in every four hours so that the p(astro) model can be updated with the latest ranking stat signal and noise model.

    Output file: p(astro) model file - written out to disk every time the ranking stat information is updated.

  • Kafka topics

    Input topics: gstlal.<analysis_tag>.inj_uploads OR gstlal.<analysis_tag>.uploads

Ingest messages from the uploads Kafka topic, store the event info in a list. Handle the stored event messages - compute the p(astro) values (pTerrestrial, and probability of each source class, for AllSky this is pBNS, pBBH, and pNSBH but the EarlyWarning and LowMass searches may only compute a subset of these according to their parameter space. Upload the p(astro) json file as gstlal.p_astro.json to the event on GraceDB and apply the label PASTRO_READY to the event.
Notes:
  1. Kafka URL
  2. GraceDb group, pipeline, search and service url to use


gstlal_ll_inspiral_trigger_counter
     
  • config
  • doc
  • source
  • Disk I/O

    Output: XML filename to write out the zero-lag ranking statistic data to. We set this to go in the zero lag dist stat pdfs directory, and use the file name {IFOS}-GSTLAL_ZEROLAG_DIST_STAT_PDFS-0-0.xml.gz

Collect zero-lag triggers from inspiral jobs, via the Kafka events. These messages contain the GPS times, FAR, SNR, coinc file, PSD file, and p(astro) file of triggers. Cluster triggers over 10 second windows by max likelihood. Write out the zero-lag counts histogram to disk at an interval set by the user.
Notes:
  1. Kafka URL and topic to consume messages from.
  2. Specify the output period, how often to write out the zero-lag counts histogram to disk.
  3. Bootstrap file. The program will try to load the specified output file first to get initial counts, if that file doesn’t exist use the bootstrap file instead to start up.


scald_event_collector
     
Notes:
  1. YAML configuration file from the `web` directory, sets dashboard and plotting options, also the data backend (Influx DB to store metrics in), and schemas
  2. Kafka URI to get messages from
  3. Data type. Always triggers
  4. topics to subscribe to
  5. one schema per topic indicating metrics to aggregate
  6. No file outputs.


scald_metric_collector
     
Aggregate metric timeseries across jobs (per second?) and store them in an Influx database. These metrics are things like FAR and likelihood history, SNR history for each IFO, latency history, etc. These are used to populate the Grafana dashboards for monitoring. The metrics are stored essentially forever in an Influx DB. These messages are sent to Kafka by the gstlal inspiral jobs, see the update function in EyeCandy (part of the LLOIDTracker).
Notes:
  1. YAML configuration file from the `web` directory, sets dashboard and plotting options, also the data backend (Influx DB to store metrics in), and schemas (metrics to aggregate, eg. FAR history or SNR history, along with aggregation type (min, max) etc.)
  2. Kafka URI to get messages from, topics to subscribe to, one schema per topic indicating metrics to aggregate
  3. No file outputs.


Kafka topics


Notes:

  • <analysis_tag> is a user provided string, e.g., “gstlal.inspiral_mario_MDC04”
  • <inj> is either the optional string “inj_” or nothing which delineates between an injection or non injection run
  • <ifo> is a particular detector, e.g., “L1”, or “K1”

Scientific metric topics:

  • gstlal.inspiral_<analysis tag>.far_history:
  • gstlal.inspiral_<analysis tag>.far_history:
  • gstlal.inspiral_<analysis tag>.likelihood_history:
  • gstlal.inspiral_<analysis tag>.inj_likelihood_history:
  • gstlal.inspiral_<analysis tag>.snr_history:
  • gstlal.inspiral_<analysis tag>.inj_snr_history:
  • gstlal.inspiral_<analysis tag>.<ifo>_snr_history:
  • gstlal.inspiral_<analysis tag>.inj_<ifo>_snr_history:
  • gstlal.testsuite_<analysis tag>.<ifo>_psd:
  • gstlal.inspiral_<analysis tag>.coinc:

Data quality metric topics:

  • gstlal.inspiral_<analysis tag>.inj_<ifo>_inj_dqvectorsegments:
  • gstlal.inspiral_<analysis tag>.<ifo>_dqvectorsegments:
  • gstlal.inspiral_<analysis tag>.inj_<ifo>_dqvectorsegments:
  • gstlal.testsuite_<analysis tag>.<ifo>_dqvectorsegments:
  • gstlal.inspiral_<analysis tag>.inj_<ifo>_whitehtsegments:
  • gstlal.inspiral_<analysis tag>.inj_<ifo>_inj_whitehtsegments:
  • gstlal.inspiral_<analysis tag>.inj_<ifo>_inj_statevectorsegments:
  • gstlal.inspiral_<analysis tag>.inj_<ifo>_statevectorsegments:
  • gstlal.inspiral_<analysis tag>.<ifo>_statevectorsegments:
  • gstlal.testsuite_<analysis tag>.<ifo>_statevectorsegments:
  • gstlal.inspiral_<analysis tag>.<ifo>_strain_dropped:
  • gstlal.inspiral_<analysis tag>.inj_<ifo>_strain_dropped:
  • gstlal.inspiral_<analysis tag>.<ifo>_noise:

Latency metric topics:

  • gstlal.inspiral_<analysis tag>.<ifo>_snrSlice_latency:
  • gstlal.inspiral_<analysis tag>.inj_<ifo>_snrSlice_latency:
  • gstlal.inspiral_<analysis tag>.<ifo>_datasource_latency:
  • gstlal.inspiral_<analysis tag>.inj_<ifo>_datasource_latency:
  • gstlal.inspiral_<analysis tag>.inj_latency_history:
  • gstlal.inspiral_<analysis tag>.latency_history:
  • gstlal.inspiral_<analysis tag>.<ifo>_whitening_latency:
  • gstlal.inspiral_<analysis tag>.inj_<ifo>_whitening_latency:
  • gstlal.inspiral_<analysis tag>.inj_all_itacac_latency:
  • gstlal.inspiral_<analysis tag>.all_itacac_latency:
  • gstlal.inspiral_<analysis tag>._all_itacac_latency:

Event topics:

  • gstlal.inspiral_<analysis tag>.favored_events:
  • gstlal.inspiral_<analysis tag>.inj_events:
  • gstlal.inspiral_<analysis tag>.uploads:
  • gstlal.inspiral_<analysis tag>.events:
  • gstlal.inspiral_<analysis tag>.p_astro:
  • gstlal.inspiral_<analysis tag>.ranking_stat:
  • gstlal.inspiral_<analysis tag>.ram_history:

Monitoring topics:

  • gstlal.inspiral_<analysis tag>.inj_ram_history:
  • gstlal.inspiral_<analysis tag>.ram_history:
  • gstlal.inspiral_<analysis tag>.inj_uptime:
  • gstlal.inspiral_<analysis tag>.uptime:



On disk layout


A shared file system is used to store configuration data, archives of trigger outputs, and occasionally to pass information between running jobs (though the low-latency information is typically passed via http or kafka).

  • archive: empty?
  • dtdphi: Makefile only?
  • mass_model: H1L1V1-GSTLAL_MASS_MODEL-0-0.xml.gz: wrong file extension??? and Makefile
  • profiles: empty?
  • svd:
  • cit_mario_online.yml
  • ics_online.yml
  • psd: H1L1V1-GSTLAL_REFERENCE_PSD-0-0.xml.gz Makefile
  • influx_creds.sh
  • bank: bbh_low_q.xml.gz bns.xml.gz imbh_low_q.xml.gz Makefile mario_bros_offline.xml.gz nsbh.xml.gz other_bbh.xml.gz
  • svd_bank: empty???
  • H1L1V1-GSTLAL_REFERENCE_PSD-0-0.xml.gz
  • Makefile
  • env.sh
  • H1L1V1-GSTLAL_SVD_MANIFEST-0-0.json
  • tisi.xml
  • filter: contains e.g., svd_bank/H1-0358_GSTLAL_SVD_BANK-0-0.xml.gz
  • nohup.out
  • split_bank: contains e.g., H1L1V1-0191_GSTLAL_SPLIT_BANK_0577-0-0.xml.gz
  • aggregator: contains e.g., /1/3/3/6/2/7/V1-PSD-1336279900-100.hdf5 do we need these at all?
  • 13362: contains e.g., H1L1V1-0918_inj_mdc04_LLOID-1336273048-14933.xml.gz H1L1V1-0918_inj_mdc04_LLOID-1336273048-60.xml.gz H1L1V1-0918_inj_mdc04_SEGMENTS-1336273048-9028.xml.gz H1L1V1-0918_inj_mdc04_SEGMENTS-1336273108-0.xml.gz H1L1V1-0918_noninj_LLOID-1336273064-14864.xml.gz H1L1V1-0918_noninj_LLOID-1336273064-9.xml.gz H1L1V1-0918_noninj_LLOID-1336287922-14406.xml.gz H1L1V1-0918_noninj_LLOID_DISTSTATS-1336273064-10.xml.gz H1L1V1-0918_noninj_LLOID_DISTSTATS-1336273064-14865.xml.gz H1L1V1-0918_noninj_LLOID_DISTSTATS-1336287922-14407.xml.gz H1L1V1-0918_noninj_SEGMENTS-1336273064-9012.xml.gz H1L1V1-0918_noninj_SEGMENTS-1336273074-0.xml.gz H1L1V1-0918_noninj_SEGMENTS-1336287922-14399.xml.gz
  • plots: contains e.g., COMBINED-GSTLAL_INSPIRAL_PLOT_BACKGROUND_ALL_NOISE_LIKELIHOOD_RATIO_CCDF_CLOSED_BOX-1336272983-114190.png
  • config.yml
  • web: contains e.g., inspiral.yml online_dashboard.json
  • test-suite: is this the test suite dag? Is this the preferred way to run it? Can we point to test-suite specific documentation?
  • dist_stat_pdfs: contains e.g., H1L1V1-GSTLAL_DIST_STAT_PDFS-0-0.xml.gz
  • gracedb_uploads: contains e.g., 13372/H1L1V1-GSTLAL_0621_inj_mdc04_7_945_CBC_AllSky_0621_RankingData-1337299900-1.xml.gz 13372/H1L1V1-GSTLAL_0621_inj_mdc04_7_945_CBC_AllSky-1337299900-1.xml Where is pastro??
  • dist_stats: contains e.g., H1L1V1-0915_GSTLAL_DIST_STATS-0-0.xml.gz
  • logs
  • zerolag_dist_stat_pdfs: contains e.g., H1L1V1-0406_GSTLAL_ZEROLAG_DIST_STAT_PDFS-0-0.xml.gz



HTTP traffic


FIXME


Important References and Resources





Software and service stack


The GstLAL online analysis relies on several open source software libraries. Some of these are available in the gwsci container, but some are not. Namely, the following services are required beyond the software in the gwsci container:

  1. Kafka - used to stream data products for I/O between different processes
  2. InfluxDB - used to store metric data
  3. Grafana - used to visualize metric data

Additionally there is an implicit assumption that you are deploying this analysis on an LDG-compatible site running HTCondor with low-latency data services running (a few different flavors are supported).