1. 18 Jul, 2021 2 commits
    • Evangelos Foutras's avatar
      Split storage box monitoring into new text collector · c844d0cb
      Evangelos Foutras authored
      This was previously monitored as part of the borg text collector, but
      now that it only runs after each backup (instead of hourly) the stats
      from monitoring.archlinux.org do not remain accurate for long. Switch
      back to hourly checks of the storage box's disk usage by adding a new
      text collector just for this purpose.
      c844d0cb
    • Evangelos Foutras's avatar
      Run borg-textcollector after each backup completes · 68def695
      Evangelos Foutras authored
      Instead of gathering borg statistics every hour or so, run the text
      collector script only once after each borg-backup service finishes.
      
      Also split the borg text collector script into two similar scripts,
      where each one gathers borg statistics for its respective borg host.
      68def695
  2. 12 Jul, 2021 1 commit
  3. 10 Jul, 2021 1 commit
  4. 01 Jul, 2021 1 commit
  5. 30 Jun, 2021 1 commit
  6. 24 Jun, 2021 1 commit
  7. 23 May, 2021 1 commit
  8. 13 May, 2021 2 commits
  9. 08 Apr, 2021 1 commit
  10. 01 Mar, 2021 1 commit
  11. 26 Feb, 2021 1 commit
  12. 26 Jan, 2021 1 commit
    • Jelle van der Waa's avatar
      Add a btrfs prometheus exporter · 8ea35153
      Jelle van der Waa authored
      Collect prometheus btrfs errors from the btrfs command from btrfs-progs
      which since 5.10 supports json output for device stats. The collected
      errors will in the future trigger an alert when the errors reach a
      certain treshold.
      8ea35153
  13. 25 Jan, 2021 1 commit
  14. 13 Jan, 2021 1 commit
  15. 14 Dec, 2020 1 commit
    • Jelle van der Waa's avatar
      Add archive specific monitoring · 4658d36d
      Jelle van der Waa authored
      To monitor our archive mirrors and the archive size itself a new
      textcollector has been added. This will allow us to monitor the archive
      growth and the sync rate to mirrors.
      4658d36d
  16. 17 Oct, 2020 1 commit
  17. 06 Oct, 2020 1 commit
    • Jelle van der Waa's avatar
      Add rebuilderd_results Prometheus metric · 7abc2500
      Jelle van der Waa authored
      To monitor if reproducible builds are going in the right direction,
      record the good/bad/unknown metrics from rebuilderd with a Prometheus
      textcollector for a Grafana dashboard to display a long term trend.
      
      A Python script is required to handle data collection as obtaining the
      status with jq/bash is non trivial and cannot easily dnyamically collect
      suites and statuses.
      
      Closes: #146
      7abc2500
  18. 06 Sep, 2020 3 commits
    • Jelle van der Waa's avatar
      Add rebuilderd build queue length textcollector · cd4b2844
      Jelle van der Waa authored
      Record the rebuilderd queue length in prometheus so we can generate an
      alert for when the queue length keeps rising. As this could be an
      indication that the rebuilders have builds which are stuck.
      cd4b2844
    • Jelle van der Waa's avatar
      Add blackbox exporter for https status checking · 3fd36ddb
      Jelle van der Waa authored
      Run the blackbox exporter on monitoring.archlinux.org to monitor other
      machines http status for public services we provide. Also has an alert
      for when a certificate is about to expire in 3 days.
      3fd36ddb
    • Jelle van der Waa's avatar
      Introduce prometheus exporters role for collection · 23564b29
      Jelle van der Waa authored
      Add a new role called prometheus_exporters which should be run on every
      machine we have and starts different collectors depending on what group
      the machine is in. Currently supported our the gitlab runner exporter,
      rebuilder textcollector, mysqld-exporter, borg textcollector and an
      node/arch exporter. The arch exporter monitors the security status and
      pacman out of date packages gauge.
      23564b29