Finding replacement for the existing job_runner

Replacement for the existing job_runner

1   Introduction

This document is mandated by https://jira.usit.uio.no/browse/CRB-1286.

It aims at generating suggestions for new and improved job-manager tool.

2   Goals for the new job-manager tool

We set the following goals for the new job-manager tool:

  • covering the current functionality of job_runner.py (developed at UiO)
  • free and actively developed software
  • providing a more robust interface for adding scheduled tasks for and around Cerebrum
  • providing the users (KIA) with the maximum amount of ease and convenience without compromising the criterion set above

3   Research and job-candidates :)

After searching and comparing user-experiences and feedbacks, the only candidate that we settled on is luigi

After internal discussions within INT, APScheduler was seggested as a complementary framework (library) for luigi.

3.1   Advantages with luigi

  • locking (the same task will not run twice)
  • visualization (web interface for status and dependency graph)
  • dependency and parameter dependency (output and dependency task can be "fed" directly as input for the parent task)
  • status / progress (API for progress status)
  • "built-in event system that allows you to register callbacks to events and trigger them from your own tasks." - luigi doc.
  • notifications (perhaps a smoother integration with Zabbix?)
  • task priority

3.2   Disadvantages

  • rewriting most cronjobs as luigi tasks is time consuming
  • "the dependencies are decentralized" - luigi doc. (not really a disadvantage)
  • "its focus is on batch processing so it’s probably less useful for near real-time pipelines or continuously running processes." - luigi doc. (not really a disadvantage)
  • "luigi does not come with built-in triggering, and you still need to rely on something like crontab to trigger workflows periodically." - lugi doc.

4   Recommendations

  • creating a fully working POC in the Cerebrum environment, with APScheduler taking the role of crontab
  • gradually migrate from job_runner.py to luigi-tasks if POC result is satisfying
  • using luigi as a replacement for the cronjob task, while considering another framework for running servers (keep running), perhaps even systemd.
Av int
Publisert 19. jan. 2018 10:08