From our discussion in today's team meeting, it would be helpful to discuss further (especially for me) where we end up placing the bank/accounting module within flux-framework: in flux-core, flux-sched, or as its own separate repo.
Other notes from today's meeting:
In flux-core, we plan to create a porcelain job listing utility which can hopefully leverage the job-event listener module above, and the requirements of this utility should dictate the extent of job query functionality that we provide in flux-core.
Beyond that, does it seem feasible to create a bank/accounting library as a separate project, which perhaps takes job data as input from an arbitrary source, and then provides an API to query the accounting and fair share bank information that will be needed by a fully functional scheduler? When used with Flux, the job database module could be used to input to the bank/accounting db, while in Kubernetes or wherever else flux-sched is used, job data can be inserted however makes sense.
I'm not clear on the requirements driving the use of flux-sched outside of Flux, so I'm not sure I can answer that question fully at this time.
Thanks @cmoussa1 and @grondo.
The main use cases that are coming up is for other resource managers (especially Kubernetes) want to use the core software layers within flux-sched -- in particular our resource graph infrastructure.
In general, I expect wanting to use "flux components" by other external projects will increase in the future.
For flux-sched, we have been very careful to have a good separation of concerns in our software architecture to enable this. In fact, the only major components that depend on flux-core at this point is the modules (qmanager and resource) each of which just assembles these resource and queue policy "library components" and wire them into flux-core.
Ultimately, I'd like to see those library components as sub modules of flux-sched or even separate projects within flux-framework.
Because banking and accounting are inherently scheduler components, I do hope that we can also architect it this way: separating out its core logic into a library components as a separate project and have the module within flux-core make use of that library as a thin wrapper.
In any case, having a good separation of concerns so that a component like this can be used for multiple purposes and be tested with multiple clients would be a decent goal.
BTW some of the WCI workflow discussions are also centered around that.
Beyond that, does it seem feasible to create a bank/accounting library as a separate project, which perhaps takes job data as input from an arbitrary source, and then provides an API to query the accounting and fair share bank information that will be needed by a fully functional scheduler?
At the risk of making the sound too simplistic, if we consider the input and output from this library as character strings (e.g., JSON), I think this should be trackable.
Obviously, one issue would be how effectively we can test this library component as a standalone tool. Some of its unit test cases will want to read in the inputs written in flux-core's job schema format and we need to adjust these inputs over time as flux-core's job scheme will continue to evolve. But at the same time, this will ultimately make our testing more effective by allowing for multi level testing. The unit test is the first line of defense. And integration test at flux-core can provide the second level.
My only concern is that we avoid duplication of effort as a result of splitting the bank/accounting project into these two components:
Seems like a bank/acct db could leverage the jobdb required in 1. above, however as a separate project it will have to implement its own separate yet very similar db.
I admit I haven't thought about the design enough to know if my concern is valid. Just wanted to express it to get others' input.
- job event listener and simple jobdb in flux-core
To the extent I can envision, job event listener should reside in the BA module component within flux-core. The BA library should assume that the job-data injection point will be determined by its client. In the case of flux-core, it will be done by the core-side BA module.
For other client (say kubernetes), they may use a different mechanism than notification, say polling and inject job data at every n seconds. It would be best to design BA library to support both. In fact it would be ideal for the BA library to treat how the job data are ingested as a black box ?
What is simple jobdb? I was envisioning that the job db belongs to the BA library so that one can swap in and out its bank-end DB for further optimization in the future without worrying about how it is currently being used? Do we have a jobdb already in flux-core?
Do we have a jobdb already in flux-core?
No, but it is a requirement in order to support simple job listing and query tool.
I think it will have very similar data as bank/acct db. Maybe that is ok? Since BA data does not have to be updated in real time, maybe the ingest of job data for the BA library could always be polling using a configured custom job data "fetcher" (default can fetch directly from flux-core jobdb or equivalent).
I think it will have very similar data as bank/acct db. Maybe that is ok?
I don't know. But there seems to be a benefit having a two level strategy: the simple jobdb represents the first level "cache" with frequent updates and BA db represents the second level with a good level of record merging and aggregation.
As discussed in today's meeting (and earlier with @garlick), let's split off the bank/accounting "library" into an independent project, and in flux-core design and implement a job database service which can end up as the source of data for accounting/db (or first level cache as @dongahn put it).
Other services can make use of the job database (e.g. porcelain job query utility), so it makes sense to establish this service in core.
We should try to do some high-level design work on the bank/accounting API, specifically how it might obtain data from this new flux-core service.