You may include all the ODEs on the loss function. Depending on what you are trying to do it might be too complex for a simple model like the one described in the article. If this is the case, I would suggest using a framework, such as tensorDiffEq (https://github.com/tensordiffeq/TensorDiffEq).