Hello
Apologies in advance if this is difficult to follow, my understanding of the inner workings of openCPI is far from perfect. To avoid confusion, I will use the word ‘application’ to refer to an openCPI Application object that has been created from the ACI and the word ‘program’ to refer to the C++ executable that is responsible for interacting with the ACI and managing one or more openCPI applications.
Currently, if an RCC worker’s run()
method returns RCC_FATAL
the entire program to halt. This is different to if a worker’s start()
method returns RCC_FATAL
in which case an exception is thrown that can be caught be surrounding the app.start()
ACI method in a try-catch block.
I discovered this whilst attempting to create a test program that runs multiple test applications, some of which purposely cause a worker to fail. Currently the entire program halts when one of these tests runs, preventing any further tests from running.
I believe this issue is a result of the run method of the worker being called within a container that runs in its own thread. The exception is caught within the runContainer
method and printed to the console followed by abort()
being called. Surely the container should instead indicate that the exception should be thrown in the main thread. The code responsible for this is in opencpi/runtime/container/src/Container.cc
on lines 237-239 .
There is also strange behaviour within the Worker
class in regards to how the return value of the run method is handled. If before returning, setError()
is called, then the return value is never actually looked at and the behaviour is the same regardless of what is returned. See opencpi/runtime/rcc/src/RccWorker.cc
on line 621, this is where I believe the worker’s run method is called from. Then on line 633 checkError
is called which will throw an exception if the setError()
method has been used before the switch statement looking at the return value is reached.
I experimented with the behaviour when setError()
is not called and found that it is different depending on where I put break points. See in RccWorker.cc
line 667. If I put a break point here, the following exception is thrown in the main thread: Worker \"workerName\" is now unusable
. If I do not have this break point, before the main thread detects that the worker is unusable, the exception is thrown on line 674 which is caught from within the container, printed to the console and abort()
is called.
If anyone with a greater understanding of how this is all working / is supposed to work is able to suggest a solution that would be really appreciated.
Many thanks,
Dan