Schema errors in 1.20.2-1
Description:
Since 1.20.2 when my program loads any model it leads to a bunch of outputs like this
Schema error: Trying to register schema with name Abs (domain: version: 1) from file /usr/src/debug/onnx/onnx/onnx/defs/math/old.cc line 2729, but it is already registered from file /usr/src/debug/onnx/onnx/onnx/defs/math/old.cc line 2729
Schema error: Trying to register schema with name Add (domain: version: 1) from file /usr/src/debug/onnx/onnx/onnx/defs/math/old.cc line 2627, but it is already registered from file /usr/src/debug/onnx/onnx/onnx/defs/math/old.cc line 2627
Schema error: Trying to register schema with name And (domain: version: 1) from file /usr/src/debug/onnx/onnx/onnx/defs/logical/old.cc line 132, but it is already registered from file /usr/src/debug/onnx/onnx/onnx/defs/logical/old.cc line 132
[...]
ONNX: This is an invalid model. In Node, ("/encoder/encoder_embed/Unsqueeze", Unsqueeze, "", -1) : ("x": tensor(float),) -> ("/encoder/encoder_embed/Unsqueeze_output_0",) , Error /usr/src/debug/onnx/onnx/onnx/defs/schema.cc:1116: SchemasRegisterer: Assertion `dbg_registered_schema_count == DbgOperatorSetTracker::Instance().GetCount()` failed: 4 schema were exposed from operator sets and automatically placed into the static registry. 696 were expected based on calls to registration macros. Operator set functions may need to be updated.
Additional info:
A search on onnxruntime repo reveals past similar reports and developer response saying that the solution is to statically link onnxruntime to onnx built with ONNX_DISABLE_STATIC_REGISTRATION (My understanding is that both onnx and onnxruntime try to register schemas, and for onnxruntime to work, schema registration must be disabled for onnxruntime's version of onnx, if this is not done then they get doubly registered like the errors above)
You must have built the dependent libraries as shared libraries, which will break things. Please let onnx runtime static link to onnx. - https://github.com/microsoft/onnxruntime/issues/8556#issuecomment-889989120
This seems to be because both ONNX and ONNX runtime call RegisterOnnxOperatorSetSchema(), i.e. register the base schemas multiple times when it should only be done once. When ONNX is built as part of onnxruntime ONNX_DISABLE_STATIC_REGISTRATION is used, which prevents ONNX from doing its own static initialization. ONNX runtime's expectation of disabling static registration seems reasonable as it only wants its own schemas. Thus ONNX should be built with ONNX_DISABLE_STATIC_REGISTRATION=ON. https://github.com/microsoft/onnxruntime/issues/8556#issuecomment-1005672965
Besides, we build ONNX with cmake option "ONNX_DISABLE_STATIC_REGISTRATION=ON". If you used a prebuilt ONNX library from somewhere else, unlikely it was built in such a way. Then it would cause a conflict when registering schemas - https://github.com/microsoft/onnxruntime/issues/23408#issuecomment-2598899951
In 1.20.2 ldd /usr/lib/libonnxruntime.so
lists libonnx.so as a dependency but this is not the case if I downgrade to 1.19.2.
Steps to reproduce:
git clone https://github.com/microsoft/onnxruntime-inference-examples
cd onnxruntime-inference-examples
cd c_cxx
mkdir build && cd build
cmake -DCMAKE_CXX_FLAGS="-I/usr/include/onnxruntime/" ..
make model-explorer
model-explorer/model-explorer ../MNIST/mnist.onnx
On a fresh Arch install this gives me errors as described above but works if I downgrade to onnxruntime 1.19.2-4.
Also the build fails unless I manually install openmpi and onnx. With 1.19.2-4 I only need to install openmpi. Though maybe openmpi should be listed as a required dependency for onnxruntime anyway since I'm not sure there's any way to use it without openmpi installed (ldd lists libmpi.so)