Initial Backend Engine Integration #773

cavusmustafa · 2021-01-05T20:36:54Z

This PR includes the new backend engine. It separates the OpenVINO execution from the Executable into an isolated class, IEBackendEngine. This will create an abstraction for integrating other device based backend engines in the future.

adk9 · 2021-01-07T17:28:47Z

ngraph_bridge/executable.cc

+    ie_inputs[i] = nullptr;
+    ie_inputs[i] = static_pointer_cast<IETensor>(inputs[i]);
+    input_names[i] = input_name;


Please make IETensor named so that we can track the name together with the tensor.

adk9 · 2021-01-07T17:29:59Z

ngraph_bridge/executable.cc

@@ -167,7 +165,9 @@ bool Executable::Call(const vector<shared_ptr<runtime::Tensor>>& inputs,
  }

  //  Prepare input blobs
-  auto func = m_network.getFunction();
+  auto func = m_ie_engine->GetFunc();
+  std::vector<std::shared_ptr<IETensor>> ie_inputs(inputs.size());


Not necessary. You can pass inputs directly to the backend.

But in that case we have to pass ngraph tensors directly and cast them to IETensors inside the engine. Then we will need to move more functionality on the backend from the executable. We might need further discussion about the related changes.

IETensor is an nGraph tensor.

adk9 · 2021-01-07T17:31:14Z

ngraph_bridge/executable.cc

+  m_ie_engine->Infer(ie_inputs, input_names, ie_outputs, output_names,
+                     ie_hoisted_params, param_names);


What does the backend need to know hoisted parameters for? It could just be passed as an input?

It's possible to merge them into a single vector. But I think this time we may not pass the inputs directly as suggested in the review above. Also, I'm not sure if this would be safe for batching. It may not be an issue but we may need to verify.

A hoisted parameter is an input to the model. The backend doesn't need to know whether an input was hoisted or not.

For the basic backend this may not matter. VADM backend will divide input into multiple batches but it's not the case for the hoisted parameters. How do their size affected by the actual input batch size? If we pass them as regular inputs to the backend now, the VADM backend will disable batching because it sees multiple inputs although the actual input size is 1.

adk9 · 2021-01-07T17:37:34Z

ngraph_bridge/ie_backend_engine.h

@@ -0,0 +1,74 @@
+/*******************************************************************************


Minor nit in terms of naming: I'd prefer to call this Backend. The "IE" prefix is unnecessary since that's the only type of backends we expect to interface with. The files can similarly be renamed to be:

backend.{h,cc} backends/hddl.{h,cc} backends/myriad.{h,cc}

adk9 · 2021-01-07T17:38:01Z

ngraph_bridge/ie_backend_engine.cc

+  if (m_device == "MYRIAD") {
+    // Set MYRIAD configurations
+    if (IEUtils::VPUConfigEnabled()) {
+      config["MYRIAD_DETECT_NETWORK_BATCH"] = "NO";
+    }
+
+    if (IEUtils::VPUFastCompileEnabled()) {
+      config["MYRIAD_HW_INJECT_STAGES"] = "NO";
+      config["MYRIAD_COPY_OPTIMIZATION"] = "NO";
+    }
+  }


These customizations should be set in the "myriad" backend implementation.

Answered within the comment below since it is related.

adk9 · 2021-01-07T17:40:18Z

ngraph_bridge/ie_basic_engine.cc

@@ -0,0 +1,80 @@
+/*******************************************************************************


"Basic Engine" is unnecessary if you provide a default implementation for the abstract backend? Backends that don't need a custom "Infer" or custom configuration can fall back to the default impl.

Is the basic execution should be in Executable or in BackendEngine? We would disagree with the first one. The second option might be possible and it may help with the issue about MYRIAD configurations above too. But we may need to think about it if it will impact anything else in the future. Although, we may change this design later, I don't think it's in a bad shape the be merged now.

"Basic" execution should be in the implementation of the default Backend.

adk9 · 2021-01-07T17:54:08Z

ngraph_bridge/ie_backend_engine.h

+  // Returns the NGraph Function from the CNNNetwork
+  std::shared_ptr<ngraph::Function> GetFunc();
+
+  virtual const std::vector<size_t> GetOutputShape(const int i) = 0;


Not needed?

This is a function needed for VADM backend. We can remove it in this PR but we need to bring it back with the next PR.

The output shape can be determined either from the function or the output that was originally passed to the backend?

VADM backend modifies the batch size of the function so the output size we get from the function will be wrong. We can get the output size just after creating the function and use it later for allocations maybe. This will add extra functionality into the NGraphEncapsulateOp and I need to test to see how it works.

adk9 · 2021-01-07T17:54:15Z

ngraph_bridge/ie_backend_engine.h

+
+  // Returns output batch size based on the input batch size and the device
+  // FIXME: This may not be needed
+  virtual size_t GetOutputBatchSize(size_t input_batch_size) const;


Not needed?

This can be removed.

adk9 · 2021-01-07T17:54:51Z

ngraph_bridge/ie_backend_engine.h

+  // Enables multi request execution if the execution engine supprts
+  void EnableMultiReqExecution();
+  // Disables multi request execution
+  void DisableMultiReqExecution();


This can be a parameter to the HDDL backend constructor.

It may limit us to enable/disable batching dynamically. Or this parameter might be a part of caching. But it will require more changes into the existing bridge code.

In which scenarios would we want to enable/disable batching dynamically for a given network executing on a given device? If we expect this interface to be limited to a specific backend, then it shouldn't be a part of the abstract interface.

Then this should be a parameter to the Executable constructor since the backend is created there.

adk9 · 2021-01-07T17:55:43Z

ngraph_bridge/ie_backend_engine.h

+  virtual void StartAsyncInference(const int req_id);
+  virtual void CompleteAsyncInference(const int req_id);


How's one supposed to use these? Infer only uses one infer request.

These are required for the asynchronous execution. Currently we need this for VADM. For the other backends, it will be same as Infer for now.

Yes, I can tell that these are required for asynchronous execution from the name :)

I was asking how one'd use this interface because I don't see a way to create an asynchronous inference request? The implementation for these is broken at the moment, and I'd prefer that we implement it correctly if we're extending the interface.

It does not have any impact on the execution. We kept all executions as async to have a common call for all backends in case we may need it for multiple backends in the future. We can move back to Infer call for the basic execution for now, it will not make any difference.

New backend module is added

da803b8

cavusmustafa marked this pull request as ready for review January 5, 2021 23:40

cavusmustafa requested a review from adk9 January 5, 2021 23:40

adk9 suggested changes Jan 7, 2021

View reviewed changes

		m_ie_engine->Infer(ie_inputs, input_names, ie_outputs, output_names,
		ie_hoisted_params, param_names);

		@@ -0,0 +1,74 @@
		/*******************************************************************************

		@@ -0,0 +1,80 @@
		/*******************************************************************************

		virtual void StartAsyncInference(const int req_id);
		virtual void CompleteAsyncInference(const int req_id);

Initial Backend Engine Integration #773

Are you sure you want to change the base?

Initial Backend Engine Integration #773

Uh oh!

Conversation

cavusmustafa commented Jan 5, 2021

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!