Skip to content
This repository was archived by the owner on Nov 17, 2023. It is now read-only.
This repository was archived by the owner on Nov 17, 2023. It is now read-only.

Wrong gradients on Windows-GPU #20471

@matteosal

Description

@matteosal

sym.zip
I only see this on Windows. Download the symbol file and run this script:

import mxnet as mx

json_path = 'sym.json'
sym = mx.sym.load(json_path)

def run_example(ctx, reqs):
	ex = sym._bind(
		ctx,
		{
			'.Inputs.Input': mx.ndarray.array([[1, 2, 3]], ctx=ctx),
			'.Inputs.Target': mx.ndarray.array([[4, 5, 6]], ctx=ctx),
			'seq_715248120': mx.ndarray.array([3], ctx=ctx)
		},
		args_grad={
			'.Inputs.Input': mx.ndarray.zeros([1, 3], ctx=ctx),
			'.Inputs.Target': mx.ndarray.zeros([1, 3], ctx=ctx),
			'seq_715248120': mx.ndarray.zeros([1], ctx=ctx)
		},
		grad_req=dict(zip(['.Inputs.Input', '.Inputs.Target', 'seq_715248120'], reqs))
	)

	ex.forward()
	ex.backward(out_grads=[mx.ndarray.array([1], ctx=ctx), mx.ndarray.array([1], ctx=ctx)])

	print(ex.grad_dict)

print('Input + Target gradient, CPU (OK):')
run_example(mx.cpu(), ['write', 'write', 'null'])
print('\n')
print('Input + Target gradient, GPU (OK):')
run_example(mx.gpu(), ['write', 'write', 'null'])
print('\n')
print('Target gradient only, CPU (OK):')
run_example(mx.cpu(), ['null', 'write', 'null'])
print('\n')
print('Target gradient only, GPU (WRONG):')
run_example(mx.gpu(), ['null', 'write', 'null'])

Output is:

Input + Target gradient, CPU (OK):
{'.Inputs.Input':
[[-0.33333334 -0.33333334 -0.33333334]]
<NDArray 1x3 @cpu(0)>, '.Inputs.Target':
[[0.33333334 0.33333334 0.33333334]]
<NDArray 1x3 @cpu(0)>, 'seq_715248120': None}


Input + Target gradient, GPU (OK):
{'.Inputs.Input':
[[-0.33333334 -0.33333334 -0.33333334]]
<NDArray 1x3 @gpu(0)>, '.Inputs.Target':
[[0.33333334 0.33333334 0.33333334]]
<NDArray 1x3 @gpu(0)>, 'seq_715248120': None}


Target gradient only, CPU (OK):
{'.Inputs.Input': None, '.Inputs.Target':
[[0.33333334 0.33333334 0.33333334]]
<NDArray 1x3 @cpu(0)>, 'seq_715248120': None}


Target gradient only, GPU (WRONG):
{'.Inputs.Input': None, '.Inputs.Target':
[[-0.33333334 -0.33333334 -0.33333334]]
<NDArray 1x3 @gpu(0)>, 'seq_715248120': None}

The Target gradient has the sign flipped in the last example.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions