Hi again,
tf.squared_difference appears to produce incorrect gradients. I don't have a toy example handy, but this is demonstrable by running any training session with an error function like so:
Tensor Loss(Tensor y, Tensor expected)
{
return tf.squared_difference(y, expected)/2;
}
and comparing to the same run like so:
Tensor Loss(Tensor y, Tensor expected)
{
return (y - expected)*(y - expected)/2;
}
In my cases, the second works, while the first learns in the wrong direction. Presumably issue is in nn_grad._SquaredDifferenceGrad which is just returning the op.inputs and ignoring the grads argument, but I don't know how to fix this:
[RegisterGradient("SquaredDifference")]
public static Tensor[] _SquaredDifferenceGrad(Operation op, Tensor[] grads)
{
//"""Returns the gradient for (x-y)^2."""
Tensor x = op.inputs[0];
Tensor y = op.inputs[1];
return new Tensor[]
{
x,
y
};
}
Thanks
Hi again,
tf.squared_difference appears to produce incorrect gradients. I don't have a toy example handy, but this is demonstrable by running any training session with an error function like so:
and comparing to the same run like so:
In my cases, the second works, while the first learns in the wrong direction. Presumably issue is in
nn_grad._SquaredDifferenceGradwhich is just returning the op.inputs and ignoring thegradsargument, but I don't know how to fix this:Thanks