A brief description of my model:
- Consists of a single parameter
X
of dtypeComplexDouble
and shape(20, 20, 20, 3)
. For reference, this must be complex because I need to perform FFTs etc. on it X
is used to compute a real scalar value,Y
as the output- The objective is to minimise the value of
Y
using autograd to optimize the value ofX
.
Simple gradient descent-based optimizers like torch.optim.SGD
and torch.optim.Adam
seem to work fine for this process. I would like to extend this to L-BFGS.
The problem is upon using
optimizer = optim.LBFGS(solver.parameters())
def closure():
optimizer.zero_grad()
Y = model.forward()
Y.backward()
return Y
for i in range(steps):
optimizer.step(closure)
I get the error
File "xx\Python\Python38\lib\site-packages\torch\optim\lbfgs.py", line 410, in step
if gtd > -tolerance_change:
RuntimeError: "gt_cpu" not implemented for 'ComplexDouble'
According to the source file, it's computing the directional derivative to be complex which disrupts the algorithm.
Is there any way to get L-BFGS working for my complex parameter (e.g. using an alternative library) or is this fundamentally impossible? I had some ideas about replacing these "faulty" dot products with something like real(a.conj() * b))
but I wasn't sure whether that would work.
CodePudding user response:
My intuition was correct. I replaced every occurence of a.dot(b)
in the file with torch.real(a.conj().dot(b))
and L-BFGS is working great!