Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

L-BFGS optimizer stops at exactly 30 iterations, no matter the values of its parameters. #1819

Open
enm72 opened this issue Aug 8, 2024 · 8 comments

Comments

@enm72
Copy link

enm72 commented Aug 8, 2024

Hello Dr. Lu, I mostly appreciate your work on the development of the DeepXDE library. I have been using the library for some time now for research and lately I found out that the L-BFGS optimizer stops at exactly 30 iterations. I have been using it without an issue until a few weeks ago, when I observed this behaviour. This is quite strange, because I observed this "early stopping" behaviour while using code, for which the L-BFGS was functioning properly, meaning that I was able to use L-BFGS for as many iterations as I wanted. The environment I am using Google Colab.
To make sure that I did not brake something in my code, I tried some of the demo code from the DeepXDE documentation, also code from your work on sampling strategies. The result is always the same: ADAM works perfectly for as many iterations as I want, but when the training advances to L-BFGS, the training stops at exactly 30 iterations. I tried tweaking the gtol and the ftol parameters, but no luck. The arithmetic precision has been set to float64 as always. In general, the issue I am facing is that without changing anything in the conditions of the code I am experimenting with, the L-BFGS optimizer stops at 30 iterations. Any advice on this would be very much appreciated. Thank you.

@rispolivc
Copy link

I am facing the same issue. My old codes all had improved convergence using L-BFGS after a round of ADAM iterations. Now it does not work anymore. I can't find where something break. I am using Tensorflow backend.

@rispolivc
Copy link

rispolivc commented Aug 17, 2024

Well, I found out that L-BFGS isn't working properly with Tensorflow backend in Google Colab (both tensorflow.compat.v1 and tensorflow v2). But it is working with pytorch and paddle, at least for me. Include this in your code, if you are using pytorch:

import os
os.environ['DDE_BACKEND'] = 'pytorch'
import deepxde as dde
dde.config.set_default_float('float64')
import numpy as np
import torch

@enm72
Copy link
Author

enm72 commented Aug 20, 2024

Hello rispolivc. I truly appreciate your response, when you replied I was encouraged that a solution to this issue was eventually found, so I re-engaged the L-BFGS training in my code. It was indeed a comfort that someone else that was facing the same issue responded, but I was actually expecting that this issue would be encountered on a massive scale.
Indeed, I am also using tensorflow.compat.v1 and tensorflow v2. When changing to pytorch, demo code from DeepXDE documentation that engaged the L-BFGS algorithm did work. However, It did not work for all of my production code. I still get the same response, when the training advances to L-BFGS iterations, even after reverting to pytorch: after 27-30 iterations, the loss terms remain unchanged, training stops and execution moves on to the production of plots of the solutions of the problem I am trying to solve.
It is really odd. It looks like the L-BFGS is not improving the loss at all, which is totally puzzling. I tried to change several parameters of the L-BFGS which are relevant, but still no luck.

@bakhtiyar-k
Copy link

@lululxvi. I also report that the L-BFGS stopped working with the Tensorflow backend

@lululxvi
Copy link
Owner

lululxvi commented Sep 18, 2024

@enm72 @rispolivc @bakhtiyar-k Thank you for reporting the issue. I believe pytorch and paddle should work. I guess it is due to the TensorFlow version. Could you let me know your DeepXDE and TensorFlow versions you are using?

@bakhtiyar-k
Copy link

DeepXDE v: 1.12.1
Tensorflow v: 2.15.0

@enm72
Copy link
Author

enm72 commented Sep 21, 2024

Dr. Lu, the same goes for me too. DeepXDE v.1.12.1 and TensorFlow v.2.15.0.

@david-e-gomes
Copy link

I had the same problem. Switching to pytorch solved it

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants