Skip to content

Fix #1178 #1203

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 5 commits into from
Closed

Fix #1178 #1203

wants to merge 5 commits into from

Conversation

ilyaZar
Copy link
Contributor

@ilyaZar ilyaZar commented May 7, 2025

This PR fixes a regression in how guess_where_config() resolves custom config paths in R/app_config.R. Specifically, it now correctly handles nested expressions of the form:

file = Sys.getenv("CONFIG_PATH", app_sys("golem-config.yml"))

Previously, the logic relied on regex and failed to correctly extract the path from nested calls. This led to malformed paths and erroneous conflict detection between the user and default config files.

Also, when experimenting in package development there could be several file = ... calls, one being commented out, in the app_config.R i.e. :

  # file = app_sys("golem-config.yml")
  file = app_sys("golem-config.yml")

This is fixed as well, the new behavior identifies the correct (not commented) case only now.

Changes made and To-Do

Fix:
#1178
#1179

…etection

Parse RHS of file= argument in app_config.R using AST traversal to robustly
support expressions like Sys.getenv("CONFIG_PATH", app_sys("...")). Fixes bug
introduced post-0.4.1 where regex extraction failed on nested calls.
@ilyaZar
Copy link
Contributor Author

ilyaZar commented May 7, 2025

@ColinFay

Related to this there is #1179, giving a more complicated expression to the argument file which is currently not handled.

In general, I am not sure if one should support this type of behavior.

One could simply restrict the possible values to the file argument in the docs, as there is infinetely many nestings and possible function calls identifying the config .yml from the file argument. Two other ways come to my mind though:

  1. grep for the .yaml config file directly, after recursively evaluating the complete AST associated with the right side of the file argument e.g. the right side is app_sys("golem-config.yml") in the full argument file = app_sys("golem-config.yml"). One can recursively deparse down the AST until "golem-config.yml" or any other .yml is found.
  2. evaluate the right side argument app_sys("golem-config.yml") from within, but that would require knowledge of app_sys() or any more complicated functions which is usually not availabe (i.e. eval(parse(file)) will not work when file contains app_sys() as this is uknown during runtime).

Ready to implement other ideas.

Sometimes, there are commented lines in app_config.R for the file argument
to test different arguments for `file` in the `get_golem_config()`
function.

Ensures that commented-out config candidates like '# file = app_sys(...)'
are not parsed and mistakenly used for user config path resolution.
Improves robustness of multi-line detection in guess_lines_to_config_file().
@ColinFay
Copy link
Member

Thanks a lot @ilyaZar 🫶

I feel like the code for guess_where_config is getting a bit complex 🤔 and I'm not quite sure if we should continue supporting the "try and guess" behavior. Maybe we should provide an explicit way to change the config path using an env var, and stop trying to guess where it is — a large portion of the devs will not change the default path, so we can expect from devs that are playing with config files to be comfortable enough with environment variable.

So, my proposition is :

  • golem has golem::get_current_config() that calls under the hood guess_where_config()
  • most devs will not be playing with the default config file, so we can safely rely on inst/golem-config.yml path for being the default in most cases.
  • power users will change the config path, but we could expect from them to be able to do it with an env var, or to hard code the path to the file in

I suggest modifying the guess_where_config() function that way:

guess_where_config <- function(
  path = golem::pkg_path(),
  file = "inst/golem-config.yml"
) {
  if (Sys.getenv("GOLEM_CONFIG_PATH") == "") {
    path_to_config <- fs_path(
      path,
      file
    )
  } else {
    path_to_config <- Sys.getenv("GOLEM_CONFIG_PATH")
  }
  path_to_config <- fs_path_abs(path_to_config)
  if (!fs_file_exists(path_to_config)) {
    stop(
      "Unable to locate config file. Either use the default path at inst/golem-config.yml or set it with a GOLEM_CONFIG_PATH environment variable"
    )
  }
}

Then, in the R/app_config.R file, switching to:

get_golem_config <- function(
  value,
  config = Sys.getenv(
    "GOLEM_CONFIG_ACTIVE",
    Sys.getenv(
      "R_CONFIG_ACTIVE",
      "default"
    )
  ),
  use_parent = TRUE,
  # If you don't want to use the default config file:
  # - replace the function call 
  #   and write the path to your config file
  # - set a `GOLEM_CONFIG_PATH` env var that will
  #   be picked by golem::get_current_config()
  file = golem::get_current_config()
) {
  config::get(
    value = value,
    config = config,
    file = file,
    use_parent = use_parent
  )
}

So basically, we stop parsing the files from the package to get the config file path, we either rely on the default value, an env var, or an hard coded path.

@LDSamson @ilyaZar, what do you both think of this approach?

@LDSamson
Copy link

Thanks a lot @ilyaZar 🫶

I feel like the code for guess_where_config is getting a bit complex 🤔 and I'm not quite sure if we should continue supporting the "try and guess" behavior. Maybe we should provide an explicit way to change the config path using an env var, and stop trying to guess where it is — a large portion of the devs will not change the default path, so we can expect from devs that are playing with config files to be comfortable enough with environment variable.

So, my proposition is :

* golem has `golem::get_current_config()` that calls under the hood `guess_where_config()`

* most devs will not be playing with the default config file, so we can safely rely on inst/golem-config.yml path for being the default in most cases.

* power users will change the config path, but we could expect from them to be able to do it with an env var, or to hard code the path to the file in

I suggest modifying the guess_where_config() function that way:

guess_where_config <- function(
  path = golem::pkg_path(),
  file = "inst/golem-config.yml"
) {
  if (Sys.getenv("GOLEM_CONFIG_PATH") == "") {
    path_to_config <- fs_path(
      path,
      file
    )
  } else {
    path_to_config <- Sys.getenv("GOLEM_CONFIG_PATH")
  }
  path_to_config <- fs_path_abs(path_to_config)
  if (!fs_file_exists(path_to_config)) {
    stop(
      "Unable to locate config file. Either use the default path at inst/golem-config.yml or set it with a GOLEM_CONFIG_PATH environment variable"
    )
  }
}

Then, in the R/app_config.R file, switching to:

get_golem_config <- function(
  value,
  config = Sys.getenv(
    "GOLEM_CONFIG_ACTIVE",
    Sys.getenv(
      "R_CONFIG_ACTIVE",
      "default"
    )
  ),
  use_parent = TRUE,
  # If you don't want to use the default config file:
  # - replace the function call 
  #   and write the path to your config file
  # - set a `GOLEM_CONFIG_PATH` env var that will
  #   be picked by golem::get_current_config()
  file = golem::get_current_config()
) {
  config::get(
    value = value,
    config = config,
    file = file,
    use_parent = use_parent
  )
}

So basically, we stop parsing the files from the package to get the config file path, we either rely on the default value, an env var, or an hard coded path.

@LDSamson @ilyaZar, what do you both think of this approach?

This sounds like a nice and simpler solution, good balance between flexibility and stability, personally I like it.
Only thing I can think of is if a user wants to rename the original config file in the inst folder to something else, for example to inst/my_package_config.yml. But that is not something I need, and the name is already customizable if you provide a environment variable or a fixed path.

Furthermore: please correct me if I am wrong, but I think with this approach it also does not matter if arbitrary other custom functions are declared in the app_config.R file, or if this file is renamed for some reason.

@ColinFay
Copy link
Member

@LDSamson thanks for your feedback.

I also think this would make using different config file name or path easier — mosts user will use the default and advanced user can configure with their needs, and yes you're right, it would no longer parse app_config.R so you can do whatever you want here :)

@ilyaZar
Copy link
Contributor Author

ilyaZar commented May 26, 2025

Hi @ColinFay

yes definitely agreed, and as said above I was never a big of the "try and guess" behavior ... although I contributed to exactly that code :)

However, I have a feeling that the if-case orders need a change.

Ideas for slight change

  1. First , when GOLEM_CONFIG_PATH is there and points to a non-existing file -> hard stop() if the file is not found because dev-users need an early error e.g. when they have made a typo or moved a file.

  2. Then, if there is no GOLEM_CONFIG_PATH envir var, we have:
    2.A a "standard" user with default path: then either the file exists and everything is fine or the dance from get_current_config() is triggered
    2.B a dev user who prefers to set a hard coded path, or made a typo like GOLEM_CONFIG_PATHHHHH, and both cases the file cannot be found

guess_where_config <- function(
  path = golem::pkg_path(),
  file = "inst/golem-config.yml"
) {
  # 1. DEV USER: envir var is set, and if the file does not exist hard stop
  if (Sys.getenv("GOLEM_CONFIG_PATH") != "") {
    path_to_config <- fs_path_abs(Sys.getenv("GOLEM_CONFIG_PATH"))
    if (!fs_file_exists(path_to_config)) {
      msg_err <- paste0(
        "Unable to locate a config file using the environment variable
        'GOLEM_CONFIG_PATH'. Check for typos in the (path to the) filename.")
      stop(msg_err)
    }
  } else if (Sys.getenv("GOLEM_CONFIG_PATH") == "") {
    path_to_config <- fs_path(
      path,
      file
    )
    # 2.A standard user: sets default path -> all is fine
    CHECK_DEFAULT_PATH <- grepl("*./inst/golem-config.yml$", path_to_config)
    if (isFALSE(CHECK_DEFAULT_PATH)) {
    # 2.B DEV USER: sets non-default path OR has typo in the envir variable name: 
    # if the file does not exist hard stop
      if (!fs_file_exists(path_to_config)) {
        msg_err <- paste0(
          "Unable to locate a config file from either the 'path' and 'file'",
          "arguments, or  the 'GOLEM_CONFIG_PATH' environment variable.",
          "Check for typos."
        )
        stop(msg_err)
      }
    }
  }
  return(path_to_config)
}

Benefits

  1. There can be cases where a dev user has a typo like GOLEM_CONFIG_PATHHHHH, or, an incorrect hard coded path or a GOLEM_CONFIG_PATH that points to a non-existent file: they get different error message for case 1. or 2.B

  2. If a standard user does not change anything and the file exists, all works. If a standard user has screwed up the .yml config, the old dance of get_current_config() is executed with the old main behavior still working and missing files get replaced from the shinyexample, see the get_current_config implementation:

get_current_config <- function(path = getwd()) {
  # We check whether we can guess where the config file is
  path_conf <- guess_where_config(path)

  if (!fs_file_exists(path_conf)) {
    if (rlang_is_interactive()) {
      ask <- ask_golem_creation_upon_config(path_conf)
      # Return early if the user doesn't allow
      if (!ask) {
        return(NULL)
      }
      fs_file_copy(
        path = golem_sys("shinyexample/inst/golem-config.yml"),
        new_path = fs_path(
          path,
          "inst/golem-config.yml"
        )
      )
      fs_file_copy(
        path = golem_sys("shinyexample/R/app_config.R"),
        new_path = fs_path(
          path,
          "R/app_config.R"
        )
      )
      replace_word(
        fs_path(
          path,
          "R/app_config.R"
        ),
        "shinyexample",
        golem::pkg_name(path = path)
      )
...
...
...

Just some ideas, ... ready to make other changes !

ilyaZar added 3 commits May 26, 2025 22:26
- implements different cases with different error messages

- error messages inform mostly about what case exactly went wrong
@ilyaZar
Copy link
Contributor Author

ilyaZar commented May 26, 2025

Changes made and To-Do

@ilyaZar
Copy link
Contributor Author

ilyaZar commented May 26, 2025

I encountered some issues.

Issues

Actually, the get_current_config is used numerous times inside {golem}, including inside get_golem_things(), so whenever I start a fresh golem skeleton and try to use a fixed path ./inst/golem-config2.yml in app_config.R it will not find the file -- instead the get_current_config dance is started via ask_golem_creation_upon_config(path_conf) prompting me to add a new config file .

In the old implementation, which used to parse the app_config.R we guessed the config location from user changes to that file. So {golem} knew from inside the package the most likely location of the config file or could almost always find it.

Verify the problem

Here, setting file = "./inst/golem-config2.yml" is not working in a fresh golem inside the app_config.R:

get_golem_config <- function(
  value,
  config = Sys.getenv(
    "GOLEM_CONFIG_ACTIVE",
    Sys.getenv(
      "R_CONFIG_ACTIVE",
      "default"
    )
  ),
  use_parent = TRUE,
  # If you don't want to use the default config file:
  # - replace the function call
  #   and write the path to your config file
  # - set a `GOLEM_CONFIG_PATH` env var that will
  #   be picked by golem::get_current_config()
  # file = golem::get_current_config()
  file = "./inst/golem-config2.yml"
) {
  config::get(
    value = value,
    config = config,
    file = file,
    use_parent = use_parent
  )
}

So changing to hard coded path at the beginning of a new golem-skeleton fails. Probably also amend_golem_config and set_golem_options might fail unexpectedly....

@ColinFay any ideas ? I may try to rework those get_golem_things() / amend_golem_config() as well but it might be a wiser decision to think a bit longer here , as now it's touching quite some other internal implemenaton.

@LDSamson
Copy link

LDSamson commented May 27, 2025

Just some brainstorming ideas below.

What if you would create a file in the inst folder in which you can write the dedicated file path, and nothing else? Optionally you can create a helper set_config_path() to set it properly. It would be much safer to read than the entire app_config.R file. It is similar to how renv discovers the active profile: it uses a single file named profile that only stores the active profile's name.

Then the logic in guess_where_config would check for a valid path in this order:

  • The GOLEM_CONFIG_PATH environment variable
  • The file named golem_config_path in the inst folder (use readLines or an equivalent function)

If the file is not available the envvar is not set, fall back to the standard path.

Furthermore, I am probably missing something here:
is there another reason for adding a get_golem_config() function in each golem project, apart from giving end users the possibility to change the location of the golem-config.yml file? If not: is it an option to keep it as an exported function in the golem package? I understand that would be a big change, but I am just curious.

@ilyaZar
Copy link
Contributor Author

ilyaZar commented May 27, 2025

@LDSamson , yes this is a good idea and all doable

I do not have any hard opinion about the correct way.

The underlying problem to address is always the same:

  • {golem} is used in different projects as a helper package to build other pkgs

  • there has to be conventions about where to store relevant files for {golem} -- such as the config-file -- on each user side project

Your suggestion (if I understand correctly), makes the implementation inside {golem} easier for the hard-coded path version:

  1. hard coded paths are set in a file, not via a function argument in app_config.R
  2. app_config.R can be altered almost arbitrarily because the underlying logic of {golem} to find config files no longer relates to app_config.R

Instead, config-files are retrieved either as envir-vars or a separate golem_config_path file (and the path stored as a string therein)

Sounds liike a good way to me, it would clean up some exotic internal implementations as well, or @ColinFay ?

@ilyaZar
Copy link
Contributor Author

ilyaZar commented May 27, 2025

Regarding get_golem_config: this is kind of the same initial problem

  • as an end user of golem you usually not alter the {golem} package

  • since the {golem} needs a pre-defined way to work with configs, the user has to adhere to the {golem} conventions to set configs anyway

Yes, it’s even possible to “purge” app_config.R along with get_golem_config(), and rename the file to app_file_access.R and only keep app_sys() there (it is more general, finding arbitrary files)

@ColinFay
Copy link
Member

Thank you both for the discussion.

I'm under the impression that we should support two cases:

  1. User sticks to the default. File is named golem-config.yml and we don't support renaming. My two cents being that:
    a. Any other workaround (like storing a file with the path) will end up the same, what if they change the name of the file storing the name, or the env var that contains the path, or other things like that. This feels too complex to support, and I'm ok with not supporting a renaming of this file.
    B. Having files that are supposed to be named a certain way sounds ok — for example {renv} has renv/activate.R, {here} has .here, a package has DESCRIPTION, etc etc, so I'm pretty comfortable with the idea of not allowing a file renaming.

  2. File can be passed as an env var GOLEM_CONFIG_PATH that would be used if set, and otherwise it would default to the file in inst/.

I'm ok with not givin too much flexibility here, as we can't provide a way to safely allow all possibility and still be reliable. Flexibility might create too much complexity here :)


So, to get back to the original issue: guess_where_config() not finding the config path because it parses text.

I agree with the original message from @ilyaZar

One could simply restrict the possible values to the file argument in the docs, as there is infinetely many nestings and possible function calls identifying the config .yml from the file argument.

Parsing the file will indeed be too complex because of the infinite possibilities the user could do. For example, what if app_sys() is renamed super_duper_fun? What if app_config.R is renamed i_have_no_idea_what_im_doing.R?

I suggest to then stick to the following:

  • In get_golem_config(), we use file = golem::get_current_config()

Then golem::get_current_config() works this way

  • If GOLEM_CONFIG_PATH is set: use this path
  • If not, default to `app_sys("golem-config.yml")

And we do not (natively) support:

  • Renaming the config file
  • Using other mechanism to point to the config file.

Then, if advanced developers want to write another way to access this file, I feel like they would be advanced enough to write their own wrapper function, as @LDSamson did.

Also, to answer remarks from @ilyaZar:

  • If there is a typo in the env var => meh, I suppose that happens and that's not up to golem to support that
  • hardcoded path => given that golem apps are package, there should be no hardcoded paths, as they are not portable.

Your idea for stopping if env var is set but the file doesn't exist is great though.

Given that I've merged #1209 and the args have been renamed, I'll cherry pick these two commits into a new branch and merge to master.

Thanks a lot for this insightful discussion 🫶

@ColinFay
Copy link
Member

I think I'll go for something even simpler in app_config, so that it's more obvious that you can tweak things here if you need to

get_golem_config <- function(
  value,
  config = Sys.getenv(
    "GOLEM_CONFIG_ACTIVE",
    Sys.getenv(
      "R_CONFIG_ACTIVE",
      "default"
    )
  ),
  use_parent = TRUE,
  # If you don't want to use the default config file,
  # set a `GOLEM_CONFIG_PATH` environment variable that points
  # to your custom config file.
  file = Sys.getenv(
    "GOLEM_CONFIG_PATH",
    app_sys("golem-config.yml")
  )
) {
  config::get(
    value = value,
    config = config,
    file = file,
    use_parent = use_parent
  )
}

@ColinFay
Copy link
Member

ColinFay commented Jun 5, 2025

This has been rebased and integrated in #1210

@ColinFay ColinFay closed this Jun 5, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants