Skip to content

Conversation

@nadavelkabets
Copy link

@nadavelkabets nadavelkabets commented Jun 14, 2025

Closes #514.

The most ament_cmake way to do this is to change up the way that ament_python_install_package works. For most things that are part of ament_cmake, the way that they work is that the call (to e.g. ament_export_libraries) doesn't do much work, and instead sets up a bunch of cmake variables, along with a hook. When ament_package is eventually called, it calls all of the hooks, and at that point the hook does the work based on the cmake variables. This is done this way so that ament_package has a full view of the environment, and avoids problems exactly like this.

For whatever reason, ament_python_install_package does not work like this at the moment. We should change ament_python_install_package to work in the more ament_cmake way, which will resolve this bug.

Following the suggestion from @clalancette,
calling ament_python_install_package now registers an environment hook and sets the parameters in environment variables, moving installation to the ament_package stage.

This allows to extend an existing package and thus allows to utilize rosidl_generate_interfaces and ament_python_install_package at the same time.

Related:

@nadavelkabets nadavelkabets changed the title feature: allow extending a python package in ament_python_isntall_package feature: allow extending a python package in ament_python_install_package Jun 14, 2025
@nadavelkabets nadavelkabets force-pushed the feature/ament-python-extend-package branch from d83479d to 00311aa Compare June 14, 2025 15:35
@nadavelkabets nadavelkabets marked this pull request as ready for review June 14, 2025 15:39
Signed-off-by: Nadav Elkabets <[email protected]>
@nadavelkabets nadavelkabets force-pushed the feature/ament-python-extend-package branch from e2862ff to 0f9b36d Compare June 14, 2025 17:59
Signed-off-by: Nadav Elkabets <[email protected]>
@nadavelkabets nadavelkabets force-pushed the feature/ament-python-extend-package branch from 0f9b36d to a8419ae Compare June 14, 2025 17:59
pszenher added a commit to pszenher/Distributional_RL_Decision_and_Control that referenced this pull request Jun 20, 2025
While the monopackage pattern (containing both message interfaces and
behavioral nodes) isn't intentionally unsupported by ROS2, it
currently cannot support packages using `ament_cmake_python` to build
C++ and Python code alongside `rosidl` message generation.

This is a well-known and long-standing issue;  see:
  ros2/rosidl_python#141
  ament/ament_cmake#514
  ament/ament_cmake#587

Accordingly, to support well-integrated inclusion of Python code in
this package (i.e., by adding `run_vrx_experiments.py` to the CMake
build system flow), this commit splits `virelex_msgs` out into a
separate package.  An argument could be made for converting
`run_vrx_experiments.py` into a Python-syntax launch file, but it does
a bit too much (defines a class, etc.) for that to feel like the
correct option.

In addition, this commit integrates the `pretrained_models` folder
into the new `trained` subdirectory of `virelex`, providing a stable
filesystem path to access those models during node execution (via
`get_package_share_directory("virelex")`).
@rkent
Copy link

rkent commented Jun 26, 2025

I've done an initial check of the code, and it generally looks good, except for one issue. As far as I can tell, you are not supporting AMENT_CMAKE_SYMLINK_INSTALL. Symlink install is probably incompatible with the concept of merging together multiple same-named python packages, but this need to merge is really rather rare, and symlink_install makes sense in the vast majority of packages that do not need this merger. It seems to me that the correct thing to do is to disable symlink install with a warning when multiple same-named packages are detected. Or did I miss something about how you supported this?

I have not actually tested yet, but will soon.

@nadavelkabets
Copy link
Author

nadavelkabets commented Jun 27, 2025

I've done an initial check of the code, and it generally looks good, except for one issue. As far as I can tell, you are not supporting AMENT_CMAKE_SYMLINK_INSTALL. Symlink install is probably incompatible with the concept of merging together multiple same-named python packages, but this need to merge is really rather rare, and symlink_install makes sense in the vast majority of packages that do not need this merger. It seems to me that the correct thing to do is to disable symlink install with a warning when multiple same-named packages are detected. Or did I miss something about how you supported this?

I kept the original package installation code without changes. To my knowledge, colcon changes the behavior of the install command depending on the AMENT_CMAKE_SYMLINK_INSTALL flag.
It appears like merging works correctly even with symlink install, due to the install command symlinking each file in the package recursively.

@rkent
Copy link

rkent commented Jun 27, 2025

This is the code from the original that does not have an equivalent in your PR. (I matched every line of the old code with your new code, and this is the only difference that I found). I'll test it today though to confirm whether it works or not.

Copy link

@rkent rkent left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've thoroughly checked the old code vs the new code, and confirmed that all of the old code has corresponding functionality in the new code, with one exception. I've also tested using a very simple combined msg/ directory with a python package of the same name, and everything works as I expect.

Concerning the exception code, I've compared results of the new code with old code for --symlink-install, and find that the new code works fine with the same result. As @nadavelkabets points out, the modifications to the install command are taking care of the symlinks. The only difference I could find is that in the old code, a symlink is created in the build/ directory, that is not present in the new code. But the end result in the /install directory is the same in both cases, so it is not clear to me why this symlink is needed. Trying to add that symlink would lead to a conflict between duplicate python packages and --symlink-install, so it is good that it is not needed.

But as @hidmic points out in PR #326 that created these symlinks, "There should be a Here be dragons sign in ament_cmake root README..." I'm afraid that fighting those dragons here is beyond my expertise. AFAICT that code is not needed. But perhaps @nadavelkabets you could give your own defense of why.

So with one nit, I've reviewed as thoroughly as I can and find no issues, so here's my gray checkmark.

@nadavelkabets
Copy link
Author

nadavelkabets commented Jul 5, 2025

Concerning the exception code, I've compared results of the new code with old code for --symlink-install, and find that the new code works fine with the same result. As @nadavelkabets points out, the modifications to the install command are taking care of the symlinks. The only difference I could find is that in the old code, a symlink is created in the build/ directory, that is not present in the new code. But the end result in the /install directory is the same in both cases, so it is not clear to me why this symlink is needed. Trying to add that symlink would lead to a conflict between duplicate python packages and --symlink-install, so it is good that it is not needed.

But as @hidmic points out in PR #326 that created these symlinks, "There should be a Here be dragons sign in ament_cmake root README..." I'm afraid that fighting those dragons here is beyond my expertise. AFAICT that code is not needed. But perhaps @nadavelkabets you could give your own defense of why.

Hi! Thanks for the review and sorry for the delay.

This was my previous code before switching symlink to copy during build:

macro(_ament_cmake_python_copy_or_symlink package_name)
  set(_sync_target "ament_cmake_python_sync_${package_name}")

  set(_dsts  "")
  set(_srcs  "")
  foreach(_dir IN LISTS _PACKAGE_DIRS)
    file(GLOB_RECURSE _dir_files CONFIGURE_DEPENDS RELATIVE "${_dir}" "${_dir}/*")
    foreach(_rel IN LISTS _dir_files)
      set(_src "${_dir}/${_rel}")
      set(_dst "${_build_dir}/${package_name}/${_rel}")

      list(FIND _dsts "${_dst}" _idx)
      if(NOT _idx EQUAL -1)
        list(REMOVE_AT _dsts  ${_idx})
        list(REMOVE_AT _srcs  ${_idx})
      endif()
      list(APPEND _dsts "${_dst}")
      list(APPEND _srcs "${_src}")
    endforeach()
  endforeach()

  set(_sync_deps "")
  list(LENGTH _dsts _len)
  if(_len GREATER 0)
    math(EXPR _last "${_len} - 1")
    foreach(_file_idx RANGE 0 ${_last})
      list(GET _dsts ${_file_idx} _dst)
      list(GET _srcs ${_file_idx} _src)

      get_filename_component(_dst_parent "${_dst}" DIRECTORY)
      file(MAKE_DIRECTORY "${_dst_parent}")

      if(AMENT_CMAKE_SYMLINK_INSTALL)
        add_custom_command(
          OUTPUT  "${_dst}"
          COMMAND ${CMAKE_COMMAND} -E create_symlink "${_src}" "${_dst}"
          DEPENDS "${_src}"
          COMMENT "Symlinking ${_dst}"
          VERBATIM
        )
      else()
        add_custom_command(
          OUTPUT  "${_dst}"
          COMMAND ${CMAKE_COMMAND} -E copy_if_different "${_src}" "${_dst}"
          DEPENDS "${_src}"
          COMMENT "Copying    ${_dst}"
          VERBATIM
        )
      endif()
      list(APPEND _sync_deps "${_dst}")
    endforeach()
  endif()

  if(_SETUP_CFG)
    set(_cfg_dst "${_build_dir}/setup.cfg")
    if(AMENT_CMAKE_SYMLINK_INSTALL)
      set(_copy_cmd ${CMAKE_COMMAND} -E create_symlink "${_SETUP_CFG}" "${_cfg_dst}")
    else()
      set(_copy_cmd ${CMAKE_COMMAND} -E copy_if_different "${_SETUP_CFG}" "${_cfg_dst}")
    endif()

    add_custom_command(
      OUTPUT  "${_cfg_dst}"
      COMMAND ${_copy_cmd}
      DEPENDS "${_SETUP_CFG}"
      COMMENT "Synchronising setup.cfg"
      VERBATIM
    )
    list(APPEND _sync_deps "${_cfg_dst}")
  endif()

  add_custom_target(${_sync_target} DEPENDS ${_sync_deps})

endmacro()

I got also confused by the original implementation. After giving it some thought, I figured that the flag is named "symlink-install" and not "symlink-build". It should not matter to the user if the build is done with symlinks or by copying the files. Moreover, the use of symlinks in build does not effect the user, since the egg files that are built only contain the name of the files, and if a new file is added the user must rebuild both ways.

The reason I changed this - to avoid generating scripts to execute during runtime.
There are three stages to a cmake build - configuration, when cmake script is executed, build and install. Previously, the package directory symlink was generated during the build stage. To allow the new override behavior, I had to "glob" the package directory in order to symlink each file. I could only find a way to execute this glob command during the configuration stage, which meant that generated files, like the ones created by rosidl_generate_interfaces, do not exist yet.
This problem is solved when using "copy". There are builtin cmake commands to execute copy during the build stage, and copy works out of the box for override behavior.

Signed-off-by: Nadav Elkabets <[email protected]>
@nadavelkabets
Copy link
Author

@christophebedard I think we can move forward with this PR.
How do we proceed?

@nadavelkabets
Copy link
Author

nadavelkabets commented Jul 25, 2025

@mjcarroll
Do you think you could find the time to give this a look?

Also, as far as I'm aware, there are currently no tests for this package. I'm not really familiar with testing methodology in cmake and testing conventions in ament_cmake. How do you usually approach this?

@rkent
Copy link

rkent commented Jul 31, 2025

I see other PRs making progress in this repo, with reviews by @cottsay Is there any way we could get some attention to this PR? @tfoote or @christophebedard perhaps you could help.

@cottsay
Copy link
Contributor

cottsay commented Aug 1, 2025

Hi there. After reading back through the discussions about this change, things make sense to me. It would give us a LOT more confidence if we had tests. I agree that it's not super straightforward to do, but it's totally possible to do. If you have some cycles, would you mind writing at least a simple test to exercise the code?

I see a couple of possible approaches. You can just include the extension and use it (as was done for ament_cmake_vendor_package) or write one or more smoke test packages and then build/test it using something like ExternalProject_Add.

@rkent
Copy link

rkent commented Aug 4, 2025

@nadavelkabets will you be able to add a test for this?

@nadavelkabets
Copy link
Author

@cottsay @rkent I added tests, but apparently ament_cmake_test depends on ament_cmake_python, so ament_add_test or ament_add_pytest_test fail due to a circular import:

nadav@nadav:/mnt/hgfs/rclpy/ament_cmake$ colcon build
[0.613s] ERROR:colcon:colcon build: Unable to order packages topologically:
ament_cmake: ['ament_cmake_gen_version_h', 'ament_cmake_gtest', 'ament_cmake_pytest', 'ament_cmake_python', 'ament_cmake_test']
ament_cmake_auto: ['ament_cmake', 'ament_cmake_gen_version_h', 'ament_cmake_gmock', 'ament_cmake_gtest', 'ament_cmake_pytest', 'ament_cmake_python', 'ament_cmake_test']
ament_cmake_gen_version_h: ['ament_cmake_gtest', 'ament_cmake_pytest', 'ament_cmake_python', 'ament_cmake_test']
ament_cmake_gmock: ['ament_cmake_gtest', 'ament_cmake_pytest', 'ament_cmake_python', 'ament_cmake_test']
ament_cmake_google_benchmark: ['ament_cmake_pytest', 'ament_cmake_python', 'ament_cmake_test']
ament_cmake_gtest: ['ament_cmake_pytest', 'ament_cmake_python', 'ament_cmake_test']
ament_cmake_pytest: ['ament_cmake_python', 'ament_cmake_test']
ament_cmake_python: ['ament_cmake_pytest', 'ament_cmake_test']
ament_cmake_test: ['ament_cmake_pytest', 'ament_cmake_python']
ament_cmake_vendor_package: ['ament_cmake_pytest', 'ament_cmake_python', 'ament_cmake_test']

Do you have any creative ideas to solve this?

@rkent
Copy link

rkent commented Aug 19, 2025

I tried an alternative approach in #597 which seems to work. That is, I created a separate package ament_cmake_python_test that only does the test. "Works" in the sense that it gets past the problem in the previous comment.

However, now that the test is running there is a new issue. You can see that your test actually fails when it uses the overlay. I've spent a bit of time understanding this, and I'm pretty sure this is a race condition issue. That is, the test (and really this whole PR) assumes that if we do two calls to ament_python_install_package, then the first call is completed before the second call executes. I don't think that is true, they execute in parallel. So depending on which one goes first, the root __init__.py in the final install directory could be either the one from ament_python_test_package, or the one from ament_python_test_package_overlay. In your test, you are looking for the second one, but what is there is the first one.

It's a race condition because copy_directory is supposed to overwrite files that are the same, but that is not occurring here because the operations are parallel not sequential.

I can make the test pass by arbitrarily adding some delaying elements to ament_cmake_python_install_registered_packages.cmake I don't have a clear example of that yet, but I have seen it locally.

I think what needs to be done is to add some target dependencies to at least make it deterministic. But in general, the problem exposed by the test is real. That is, if you merge two different subdirectories into the same python package, and there are duplicate file or directory names, there is no way to really resolve that.

In the use case that I care about, where there are python packages and custom messages in the same package, the files to merge from the custom messages are fairly predictable, so it should be possible to have reliable results. If someone tries to merge two different subdirectories into a single python package using this PR, we'll just have to assume that they know what they are doing. Some sort of warning message might be indicated.

@rkent
Copy link

rkent commented Sep 4, 2025

I just want to add an update on my work on this. I've mostly been working on testing. Rather than simply manually testing the features, I've been putting together an automated test that exercises the various options of ament_python_install_package. Essentially I run colcon build on a variety of test packages that exercise various features. I'm about halfway through with this, and I'll propose it formally when I am done.

On the existing test, I'm not quite sure what exactly you are trying to test for in the various asserts. Python asserts have an optional clause where you can specify what you expected and why, or you can do it in the comments. If you have ever been in the position of submitting a PR for something, that causes a test failure in something completely unrelated to what you were doing, you'll appreciate why it is important to document why a test is there. If you want your tests added, they really need to be clear as to their purpose.

As to issues with the code itself, I have not really gone over it thoroughly yet. But there is one known issue we need to discuss: ordering-dependent behavior of the merges, particularly as it affects the overall module __init__.py.

I think I said earlier that the correct behavior would be to respect the order of items added in CMakeLists.py, but I am having second thoughts about that. I believe that by far the motivation for this PR is to be able to combine custom IDL generation (messages, actions, and services) with python code. In that case, the __init__.py from the IDL code is blank, and the correct behavior should be to use the __init__.py from the python module. My tests show this is not the current behavior. If we leave it as this, about half of existing uses will possibly fail if the python module adding is done before the IDL generation. I think we should discuss more the desired behavior, and fix it if that makes sense.

@nadavelkabets
Copy link
Author

nadavelkabets commented Sep 6, 2025

I think I said earlier that the correct behavior would be to respect the order of items added in CMakeLists.py, but I am having second thoughts about that. I believe that by far the motivation for this PR is to be able to combine custom IDL generation (messages, actions, and services) with python code. In that case, the __init__.py from the IDL code is blank, and the correct behavior should be to use the __init__.py from the python module. My tests show this is not the current behavior. If we leave it as this, about half of existing uses will possibly fail if the python module adding is done before the IDL generation. I think we should discuss more the desired behavior, and fix it if that makes sense.

My refactor added the ability to overlay multiple python packages, following their declaration order in CMakelists.txt.
In the existing implementation, declaration of multiple packages with the same name is not allowed and results in an error.
As such, no existing user code should fail.
Furthermore, in a package that combines user code and generated messages, the only effect of a wrong declaration order will be an empty root __init__.py file, which is not catastrophic (as long as we document that behavior).

I agree with you that combining idl generated interfaces with user code should be the only use case of this feature.
Moreover, I think that we should not allow multiple user created packages with the same name.
With that said, the current architecture and integration between ament_python_install_package and rosidl_generate_interfaces is complicated and lacking.
For example, the build target for ament_python_install_package has no dependency for the build target of rosidl_generate_interfaces, resulting in the built python message files missing from the generated SOURCES.txt file.

This is the current behavior:

  1. Each source dir is copied to the build dir (the idl messages are not compiled yet so an empty directory is copied)
  2. Python egg-info is compiled in the ament_python build dir
  3. Idl messages are compiled
  4. Files are copied or symlinked to the install directory recursively from each source dir, now with the all interfaces that have finished building.
  5. Generated setuptools files are copied or symlinked from the build dir to the install dir.

In my opinion, the ideal behavior would be the following:

  1. Idl messages are compiled
  2. User source dir and generated messages are copied or symlinked recursively in to the build dir
  3. Python egg-info data is compiled in the build dir
  4. The build dir is symlinked or copied recursively into the install dir

To achieve that, this refactor is not enough, and modifications for rosidl_generator_py are required.
As rosidl_generator_py currently utilizes ament_python_install_package the same way as user code, I'm not sure how to allow combining generated messages and user code while preventing users from declaring multiple packages themselves, or causing strange behavior by declaring in the wrong order.

I might need to think this further. Maybe some maintainers can provide feedback on the matter.

@rkent
Copy link

rkent commented Sep 6, 2025

I'm wondering if in your code:

        file(INSTALL
          DESTINATION \"\${_dest}\"
          TYPE DIRECTORY
          FILES \"\${_dir}\"
          PATTERN \"*.pyc\"       EXCLUDE
          PATTERN \"__pycache__\" EXCLUDE
        )

we could, before, detect an empty "init.py" wanting to overwrite a non-empty one, and exclude that file from the copy, so that this would work with combined idl/python code without having to worry about order.

By the way, I've modified the tests from #598 to work with your code, and all of the tests pass, except one I added where the python install is called before the msg install. I think that will go a long way to making @cottsay comfortable with this change.

@nadavelkabets
Copy link
Author

nadavelkabets commented Sep 6, 2025

By the way, I've modified the tests from #598 to work with your code, and all of the tests pass, except one I added where the python install is called before the msg install. I think that will go a long way to making @cottsay comfortable with this change.

First of all, thanks for the time you invested into testing this code.
As far as I understand, the test failure does not indicate a bug as currently this is the intended behavior (for the second install to override the first). Is that correct?

we could, before, detect an empty "init.py" wanting to overwrite a non-empty one, and exclude that file from the copy, so that this would work with combined idl/python code without having to worry about order.

My issue with this approach is that it's really case specific.
Do we allow users to overlay as many packages as they want? What if we had more than 2 __init__.py files?
What if some user decided to install multiple packages, all with non-empty __init__.py files?
We are opening the door for a lot of edge cases that we did not intend to introduce as we are now allowing a new use case that is unrelated to this bug we are trying to solve.

If we're okay with giving users the freedom to overlay as many self made packages as they want, I think that the order dependent implementation is the simplest solution. While I agree that it might be confusing, I think that a more complex solution might make it even more confusing for users If they didn't utilize this feature exactly as we intended.

@nadavelkabets
Copy link
Author

Also, regarding the build target dependency issue, perhaps we should add a new DEPENDS flag for the ament_python_install_package function and modify rosidl_generator_py to add it's build target as a dependancy.

@rkent
Copy link

rkent commented Sep 6, 2025

First of all, thanks for the time you invested into testing this code. As far as I understand, the test failure does not indicate a bug as currently this is the intended behavior (for the second install to override the first). Did I understand correctly?

Yes you understand correctly. I am specifically testing that the python package init.py is the one that survives, even if it is called before the msg install. So the issue is what is the desired behavior, not some detected issue with your code.

we could, before, detect an empty "init.py" wanting to overwrite a non-empty one, and exclude that file from the copy, so that this would work with combined idl/python code without having to worry about order.

My issue with this approach is that it's really case specific.

I would say that keeping a contentful __init__.py over an empty one is not case specific to the case of combining idl with python, but would also work if someone tried to overlay two python packages.

What if some user decided to install multiple packages, all with non-empty __init__.py files? We are opening the door for a lot of edge cases that we did not intend to introduce as we are now allowing a new use case that is unrelated to this bug we are trying to solve.

IMHO trying to worry about possible but unlikely abuses should not overrule the desire to make this "just work" for the most common use case, without adding the footgun of having to worry about which comes first in CMakeLists.txt

@rkent
Copy link

rkent commented Sep 6, 2025

Also, regarding the build target dependency issue, perhaps we should add a new DEPENDS flag for the ament_python_install_package function and modify rosidl_generator_py to add it's build target as a dependancy.

You are more experienced in cmake than I am, so I'm not going to push "my" solution over yours. Yes I think doing this with dependencies would be preferable to my __init__.py kludge.

@nadavelkabets
Copy link
Author

You are more experienced in cmake than I am, so I'm not going to push "my" solution over yours. Yes I think doing this with dependencies would be preferable to my __init__.py kludge.

That's a solution for a different bug that exists in the current implementation and is not solved by my refactor.

If we're okay with giving users the freedom to overlay as many self made packages as they want, I think that the order dependent implementation is the simplest solution. While I agree that it might be confusing, I think that a more complex solution might make it even more confusing for users If they didn't utilize this feature exactly as we intended.

@cottsay what's your take on the subject? Should we always take the non-empty root __init__.py file, resulting in a correct installation no matter the call order to ament_python_install_package and rosidl_generate_interfaces, or should we overlay the files following the call order in the CMakelists file?

rkent added a commit to rkent/ament_cmake that referenced this pull request Sep 8, 2025
Co-authored-by: R. Kent James <[email protected]>
Co-authored-by Nadav Elkabets <[email protected]>

Signed-off-by: R Kent James <[email protected]>
rkent added a commit to rkent/ament_cmake that referenced this pull request Sep 8, 2025
Co-authored-by: R. Kent James <[email protected]>
Co-authored-by Nadav Elkabets <[email protected]>

Signed-off-by: R Kent James <[email protected]>
rkent added a commit to rkent/ament_cmake that referenced this pull request Sep 8, 2025
Co-authored-by: R. Kent James <[email protected]>
Co-authored-by Nadav Elkabets <[email protected]>

Signed-off-by: R Kent James <[email protected]>
@rkent
Copy link

rkent commented Sep 8, 2025

I've added my test code now in #599 and all is working well. In that code, I proposed a change so that when idl code is combined with python code, the idl code is installed first so that the blank __init__.py from the idl does not overwrite the possible non-blank __init__.py from the python package.

I'm not aware of any other issues in this PR's code but I want to look it over in detail one more time in the next day or so. I'd be interested in @nadavelkabets 's take on both that proposed fix, as well as the proposed test code.

@cottsay
Copy link
Contributor

cottsay commented Sep 8, 2025

Should we always take the non-empty root __init__.py file, resulting in a correct installation no matter the call order to ament_python_install_package and rosidl_generate_interfaces, or should we overlay the files following the call order in the CMakeLists file?

Hmm. Looking at CMake's own installation behavior:

cmake_minimum_required(VERSION 3.15)
project(foo NONE)

install(FILES b/foo.txt DESTINATION share/)
install(FILES a/foo.txt DESTINATION share/)

When I run this, both installs are performed in the order they're defined in, so the latter one overwrites.

As much as I'd like it to be that simple, I believe I can reproduce the inconsistency you're discussing if I put the files in the same call to install() like this:

install(FILES a/foo.txt b/foo.txt DESTINATION share/)

So it seems that CMake isn't entirely consistent with itself. Since package authors will be adding the "overlays" of the package data in separate calls from their CMakeLists, I think it would probably be good to align with the behavior when separate install() invocations are made and always overwrite with the files in the later calls (if possible).


Separately, this could land us in a tricky spot if different "overlays" each include non-empty __init__.py that need to be merged (which is obviously way out of scope for what we're trying to do here) :(

@rkent
Copy link

rkent commented Sep 8, 2025

@cottsay I agree with one caveat - in the most common case of the combined use of idl files and python packages, the idl file will always have an empty __init__.py file, so it should be installed first to allow for a possible non-empty python package __init__.py. This PR as written does not do that, but there are changes in #599 to do it.

But I won't argue this further if there are strong opinions against it.

@cottsay
Copy link
Contributor

cottsay commented Sep 8, 2025

Taking this in another direction, would it be possible to explicitly disallow collisions entirely? Allow only additional Python modules to be added, and punt the merging problem entirely?

@rkent
Copy link

rkent commented Sep 8, 2025

How is that different from the current behavior?

As far as I can tell, the major motivation for this is that people have packages converted from ROS1 that combined idl definitions with python modules, and they don't want to separate them. That could be because their packages are used by others, and they want to preserve the previous design, or they just have too many to be practical to adapt.

I would be happy if we supported just the combined IDL and python case, but I don't see why your should restrict the weird case of trying to overlay two python packages, when it clearly could work with the code in this PR.

@cottsay
Copy link
Contributor

cottsay commented Sep 8, 2025

I'd guess I was under the impression that extending Python packages was the goal here, where a Python package can contain multiple Python modules. It's possible that I'm mistaken or that extending modules is not sufficient for the use case you're seeking, but avoiding the ambiguity in question would be preferable to choosing one behavior or the other, so I thought I'd ask.

@rkent
Copy link

rkent commented Sep 8, 2025

If I look at some of the people who have commented on #514, here are some repos and the underlying problem:

https://github.com/ADVRHumanoids/nicla_vision_ros2 (attempt to define idl and python in the same package)

https://github.com/ros-drivers/audio_common/tree/master/sound_play (the same)

@brta-jc said: "Currently we still maintain a fork of rosidl_generator_py and as such cannot build our packages on the upstream ROS industrial CI." so I assume it's the same issue.

@sillkjc said: "We maintain a codebase of 100+ packages and splitting them into msgs and srvs is inappropriate for some, and far too much work overall." so I assume the same issue.

So all of the example I have seen are combining python with idl.

@nadavelkabets
Copy link
Author

So all of the example I have seen are combining python with idl

I agree that this is the issue we're trying to solve.
With that said, I must make the differentiation between the issue and the proposed solution.
This PR does not propose a solution specific to the issue, but a broader feature that allows merging multiple source python packages.
As much as I want this feature to be the perfect solution for our issue, I think the current implementation is the best compromise.
Ament cmake is a low level build tool, which IDL generation is built on. Introducing IDL specific code, as you proposed in #599, to ament cmake is in my opinion problematic, creating unhealthy and unexpected coupling between the two.
I agree that this specific use case might cause some confusion if a user decided to reverse the call order, but that could be solved easily by improving the documentation.

I'll go over your tests in #599, I agree that we should merge it before merging this PR.

@rkent
Copy link

rkent commented Sep 17, 2025

In some of my testing with #599, I came across an issue that may or may not be real. One variation added multiple packages through cmake add_subdirectory statements. The problem comes when one of those packages hits the ament_package command, which then triggers ament_cmake_python_install_registered_packages That command gets a list through get_property(_pkgs GLOBAL PROPERTY AMENT_CMAKE_PYTHON_PKGS) and then installs those packages.

The problem comes that AMENT_CMAKE_PYTHON_PKGS is global and persists, but is not cleared. So when the next package in a subdirectory is installed, and ament_cmake_python_install_registered_packages is hit again, it tries to re-install the previous package, which generates an error.

The solution is simple, just clear the list after ament_cmake_python_install_registered_packages, but I can still imagine problems if there were a hierarchy of subdirectories, so a parent package does not get properly registered if there is a child package that calls registration in the middle of its CMakeLists.txt

I'm not sure if this is a real problem or not, or just an artifact of one way I tried to do testing. It probably does not make sense to have a cmake python package with 'sub packages'. But clearing the list after all packages are installed is probably a good idea anyway.

@nadavelkabets
Copy link
Author

I'm not sure if this is a real problem or not, or just an artifact of one way I tried to do testing. It probably does not make sense to have a cmake python package with 'sub packages'. But clearing the list after all packages are installed is probably a good idea anyway.

I feel like the usage of add_subdirectory contradicts the idea of ros packages. According to the official documentation: "A single workspace can contain as many packages as you want, each in their own folder. You can also have packages of different build types in one workspace (CMake, Python, etc.). You cannot have nested packages."
Clearing the list might solve your issue, but honestly this is not a supported use case so I believe this modification is not required.

@nadavelkabets
Copy link
Author

@cottsay
Could you run CI on this PR?

@cottsay
Copy link
Contributor

cottsay commented Sep 24, 2025

Could you run CI on this PR?

Here's a full build:

  • Linux Build Status
  • Linux-aarch64 Build Status
  • Linux-rhel Build Status
  • Windows Build Status

@rkent
Copy link

rkent commented Sep 24, 2025

I feel like the usage of add_subdirectory contradicts the idea of ros packages.

That's fine, I don't feel strongly about it. I believe that ideally one of the point of writing tests is to identify edge cases that cause the solution to fail. We can then discuss whether the edge cases are real enough to warrant fixing. Right now I would argue that the "add_subdirectory" edge case is NOT worth fixing, but the "order matters with combined msg and python packages" IS worth fixing. But I can accept what others believe.

@nadavelkabets
Copy link
Author

@cottsay
CI fails because this yaml is invalid - I think that a colon is missing after "ament/ament_cmake"

%YAML 1.1
---
repositories:
  ament/ament_cmake
    type: git
    url: https://github.com/nadavelkabets/ament_cmake.git
    version: feature/ament-python-extend-package

@cottsay
Copy link
Contributor

cottsay commented Sep 24, 2025

CI fails because this yaml is invalid - I think that a colon is missing after "ament/ament_cmake"

Thanks, sorry about that. I've re-triggered the jobs and updated the links in the original comment.

rkent added a commit to rkent/ament_cmake that referenced this pull request Sep 25, 2025
@rkent
Copy link

rkent commented Sep 25, 2025

On https://github.com/rkent/ament_cmake/tree/1-feature/ament-python-extend-package I have the tests from #598 with the mixed msg/python parts enabled. This all passes for me locally.

I've left the test of ordering of msg/python includes in CMakeLists.txt disabled, though as I said earlier I still think this is something that should work.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

ament_cmake_python: Update ament_python_install_package() to allow installing more files into an existing package.

4 participants