Skip to content

Conversation

@vepadulano
Copy link
Member

When the input data source is a TTree, GetColumnNames gathers the list of all the available TTree branches. In case there are two branches in the tree (e.g. el1 and el2), each of them has a sub-branch with the same name (e.g. electron_pt), TTree allows calling GetBranch("electron_pt") and returns the pointer to the sub-branch of the first main branch (i.e. el1.electron_pt). This behaviour can lead to ambiguities, thus avoid exposing the ambiguous column name via RDF.

A test is added to exemplify this case.

This PR fixes #19392

Note that it is a draft PR as the fix is fairly obvious but I am not sure that it won't break other tests

@vepadulano vepadulano requested review from enirolf and pcanal July 17, 2025 14:05
@vepadulano vepadulano self-assigned this Jul 17, 2025
@vepadulano vepadulano closed this Jul 17, 2025
@vepadulano vepadulano reopened this Jul 17, 2025
@vepadulano vepadulano added the clean build Ask CI to do non-incremental build on PR label Jul 17, 2025
@vepadulano vepadulano closed this Jul 17, 2025
@vepadulano vepadulano reopened this Jul 17, 2025
@dpiparo
Copy link
Member

dpiparo commented Nov 2, 2025

a few builds failed because of lcgpackages.web.cern.ch being unreachable and not the changes proposed.

@github-actions
Copy link

github-actions bot commented Nov 2, 2025

Test Results

    22 files      22 suites   3d 18h 42m 47s ⏱️
 3 705 tests  3 704 ✅ 0 💤 1 ❌
79 556 runs  79 554 ✅ 0 💤 2 ❌

For more details on these failures, see this check.

Results for commit 028676a.

♻️ This comment has been updated with latest results.

@vepadulano vepadulano closed this Nov 2, 2025
@vepadulano vepadulano reopened this Nov 2, 2025
When the input data source is a TTree, GetColumnNames gathers the list of all
the available TTree branches. In case there are two branches in the tree (e.g.
`el1` and `el2`), each of them has a sub-branch with the same name (e.g.
`electron_pt`), TTree allows calling `GetBranch("electron_pt")` and returns the
pointer to the sub-branch of the first main branch (i.e. `el1.electron_pt`).
This behaviour can lead to ambiguities, thus avoid exposing the ambiguous column
name via RDF.

A test is added to exemplify this case.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

clean build Ask CI to do non-incremental build on PR in:RDataFrame in:TTree

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[df] Sub-branches get wrongly added as top-level columns

2 participants