Conversation
Codecov Report
@@ Coverage Diff @@
## master #3958 +/- ##
==========================================
- Coverage 85.13% 85.13% -0.01%
==========================================
Files 378 378
Lines 67117 67117
==========================================
- Hits 57138 57137 -1
- Misses 9979 9980 +1 |
janezd
left a comment
There was a problem hiding this comment.
Thanks. I have a few small comments. This widget is indeed difficult to describe, but I guess we've done a good job now.
| The **Merge Data** widget is used to horizontally merge two datasets, based on the values of selected attributes (columns). In the input, two datasets are required, data and extra data. The widget allows selection of one or more attributes from each domain, which will be used to perform the merging. The widget produces one output. It corresponds to the instances from the input data to which attributes (columns) from input extra data are appended. | ||
|
|
||
| Merging is done by values of selected (merging) attributes. First, the value of the merging attribute from Data is taken and instances from Extra Data are searched for matching values. If more than a single instance from Extra Data was to be found, the attribute is removed from available merging attributes. | ||
| Merging is done by values of selected (merging) attributes. First, the value of the merging attribute from Data is taken and instances from Extra Data are searched for matching values. If the selected attribute does not contain unique values (in other words, the attribute has duplicate values), the attribute is removed from the available merging attributes. |
There was a problem hiding this comment.
This is no longer true because now it suffices that a combination of attributes yields unique values. Now, it's an error if the chosen combination of values is not unique.
I'd also say "matching" instead of merging.
Actually, I'd remove this paragraph and change the previous one.
| - Data: dataset with features added from extra data | ||
|
|
||
| The **Merge Data** widget is used to horizontally merge two datasets, based on values of selected attributes. In the input, two datasets are required, data and extra data. The widget allows selection of an attribute from each domain, which will be used to perform the merging. The widget produces one output. It corresponds to instances from the input data to which attributes from input extra data are appended. | ||
| The **Merge Data** widget is used to horizontally merge two datasets, based on the values of selected attributes (columns). In the input, two datasets are required, data and extra data. The widget allows selection of one or more attributes from each domain, which will be used to perform the merging. The widget produces one output. It corresponds to the instances from the input data to which attributes (columns) from input extra data are appended. |
There was a problem hiding this comment.
The widget allows selection of one or more attributes from each domain, which will be used to perform the merging.
Maybe: Rows from the two data sets are matched by the values of pairs of attributes, chosen by the user.
| Merging is done by values of selected (merging) attributes. First, the value of the merging attribute from Data is taken and instances from Extra Data are searched for matching values. If the selected attribute does not contain unique values (in other words, the attribute has duplicate values), the attribute is removed from the available merging attributes. | ||
|
|
||
|  | ||
| Merge Data can merge also on more than one attribute. Click on the plus icon to add the attribute to merge on. The final result have to be unique combinations for each individual row. |
There was a problem hiding this comment.
attribute -> attribute pair? (appears twice)
| 1. Information on main data. | ||
| 2. Information on data to append. | ||
| 3. Merging type: | ||
| - **Append columns from Extra Data** outputs all instances from Data appended by matching instances from Extra Data. When no match is found,unknown values are appended. |
There was a problem hiding this comment.
Perhaps: Output all rows from the Data, augmented by columns from matching rows in the Extra data. Rows without matches are retained, though the data in extra columns is missing.
Appending sound more like vertical (to me).
"Unknown values" sounds (to me) as if behaviour is undefined.
| 2. Information on data to append. | ||
| 3. Merging type: | ||
| - **Append columns from Extra Data** outputs all instances from Data appended by matching instances from Extra Data. When no match is found,unknown values are appended. | ||
| - **Find matching pairs of rows** outputs only matching instances. |
There was a problem hiding this comment.
Perhaps: Find matching pairs of rows: similar to above, except that Data rows without matches are removed from the output.*
| 3. Merging type: | ||
| - **Append columns from Extra Data** outputs all instances from Data appended by matching instances from Extra Data. When no match is found,unknown values are appended. | ||
| - **Find matching pairs of rows** outputs only matching instances. | ||
| - **Concatenate tables** outputs all instances from both inputs, even though the match may not be found. In that case unknown values are assigned. |
There was a problem hiding this comment.
- Concatenate tables treats both data sources symmetrically. The output is similar as for the first option, except that non-matched values from Extra data are appended at the end.
| - **Append columns from Extra Data** outputs all instances from Data appended by matching instances from Extra Data. When no match is found,unknown values are appended. | ||
| - **Find matching pairs of rows** outputs only matching instances. | ||
| - **Concatenate tables** outputs all instances from both inputs, even though the match may not be found. In that case unknown values are assigned. | ||
| 4. List of comparable attributes from Data input. |
|
|
||
| #####Concatenate tables (outer join) | ||
|
|
||
| The rows present in both the Data and the Extra Data will be present on the output. Where rows cannot be matched, missing values will appear. |
There was a problem hiding this comment.
Perhaps: All rows from both ...
1e971a6 to
08f87e2
Compare
|
Fixed. |
Issue
Documents changes introduced in #3919.
Description of changes
Documents merging by multiple rows. Also extends documentation for merging types.
Includes