You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The rows are per individual due to having length information, but the wtcpue column is for the species.
This is a problem because the intuition only works when you assign that the original species names are all correct.
I checked, and this raises very few problem for NEUS if approached simply (i.e., take the mean of the wtcpue within each unique combination of spp-haulid). However, there are a couple cases for which there was a species name correction. So 2 taxonimic ID's originally had their own (different) wtcpue's in a given haul, and each of those taxa may have had some individuals lengthed. So the wtcpue value is repeated several times for the taxon. But after correcting taxonomy, the 2 taxa are actually the same species. So you can't simply take the average (what you would do if all same taxa and duplicated wtcpue, as was probably intended interpretation) or the sum of wtcpue (if multiple rows for the same species-haul did not have duplicated wtcpue).
I hope this issue does not apply to sex too, but it could (i.e., when sex is listed, is the wtcpue sex-specific, or for the whole spp?).
One approach is to first aggregate while including wtcpue as a factor. This can be done with trawlAgg(), because usually at this stage of data processing both space_lvl and time_lvl are "haulid", so one of those (probably time) can be changed to "wtcpue". However, this might become challenging when there are NA's etc for wtcpue ... idk how the grouping would work.
Another approach could be to make the bioFun argument something like function(x)sumna(una(x)), where x is "wtcpue" passed to bioCols argument. This assumes equivalent wtcpue are from duplicated rows that shouldn't be summed together to get the total wtcpue for a species in a haul. May or may not be true.
Yet another approach could be to aggregate not by "spp", but by the original taxonomic ID column first. In that first aggregation, do bioFun = meanna. Then do the subsequent aggregation by "spp" with bioFun = sumna. This assumes that duplicate rows for a species within a haul should not be summed. It also obscures the potentially problematic scenario of there actually being multiple wtcpue values .... maybe instead of meanna could do something that lists the unique values, and hopefully throws an error when there's more than 1.
The text was updated successfully, but these errors were encountered:
The rows are per individual due to having length information, but the wtcpue column is for the species.
This is a problem because the intuition only works when you assign that the original species names are all correct.
I checked, and this raises very few problem for NEUS if approached simply (i.e., take the mean of the wtcpue within each unique combination of spp-haulid). However, there are a couple cases for which there was a species name correction. So 2 taxonimic ID's originally had their own (different) wtcpue's in a given haul, and each of those taxa may have had some individuals lengthed. So the wtcpue value is repeated several times for the taxon. But after correcting taxonomy, the 2 taxa are actually the same species. So you can't simply take the average (what you would do if all same taxa and duplicated wtcpue, as was probably intended interpretation) or the sum of wtcpue (if multiple rows for the same species-haul did not have duplicated wtcpue).
I hope this issue does not apply to sex too, but it could (i.e., when sex is listed, is the wtcpue sex-specific, or for the whole spp?).
One approach is to first aggregate while including wtcpue as a factor. This can be done with
trawlAgg()
, because usually at this stage of data processing bothspace_lvl
andtime_lvl
are"haulid"
, so one of those (probably time) can be changed to"wtcpue"
. However, this might become challenging when there are NA's etc for wtcpue ... idk how the grouping would work.Another approach could be to make the
bioFun
argument something likefunction(x)sumna(una(x))
, wherex
is"wtcpue"
passed tobioCols
argument. This assumes equivalent wtcpue are from duplicated rows that shouldn't be summed together to get the total wtcpue for a species in a haul. May or may not be true.Yet another approach could be to aggregate not by
"spp"
, but by the original taxonomic ID column first. In that first aggregation, dobioFun = meanna
. Then do the subsequent aggregation by"spp"
withbioFun = sumna
. This assumes that duplicate rows for a species within a haul should not be summed. It also obscures the potentially problematic scenario of there actually being multiple wtcpue values .... maybe instead of meanna could do something that lists the unique values, and hopefully throws an error when there's more than 1.The text was updated successfully, but these errors were encountered: