You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: R/check_strat.R
+12-5Lines changed: 12 additions & 5 deletions
Original file line number
Diff line number
Diff line change
@@ -8,7 +8,7 @@
8
8
#' @param prompt_strat_tol Logical, if in \code{\link{interactive}} mode, prompt user for tolerance? If not, and if \code{append_keep_strat} is TRUE and \code{strat_tol} is left \code{\link{missing}}, then a default will be selected for \code{strat_tol}
9
9
#' @param strat_tol The maximum number of unsampled years that is tolerated for any stratum before all rows corresponding to that stratum have their value in the "keep_strat" column set to FALSE
10
10
#' @param plot Logical, visualize strata over time and the number of strata sampled in all but N years?
11
-
#'
11
+
#'
12
12
#' @details
13
13
#' The aim of the function is to guide the selection of which strata to exclude from analysis because they are not sampled often enough. Having fewer gaps in your data set is better, but sometimes tolerating a tiny amount of missingness can result in huge increases in data; the visualization provided by this funciton will help gauge that tradeoff.
Copy file name to clipboardExpand all lines: R/clean.format.R
+5Lines changed: 5 additions & 0 deletions
Original file line number
Diff line number
Diff line change
@@ -6,10 +6,15 @@
6
6
#'
7
7
#' @details
8
8
#' It is this function that makes specific corrections for data entry errors. For example, in one region a tow duration of 3 should have been 30. In another region some of the \code{effort} values were entered as \code{0} or \code{NA}, but should have had a particular value.
9
+
#'
9
10
#' This function also ensures that longitude and latitude are in the same format among regions.
11
+
#'
10
12
#' Other data entry errors or necessary corrections are implemented here, too.
13
+
#'
11
14
#' Dates are not thoroughly formatted here, except in some cases where getting a \code{year}, e.g., requires parsing values out of other columns. POSIX class dates not created.
Copy file name to clipboardExpand all lines: R/clean.names.R
+2Lines changed: 2 additions & 0 deletions
Original file line number
Diff line number
Diff line change
@@ -7,6 +7,8 @@
7
7
#' @details
8
8
#' Regions tend to have very different column names for what are essentiallythe same measurements, descriptors, etc. This function tries to give everything a standardized name when it's appropriate.
Copy file name to clipboardExpand all lines: R/clean.trimCol.R
+2Lines changed: 2 additions & 0 deletions
Original file line number
Diff line number
Diff line change
@@ -22,6 +22,8 @@
22
22
#' Names passed to \code{c.drop} take precedence over names passed to \code{cols} or \code{c.add}; e.g., if the same name is passed to both \code{c.drop} and \code{c.add}, it will not be included in the final data.table. The choice is somewhat arbitrary, although giving preference to dropping names is consistent with the intended use of the function.
23
23
#'
24
24
#' Finally, duplicate columns will not be returned if a name is supplied to both \code{cols} and to \code{c.add}.
Copy file name to clipboardExpand all lines: R/clean.trimRow.R
+2Lines changed: 2 additions & 0 deletions
Original file line number
Diff line number
Diff line change
@@ -7,6 +7,8 @@
7
7
#' @details
8
8
#' Recommended rows to drop according to Malin's original scripts and what's in the OceanAdapt repo. Rows are not actually dropped; rather, a column called \code{keep.row} is added to the data.table; when \code{keep.row} is \code{FALSE}, it is recommended that the row be dropped.
Copy file name to clipboardExpand all lines: R/formatStrat.R
+2-120Lines changed: 2 additions & 120 deletions
Original file line number
Diff line number
Diff line change
@@ -38,6 +38,8 @@ ll2km <- function(x,y){
38
38
#' @details
39
39
#' If \code{frac} is 1, then round to the nearest whole number. If \code{frac} is 0.5, then snap everything to the nearest half a degree grid. If 10, then snap to the nearest multiple of 10, plus 5 (6 goes to 5, 8 goes to 5, 10 goes to 15, 21 goes to 25, etc). Handy if you have lat-lon data that you want to redefine as being on a grid.
40
40
#'
41
+
#' @seealso \code{\link{ll2strat}}
42
+
#'
41
43
#' @export
42
44
roundGrid<-function(x, frac=1){
43
45
# if frac is 1, then place in a 1º grid
@@ -65,126 +67,6 @@ ll2strat <- function(lon, lat, gridSize=1){
65
67
}
66
68
67
69
68
-
69
-
70
-
71
-
72
-
# save tolerance: "/Users/Battrd/Documents/School&Work/pinskyPost/trawl/Data/stratTol/"
73
-
# save tolerance figures: "/Users/Battrd/Documents/School&Work/pinskyPost/trawl/Figures/stratTolFigs"
74
-
75
-
# Function can operate in 1 of 2 ways
76
-
# 1) don't save .txt or figures, don't display figures, don't ask for the tolerance (just read in from .txt file), but change stratum in data.table
77
-
# 2) Figures of tolerance are saved, figures are displayed, .txt of tolerance is saved, and stratum is change in data.table
78
-
#' Make Strata
79
-
#'
80
-
#' Function to make strata for a region, examing missingness
81
-
#'
82
-
#' @param x a data.table of trawl data
83
-
#' @param regName the name of the region
84
-
#' @param doLots option to specify tolerance for missingness; otherwise reads in file for it
85
-
#'
86
-
#' @section Warning:
87
-
#' This function is not ready to be used. Saves figures, has hard-coded paths, looks for reference files outisde of package, etc.
#' Dual functionality: turn factors into a characters, and ensure those characters are encoded as ASCII. Converting to ASCII relies on the \code{stringi} package, particularly \code{stringi::stri_enc_mark} (for detection of non-ASCII) and \code{stringi::stri_enc_toascii} (for conversion to ASCII).
117
124
#'
125
+
#' This function is used when resaving data sets when building the package to ensure that it is portable.
126
+
#'
118
127
#' @return NULL (invisibly), but affects the contents of the data.table whose name was passed to this function
#' See \code{\link{lubridate::parse_date_time}} for a summary of how to specify \code{orders}. Examples show a conversion of variable formats. The only reason this function exists is that \code{parse_date_time} did not handle the century very well on some test data.
163
172
#'
164
-
#' The default \code{orders} is \code{paste0(rep(c("ymd", "mdy", "Ymd", "mdY"),each=5), c(" HMS"," HM", " H", "M", ""))}
173
+
#' The default \code{orders} is
174
+
#' \code{paste0(
175
+
#' rep(c("ymd", "mdy", "Ymd", "mdY"),each=5),
176
+
#' c(" HMS"," HM", " H", "M", "")
177
+
#' )}
165
178
#'
166
179
#' @section Note:
167
180
#' In 2056 I will turn 70. At that point, I'll still be able to assume that a date of '57 associated with an ecological field observation was probably made in 1957. If I see '56, I'll round it up to 2056. I'll probably retire by the time I'm 70, or hopefully someone else will have cleaned up the date formats in all ecological data sets by that time. Either way, it is in my own self interest to set the default as `year=1957`; I do not currently use very many data sets that begin before 1957 (and none of such vast size that I need computer code to automate the corrections), and as a result, the default 1957 will continue to work for me until I retire. After that, a date of '57 that was actually taken in 2057 will have its date reverted to 1957. Shame on them.
168
-
#'
181
+
#'
169
182
#' Oh, and the oldest observation in this package is 1958, I believe (the soda bottom temperatures). As for trawl data, NEUS goes back to 1963. So 1957 is a date choice that will work for all dates currently in this package, and given a 1 year buffer, maximizes the duration of the appropriateness of this default for these data sets into the future.
0 commit comments