Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

extracting tables with varying grouping marks (locale issue) #167

Open
AndySibov opened this issue Sep 11, 2024 · 3 comments
Open

extracting tables with varying grouping marks (locale issue) #167

AndySibov opened this issue Sep 11, 2024 · 3 comments
Assignees

Comments

@AndySibov
Copy link

AndySibov commented Sep 11, 2024

I didn't think there would be a package out there for this, thanks!

I was importing a table where the grouping mark is a dot, with values around 10.000.
As such, the extract_table function returns a double such as 1.950 in the form of 1.95.

image

best would be to be able to set import option for locale() for grouping marks and such.

below is a function to recover these imported doubles, but it doesn't work for doubles that have all zero's in the decimals (e.g. input 100.00 (from original value 100.000) will result in 100.

recover_double_grouping_mark <- function(value, grouping_mark = '.', interval = 1000) {

dbl_as_char <- as.character(value)

#determine the interval
interval <- log10(interval)

#Vectorized counting of grouping marks for each element in the vector
dot_count <- str_count(dbl_as_char, pattern = paste0('\', grouping_mark))

#Vectorized finding of the position of the first grouping mark and counting digits before it
int_count <- sapply(gregexpr(grouping_mark, dbl_as_char), function(x) min(x) - 1)

#Calculate the difference between expected and actual number of digits for each element
dif_expected_nchar <- ifelse(dot_count > 0,
abs(int_count - (dot_count * interval)),
0)

#Vectorized adjustment of values where there's a mismatch in character length
adjusted_values <- ifelse(dif_expected_nchar > 0,
value * 10^dif_expected_nchar,
value)

return(adjusted_values)
}

@pachadotdev
Copy link
Contributor

pachadotdev commented Oct 24, 2024

to fix your trouble check this solution click maybe this will solve your problem.

LOL no

I opened this in a container and it shows this

image

image

reported and blocked

@pachadotdev
Copy link
Contributor

I didn't think there would be a package out there for this, thanks!

I was importing a table where the grouping mark is a dot, with values around 10.000. As such, the extract_table function returns a double such as 1.950 in the form of 1.95.

image

best would be to be able to set import option for locale() for grouping marks and such.

below is a function to recover these imported doubles, but it doesn't work for doubles that have all zero's in the decimals (e.g. input 100.00 (from original value 100.000) will result in 100.

recover_double_grouping_mark <- function(value, grouping_mark = '.', interval = 1000) {

dbl_as_char <- as.character(value)

#determine the interval interval <- log10(interval)

#Vectorized counting of grouping marks for each element in the vector dot_count <- str_count(dbl_as_char, pattern = paste0('', grouping_mark))

#Vectorized finding of the position of the first grouping mark and counting digits before it int_count <- sapply(gregexpr(grouping_mark, dbl_as_char), function(x) min(x) - 1)

#Calculate the difference between expected and actual number of digits for each element dif_expected_nchar <- ifelse(dot_count > 0, abs(int_count - (dot_count * interval)), 0)

#Vectorized adjustment of values where there's a mismatch in character length adjusted_values <- ifelse(dif_expected_nchar > 0, value * 10^dif_expected_nchar, value)

return(adjusted_values) }

hi @AndySibov

sorry the late reply, do you have a real link to the PDF

if there are no links, my email is in my description

sorry about the idiot that included a phising link as an answer

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants
@pachadotdev @AndySibov and others