Skip to content

type validation #83

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
p-schaefer opened this issue Dec 16, 2024 · 1 comment
Open

type validation #83

p-schaefer opened this issue Dec 16, 2024 · 1 comment

Comments

@p-schaefer
Copy link

I'm having an issue trying to validate the schema for DataStream found here. I'm not sure if the issue is in the schema, or if its in the package, but it seems to be failing at validating. (I'm also struggling a little with the outputs from the functions, so forgive me if there is a better way of compiling the outputs)

Data setup

dt <- tibble::tribble(
  ~DatasetName, ~MonitoringLocationID, ~MonitoringLocationName, ~MonitoringLocationLatitude, ~MonitoringLocationLongitude, ~MonitoringLocationHorizontalCoordinateReferenceSystem, ~MonitoringLocationHorizontalAccuracyMeasure, ~MonitoringLocationHorizontalAccuracyUnit, ~MonitoringLocationVerticalMeasure, ~MonitoringLocationVerticalUnit, ~MonitoringLocationType,   ~ActivityType, ~ActivityMediaName, ~ActivityStartDate, ~ActivityStartTime, ~ActivityEndDate, ~ActivityEndTime, ~ActivityDepthHeightMeasure, ~ActivityDepthHeightUnit, ~SampleCollectionEquipmentName,  ~CharacteristicName, ~MethodSpeciation, ~ResultSampleFraction, ~ResultValue, ~ResultUnit, ~ResultValueType, ~ResultDetectionCondition, ~ResultDetectionQuantitationLimitMeasure, ~ResultDetectionQuantitationLimitUnit, ~ResultDetectionQuantitationLimitType, ~ResultStatusID, ~ResultComment, ~ResultAnalyticalMethodID, ~ResultAnalyticalMethodContext, ~ResultAnalyticalMethodName, ~AnalysisStartDate, ~AnalysisStartTime, ~AnalysisStartTimeZone, ~LaboratoryName, ~LaboratorySampleID,
  "test",             "test-314",               "test-314",                  43.5895361,                  -79.9411775,                                                "NAD83",                                           NA,                                        NA,                                 NA,                              NA,          "River/Stream", "Field Msr/Obs",    "Surface Water",       "2005-07-12",                 NA,               NA,               NA,                          NA,                       NA,                 "Probe/Sensor", "Temperature, water",                NA,                    NA,           18,     "deg C",         "Actual",                        NA,                                       NA,                                    NA,                                    NA,              NA,             NA,                        NA,                             NA,                          NA,                 NA,                 NA,                     NA,              NA,                  NA
)

Validate using jsonvalidate::json_schema$new()$serialise as an input:

dt_json <- jsonlite::toJSON(as.list(dt),digits = 999)

path <- tempfile()
dl <- download.file("https://datastream.org/schema", path,method="curl", quiet =T)

sc <- suppressWarnings(jsonvalidate::json_schema$new(path,strict = T))

json <- sc$serialise(dt_json)
out <- sc$validate(json,verbose=T)
ab <- attr(out,"error")

cbind(ab$schemaPath,ab$keyword,ab$params,ab$message)

image

and when I validate using jsonvalidate::json_validate() with serialized input

jsonvalidate::json_validate(json, path,verbose = T, strict = T)

image

and when I validate without serializing:

out <- sc$validate(dt_json,verbose=T)
ab <- attr(out,"error")

bind_cols(ab$schemaPath,ab$keyword,ab$message)

image

jsonvalidate::json_validate(dt_json, path,verbose = T, strict = T)

image

I appreciate any assistance you can provide.

@p-schaefer
Copy link
Author

I've made some progress. I think part of the issue might be the use of the allOf in the schema. Isolating one of the elements in the allOf gets me further:

dt <- tibble::tribble(
  ~DatasetName, ~MonitoringLocationID, ~MonitoringLocationName, ~MonitoringLocationLatitude, ~MonitoringLocationLongitude, ~MonitoringLocationHorizontalCoordinateReferenceSystem, ~MonitoringLocationHorizontalAccuracyMeasure, ~MonitoringLocationHorizontalAccuracyUnit, ~MonitoringLocationVerticalMeasure, ~MonitoringLocationVerticalUnit, ~MonitoringLocationType,   ~ActivityType, ~ActivityMediaName, ~ActivityStartDate, ~ActivityStartTime, ~ActivityEndDate, ~ActivityEndTime, ~ActivityDepthHeightMeasure, ~ActivityDepthHeightUnit, ~SampleCollectionEquipmentName,  ~CharacteristicName, ~MethodSpeciation, ~ResultSampleFraction, ~ResultValue, ~ResultUnit, ~ResultValueType, ~ResultDetectionCondition, ~ResultDetectionQuantitationLimitMeasure, ~ResultDetectionQuantitationLimitUnit, ~ResultDetectionQuantitationLimitType, ~ResultStatusID, ~ResultComment, ~ResultAnalyticalMethodID, ~ResultAnalyticalMethodContext, ~ResultAnalyticalMethodName, ~AnalysisStartDate, ~AnalysisStartTime, ~AnalysisStartTimeZone, ~LaboratoryName, ~LaboratorySampleID,
  "test",             "test-314",               "test-314",                  43.5895361,                  -79.9411775,                                                "NAD83",                                           NA,                                        NA,                                 NA,                              NA,          "River/Stream", "Field Msr/Obs",    "Surface Water",       "2005-07-12",                 NA,               NA,               NA,                          NA,                       NA,                 "Probe/Sensor", "Temperature, water",                NA,                    NA,           18,     "deg C",         "Actual",                        NA,                                       NA,                                    NA,                                    NA,              NA,             NA,                        NA,                             NA,                          NA,                 NA,                 NA,                     NA,              NA,                  NA
)

path <- tempfile()
dl <- download.file("https://datastream.org/schema", path,method="auto", quiet =T)
scm <- jsonlite::read_json(path)

sc <- jsonvalidate::json_schema$new(
      jsonlite::toJSON(scm$allOf[[1]],auto_unbox = T,digits = 999),
      strict = F)

json <- sc$serialise(as.list(dt[1,1:5]))
out <- sc$validate(json,verbose=T,greedy =T)
ab <- attr(out,"error")

cbind(ab$schemaPath,ab$keyword,ab$message)

image

it now gets me a pattern error, but I'm not sure I understand why its failing, since the below returns TRUE

stringr::str_detect("test","^[\\p{L}\\p{N}\\p{P}\\p{S} ]+$")

Testing with a simpler schema also doesn't detect the string correctly:

jsonvalidate::json_validate(
    '{"test": "string"}',
    '{
    "$schema": "http://json-schema.org/draft-04/schema#",
    "title": "Test",
    "type": "object",
    "properties": {
      "test": {
        "type": "string",
        "pattern": "^[\\p{L}\\p{N}\\p{P}\\p{S} ]+$"
      }
    }
  }',
    engine = "ajv"
)

# >[1] FALSE

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant