Skip to content

Precomputed Forward Observations (FOs)

hkershaw-brown edited this page Mar 5, 2021 · 2 revisions

Externally Computed Forward Operator Values

DART supports ingesting observation Forward Operator (FO) values from an external source and storing those values in a standard observation sequence file.
This capability allows DART to assimilate observations from other DA systems.

These are also referred to as "precomputed forward operator" values, since they are not computed during the assimilation but in a separate processing step beforehand. Important: observations resulting from perfect_model_obs are not precomputed forward observation values!

DART subroutines are provided to store an ensemble of precomputed FO values, to retrieve an ensemble of precomputed FO values, and to mark an observation type as using these precomputed FO values instead of generating them at runtime.

An observation converter, which takes observation information from another format and outputs the data in DART observation sequence format ("obs_seq" format) can provide an ensemble of precomputed FO values which are then stored as metadata along with the standard observation information.

A filter runtime a namelist option (use_precomputed_FOs_these_obs_types in obs_kind_nml) selects which observation types should use these precomputed FO values instead of the values calculated by the DART forward operators. Where the code would normally call the DART forward operator routine to compute the prior expected obs value from each ensemble member, it instead substitutes the precomputed values. If posterior FO values are requested, missing values are returned to prevent the comparison of 'apples' and 'oranges'. Since the forward operator from the external system is different from the one DART would use to compute the expected posterior value, direct comparison of the (external) priors and (internal) posteriors is problematic. NOTE: if the observation is successfully assimilated, the DART QC is set to '2', indicating the observation was assimilated and the posterior observation operator "failed". All DART quality control values are still accurate; if the observation fails the outlier_threshold, the DART QC is '7', etc.

If needed they can be computed in a post-processing step after filter exits.

The precomputed FO metadata specifies the number precomputed values available (the external ensemble size). The DART ensemble size can be smaller than the number of precomputed values. It is an error to run with an ensemble size larger than the number of precomputed values available. The precomputed FO values will be associated with ensemble members in the order they occur in the metadata, so the order of the precompute values must match the order ensemble members as input to filter.

NOTE: The output obs_seq file from filter removes the precomputed values from the metadata since they are now part of the standard observation metadata, (the 'copies' array, where all observations store their prior and posterior forward operator results.

All other observation tools preserve the precomputed FO values unless an option is selected to remove them.

Example:

An example precomputed forward observation (before assimilation):

  ...
  num_copies:            1  num_qc:            1
  num_obs:         2113  max_num_obs:         2113
observation
GSI Quality Control
  first:            1  last:         2113
 OBS            1
   209.64999389648438
   1.0000000000000000
          -1           2          -1
obdef
loc3d
     4.418562683632357        0.6736970646382003         19680.00000000000      2
kind
          18
external_FO       3       1
   211.38349914550781        210.05355834960938        210.05355834960938
     0     152053
  0.69421207904815674
  ...

And here is what it looks like AFTER assimilation:

  ...
  num_copies:           11  num_qc:            2
  num_obs:         2113  max_num_obs:         2113
observation
prior ensemble mean
posterior ensemble mean
prior ensemble spread
posterior ensemble spread
prior ensemble member      1
posterior ensemble member      1
prior ensemble member      2
posterior ensemble member      2
prior ensemble member      3
posterior ensemble member      3
GSI Quality Control
DART quality control
  first:            1  last:         2113
 OBS            1
   209.64999389648438
   210.49687194824219
  -888888.00000000000
  0.76784167651822799
  -888888.00000000000
   211.38349914550781
  -888888.00000000000
   210.05355834960938
  -888888.00000000000
   210.05355834960938
  -888888.00000000000
   1.0000000000000000
   2.0000000000000000
          -1           2          -1
obdef
loc3d
     4.418562683632357        0.6736970646382003         19680.00000000000      2
kind
          18
     0     152053
  0.69421207904815674
  ...

For more information:

See the GSI2DART observation converter for an example of how to create observation sequence files with precomputed FO values.

See the obs_kind namelist documentation for how to control their use at runtime.

See the obs_sequence_tool documentation for how to remove precomputed FO values if desired.

Testing

In short, I ran a small observation sequence file through the bgrid_solo filter.

The GSI2DART observation converter has an input dataset in the test_build_datasets_preprocess.tar file that generates an observation sequence file with only 2 ensemble members. This is not enough to run an assimilation, but it does test the converter. In the GSI2DART/data directory, I simply made a duplicate set of inputs for a third ensemble member, changed the input.nml ensemble size to '3' and ran the converter. I moved the output observation sequence file with conventional observations ('obs_seq.out.conv') to the bgrid_solo/work directory.

The GSI2DART observation file has observation types that require the following preprocess support for the bgrid_solo/work/input.nml

&preprocess_nml
   overwrite_output        = .true.
   input_obs_qty_mod_file  = '../../../assimilation_code/modules/observations/DEFAULT_obs_kind_mod.F90'
   output_obs_qty_mod_file = '../../../assimilation_code/modules/observations/obs_kind_mod.f90'
   input_obs_def_mod_file  = '../../../observations/forward_operators/DEFAULT_obs_def_mod.F90'
   output_obs_def_mod_file = '../../../observations/forward_operators/obs_def_mod.f90'
   obs_type_files          = '../../../observations/forward_operators/obs_def_reanalysis_bufr_mod.f90',
                             '../../../observations/forward_operators/obs_def_altimeter_mod.f90',
                             '../../../observations/forward_operators/obs_def_dew_point_mod.f90',
                             '../../../observations/forward_operators/obs_def_gps_mod.f90',
                             '../../../observations/forward_operators/obs_def_gts_mod.f90',
                             '../../../observations/forward_operators/obs_def_mesonet_mod.f90',
                             '../../../observations/forward_operators/obs_def_metar_mod.f90',
                             '../../../observations/forward_operators/obs_def_radar_mod.f90',
                             '../../../observations/forward_operators/obs_def_radiance_mod.f90',
                             '../../../observations/forward_operators/obs_def_rel_humidity_mod.f90',
                             '../../../observations/forward_operators/obs_def_QuikSCAT_mod.f90'
   quantity_files          = '../../../assimilation_code/modules/observations/atmosphere_quantities_mod.f90'
   /

For testing purposes, I used the obs_sequence_tool to cut down the number of observations (leaving only RADIOSONDE_U_WIND_COMPONENT and RADIOSONDE_TEMPERATURE)

&obs_sequence_tool_nml
   filename_seq       = 'obs_seq.out.conv'
   filename_out       = 'obs_seq.smaller',
   print_only         =  .false.,
   first_obs_days     = -1,
   first_obs_seconds  = -1,
   last_obs_days      = -1,
   last_obs_seconds   = -1,
   min_lat            =  39.0,
   max_lat            =  40.0,
   min_lon            =  255.0,
   max_lon            =  256.0,
   gregorian_cal      = .true.
   remove_precomputed_FO_values = ''
   obs_types          = 'MARINE_SFC_PRESSURE',
                        'LAND_SFC_PRESSURE',
                        'GPSRO_REFRACTIVITY',
                        'RADIOSONDE_V_WIND_COMPONENT',
                        'RADIOSONDE_SURFACE_PRESSURE',
                        'ACARS_U_WIND_COMPONENT',
                        'ACARS_V_WIND_COMPONENT',
                        'ACARS_TEMPERATURE',
                        'LAND_SFC_U_WIND_COMPONENT',
                        'LAND_SFC_V_WIND_COMPONENT',
                        'LAND_SFC_TEMPERATURE',
                        'RADIOSONDE_RELATIVE_HUMIDITY',
                        'ACARS_RELATIVE_HUMIDITY',
                        'LAND_SFC_RELATIVE_HUMIDITY'
   keep_types         = .false.,
   /

I took the 'obs_seq.smaller' observations and modified the input.nml to simply assimilate these observations without advancing the model state:

   ens_size                     = 3,
   obs_sequence_in_name         = "obs_seq.smaller",
   obs_sequence_out_name        = "obs_seq.final",
   num_output_obs_members       = 3,
   init_time_days               = 152053,
   init_time_seconds            = 0,

   outlier_threshold     =  3.0,

   use_precomputed_FOs_these_obs_types = 'RADIOSONDE_TEMPERATURE'

From there, you can confirm that the RADIOSONDE_U_WIND_COMPONENT observations are treated 'normally' and that the RADIOSONDE_TEMPERATURE observations are using their precomputed values. 'obs_seq.smaller' is small enough that you can manually edit one of the observation values to ensure that the outlier_threshold value is exceeded.