Synthetic data to approximate real datasets #5

finnhacks42 · 2021-12-01T05:33:44Z

Since we can't evaluate the performance of observational causal inference directly via cross-val, one approach to deciding which algorithm and approach to use is to test how the options perform on synthetic data that has similar properties to the real data. This means replicating;

correlational relationships between confounders
marginal distributions over all variables
strength of relationships between confounders and treatment (eg propensity score aroc)
generating a known response surface y = f(X,T) + epsilon with a specified maximal achievable r2.

dsteinberg · 2022-02-01T05:38:09Z

See the simulations folder for some simple DGPs

dsteinberg · 2022-02-01T05:44:52Z

I'll make better simulations as a "nice to have" for now

dsteinberg added the nicetohave Nice to have enhancement, but not a priority label Feb 1, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Synthetic data to approximate real datasets #5

Synthetic data to approximate real datasets #5

finnhacks42 commented Dec 1, 2021

dsteinberg commented Feb 1, 2022

dsteinberg commented Feb 1, 2022

Synthetic data to approximate real datasets #5

Synthetic data to approximate real datasets #5

Comments

finnhacks42 commented Dec 1, 2021

dsteinberg commented Feb 1, 2022

dsteinberg commented Feb 1, 2022