model : θ × ξ → ( χ × ϕ ) → ψ loss : ψ × ψ → ℓ optimization : ( θ × ξ × χ → ψ ) → ( ψ × ψ → ℓ ) → D → θ × λ × L inference : ( θ × κ → ψ ) → θ → ( χ × κ × ζ ) → ψ