Add support for vision models/ passing in screenshots for tasks that need visual information extraction/ verification.