-
Notifications
You must be signed in to change notification settings - Fork 15
/
ai.qmd
104 lines (60 loc) · 6.82 KB
/
ai.qmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
---
title: "Getting started with ai-assisted LLMs"
subtitle: "chatGPT, copilot, Palm, POE"
date-modified: 'today'
date-format: long
license: CC BY-NC
bibliography: references.bib
---
Ai-assisted coding tools such as OpenAI's **ChatGPT**, Google's **Bard**, and GitHub's **Copilot** can be a used in code generation, code completion, and learning to code. R-specific approaches to assisted coding are convenient and available through RStudio addins such as {[**gptstudio**](#0)}.
::: {.callout-warning collapse="true"}
## warning
This information is fluid and ever-evolving. The AI field of Large Language Models (LLM) is in flux.
LLMs can save you time. They can be biased, inaccurate, and generate harmful content.
Be careful and circumspect about the data uploaded to these models.
:::
## Recommendation
- For conceptual coding help
- Use the free version of **GPT4** via [**bing.com/chat**](https://bing.com/chat). Set the *conversation style* to "More Precise"
- Alternatively, [**Claude.ai**](https://Claude.ai)
- Either of the above can be very helpful, especially when focusing your questions (prompt engineering) on the Tidyverse approach to data science.
- For code completion
- Signup for [GitHub for education](https://education.github.com/benefits), to integrate the free code-completion [GitHub Copilot into your RStudio](https://docs.posit.co/ide/user/ide/guide/tools/copilot.html "Copilot and RStudio IDE") or VSCode IDE
## Quick Start
[Bing Chat](https://www.microsoft.com/en-us/edge/features/bing-chat)[^1] can access openAI's *ChatGPT*. [Google **Bard**](https://bard.google.com) works in any browser. Another option is [**poe**.com](https://poe.com). Poe is especially useful for experimenting with a variety of Large Language Models (LLM) including *ChatGPT*, Google-*PaLM* (Bard), *Sage*, and *Claude*. Each of the options mentioned above are provided free of charge.[^2]
[^1]: At the time this sentence was composed Bing Chat is using ChatGPT4.
[^2]: Free but possibly rate limited as of the date-time this sentence was composed.
Below is *from* the Bard LLM when asked for advice on learning to code with ai-assited LLMs:
> - Be specific in your requests
> - Use clear and concise language
> - Provide examples
> - Be patient
> - Use a variety of LLMs and experiment with different approaches
> - **Use LLMs as a supplement to other learning resources**
> - Verify your computational results
See Also: the warning, above, and the *ethics* section, below.
## Tools or packages
Aside from the websites mentioned in the *Quick Start* section above, there are many approaches to integrating AI-assistance directly into an IDE such as RStudio or VSCode. These addins, or plugins, are highlighted for their seemless integration in a coding IDE.
### gptstudio
The gptstudio can reference a variety of LLMs defaulting to ChatGPT. A big advantage to {[gptstudio](https://michelnivard.github.io/gptstudio/)} is the ability to stay within the **RStudio** IDE, interacting seemlessly with notebooks or plain script files. From the Addins menu, get assistance writing code, or checking spelling and grammar. A companion package {gpttools} can extend {gptstudio}.
#### Preequisites
- OpenAI account
- OpenAI API key (requires a credit card but may not require payment)
- Set the API key in RStudio environment. For security, if using version control (such as GitHub), include the .Renviron in the .gitignore file
[Setup details are explained](https://michelnivard.github.io/gptstudio/#prerequisites) at gptstudio. Pricing seems exceedingly low as of this writing but there are no garauntees. The API allows coder to set cost limits.
### RTutor
There is a [website](https://rtutor.ai/) or a [**package** for integrating into **RStudio**](https://github.com/gexijin/RTutor). Does not require an API key, but paying your way is appreciated. This is a tutor designed to help teach about ai-assisted coding and learn about exploring datasets. A subsection of RTutor automatically runs datasets through various EDA packages.
### Github Copilot
Copilot is an autocomplete feature and works well in some cases. RStudio offers [tips for integrating](https://docs.posit.co/ide/user/ide/guide/tools/copilot.html "Copilot integration with RStudio IDE") the tool into the RStudio editor. Defaults to OpenAI's *codex* LLM based on code found in GitHub. This tool is focused on code completion rather than conceptual computational thinking. [Free for teachers and students](https://education.github.com/benefits "Github for Education benefits").
## Comparisons of models
[**gptstudio**](https://michelnivard.github.io/gptstudio/): Great for its deep integration into RStudio and ability to work within code-chunks or prose. Setting up the API key is a necessary configuration. The {gptstudio} documentation shows clearly how to protect API-keys linked to a credit card --- a notable issue when using version control like GitHub.
*Open Source LLMs*: [**h2o**](https://gpt.h2o.ai/) (Apache) [**Llama** **2**](https://codellama.h2o.ai/) (restricted Open) are available. Early reports are that Llama 2 is nearly as powerful as GPT 4 and has a mostly-open, or more-open, license than the proprietary licenses of the more proprietary models.
[**Poe.com**](https://poe.com): Easy access to several LLM models: ChatGPT, PaLM, GPT-4, Claude, and Sage. Works well. Comparison of LLM responses is convenient. Log-out is found in the settings page.
[**Claude**](https://claude.ai/): Of the easily accessible and free LLMs, I've had the best luck with Claude.ai. Llama 2 may be as good or better.
[**Bard**](https://bard.google.com)**/PaLM:** Google's LLM. Fast, efficient and familiar.
[**ChatGPT**](https://openai.com/chatgpt): OpenAI's LLM. This works well. Free website can slow down due to user congestion limits. Subscription models exist, presumably with less congestion.
[**Copilot**](https://github.com/features/copilot): OpenAI's codex LLM focused on code completion. In my experience copilot is less useful in understanding conceptual questions about computational data analysis. Uses the VSCode IDE; a nice app that works with many coding languages including R. VSCode also works with Quarto. The R set-up in VSCode is a bit cranky. If you're a Pythonite, VSCode set-up is more convenient than R. If you're an R coder, you might be happy sticking with RStudio.
### Summary comparison
*Claude* and *Llama 2* are my current favorites. Bard/PaLM and ChatGPT (via addins, poe.com, bing.com, or openai.com) have worked well in my tests. There seem to be some usage-congestion limitations, especially if you don't subscribe or want to use the latest ChatGPT model.
## Ethics
LLMs knowledge bases are private and lack transparency. There are important societal concerns about the fairness of equitable access to these tools. It's unclear how developers or users of these models can be held accountable.