Skip to content

Commit 82955c1

Browse files
authored
Update index.md
1 parent 0bcd667 commit 82955c1

File tree

1 file changed

+46
-39
lines changed

1 file changed

+46
-39
lines changed

ai/index.md

Lines changed: 46 additions & 39 deletions
Original file line numberDiff line numberDiff line change
@@ -8,48 +8,55 @@
88

99

1010
# en avance: Notes from a lecture
11-
## delivered by Elle O'Brien 14-October-2025
12-
### UMich School of Information
11+
## delivered by Dr. Elle O'Brien 14-October-2025; University of Michigan School of Information
1312

1413

15-
Past org: Data Version Control
16-
17-
18-
- Code LLMs (abbreviated here CL) regarding simulation, cata collection, etcetera
14+
- Code LLMs (abbreviated here CL) are in regular use: simulation, data collection, etc research software
1915
- Undertrained as a Developer? You must be a Scientist!
20-
- Debugging code you didn't write is difficult
21-
- Inaccuracies in code and comments are a liability
22-
- Science context: What is "testing practice"?
23-
- Survey of scientists who use CL: Excerpted outcomes
24-
- Who are you? Life sciences and engineering and etcetera
25-
- What CL do you use? Partial: They use Chatbots not GitHub Copilot 3:1
26-
- Chatbots produce longer blocks of code (hypothesis: this increases the cognitive load on R)
27-
- Use case: "language changes" (e.g. due to legacy code, changing labs, specialty tools etcetera)
28-
- Chat is 1000x easier than documentation...
29-
- Why use documentation? CL can read and apply it for me
30-
- Testing: Ad hoc, eyeball, not systematic
31-
- Unsurprising: This can easily lead to failure modes
32-
- Incorrect mental models by R can lead to failure modes
33-
34-
35-
The bottom line seems to be: People with experience and skill in software development, the lower the
36-
"productivity boost".
37-
38-
39-
As part of this process: Be aware of the Retraction Watch database. 10s-of-k retraction; compare count of papers per year: 3 million.
40-
41-
42-
Potential failure mode: The quality of scientific literature slowly and quietly degrades.
43-
44-
45-
Potential failure mode: Scientists stop using professional caliber scientific software.
46-
47-
48-
Potential failure mode: Public trust crisis, as featured in the New York Times.
49-
50-
51-
A bottom line I take away is that this is a Cautionary Tale deserving our attention and effort
52-
as scientists. A grass roots approach (as suggested by the presenter) could begin with "buddy up"
16+
- Established: Debugging code you didn't write is difficult
17+
- Established: Inaccuracies in code and comments are a liability
18+
- Science context: 'What is "testing practice"?' (lack of awareness)
19+
- From a survey of scientists who use CL: Excerpted outcomes
20+
- Who are you? Life sciences and engineering and etcetera down the domain line
21+
- What CL do you use? Ratio 3 to 1: Chatbots over GitHub Copilot (coding assistants)
22+
- Consequently: As Chatbots produce longer blocks of code in comparison...
23+
- ...the null hypothesis is that Chatbot code increases cognitive load imposed on the Researcher-Developer
24+
- Under what conditions do researchers work with unfamiliar coding languages?
25+
- due to legacy code, moving between labs, domain tools and belike
26+
- How do Researcher-Developers interact with documentation?
27+
- In short: They don't.
28+
- "Chat is 1000x easier than documentation."
29+
- "Why use documentation? The CL can read it and apply it for me"
30+
- How does research code **testing** get done?
31+
- Ad hoc or 'eyeball' methods; not systematic
32+
- This can easily lead to failure modes
33+
- Another common theme is incorrect mental models...
34+
- ...that is: on the part of the Researcher-Developer
35+
- "The code is looking at an Internet-based resource..."
36+
- ...when in fact the code is not looking at the Internet
37+
- Needless to say this can produce failure modes
38+
39+
40+
The survey results proceeded to the relationship between CL effectiveness (perceived) in
41+
productivity in relation to facets of skill on the part of the Researcher-Developer.
42+
A summary point: People with experience and skill in software development experience a
43+
lower "productivity boost" from using a CL, even to the point of *decrease*.
44+
45+
46+
Turning to the scientific literature produced as a result of CL collaboration. An
47+
interesting resource is the Retraction Watch database. In one year there are currently
48+
O(10k) retractions; in comparison with 3 million published papers per year.
49+
50+
51+
In summary the narrative suggests the following failure modes:
52+
- The quality of scientific literature slowly and quietly degrades.
53+
- Scientists stop using professional caliber scientific software.
54+
- Poor research attribitable to CL use results in a public trust crisis, 'as featured in the New York Times'.
55+
56+
57+
A Cautionary Tale deserving of attention and effort: As scientists credibility is an important
58+
part of how we operate ('philosophy of doubt'). Where to begin? The speaker suggests as an
59+
example taking a grass roots approach: "Buddy up"
5360
with an RSE.
5461

5562

0 commit comments

Comments
 (0)