Skip to content

feat(r): Generic datasources #28

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 50 commits into
base: main
Choose a base branch
from
Open

Conversation

npelikan
Copy link

@npelikan npelikan commented Jun 7, 2025

This is a WIP but seems to basically work.

One question -- I retained the basic functionality for local data.frames (that is, df -> querychat -> df), where remote data sources instead return a dbplyr lazy tbl(), meant for chaining. Is this too confusing of a behavior split? Should local data.frames also return a tbl(), just now connected to duckdb?

A few immediate TODOs:

  • add more documentation
  • generate some examples
  • create shinytests
  • validate this works on more than just sqlite

jcheng5 and others added 23 commits April 3, 2025 22:47
...instead of requiring explicit DataSource subclass creation
…provements

Plus some improvements:
- Cleaner .md file reading code in example apps
- Use GPT-4.1 by default, not GPT-4 😬
- Make sqlalchemy required
fix: No longer need to manually calls session$ns() with shinychat (#1
@npelikan npelikan marked this pull request as draft June 10, 2025 02:24
@npelikan npelikan marked this pull request as ready for review June 10, 2025 02:25
@schloerke schloerke marked this pull request as draft June 10, 2025 13:42
@schloerke schloerke changed the title DRAFT: R generic datasources feat(r): Generic datasources Jun 10, 2025
@npelikan
Copy link
Author

(Changing the target of this to main so it merges cleanly)

@npelikan npelikan changed the base branch from generic-datasource-improvements to main June 25, 2025 22:53
@jcheng5
Copy link
Collaborator

jcheng5 commented Jun 26, 2025

I'm really sorry, this is still waiting on me, isn't it? I'm going to book some time for us to discuss in realtime if that's cool with you.

@npelikan
Copy link
Author

Alright, I think this is ready to merge @jcheng5

npelikan and others added 11 commits July 1, 2025 17:39
Previously, the examples/app-database.R would shown an error on
startup because the initial query was "", which was then sent
as a SQL query to RSQLite. The get_lazy_data code path accounted
for the "" query, so we decided to make the eager code path just
call the lazy code path, then collect().

Also fixed a formatting issue with the table.
It seems like dbplyr tables-as-queries can be a bit... temperamental. This should fix that by explicitly declaring sql always.
@jcheng5
Copy link
Collaborator

jcheng5 commented Jul 17, 2025

Shooot, there is one problem left. We're now using dplyr::tbl(conn, dplyr::sql(query)) to actually perform the SQL queries. The problem is that dplyr assumes that second argument is a table name, not a whole SQL query. So little things that are valid for SQL queries, break; for example, a trailing semicolon, or ending on a SQL comment. You end up seeing pretty confusing error messages with unrecognizable SQL like:

SELECT * FROM (SELECT * FROM iris --test) AS `q01` WHERE (0 = 1)

I'll ask Hadley what the right way to do this is (if there is one).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants