Skip to content

Conversation

@lorenabalan
Copy link
Contributor

@lorenabalan lorenabalan commented Oct 30, 2025

Merging in all connectors framework work thus far into main
Will merge after #3815 & #3812

lorenabalan and others added 30 commits September 22, 2025 09:38
Co-authored-by: Elastic Machine <[email protected]>
Co-authored-by: Artem Shelkovnikov <[email protected]>
…ut (#3729)

# 🚧  Restructure Connectors project 🚧 
## Break out `connectors_service` directory under the `app/` folder
### Part of elastic/search-team#10999

This PR adds a subdirectory, `connectors_service`, under the `app/`
directory and moves the connector code under it

`apps/`
- `connectors_service`
  - `connectors`
  - `Makefile`
  - `pyproject.toml`
  - `.ruff.toml`
  - etc

## Related Pull Requests
- #3721
- #3724
This PR extracts the minimal amount of code to become a
"connectors_sdk".

The SDK is a separate package, and is a minimal thing required to
implement new connectors. Theoretically, this package should be
sufficient to extract the connectors, but more code might need to be
moved there, which is fine.

Some design decisions:

1. I had to move docker files to the root directory, so that the current
process of building docker files would not be broken. Later, when
packages will be published, dockerfiles might get moved back to the apps
they make available.
2. I also had to update the Makefile in the `app/connectors_service` in
such a way, that in installs the dev version of the connectors_sdk
package using relative path. I don't know if there's a better way for
now
3. Some code got moved around in _specific_ way, I've added comments to
specific places that I feel need explanation

---------

Co-authored-by: Elastic Machine <[email protected]>
## Description
Had a typo in my fix in
87ca5e2
…les into folders (#3747)

# 🚧 Connectors Framework Improvements 🏗️ 
## Convert single source `*.py` files into folders with code broken down
into more granular files

The goal with this PR is to convert (in-place) our single-file data
source implementations into folders that may contain multiple `*.py`
files.

Ex:
Original `MySQL` data source file
```
connectors/sources/mysql.py
```

New source (for example, file names not set in stone yet)
```
../connectors/sources/mysql/
    datasource.py (entry point)
    mysql_utils.py
    mysql_validations.py 
```

## Why?
- Easier understanding of the constituent parts of what a connector
'needs' (`BaseDataSource`, `utils` etc)
  - Great for both human coders and LLM tools

---------

Co-authored-by: Elastic Machine <[email protected]>
## Description
Split azure blob storage + s3 connectors

## Related Pull Requests

#3747
# 🚧 Connectors Framework Improvements 🚧 

Move `_message_doc` function inside the `GMailDataSource` class so that
we don't have to expose it in gmail's `__init__.py` but can still unit
test it in `test_gmail.py`

## Closes https://github.com/elastic/connectors-py/issues/###


<!--Provide a general description of the code changes in your pull
request.
If the change relates to a specific issue, include the link at the top.

If this is an ad-hoc/trivial change and does not have a corresponding
issue, please describe your changes in enough details, so that reviewers
and other team members can understand the reasoning behind the pull
request.-->

## Checklists

<!--You can remove unrelated items from checklists below and/or add new
items that may help during the review.-->

#### Pre-Review Checklist
- [x] this PR does NOT contain credentials of any kind, such as API keys
or username/passwords (double check `config.yml.example`)
- [x] this PR has a meaningful title
- [x] this PR links to all relevant github issues that it fixes or
partially addresses
- [x] this PR has a thorough description
- [x] Covered the changes with automated tests
- [x] Tested the changes locally

## Related Pull Requests
Connectors sources file -> folder work:
#3747

---------

Co-authored-by: Elastic Machine <[email protected]>
# 🚧 Connectors Framework Improvements 🚧 
## Part of elastic/search-team#10999

Small PR to make sure connectors source paths are correct in our
Buildkite pipeline
# 🚧 Connectors Framework Improvements 🚧 
## Part of elastic/search-team#10999

This PR splits the PostgreSQL connector file into a folder.

#### Pre-Review Checklist
- [x] this PR does NOT contain credentials of any kind, such as API keys
or username/passwords (double check `config.yml.example`)
- [x] this PR has a meaningful title
- [x] this PR links to all relevant github issues that it fixes or
partially addresses
- [x] this PR has a thorough description
- [x] Covered the changes with automated tests
- [x] Tested the changes locally

---------

Co-authored-by: Elastic Machine <[email protected]>
## Description

Split Jira & Confluence connectors.

Previous folder structure:
```
sources
|__ atlassian.py   # imports from confluence.py, jira.py
|__ confluence.py  # imports from atlassian.py
|__ jira.py        # imports from atlassian.py
```

## Related Pull Requests

* #3747
# 🚧 Connectors Framework Improvements 🚧 
## Part of elastic/search-team#10999

This PR splits the MySQL connector source file into a directory

## Checklists

<!--You can remove unrelated items from checklists below and/or add new
items that may help during the review.-->

#### Pre-Review Checklist
- [x] this PR does NOT contain credentials of any kind, such as API keys
or username/passwords (double check `config.yml.example`)
- [x] this PR has a meaningful title
- [x] this PR links to all relevant github issues that it fixes or
partially addresses
- [x] if there is no GH issue, please create it. Each PR should have a
link to an issue
- [x] this PR has a thorough description
- [x] Covered the changes with automated tests
- [x] Tested the changes locally

## Related Pull Requests
- #3747

---------

Co-authored-by: Elastic Machine <[email protected]>
# 🚧 Connectors Framework Improvements🚧
## Part of elastic/search-team#10999

This PR splits the Microsoft SQL connector source file into a directory

#### Pre-Review Checklist
- [x] this PR does NOT contain credentials of any kind, such as API keys
or username/passwords (double check `config.yml.example`)
- [x] this PR has a meaningful title
- [x] this PR links to all relevant github issues that it fixes or
partially addresses
- [x] if there is no GH issue, please create it. Each PR should have a
link to an issue
- [x] this PR has a thorough description
- [x] Covered the changes with automated tests
- [x] Tested the changes locally

## Related Pull Requests
- #3747
# 🚧 Connectors Framework Improvements🚧
## Part of elastic/search-team#10999

This PR splits the Oracle DB connector source file into a directory

#### Pre-Review Checklist
- [x] this PR does NOT contain credentials of any kind, such as API keys
or username/passwords (double check `config.yml.example`)
- [x] this PR has a meaningful title
- [x] this PR links to all relevant github issues that it fixes or
partially addresses
- [x] if there is no GH issue, please create it. Each PR should have a
link to an issue
- [x] this PR has a thorough description
- [x] Covered the changes with automated tests
- [x] Tested the changes locally

## Related Pull Requests
- #3747

---------

Co-authored-by: Elastic Machine <[email protected]>
# 🚧 Connectors Framework Improvements🚧
## Part of elastic/search-team#10999

This PR splits the Sharepoint connector source files into a dedicated
Sharepoint directory

#### Pre-Review Checklist
- [x] this PR does NOT contain credentials of any kind, such as API keys
or username/passwords (double check `config.yml.example`)
- [x] this PR has a meaningful title
- [x] this PR links to all relevant github issues that it fixes or
partially addresses
- [x] if there is no GH issue, please create it. Each PR should have a
link to an issue
- [x] this PR has a thorough description
- [x] Covered the changes with automated tests
- [x] Tested the changes locally

## Related Pull Requests
- #3747

---------

Co-authored-by: Elastic Machine <[email protected]>
PR to bring the recent changes from
#3708 into the `develop`
branch.

This PR is a prerequisite to splitting apart the GitHub connector
## Description 
Split connectors into multiple files.

## Related Pull Requests

elastic/search-team#10999
#3747

---------

Co-authored-by: Elastic Machine <[email protected]>
# 🚧 Connectors Framework Improvements🚧
## Part of elastic/search-team#10999 &
elastic/search-team#11003

This PR splits the Box connector source files into a directory

#### Pre-Review Checklist
- [x] this PR does NOT contain credentials of any kind, such as API keys
or username/passwords (double check `config.yml.example`)
- [x] this PR has a meaningful title
- [x] this PR links to all relevant github issues that it fixes or
partially addresses
- [x] if there is no GH issue, please create it. Each PR should have a
link to an issue
- [x] this PR has a thorough description
- [x] Covered the changes with automated tests
- [x] Tested the changes locally

## Related Pull Requests
- #3747

---------

Co-authored-by: Elastic Machine <[email protected]>
# 🚧 Connectors Framework Improvements🚧
## Part of elastic/search-team#10999 &
elastic/search-team#11003

This PR splits the Redis connector source file into a directory

#### Pre-Review Checklist
- [x] this PR does NOT contain credentials of any kind, such as API keys
or username/passwords (double check `config.yml.example`)
- [x] this PR has a meaningful title
- [x] this PR links to all relevant github issues that it fixes or
partially addresses
- [x] if there is no GH issue, please create it. Each PR should have a
link to an issue
- [x] this PR has a thorough description
- [x] Covered the changes with automated tests
- [x] Tested the changes locally

## Related Pull Requests
- #3747

---------

Co-authored-by: Elastic Machine <[email protected]>
# 🚧 Connectors Framework Improvements🚧
## Part of elastic/search-team#10999 &
elastic/search-team#11003

This PR splits the Outlook connector source file into a directory

#### Pre-Review Checklist
- [x] this PR does NOT contain credentials of any kind, such as API keys
or username/passwords (double check `config.yml.example`)
- [x] this PR has a meaningful title
- [x] this PR links to all relevant github issues that it fixes or
partially addresses
- [x] if there is no GH issue, please create it. Each PR should have a
link to an issue
- [x] this PR has a thorough description
- [x] Covered the changes with automated tests
- [x] Tested the changes locally

## Related Pull Requests
- #3747
# 🚧 Connectors Framework Improvements🚧
## Part of elastic/search-team#10999 &
elastic/search-team#11003

This PR splits the OneDrive connector source file into a directory

#### Pre-Review Checklist
- [x] this PR does NOT contain credentials of any kind, such as API keys
or username/passwords (double check `config.yml.example`)
- [x] this PR has a meaningful title
- [x] this PR links to all relevant github issues that it fixes or
partially addresses
- [x] if there is no GH issue, please create it. Each PR should have a
link to an issue
- [x] this PR has a thorough description
- [x] Covered the changes with automated tests
- [x] Tested the changes locally

## Related Pull Requests
- #3747

---------

Co-authored-by: Elastic Machine <[email protected]>
lorenabalan and others added 14 commits October 15, 2025 09:23
## Description

## Related PRs
#3747

---------

Co-authored-by: Elastic Machine <[email protected]>
# 🚧 Connectors Framework Improvements🚧
## Part of elastic/search-team#10999 &
elastic/search-team#11003

This PR splits the following connector source files into directories:

- Salesforce
- GitHub
- Sandfly

#### Pre-Review Checklist
- [x] this PR does NOT contain credentials of any kind, such as API keys
or username/passwords (double check `config.yml.example`)
- [x] this PR has a meaningful title
- [x] this PR links to all relevant github issues that it fixes or
partially addresses
- [x] if there is no GH issue, please create it. Each PR should have a
link to an issue
- [x] this PR has a thorough description
- [x] Covered the changes with automated tests
- [x] Tested the changes locally

## Related Pull Requests
- #3747
## Description
* Updated watched paths to be more precise (`app/connectors_service/`
prefix)
* Updated "${DOCKERFILE_FTEST_PATH}" from paths to watch. After
7cd4a4c
it wasn't being set anywhere in `pipeline.yml` so ending up as `""`.
According to AI, what ends up happening under the hood is

> When DOCKERFILE_FTEST_PATH is empty, the grep command becomes:
  echo ".buildkite/pipeline.yml" | grep -q "^"
An empty pattern with ^ means "match lines that start with... anything"
- which matches EVERY line!

* Updated `diff` script according to IDE & AI suggestions

## Testing
* CI for
[7359254](7359254)
only runs azure blob storage ftest
https://buildkite.com/elastic/connectors/builds/20480#019a0c3a-0ca5-496a-99d4-7ed21a1a04c3
* CI for this PR otherwise doesn't run any ftests
## Related to https://github.com/elastic/connectors-py/issues/10999
* Added new _manual_ pipeline for publishing to PyPI:
[.buildkite/pypi-publish-pipeline.yml](https://github.com/elastic/connectors/pull/3748/files#diff-028ba08c5e2e74e83911f11edfc92beb3dd8ebe77b6136d593f8b88bc047f8c0)
* Added new step in main pipeline (that runs on PRs and on daily
schedule) to build the packages' [binary & source
distributions](https://packaging.python.org/en/latest/specifications/section-distribution-formats/),
then `twine check` them; for `connectors_service` it will also install
it and test the CLI entrypoints it exposes (e.g. `connectors --help`)
* Changes to `pyproject.toml`
* Incorporated `pytest`, `pyright`, and `ruff` config into
`pyproject.toml` and removed individual files
* Updated `license` in `pyproject.toml` to avoid deprecation warnings
(removal in 2026) - see warnings
[here](https://buildkite.com/elastic/connectors/builds/19996#0199aa3a-2539-4dea-81cb-bd1f0a401eb0)
* Updated `packages` in `pyproject.toml` to avoid "ambiguous packages"
warning - see warning
[here](https://buildkite.com/elastic/connectors/builds/19996#0199aa3a-2539-4dea-81cb-bd1f0a401eb0)

~❓ I don't see `diff.sh` used anywhere.. can I delete it or am I missing
sth?~ nevermind, I was looking for `diff.sh` instead of just `diff`

#### ✍️ Process moving forward
Daily per PR: 
- Validate that packages are built correctly 
- "Publish DRA" part of `pipeline.yml` will contain:
- building docker image by installing package locally by **building and
installing from distribution**
    - build a `.tar.gz` and `.whl` of our packages 

Release day:
- same process as today, a Docker image DRA is chosen as BC and then
released
- _additionally,_ trigger manual pipeline to publish to PyPI from
appropriate commit; ⚠️ as agreed on 21/10/2025, this will only publish
to Test PyPI.

TODO:
- [x] Update `catalog-info.yml`
- [x] Publish .tar.gz and .whl as DRA
- [x] Ensure Docker images are built with installing code from .whl

## Checklists

<!--You can remove unrelated items from checklists below and/or add new
items that may help during the review.-->

#### Pre-Review Checklist
- [ ] this PR does NOT contain credentials of any kind, such as API keys
or username/passwords (double check `config.yml.example`)
- [ ] this PR has a meaningful title
- [ ] this PR links to all relevant github issues that it fixes or
partially addresses
- [ ] if there is no GH issue, please create it. Each PR should have a
link to an issue
- [ ] this PR has a thorough description
- [ ] Covered the changes with automated tests
- [ ] Tested the changes locally
- [ ] Added a label for each target release version (example: `v7.13.2`,
`v7.14.0`, `v8.0.0`)
- [ ] For bugfixes: backport safely to all minor branches still
receiving patch releases
- [ ] Considered corresponding documentation changes
- [ ] Contributed any configuration settings changes to the
configuration reference
- [ ] if you added or changed Rich Configurable Fields for a Native
Connector, you made a corresponding PR in
[Kibana](https://github.com/elastic/kibana/blob/main/packages/kbn-search-connectors/types/native_connectors.ts)

#### Changes Requiring Extra Attention

<!--Please call out any changes that require special attention from the
reviewers and/or increase the risk to availability or security of the
system after deployment. Remove the ones that don't apply.-->

- [ ] Security-related changes (encryption, TLS, SSRF, etc)
- [ ] New external service dependencies added.

## Related Pull Requests

<!--List any relevant PRs here or remove the section if this is a
standalone PR.

* https://github.com/elastic/.../pull/123-->

## Release Note

<!--If you think this enhancement/fix should be included in the release
notes,
please write a concise user-facing description of the change here.
You should also label the PR with `release_note` so the release notes
author(s) can easily look it up.-->

---------

Co-authored-by: Elastic Machine <[email protected]>
Co-authored-by: Sean Story <[email protected]>
elasticmachine and others added 5 commits October 30, 2025 16:54
Final sync before merging develop to main.

⚠️ Should be merged as merge-commit.

---------

Co-authored-by: Sean Story <[email protected]>
Co-authored-by: Elastic Machine <[email protected]>
@lorenabalan lorenabalan marked this pull request as ready for review October 30, 2025 17:10
@lorenabalan lorenabalan requested a review from a team as a code owner October 30, 2025 17:10
@mattnowzari
Copy link
Contributor

Ahh this is very exciting!!! 🚀

I noticed we are merging from a new branch merge-develop-to-main - The README PR got merged into develop branch after the fact, which means those changes are not present here. We can either merge this PR, then do a follow-up PR or get latest changes from develop into merge-develop-to-main then merge this PR into main, thoughts?

@lorenabalan
Copy link
Contributor Author

I noticed we are merging from a new branch merge-develop-to-main - The README PR got merged into develop branch after the fact, which means those changes are not present here. We can either merge this PR, then do a follow-up PR or get latest changes from develop into merge-develop-to-main then merge this PR into main, thoughts?

I was gonna merge latest develop into this but you're right I don't need this branch - muscle memory from all the main-to-develop merges before. 😅 I'll re-raise!

@lorenabalan lorenabalan deleted the merge-develop-to-main branch October 31, 2025 08:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants