Skip to content

Commit a611df3

Browse files
committed
Merge branch 'main' into mandd/expandedDemo
2 parents a984e77 + d4f4864 commit a611df3

39 files changed

+1782
-3020
lines changed

.github/workflows/github-actions.yml

Lines changed: 33 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -36,16 +36,21 @@ jobs:
3636
- name: Install DACKAR Required Libraries
3737
# Either fix scikit-learn==1.5 to allow quantulum3 to use the pretrained classifier or
3838
# Run "quantulum3-training -s" to retrain classifier
39+
3940
run: |
4041
pwd
4142
conda create -n dackar_libs python=3.11
4243
conda init bash && source ~/.bashrc && conda activate dackar_libs
43-
pip install spacy==3.5 stumpy textacy matplotlib nltk coreferee beautifulsoup4 networkx pysbd tomli numerizer autocorrect pywsd openpyxl quantulum3[classifier] numpy==1.26 scikit-learn pyspellchecker contextualSpellCheck pandas
44+
pip install spacy==3.5 stumpy textacy matplotlib nltk coreferee beautifulsoup4 networkx pysbd tomli numerizer autocorrect pywsd openpyxl quantulum3[classifier] numpy==1.26 scikit-learn pyspellchecker contextualSpellCheck pandas wordcloud jsonschema toml
4445
pip install neo4j jupyterlab
4546
pip install pytest
46-
python3 -m spacy download en_core_web_lg
47-
python3 -m coreferee install en
48-
python3 -m nltk.downloader all
47+
# python -m spacy download en_core_web_lg [for some reason, GitHub machine complains this command]
48+
- name: Download trained models
49+
run: |
50+
conda init bash && source ~/.bashrc && conda activate dackar_libs
51+
python -m coreferee install en
52+
python -m nltk.downloader all
53+
pip install https://github.com/explosion/spacy-models/releases/download/en_core_web_lg-3.5.0/en_core_web_lg-3.5.0-py3-none-any.whl
4954
quantulum3-training -s
5055
5156
- name: Test
@@ -81,14 +86,18 @@ jobs:
8186
pwd
8287
conda create -n dackar_libs python=3.11
8388
conda init zsh && source ~/.zshrc && conda activate dackar_libs
84-
pip install spacy==3.5 stumpy textacy matplotlib nltk coreferee beautifulsoup4 networkx pysbd tomli numerizer autocorrect pywsd openpyxl quantulum3[classifier] numpy==1.26 scikit-learn pyspellchecker contextualSpellCheck pandas
89+
pip install spacy==3.5 stumpy textacy matplotlib nltk coreferee beautifulsoup4 networkx pysbd tomli numerizer autocorrect pywsd openpyxl quantulum3[classifier] numpy==1.26 scikit-learn pyspellchecker contextualSpellCheck pandas wordcloud jsonschema toml
8590
pip install neo4j jupyterlab
8691
pip install pytest
87-
python3 -m spacy download en_core_web_lg
88-
python3 -m coreferee install en
89-
python3 -m nltk.downloader all
90-
quantulum3-training -s
9192
93+
# python -m spacy download en_core_web_lg [for some reason, GitHub machine complains this command]
94+
- name: Download trained models
95+
run: |
96+
conda init zsh && source ~/.zshrc && conda activate dackar_libs
97+
python -m coreferee install en
98+
python -m nltk.downloader all
99+
pip install https://github.com/explosion/spacy-models/releases/download/en_core_web_lg-3.5.0/en_core_web_lg-3.5.0-py3-none-any.whl
100+
quantulum3-training -s
92101
93102
- name: Test
94103
run: |
@@ -98,7 +107,7 @@ jobs:
98107
99108
100109
Test-DACKAR-Windows:
101-
runs-on: windows-latest
110+
runs-on: windows-2025
102111
steps:
103112
- name: Setup Conda
104113
uses: conda-incubator/setup-miniconda@v3
@@ -125,16 +134,28 @@ jobs:
125134
echo " Conda information"
126135
conda info
127136
echo " Activate Dackar conda environment"
128-
pip install spacy==3.5 stumpy textacy matplotlib nltk coreferee beautifulsoup4 networkx pysbd tomli numerizer autocorrect pywsd openpyxl quantulum3[classifier] numpy==1.26 scikit-learn pyspellchecker contextualSpellCheck pandas
137+
pip install spacy==3.5 stumpy textacy matplotlib nltk coreferee beautifulsoup4 networkx pysbd tomli numerizer autocorrect pywsd openpyxl quantulum3[classifier] numpy==1.26 scikit-learn pyspellchecker contextualSpellCheck pandas wordcloud jsonschema toml
129138
pip install neo4j jupyterlab
130139
pip install pytest
140+
pip uninstall numba llvmlite
141+
pip install --no-cache-dir numba==0.61 llvmlite==0.44
131142
conda list
132143
which python
133-
python -m spacy download en_core_web_lg
144+
pip install https://github.com/explosion/spacy-models/releases/download/en_core_web_lg-3.5.0/en_core_web_lg-3.5.0-py3-none-any.whl
134145
python -m coreferee install en
135146
python -m nltk.downloader all
136147
quantulum3-training -s
137148
149+
# python -m spacy download en_core_web_lg [for some reason, GitHub machine complains this command]
150+
# python -m spacy download en_core_web_lg
151+
# pip install numba
152+
# - name: Download trained models
153+
# run: |
154+
# pip install https://github.com/explosion/spacy-models/releases/download/en_core_web_lg-3.5.0/en_core_web_lg-3.5.0-py3-none-any.whl
155+
# python -m coreferee install en
156+
# python -m nltk.downloader all
157+
# quantulum3-training -s
158+
138159
- name: Test
139160
run: |
140161
cd tests

.gitignore

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -70,6 +70,7 @@ instance/
7070

7171
# Sphinx documentation
7272
docs/_build/
73+
docs/notebooks/
7374

7475
# PyBuilder
7576
target/
@@ -144,3 +145,8 @@ tmp/
144145
Profile.prof
145146
.vscode
146147
.sass-cache
148+
149+
150+
*.csv
151+
*.bk
152+
*.png

README.md

Lines changed: 12 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,15 @@
11
# DACKAR
2-
*Digital Analytics, Causal Knowledge Acquisition and Reasoning for Technical Language Processing*
2+
*Digital Analytics, Causal Knowledge Acquisition and Reasoning*
3+
4+
A Knowledge Management and Discovery Tool for Equipment Reliability Data
5+
6+
To improve the performance and reliability of high dependable technological systems such as nuclear power plants, advanced monitoring and health management systems are employed to inform system engineers on observed degradation processes and anomalous behaviors of assets and components. This information is captured in the form of large amount of data which can be heterogenous in nature (e.g., numeric, textual). Such a large amount of available data poses challenges when system engineers are required to parse and analyze them to track the historic reliability performance of assets and components. DACKAR tackles this challenge by providing means to organize equipment reliability data in the form of a knowledge graph. DACKAR distinguish itself from current knowledge graph-based methods in that model-based system engineering (MBSE) models are used to capture system architecture and health and performance data. MBSE models are used as skeleton of a knowledge graph; numeric and textual data elements, once processed, are associated to MBSE model elements. Such a feature opens the door to new data analytics methods designed to identify causal relations between observed phenomena.
7+
8+
DACKAR is structured by a set of workflows where each workflow is designed to process raw data elements (i.e., anomalies, events reported in textual form, MBSE models) and construct or update a knowledge graph. For each workflow, the user can specify the sequence of pipelines that are designed to perform specific processing actions on the raw data or the processed data within the same workflow. Specific guidelines on the formats of the raw data are provided. In addition, within the same workflow, a specific data-object is defined; in this respect, each pipeline is tasked to either process portion of the defined data-object or create knowledge graph data. The available workflows are:
9+
* mbse_workflow: Workflow to process system and equipment MBSE models
10+
* anomaly_workflow: Workflow to process numeric data and anomalies
11+
* tlp_workflow: Workflow to process textual data
12+
* kg_workflow: Workflow to construct and update knowledge graphs
313

414
## Installation
515

@@ -41,7 +51,7 @@ and ``jupyterlab`` is used to execute notebook examples under ``./examples/`` fo
4151

4252
## Test
4353

44-
### Test functions with ```__pytest__```
54+
### Test functions with ```pytest```
4555

4656
- Run the following command in your command line to install pytest:
4757

docs/contributors.rst

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,2 +1,6 @@
11
Contributors
22
============
3+
4+
Congjian Wang: [email protected]
5+
Diego Mandelli: [email protected]
6+
Joshua J. Cogliati: [email protected]

docs/install_spacy3.5.rst

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -32,7 +32,7 @@ Install the Required Libraries
3232
3333
conda activate dackar_libs
3434
35-
pip install spacy==3.5 stumpy textacy matplotlib nltk coreferee beautifulsoup4 networkx pysbd tomli numerizer autocorrect pywsd openpyxl quantulum3[classifier] numpy==1.26 scikit-learn pyspellchecker contextualSpellCheck pandas
35+
pip install spacy==3.5 stumpy textacy matplotlib nltk coreferee beautifulsoup4 networkx pysbd tomli numerizer autocorrect pywsd openpyxl quantulum3[classifier] numpy==1.26 scikit-learn pyspellchecker contextualSpellCheck pandas wordcloud jsonschema toml
3636
3737
.. conda install -c conda-forge pandas
3838
.. scikit-learn 1.2.2 is required for quantulum3

docs/support.rst

Lines changed: 2 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -2,11 +2,8 @@
22
Support
33
=======
44

5-
The easiest way to get help with the project is to open an issue on Github_.
6-
7-
.. The mailing list at ... is also available for support.
8-
9-
.. _Github: https://github.inl.gov/congjian-wang/DACKAR/issues
5+
The easiest way to get help with the project is to open an issue on `GitHub`_ .
6+
.. _Github: https://github.com/idaholab/DACKAR/issues/
107

118
Developers:
129
-----------

0 commit comments

Comments
 (0)