Skip to content

Commit bb5345d

Browse files
committed
Merge branch 'main' into wangc/architecture
2 parents 2c98bcf + e9e6889 commit bb5345d

File tree

4 files changed

+47
-15
lines changed

4 files changed

+47
-15
lines changed

.github/workflows/github-actions.yml

Lines changed: 30 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -36,16 +36,21 @@ jobs:
3636
- name: Install DACKAR Required Libraries
3737
# Either fix scikit-learn==1.5 to allow quantulum3 to use the pretrained classifier or
3838
# Run "quantulum3-training -s" to retrain classifier
39+
3940
run: |
4041
pwd
4142
conda create -n dackar_libs python=3.11
4243
conda init bash && source ~/.bashrc && conda activate dackar_libs
4344
pip install spacy==3.5 stumpy textacy matplotlib nltk coreferee beautifulsoup4 networkx pysbd tomli numerizer autocorrect pywsd openpyxl quantulum3[classifier] numpy==1.26 scikit-learn pyspellchecker contextualSpellCheck pandas wordcloud jsonschema toml
4445
pip install neo4j jupyterlab
4546
pip install pytest
46-
python3 -m spacy download en_core_web_lg
47-
python3 -m coreferee install en
48-
python3 -m nltk.downloader all
47+
# python -m spacy download en_core_web_lg [for some reason, GitHub machine complains this command]
48+
- name: Download trained models
49+
run: |
50+
conda init bash && source ~/.bashrc && conda activate dackar_libs
51+
python -m coreferee install en
52+
python -m nltk.downloader all
53+
pip install https://github.com/explosion/spacy-models/releases/download/en_core_web_lg-3.5.0/en_core_web_lg-3.5.0-py3-none-any.whl
4954
quantulum3-training -s
5055
5156
- name: Test
@@ -84,11 +89,15 @@ jobs:
8489
pip install spacy==3.5 stumpy textacy matplotlib nltk coreferee beautifulsoup4 networkx pysbd tomli numerizer autocorrect pywsd openpyxl quantulum3[classifier] numpy==1.26 scikit-learn pyspellchecker contextualSpellCheck pandas wordcloud jsonschema toml
8590
pip install neo4j jupyterlab
8691
pip install pytest
87-
python3 -m spacy download en_core_web_lg
88-
python3 -m coreferee install en
89-
python3 -m nltk.downloader all
90-
quantulum3-training -s
9192
93+
# python -m spacy download en_core_web_lg [for some reason, GitHub machine complains this command]
94+
- name: Download trained models
95+
run: |
96+
conda init zsh && source ~/.zshrc && conda activate dackar_libs
97+
python -m coreferee install en
98+
python -m nltk.downloader all
99+
pip install https://github.com/explosion/spacy-models/releases/download/en_core_web_lg-3.5.0/en_core_web_lg-3.5.0-py3-none-any.whl
100+
quantulum3-training -s
92101
93102
- name: Test
94103
run: |
@@ -98,7 +107,7 @@ jobs:
98107
99108
100109
Test-DACKAR-Windows:
101-
runs-on: windows-latest
110+
runs-on: windows-2025
102111
steps:
103112
- name: Setup Conda
104113
uses: conda-incubator/setup-miniconda@v3
@@ -128,13 +137,25 @@ jobs:
128137
pip install spacy==3.5 stumpy textacy matplotlib nltk coreferee beautifulsoup4 networkx pysbd tomli numerizer autocorrect pywsd openpyxl quantulum3[classifier] numpy==1.26 scikit-learn pyspellchecker contextualSpellCheck pandas wordcloud jsonschema toml
129138
pip install neo4j jupyterlab
130139
pip install pytest
140+
pip uninstall numba llvmlite
141+
pip install --no-cache-dir numba==0.61 llvmlite==0.44
131142
conda list
132143
which python
133-
python -m spacy download en_core_web_lg
144+
pip install https://github.com/explosion/spacy-models/releases/download/en_core_web_lg-3.5.0/en_core_web_lg-3.5.0-py3-none-any.whl
134145
python -m coreferee install en
135146
python -m nltk.downloader all
136147
quantulum3-training -s
137148
149+
# python -m spacy download en_core_web_lg [for some reason, GitHub machine complains this command]
150+
# python -m spacy download en_core_web_lg
151+
# pip install numba
152+
# - name: Download trained models
153+
# run: |
154+
# pip install https://github.com/explosion/spacy-models/releases/download/en_core_web_lg-3.5.0/en_core_web_lg-3.5.0-py3-none-any.whl
155+
# python -m coreferee install en
156+
# python -m nltk.downloader all
157+
# quantulum3-training -s
158+
138159
- name: Test
139160
run: |
140161
cd tests

README.md

Lines changed: 11 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,15 @@
11
# DACKAR
2-
*Digital Analytics, Causal Knowledge Acquisition and Reasoning for Technical Language Processing*
2+
*Digital Analytics, Causal Knowledge Acquisition and Reasoning*
3+
4+
A Knowledge Management and Discovery Tool for Equipment Reliability Data
5+
6+
To improve the performance and reliability of high dependable technological systems such as nuclear power plants, advanced monitoring and health management systems are employed to inform system engineers on observed degradation processes and anomalous behaviors of assets and components. This information is captured in the form of large amount of data which can be heterogenous in nature (e.g., numeric, textual). Such a large amount of available data poses challenges when system engineers are required to parse and analyze them to track the historic reliability performance of assets and components. DACKAR tackles this challenge by providing means to organize equipment reliability data in the form of a knowledge graph. DACKAR distinguish itself from current knowledge graph-based methods in that model-based system engineering (MBSE) models are used to capture system architecture and health and performance data. MBSE models are used as skeleton of a knowledge graph; numeric and textual data elements, once processed, are associated to MBSE model elements. Such a feature opens the door to new data analytics methods designed to identify causal relations between observed phenomena.
7+
8+
DACKAR is structured by a set of workflows where each workflow is designed to process raw data elements (i.e., anomalies, events reported in textual form, MBSE models) and construct or update a knowledge graph. For each workflow, the user can specify the sequence of pipelines that are designed to perform specific processing actions on the raw data or the processed data within the same workflow. Specific guidelines on the formats of the raw data are provided. In addition, within the same workflow, a specific data-object is defined; in this respect, each pipeline is tasked to either process portion of the defined data-object or create knowledge graph data. The available workflows are:
9+
* mbse_workflow: Workflow to process system and equipment MBSE models
10+
* anomaly_workflow: Workflow to process numeric data and anomalies
11+
* tlp_workflow: Workflow to process textual data
12+
* kg_workflow: Workflow to construct and update knowledge graphs
313

414
## Installation
515

docs/contributors.rst

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,2 +1,6 @@
11
Contributors
22
============
3+
4+
Congjian Wang: [email protected]
5+
Diego Mandelli: [email protected]
6+
Joshua J. Cogliati: [email protected]

docs/support.rst

Lines changed: 2 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -2,11 +2,8 @@
22
Support
33
=======
44

5-
The easiest way to get help with the project is to open an issue on Github_.
6-
7-
.. The mailing list at ... is also available for support.
8-
9-
.. _Github: https://github.inl.gov/congjian-wang/DACKAR/issues
5+
The easiest way to get help with the project is to open an issue on `GitHub`_ .
6+
.. _Github: https://github.com/idaholab/DACKAR/issues/
107

118
Developers:
129
-----------

0 commit comments

Comments
 (0)