You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: _episodes/01-introduction.md
+2Lines changed: 2 additions & 0 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -12,6 +12,8 @@ keypoints:
12
12
- This tutorial is brought to you by the DUNE Computing Consortium.
13
13
- The goals are to give you the computing basis to work on DUNE.
14
14
---
15
+
16
+
{% include 01-introduction.toc.md %}
15
17
## DUNE Computing Consortium
16
18
17
19
The DUNE Computing Consortium works to establish a global computing network that will handle the massive data streams produced by distributing these across the computing grid.
Copy file name to clipboardExpand all lines: _episodes/03-data-management.md
+38-13Lines changed: 38 additions & 13 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -13,7 +13,9 @@ keypoints:
13
13
- Xrootd allows user to stream data files.
14
14
---
15
15
16
-
#### Session Video
16
+
{% include 03-data-management.toc.md %}
17
+
18
+
## Session Video
17
19
18
20
<!--The session will be captured on video a placed here after the workshop for asynchronous study.-->
19
21
@@ -77,7 +79,11 @@ If you want to process data using the full power of DUNE computing, you should t
77
79
78
80
## How to find and access official data
79
81
80
-
### What is metacat?
82
+
{% include OfficialDatasets_include.md %}
83
+
84
+
You can also query the catalogs yourself using [metacat][metacat] and [rucio][rucio] catalogs. Metacat contains information about file content and official datasets, rucio stores the physical location of those files. Files should have entries in both catalogs. Generally you ask metacat first to find the files you want and then ask rucio for their location.
85
+
86
+
## What is metacat?
81
87
82
88
Metacat is a file and dataset catalog - it allows you to search for files and datasets that have particular attributes and understand their provenance, including details on all of their processing steps.
83
89
It also allows for querying jointly the file catalog and the DUNE conditions database.
To find your data you need to specify at the minimum
96
102
97
-
-`core.run_type` (the experiment)
103
+
-`core.run_type` (the experiment: fardet-vd, hd-protodune ...)
98
104
-`core.file_type` (mc or detector)
99
-
-`core.data_tier` (the level of processing raw, full-reconstructed, root-tuple)
105
+
-`core.data_tier` (the level of processing raw, full-reconstructed, root-tuple ...)
100
106
101
107
and when searching for specific types of data
102
108
@@ -145,7 +151,8 @@ First get metacat if you have not already done so
145
151
token authentication.
146
152
{: .callout} -->
147
153
148
-
### then do queries to find particular sets of files
154
+
### then do queries to find particular groups of files
155
+
149
156
150
157
~~~
151
158
metacat query "files from dune:all where core.file_type=detector and core.run_type=hd-protodune and core.data_tier=raw and core.runs[any]=27331 limit 1"
@@ -240,10 +247,9 @@ Total size: 17553648200600 (17.554 TB)
240
247
{: .output}
241
248
242
249
243
-
244
250
<!-- To look at all the files in that run you need to use XRootD - **DO NOT TRY TO COPY 4 TB to your local area!!!*** -->
245
251
246
-
## Official datasets <aname="Official_Datasets"></a>
252
+
<!--## Official datasets <a name="Official_Datasets"></a>
247
253
248
254
The production group make official datasets which are sets of files which share important characteristics such as experiment, data_tier, data_stream, processing version and processing configuration.
metacat query -s "files from fardet-vd:fardet-vd__full-reconstructed__v09_81_00d02__reco2_dunevd10kt_anu_1x8x6_3view_30deg_geov3__prodgenie_anu_numu2nue_nue2nutau_dunevd10kt_1x8x6_3view_30deg__out1__v2_official"
352
+
~~~
353
+
{: .language-bash}
354
+
355
+
~~~
356
+
Files: 20648
357
+
Total size: 34550167782531 (34.550 TB)
358
+
~~~
359
+
{: .output}
338
360
361
+
this may take a while as that is a big dataset.
339
362
363
+
340
364
### What describes a dataset?
341
365
342
-
Let's look at the metadata describing that anti-neutrino dataset: the -j means json output
366
+
Let's look at the metadata describing an anti-neutrino dataset: the -j means json output
343
367
344
368
~~~
345
369
metacat dataset show -j fardet-vd:fardet-vd__full-reconstructed__v09_81_00d02__reco2_dunevd10kt_anu_1x8x6_3view_30deg_geov3__prodgenie_anu_numu2nue_nue2nutau_dunevd10kt_1x8x6_3view_30deg__out1__v2_official
@@ -386,7 +410,7 @@ You can use any of those keys to refine dataset searches as we did above. You pr
386
410
387
411
### What files are in that dataset and how do I use them?
388
412
389
-
You can either click on a dataset in the web data catalog or:
413
+
You can either locate and click on a dataset in the [web data catalog](https://dune-tech.rice.edu/dunecatalog/) or use the[metacat web interface](https://metacat.fnal.gov:9443/dune_meta_prod/app/gui) or use the command line:
390
414
391
415
~~~
392
416
metacat query "files from fardet-vd:fardet-vd__full-reconstructed__v09_81_00d02__reco2_dunevd10kt_anu_1x8x6_3view_30deg_geov3__prodgenie_anu_numu2nue_nue2nutau_dunevd10kt_1x8x6_3view_30deg__out1__v2_official limit 10"
@@ -398,7 +422,7 @@ will list the first 10 files in that dataset (you probably don't want to list al
398
422
You can also use a similar query in your batch job to get the files you want.
399
423
400
424
401
-
###Finding those files on disk
425
+
## Finding those files on disk
402
426
403
427
To find your files, you need to use [Rucio](#Rucio) directly or give the [justIN](https://dunejustin.fnal.gov/docs/tutorials.dune.md) batch system your query and it will locate them for you.
404
428
@@ -417,7 +441,8 @@ export SAM_EXPERIMENT=dune
417
441
-->
418
442
## Getting file locations using Rucio
419
443
420
-
### What is Rucio? <aname="Rucio"></a>
444
+
### What is Rucio?
445
+
<!-- <a name="Rucio"></a> -->
421
446
Rucio is the next-generation Data Replica service and is part of DUNE's new Distributed Data Management (DDM) system that is currently in deployment.
422
447
Rucio has two functions:
423
448
1. A rule-based system to get files to Rucio Storage Elements around the world and keep them there.
@@ -427,7 +452,7 @@ As of the date of the 2025 tutorial:
427
452
- The Rucio client is available in CVMFS and Spack
428
453
- Most DUNE users are now enabled to use it. New users may not automatically be added.
429
454
430
-
### You will need to authenticate to use read files
455
+
### You will need to authenticate to read files
431
456
432
457
> #### For SL7 use justin to get a token
433
458
{:.callout}
@@ -498,7 +523,7 @@ which the locations of the file on disk and tape. We can use this to copy the f
498
523
> Try to access the file at manchester using the command:
Copy file name to clipboardExpand all lines: _episodes/03.2-UPS.md
+1-1Lines changed: 1 addition & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -17,7 +17,7 @@ keypoints:
17
17
> You need to be in the Apptainer to use it.
18
18
> UPS is being replaced by a new [spack][Spack Documentation] system for Alma9. We will be adding a Spack tutorial soon but for now, you need to use SL7/UPS to use the full DUNE code stack.
19
19
>
20
-
> Go back and look at the [SL7/Apptainer]({{ site.baseurl }}/setup.html#SL7_setup) instructions to get an SL7 container for this section.
20
+
> Go back and look at the [SL7/Apptainer]({{ site.baseurl }}/sl7_setup) instructions to get an SL7 container for this section.
21
21
{: .challenge}
22
22
23
23
An important requirement for making valid physics results is computational reproducibility. You need to be able to repeat the same calculations on the data and MC and get the same answers every time. You may be asked to produce a slightly different version of a plot for example, and the data that goes into it has to be the same every time you run the program.
Copy file name to clipboardExpand all lines: _episodes/05.1-improve-code-efficiency.md
+4-2Lines changed: 4 additions & 2 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -10,11 +10,13 @@ keypoints:
10
10
- CPU, memory, and build time optimizations are possible when good code practices are followed.
11
11
---
12
12
13
-
#### Session Video
13
+
## Improve your Code efficiency
14
+
15
+
### Session Video
14
16
15
17
The session will be captured on video a placed here after the workshop for asynchronous study.
16
18
17
-
####Live Notes
19
+
### Live Notes
18
20
19
21
<!-- Participants are encouraged to monitor and utilize the [Livedoc for May. 2023](https://docs.google.com/document/d/19XMQqQ0YV2AtR5OdJJkXoDkuRLWv30BnHY9C5N92uYs/edit?usp=sharing) to ask questions and learn. For reference, the [Livedoc from Jan. 2023](https://docs.google.com/document/d/1sgRQPQn1OCMEUHAk28bTPhZoySdT5NUSDnW07aL-iQU/edit?usp=sharing) is provided.
Copy file name to clipboardExpand all lines: _extras/Common-Error-Messages.md
+22-19Lines changed: 22 additions & 19 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -5,37 +5,40 @@ keypoints:
5
5
- Errors that people report in doing the tutorial
6
6
---
7
7
8
-
- #### `/usr/bin/xauth: unable to write authority file`
9
-
#### `disk quota exceeded error with metacat auth login`
8
+
## Common Error Messages
10
9
11
-
These likely means your kerberos ticket was not forwarded and you can't access your home are without it. do a kinit in your terminal session. Or possibly you really have filled your home area.
12
10
13
-
- #### `bash: setup: command not found`
11
+
{% include Common-Error-Messages.toc.md %}
14
12
15
-
setup is a UPS command. You need to be running in the Apptainer and setup the DUNE ups system - check out the instructions in [SL7 setup]
16
-
({{ site.baseurl }}/sl7_setup)
13
+
### Error: /usr/bin/xauth: unable to write authority file
17
14
15
+
These likely means your kerberos ticket was not forwarded and you can't access your home are without it. do a kinit in your terminal session. Or possibly you really have filled your home area.
18
16
19
-
- #### `SyntaxError: future feature annotations is not defined`
17
+
### bash: setup: command not found
20
18
21
-
This looks like a bad python version, try doing `which python` if it isn't > 3.9 you don't have a modern python version.
19
+
setup is a UPS command. You need to be running in the Apptainer and setup the DUNE ups system - check out the instructions in [SL7 setup]({{ site.baseurl }}/sl7_setup)
22
20
23
-
- On SL7 we suggest setting up the dunesw as shown in the example setup. alternatively you can
24
21
25
-
~~~
26
-
setup root -v v6_28_12 -q e26:p3915:prof
27
-
~~~
28
-
{: .language-bash}
22
+
### SyntaxError: future feature annotations is not defined
29
23
30
-
- On AL9 we suggest loading ROOT which brings in a modern version of python and allows xrootd access to data.
24
+
This looks like a bad python version, try doing `which python` if it isn't > 3.9 you don't have a modern python version.
31
25
32
-
~~~
33
-
spack load root@6.28.12
34
-
~~~
35
-
{: .language-bash}
26
+
- On SL7 we suggest setting up the dunesw as shown in the example setup. alternatively you can
Copy file name to clipboardExpand all lines: _extras/ComputerSetup.md
+10-6Lines changed: 10 additions & 6 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -12,13 +12,17 @@ keypoints:
12
12
- It is also something almost all people who get paid to program are expected to know well
13
13
---
14
14
15
-
## 0. Back up your machine
15
+
## Computer setup
16
+
17
+
{% include ComputerSetup.toc.md %}
18
+
19
+
### Back up your machine
16
20
17
21
We are going to be messing with your operating system at some level so it is extremely wise to do a complete backup of your machine to an external drive right now.
18
22
19
23
Also turn off automatic updates. Operating system updates can mess with your setup. Generally, back up before doing updates so you can revert if necessary.
20
24
21
-
##1. Open a unix terminal window
25
+
###Open a unix terminal window
22
26
23
27
First figure out how to open a terminal on your system. The Carpentries Shell Training has a [section that explains this][New Shell]
24
28
@@ -35,7 +39,7 @@ On Windows it's a bit more complicated as the underlying operating system is not
35
39
36
40
37
41
38
-
##2. Learn how to use the Unix Shell
42
+
###Learn how to use the Unix Shell
39
43
40
44
<!-- First figure out [how to open a terminal on your system][New Shell]
41
45
-->
@@ -47,7 +51,7 @@ It tells you how to start a terminal session in Windows, Mac OSX and Unix system
47
51
Please do that [unix shell tutorial][Unix Shell Basics] to learn about the basic command line.
48
52
49
53
50
-
##3. Install an x-windows emulator
54
+
###Install an x-windows emulator
51
55
52
56
#### MacOS
53
57
@@ -88,7 +92,7 @@ See the information about [Windows]({{ site.baseurl }}/Windows.html) terminal co
88
92
> You should now be ready to go for the ({{ site.baseurl }}/setup)
89
93
{: .callout}
90
94
91
-
## Extra - Get a compiler/code editor
95
+
###Extra - Get a code editor
92
96
93
97
Although you will mainly be using python to code to begin with, most HEP code is actually C++ and it is good to have access to a C++ compiler. Bonus is that you normally get a good editor as well.
94
98
@@ -108,7 +112,7 @@ You can also use vim or emacs if you are old school.
108
112
Likely you should load up the full [Visual Studio][Visual Studio] as it has a nice C++ compiler
0 commit comments