You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: FAQ.md
+63-33Lines changed: 63 additions & 33 deletions
Original file line number
Diff line number
Diff line change
@@ -39,7 +39,7 @@
39
39
## How can I see the code generated by Devito?
40
40
After you build an ```op=Operator(...)``` implementing one or more equations, you can use ```print(op)``` to see the generated low level code. The example below builds an operator that takes a 1/2 cell forward shifted derivative of the ```Function```**f** and puts the result in the ```Function```**g**.
Set the environment variable `DEVITO_LOGGING=DEBUG`. When an Operator gets compiled, the used compilation command will be emitted to stdout.
108
108
109
109
If nothing seems to change, it is possible that no compilation is happening under-the-hood as all kernels have already been compiled in a previous run. You will then have to clear up the Devito kernel cache. From the Devito root directory, run:
110
-
```
110
+
111
+
```bash
111
112
python scripts/clear_devito_cache.py
112
113
```
113
114
@@ -151,7 +152,8 @@ Take a look [here](https://github.com/devitocodes/devito/tree/master/examples/pe
151
152
Devito applies several performance optimizations to improve the number of operations ("operation count") in complex expressions. These optimizations are designed to do a really good job but also be reasonably fast. One such pass attempts to factorize as many common terms as possible in expressions in order to reduce the operation count. We will construct a demonstrative example below that has a common term that is _not_ factored out by the Devito optimization. The difference in floating-point operations per output point for the factoring of that term is about 10 percent, and the generated C is different, but numerical outputs of running the two different operators are indistinguishable to machine precision. In terms of actual performance, the (few) missed factorization opportunities may not necessarily be a relevant issue: as long as the code is not heavily compute-bound, the runtimes may only be slightly higher than in the optimally-factorized version.
@@ -221,7 +227,8 @@ You will note that this method uses placeholders for the material parameter arra
221
227
222
228
### How to get the list of Devito environment variables
223
229
You can get the list of environment variables with the following python code:
224
-
```
230
+
231
+
```python
225
232
from devito import print_defaults
226
233
print_defaults()
227
234
```
@@ -304,7 +311,8 @@ Set `DEVITO_IGNORE_UNKNOWN_PARAMS=1` to avoid Devito raising an exception if one
304
311
305
312
## How do you run the unit tests from the command line
306
313
In addition to the [tutorials](https://www.devitoproject.org/devito/tutorials.html), the unit tests provide an excellent way to see how the Devito API works with small self-contained examples. You can exercise individual unit tests with the following python code:
## What is the difference between f() and f[] notation
316
324
Devito offers a functional language to express finite difference operators. This is introduced [here](https://github.com/devitocodes/devito/blob/master/examples/userapi/01_dsl.ipynb) and systematically used throughout our examples and tutorials. The language relies on what in jargon we call the "f() notation".
317
325
318
-
```
326
+
```python
319
327
>>>from devito import Grid, Function
320
328
>>> grid = Grid(shape=(5, 6))
321
329
>>> f = Function(name='f', grid=grid, space_order=2)
@@ -327,15 +335,15 @@ Derivative(f(x, y), x)
327
335
328
336
Sometimes, one wishes to escape the constraints of the language. Instead of taking derivatives, other special operations are required. Or perhaps, a specific grid point needs to be accessed. In such a case, one could use the "f[] notation" or "indexed notation". Following on from the example above:
329
337
330
-
```
338
+
```python
331
339
>>> x, y = grid.dimensions
332
340
>>> f[x +1000, y]
333
341
f[x +1000, y]
334
342
```
335
343
336
344
The indexed object can be used at will to construct `Eq`s, and they can be mixed up with objects stemming from the "f() notation".
337
345
338
-
```
346
+
```python
339
347
>>> f.dx + f[x +1000, y]
340
348
Derivative(f(x, y), x) + f[x +1000, y]
341
349
```
@@ -378,19 +386,23 @@ The indexed notation, or "f[] notation", is discussed [here](#What-is-the-differ
378
386
379
387
## What's up with object\.data
380
388
The `.data` property which is associated with objects such as `Constant`, `Function` and `SparseFunction` (along with their derivatives) represents the 'numerical' value of the 'data' associated with that particular object. For example, a `Constant` will have a single numerical value associated with it as shown in the following snippet
381
-
```
389
+
390
+
```python
382
391
from devito import Constant
383
392
384
393
c = Constant(name='c')
385
394
c.data =2.7
386
395
387
396
print(c.data)
388
397
```
389
-
```
398
+
399
+
```default
390
400
2.7
391
401
```
402
+
392
403
Then, a `Function` defined on a `Grid` will have a data value associated with each of the grid points (as shown in the snippet below) and so forth.
Here we see the `grid` has been created with the 'default' dimensions `x` and `y`. If a grid is created and passed a shape of `(5, 5, 5)` we'll see that in addition it has a `z` dimension. However, what if we want to create a grid with, say, a shape of `(5, 5, 5, 5)`? For this case, we've now run out of the dimensions defined by default and hence need to create our own dimensions to achieve this. This can be done via, e.g.,
423
437
```
438
+
439
+
Here we see the `grid` has been created with the 'default' dimensions `x` and `y`. If a grid is created and passed a shape of `(5, 5, 5)` we'll see that in addition it has a `z` dimension. However, what if we want to create a grid with, say, a shape of `(5, 5, 5, 5)`? For this case, we've now run out of the dimensions defined by default and hence need to create our own dimensions to achieve this. This can be done via, e.g.,
## As time increases in the finite difference evolution, are wavefield arrays "swapped" as you might see in c/c++ code
464
486
465
487
In c/c++ code using two wavefield arrays for second order acoustics, you might see code like the following to “swap” the wavefield arrays at each time step:
466
-
```
488
+
489
+
```C
467
490
float *p_tmp = p_old;
468
491
p_old = p_cur;
469
492
p_cur = p_tmp;
@@ -491,7 +514,7 @@ First, classes such as `Function` or `SparseTimeFunction` are inherently complex
491
514
492
515
Second, you must know that these objects are subjected to so-called reconstruction during compilation. Objects are immutable inside Devito; therefore, even a straightforward symbolic transformation such as `f[x] -> f[y]` boils down to performing a reconstruction, that is, creating a whole new object. Since `f` carries around several attributes (e.g., shape, grid, dimensions), each time Devito performs a reconstruction, we only want to specify which attributes are changing -- not all of them, as it would make the code ugly and incredibly complicated. The solution to this problem is that all the base symbolic types inherit from a common base class called `Reconstructable`; a `Reconstructable` object has two special class attributes, called `__rargs__` and `__rkwargs__`. If a subclass adds a new positional or keyword argument to its `__init_finalize__`, it must also be added to `__rargs__` or `__rkwargs__`, respectively. This will provide Devito with enough information to perform a reconstruction when it's needed during compilation. The following example should clarify:
493
516
494
-
```
517
+
```python
495
518
classFoo(Reconstructable):
496
519
__rargs__ = ('a', 'b')
497
520
__rkwargs__ = ('c',)
@@ -515,7 +538,7 @@ class Bar(Foo):
515
538
516
539
You are unlikely to care about how reconstruction works in practice, but here are a few examples for `a = Foo(3, 5)` to give you more context.
517
540
518
-
```
541
+
```python
519
542
a._rebuild() ->"x(3, 5, 4)" (i.e., copy of `a`).
520
543
a._rebuild(4) ->"x(4, 5, 4)"
521
544
a._rebuild(4, 7) ->"x(4, 7, 4)"
@@ -534,7 +557,7 @@ There is currently no API to achieve this straightforwardly. However, there are
534
557
* via env vars: use a [CustomCompiler](https://github.com/opesci/devito/blob/v4.0/devito/compiler.py#L446) -- just leave the `DEVITO_ARCH` environment variable unset or set it to `'custom'`. Then, `export CFLAGS="..."` to tell Devito to use the exported flags in place of the default ones.
535
558
* programmatically: subclass one of the compiler classes and set `self.cflags` to whatever you need. Do not forget to add the subclass to the [compiler registry](https://github.com/opesci/devito/blob/v4.0/devito/compiler.py#L472). For example, you could do
536
559
537
-
```
560
+
```python
538
561
from devito import configuration, compiler_registry
539
562
from devito.compiler import GNUCompiler
540
563
@@ -576,7 +599,8 @@ Until Devito v3.5 included, domain decomposition occurs along the fastest axis.
576
599
## How should I use MPI on multi-socket machines
577
600
578
601
In general you should use one MPI rank per NUMA node on a multi-socket machine. You can find the number of numa nodes with the `lscpu` command. For example, here is the relevant part of the output from the `lscpu` command on an AMD 7502 2 socket machine with 2 NUMA nodes:
579
-
```
602
+
603
+
```default
580
604
Architecture: x86_64
581
605
CPU(s): 64
582
606
On-line CPU(s) list: 0-63
@@ -597,7 +621,7 @@ NUMA node1 CPU(s): 32-63
597
621
There are a few things you may want to check
598
622
599
623
* To refer to the actual ("global") shape of the domain, you should always use `grid.shape` (or analogously through a `Function`, `f.grid.shape`). And unless you know well what you're doing, you should never use the function shape, namely `f.shape` or `f.data.shape`, as that will return the "local" domain shape, that is the data shape after domain decomposition, which might differ across the various MPI ranks.
600
-
* <... to be completed ...>
624
+
601
625
602
626
[top](#Frequently-Asked-Questions)
603
627
@@ -613,17 +637,23 @@ This is likely due to an out-of-bounds (OOB) array access while running the gene
613
637
## Can I manually modify the C code generated by Devito and test these modifications
614
638
615
639
Yes, as of Devito v3.5 it is possible to modify the generated C code and run it inside Devito. First you need to get the C file generated for a given `Operator`. Run your code in `DEBUG` mode:
616
-
```
640
+
641
+
```bash
617
642
DEVITO_LOGGING=DEBUG python your_code.py
618
643
```
644
+
619
645
The generated code path will be shown as in the excerpt below:
You can now open the C file, do the modifications you like, and save them. Finally, rerun the same program but this time with the _Devito JIT backdoor_ enabled:
624
-
```
652
+
653
+
```bash
625
654
DEVITO_JIT_BACKDOOR=1 python your_code.py
626
655
```
656
+
627
657
This will force Devito to recompile and link the modified C code.
628
658
629
659
If you have a large codebase with many `Operator`s, here's a [trick](https://github.com/devitocodes/devito/wiki/Efficient-use-of-DEVITO_JIT_BACKDOOR-in-large-codes-with-many-Operators) to speed up your hacking with the JIT backdoor.
@@ -675,7 +705,7 @@ About the GPts/s metric, that is number of gigapoints per seconds. The "points"
675
705
676
706
An excerpt of the performance profile emitted by Devito upon running an Operator is provided below. In this case, the Operator has two sections, ``section0`` and ``section1``, and ``section1`` consists of two consecutive 6D iteration spaces whose size is given between angle brackets.
677
707
678
-
```
708
+
```default
679
709
Global performance: [OI=0.16, 8.00 GFlops/s, 0.04 GPts/s]
680
710
Local performance:
681
711
* section0<136,136,136> run in 0.10 s [OI=0.16, 0.14 GFlops/s]
@@ -695,7 +725,7 @@ The floating-point operations are counted once all of the symbolic flop-reducing
695
725
696
726
To calculate the GFlops/s performance, Devito multiplies the floating-point operations calculated at compile time by the size of the iteration space, and it does that at the granularity of individual expressions. For example, consider the following snippet:
0 commit comments