Skip to content

Commit

Permalink
Add docs (#13)
Browse files Browse the repository at this point in the history
* Update docstrings and add docs
  • Loading branch information
AlexeyKozhevin authored Jul 30, 2024
1 parent 7f47b77 commit 34dea88
Show file tree
Hide file tree
Showing 20 changed files with 530 additions and 129 deletions.
28 changes: 28 additions & 0 deletions .github/workflows/doc.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,28 @@
name: documentation

on: [push, pull_request, workflow_dispatch]

permissions:
contents: write

jobs:
docs:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v3
- uses: actions/setup-python@v3
- name: Install dependencies
run: |
pip install sphinx sphinx_rtd_theme myst_parser
pip install .
- name: Sphinx build
run: |
sphinx-build docs _build
- name: Deploy to GitHub Pages
uses: peaceiris/actions-gh-pages@v3
if: ${{ github.event_name == 'push' && github.ref == 'refs/heads/main' }}
with:
publish_branch: gh-pages
github_token: ${{ secrets.GITHUB_TOKEN }}
publish_dir: _build/
force_orphan: true
20 changes: 20 additions & 0 deletions docs/Makefile
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
# Minimal makefile for Sphinx documentation
#

# You can set these variables from the command line, and also
# from the environment for the first two.
SPHINXOPTS ?=
SPHINXBUILD ?= sphinx-build
SOURCEDIR = .
BUILDDIR = _build

# Put it first so that "make" without argument is like "make help".
help:
@$(SPHINXBUILD) -M help "$(SOURCEDIR)" "$(BUILDDIR)" $(SPHINXOPTS) $(O)

.PHONY: help Makefile

# Catch-all target: route all unknown targets to Sphinx using the new
# "make mode" option. $(O) is meant as a shortcut for $(SPHINXOPTS).
%: Makefile
@$(SPHINXBUILD) -M $@ "$(SOURCEDIR)" "$(BUILDDIR)" $(SPHINXOPTS) $(O)
5 changes: 5 additions & 0 deletions docs/api/segfast.loader.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
======
Loader
======

.. automethod:: segfast.loader.open
8 changes: 8 additions & 0 deletions docs/api/segfast.memmap_loader.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
============
MemmapLoader
============

.. autoclass:: segfast.memmap_loader.MemmapLoader
:members:
:undoc-members:
:member-order: bysource
12 changes: 12 additions & 0 deletions docs/api/segfast.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
===
API
===

.. toctree::
:maxdepth: 5

segfast.loader
segfast.memmap_loader
segfast.segyio_loader
segfast.trace_header_spec
segfast.utils
13 changes: 13 additions & 0 deletions docs/api/segfast.segyio_loader.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
============
SegyioLoader
============

.. autoclass:: segfast.segyio_loader.SegyioLoader
:members:
:undoc-members:
:member-order: bysource

.. autoclass:: segfast.segyio_loader.SafeSegyioLoader
:members:
:undoc-members:
:member-order: bysource
8 changes: 8 additions & 0 deletions docs/api/segfast.trace_header_spec.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
===============
TraceHeaderSpec
===============

.. autoclass:: segfast.trace_header_spec.TraceHeaderSpec
:members:
:undoc-members:
:member-order: bysource
5 changes: 5 additions & 0 deletions docs/api/segfast.utils.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
=====
Utils
=====

.. autoclass:: segfast.utils.ForPoolExecutor
57 changes: 57 additions & 0 deletions docs/conf.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,57 @@
# Configuration file for the Sphinx documentation builder.
#
# For the full list of built-in configuration values, see the documentation:
# https://www.sphinx-doc.org/en/master/usage/configuration.html

# -- Project information -----------------------------------------------------
# https://www.sphinx-doc.org/en/master/usage/configuration.html#project-information

import sys, os
sys.path.insert(0, os.path.abspath('..'))
import segfast

master_doc = 'index'

project = 'segfast'
author = 'Analysis Center'
copyright = '2024, ' + author

release = segfast.__version__
version = '.'.join(release.split('.'))

# -- General configuration ---------------------------------------------------
# https://www.sphinx-doc.org/en/master/usage/configuration.html#general-configuration

extensions = [
'sphinx.ext.autodoc',
'sphinx.ext.doctest',
'sphinx.ext.coverage',
'sphinx.ext.mathjax',
'sphinx.ext.viewcode',
'sphinx.ext.githubpages',
'sphinx.ext.intersphinx',
'sphinx.ext.napoleon',
'sphinx_rtd_theme',
]

templates_path = ['_templates']
exclude_patterns = []
language = 'en'


# -- Options for HTML output -------------------------------------------------
# https://www.sphinx-doc.org/en/master/usage/configuration.html#options-for-html-output

html_title = "SegFast"
html_theme = "sphinx_rtd_theme"
html_static_path = ['_static']
html_theme_options = {
'logo_only': False
}

# Example configuration for intersphinx: refer to the Python standard library.
intersphinx_mapping = {
'python': ('https://docs.python.org/', None),
'numpy': ('https://docs.scipy.org/doc/numpy/', None),
'segyio': ('https://segyio.readthedocs.io/en/latest/', None)
}
40 changes: 40 additions & 0 deletions docs/index.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,40 @@
.. segfast documentation master file, created by
sphinx-quickstart on Thu Feb 1 14:09:14 2024.
You can adapt this file completely to your liking, but it should at least
contain the root `toctree` directive.
segfast documentation
=====================

**segfast** is a library for interacting with SEG-Y seismic data. Main features are:

* Faster access to read data: both traces headers and values
* Optional bufferization, where the user can provide a preallocated memory to load the data into
* Convenient API that relies on :class:`numpy.memmap` for most operations, while providing
`segyio <https://segyio.readthedocs.io/en/latest/>`_ as a fallback engine


Implementation details
----------------------
We rely on **segyio** to infer file-wide parameters.

For headers and traces, we use custom methods of reading binary data.

Main differences to **segyio** C++ implementation:
- we read all of the requested headers in one file-wide sweep, speeding up by an order of magnitude
compared to the **segyio** sequential read of every requested header.
Also, we do that in multiple processes across chunks.

- a memory map over trace data is used for loading values. Avoiding redundant copies and leveraging
:mod:`numpy` superiority allows to speed up reading, especially in case of trace slicing along the samples axis.
This is extra relevant in the case of loading horizontal (depth) slices.


.. toctree::
:maxdepth: 1
:titlesonly:

installation
start
segy
api/segfast
14 changes: 14 additions & 0 deletions docs/installation.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,14 @@
Installation
============

* With ``pip``/``pip3``:

.. code-block:: bash
pip3 install segfast
* Developer version (add ``--depth 1`` if needed)

.. code-block:: bash
git clone https://github.com/analysiscenter/segfast.git
35 changes: 35 additions & 0 deletions docs/make.bat
Original file line number Diff line number Diff line change
@@ -0,0 +1,35 @@
@ECHO OFF

pushd %~dp0

REM Command file for Sphinx documentation

if "%SPHINXBUILD%" == "" (
set SPHINXBUILD=sphinx-build
)
set SOURCEDIR=.
set BUILDDIR=_build

%SPHINXBUILD% >NUL 2>NUL
if errorlevel 9009 (
echo.
echo.The 'sphinx-build' command was not found. Make sure you have Sphinx
echo.installed, then set the SPHINXBUILD environment variable to point
echo.to the full path of the 'sphinx-build' executable. Alternatively you
echo.may add the Sphinx directory to PATH.
echo.
echo.If you don't have Sphinx installed, grab it from
echo.https://www.sphinx-doc.org/
exit /b 1
)

if "%1" == "" goto help

%SPHINXBUILD% -M %1 %SOURCEDIR% %BUILDDIR% %SPHINXOPTS% %O%
goto end

:help
%SPHINXBUILD% -M help %SOURCEDIR% %BUILDDIR% %SPHINXOPTS% %O%

:end
popd
29 changes: 29 additions & 0 deletions docs/segy.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,29 @@
SEG-Y description
=================

The most complete description can be found in `the official SEG-Y specification <https://library.seg.org/pb-assets/technical-standards/seg_y_rev2_0-mar2017-1686080998003.pdf>`_ but here we give
a brief intro into SEG-Y format.

The SEG-Y is a binary file divided into several blocks:

- file-wide information block which in most cases takes the first 3600 bytes:

- **textual header**: the first 3200 bytes are reserved for textual info about the file. Most of the software uses
this header to keep acquisition meta, date of creation, author, etc.
- **binary header**: 3200–3600 bytes contain file-wide headers, which describe the number of traces, the format used
for storing numbers, the number of samples for each trace, acquisition parameters, etc.
- (optional) 3600+ bytes can be used to store the **extended textual information**. If there is such a header,
then this is indicated by the value in one of the 3200–3600 bytes.

- a sequence of traces, where each trace is a combination of its header and signal data:

- **trace header** takes the first 240 bytes and describes the meta info about its trace: shot/receiver coordinates,
the method of acquisition, current trace length, etc. Analogously to binary file header, each trace also
can have extended headers.
- **trace data** is usually an array of amplitude values, which can be stored in various numerical types.
As the original SEG-Y is quite old (1975), one of those numerical formats is IBM float,
which is very different from standard IEEE floats; therefore, special caution is required to
correctly decode values from such files.

For the most part, SEG-Y files are written with a constant size of each trace, although the standard itself allows
for variable-sized traces. We do not work with such files.
31 changes: 31 additions & 0 deletions docs/start.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,31 @@
Quick start
===========

* Open the file:

.. code-block:: python
import segfast
segy_file = segfast.open('/path/to/file.sgy')
* Load headers:

.. code-block:: python
headers = segy_file.load_headers(['CDP_X', 'CDP_Y', 'INLINE_3D', 'CROSSLINE_3D'])
* Load inline:

.. code-block:: python
traces_idx = headers[headers['INLINE_3D'] == INLINE_IDX].index
inline = segy_file.load_traces(traces_idx)
* Load certain depths from all traces:

.. code-block:: python
segy_file.load_depth_slices(DEPTHS)
The resulting array will have shape ``(n_traces, len(DEPTHS))`` so it must be processed to be transformed
to an array of the field shape.
2 changes: 1 addition & 1 deletion pylintrc
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,7 @@ variable-rgx=(.*[a-z][a-z0-9_]{1,30}|[a-z_])$ # snake_case + single letters
argument-rgx=(.*[a-z][a-z0-9_]{1,30}|[a-z_])$ # snake_case + single letters

[MESSAGE CONTROL]
disable=no-value-for-parameter, no-self-use, too-few-public-methods, unsubscriptable-object, no-member, too-many-lines,
disable=no-value-for-parameter, too-few-public-methods, unsubscriptable-object, no-member, too-many-lines,
arguments-differ, too-many-locals, import-error, cyclic-import, duplicate-code, relative-beyond-top-level,
unused-argument, too-many-public-methods, invalid-name, attribute-defined-outside-init, arguments-renamed,
abstract-method, no-name-in-module, import-self
Expand Down
21 changes: 20 additions & 1 deletion segfast/loader.py
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,26 @@


def Loader(path, engine='memmap', endian='big', strict=False, ignore_geometry=True):
""" Selector class for loading SEG-Y with either segyio-based loader or memmap-based one. """
""" Selector class for loading SEG-Y with either segyio-based loader or memmap-based one.
Parameters
----------
path : str
Path to the SEG-Y file
engine : 'memmap' or 'segyio'
Engine to load data from file: ``'memmap'`` is based on :class:`numpy.memmap` created for the whole file and
``'segyio'`` is for using **segyio** library instruments. in any case, **segyio** is used to load information
about the entire file (e.g. ``'sample_interval'`` or ``'shape'``).
endian : 'big' or 'little'
Byte order in the file.
strict : bool
See :func:`segyio.open`
ignore_geometry : bool
See :func:`segyio.open`
Return
------
:class:`~.memmap_loader.MemmapLoader` or :class:`~.segyio_loader.SegyioLoader`
"""
loader_class = _select_loader_class(engine)
return loader_class(path=path, endian=endian, strict=strict, ignore_geometry=ignore_geometry)
open = File = Loader
Expand Down
Loading

0 comments on commit 34dea88

Please sign in to comment.