Skip to content

Releases: MobileTeleSystems/data-rentgen

0.2.1 (2025-04-07)

07 Apr 12:03
a4f8cf2
Compare
Choose a tag to compare

Improvements

  • Reduce image size x2
  • Change docker image user from root to data-rentgen, to improve security.
  • SBOM file is generated on release.

0.2.0 (2025-03-25)

25 Mar 13:49
9c40220
Compare
Choose a tag to compare

TL;DR

  • Implemented column lineage support.
  • HDFS/S3 partitions are now truncated from table path.
  • Added total run/operation statistics (input/output bytes, rows, files).
  • Lineage graph UX improvements.
  • Kafka -> consumer integrations improvements.

Breaking Changes

  • Change response schema of GET /operations. (158)

    Operation properties are moved to data key, added new statistics key. This allows to show operation statistics in UI without building up lineage graph.

    Response examples

    Before:

    {
        "meta": {
            // ...
        },
        "items": [
            {
                "kind": "OPERATION",
                "id": "00000000-0000-0000-0000-000000000000",
                "name": "abc",
                "description": "some",
                // ...
            }
        ],
    }

    to:

    {
        "meta": {
            // ...
        },
        "items": [
            {
                "id": "00000000-0000-0000-0000-000000000000",
                "data": {
                    "id": "00000000-0000-0000-0000-000000000000",
                    "name": "abc",
                    "description": "some",
                    // ...
                },
                "statistics": {
                    "inputs": {
                        "total_datasets": 2,
                        "total_bytes": 123456,
                        "total_rows": 100,
                        "total_files": 0,
                    },
                    "outputs": {
                        "total_datasets": 2,
                        "total_bytes": 123456,
                        "total_rows": 100,
                        "total_files": 0,
                    },
                },
            }
        ],
    }
  • Change response schema of GET /runs. (159)

    Run properties are moved to data key, added new statistics key. This allows to show run statistics in UI without building up lineage graph.

    Response examples

    Before:

    {
        "meta": {
            // ...
        },
        "items": [
            {
                "kind": "RUN",
                "id": "00000000-0000-0000-0000-000000000000",
                "external_id": "abc",
                "description": "some",
                // ...
            }
        ],
    }

    to:

    {
        "meta": {
            // ...
        },
        "items": [
            {
                "id": "00000000-0000-0000-0000-000000000000",
                "data": {
                    "id": "00000000-0000-0000-0000-000000000000",
                    "external_id": "abc",
                    "description": "some",
                    // ...
                },
                "statistics": {
                    "inputs": {
                        "total_datasets": 2,
                        "total_bytes": 123456,
                        "total_rows": 100,
                        "total_files": 0,
                    },
                    "outputs": {
                        "total_datasets": 2,
                        "total_bytes": 123456,
                        "total_rows": 100,
                        "total_files": 0,
                    },
                    "operations": {
                        "total_operations": 10,
                    },
                },
            }
        ],
    }
  • Change response schema of GET /locations. (160)

    Location properties are moved to data key, added new statistics key. This allows to show location statistics in UI.

    Response examples

    Before:

    {
        "meta": {
            // ...
        },
        "items": [
            {
                "kind": "LOCATION",
                "id": 123,
                "name": "rnd_dwh",
                "type": "hdfs",
                // ...
            }
        ],
    }

    to:

    {
        "meta": {
            // ...
        },
        "items": [
            {
                "id": "123",
                "data": {
                    "id": "123",
                    "name": "rnd_dwh",
                    "type": "hdfs",
                    // ...
                },
                "statistics": {
                    "datasets": {"total_datasets": 2},
                    "jobs": {"total_jobs": 0},
                },
            }
        ],
    }

    Same for PATCH /locations/:id:

    Response examples

    Before:

    {
        "kind": "LOCATION",
        "id": 123,
        "name": "abc",
        // ...
    }

    after:

    {
        "id": "123",
        "data": {
            "id": "123",
            "name": "abc",
            // ...
        },
        "statistics": {
            "datasets": {"total_datasets": 2},
            "jobs": {"total_jobs": 0},
        },
    }
  • Change response schema of GET /datasets. (161)

    Dataset properties are moved to data key. This makes API response more consistent with others (e.g. GET /runs, GET /operations).

    Response examples

    Before:

    {
        "meta": {
            // ...
        },
        "items": [
            {
                "kind": "DATASET",
                "id": 123,
                "name": "abc",
                // ...
            }
        ],
    }

    to:

    {
        "meta": {
            // ...
        },
        "items": [
            {
                "id": "123",
                "data": {
                    "id": "123",
                    "name": "abc",
                    // ...
                },
            }
        ],
    }
  • Change response schema of GET /jobs. (162)

    Job properties are moved to data key. This makes API response more consistent with others (e.g. GET /runs, GET /operations).

    Response examples

    Before:

    {
        "meta": {
            // ...
        },
        "items": [
            {
                "kind": "JOB",
                "id": 123,
                "name": "abc",
                // ...
            }
        ],
    }

    after:

    {
        "meta": {
            // ...
        },
        "items": [
            {
                "id": "123",
                "data": {
                    "id": "123",
                    "name": "abc",
                    // ...
                },
            }
        ],
    }
  • Change response schema of GET /:entity/lineage. (164)

    List of all nodes (e.g. list[Node]) is split by node type, and converted to map (e.g. dict[str, Dataset], dict[str, Job]).

    List of all relations (e.g. list[Relation]) is split by relation type (e.g. list[DatasetSymlink], list[Input]).

    Response examples

    Before:

    {
        "relations": [
            {
                "kind": "PARENT",
                "from": {"kind": "JOB", "id": 123},
                "to": {"kind": "RUN", "id": "00000000-0000-0000-0000-000000000000"},
            },
            {
                "kind": "SYMLINK",
                "from": {"kind": "DATASET", "id": 234},
                "to": {"kind": "DATASET", "id": 999},
            },
            {
                "kind": "INPUT",
                "from": {"kind": "DATASET", "id": 234},
                "to": {"kind": "OPERATION", "id": "11111111-1111-1111-1111-111111111111"},
            },
            {
                "kind": "OUTPUT",
                "from": {"kind": "OPERATION", "id": "11111111-1111-1111-1111-111111111111"},
                "to": {"kind": "DATASET", "id": 234},
            },
        ],
        "nodes": [
            {"kind": "DATASET", "id": 123, "name": "abc"},
            {"kind": "JOB", "id": 234, "name": "cde"},
            {
                "kind": "RUN",
                "id": "00000000-0000-0000-0000-000000000000",
                "external_id": "def",
            },
            {
                "kind": "OPERATION",
                "id": "11111111-1111-1111-1111-111111111111",
                "name": "efg",
            },
        ],
    }

    after:

    {
        "relations": {
            "parents": [
                {
                    "from": {"kind": "JOB", "id": "123"},
                    "to": {"kind": "RUN", "id": "00000000-0000-0000-0000-000000000000"},
                },
            ],
            "symlinks": [
                {
                    "from": {"kind": "DATASET", "id": "234"},
                    "to": {"kind": "DATASET", "id": "999"},
                },
            ],
            "inputs": [
                {
                    "from": {"kind": "DATASET", "id": "234"},
                    "to": {
                        "kind": "OPERATION",
                        "id": "11111111-1111-1111-1111-111111111111",
                    },
                },
            ],
            "outputs": [
                {
                    "from": {
                        "kind": "...
Read more

0.1.0 (2024-12-25)

25 Dec 13:15
Compare
Choose a tag to compare

πŸŽ‰ Data.Rentgen first release! πŸŽ‰