Skip to content

Commit 85c1de3

Browse files
authored
9.1 features (#895)
1 parent e217a12 commit 85c1de3

File tree

3 files changed

+127
-31
lines changed

3 files changed

+127
-31
lines changed

.gitignore

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1 +1,2 @@
11
_build
2+
.venv

develop/api_metadata.rst

Lines changed: 32 additions & 28 deletions
Original file line numberDiff line numberDiff line change
@@ -205,38 +205,42 @@ Worker node table
205205

206206
The pg_dist_node table contains information about the worker nodes in the cluster.
207207

208-
+----------------+----------------------+---------------------------------------------------------------------------+
209-
| Name | Type | Description |
210-
+================+======================+===========================================================================+
211-
| nodeid | int | | Auto-generated identifier for an individual node. |
212-
+----------------+----------------------+---------------------------------------------------------------------------+
213-
| groupid | int | | Identifier used to denote a group of one primary server and zero or more|
214-
| | | | secondary servers, when the streaming replication model is used. By |
215-
| | | | default it is the same as the nodeid. |
216-
+----------------+----------------------+---------------------------------------------------------------------------+
217-
| nodename | text | | Host Name or IP Address of the PostgreSQL worker node. |
218-
+----------------+----------------------+---------------------------------------------------------------------------+
219-
| nodeport | int | | Port number on which the PostgreSQL worker node is listening. |
220-
+----------------+----------------------+---------------------------------------------------------------------------+
221-
| noderack | text | | (Optional) Rack placement information for the worker node. |
222-
+----------------+----------------------+---------------------------------------------------------------------------+
223-
| hasmetadata | boolean | | Reserved for internal use. |
224-
+----------------+----------------------+---------------------------------------------------------------------------+
225-
| isactive | boolean | | Whether the node is active accepting shard placements. |
226-
+----------------+----------------------+---------------------------------------------------------------------------+
227-
| noderole | text | | Whether the node is a primary or secondary |
228-
+----------------+----------------------+---------------------------------------------------------------------------+
229-
| nodecluster | text | | The name of the cluster containing this node |
230-
+----------------+----------------------+---------------------------------------------------------------------------+
208+
+------------------+----------------------+---------------------------------------------------------------------------+
209+
| Name | Type | Description |
210+
+==================+======================+===========================================================================+
211+
| nodeid | int | | Auto-generated identifier for an individual node. |
212+
+------------------+----------------------+---------------------------------------------------------------------------+
213+
| groupid | int | | Identifier used to denote a group of one primary server and zero or more|
214+
| | | | secondary servers, when the streaming replication model is used. By |
215+
| | | | default it is the same as the nodeid. |
216+
+------------------+----------------------+---------------------------------------------------------------------------+
217+
| nodename | text | | Host Name or IP Address of the PostgreSQL worker node. |
218+
+------------------+----------------------+---------------------------------------------------------------------------+
219+
| nodeport | int | | Port number on which the PostgreSQL worker node is listening. |
220+
+------------------+----------------------+---------------------------------------------------------------------------+
221+
| noderack | text | | (Optional) Rack placement information for the worker node. |
222+
+------------------+----------------------+---------------------------------------------------------------------------+
223+
| hasmetadata | boolean | | Reserved for internal use. |
224+
+------------------+----------------------+---------------------------------------------------------------------------+
225+
| isactive | boolean | | Whether the node is active accepting shard placements. |
226+
+------------------+----------------------+---------------------------------------------------------------------------+
227+
| noderole | text | | Whether the node is a primary or secondary |
228+
+------------------+----------------------+---------------------------------------------------------------------------+
229+
| nodecluster | text | | The name of the cluster containing this node |
230+
+------------------+----------------------+---------------------------------------------------------------------------+
231+
| shouldhaveshards | boolean | | If false, shards will be moved off node (drained) when rebalancing, |
232+
| | | | nor will shards from new distributed tables be placed on the node, |
233+
| | | | unless they are colocated with shards already there |
234+
+------------------+----------------------+---------------------------------------------------------------------------+
231235

232236
::
233237

234238
SELECT * from pg_dist_node;
235-
nodeid | groupid | nodename | nodeport | noderack | hasmetadata | isactive | noderole | nodecluster
236-
--------+---------+-----------+----------+----------+-------------+----------+----------+ -------------
237-
1 | 1 | localhost | 12345 | default | f | t | primary | default
238-
2 | 2 | localhost | 12346 | default | f | t | primary | default
239-
3 | 3 | localhost | 12347 | default | f | t | primary | default
239+
nodeid | groupid | nodename | nodeport | noderack | hasmetadata | isactive | noderole | nodecluster | shouldhaveshards
240+
--------+---------+-----------+----------+----------+-------------+----------+----------+-------------+------------------
241+
1 | 1 | localhost | 12345 | default | f | t | primary | default | t
242+
2 | 2 | localhost | 12346 | default | f | t | primary | default | t
243+
3 | 3 | localhost | 12347 | default | f | t | primary | default | t
240244
(3 rows)
241245

242246
.. _pg_dist_object:

develop/api_udf.rst

Lines changed: 94 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -444,7 +444,7 @@ Arguments
444444
**node_port:** The port on which PostgreSQL is listening on the worker node.
445445

446446
**group_id:** A group of one primary server and zero or more secondary
447-
servers, relevant only for streaming replication. Default 0
447+
servers, relevant only for streaming replication. Default -1
448448

449449
**node_role:** Whether it is 'primary' or 'secondary'. Default 'primary'
450450

@@ -494,6 +494,36 @@ Example
494494
495495
select * from master_update_node(123, 'new-address', 5432);
496496
497+
.. _master_set_node_property:
498+
499+
master_set_node_property
500+
$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$
501+
502+
The master_set_node_property() function changes properties in the Citus metadata table :ref:`pg_dist_node <pg_dist_node>`. Currently it can change only the ``shouldhaveshards`` property.
503+
504+
Arguments
505+
************************
506+
507+
**node_name:** DNS name or IP address for the node.
508+
509+
**node_port:** the port on which PostgreSQL is listening on the worker node.
510+
511+
**property:** the column to change in ``pg_dist_node``, currently only ``shouldhaveshard`` is supported.
512+
513+
**value:** the new value for the column.
514+
515+
Return Value
516+
******************************
517+
518+
N/A
519+
520+
Example
521+
***********************
522+
523+
.. code-block:: postgresql
524+
525+
SELECT * FROM master_set_node_property('localhost', 5433, 'shouldhaveshards', false);
526+
497527
.. _master_add_inactive_node:
498528

499529
master_add_inactive_node
@@ -512,7 +542,7 @@ Arguments
512542
**node_port:** The port on which PostgreSQL is listening on the worker node.
513543

514544
**group_id:** A group of one primary server and zero or more secondary
515-
servers, relevant only for streaming replication. Default 0
545+
servers, relevant only for streaming replication. Default -1
516546

517547
**node_role:** Whether it is 'primary' or 'secondary'. Default 'primary'
518548

@@ -1006,7 +1036,7 @@ The rebalance_table_shards() function moves shards of the given table to make th
10061036
Arguments
10071037
**************************
10081038

1009-
**table_name:** The name of the table whose shards need to be rebalanced.
1039+
**table_name:** (Optional) The name of the table whose shards need to be rebalanced. If NULL, then rebalance all existing colocation groups.
10101040

10111041
**threshold:** (Optional) A float number between 0.0 and 1.0 which indicates the maximum difference ratio of node utilization from average utilization. For example, specifying 0.1 will cause the shard rebalancer to attempt to balance all nodes to hold the same number of shards ±10%. Specifically, the shard rebalancer will try to converge utilization of all worker nodes to the (1 - threshold) * average_utilization ... (1 + threshold) * average_utilization range.
10121042

@@ -1020,6 +1050,8 @@ Arguments
10201050
* ``force_logical``: Use logical replication even if the table doesn't have a replica identity. Any concurrent update/delete statements to the table will fail during replication.
10211051
* ``block_writes``: Use COPY (blocking writes) for tables lacking primary key or replica identity.
10221052

1053+
**drain_only:** (Optional) When true, move shards off worker nodes who have ``shouldhaveshards`` set to false in :ref:`pg_dist_node`; move no other shards.
1054+
10231055
Return Value
10241056
*********************************
10251057

@@ -1089,6 +1121,65 @@ Example
10891121
│ 7083 │ foo │ 102019 │ 8192 │ n3.foobar.com │ 5432 │ n4.foobar.com │ 5432 │ 2 │
10901122
└───────────┴────────────┴─────────┴────────────┴───────────────┴────────────┴───────────────┴────────────┴──────────┘
10911123

1124+
.. _master_drain_node:
1125+
1126+
master_drain_node
1127+
$$$$$$$$$$$$$$$$$$$$$$$$$$$
1128+
1129+
.. note::
1130+
The master_drain_node function is a part of Citus Enterprise. Please `contact us <https://www.citusdata.com/about/contact_us>`_ to obtain this functionality.
1131+
1132+
The master_drain_node() function moves shards off the designated node and onto other nodes who have ``shouldhaveshards`` set to true in :ref:`pg_dist_node`. This function is designed to be called prior to removing a node from the cluster, i.e. turning the node's physical server off.
1133+
1134+
Arguments
1135+
**************************
1136+
1137+
**nodename:** The hostname name of the node to be drained.
1138+
1139+
**nodeport:** The port number of the node to be drained.
1140+
1141+
**shard_transfer_mode:** (Optional) Specify the method of replication, whether to use PostgreSQL logical replication or a cross-worker COPY command. The possible values are:
1142+
1143+
* ``auto``: Require replica identity if logical replication is possible, otherwise use legacy behaviour (e.g. for shard repair, PostgreSQL 9.6). This is the default value.
1144+
* ``force_logical``: Use logical replication even if the table doesn't have a replica identity. Any concurrent update/delete statements to the table will fail during replication.
1145+
* ``block_writes``: Use COPY (blocking writes) for tables lacking primary key or replica identity.
1146+
1147+
Return Value
1148+
*********************************
1149+
1150+
N/A
1151+
1152+
Example
1153+
**************************
1154+
1155+
Here are the typical steps to remove a single node (for example '10.0.0.1' on a standard PostgreSQL port):
1156+
1157+
1. Drain the node.
1158+
1159+
.. code-block:: postgresql
1160+
1161+
SELECT * from master_drain_node('10.0.0.1', 5432);
1162+
1163+
2. Wait until the command finishes
1164+
3. Remove the node
1165+
1166+
When draining multiple nodes it's recommended to use :ref:`rebalance_table_shards` instead. Doing so allows Citus to plan ahead and move shards the minimum number of times.
1167+
1168+
1. Run this for each node that you want to remove:
1169+
1170+
.. code-block:: postgresql
1171+
1172+
SELECT * FROM master_set_node_property(node_hostname, node_port, 'shouldhaveshards', false);
1173+
1174+
2. Drain them all at once with :ref:`rebalance_table_shards`:
1175+
1176+
.. code-block:: postgresql
1177+
1178+
SELECT * FROM rebalance_table_shards(drain_only := true);
1179+
1180+
3. Wait until the draining rebalance finishes
1181+
4. Remove the nodes
1182+
10921183
replicate_table_shards
10931184
$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$
10941185

0 commit comments

Comments
 (0)