You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
In the multi-tenant quick start guide, we introduce the following concept. I feel that the notion of additional roundtrips, creating a new UDF, and then declaring the use of the UDF as a distributed function goes beyond a quick start.
Could we evaluate removing the following section from our Quick Start Guide?
I'm asking because I haven't used create_distributed_function() in this way before. Although I'm not a power user, I also feel that this goes beyond what's needed to get started on Citus.
"Each statement in a transactions causes roundtrips between the coordinator and workers in multi-node Citus. For multi-tenant workloads, it’s more efficient to run transactions in distributed functions. The efficiency gains become more apparent for larger transactions, but we can use the small transaction above as an example.
First create a function that does the deletions:
CREATE OR REPLACE FUNCTION
delete_campaign(company_id int, campaign_id int)
RETURNS void LANGUAGE plpgsql AS $fn$
BEGIN
DELETE FROM campaigns
WHERE id = $2 AND campaigns.company_id = $1;
DELETE FROM ads
WHERE ads.campaign_id = $2 AND ads.company_id = $1;
END; $fn$;
Next use create_distributed_function to instruct Citus to run the function directly on workers rather than on the coordinator (except on a single-node Citus installation, which runs everything on the coordinator). It will run the function on whatever worker holds the Shards for tables ads and campaigns corresponding to the value company_id.
-- you can run the function as usual
SELECT delete_campaign(5, 46);"
Good locations for content in docs structure
How does this work? (devs)
Example sql
Corner cases, gotchas
Are there relevant blog posts or outside documentation about the concept/feature?
Link to relevant commits and regression tests if applicable
The text was updated successfully, but these errors were encountered:
ozgune
changed the title
Evaluate "creating a distributed function" in the multi-tenant quick start guide
Evaluate removing the "create distributed function" section from the quick start guide
Mar 11, 2022
"Distributed functions" is an advanced topic, so it makes sense not to have it on the quick start.
Users typically create a distributed function and expect the function speed up (expecting similar behavior to create distributed table). However, in reality, the schema/functions should be properly set up to benefit from distributed functions. Hence, users are confused with the concept of distributed functions.
In fact, Marco thinks we could rename create_distributed_function to something more explicit like delegate_procedure_to_nodes or such.
Why are we implementing it? (sales eng)
What are the typical use cases?
Communication goals (e.g. detailed howto vs orientation)
Our Quick Start guide is an opportunity to introduce simple concepts to our users.
https://docs.citusdata.com/en/v10.2/get_started/tutorial_multi_tenant.html
In the multi-tenant quick start guide, we introduce the following concept. I feel that the notion of additional roundtrips, creating a new UDF, and then declaring the use of the UDF as a distributed function goes beyond a quick start.
Could we evaluate removing the following section from our Quick Start Guide?
I'm asking because I haven't used create_distributed_function() in this way before. Although I'm not a power user, I also feel that this goes beyond what's needed to get started on Citus.
"Each statement in a transactions causes roundtrips between the coordinator and workers in multi-node Citus. For multi-tenant workloads, it’s more efficient to run transactions in distributed functions. The efficiency gains become more apparent for larger transactions, but we can use the small transaction above as an example.
First create a function that does the deletions:
CREATE OR REPLACE FUNCTION$fn$
$fn$ ;
delete_campaign(company_id int, campaign_id int)
RETURNS void LANGUAGE plpgsql AS
BEGIN
DELETE FROM campaigns
WHERE id = $2 AND campaigns.company_id = $1;
DELETE FROM ads
WHERE ads.campaign_id = $2 AND ads.company_id = $1;
END;
Next use create_distributed_function to instruct Citus to run the function directly on workers rather than on the coordinator (except on a single-node Citus installation, which runs everything on the coordinator). It will run the function on whatever worker holds the Shards for tables ads and campaigns corresponding to the value company_id.
SELECT create_distributed_function(
'delete_campaign(int, int)', 'company_id',
colocate_with := 'campaigns'
);
-- you can run the function as usual
SELECT delete_campaign(5, 46);"
Good locations for content in docs structure
How does this work? (devs)
Example sql
Corner cases, gotchas
Are there relevant blog posts or outside documentation about the concept/feature?
Link to relevant commits and regression tests if applicable
The text was updated successfully, but these errors were encountered: