Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fixes: #17976 DeviceType_Count on Manufacturer #18583

Conversation

renatoalmeidaoliveira
Copy link
Collaborator

Fixes: #17976 DeviceType_Count on Manufacturer

  • Changed get_prefetches_for_serializer to support nested serializers
  • For each nested field select all RelatedObjectCountField fields and create a Prefetch with the correct annotation

@jeremystretch
Copy link
Member

@renatoalmeidaoliveira it would be ideal to solve this by enabling recursion through nested serializers inside get_annotations_for_serializer() if possible. This is because these annotations employ the count_related() utility function, which IIRC works a bit differently from the standard Count(). Did you try going down this path at all?

@renatoalmeidaoliveira
Copy link
Collaborator Author

renatoalmeidaoliveira commented Feb 10, 2025

@jeremystretch that was my first try, i.e. using get_annotations_for_serializer I tryed that in two ways the one proposed by @arthanson in the issue and another version I made.
In both ways the annotated attribute falls inside the base model instead of the nested one and even using the nottation {nested_model}__count_devices.
It's possible to reprodute that behaivour inside nbshell, then I tryed to work on get_prefetches_for_serializer, making it return the fields using Count.
Now I think it's possible to use count_related inside the get_prefetches_for_serializer, gonna try to implement that way.

@renatoalmeidaoliveira
Copy link
Collaborator Author

About using count_related I was able to use it inside the get_prefetches_for_serializer.
Looking at the query at Django Debug we have the following differences:

Using count_related

SELECT "dcim_manufacturer"."id",
       "dcim_manufacturer"."created",
       "dcim_manufacturer"."last_updated",
       "dcim_manufacturer"."custom_field_data",
       "dcim_manufacturer"."name",
       "dcim_manufacturer"."slug",
       "dcim_manufacturer"."description",
       COALESCE((SELECT COUNT(*) AS "c" FROM "dcim_manufacturer" U0 INNER JOIN "dcim_devicetype" U1 ON (U0."id" = U1."manufacturer_id") WHERE U1."id" = ("dcim_manufacturer"."id") GROUP BY U1."id"), 0) AS "devicetype_count"
  FROM "dcim_manufacturer"
 WHERE "dcim_manufacturer"."id" IN (3)

Performance using count_related

count_related

Using Count

SELECT "dcim_manufacturer"."id",
       "dcim_manufacturer"."created",
       "dcim_manufacturer"."last_updated",
       "dcim_manufacturer"."custom_field_data",
       "dcim_manufacturer"."name",
       "dcim_manufacturer"."slug",
       "dcim_manufacturer"."description",
       COUNT("dcim_devicetype"."id") AS "devicetype_count"
  FROM "dcim_manufacturer"
  LEFT OUTER JOIN "dcim_devicetype"
    ON ("dcim_manufacturer"."id" = "dcim_devicetype"."manufacturer_id")
 WHERE "dcim_manufacturer"."id" IN (3)
 GROUP BY "dcim_manufacturer"."id"

Performance using Count

Count

Summary

In my environment, using NetBox demo data, the performance using Count was a bit better, but both ways works.
Please let me know the preffered way.

@renatoalmeidaoliveira
Copy link
Collaborator Author

About changing get_annotations_for_serializer, I tryed with this approach:

def get_annotations_for_serializer(serializer_class, fields_to_include=None):
   """
   Return a mapping of field names to annotations to be applied to the queryset for a serializer.
   """
   annotations = {}
   # If specific fields are not specified, default to all
   if not fields_to_include:
       fields_to_include = serializer_class.Meta.fields

   model = serializer_class.Meta.model

   for field_name, field in serializer_class._declared_fields.items():
       if issubclass(type(field), Serializer):
           nested_fields = field.Meta.brief_fields if field.nested else []
           for subfield_name, subfield in field.get_fields().items():
               if subfield_name in nested_fields and type(subfield) is RelatedObjectCountField:
                   subfield_model = field.Meta.model
                   annotations[f'{field_name}__{subfield_name}'] = count_related(subfield_model, subfield.relation)
       if field_name in fields_to_include and type(field) is RelatedObjectCountField:
           related_field = getattr(model, field.relation).field
           annotations[field_name] = count_related(related_field.model, related_field.name)
   return annotations

Running with this with Debug it executes the following query:

SELECT "dcim_devicetype"."id",
       "dcim_devicetype"."created",
       "dcim_devicetype"."last_updated",
       "dcim_devicetype"."custom_field_data",
       "dcim_devicetype"."description",
       "dcim_devicetype"."comments",
       "dcim_devicetype"."weight",
       "dcim_devicetype"."weight_unit",
       "dcim_devicetype"."_abs_weight",
       "dcim_devicetype"."manufacturer_id",
       "dcim_devicetype"."model",
       "dcim_devicetype"."slug",
       "dcim_devicetype"."default_platform_id",
       "dcim_devicetype"."part_number",
       "dcim_devicetype"."u_height",
       "dcim_devicetype"."exclude_from_utilization",
       "dcim_devicetype"."is_full_depth",
       "dcim_devicetype"."subdevice_role",
       "dcim_devicetype"."airflow",
       "dcim_devicetype"."front_image",
       "dcim_devicetype"."rear_image",
       "dcim_devicetype"."console_port_template_count",
       "dcim_devicetype"."console_server_port_template_count",
       "dcim_devicetype"."power_port_template_count",
       "dcim_devicetype"."power_outlet_template_count",
       "dcim_devicetype"."interface_template_count",
       "dcim_devicetype"."front_port_template_count",
       "dcim_devicetype"."rear_port_template_count",
       "dcim_devicetype"."device_bay_template_count",
       "dcim_devicetype"."module_bay_template_count",
       "dcim_devicetype"."inventory_item_template_count",
       COALESCE((SELECT COUNT(*) AS "c" FROM "dcim_manufacturer" U0 INNER JOIN "dcim_devicetype" U1 ON (U0."id" = U1."manufacturer_id") WHERE U1."id" = ("dcim_devicetype"."id") GROUP BY U1."id"), 0) AS "manufacturer__devicetype_count",
       COALESCE((SELECT COUNT(*) AS "c" FROM "dcim_platform" U0 INNER JOIN "dcim_device" U1 ON (U0."id" = U1."platform_id") WHERE U1."id" = ("dcim_devicetype"."id") GROUP BY U1."id"), 0) AS "default_platform__device_count",
       COALESCE((SELECT COUNT(*) AS "c" FROM "dcim_platform" U0 INNER JOIN "virtualization_virtualmachine" U1 ON (U0."id" = U1."platform_id") WHERE U1."id" = ("dcim_devicetype"."id") GROUP BY U1."id"), 0) AS "default_platform__virtualmachine_count",
       COALESCE((SELECT COUNT(*) AS "c" FROM "dcim_device" U0 WHERE U0."device_type_id" = ("dcim_devicetype"."id") GROUP BY U0."device_type_id"), 0) AS "device_count"
  FROM "dcim_devicetype"
 WHERE "dcim_devicetype"."id" = 7
 LIMIT 21

And looking at the results inside Django debug, its returning the correct results but as an attribute of DeviceType :

image

So even the query returning the expected results the JSON output omits the result instead of adding it inside the Nested Object

@@ -87,6 +88,7 @@ def get_prefetches_for_serializer(serializer_class, fields_to_include=None):
fields_to_include = serializer_class.Meta.fields

prefetch_fields = []
annotaded_prefetch = {}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

minor spelling issue, annotated would be correct spelling

# Add an annotation to the annotaded_prefetch
if isinstance(serializer_field, RelatedObjectCountField) and source_field is not None:
if model_field_name not in annotaded_prefetch:
annotaded_prefetch[model_field_name] = Count(serializer_field.relation)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to continue processing at this point? I think we can just do a continue here?

# If the serializer field does not map to a discrete model field, skip it.
try:
field = model._meta.get_field(model_field_name)
if isinstance(field, (RelatedField, ManyToOneRel, GenericForeignKey)):
if (isinstance(field, (RelatedField, ManyToOneRel, GenericForeignKey)) and
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This can be moved below the try/except - if the field doesn't exist we are doing a continue so it won't get hit and field will be set.
Also - if it is issubclass(type(serializer_field), Serializer) do we want to continue to line 117? Not sure - just asking, could we just do a check on that condition and continue?

@arthanson
Copy link
Collaborator

@renatoalmeidaoliveira is this ready for re-review? If you do some fixes please click the re-review button as otherwise it doesn't pop up in our queues.
Monosnap Fixes: #17976 DeviceType_Count on Manufacturer by renatoalmeidaoliveira · Pull Request #18583 · netbox-community:netbox 2025-02-25 08-40-27

@renatoalmeidaoliveira
Copy link
Collaborator Author

@arthanson it's not ready yet... I tryed to use a different approach based on your comments and @jeremystretch's but it breaks a lot of things for other models, gonna post some code snippet about that sollution

except FieldDoesNotExist:
continue

# Prefetch for tags
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This code is close but I think this is way too specific checking for a field named 'tags'. There shouldn't be anything special about tags - we should be able to check generically for M2M or FK field here.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this is needed can you do something like:

if isinstance(field, models.ManyToManyField):
    prefetch_fields.append(field.name)
elif serializer_field and source_field is None:
   ...

nested_annotations = get_annotations_for_serializer(serializer_class, fields_to_include=fields_to_include)
if nested_annotations:
related_prefetch = Prefetch(source_field, queryset=model.objects.all().annotate(**nested_annotations))
prefetch_fields.append(related_prefetch)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Almost there - I think you can just add a return prefetch_fields here. If it's gotten to here then in the loop below on line 99 the only time it modifies prefetch_fields is on line 125 and that can only happen if source_field is None, which it wouldn't be in this case, so all that code is looping / processing for nothing.

If that is the case the fields_to_include doesn't matter in this case. I'm seeing when it is called with source_field you are only getting one value returned for prefetch_fieds (which makes sense as the for loop doesn't append anything) Therefore it doesn't look like you need to recurse into this at all, the lines above can just be a separate function that is called (without the fields_to_include).

Please let me know if that isn't clear.

@renatoalmeidaoliveira
Copy link
Collaborator Author

@arthanson I created a method to deal with subfield annotations, and remove the recursion, I think in most cases it gonna work well because the brief fields ussually doesn't contains serializer fields, the only exception I found was PowerPort.
Comparing the prefetch from the current implementation and this one the difference is:

Current Implementation:

['device', 'module', 'module__device', 'module__module_bay', 'module__module_type', 'module__module_type__manufacturer', 'cable', 'tags']

New implementation:

['device', 'module', 'cable', 'tags']

But even with that difference with the prefetch field the amount of queries in the list endpoint was only by one query in my environment with demo data

Copy link
Collaborator

@arthanson arthanson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't mean take out the recursion, as you noted that removes many of the prefetch's that are needed. At line 19 below I've added a return after the append as the code below it has no effect (since in this case source_field is not None

What I wasn't sure on was if the line get_prefetches_for_serializer(type(serializer_field), subfields, field_name) could be simplified in the case where it is passing in source_field, but looking at it more this might be the easiest way. So I'd suggest reverting to the code I posted below.

For more clarification: You are recursing into a function but it has two completely separate logic paths if source_field is None or not. Wasn't sure if it would be cleaner to call two separate functions (do the recursion if source_field is None, but call another function if source_field isn't None that just does the code you have after if source_field is not None. Again, it might be simpler to just keep it all in one function like you have it.

def get_prefetches_for_serializer(serializer_class, fields_to_include=None, source_field=None):
    """
    Compile and return a list of fields which should be prefetched on the queryset for a serializer.
    """
    model = serializer_class.Meta.model

    # If fields are not specified, default to all
    if not fields_to_include:
        fields_to_include = serializer_class.Meta.fields

    prefetch_fields = []

    # If this serializer is nested, get annotations and prefetches for the nested serializer
    if source_field is not None:
        nested_annotations = get_annotations_for_serializer(serializer_class, fields_to_include=fields_to_include)
        if nested_annotations:
            related_prefetch = Prefetch(source_field, queryset=model.objects.all().annotate(**nested_annotations))
            prefetch_fields.append(related_prefetch)
            return prefetch_fields

    for field_name in fields_to_include:
        serializer_field = serializer_class._declared_fields.get(field_name)

        # Determine the name of the model field referenced by the serializer field
        model_field_name = field_name
        if serializer_field and serializer_field.source:
            model_field_name = serializer_field.source

        # If the serializer field does not map to a discrete model field, skip it.
        try:
            field = model._meta.get_field(model_field_name)
        except FieldDoesNotExist:
            continue

        # If this field is represented by a nested serializer, recurse to resolve prefetches
        # for the related object.
        if serializer_field and source_field is None:
            if issubclass(type(serializer_field), Serializer):
                # Determine which fields to prefetch for the nested object
                subfields = serializer_field.Meta.brief_fields if serializer_field.nested else None
                for subfield in get_prefetches_for_serializer(type(serializer_field), subfields, field_name):
                    if isinstance(subfield, Prefetch):
                        prefetch_fields.append(subfield)
                    else:
                        prefetch_fields.append(f'{field_name}__{subfield}')
            elif isinstance(field, (RelatedField, ManyToOneRel, GenericForeignKey)):
                prefetch_fields.append(field.name)
    return prefetch_fields

@renatoalmeidaoliveira
Copy link
Collaborator Author

Closing this issue since I didn't find a good solution to keep the recursion and allow Prefetches with annotations.
Described my findings in the Issue

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

DeviceType_Count on Manufacturer
3 participants