Skip to content

Conversation

@carlosahs
Copy link

@carlosahs carlosahs commented Nov 29, 2025

Which issue does this PR close?

Rationale for this change

#[expect] attributes suppress the lint emission, but emit a warning, if the expectation is unfulfilled. This can be useful to be notified when the lint is no longer triggered.
- https://rust-lang.github.io/rust-clippy/master/index.html?search=clippy%3A%3Aupper_case_acronyms

This is helpful to identify when the lint is no longer needed across functions-* crates.

What changes are included in this PR?

Added #![deny(clippy::allow_attributes)] to all functions-* crates. Also, the following lints are removed:

  • clippy::upper_case_acronyms is removed for
    #[derive(Debug, Clone, PartialEq, Hash, Eq)]
    #[allow(clippy::upper_case_acronyms)]
    pub enum RegrType {
    /// Variant for `regr_slope` aggregate expression
    /// Returns the slope of the linear regression line for non-null pairs in aggregate columns.
    /// Given input column Y and X: `regr_slope(Y, X)` returns the slope (k in Y = k*X + b) using minimal
    /// RSS (Residual Sum of Squares) fitting.
    Slope,
    /// Variant for `regr_intercept` aggregate expression
    /// Returns the intercept of the linear regression line for non-null pairs in aggregate columns.
    /// Given input column Y and X: `regr_intercept(Y, X)` returns the intercept (b in Y = k*X + b) using minimal
    /// RSS fitting.
    Intercept,
    /// Variant for `regr_count` aggregate expression
    /// Returns the number of input rows for which both expressions are not null.
    /// Given input column Y and X: `regr_count(Y, X)` returns the count of non-null pairs.
    Count,
    /// Variant for `regr_r2` aggregate expression
    /// Returns the coefficient of determination (R-squared value) of the linear regression line for non-null pairs in aggregate columns.
    /// The R-squared value represents the proportion of variance in Y that is predictable from X.
    R2,
    /// Variant for `regr_avgx` aggregate expression
    /// Returns the average of the independent variable for non-null pairs in aggregate columns.
    /// Given input column X: `regr_avgx(Y, X)` returns the average of X values.
    AvgX,
    /// Variant for `regr_avgy` aggregate expression
    /// Returns the average of the dependent variable for non-null pairs in aggregate columns.
    /// Given input column Y: `regr_avgy(Y, X)` returns the average of Y values.
    AvgY,
    /// Variant for `regr_sxx` aggregate expression
    /// Returns the sum of squares of the independent variable for non-null pairs in aggregate columns.
    /// Given input column X: `regr_sxx(Y, X)` returns the sum of squares of deviations of X from its mean.
    SXX,
    /// Variant for `regr_syy` aggregate expression
    /// Returns the sum of squares of the dependent variable for non-null pairs in aggregate columns.
    /// Given input column Y: `regr_syy(Y, X)` returns the sum of squares of deviations of Y from its mean.
    SYY,
    /// Variant for `regr_sxy` aggregate expression
    /// Returns the sum of products of pairs of numbers for non-null pairs in aggregate columns.
    /// Given input column Y and X: `regr_sxy(Y, X)` returns the sum of products of the deviations of Y and X from their respective means.
    SXY,
    }
    because it resulted in
    warning: this lint expectation is unfulfilled
      --> datafusion/functions-aggregate/src/regr.rs:89:10
       |
    89 | #[expect(clippy::upper_case_acronyms)]
       |          ^^^^^^^^^^^^^^^^^^^^^^^^^^^
       |
       = note: `#[warn(unfulfilled_lint_expectations)]` on by default
    
    warning: `datafusion-functions-aggregate` (lib) generated 1 warning
    
    The rationale is the following: The default value of avoid-breaking-exported-api is true so lint is suppressed. Ref. Setting avoid-breaking-exported-api to false raises multiple warnings as datafusion does not strictly follow UpperCamelCase naming convention.
  • clippy::too_many_arguments is removed for
    #[allow(clippy::too_many_arguments)]
    fn regexp_instr_inner<'a, S>(
    values: &S,
    regex_array: &S,
    start_array: Option<&Int64Array>,
    nth_array: Option<&Int64Array>,
    flags_array: Option<S>,
    subexp_array: Option<&Int64Array>,
    ) -> Result<ArrayRef, ArrowError>
    as default value of too-many-arguments-threshold is 7 and regexp_instr_inner takes on 6 arguments.
  • rustdoc::redundant_explicit_links is removed for (1)
    /// Creates a singleton `ScalarUDF` of the `$UDF` function and a function
    /// named `$NAME` which returns that singleton. Optionally use a custom constructor
    /// `$CTOR` which defaults to `$UDF::new()` if not specified.
    ///
    /// This is used to ensure creating the list of `ScalarUDF` only happens once.
    #[macro_export]
    macro_rules! make_udf_function {
    ($UDF:ty, $NAME:ident, $CTOR:expr) => {
    #[allow(rustdoc::redundant_explicit_links)]
    #[doc = concat!("Return a [`ScalarUDF`](datafusion_expr::ScalarUDF) implementation of ", stringify!($NAME))]
    pub fn $NAME() -> std::sync::Arc<datafusion_expr::ScalarUDF> {
    // Singleton instance of the function
    static INSTANCE: std::sync::LazyLock<
    std::sync::Arc<datafusion_expr::ScalarUDF>,
    > = std::sync::LazyLock::new(|| {
    std::sync::Arc::new(datafusion_expr::ScalarUDF::new_from_impl(
    ($CTOR)(),
    ))
    });
    std::sync::Arc::clone(&INSTANCE)
    }
    };
    ($UDF:ty, $NAME:ident) => {
    make_udf_function!($UDF, $NAME, <$UDF>::new);
    };
    }
    and (2)
    /// Creates a singleton `ScalarUDF` of the `$UDF` function and a function
    /// named `$NAME` which returns that singleton. The function takes a
    /// configuration argument of type `$CONFIG_TYPE` to create the UDF.
    #[macro_export]
    macro_rules! make_udf_function_with_config {
    ($UDF:ty, $NAME:ident) => {
    #[allow(rustdoc::redundant_explicit_links)]
    #[doc = concat!("Return a [`ScalarUDF`](datafusion_expr::ScalarUDF) implementation of ", stringify!($NAME))]
    pub fn $NAME(config: &datafusion_common::config::ConfigOptions) -> std::sync::Arc<datafusion_expr::ScalarUDF> {
    std::sync::Arc::new(datafusion_expr::ScalarUDF::new_from_impl(
    <$UDF>::new_with_config(&config),
    ))
    }
    };
    }
    as it looks like computed automatic link for [`ScalarUDF`](datafusion_expr::ScalarUDF) is not the same as the explicit link. Any concerns with this one?
  • rustdoc::private_intra_doc_links is removed for (1)
    #[allow(rustdoc::private_intra_doc_links)]
    /// See [`TDigest::to_scalar_state()`] for a description of the serialized
    /// state.
    fn state_fields(&self, args: StateFieldsArgs) -> Result<Vec<FieldRef>> {
    Ok(vec![
    Field::new(
    format_state_name(args.name, "max_size"),
    DataType::UInt64,
    false,
    ),
    Field::new(
    format_state_name(args.name, "sum"),
    DataType::Float64,
    false,
    ),
    Field::new(
    format_state_name(args.name, "count"),
    DataType::UInt64,
    false,
    ),
    Field::new(
    format_state_name(args.name, "max"),
    DataType::Float64,
    false,
    ),
    Field::new(
    format_state_name(args.name, "min"),
    DataType::Float64,
    false,
    ),
    Field::new_list(
    format_state_name(args.name, "centroids"),
    Field::new_list_field(DataType::Float64, true),
    false,
    ),
    ]
    .into_iter()
    .map(Arc::new)
    .collect())
    }
    and (2)
    #[allow(rustdoc::private_intra_doc_links)]
    /// See [`TDigest::to_scalar_state()`] for a description of the serialized
    /// state.
    fn state_fields(&self, args: StateFieldsArgs) -> Result<Vec<FieldRef>> {
    self.approx_percentile_cont.state_fields(args)
    }
    since lint detects when intra-doc links from public to private items but TDigest::to_scalar_state() is public, so we have the opposite case: from private to public items.

Are these changes tested?

  • Ran sh ci/scripts/rust_docs.sh with no warnings/errors.
  • Ran sh ci/scripts/rust_clippy.sh with no warnings/errors.

Are there any user-facing changes?

No.

@github-actions github-actions bot added the functions Changes to functions implementation label Nov 29, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

functions Changes to functions implementation

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant