Skip to content

Added condtional type to improve typing in case of object row format #58

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 7 commits into
base: master
Choose a base branch
from

Conversation

supermar1010
Copy link

Hello,
when using the rowFormat: "object" the type for rows in the onComplete still is any[][]. It should be any[].
This PR converts the interface to a type and thus makes it possible to use conditional types.
With this change the type definition of onComplete changes depending on if rowFormat is set to object.

Have a great day!
Mario

Copy link
Collaborator

@platypii platypii left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the type change, but please run lint and typescript, there's some issues to fix please.

@supermar1010
Copy link
Author

supermar1010 commented Jan 26, 2025

Ohh sorry, I did a bit of a sloppy job yesterday.
I fixed the issues lint, test and build:types run successful now!

I added two types, one with the rowFormat being optional, which leads to any[][] (but typed as "array", so rowFormat is always available on the type and additionally now "forced" to be either undefined, "array" or "object"). The other one with rowFormat set to object which makes onComplete type: any[]

@supermar1010 supermar1010 requested a review from platypii January 26, 2025 11:45
@platypii platypii requested a review from severo January 30, 2025 19:01
compressors?: Compressors // custom decompressors
utf8?: boolean // decode byte arrays as utf8 strings (default true)
}

interface ParquetArrayRowReadOptions extends ParquetBaseReadOptions {
rowFormat?: "array" // format of each row passed to the onComplete function
onComplete?: (rows: any[][]) => void // called when all requested rows and columns are parsed
Copy link
Contributor

@severo severo Jan 30, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In fact, I think we want to have both fields, or none (rowFormat is useless if onComplete is undefined, and we want to have the rowFormat for type safety of onComplete)

So: maybe set both fields as always defined (remove ?: -> :) here and in ParquetObjectRowReadOptions, and we would have:

export type ParquetReadOptions =  ParquetBaseReadOptions | ParquetArrayRowReadOptions | ParquetObjectRowReadOptions

Or

interface ParquetRowReadOnCompleteArray {
  rowFormat: "array" // format of each row passed to the onComplete function
  onComplete: (rows: any[][]) => void // called when all requested rows and columns are parsed
}
interface ParquetRowReadOnCompleteObject {
  rowFormat: "object" // format of each row passed to the onComplete function
  onComplete: (rows: Record<string, any>[]) => void // called when all requested rows and columns are parsed
}
type ParquetRowReadOnComplete = ParquetRowReadOnCompleteArray | ParquetRowReadOnCompleteObject | {} 
export type ParquetReadOptions = ParquetBaseReadOptions & ParquetRowReadOnComplete

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or more compact

type ParquetRowReadOnComplete =
  | {}
  | { rowFormat: "array", onComplete: (rows: any[][]) => void }
  | { rowFormat: "object", onComplete: (rows: Record<string, any>[]) => void }

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(ups, sorry, I mixed comments and review)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should leave onComplete as optional, as there are also places in the code where it is not needed.

Actually here the type is a bit misleading, as you can't pass onComplete as it will always be overwritten

/**
 * @param {ParquetReadOptions} options
 * @returns {Promise<Record<string, any>[]>} resolves when all requested rows and columns are parsed
*/
export function parquetReadObjects(options) {
  return new Promise((onComplete, reject) => {
    parquetRead({
      ...options,
      rowFormat: 'object',
      onComplete,
    }).catch(reject)
  })
}

Copy link
Contributor

@severo severo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we do that, we would have to add checks in some places. For example, in

if (onComplete) onComplete(rowData)

onComplete will receive any[][], so we would have to check that rowFormat === "array" and throw if not

compressors?: Compressors // custom decompressors
utf8?: boolean // decode byte arrays as utf8 strings (default true)
}

interface ParquetArrayRowReadOptions extends ParquetBaseReadOptions {
rowFormat?: "array" // format of each row passed to the onComplete function
onComplete?: (rows: any[][]) => void // called when all requested rows and columns are parsed
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

compressors?: Compressors // custom decompressors
utf8?: boolean // decode byte arrays as utf8 strings (default true)
}

interface ParquetArrayRowReadOptions extends ParquetBaseReadOptions {
rowFormat?: "array" // format of each row passed to the onComplete function
onComplete?: (rows: any[][]) => void // called when all requested rows and columns are parsed
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or more compact

type ParquetRowReadOnComplete =
  | {}
  | { rowFormat: "array", onComplete: (rows: any[][]) => void }
  | { rowFormat: "object", onComplete: (rows: Record<string, any>[]) => void }

@supermar1010
Copy link
Author

I think this is how it is supposed to be, but I'm not entirely sure, tests, ts and linting runs successful thoug, manual tests, also look good

@@ -7,5 +7,5 @@
"outDir": "types",
"declarationMap": true
},
"include": ["src"]
"include": ["src", "test"]
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This adds ts checks to the test suite

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants