Add support to Asynchronous API #320

ake123 · 2025-05-22T08:14:07Z

Add support to Asynchronous API #319

pitkant · 2025-05-22T11:55:23Z

~~Did you test this with some dataset? For me it produces nonsensical output:~~

Ah, I inputed the original dataset ID into asynchronous function, so of course it would output nonsensical results. Maybe it would be useful to warn the user for this kind of use, see the history of this comment for example output. I will test this more.

pitkant · 2025-05-22T12:53:29Z

I'm sure you have already read the Eurostat documentation on this but I'll just copy-paste it here

Examples queries

1 - Query in range for asynchronous extraction

Following query would be considered within limits and processed by the system

http://ec.europa.eu/eurostat/api/comext/dissemination/sdmx/2.1/data/DS-045409/A.DK.US..1.SUPPLEMENTARY_QUANTITY?format=SDMX_2.1_STRUCTURED

This query matches the following positions:

freq -> 1 position ("A")
reporter 1 position ("DK")
partner -> 1 position ("US")
product -> 40321 positions (there is no filter on this dimension)
flow -> 1 position ("1")
time_period -> 36 positions (there is no explicit filter on this dimension but the system will only return yearly data)
indicators -> 1 position ("SUPPLEMENTARY_QUANTITY")
Estimated cost: 1 x 1 x 1 x 40321 x 1 x 36 x 1 = 1 451 556 which is above the synchronous limit but below the maximum extraction limit so this request is treated asynchronously.

2 -Query above range for asynchronous extraction

Following query would be considered off limits and not processed by the system

https://ec.europa.eu/eurostat/api/comext/dissemination/sdmx/2.1/data/DS-045409/A.PT...2.QUANTITY_IN_100KG?format=SDMX_2.1_STRUCTURED1

This query matches the following positions:

freq -> 1 position ("A")
reporter 1 position ("PT")
partner -> 282 positions (there is no filter on this dimension)
product -> 40321 positions (there is no filter on this dimension)
flow -> 1 position ("2")
time_period -> 36 positions (there is no explicit filter on this dimension but the system will only return yearly data as the frequency requested is annual)
indicators -> 1 position ("QUANTITY_IN_100KG")
Estimated cost: 1 x 1 x 282 x 40321 x 1 x 36 x 1 = 409 338 792 which is above the maximum extraction limit of 5 000 000 cells and an error is returned.

I think when it comes to triggering the asynchronous request and keeping things within the Fair use limits we should refrain from running any automated tests on this functionality.

Fair use of the service

A request for data extraction will be forced to be processed asynchronously based on the evaluation of 3 main criteria:

the number of concurrent data extraction requests

the number of requests performed during a period

per day

during the last 7 days

during the last 30 days

the cumulative "extraction cost" generated during a period

per day

during the last 7 days

during the last 30 days
If one of the above criteria exceeds some thresholds, further data extraction requests will be forced to be processed asynchronously and this as long as the rule is violated.

In order to avoid this, we recommend to:

trigger 1 extraction request at a time

in case of use of scripts, don't use parallelisation

if applicable, get data from the bulk download

Maybe 1 of this type of request could be recorded and used as a dummy? I don't immediately have the answer to that

Btw, while trying to trigger the asynchronous response I tested get_eurostat_sdmx with the dataset "bop_iip6_q". Curiously, I was able to download the whole dataset with no filters and all 57,312,860 rows. I thought it would've for sure triggered the async response or an error, but no. It was very slow though.

ake123 · 2025-05-22T19:08:02Z

i used some queries like the one below to trigger async but the thing is you can't use the same query again as it is cached by the server I guess so it changes to synchronous mode and says "Synchronous mode: CSV data returned directly."

dat <- get_eurostat_sdmx(
id = "DS-045409",
filters = list(
FREQ = "A",
FLOW = "1",
REPORTER = c("FI", "SE","ES"),                
PARTNER = c( "US"),           
INDICATORS = "SUPPLEMENTARY_QUANTITY"
),
agency = "eurostat_comext",
type = "code",
wait = 10,
max_wait = 600
)

pitkant · 2025-05-23T11:20:40Z

I'm satisfied if you can get the function working just once. We can mark it with an Experimental tag (or similar) in the man pages

Add support to Asynchronous API

ce4aa90

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add support to Asynchronous API #320

Add support to Asynchronous API #320

Uh oh!

ake123 commented May 22, 2025

Uh oh!

pitkant commented May 22, 2025 •

edited

Loading

Uh oh!

pitkant commented May 22, 2025 •

edited

Loading

Examples queries

1 - Query in range for asynchronous extraction

2 -Query above range for asynchronous extraction

Fair use of the service

Uh oh!

ake123 commented May 22, 2025 •

edited

Loading

Uh oh!

pitkant commented May 23, 2025

Uh oh!

Uh oh!

Add support to Asynchronous API #320

Are you sure you want to change the base?

Add support to Asynchronous API #320

Uh oh!

Conversation

ake123 commented May 22, 2025

Uh oh!

pitkant commented May 22, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pitkant commented May 22, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Examples queries

1 - Query in range for asynchronous extraction

2 -Query above range for asynchronous extraction

Fair use of the service

Uh oh!

ake123 commented May 22, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pitkant commented May 23, 2025

Uh oh!

Uh oh!

pitkant commented May 22, 2025 •

edited

Loading

pitkant commented May 22, 2025 •

edited

Loading

ake123 commented May 22, 2025 •

edited

Loading