Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

What privacy relevant / personally identifiable data does polr collect? #445

Open
ian-kelling opened this issue Apr 25, 2018 · 8 comments
Open

Comments

@ian-kelling
Copy link

I don't see this in the documentation. I'm also concerned if this
is GDPR compliant. I've used YOURLS and out of the box it
stores ip / user agent / time of each request, which is no good.
It can also just store click count, but I'd like to find something
which can do more than that while avoiding pii.

@steveroot
Copy link

I don't know the answer as I've only been using POLR to experiment and learn in my own little sandbox (I guess we could work it out from reading through the code) but having spent yesterday reading the EU GDPR to apply in my own small business I can also appreciate why you're asking

It can also just store click count, but I'd like to find something which can do more than that while avoiding pii.

What more than click count are you looking for?

it stores ip / user agent / time of each request, which is no good.

From my reading of GDPR,[and please know that I'm no legal expert and may have misinterpreted GDPR it in the same way as two people take away different meanings from a pop song] you can collect everything you need providing there is a justifiable reason at the design stage. Storing IP, user agent and time of request (data that is also stored in a regular web server log and possibly any ISP servers between you and POLR) is essential in order to monitor system performance and prevent exploitation beyond the intended service (eg: if the same IP is using or creating a link a million times a minute knowing that allows you to take defensive actions). Assembling this log data with other data sources for marketing profiling on the other hand I would find hard to find a justifiable reason but (I'd hope!) polr is't doing this by default, and isn't sharing the full log data publically that may enable 'bad actors' to use the data this way.

If we can justify the collection of the data, the next question is how long we can justify storing that data. If a server logs typically rotates at say, 30 days, is there a similarly timed purge option to clear the old data in polr but process and keep aggregate statistics for a longer period (because we might want to know how useful a shortened link has been). What I can't answer though is if an IP address is personally identifiable. On it's own and without effort to connect other data sources, I'd say no, so perhaps once the GDPR dust settles we wont have an issue storing IP / user agent / time of each request forever should we wish.

In GDPR terms, I think collecting IP / useragent / timestamp would for most cases come within "Legitimate Interest" - summarised by the UK ICO https://ico.org.uk/for-organisations/guide-to-the-general-data-protection-regulation-gdpr/lawful-basis-for-processing/legitimate-interests/
or delving directly into the GDPR -
https://gdpr-info.eu/art-5-gdpr/ and https://gdpr-info.eu/recitals/no-49/

@enkota
Copy link

enkota commented May 7, 2018

Also interested in this with GDPR regulations coming into place soon.

Maybe an option to purge data older than a certain period (such as 3 months) could work well?

@ian-kelling
Copy link
Author

@steveroot afaik, I agree the data I mentioned is ok, the real problem is that it doesn't expire. I still wonder what data polr collects and for how long.

@DenisLanz
Copy link

Looking at the clicks database it stores IP, Country as well as Referrer and User Agent. As far as I can tell there is no cookie.
So as long as the IPs can't be annonymized and there is no option to disable click-tracking or have the data expire, polr is not compliant with GDPR and should probably be avoided.

@overint
Copy link
Contributor

overint commented May 19, 2018

there is no option to disable click-tracking or have the data expire, polr is not compliant with GDPR and should probably be avoided.

That is incorrect. IPs, Referrer and User Agents are only stored if you turn on advanced analytics.

@DenisLanz
Copy link

DenisLanz commented May 19, 2018

Ah sorry, my bad. But how to disable it? I couldn't find an option in the backend and there is no explicit documentation.

Edit, I was too quick, disable it n .env :
SETTING_ADV_ANALYTICS=false

@overint
Copy link
Contributor

overint commented May 19, 2018

Currently there isn't a way to edit the settings in the backend. You are correct SETTING_ADV_ANALYTICS=false will turn it off.

@jarlave
Copy link

jarlave commented Jul 20, 2018

for the time being, if you still want to know which country and referer your links are clicked, you could change this yourself in app/Helpers/ClickHelper.php
this

...
$click->ip = $ip;
...
$click->user_agent = $request->server('HTTP_USER_AGENT');
...

to this

...
$click->ip = preg_replace('#(?:\.\d+){2}$#', '.0.0', $ip);
...
$click->user_agent = "REDACTED";
...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants