Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

When visiting links, trackers from one page may incorrectly be attributed to the link target #333

Open
philipp-classen opened this issue Sep 26, 2023 · 6 comments
Assignees
Labels

Comments

@philipp-classen
Copy link
Member

This has been reported on the Ghostery Extension already (ghostery/ghostery-extension#1241), but worth keeping here also. When following links, it can happen that trackers present the source website may be attributed to the target website.

A good example is the Ghostery Search, but it affects other sites as well:
https://whotracks.me/websites/ghosterysearch.com.html

Screenshot taken on 2023-09-26:
ghostery-search

It lists various Google services (including Google Analytics), even though it was never used on the site. The numbers are relatively low (<5%), especially compared to resources present on the site (e.g. jsdelivery with 95.6%). Still, it breaks some metrics (e.g. number of trackers present).

Since the bug is client side, it is not something that we can expect to fix here (or in the processing pipeline). However, if it is fixed, ghosterysearch.com is a useful test page to verify that the false-positives go down.

@philipp-classen
Copy link
Member Author

Also, stats like Google trackers are present on 75% of the web traffic (Oct 2023) are most likely inflated.

@philipp-classen
Copy link
Member Author

Looks like the stats are affected only on Firefox and is caused by the webNavigation.onBeforeNavigate listener firing twice (see https://bugzilla.mozilla.org/show_bug.cgi?id=1732564). Though it is an old bug, it used to happen only in rare edge cases; but recent architectural changes in Firefox have increased the likelihood of being hit.

The proper solution is to fix it in the browser, but until then we can already apply workarounds to filter out duplicated event (see whotracksme/webextension-packages#58).

@ghostwords
Copy link

Looks like the stats are affected only on Firefox

Are you sure?

  1. Install Ghostery into a fresh Chrome profile
  2. Do the opt in/enable thing
  3. Visit google.com
  4. Visit example.com
  5. See some Google tracker reported on example.com. This doesn't always happen, but if you try a few times, it eventually will.
@philipp-classen
Copy link
Member Author

@ghostwords Thanks, we have to check that. Could be that there are more paths affected.

@philipp-classen
Copy link
Member Author

@chrmod I remember you found a problem also with the adblocker, which could explain the behavior on Chrome. Very likely, we have at least one bug left; though it is not clear yet if it also affects the WhoTracks.me collection. (WhoTracks.me is built only on anti-tracking messages, so the question is if the remaining problems are isolated to the UI, or if they also affect anti-tracking messages.)

@philipp-classen
Copy link
Member Author

philipp-classen commented Jun 17, 2024

Still applies: https://www.ghostery.com/whotracksme/websites/ghosterysearch.com

I'm not sure if it a reasonable expectation to end with zero noise. Maybe we should implement thresholds and do not count trackers that do not exceed them.

We changed the processing recently to consider only data from the latest Ghostery 8 clients and from Ghostery 10. The effects will be visible only in about two months though (with the July release at begin of August).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
2 participants