Track and compare data quality across online survey platforms.
Researchers submit their study results — the dashboard updates as new studies are added.
A central concern with online studies is whether each response was submitted by a single attentive human participant.
This breaks down into three specific data quality concerns:
Whether participants pay attention to study materials
Whether responses are generated by AI agents, LLMs, or Bots
Whether multiple responses were submitted by the same human using multiple accounts or fake accounts
Each of these concerns can be addressed with targeted quality checks, which may evolve over time. As a starting point, this dashboard tracks the checks discussed in our companion paper — and will expand as researchers contribute new metrics.
Builds on “Mission Possible: The Collection of High Quality Online Data” by Çelebi, Exley, Harrs, Kivimaki, Serra-Garcia & Yusof (2026). The paper provides an evidence-based assessment of data quality across online survey platforms, using laboratory and AI agent responses as benchmarks. It also proposes a new two-stage recruitment method. Materials are available in the companion GitHub repository.
Each metric is an indicator equal to 1 if the check passes. Higher pass rates are better across all metrics.
Equal to 1 if respondents pass instructional attention checks. For example, respondent correctly selects the option furthest to the “left” and “right” when asked to do so in a 5-point Likert question.
Equal to 1 if the respondent correctly types four numbers shown sequentially in a short video.
Equal to 1 if the open-text response was entered through manual typing.
Three conditions must hold: (i) no paste event recorded; (ii) no large discrete jump in text length without corresponding keystrokes (>50 characters); and (iii) at least one keystroke recorded. Together these detect copy–paste, drag-and-drop, and fully automated insertion.
Equal to 1 if the respondent typed the open-text response and the median inter-keystroke interval exceeds 75 milliseconds.
Equal to 1 if Google’s reCAPTCHA score is at least 0.5. The score aggregates behavioral and contextual signals during the respondent’s interaction with the survey interface; lower scores indicate a higher likelihood of bot activity.
Equal to 1 if the Pangram AI likelihood score for the open-text response is below 0.5. The score reflects the estimated probability that the text was generated by a large language model rather than a human.
Equal to 1 if the respondent clicked on the screen at least once during the questionnaire.
Equal to 1 if the respondent moved the mouse at least once within the questionnaire page.
Equal to 1 if the IP address is unique within the sample.
Equal to 1 if the IP address is not located outside the targeted country, as determined by an IP intelligence service (e.g., MaxMind).
Equal to 1 if fewer than five responses in the sample share the same geographic latitude–longitude coordinates, as reported by the survey platform (e.g., Qualtrics) or an IP intelligence service (e.g., MaxMind).
Equal to 1 if the submission is not flagged as a duplicate by the survey platform (e.g., Qualtrics), which identifies repeated submissions using browser cookies.
Equal to 1 if no other response in the sample shares the same device fingerprint, as provided by a device fingerprinting service (e.g., Fingerprint.com). Unlike cookies, device fingerprints persist across IP changes and browser resets, making them more robust for detecting repeated submissions from the same device.
Help expand the dataset and improve comparisons across platforms.
Want to update a previously submitted entry? Click "Edit Existing Entry" below.
Takes about 5 minutes