Skip to contents
  • anonymized using a hash so that joins can still be done post-hoc

  • timestamped so that trends can be visualized

Features include:

  • customer_hash: string hash of customer_no

  • group_customer_hash: string hash of group_customer_no

  • timestamp: submission timestamp

  • survey: filename / title of the survey

  • question: text of the question

  • subquestion: text of the subquestion (or NA if none)

  • answer: text of the response

  • encoded_answer: embedding of the answer (e.g. integers for likert scale)

Usage

survey_stream(
  survey_dir = config::get("tessistream")$survey_dir,
  reader = survey_monkey
)

Arguments

survey_dir

directory of surveys to parse

reader

function(filename) that reads survey data, the only current reader is survey_monkey

Details

Simple dataset of all survey results.

Note

There's no way to irreversibly anonymize this data and still allow post-hoc joins. The secret in this case (the customer number) is stored openly in the database, the hashing algorithm is explained here, and the number of possible customer numbers is small, so brute forcing the mapping is trivial.

The goal is just to make it more difficult to extract customer information from this table so that the user knows what they are doing.