Before you can reshape or analyze your conjoint survey data, you
first need to import it into R. In
projoint, use the read_Qualtrics()
function to quickly read properly formatted Qualtrics files.
When exporting from Qualtrics:
⚡ If you skip selecting “Use choice text,” your conjoint data may fail to load properly!
read_Qualtrics()
Or, if using an example bundled with projoint:
## # A tibble: 518 Ă— 218
## StartDate EndDate Status Progress
## <dttm> <dttm> <chr> <dbl>
## 1 2022-03-01 10:44:18 2022-03-01 10:44:43 IP Address 100
## 2 2022-03-01 10:44:06 2022-03-01 10:47:59 IP Address 100
## 3 2022-03-01 10:45:30 2022-03-01 10:49:03 IP Address 100
## 4 2022-03-01 10:52:18 2022-03-01 10:56:29 IP Address 100
## 5 2022-03-01 10:54:34 2022-03-01 10:57:30 IP Address 100
## 6 2022-03-01 10:56:51 2022-03-01 10:58:06 IP Address 100
## 7 2022-03-01 10:58:09 2022-03-01 11:00:45 IP Address 100
## 8 2022-03-01 11:01:43 2022-03-01 11:01:51 IP Address 100
## 9 2022-03-01 10:58:35 2022-03-01 11:03:44 IP Address 100
## 10 2022-03-01 11:00:14 2022-03-01 11:04:37 IP Address 100
## # ℹ 508 more rows
## # ℹ 214 more variables: `Duration (in seconds)` <dbl>, Finished <lgl>,
## # RecordedDate <dttm>, ResponseId <chr>, DistributionChannel <chr>,
## # UserLanguage <chr>, Q_RecaptchaScore <dbl>, Q1.2 <chr>, Q2.2 <chr>,
## # Q2.3 <chr>, Q2.4 <chr>, Q2.5 <chr>, Q2.6 <chr>, Q2.7 <chr>, Q2.8 <chr>,
## # Q2.9 <chr>, Q3.1 <chr>, Q4.2 <chr>, Q4.3 <chr>, Q4.4 <chr>, Q4.5 <chr>,
## # Q4.6 <chr>, Q4.7 <chr>, Q4.8 <chr>, Q4.9 <chr>, Q5.1 <chr>, Q6.1 <chr>, …
Preparing your data correctly is one of the most important steps in
conjoint analysis. Fortunately, the reshape_projoint()
function in projoint makes this easy.
Outcome naming & order (important)
- List
.outcomesin the order questions were asked.
- If you have a repeated task, its outcome must be the last element.
- For base tasks (all but last), the function reads the digits in each name as the task id (e.g.,
"choice4","Q4","task04"→ task 4).
- The repeated base task is inferred from the first base outcome’s digits. The repeated outcome itself need not contain digits—only its position (last) matters.
- Outcome strings should end with your choice labels; by default we parse the last character and expect
"A"/"B". If your survey uses"1"/"2"(or other endings), set.choice_labelsaccordingly.
outcomes <- paste0("choice", 1:8)
outcomes1 <- c(outcomes, "choice1_repeated_flipped")
out1 <- reshape_projoint(
.dataframe = exampleData1,
.outcomes = outcomes1,
.choice_labels = c("A", "B"),
.alphabet = "K",
.idvar = "ResponseId",
.repeated = TRUE,
.flipped = TRUE
)Key Arguments:
.outcomes: Outcome columns (include repeated task
last).choice_labels: Profile labels (e.g., “A”, “B”).idvar: Respondent ID variable.alphabet: Variable prefix (“K”).repeated, .flipped: If repeated task
exists and is flippedNot-Flipped Repeated Task
outcomes <- paste0("choice", 1:8)
outcomes2 <- c(outcomes, "choice1_repeated_notflipped")
out2 <- reshape_projoint(
.dataframe = exampleData2,
.outcomes = outcomes2,
.repeated = TRUE,
.flipped = FALSE
)No Repeated Task
.fill Argument: Should You Use It?
Use .fill = TRUE to “fill” missing values based on IRR
agreement.
fill_FALSE <- reshape_projoint(
.dataframe = exampleData1,
.outcomes = outcomes1,
.fill = FALSE
)
fill_TRUE <- reshape_projoint(
.dataframe = exampleData1,
.outcomes = outcomes1,
.fill = TRUE
)Compare:
selected_vars <- c("id", "task", "profile", "selected", "selected_repeated", "agree")
fill_FALSE$data[selected_vars]## # A tibble: 6,400 Ă— 6
## id task profile selected selected_repeated agree
## <chr> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 R_00zYHdY1te1Qlrz 1 1 1 1 1
## 2 R_00zYHdY1te1Qlrz 1 2 0 0 1
## 3 R_00zYHdY1te1Qlrz 2 1 1 NA NA
## 4 R_00zYHdY1te1Qlrz 2 2 0 NA NA
## 5 R_00zYHdY1te1Qlrz 3 1 1 NA NA
## 6 R_00zYHdY1te1Qlrz 3 2 0 NA NA
## 7 R_00zYHdY1te1Qlrz 4 1 0 NA NA
## 8 R_00zYHdY1te1Qlrz 4 2 1 NA NA
## 9 R_00zYHdY1te1Qlrz 5 1 1 NA NA
## 10 R_00zYHdY1te1Qlrz 5 2 0 NA NA
## # ℹ 6,390 more rows
## # A tibble: 6,400 Ă— 6
## id task profile selected selected_repeated agree
## <chr> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 R_00zYHdY1te1Qlrz 1 1 1 1 1
## 2 R_00zYHdY1te1Qlrz 1 2 0 0 1
## 3 R_00zYHdY1te1Qlrz 2 1 1 NA 1
## 4 R_00zYHdY1te1Qlrz 2 2 0 NA 1
## 5 R_00zYHdY1te1Qlrz 3 1 1 NA 1
## 6 R_00zYHdY1te1Qlrz 3 2 0 NA 1
## 7 R_00zYHdY1te1Qlrz 4 1 0 NA 1
## 8 R_00zYHdY1te1Qlrz 4 2 1 NA 1
## 9 R_00zYHdY1te1Qlrz 5 1 1 NA 1
## 10 R_00zYHdY1te1Qlrz 5 2 0 NA 1
## # ℹ 6,390 more rows
Tip:
- Use .fill = TRUE for small-sample or subgroup analysis
(helps increase power).
- Use .fill = FALSE (default) when in doubt for safer
estimates.
If you already have a clean dataset, use
make_projoint_data():
out4 <- make_projoint_data(
.dataframe = exampleData1_labelled_tibble,
.attribute_vars = c(
"School Quality", "Violent Crime Rate (Vs National Rate)",
"Racial Composition", "Housing Cost",
"Presidential Vote (2020)", "Total Daily Driving Time for Commuting and Errands",
"Type of Place"
),
.id_var = "id",
.task_var = "task",
.profile_var = "profile",
.selected_var = "selected",
.selected_repeated_var = "selected_repeated",
.fill = TRUE
)Preview:
## <projoint_data>
## - data: 6400 rows, 13 columns
## - labels: 24 levels, 4 columns
To reorder or relabel attributes:
Edit the CSV (change order, label columns; leave
level_id untouched)
Save it as “labels_arranged.csv” or something else.
Reload labels:
Compare using our example:
🏠Home: Home