As an actuary you are able to intuitively pick up patterns from looking at data. After helping hundreds of actuaries prepare to pass Exam PA in our online course, I have noticed which types of questions many people get stuck on. These are the questions which require the most practice – data manipulation and efficient writing. There are no “shortcuts” to these problems but only practice, practice, and more practice.

## If you took PA in June:

Your official exam results won’t be available for another month, on September 4^{th}. This is a long time to wait! Our post-exam surveys may help you determine if you passed.

## If December will be your first sitting:

You can benefit from studying every inch of the syllabus. Each exam has questions which are new and never-before-seen.

## Here are all of the questions that were asked in June:

As you can see, each day had variations of the same collection of questions. June 17 and 18 are not shown because there was uncertainty in which questions people were remembering when the survey was administered.

Specific Task | Points |

Task 1 – Data Exploration | 8 |

Task 2 – Explore target variable (TV) | 12 |

Task 3 – Missing Values | 4 |

Task 4 – Correlation matrix and recommending a method | 6 |

Task 5 – Principal Component Analysis (PCA) | 10 |

Task 6 – Recommend a distribution and link function | 4 |

Task 7 – Construct a GLM | 8 |

Task 8 – Stepwise Selection | 4 |

Task 9 – Elastic net regularization | 7 |

Task 10 – Decision tree | 10 |

Task 11 – Cost complexity pruning | 4 |

Task 12 – Recommend a model | 5 |

Task 13 – Executive Summary | 20 |

Specific Task | Points |

Task 1 – Data Exploration | 6 |

Task 2 – Explore target variable (TV) | 15 |

Task 3 – Find variables which may not be appropriate | 4 |

Task 4 – Write a summary of data modifications tasks 1-3 | 6 |

Task 5 – Explore a decision tree choose a cp parameter | 10 |

Task 6 – Explain Bias variance tradeoff | 4 |

Task 7 – Explain Binomial and gamma dist | 5 |

Task 8 – Fit a GLM with poisson model without PCA | 8 |

Task 9 – Fit a Lasso model | 7 |

Task 10 – Compare the two GLM and select Lasso | 6 |

Task 11 – Run the glm model using glm() function | 4 |

Task 12 – Executive Summary | 20 |

## What questions did people have the most difficulty with?

We asked each candidate to rank how well prepared they felt for each question on a scale of 1 – 4 and then took average. There was no weighting for the number of points per question.

The big picture is that many people lose points on communication aspects. This is a communication exam first and foremost. The Project Statement asks candidates to **use non-techincal language** but many people still used statistical verbage or were unable to explain concepts in plain English. Our video tutorial on Writing the Executive Summary gives you practice at this.

Data Manipulation is always a source of mistakes, and to make this double-painful, a mistake on the **data manipulation** has a chain-reaction effect of causing future model results to be incorrect. See chapter 6 – Data Manipulation of our Free Study Guide for a data manipulation tutorial from SOA December 2020 Exam PA. Our online course has tutorials that teach you how to manipulate data like a data scientist, using modern R libraries. There is no need to be an “R Expert” to get 100% on the data manipulation questions.

The most difficult new questions, where people expressed the greatest uncertainty in their answers, were about the** interpretation of a decision tree**, **handling missing values**, and in** interpreting a correlation matrix**.

The question about the decision tree was to be expected because they appear in almost frequently in the other exams, but the questions about missing values and correlations caught a number of people by surprise.

The topic of missing values did appear on the SOA Exam December syllabus under Data Types and Exploration:

Learning Outcomes:

c) Understand basic methods of handling missing data.

But no questions have appeared on prior exams. In the past, the SOA has given this part to you for free by saying “your assistant has already removed missing values”. It is also woth noting that the word “correlation” doesn’t appear anywhere on the syllabus.

I thought that all tasks except for the Executive Summary would allow for technical language, as the target audience was someone familiar with predictive analytics concepts. The Executive Summary was of course to be written for a non-technical audience.