Project 3 is a sometimes assigned as a group project. Based on student feedback in previous terms, I will allow this as an individual project. If you want, you may complete this project as an individual or as part of a team. If you complete as in individual, of course, you must take on all group roles!
Below are the “team” instructions. If you prefer to work alone, you must complete the whole project on your own.
The first part of your assignment (in which you can lock in 5 points!) is to decide each of the roles.
Each team will have:
- One leader (automatically assigned) in charge of coordinating the team and helping make sure that each person does their part.
- One member will be a proofreader, in charge of the final check at the end.
- One member will be in charge of putting the final touches (in Tableau, Excel, or elsewhere) on any graphics included in your report.
- Four member teams will also have the luxury of a software guru, someone who commits to make sure the regressions are being run properly by checking (for instance) Excel answers vs. Statgraphics and making sure they agree.
- All team members will contribute to a rough draft and team decision making.
Deliverable #1: Send me (your instructor) an email with each group member’s name and the role they will take on the team. Cc all members of the group. (Example: “Sally is leader, Tom’s finalizing the graphs, Ursula will proofread and Vivian has agreed to check the software.”) [5 points]
After you send me your first deliverable, you will choose a state and use its data file to predict household income (HINCP) using other variables in the data set using mulitple linear regression. Your final model should have a reasonable r squared and an overall “significant” p-value. Do not use FINCP as a predictor! FINCP is family income, and saying that we can predict a household’s income if we know the family’s income is very nearly a circular argument. For the vast majority of Americans, family income and household income are the same thing. Similarly, do not use any predictor variable (like OCPIP) that is calculated using household income. If you already know household income (which is used to calcuate OCPIP), then it should be pretty darn easy to predict household income! If any of your predictors have large p-values, be sure to justify why you are including them. Every student in the group should contribute to and comment on the body of the report, even if grammar and graph details are left to individual group members.
To really impress, make a prediction for a particular household with a given set of predictor variables.
Deliverable #2: Upload your report to Canvas by the deadline.
5 points: Model, including statgraphics output including ANOVA and coefficient tables but NOT any automated “explanation” such as StatGraphic’s “The StatAdvisor”.
5 points: 1-4 complete sentences describing your model
5 points: Graph. Remember your principles of data viz!
5 points: Comment on outliers, patterns [this exploration should also include a scatterplot matrix for your predictor variables: pairs() in R or plot>multivariate visualization>scatterplot matrix in Statgraphics]
5 points: R squared better than 25% (partial credit for getting close)
5 points: Describe each independent variable (what is it?). If your final model included less than 4 predictors, describe other variables you considered but did not include. Describe a minimum of four variables.
5 points: For each independent variable, speculate on its sign and value (why is it positive/negative? is it “big” or “small” in its effect?)