Term 2 Statistics Project - A Statistical Case Study
The goal of this project is to create a statistical case study. This case study must include real world data, a picture, a description of the data, and statistical questions that pertain to the data. In addition, a webpage must be made to host your case study and corresponding data. An additional benefit of having your case study and data on a webpage is that your data, with the help of RStudio, will be able to be analyzed in RStudio.
This project must include the following: 1. REAL Data: This data may be from: a) A scientific website - feel free to investigate the sites found here. b) your own survey c) collected from another class ( as part a physics experiment, for example), d) from another website (polling, etc.)
2. Once you have a data set of interest, please ask me (Mr. Simoneau) to review the data before proceeding. Understand that you must submit/create statistical questions that pertain to your data set. So think about what statistical questions you may be able to ask while choosing which data you would like to use.
PART B - UPLOAD DATA - INSTRUCTIONS TO UPLOAD DATA FROM GOOGLE DOCS INTO RSTUDIO: 1. Once your data is all set, make sure you don't have any commas or % or $ signs in your data. Also make sure your text (variable names and colleges) is properly formatted with NO spaces or symbols (% bad). Also, make sure you are not missing any data! Make best guess estimate for data you are missing. 2. In google docs, click file, then download as , the choose comma separated csv file. 3. Once you save the file go to RStudio. Then click "Files" (Files can be found not on the top right, but in the window where you get plots and R help documentation), then "Upload" and then upload your .csv file. 4. Once the data has been uploaded to RStudio, in the command line type: d=read.csv("yourfilenamehere.csv") ### make sure to copy your file name exactly attach(d); names(d) ## the names will give you all the names of your variables. Here are additional instructions on loading data into RStudio: https://www.stats4stem.org/upload-google-docs-spreadsheet-to-rstudio
PART C - DATA ANALYSIS
4 - If you are doing this project individually, you must propose and answer 7 questions. If you are a group of two, you must propose and answer 14 questions.
AP STUDENTS For individual students, complete the following number of questions from each chapter. For groups of two, double this number. 1. Describing Data Numerically and Graphically (2 questions) 2. Normal Distribution (1 question) 3. Linear Regression (1 question) 4. Binomial and Geometric (1 Question) 5. Random Variables (1 Question) 6. Sampling Distribution (1 Question)
REGULAR STATISTICS For individual students, complete the following number of questions from each chapter. For groups of two, double this number. 1. Describing Data Numerically and Graphically (4 questions) 2. Normal Distribution (1 question) 3. Linear Regression (1 question) 4. Random Variables (1 Question)
5. To complete your project. Please reference the following google doc, and make your own copy and modify it as needed.