Data & Statistics Projects

 

Project 1

Data Gathering & Presentation

Austin Peay Building at Fort Campbell

Project 1 Report

Meetings:  Each group should meet before each report is due. Meetings prior to the pre-project due date will not count toward Project 1.

1st Draft:  Before meetings and at least 48 hours before the report's due date on the syllabus, each group member should post a draft version of their part of the report in their D2L group discussions.  Before meetings and at least 36 to 24 hours before each report is due, the group leader should post to the group project's assignments folder a draft of the entire report.  Not all group members need to be involved in the meeting when schedules make that impractical.  All students are, however, required to contribute to each report to receive credit.

All group members will be responsible for grading the assignments-folder-posted first draft of the report with the rubric (attached to the Project 1 folder in D2L's Assignments area) and editing the project whenever the rubric is not met clearly, accurately, and completely.  The group should go through the entire report, discussing each part and matching each part to the rubric.  Each group member is responsible for the accuracy of all calculations; the clarity, completeness, and accuracy of the answers to all questions; and the overall report presentation, style, and quality.  If any group member feels a part of the report does not meet rubric specifications, is incorrect, or is lacking in any way, that part of the report should be discussed and revised by the group at this group meeting.

Evaluate your peers by submitting one peer review for each member of your group.  Individual grades will greatly depend on turning in your own evaluations by the due date and in the evaluations of you by fellow group members.

The group leader will immediately (within 2 to 3 days of each previous project or pre-project being due) assign all parts to the various group members, keeping the Wrapping It Up part for himself or herself and splitting the remaining parts as equally as possible based on the point distribution in the rubric and the part divisions already assigned in the project description.  The group leader is responsible for ensuring group members post their respective assigned parts of the project at least two days prior to the project due date - or even earlier if the leader specifies an earlier time when assigning parts, at the leader's discretion.  If a group member is late, the leader will reassign that member's part to active group members and will let the instructor know immediately.

Other members will be responsible for completing all assigned parts at least two days prior, or even earlier, if specified, and posting those parts to the group discussion board.  If the group leader has not assigned roles for the project within two to three days of the previous project or pre-project being due, then the group contact the instructor know immediately.  Any student who does not understand completely his or her assigned part should immediately ask for clarity and help from group members. Any member turning in his or her part late may receive a zero for the current project if this part was assigned and completed by another member because of the lateness. 

Report Expectations

In formal, professional reporting style, the full report for Project 1 needs to be written to include answers to everything in all questions, although the questions and question numbers should not be included. This report should read as if the reader has no knowledge of these questions nor any knowledge that the questions have prompted the writing of the report. The entire report should be in complete sentences and full paragraphs with the exception of tables, charts, graphs, equations, and lists of statistics. Everything from formatting to grammar to style should be as professional as possible.

To be successful with this project, students should:

  1. Present a neat, organized and clearly communicated project document.  Students have the latitude to present work on this project in any professional manner.
  2. When rounding, give at least three significant digits for final answers and at least four significant digits for answers that will be used in future calculations.
  3. To ensure you understand how to interpret Minitab output, use only Minitab output and not the calculator or Excel or other programs for all calculations, except for simple arithmetic.
  4. Don't assume what the question is asking unless certain.  Many students get questions wrong that are exemplified in the textbook and course notes.  Looking up items can dramatically improve the group grade.  There is an index in the back of your book and at the end of the digital textbooks in CourseCompass where one can easily jump to the page or pages where each term is discussed.  Please feel free to ask questions in the Data & Statistics Projects discussion, especially when clarity is needed on what is being asked.
  5. Choose a group leader who will immediately assign roles and parts of the project to individuals, dividing up the project as equally as possible.  Choose a backup group leader who would be able to step in and fill the leader role should anything happen to the leader.
  6. Students who don't complete and post assignments to group discussions at least two days prior to the due date for the pre-project report and Project 1 will receive between 0% to 50% of the group project grade.
  7. Reports should be professionally written and completely without reference to this set of instructions or questions. Even though all questions and parts listed here should be addressed using the same or similar terminology within the report, preferably in order, the reader of the report should not be aware that the report was guided by a set of questions or instructions.

Instructions for Minitab Only

Note: Minitab should be completed all at once by one member of the group - or possibly by the group as a whole during a face-to-face meeting.  Splitting parts of Minitab is far from ideal since Minitab parts cannot be copied and pasted from file to file.  You can access the Minitab app from the Minitab module in D2L's Content area or by going to https://app.minitab.com/.

1 - After opening and logging into Minitab, https://app.minitab.com/, click on File (upper left) and then New and then Report.  Right-click on the "Untitled Report" that now appears in the Navigator pane (far left) and select Rename.  Type "Project 1" as the name of this report.

Click on the dropdown triangle in the upper right area of the Project 1 report tab and select "Insert Row."  Click on the dropdown triangle to the upper right of this new row and select "Insert Annotation."  Where the report now says, "type annotation here," type (or copy and paste and edit) the following:

            Your Group Name
            List of Group Members' Names
            MATH 1530 Elements of Statistics
            Project #1: Minitab
            Due Date:  (actually type the due date)

2 - For all of the quantitative data posted in group discussions, make sure to enter nothing but numbers in the cells.  In other words, if you have dollar signs, dashes, or any special characters, you must remove those. 

If you have data ranges such as 18-22, 23-27, 28-32 and so on, this will not be acceptable.  Take an average of the class by taking the lower limit of the class you are in (e.g., 18) and the upper limit (e.g., 22) and averaging these two:  20.  Replace all such data ranges with single values.

Your Row 1 will correspond with data from person 1, Row 2 with person 2, and so on.  Your columns will be C1 or C1-T, followed by C2 or C2-T, and so on.  The -Ts mean you have text in your columns, and these columns can only be used as categorical variables.  If you have quantitative data in a -T column, you have done something wrong, and you should start again with a new project.  If you have too much trouble, you may need to type things in, but I hope it won't get to that point.

Once all data are in and all columns correctly have -T or no -T, you should add concise column labels to represent the data.  For example, a question like "On a scale of 0 to 10 with 0 being epic fail (bad) and 10 being made of awesome (good), how would you rate the Harry Potter and the Deathly Hallows, Part I movie?" could be titled simply "Movie Rating."

Note:  Be sure that, for categorical data columns, each category is spelled exactly the same way with exactly the same spacing and capitalization.  If not, this will result in duplicate categories. 

3 - Stats, Histograms, and Boxplots:  Do Stat > Basic Statistics > Display Descriptive Statistics.  Inside variables, select all possible quantitative variables listed.  Click on Statistics and additionally check interquartile range, mode, range, and skewness and uncheck N*, SE Mean, and N for Mode.  Press OK once.  Click on Graphs.  Select Histogram of data and Boxplot of data.  Select OK.  Select OK again. 

This step will give you 1) a histogram, 2) a boxplot, and 3) columns of detailed statistics in the session window for every quantitative variable.  Each histogram and boxplot will have its own window.

4 - Frequency Tables:  Stat > Tables > Tally Individual Variables.  Select all quantitative and categorical variables.  Display Counts and Percents.  Select OK.

Note:  Check to make sure that each categorical variable is only listed one time.  If you find that a categorical variable appears multiple times, this is probably because of some spacing, spelling, or capitalizing difference in the way you have this variable written in your data column.  You will want to go through and correct this problem, making each category spelled exactly the same way with exactly the same spacing and capitalization.  Then repeat this Step 2.

5 - Pie Chart:  Graph > Pie Chart. Leave the chart of unique values selected.  Under categorical variables, choose all your categorical variables.  Select the Pie Options button and choose Decreasing volume.  Select Labels, choose the Slice Labels tab, and select category name and percent.  Select the Multiple Graphs button and choose On separate graphs.

For the rest of Minitab, choose your one quantitative data variable that looks the most bell-shaped, when comparing all of your histograms. 

6 - Construct a normal probability of this normal data value by selecting Graph > Probability Plot. Click on the single picture and OK.  In the box to the left, double-click your best bell-shaped quantitative variable. Click OK.

7 - To standardize, note the name of the first column that is empty (e.g., C6 or C7 or C8 or something like that).  Then select CALC > Standardize, choose your best bell-shaped quantitative variable for input columns, type in the name of your first blank column (e.g., C6, C7, C8, or whichever yours is) for storing results, and click OK.  Name the standardized data column z

You should now have new data in the z column, which are the standardized values (z-scores) of every data value listed in your best bell-shaped quantitative data variable.  

8 - Let's do a frequency table for our z-scores:  Stat > Tables > Tally Individual Variables.  Select the z-score column.  Select OK.

9 - Assume your best bell-shaped quantitative data are normally distributed with a mean equal to the computed sample mean and a standard deviation of equal to the computed sample standard deviation.  Click CALC > Probability Distributions > Normal, select the circle in front of Cumulative Probability, and type the given mean and standard deviation in the appropriate boxes.  Click the circle in front of Input Constant.  Type the constant (data value) of representing the first quartile, Q1, of your data value in the box next to the Input Constant.  Click OK.  The probability will be displayed in the session window. Repeat using Q3 instead of Q1 as the input constant.

Save your project as Project1 (File > Save Project As > Project1_GroupName) and write down where you saved it!  Submit only Project1_GroupName.mpx to the Minitab Assignments folder, and triple-check that the file you've loaded says mpx.   If the last part is not .mpx, you may not receive credit for the Minitab part. 

After loading your file, reopen straight from the Assignments folder to make sure that:

  1. All graphs (scatterplot, probability plot, histograms, bar charts, pie charts) are included,
  2. The worksheet with z-scores is included,
  3. The heading information with your name, date, and so on is included, and
  4. All Minitab analysis is included (frequency tables, descriptive statistics, and two cumulative distribution functions).
     

Instructions for Report Parts
------------

PART A:  Gathering - The Sample and Design

Once your pre-report is submitted, you can start working on the full report. 

1 - Detail how you conducted the survey, including who (target population), when, and where. Exact dates, times, and locations should be given for each survey collection. Of the group meeting for Project 1 (which should be different from the pre-project group meeting), tell when, where, who met, and for how long.

2 - Answer the following questions clearly so that your answers cannot be misunderstood and so that the reader will not doubt which question you are answering:

------------

PART B:  Organizing and Presenting Quantitative & Categorical Data (First Quantitative & First Categorical Variables)

1 - Make a bolded heading for the first quantitative variable. 

2 - Go to the course in D2L under Tasks and Assignments and Project 1 - Minitab and download the .mpx file to your Downloads folder.  Open the Minitab app and open the file you've downloaded from the assignments using the Minitab app. Under your heading in MS Word, copy and paste that variable's frequency table from Minitab, leaving the table completely unaltered.

3 - Copy the statistics row for that data from the Descriptive Statistics in the session window of Minitab, including column headings.  Descriptive Statistics should have rows for each data variable, but delete all of the rows except for the header row and the row for this first quantitative variable.  No other alterations should be made.

4 - Copy the histogram and boxplot for this data.

5 - Describe the distribution of the first quantitative variable including modality (Hint: Section 2.6 of the textbook), skewness, and symmetry.   

6 - Which measure of central tendency do you think best describes the first quantitative variable?  Why?

7 - Make a bolded heading for the first categorical variable. 

8 - Under that heading, copy and paste that variable's frequency table from Minitab, leaving the table completely unaltered.

9 - Copy the pie chart for this variable. 

10 - Which measure of central tendency do you think best describes the first categorical variable?  Why? 

------------

PART C:  Normal Distributions

HINT: Completing Chapter 6 before doing this report part is highly recommended.

In order to use the normal distribution and its associated area under the curve to compute expected percentages or probability, we must assume the data are reasonably normally distributed:  unimodal, symmetric, wit
hout skew.
However, a histogram is not reliable enough when the number of data are small because there may not be a sufficient number of data to construct a histogram. In situations such as these, another graph can be produced
:  a normal probability plot.  This plot is a scatterplot that places your data on one axis and what has already been determined to be normally distributed data on the other axis.  If our data line up well with the normally distributed data, we can be safe in assuming our data are also normally distributed. (Hint:  Lining up well means that the data form a single, reasonably straight line.  Multiple, disconnected lines do not mean that the data line up well.)

1 - Go to the course in D2L under Tasks and Assignments and Project 1 - Minitab and download the .mpx file to your Downloads folder.  Open the Minitab app and open the file you've downloaded from the assignments using the Minitab app. Create a bolded heading in MS Word called The Most Normal Distribution. Copy and paste in the histogram from the data that your Minitab group member chose as the most normal variable, which is the quantitative variable that was your most bell-shaped. 

2 - Describe this histogram in terms of 1) modality, 2) symmetry, and 3) skewness. 

3 - Copy and paste your normal probability plot for this variable. 

4 - Do the data in this plot lineup well with each other in one single line?  (Note:  Lining up in multiple, perfectly straight lines is not what we are talking about, here.  Rather, do the data roughly form one single, approximately straight line?) 

5 - Does the normal probability plot indicate the data is normal, approximately normal, or not normal? 

6 - Copy and paste the tally for the standardized values (z-scores) from the z column you created. 

7 - Using the Empirical Rule's definition of outliers, are any of the data values potentially extreme values (outliers)?  If so, which ones are outliers?  Explain why you were able to conclude that you do or do not have outliers.

8 - Copy and paste the cumulative distribution functions for Q1 and Q3, leaving both completely unaltered from Minitab.

9 - The probability that a data value selected at random would be less than Q1 is ???. 

10 - The probability that a data value selected at random would be greater than Q3 is ???. 

11 - If the data were perfectly normal, what would the probability be of selected a data value less than Q1?  More than Q3?  (Hint:  In other words, how much proportion of the data do we expect to be below Q1 and above Q3 by the definitions of Q1 and Q3?)  


Wrapping It Up

Have a paragraph at the end detailing what exactly each group member did to contribute to the entire group effort.

After finishing the paper with this final Wrapping It Up paragraph, you will want to add a title page that includes a title for the paper, Math 1530, the date, the name of the group, and a list of the group members.  As you put the different parts together, be sure to check for formatting consistency, professionalism, and adherence to the report expectations above (at the top).

Save your MS Word document as Project1_GroupName.doc or Project1_GroupName.docx

Load the file to the assignments folder. 

Every group member should open the file straight from the assignments folders to grade the file that is in the assignments folder using the rubric (attached to the folder in D2L's Assignments area), correcting the report for every rubric item based on clarity, accuracy, and completeness.  Every group member is responsible for ensuring that the report meets every item in the rubric.  After grading and correcting the report, the individual member should post the new draft to group discussions with a list of the major changes made so that an appropriate individual grade may be determined. 

The group leader should ensure that the final draft gets loaded to D2L's Project 1 assignments folder well before the final deadline.  

Evaluate your peers by submitting one peer review for each member of your group.  Individual grades will greatly depend on turning in your own evaluations by the due date and in the evaluations of you by fellow group members.

------------