## Statistics homework help

Dr. Beeper, an Educational Psychologists who studies issues related to higher education, is interested in studying key factors that impact year to year persistence among college students.  His review of the literature identifies several factors that appear to be causally related to persistence. Specifically, academic aptitude, goal commitment, institutional commitment, and the number of work hours.
To test the importance of these factors, Dr. Beeper administers a set of questionnaires to 100 randomly selected first-time, full-time freshmen college students (50 male and 50 female) that attended the Freshmen Orientation in the Fall of 2016, at Newton Young University (NYU) in Nebraska.
Measures:
Institutional Commitment (IC) represents the importance that students place on graduating from the college they are currently attending.   Institutional Commitment was measured with five-item questionnaire. Each item was rated on a 0, 1, or 2 scale.  The possible range of scale scores are zero to 10, where values close to zero indicate little to no importance, and values close to 10 indicate high importance.
Goal Commitment (GC) represents the importance that students place on obtaining a college degree.  Goal Commitment was also measured with five-item questionnaire.  Each item was rated on a 0, 1, or 2 scale.  The possible range of scale scores are zero to 10; where values close to zero indicated little to no importance to obtaining a college degree, and values close to 10 indicated a high importance to graduating from college.
Academic Aptitude was represented as scores on both the SAT-Math and the SAT-Verbal tests.  SAT scores for all participants were obtained from high school transcripts.
Hours works, represents the anticipated number of hours the student expected to work throughout the semester.
Finally, Year-to-year persistence was determined by examining the enrollment records for the sample of 100 students. A student that was registered for registered for the Fall 2017 classes was classified as a “Persister”, and given a code of 1, a student that did not re-enroll for classes at NYU, or any other college/university (based on follow-up phone interviews) was considered a “Non-persister”, and was given a code of 0.  Therefore, the SPSS variable Persist has two levels, 0 and 1.
The assignment is, using the attached SPSS data file, conduct a binary logistical regression analysis in which IC, GC, SAT-MathSAT-Verbal, and Hours Worked are the predictor. variables (covariates in SPSS), and the variable Persist is the outcome (DV in SPSS). Use my sample summary as a model for your summary.
The specific elements of the assignment are:
1) Create a Null and Alternative Hypotheses for the Logistical Regression Analysis
2) State the Goals of the analysis
3) Summarize the results and interpret findings the overall model (for example the Chi Square results, Nagelkerke R-Square or Cox Snell R-Square).
4) Summarize and interpret the results for each predictor;  and present, summarize and interpret the results for each significant predictor (i.e., B, Wald’s test, df, p and OR (ExpB). Interpret the significant OR using the effect size conventions I posted in last week’s (8) discussion board.
5) Include and refer to the appropriate tables within the summary.
Please read my sample summary see what statistics to report, and how to report and interpret them in correct APA style, as well as the tables to include.
You’ll see that in my sample summary I also include t-tests. You may  want to conduct t-tests that compare “persisters” and non-persisters, on the predictor variables (covariates).  Please note that the t-test are optional, and will have no impact on your grade whether you include them or not.  The t-test  are very informative about the bivariate relationship between the predictor variables (covariates in SPSS) and the binomial outcome (DV in SPSS) .
Please note that you are not required to conduct the t-tests, or to compute and report Cohen’s d.
Here’s the syntax for my sample summary.
T-TEST GROUPS=BO(0 1)
/MISSING=ANALYSIS
/VARIABLES=teachsat ressat wkoverld
/CRITERIA=CI(.95).
LOGISTIC REGRESSION VARIABLES BO
/METHOD=ENTER teachsat ressat wkoverld
/PRINT=GOODFIT CI(95)
/CRITERIA=PIN(0.05) POUT(0.10) ITERATE(20) CUT(0.5).
• Week5CollegePersistence.sav
• APASummaryforLogisticalRegression_2_pratice5.pdf

## Statistics homework help

Write a 2- to 3-page critique of the research you found in the Walden Library that includes responses to the following prompts:

• Why did the authors select binary logistic regression in the research?
• Do you think this test was the most appropriate choice? Why or why not?
• Did the authors display the results in a figure or table?
• Does the results table stand alone? In other words, are you able to interpret the study from it? Why or why not?
• DifferentialAnalysisofDiseaseRisk.pdf

Instructions

## Final Project Assignment Instructions

### Scenario Background:

A marketing company based out of New York City is doing well and is looking to expand internationally. The CEO and VP of Operations decide to enlist the help of a consulting firm that you work for, to help collect data and analyze market trends.
You work for Mercer Human Resources. The Mercer Human Resource Consulting website lists prices of certain items in selected cities around the world. They also report an overall cost-of-living index for each city compared to the costs of hundreds of items in New York City (NYC). For example, London at 88.33 is 11.67% less expensive than NYC.
More specifically, if you choose to explore the website further you will find a lot of fun and interesting data. You can explore the website more on your own after the course concludes.
https://mobilityexchange.mercer.com/Insights/ cost-of-living-rankings#rankings

### Assignment Guidance:

In the Excel document, you will find the 2018 data for 17 cities in the data set Cost of Living. Included are the 2018 cost of living index, cost of a 3-bedroom apartment (per month), price of monthly transportation pass, price of a mid-range bottle of wine, price of a loaf of bread (1 lb.), the price of a gallon of milk and price for a 12 oz. cup of black coffee. All prices are in U.S. dollars.
You use this information to run a Multiple Linear Regression to predict Cost of living, along with calculating various descriptive statistics. This is given in the Excel output (that is, the MLR has already been calculated. Your task is to interpret the data).
Based on this information, in which city should you open a second office in? You must justify your answer. If you want to recommend 2 or 3 different cities and rank them based on the data and your findings, this is fine as well.

### Deliverable Requirements:

This should be ¾ to 1 page, no more than 1 single-spaced page in length, using 12-point Times New Roman font. You do not need to do any calculations, but you do need to pick a city to open a second location at and justify your answer based upon the provided results of the Multiple Linear Regression.
The format of this assignment will be an Executive Summary. Think of this assignment as the first page of a much longer report, known as an Executive Summary, that essentially summarizes your findings briefly and at a high level. This needs to be written up neatly and professionally. This would be something you would present at a board meeting in a corporate environment. If you are unsure of an Executive Summary, this resource can help with an overview. What is an Executive Summary?

### Things to Consider:

To help you make this decision here are some things to consider:

• Based on the MLR output, what variable(s) is/are significant?
• From the significant predictors, review the mean, median, min, max, Q1 and Q3 values?
• It might be a good idea to compare these values to what the New York value is for that variable. Remember New York is the baseline as that is where headquarters are located.
• Based on the descriptive statistics, for the significant predictors, what city has the best potential?
• What city or cities fall are below the median?
• What city or cities are in the upper 3rd quartile?
• Math302FinalProject.xlsx

## Statistics homework help

Before beginning work on this week’s discussion post, review the following resources:

From the below list, select one topic for which you will lead the discussion in the forum this week. Early in the week, reserve your selected topic by posting your response (reservation post) to the Discussion Area, identifying key words about your topic in the subject line.
By the due date assigned, research your topic and start a scholarly conversation as you respond with your initial or primary post to your own reservation post in the Discussion Area. Make sure your response does not duplicate your colleagues’ responses:
Topic:

• Distinguish between parametric and non-parametric tests.

As the beginning of a scholarly conversation, your initial post should be:

• Succinct—no more than 500 words.
• Provocative—use concepts and combinations of concepts from the readings to propose relationships, causes, and/or consequences that inspire others to engage (inquire, learn). In other words, take a scholarly stand.
• Supported—scholarly conversations are more than opinions. Ideas, statements, and conclusions are supported by clear research and citations from course materials as well as other credible, peer-reviewed resources.

## Statistics homework help

Research Question:
As the Director of Academic Assessment at City University (CU) you are asked to determine if the SAT test is a significant predictor of persistence of first year of college students.  The sample consisted of the 100 freshmen students that enrolled in CU as first-time full-time students in the Fall 2019 semester.
Because the SAT scores for incoming class of students as significantly and severely skewed – thus violating the assumption of normality – you dichotomized the students into two SAT groups coded zero and one.  Students that scores below the median score of 1000 on the SAT were given a code of 0, and classified as the Low SAT students.   Students that scored 1000, or above,  on the SAT were given a code of 1, and classified as the High SAT students.
The outcome measure (DV) was year to year persistence. Specifically, whether or not the student re-enrolled as a Full-time student in the Fall 0220 semester.  Those students that dropped out of college and didn’t enroll in class during the Fall 2020 semester were assigned a code of 0 on the persistence measure.  Students that enrolled full-time in the Fall of 2020, the “persisters”, were assigned a code of 1 on the persistence measure.
Analyze the data to determine if SAT classification (High or Low) is significantly related to persistence
Using the attached SPSS dataset construct a 2×2 contingency table using crosstabs, and then conduct a simple binary logistical regression analysis in which the variable Persist of the Dependent variable, and the variable SAT is the Covariate (predictor) variable
Be sure to present and discuss the significance of the overall analysis (model).  Be sure to present and discuss:

• Conditional probabilities
• Conditional odds
• Logits
• Odds ratios
• Relative risk
• Slope
• Intercept
• the results for the overall model (Chi-square test and R-square measures)
• The results for Wald’s test.

Also be sure to include the null and alternative hypotheses for the analoysis.
You summary should be 2 to 3 pages.  Please use my sample summary as a model-for your summary.

• Week4SATMandPersistence.sav
• APASumaryforCancerTreatmentStudyforORs_1_1.docx

## Statistics homework help

Before beginning work on this week’s discussion post, review the following resources:

From the below list, select one topic for which you will lead the discussion in the forum this week. Early in the week, reserve your selected topic by posting your response (reservation post) to the Discussion Area, identifying key words about your topic in the subject line.
By the due date assigned, research your topic and start a scholarly conversation as you respond with your initial or primary post to your own reservation post in the Discussion Area. Make sure your response does not duplicate your colleagues’ responses:
Topic:

• Distinguish between parametric and non-parametric tests.

As the beginning of a scholarly conversation, your initial post should be:

• Succinct—no more than 500 words.
• Provocative—use concepts and combinations of concepts from the readings to propose relationships, causes, and/or consequences that inspire others to engage (inquire, learn). In other words, take a scholarly stand.
• Supported—scholarly conversations are more than opinions. Ideas, statements, and conclusions are supported by clear research and citations from course materials as well as other credible, peer-reviewed resources.

## Statistics homework help

All posts must 100% original work. no plagiarism. Post results must be provided using Excel. Make sure you interpret your results on a Word Document.
Recall the car data set you identified in Forum 2 excel attached. We know that this data set is normally distributed using the mean and SD you calculated.  (Be sure you use the numbers without the supercar outlier)
For the next 4 cars that are sampled, what is the probability that the price will be less than \$500 dollars below the mean?  Make sure you interpret your results.
Please note: we are given a new sample size, we will need to calculate a new SD.  Then, to find the value that is \$500 below the mean you will need to take the mean and subtract \$500 from it.  For example, if the mean is \$15,000 then \$500 below this would be \$14,500.  Thus the probability you would want to find is P(x < 14,500).
For the next 4 cars that are sampled, what is the probability that the price will be higher than \$1000 dollars above the mean?  Make sure you interpret your results.  Use the same logic as above.  If your mean is \$15,000 then \$1,000 above is 15,000 + 1,000 = \$16,000.  Thus the probability you would want to find is P(x > 16,000).
For the next 4 cars that are sampled, what is the probability that the price will be equal to the mean?  Make sure you interpret your results.  Use the same logic as above.
For the next 4 cars that are sampled, what is the probability that the price will be \$1500 within the mean?  Make sure you interpret your results.  Use the same logic as above.
I encourage you to review the Week 4 normal probabilities PDF at the bottom of the discussion.  This will give you a step by step example to follow and show you how to find probabilities using Excel.  I also encourage you to review the Week 4 Empirical Rule PDF.  This will give you a better understanding on how to utilize the empirical rule.
• Forum2.xlsx
• Week4EmpiricalRule.pdf
• Week4normalprobabilities.pdf

## Statistics homework help

Provide (2) 200 words response with a minimum of 1 APA references for RESPONSES 1 AND 2 below. Responses may include direct questions. In your peer posts, compare the probabilities that you found with those of your classmates. Were they higher/lower and why? In your responses, refer to the specific data from your classmates’ posts. Make sure you include your data set in your initial post as well. Attached are the excel docs for both responses.
RESPONSE 1:
For this weeks forum I determined that my average price for the SUV’s that I selected was \$49,903.60. Out of the 10 vehicles that I chose 6 of the cars were less than the average of all the cars price giving me a probability of p=.60 and a q=.40. When I calculated the probability of 10 randomly selected cars that exactly 4 of them will fall below my average I got 11%. When I calculated the probability that fewer than 5 of them would fall below the average I got slightly over 16%. Next, I calculated the probability that more than 6 of them would fall below my average and got 38%. What was interesting is that when I calculated the probability that at least 4 cars would fall below the average of my vehicles I got 94.5%. I think that this is a good exercise to learn in being able to speak to the likelihood that a car dealer will likely have the vehicle a customer is looking for based on the customer’s price point. If the dealer knows their averages they can also make suggestions and marketing assumptions that they have a good probability that a customer will find a vehicle of their choice within their price range using the information we covered in this weeks lesson. The formulas based on the pdf were explained at the bottom of the attached spreadsheet.
RESPONSE 2:
The average of my vehicles came out to \$16,593. Half of the car’s prices fell below the average which made my probability success and failure 0.5. Both p=0.5 and q=0.5.
Average = \$16,593
5 of my vehicles fall below the average
P= 5/10
P or Success = 0.5
Q = 1-p
Q = 1-0.5
Q or Failure = 0.5
In another random sample using the same data, the probability that exactly 4 cars would fall below the average is 21%.
In another random sample using the same data, the probability that fewer than 5 vehicles would fall below the average is 62%.
In another random sample using the same data, the probability that more than 6 vehicles will fall below the average is 17%.
In another random sample using the same data, the probability that at least 4 vehicles will fall below the average is 83%.
The results did not particularly surprise me. Since my probability was 0.5 is made it easier to simply look at the numbers and make a guess. The result that probably surprised me the most was how little the probability for exactly 4 vehicles falling below the average was. But then again, anytime you see the word “exactly” I would imagine a lower probability. Another extremely helpful pdf for this exercise. It was easy to follow and helped me understand the material more clearly.
Jenny
• Response1.xlsx
• Response2.xlsx

## Statistics homework help

Research Question:
As the Director of Academic Assessment at City University (CU) you are asked to determine if the SAT test is a significant predictor of persistence of first year of college students.  The sample consisted of the 100 freshmen students that enrolled in CU as first-time full-time students in the Fall 2019 semester.
Because the SAT scores for incoming class of students as significantly and severely skewed – thus violating the assumption of normality – you dichotomized the students into two SAT groups coded zero and one.  Students that scores below the median score of 1000 on the SAT were given a code of 0, and classified as the Low SAT students.   Students that scored 1000, or above,  on the SAT were given a code of 1, and classified as the High SAT students.
The outcome measure (DV) was year to year persistence. Specifically, whether or not the student re-enrolled as a Full-time student in the Fall 0220 semester.  Those students that dropped out of college and didn’t enroll in class during the Fall 2020 semester were assigned a code of 0 on the persistence measure.  Students that enrolled full-time in the Fall of 2020, the “persisters”, were assigned a code of 1 on the persistence measure.
Analyze the data to determine if SAT classification (High or Low) is significantly related to persistence
Using the attached SPSS dataset construct a 2×2 contingency table using crosstabs, and then conduct a simple binary logistical regression analysis in which the variable Persist of the Dependent variable, and the variable SAT is the Covariate (predictor) variable
Be sure to present and discuss the significance of the overall analysis (model).  Be sure to present and discuss:

• Conditional probabilities
• Conditional odds
• Logits
• Odds ratios
• Relative risk
• Slope
• Intercept
• the results for the overall model (Chi-square test and R-square measures)
• The results for Wald’s test.

Also be sure to include the null and alternative hypotheses for the analoysis.
You summary should be 2 to 3 pages.  Please use my sample summary as a model-for your summary.

• Week4SATMandPersistence.sav
• APASumaryforCancerTreatmentStudyforORs_1_1.docx