6.given this table figure out:
"Are the frequency distribution of mask types the same across the six cities?" In other
...
cities?" In other words, "do people in the 6 cities wear the different mask types with similar likelihood?"
Let's say you have 4 types of masks: N95, pleated blue polyester, cloth, neck gaiter. Then you'll have a data table that is 4x6 with how many degrees of freedom? With this table you only have one overall H0: the dist'n of mask types is the same for all 6 cities. The corresponding H1 is "at least one city is not like the others."
What happens if you have a violation of Cochran's Rule among the 24 expected values? What will you do?
View More
8.Task 1
You are asked to carry out a study on behalf of a business analytics specialised consultancy on a subsample
...
on a subsample of weekly data from Randall’s Supermarket, one of the biggest in the UK. Randall’s marketing management team wishes to identify trends and patterns in a sample of weekly data collected for a number of their loyalty cardholders during a 26-week period. The data includes information on the customers’ gender, age, shopping frequency per week and shopping basket price. Randall’s operates two different types of stores (convenient stores and superstores) but they also sell to customers via an online shopping platform. The collected data are from all three different types of stores. Finally, the data provides information on the consistency of the customer’s shopping basket regarding the type of products purchased. These can vary from value products, to brand as well as the supermarket’s own high-quality product series Randall’s Top. As a business analyst you are required to analyse those data, make any necessary modifications in order to determine whether for any single customer it is possible to predict the value of their shopping basket.
Randall’s marketing management team is only interested in identifying whether the spending of the potential customer will fall in one of three possible groups including:
• Low spender (shopping basket value of £25 or less)
• Medium Spender (shopping basket value between £25.01 and £70) and
• High spenders (shopping basket greater than £70)
For the purpose of your analysis you are provided with the data set Randall’s.xls. You have to decide, which method is appropriate to apply for the problem under consideration and undertake the necessary analysis. Once you have completed this analysis, write a report for the Randall’s marketing management team summarising your findings but also describing all necessary steps undertaken in the analysis. The manager is a competent business analyst himself/herself so the report can include technical terms, although you should not exceed five pages. Screenshots and supporting materials can be included in the appendix.
Requirements
After completing your analysis, you should submit a report that consists of two parts. Part A being a non-technical summary of your findings and Part B a detailed report of the analysis undertaken with more details.
Part A: A short report for the Head of Randall’s Marketing Management (20 per cent). This should briefly explain the aim of the project, a clear summary and justification of the methods considered as well as an overview of the results.
Although, the Head of Randall’s Marketing Management team who will receive this summary is a competent business analytics practitioner, the majority of the other team members have little knowledge of statistical modelling and want to know nothing about the technical and statistical underpinning of the techniques used in this analysis. This report should be no more than two sides of A4 including graphs, tables, etc. In this report you should include all the objectives of this analysis, summary of data and results as well as your recommendations (if any).
Part B: A technical report on the various stages of the analysis (80 per cent).
The analysis should be carried out using the range of analytics tools discussed:
• SPSS Statistics
Ensure that the exercise references:
• Binary and multinomial logistic regression
• Linear vs Logistic regression
• Logit Model with odds Ratio
• Co-efficients and Chi Squared
• MLR co-efficients
• Assessing usefulness of MLR model
• Interpreting a model
• Assessing over-all model fit with Psuedo R-Squared measures
• Classification accuracy (Hit Ratio)
• Wald Statistic
• Odd ratio exp(B)
• Ratio of the probability of an event happening vs not happening
• Ratio of the odds after a unit change in the predictor to the original odds
• Assumptions
• Residuals analysis
• Cook’s distance
• DfBeta
• Adequacy (with variance inflation factor VIF and tolerance statistic)
• Outliers and influential points cannot just be removed. We need to check them (typo? – unusual data?)
• Check for multicollinearity
• Parsimony
Write a short and concise report to explain the technical detail of what you have done for each step of the analysis.
The report should also cover the following information:
• Any type of analysis that might be useful and check whether the main assumptions behind the analyses do not hold or cannot be
• Give evidence of the understanding of the statistical tools that you are using. For example, comment on the model selection procedure and the coefficient interpretation, e.g. comment on the interpretation of the logistic regression coefficients if such a method is used and provide an example of
• Conclusions and explanation, in non-technical terms, of the main points
View More
9.A bird has three alleles at a locus that controls wing color: W1, W2 and W3. W1 is dominant
...
nd W3; W1 produces blue wings. W2 is co-dominant to W3; W2W2 produces yellow wings, W3W3 produces green wings, and W2W3 produces green-spotted wings. The phenotypic frequencies in a population are as follows:
0.16 green
0.16 green spotted
0.04 yellow
0.64 blue
Assume that this population is in Hardy-Weinberg equilibrium: what is the frequency of the W1 allele in this population?
Group of answer choices
0.64
0.642
square root of 0.64
There’s not enough information to figure this out.
0.4
View More
13.Professor Maya was interested in maximizing student learning in all her classes. She decided the best way to do that
...
t way to do that would be to investigate her students’ test performance in a number of ways.
The first thing she did was separate her students’ test scores based on the time of day she held her lectures (morning vs evening). Next she recorded the type of test students were writing (multiple choice vs short answer). She selected a random sample of students from her morning (n = 6) and evening (n = 7) classes (total of 13) and recorded scores from two of their tests as shown below.
Morning
Evening
Multiple Choice
Short Answer
Multiple Choice
Short Answer
66
74
70
45
64
55
80
55
72
77
78
55
70
57
84
60
61
58
64
70
67
69
84
60
70
63
DATA Set 1:
Good morning sunshine. Is Time of Day important?
1. Prof. Maya recently read an article that concluded students retained more information when attending classes in the morning. Based on this finding she thought students in her morning class might have performed differently on their Short Answer test scores when compared to students in her evening class. Does the data support her hypothesis? [15 points]
Multiple Guess! Does Exam Type matter?
2. Prof. Maya also knew that students often did better on multiple-choice tests because they only have to recognize the information (rather than recall it). Given this, she thought students attending the morning class might perform differently on the Multiple-Choice test when compared to the Short Answer test. Does the data support her hypothesis? [15 points]
DATA Set 2:
We’ll try anything once. Does the new Tutorial Plan work?
3. Combining all of her students (and ignoring time of day), Prof. Maya asked her TAs to try a new – and very expensive - tutorial study plan. She then chose a random sample of 20 students to receive the new study plan and another sample of 30 to continue using the old study plan. Following an in-class quiz, she divided the students into 3 levels of achievement (below average, average, and above average), and then created the frequency table below. Does the new expensive tutorial study plan improve student performance? [15 points]
Below average
Average
Above Average
New plan
7
7
6
Old plan
6
15
9
DATA Set 3:
How are YOU doing?
4. Finally, Prof. Maya thinks that her 2018 class is doing better than her 2017 class did. She decided to collect a sample of test scores from the students in her course this year (combining all of the groups) and compare the average with her previous year’s class average. Does the data support her hypothesis? [15 points]
The 2017 class average = 63%
The 2018 sample size = 25
The 2018 sample standard deviation = 11
The 2018 sample average = use your actual midterm mark (yes, you the student reading this :)
Bonus: What does it all mean?
5. Bonus: IF Prof. Maya had complete control of how and when she ran her course in 2018, considering all the info you just found in the 3 data sets, write a brief statement of how you would recommend she set-up the course next year – and explain why. [5 points]
View More
14.Professor Maya was interested in maximizing student learning in all her classes. She decided the best way to do that
...
t way to do that would be to investigate her students’ test performance in a number of ways.
The first thing she did was separate her students’ test scores based on the time of day she held her lectures (morning vs evening). Next she recorded the type of test students were writing (multiple choice vs short answer). She selected a random sample of students from her morning (n = 6) and evening (n = 7) classes (total of 13) and recorded scores from two of their tests as shown below.
DATA Set 1:
Good morning sunshine. Is Time of Day important?
1. Prof. Maya recently read an article that concluded students retained more information when attending classes in the morning. Based on this finding she thought students in her morning class might have performed differently on their Short Answer test scores when compared to students in her evening class. Does the data support her hypothesis? [15 points]
Multiple Guess! Does Exam Type matter?
2. Prof. Maya also knew that students often did better on multiple-choice tests because they only have to recognize the information (rather than recall it). Given this, she thought students attending the morning class might perform differently on the Multiple-Choice test when compared to the Short Answer test. Does the data support her hypothesis? [15 points]
DATA Set 2:
We’ll try anything once. Does the new Tutorial Plan work?
3. Combining all of her students (and ignoring time of day), Prof. Maya asked her TAs to try a new – and very expensive - tutorial study plan. She then chose a random sample of 20 students to receive the new study plan and another sample of 30 to continue using the old study plan. Following an in-class quiz, she divided the students into 3 levels of achievement (below average, average, and above average), and then created the frequency table below. Does the new expensive tutorial study plan improve student performance? [15 points]
Below average
Average
Above Average
New plan
7
7
6
Old plan
6
15
9
DATA Set 3:
How are YOU doing?
4. Finally, Prof. Maya thinks that her 2018 class is doing better than her 2017 class did. She decided to collect a sample of test scores from the students in her course this year (combining all of the groups) and compare the average with her previous year’s class average. Does the data support her hypothesis? [15 points]
The 2017 class average = 63%
The 2018 sample size = 25
The 2018 sample standard deviation = 11
The 2018 sample average = use your actual midterm mark (yes, you the student reading this :)
Bonus: What does it all mean?
5. Bonus: IF Prof. Maya had complete control of how and when she ran her course in 2018, considering all the info you just found in the 3 data sets, write a brief statement of how you would recommend she set-up the course next year – and explain why. [5 points]
View More
15.To gain experience with the operations involving binary search trees. This data structure as linked list uses dynamic memory allocation
...
list uses dynamic memory allocation to grow as the size of the data set grows. Unlike linked lists, a binary search tree is very fast to insert, delete and search.
Project Description
When an author produce an index for his or her book, the first step in this process is to decide which words should go into the index; the second is to produce a list of the pages where each word occurs. Instead of trying to choose words out of our heads, we decided to let the computer produce a list of all the unique words used in the manuscript and their frequency of occurrence. We could then go over the list and choose which words to put into the index.
The main object in this problem is a "word" with associated frequency. The tentative definition of "word" here is a string of alphanumeric characters between markers where markers are white space and all punctuation marks; anything non-alphanumeric stops the reading. If we skip all un-allowed characters before getting the string, we should have exactly what we want. Ignoring words of fewer than three letters will remove from consideration such as "a", "is", "to", "do", and "by" that do not belong in an index.
In this project, you are asked to write a program to read any text file and then list all the "words" in alphabetic order with their frequency together appeared in the article. The "word" is defined above and has at least three letters.
Note:
Your result should be printed to an output file named YourUserID.txt.
You need to create a Binary Search Tree (BST) to store all the word object by writing an insertion or increment function. Finally, a proper traversal print function of the BST should be able to output the required results.
The BST class in the text can not be used directly to solve this problem. It is also NOT a good idea to modify the BST class to solve this problem. Instead, the following codes are recommended to start your program.
//Data stored in the node type
struct WordCount
{
string word;
int count;
};
//Node type:
struct TreeNode
{
WordCount info;
TreeNode * left;
TreeNode * right;
};
// Two function's prototype
// Increments the frequency count if the string is in the tree
// or inserts the string if it is not there.
void Insert(TreeNode*&, string);
// Prints the words in the tree and their frequency counts.
void PrintTree(TreeNode* , ofstream&);
//Start your main function and the definitions of above two functions.
Sample Run
Please type the text file name: Lincoln.txt
Please give the output text file name: mus11.txt
You are done! You can open the file "mus11.txt" to check.
Press any key to continue
------------------------------------------------------------------------------------------------------------------------------------------------
lincoln.txt---
The Gettysburg Address
Gettysburg, Pennsylvania
November 19, 1863
Four score and seven years ago our fathers brought forth on this continent, a new nation, conceived in
Liberty, and dedicated to the proposition that all men are created equal.
Now we are engaged in a great civil war, testing whether that nation, or any nation so conceived and
so dedicated, can long endure. We are met on a great battle-field of that war. We have come to dedicate
a portion of that field, as a final resting place for those who here gave their lives that that nation
might live. It is altogether fitting and proper that we should do this.
But, in a larger sense, we can not dedicate -- we can not consecrate -- we can not hallow -- this ground.
The brave men, living and dead, who struggled here, have consecrated it, far above our poor power to add
or detract. The world will little note, nor long remember what we say here, but it can never forget what
they did here. It is for us the living, rather, to be dedicated here to the unfinished work which they
who fought here have thus far so nobly advanced. It is rather for us to be here dedicated to the great
task remaining before us -- that from these honored dead we take increased devotion to that cause for
which they gave the last full measure of devotion -- that we here highly resolve that these dead shall not
have died in vain -- that this nation, under God, shall have a new birth of freedom -- and that government
of the people, by the people, for the people, shall not perish from the earth.
------------------------------------------------------------------------------------------------------------------------------------------------
mus11.txt
1863 1
Address 1
But 1
Four 1
Gettysburg 2
God 1
Liberty 1
November 1
Now 1
Pennsylvania 1
The 3
above 1
add 1
advanced 1
ago 1
all 1
altogether 1
and 6
any 1
are 3
battle-field 1
before 1
birth 1
brave 1
brought 1
but 1
can 5
cause 1
civil 1
come 1
conceived 2
consecrate 1
consecrated 1
continent 1
created 1
dead 3
dedicate 2
dedicated 4
detract 1
devotion 2
did 1
died 1
earth 1
endure 1
engaged 1
equal 1
far 2
fathers 1
field 1
final 1
fitting 1
for 5
forget 1
forth 1
fought 1
freedom 1
from 2
full 1
gave 2
government 1
great 3
ground 1
hallow 1
have 5
here 8
highly 1
honored 1
increased 1
larger 1
last 1
little 1
live 1
lives 1
living 2
long 2
measure 1
men 2
met 1
might 1
nation 5
never 1
new 2
nobly 1
nor 1
not 5
note 1
our 2
people 3
perish 1
place 1
poor 1
portion 1
power 1
proper 1
proposition 1
rather 2
remaining 1
remember 1
resolve 1
resting 1
say 1
score 1
sense 1
seven 1
shall 3
should 1
struggled 1
take 1
task 1
testing 1
that 13
the 9
their 1
these 2
they 3
this 4
those 1
thus 1
under 1
unfinished 1
vain 1
war 2
what 2
whether 1
which 2
who 3
will 1
work 1
world 1
years 1
View More