## Chi-Square Test, Fisher’s Exact Test, & Cross Tabulations in R | R Tutorial 4.10| MarinStatsLectures

Hi! I am Mike Marin and in this video we’ll talk about how to conduct the

“chi-square test of independence” as well as “Fisher’s exact test” using R programming language. the chi-square test of independence is a parametric method appropriate for

testing independence between two categorical variables.

we’ll be working with the lung capacity data that was introduced earlier in this

series of videos. I’ve already imported the data into R and attached it . we’ll explore the relationship between gender and smoking. we can use the “CHISQ.Test” command/function to do the chi-square test in R.

To access the Help menu in R type Help and in brackets the name of

the command/function you would like help for or place a question mark (?) in front of the

name. the first thing we will need to do is produce a contingency table (corss tabulation). this can be done using the table command/function. now let’s go ahead and save this table

in an object called TAB for use later.

the next thing we can do is produce a bar plot to examine the

relationship visually this can be done using the barplot

command .here I will set the “beside” argument equal to

True to produce cluster bar charts and also set the “legend” argument equal

to True to have a default legend produced.

see my earlier videos on producing bar plots to learn more about how to produce these. next we can produce the chi-square test using the “CHISQ.Test” command/function. here we would like a chi-square test for the contingency table. we can set the “correct” argument equal to True to have R statistical software do the Yate’s continuity correction for the chi-square test. we can see the test statistic of 1.744 and the corresponding p-value of 0.1866 recall earlier that we saw we can store the result of a

test in an object.

I am going to store the results of this test in the object CHI Earlier in the series of videos, we also learned about the “attributes” command/function we can ask what attributes R stored in this object CHI, if we would like to extract certain attributes from this object, we can do so using the

“$” sign here let’s take up the expected table if the assumptions are the chi-square

test are not met we may consider using Fisher’s exact

test Fisher’s exact test is a nonparametric equivalent to the chi-square test.

this test may be completed using the “Fisher.test” command .

again here we’d like to do the test on our contingency table.we can set the “conf.int” argument equal to True if you

would like to have a confidence interval for the odds ratio returned and we can

use the “conf.level” argument to set the desired level of confidence.

in the next video in this series we will discuss a package we can use to

calculate relative risks, odds ratios and so on.

thanks for watching this video Subscribe to marinstatslectures for more R programming and Statistics Videos

Thanks for your video. I would like to ask if R is able to handle Fisher Exact test for contingency table larger than 2X2?

Hi Mike, great videos…really helpful! Just a quick question. I've done a chi-square test on the presence of disease in different seasons. There is a significant difference, but I don't know how to pinpoint this or do a pairwise comparison of the 4 seasons. Is there something on R that I can do to see the comparison between the seasons and get a p-value for each?

Cheers!

Thank you!!

Thanks for your videos!

I came across a paper about Chi-square test and it says Chi-square test is a non-parametric test because it's distribution free:

http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3900058/

Could you explain a bit more about this?

Thank you!

Allah Razı olsun gardaşım. Adamsın

Hi Mike, What about the p-value in this case? the chi squared value is below the critical value but the p-value is below the required significance level. Does this mean we reject the null hypothesis that there is no difference between men and women regarding smoking?

a program, a man ,a mission,

you are a hero

Nice tutorial, it has solved many of the problems I had but I still have a doubt…I run the chisq function and the outcome is that p-values might not be reliable……then I run Fisher test but what if the R outcome after a fisher test is next:

"LDKEY is too small for this problem. Try increasing the size of the workspace"

Would be alright to run chisq.test on the contigency table and use Monte Carlo aproach for the calculation for the p.value?

Thank you very much ¡¡¡

@Mike_marin, you rock

now how do i plot the probability density function?

Hello Sir. What type of test do i use to compare means of males and females for different factors like Security, Reliability to know if there is a significant difference in the means of males and females for those factors?

I am some confused, CHI square test only works for sample with normal distribution, but when I search on internet how to test Normality distribution of discrete variables, The most common answer is There no such as thing as Normality test for categorical variables. Therefore how is possible we can call CHi square a PARAMETRIC TEST. I am done with stats, everytime I think a have step forward, some BS like this slap me in the face :/ Does anybody help?

I must thank you for making statistics & R so easy. I wonder if you could please explain a little about Yate's correction. When to use it and when not?

Thank you very much! You helped me!

Hello, Mike! I wondered if there is a way to obtain the expected contingency table for the fisher.test :/. Awesome video!

You just saved me from failing this shit, much appreciated.

Nice tutorial! Good job! Just for clarification, please replace "parametric" by "non-parametric".

Quick question about Chi-Square Tests:

I am supposed to find out if there is a statistically significant connection between two variables on a level of significance of 1%.

My calculation of the Chi-Square-test has the following output:

X-squared = 81.469, df = 1, p-value < 2.2e-16

I would interpret it so, that there IS a "connection" between my variables on the 1% significance level because the p-value is <0.01

Am I correct?

you are a fucking master of rstudio, thanks bro for helping me to pass my subject!

Thanks very much!! Your videos are very very very helpful!

🤓In this #R video, we learn to conduct Pearson’s chi-square test and Fisher's Exact test in R, as well as produce contingency tables with R. Chi-square test and Fisher’s exact test, can be used to check if two variables are independent. Cross tabs or contingency tables show the frequency distribution of variables to help understand the correlation between them. To learn better the concept of chi-square test watch this video (https://youtu.be/pfc9MUz03XA ). Want to support us⁉️ You can Donate (https://bit.ly/2CWxnP2), Share Our Videos, Leave Comments or Give us a Like 👍🏼! May all your learnings are #statistically and #scientifically significant 🦄

hi prof, thanks for this great job. i have a question. i have two dataset each has binary outcome i want to compare the outcome between this two datasets to see which one has poor or good outcome. 10 independent variables in dataset 1 and 5 independent variables in dataset 2. but they all have the same outcome measure which is coded as dummy variable 0 &1. how can i determine which group has better outcome? can i just use their proportions values? need explanatioin thanks