Monday, 14 December 2015

Introduction of Correlation Coefficient (CORRELATION)



Correlation 
Univariate analysis: Analysis of data when only one variable is involved.
Eg. Dispersion, Central tendency, Skewness, Kurtosis
Bivariate analysis: It is the analyses of data which involves two variables which have got realtionship exist between them. In biological experiment the bivariate analysis is very common where in one may like to know the strength of relationship or one may like to predict one variable from another related variable.
 These techniques help in measuring the independence or relationship between bivariate data and predict the data of one variable for the given value of the other variable.
Correlation Coefficient
Correlation analysis is helpfull in ascertaining the strength of relationship between the two variables. It measures the closeness of relationship between the two variables.
It ranges from -1 to +1 and it does not have any unit.
Francis Galton was the first person who investigated the correlation technique graphically. However Karl Pearson (1857-93) introduced a method of assessing correlation by means of the coefficient of correlation.
Eg: Milk production and the fat percentage feed intake and weight gain.
The statistical tool with the help of which the realtionship between the two variables studied is called correlation.
Correlation and causation
 High degree of correlation exists due to any one or a combination of the following reasons.
1.      By Chance: Due to small number of variables sometimes there may exist a correlation in a sample but the same does not exist in the population. It is due to chance factor in small sample.
2.      2. Influence of some external factors on two variables. A high degree of variables may be due to same causes affecting the each variable.
3.      Influence of two variables on each other or mutual influence.
4.      Influence of one variable upon the other -one of the variable is truly independent and therefore acts free from any external forces and influence the other variable which is truly dependent since it reacts in response to the independent variable.
Types of correlation:
1.      Positive or negative correlation.
2.      Simple partial or multiple correlations
3.       Linear or non linear correlations.
Positive or negative correlation.
It depends on the direction in which the variables are moving. When both the variables move in the same direction it is positive correlation and if they move in the opposite direction it is negative correlation.
Simple , Partial & Multiple correlation
Simple- Only two variables are involved.
Partial or multiple- Relationship of more than two variables.
Multiple correlation- The relationship between one independent variable and two or more independent variables are studied.
Eg. Feed intake _ Body weight, Milk yield.
Partial correlation: The study of two variables excluding some other variables is called partial correlation.
 Linear and non Linear correlations:
Correlation between two variables is said to be linear if corresponding to a unit change in one variable, there is a constant change in one variable, there is a constant change in the other variable over the entire range of values.
 X        30        60        90        120      150
 Y        10        20        30        40        50

The graph of these variables having such relationship will form a straight line.
The distinction between linear and non- linear correlation is based on the ratio of change between the variables under study.
X         1          2          3          4          5
Y         5          7          9          11        13
Thus for a unit change in X there is a constant change of 2 in Y.
Y = 2X + 3
The two variables X and Y are linearly related, if there exist a relationship
Y = a + bx
That is if the two values are plotted on a graph one should get a straight line.
If there is no constant change in ‘Y’ for every unit change in ‘a’ then it is termed as non linear  or curvilinear.
Non linear – eg. If we double the protein content in the feed milk, production will not be doubled. The graph of non- linear realtionship will form a curve. It is also called                     “curivilinear relationship”.
The mutual relationship could depend on
1.      Mutual dependence- supply and demand
2.      Both are influenced by same external factors – Effect of weather on rice and potato yield.
3.      Pure chance- size of shoe and degree of intelligence- known as spurious or non sense correlation.
Methods of studying correlation
I.                   Scatter diagram method: By plotting the two variables on the graph sheet the relationship can be understood. If the points are too much scattered it indicates less or no relationship. If it is condensed then it indicates some relationship between the two variables.


Depending upon the distribution in the scatter plot
1.      High degree of  positive correlation
2.      High degree of negative correlation
3.      Low degree of negative correlation
4.      Low degree of positive correlation
This method does not get affected by extreme values and give fair degree of relationship. However, in large sample it is not suitable. It does not provide exact measure of the
Merits and Demerits of scatter diagram
Merits:
1.Simple
2. We can have a rough idea about the realtionship whteher it is +ve or –ve.
3. Not influenced by extreme item.
Demerits
It cannot give exact degree of correlation

II.                Graphical method The two individual values of the two variables are plotted on a graph paper. We thus get two curves one for X and another for Y. These two curves form the basis of comparison.
                        Jan       Feb     Mar      Apr      May
VariableI         12        16        12        14        18
Variable II       18        14        18        16        13

Both these are about visualizing relationship.
III.             Coefficient of correlation _ Measuring the relationship.
Karl Pearson developed the method
 It is also called Pearsonian Coefficient of correlation


No comments:

Post a Comment