STATISTICS FOR EXPERIMENTAL DATA ANALYSIS 1997 Version
Mean, Standard Deviation, T- test
1. Population mean - The average value of a population of numbers.
2. Standard deviation - A measure of the spread of Xi values.
(Sum Xi)2 = 204.49 (Sum Xi)2 /n = 40.89
The mean and standard deviation is usually expressed as bar X + sd or 2.86 + 1.37. This means that if we were to find more Xi's in our sample, 68% should fall between the limits of one standard deviation (2.86 + 1.37 or between 1.49 and 4.23), 95% should fall between bar X + 2 sd (2.86 + 2.74) and 99.7% should fall between + 3 sd (2.86 + 4.11).
3. Variance = before the square root of the sd is taken, that number is called the variance. The variance = 1.88
4. Standard error - a measure of the spread of means if we were to repeat the experiment numerous times and produced more means. This is also called the standard deviation of the mean. It is the most frequently found form of deviations found in the literature. It looks smaller and besides we are usually interested in how means will vary upon experimental repetition than how individual samples will vary.
The mean and standard error are usually expressed as X + se or 2.86 + 0.61. This means that if we were to repeat the experiment and come up with new populations of numbers, 68% of the means of these numbers should fall between the limits of one standard error etc.
5. Student's T-test is a test to compare the means and standard deviations (or more accurately the variances) of two populations of numbers and to determine the probability of whether these means were derived from different populations of numbers or the same population. Let us suppose we grew two populations of mice and poisoned one population with cigarette smoke. We want to know the probability that the cigarette smoke had an influence on the final weight of the animals. We will have two different means bar X1 and bar X2 and two different estimates of the distribution of the numbers composing these populations. T-tests assume that the distribution of numbers does not change with the treatment group. So we assume that poisoning may affect the mean weight but not the sd. We can pool the two estimates of the sd and come up with a better one. The pooled estimate is sp.
Example: Control (1) Poisoned (2)
32 35
37 31
35 29
28 25
41 34
44 40
35 27
31 32
34 31
So that bar X(1) = 35.22 bar X(2) = 31.56
sd12 = 24.44 sd22 = 20.02
The pooled estimate of variance is:
The T statistic is as follows:
Now one must compare the T- statistic to a critical T (Tc) from the table that follows. The table is based on "degrees of freedom" (df) which is n1 + n2 - 2 = 16. The degrees of freedom is always one less than n. Looking down the column of df to 16, we find a series of numbers (the Tc's). Our T of 1.65 is greater than 1.337, but less than 1.746 so our test indicates we have exceeded the requirements for p = .10, but we have not met the requirements for p = .05. This means that the chances that our two number populations are the same is between 0.1 and .05 or 1 in 10 and 1 in 20. We would then state in our paper that the chances our number populations were from the same number population were less than 0.1 or 10%. Usually this is stated as a probability (p) less than 0.10.
p < 0.10
If we were to state that it was the poisoning (not a sampling abnormality) that affected the final weights of the mice, we would have a 10% chance of being incorrect. The chance of error that you want to call significant is up to you. I wouldn't bet my life or scientific reputation on a p < 0.10, but might bet $1.00. Most people in biological investigations consider p < .05 as indicating significantly different number populations. Here they have a 1 in 20 chance of error and afterall, you have to stick you neck out a little.
6. T - test with PAIRED DIFFERENCE OF SIGNIFICANCE. Some variability can be eliminated in experiments by testing the same animal under the control and experimental circumstances to be evaluated. For instance, to detect whether a drug has an effect on the respiration of an animal, one could check respiration on the animal as a control and on the same animal under the influence of the drug. The statistical tests would be done on the "comparisons" of the two values. This test must and should be used when it can be (An animal used as its own control).
Example: Animal Control Drugged Difference
A 5 7 +2
B 5 4 -1
C 3 6 +3
D 6 7 +1
E 3 6 +3
Total difference = +8
Mean difference + sd bar d = 1.6 + 1.67
The Tc for 4 df = is 1.533 at p = 0.1 and 2.132 at p = 0.05 so we could state with not a whole lot of confidence that the drug influenced respiration (p < 0.1).
7. Regression analysis (You need not learn this unless you use it in your project) - a mathematical means to determine the best straight line to fit a series of points on a graph. Let's say we did an experiment that indicated weight gains at daily intervals. Plotting weight on the y-axis and time on the x-axis might give us points which appear to approximate a straight line. You could "eyeball" the line in or do it mathematically to give the least error and give each point equivalent weight. Note that each point consists of two numbers (weight) and (time). The formula for a straight line is y = mx +b. Where m is the slope of the line and b is the y intercept (where the line crosses the y-axis when x = 0).
b = y - mx
Example: yi xi xi(2) xiyi yi(2)
65 39 1521 2535 4225
78 43 1849 3354 6084
52 21 441 1092 2704
82 64 4096 5248 6724
92 57 3249 5244 3464
89 47 2209 4183 7921
98 75 5625 7350 9604
73 28 784 2044 5329
56 34 1156 1904 3136
75 52 2704 3900 5625
` ___________________________________________________________
Sums: 760 460 23634 36854 59816
n =10
b = 76 - (0.76)(46) = 40.78
So the best fitting straight line is y = 40.78 + 0.766x.
|
CRITICAL_VALUES_OF_'T' |
|
|||||
|---|---|---|---|---|---|---|
|
df |
p=0.1 |
p=0.05 |
p=0.025 |
p=0.01 |
p=0.005 |
|
|
1 |
3.078 |
6.314 |
12.706 |
31.821 |
63.657 |
|
|
2 |
1.886 |
2.920 |
4.303 |
6.965 |
9.925 |
|
|
3 |
1.638 |
2.353 |
3.182 |
4.541 |
5.841 |
|
|
4 |
1.533 |
2.132 |
2.776 |
3.747 |
4.604 |
|
|
5 |
1.476 |
2.015 |
2.571 |
3.365 |
4.032 |
|
|
6 |
1.440 |
1.943 |
2.447 |
3.143 |
3.707 |
|
|
7 |
1.415 |
1.895 |
2.365 |
2.998 |
3.499 |
|
|
8 |
1.397 |
1.860 |
2.306 |
2.896 |
3.355 |
|
|
9 |
1.383 |
1.833 |
2.262 |
2.821 |
3.250 |
|
|
10 |
1.372 |
1.812 |
2.228 |
2.764 |
3.169 |
|
|
11 |
1.363 |
1.796 |
2.201 |
2.718 |
3.106 |
|
|
12 |
1.356 |
1.782 |
2.179 |
2.681 |
3.055 |
|
|
13 |
1.350 |
1.771 |
2.160 |
2.650 |
3.012 |
|
|
14 |
1.345 |
1.761 |
2.145 |
2.624 |
2.977 |
|
|
15 |
1.341 |
1.753 |
2.131 |
2.602 |
2.947 |
|
|
16 |
1.337 |
1.746 |
2.120 |
2.583 |
2.921 |
|
|
17 |
1.333 |
1.740 |
2.110 |
2.567 |
2.898 |
|
|
18 |
1.330 |
1.734 |
2.101 |
2.552 |
2.878 |
|
|
19 |
1.328 |
1.729 |
2.093 |
2.539 |
2.861 |
|
|
20 |
1.325 |
1.725 |
2.086 |
2.528 |
2.845 |
|
|
21 |
1.323 |
1.721 |
2.080 |
2.518 |
2.831 |
|
|
22 |
1.321 |
1.717 |
2.074 |
2.508 |
2.819 |
|
|
23 |
1.319 |
1.714 |
2.069 |
2.500 |
2.807 |
|
|
24 |
1.318 |
1.711 |
2.064 |
2.492 |
2.797 |
|
|
If your d.f. is more than 24, use the values for 24. |
|
|||||