Class 7 – Statistics

Take practice tests in Statistics

Online Tests

Topic Sub Topic Online Practice Test
Statistics
• Measures of central tendency
• Graphs
Take Test See More Questions

Study Material

 Introduction: Statistics or data handling may be defined as the science of collection, presentation, analysis and interpretation of numerical data. Information is collected, presented and organized in the form of tables, graphs, etc., analyzed and then inferences are drawn from them.   Data (Plural of ‘Datum’): The weights of a NCC batch of 25 students of a class, measured in kg are obtained as under: 35, 28, 26, 30, 32, 35, 26, 31, 36, 28, 29, 30, 27, 26, 36, 30, 25, 28, 29, 28, 27, 28, 30, 31. This collection of a particular type of information in the form of numerical figures is called, a set of data. This set of data obtained in the original form is called a set of raw (or ungrouped) data. Each numerical figure in the set of data is called an observation. Array: It is very difficult to draw any inference from this raw set of data. So we arrange it in ascending or descending order of size. The above set of data arranged in ascending order is: 25, 26, 26, 26, 27, 27, 28, 28, 28, 28, 28, 29, 29, 30, 30, 30, 30, 31, 31, 32, 32, 35, 35, 35, 36, 36. Arranging the numerical figures of a set of data in ascending or descending order is called an array. Range: By presenting the data in the above manner we can get some information about the data. Lowest weight = 25 kg, Highest weight = 36 kg The difference between the highest and lowest values of the observation in a given set of data is called its range. Here the range = 36 – 25 = 11. Frequency Distribution: The number of times a particular observation occurs is called its frequency. The frequency of 26 kg in the above data is 3 and the frequency of 28 is 5. We may represent the set of data obtained above in a tabular form showing the frequency of each observation beside it as under: The table showing the frequencies of various observations of data is called a frequency distribution table or simply frequency table. We take each observation from the set of data and count them with the help of strokes called tally marks. For the sake of convenience we use tally marks in bunches of five, i.e., the fifth one  crossing the four diagonally.

Exmaple: The marks scored by 35 students in a science test were as under:

60, 65, 100, 70, 85, 75, 95, 90, 65, 70, 80, 95, 70, 75, 70, 80, 80, 70, 75, 85, 85, 70, 90, 75, 75, 80, 80, 85, 85, 90, 75, 75, 80, 80.

Prepare a frequency distribution table for the above data.

Solution: First, we re-arrange the given set of data in ascending order as under:

60, 65, 65, 70, 70, 70, 70, 70, 70, 75, 75, 75, 75, 75, 75, 75, 75, 80, 80, 80, 80, 80, 80, 80, 85, 85, 85, 85, 85, 90, 90, 90, 95, 95, 100.

The frequency distribution table for the above data is:

Measures of central tendency: There are 3 measures used to find averages they are mean, median and mode. Let us see about mean.

Mean: The arithmetic mean in statistics is the same as ‘average’ in arithmetic.

Mean of ungrouped (or) raw data: The mean of a set of data is found out by dividing the sum of all the observations by the total number of observations in the data. We denote the mean by  (read “x bar”).

Example: Following are the ages (in years) of 10 teachers in a school. 32, 41, 27, 54, 36, 25, 28, 57, 40, 38.

(i) What is the age of the oldest teacher and that of the youngest teacher?

(ii) Find the range of the ages of the teachers.

(iii) Find the mean age.

Solution: Arranging in ascending order, we get 25, 27, 28, 32, 36, 38, 40, 41, 54, 57 From the above set of data, we find that

(i)   Age of the oldest teacher = 57 years

Age of the youngest teacher = 25 years

(ii) Range = (57-25) years = 32 years

(iii)

Example: Find the arithmetic mean of the numbers 3, 0, -1, 7, 11.

Solution:

Example: A group of students was given a special test. The test was completed by various students in the following time (in minutes) : 18, 20, 21, 23, 25, 25, 29, 31, 31, 37.

(i) Find the mean time taken by the students to complete the test.

(ii) How many students took more than the mean time to complete the test?

(iii) If the student who took 37 minutes had taken only 23 minutes to complete the test, what would have been the mean time.

Solution: The time taken by students ( in minutes ) are 18, 20, 21, 23, 25, 25, 29, 31, 31, 37; Number of students = 10.

(i) Mean time

(ii) Number of students who took more than mean time, i.e., more than 26 minutes to complete the test = 4.

(iii) Replacing 37 by 23, new mean time

Example: The mean age of 5 children of a family is 12 years. If four of them are respectively 6, 11, 13 and 16 years, find the age of the fifth child.

Solution: Let the age of the fifth child be x years.

Example: The mean of 5 observations is 15. If mean of the first three obervations is 14 and that of the last three is 17, find the third observation.

Solution:

Mean of ungrouped frequency distribution: If the observations x1, x2, x3,………, xn with respective frequencies f1, f2, f3,…….., fn then the mean of the given data is sum of values of all observations with frequencies by the number of observations.

Sum of values of all observations = x1 f1+ x2 f2 + x3 f3 +………+ fn xn.

Number of observations = f1 + f2 + f3 +……..+ fn.

∴ Mean of data

Consider the frequency distribution showing scores of 35 students in a mathematics test. Then  mean of these scores . One way would be to add the scores of all the 35 boys as separate addends and then divide the sum by 35. It will be quite cumbersome. We can make the task easier by multiplying each score by its frequency and then dividing by the total number of boys., Then,

It is more convenient to find the products of the scores and the frequencies by adding an extra column to the frequency table and arranging the work as shown in table:

Example: Find the mean of the following distribution :

 x 5 15 25 35 45 f 7 8 20 10 5

Solution :

 x f fx 5 7 35 15 8 120 25 20 500 35 10 350 45 5 225 Total 50 1230

 Sunil 50 Manish 15 Ashok 10 Subodh 9 Rekha 6

Median: A group of students took a spelling test. After evaluation, the teacher announced that “on the average” each of the five students mis-spelled 18 words.

Shown at the right is the actual number of words mis-spelled by each student. It 18 the mean of these scores? (Yes). How many mis-spelled at least 18 words? (one). Does 18 satisfactorily represent the five scores? (No).

The five scores are arranged in order. Which score has the same number of scores above it as below it? (10). Is 10 a more satisfactory representative score? (Yes) Why or why not? (Ans: It is more representative of all of the scores than 18.)

If a set of data contains a few very high scores or very low scores, the mean does not satisfactorily represent the data. In situations such as these it is often more desirable to use the middle score, called the median, as the representative score.

The median of a set of numbers is the middle number when all the numbers are arranged in order of size, i.e., in descending or ascending order.

Rule: To find the median of a set of numbers, arrange them in order of size and select the middle number. If there is no middle number, that is when the number of numbers in the data is even, then the mean of the two middle scores is the median.

The median of  ‘n’ observations is

i) If n is odd ⇒ median =  observation.
ii) If n is even ⇒ median =  observation.

Example: What is the median weekly salary of workers in firm whose salaries are ` 84,  ` 60,  ` 50,  ` 40,  ` 45, ` 42,  ` 38,  ` 65, ` 71?

Solution:  First arrange the salaries in descending order:

` 84, ` 71, ` 65, ` 50, ` 45, ` 42, ` 40, ` 38.

Next, count the number of salaries. It is 9. The fifth salary (` 50) has four salaries which are less than it and four salaries above it. Therefore, ` 50 is the middle or median salary.

Mode: Miss Rama observed at a club meeting that three of the girls wore red dresses, seven wore black dresses and four wore pink dresses. Since more girls wore black than any other colour she said, “Black is the mode or fashion.”

The mode is another kind of averge and it is found by obseving the frequency with which each number in a set of numbers occurs. Since the mode can be found by inspection, it is the easiest of the measures of central tendency to obtain. However, as you will soon see it is not an specially reliable index of clustering.

The mode of a set of numbers is the number which occurs most frequently in the set. If no number occurs more than once, the set of data is said to have no mode. If different numbers occur the same number of times, the set of data has more than one mode.

For example:

 Observation 6, 7, 8, 9, 14 16, 17, 17, 17, 18, 19, 20 52, 58, 58, 58, 65, 73, 73, 73 Mode No mode one mode :17 two modes : 58 and 73

Exmaple: Find the mode of the following years of experience of teachers in a school:

7, 6, 10, 12, 5, 4, 7, 4, 2, 7, 1, 2, 3, 10, 1, 7, 5, 4.

Solution: By inspection, the largest frequency of ‘7’ is 4. Therefore, 7 years of experience is the mode. More teachers have 7 years of experience than any other number of years.

 Years of experience 12 10 7 6 5 4 3 2 1 Frequency 1 2 4 1 2 3 1 2 2

A histogram is a graphical representation of the distribution of data. It is an estimate of the probability distribution of a continuous variable (quantitative variable) and was first introduced by Karl Pearson.