**Statistics in Economics**

Scarcity is the root of all economic problems. Economics is often discussed in three parts: consumption, production and distribution.

We want to know how the consumer decides, given his income and many alternative goods to choose from, what to buy when he knows the prices. This is the study of Consumption. We also want to know how the producer, similarly, chooses what and how to produce for the market. This is the study of Production. Finally, we want to know how the national income or the total income arising from what has been produced in the country (called the Gross Domestic Product or GDP) is distributed through wages (and salaries), profits and interest (We will leave aside here income from international trade and investment). This is the study of Distribution.

Economics is the study of how people and society choose to employ scarce resources that could have alternative uses in order to produce various commodities that satisfy their wants and to distribute them for consumption among various persons and groups in society.”

Statistics deals with the collection, analysis, interpretation and presentation of numerical data.

**Collection of Data**

Primary and Secondary Data Sources.

The questionnaire should start from general questions and proceed to more specific ones. For example: (i) Is the electricity supply in your locality regular? (ii) Is increase in electricity charges justified?

**Pilot Survey**

Once the questionnaire is ready, it is advisable to conduct a try-out with a small group which is known as Pilot Survey or Pre-testing of the questionnaire.

**CENSUS AND SAMPLE SURVEYS**

A survey, which includes every element of the population, is known as Census or the Method of Complete Enumeration.

**SAMPLING AND NON-SAMPLING ERRORS**

Sampling error refers to the difference between the sample estimate and the corresponding population parameter (actual value of the characteristic of the population for example, average income, etc). Thus, the difference between the actual value of a parameter of the population and its estimate (from the sample) is the sampling error. It is possible to reduce the magnitude of sampling error by taking a larger sample.

Non-sampling errors are more serious than sampling errors because a sampling error can be minimized by taking a larger sample. It is difficult to minimize non-sampling error, even by taking a large sample. Even a Census can contain non-sampling errors. Some of the non-sampling errors are: Sampling Bias, Non-Response Errors, Errors in Data Acquisition.

**CENSUS OF INDIA AND NSSO**

The Census is being regularly conducted every ten years since 1881. The first Census after Independence was conducted in 1951. The NSS was established by the Government of India to conduct nationwide surveys on socio-economic issues. Census of India and National Sample Survey are two important agencies at the national level, which collect, process and tabulate data on many important economic and social issues.

**CLASSIFICATION OF DATA**

The data collected from primary and secondary sources are raw or unclassified. Once the data are collected, the next step is to classify them for further statistical analysis. Classification brings order in the data. Likewise the raw data is classified in various ways depending on the purpose. They can be grouped according to time. Such a classification is known as a Chronological Classification. In such a classification, data are classified either in ascending or in descending order with reference to time such as years, quarters, months, weeks, etc. In Spatial Classification the data are classified with reference to geographical locations such as countries, states, cities, districts, etc. Sometimes you come across characteristics that cannot be expressed quantitatively. Such characteristics are called Qualities or Attributes. For example, nationality, literacy, religion, gender, marital status, etc. They cannot be measured. Yet these attributes can be classified on the basis of either the presence or the absence of a qualitative characteristic. Such a classification of data on attributes is called a Qualitative Classification.

Characteristics, like height, weight, age, income, marks of students, etc., are quantitative in nature. When the collected data of such characteristics are grouped into classes, it becomes a Quantitative Classification.

**Continuous and Discrete. Variables**

A continuous variable can take any numerical value. It may take integral values (1, 2, 3, 4, …), fractional values (1/2, 2/3, 3/4, …), and values that are not exact fractions ( 2 =1.414, 3 =1.732, … , 7 =2.645). For example, the height of a student, as he/she grows say from 90 cm to 150 cm, would take all the values in between them. It can take values that are whole numbers like 90cm, 100cm, 108cm, 150cm. It can also take fractional values like 90.85 cm, 102.34 cm, 149.99cm etc. that are not whole numbers.

Unlike a continuous variable, a discrete variable can take only certain values. Its value changes only by finite “jumps”. It “jumps” from one value to another but does not take any intermediate value between them. For example, a variable like the “number of students in a class”, for different classes, would assume values that are only whole numbers.

**FREQUENCY DISTRIBUTION**

It shows how different values of a variable (here, the marks in mathematics scored by a student) are distributed in different classes along with their corresponding class frequencies. In this case we have ten classes of marks: 0–10, 10–20, … , 90–100. The term Class Frequency means the number of values in a particular class.

**BIVARIATE FREQUENCY DISTRIBUTION**

Very often when we take a sample from a population we collect more than one type of information from each element of the sample. For example, suppose we have taken sample of 20 companies from the list of companies based in a city. Suppose that we collect information on sales and expenditure on advertisements from each company. In this case, we have bivariate sample data. Such bivariate data can be summarised using a Bivariate Frequency Distribution.

**Presentation of Data**

There are generally three forms of presentation of data:

Textual or Descriptive presentation, Tabular presentation, Diagrammatic presentation.

Classification used in tabulation is of four kinds:

• Qualitative

• Quantitative

• Temporal and

• Spatial

A good table should essentially have the following: Table Number, Title: It finds place at the head of the table, Column Headings, Row Headings, Body of the Table, Unit of Measurement, Source: Source is generally written at the bottom of the table.

Diagrams may be less accurate but are much more effective than tables in presenting the data.

There are various kinds of diagrams in common use. Amongst them the important ones are the following:

(i) Geometric diagram: Bar diagram and pie diagram come in the category of geometric diagram.

(ii) Frequency diagram: Data in the form of grouped frequency distributions are generally represented by frequency diagrams like histogram, frequency polygon, frequency curve and ogive. A histogram looks similar to a bar diagram. But there are more differences than similarities. In histogram no space is left between two rectangles, but in a bar diagram some space must be left between consecutive bars The width in a histogram is as important as its height. We can have a bar diagram both for discrete and continuous variables, but histogram is drawn only for a continuous variable.

(iii) Arithmetic line graph

Correlation: Correlation is commonly classified into negative and positive correlation. The correlation is said to be positive when the variables move together in the same direction. When the income rises, consumption also rises. When income falls, consumption also falls. Sale of ice cream and temperature move in the same direction. The correlation is negative when they move in opposite directions. When the price of apples falls its demand increases. When the prices rise its demand decreases. Three important tools used to study correlation are scatter diagrams, Karl Pearson’s coefficient of correlation and Spearman’s rank correlation.

**Index Numbers**

The value of money does not remain constant over time. It rises or falls and is inversely related to the changes in the price level. A rise in the price level means a fall in the value of money and a fall in the price level means a rise in the value of money. Thus, changes in the value of money are reflected by the changes in the general level of prices over a period of time. Changes in the general level of prices can be measured by a statistical device known as ‘index number.’

Price index number indicates the average of changes in the prices of representative commodities at one time in comparison with that at some other time taken as the base period.

Steps or Problems in the Construction of Price Index Numbers:

1. Selection of Base Year:

The first step or the problem in preparing the index numbers is the selection of the base year. The base year is defined as that year with reference to which the price changes in other years are compared and expressed as percentages. The base year should be a normal year.

In other words, it should be free from abnormal conditions like wars, famines, floods, political instability, etc. Base year can be selected in two ways- (a) through fixed base method in which the base year remains fixed; and (b) through chain base method in which the base year goes on changing, e.g., for 1980 the base year will be 1979, for 1979 it will be 1978, and so on.

2. Selection of Commodities:

The second problem in the construction of index numbers is the selection of the commodities. Since all commodities cannot be included, only representative commodities should be selected keeping in view the purpose and type of the index number.

3. Collection of Prices:

After selecting the commodities, the next problem is regarding the collection of their prices:

(a) From where the prices to be collected;

(b) Whether to choose wholesale prices or retail prices;

(c) Whether to include taxes in the prices or not etc.

While collecting prices, the following points are to be noted:

(a) Prices are to be collected from those places where a particular commodity is traded in large quantities.

(b) Published information regarding the prices should also be utilised,

(c) In selecting individuals and institutions who would supply price quotations, care should be taken that they are not biased.

(d) Selection of wholesale or retail prices depends upon the type of index number to be prepared. Wholesale prices are used in the construction of general price index and retail prices are used in the construction of cost-of-living index number.

4. Selection of Average:

Since the index numbers are, a specialised average, the fourth problem is to choose a suitable average.

5. Selection of Weights:

Generally, all the commodities included in the construction’ of index numbers are not of equal importance. Therefore, if the index numbers are to be representative, proper weights should be assigned to the commodities according to their relative importance.

For example, the prices of books will be given more weightage while preparing the cost-of-living index for teachers than while preparing the cost-of-living index for the workers. Weights should be unbiased and be rationally and not arbitrarily selected.

6. Calculation.

**SOME IMPORTANT INDEX NUMBERS**

Consumer price index Consumer price index (CPI), also known as the cost of living index, measures the average change in retail prices. Consider the statement that the CPI for industrial workers (2001=100) is 277 in December 2014. What does this statement mean? It means that if the industrial worker was spending Rs 100 in 2001 for a typical basket of commodities, he needs Rs 277 in December 2014 to be able to buy an identical basket of commodities. It is not necessary that he/she buys the basket. Consumer Price Index Number Government agencies in India prepare a large number of consumer price index numbers. Some of them are as follows:

• Consumer Price Index Numbers for Industrial Workers with base 2001=100. Value of Index in May 2017 was 278.

• All-India Consumer Price Index Numbers for Agricultural Labourers with base 1986- 87=100. Value of Index in May 2017 was 872.

• All-India Consumer Price Index Numbers for Rural Labourers with base 1986-87=100. Value of Index in May 2017 was 878.

• All-India Rural Consumer Index with base 2012 = 100. Value of Index in May 2017 was 133.3

• All-India Urban Consumer Price Index with base 2012 = 100. Value of Index in May 2017 was 129.3

All-India Combined Consumer Price with base 2012 = 100. Value of Index in May 2017 was 131.4 In addition, these indices are available at the state level.

The Reserve Bank of India is using the All-India Combined Consumer Price Index as the main measure of how consumer prices are changing.

Therefore, some details are necessary about this index number. This index is now being prepared with base 2012 = 100 and many improvements have been made in accordance with international standards.

Wholesale Price Index The Wholesale price index number indicates the change in the general price level. Unlike the CPI, it does not have any reference consumer category. The Wholesale Price Index is now being prepared with base 2011-12 = 100. The value of the index for May 2017 was 112.8.

SENSEX

Sensex is the short form of Bombay Stock Exchange Sensitive Index with 1978–79 as base. The value of the sensex is with reference to this period.

Consumer index number (CPI) or cost of living index numbers are helpful in wage negotiation, formulation of income policy, price policy, rent control, taxation and general economic policy formulation.

• The wholesale price index (WPI) is used to eliminate the effect of changes in prices on aggregates, such as national income, capital formation, etc.

• The WPI is widely used to measure the rate of inflation. CPI are used in calculating the purchasing power of money and real wage.