**Data collection** is the systematic process of gathering information to provide evidence for solving economic problems and reaching sound conclusions.
**Key Concepts:**
**Purpose of Data Collection:**
---
Data can originate from two main sources, each with distinct characteristics and applications.
**Definition**: Data collected directly by the researcher for the first time through first-hand information.
**Characteristics:**
**Example**: A researcher interviewing school students to determine the popularity of a filmstar among them collects primary data.
**Advantages:**
**Disadvantages:**
**Definition**: Data already collected, processed (scrutinised and tabulated), and published by another agency, which is then used by other researchers.
**Characteristics:**
**Sources of Secondary Data:**
**Example**: When a researcher uses data previously collected on filmstar popularity in a similar study, that data becomes secondary data.
**Advantages:**
**Disadvantages:**
---
**Survey**: A systematic method of gathering information from individuals to describe specific characteristics (price, quality, usefulness, popularity, etc.).
A questionnaire is the primary instrument used in surveys. It may be self-administered by respondents or administered by trained enumerators.
**Essential Guidelines for Questionnaire Design:**
**1. Length and Conciseness**
**2. Language and Clarity**
**3. Logical Sequence and Flow**
**Poor Example**: Ask about justification of electricity charges before asking about regularity of supply
**Good Example**: Ask about regularity of electricity supply first, then about charges
**4. Precision and Clarity**
**Poor**: "What percentage of your income do you spend on clothing in order to look presentable?"
**Good**: "What percentage of your income do you spend on clothing?"
**5. Avoiding Ambiguity**
**Poor**: "Do you spend a lot of money on books in a month?"
**Good**: "How much do you spend on books in a month?"
**6. Avoiding Double Negatives**
**Poor**: "Don't you think smoking should be prohibited?"
**Good**: "Do you think smoking should be prohibited?"
**7. Avoiding Leading Questions**
**Poor**: "How do you like the flavour of this high-quality tea?"
**Good**: "How do you like the flavour of this tea?"
**8. Avoiding Questions Indicating Alternatives**
**Poor**: "Would you like to do a job after college or be a housewife?"
**Good**: "What would you like to do after college?"
**Open-Ended (Unstructured) Questions**
**Advantages:**
**Disadvantages:**
**Closed-Ended (Structured) Questions**
**Two-Way Questions** (Binary choice):
**Multiple Choice Questions**:
**Advantages of Closed-Ended Questions:**
**Disadvantages of Closed-Ended Questions:**
**"Any Other" Option**:
---
Three primary methods exist for collecting survey data, each with distinct advantages, disadvantages, and applications.
**Definition**: Researcher (or investigator) conducts face-to-face interviews with respondents.
**When Used**: When researcher has access to all members of population.
**Advantages:**
**Disadvantages:**
**Best For**: Urban areas, small populations, questions requiring clarification.
**Definition**: Questionnaire is sent to respondents by mail with request to complete and return by specified date.
**Modern Variants**: Online surveys and SMS surveys.
**Advantages:**
**Disadvantages:**
**Best For**: Educated populations, literate respondents, questions needing anonymity, geographically dispersed areas.
**Definition**: Investigator asks questions over telephone; respondent answers orally.
**Advantages:**
**Disadvantages:**
**Best For**: Quick surveys, urban educated populations, when respondents have phone access, supplementary interviews.
| Feature | Personal Interview | Mail Survey | Telephone Interview |
|---------|-------------------|-------------|-------------------|
| Cost | High | Low | Medium |
| Response Rate | Highest | Lowest | Medium-High |
| Time | Long | Long | Short |
| Interviewer Influence | High | None | Medium |
| Reactions Observable | Yes | No | No |
| Geographic Reach | Limited | Excellent | Medium |
| Best for Open Questions | Yes | No | No |
| Suitable for Remote Areas | No | Yes | Limited |
---
**Definition**: A try-out of the questionnaire with a small sample group before conducting the actual survey.
**Purpose and Functions:**
**1. Testing Questionnaire Quality**
**2. Operational Assessment**
**3. Resource Planning**
**Benefits:**
**Process**:
---
**Definition**: A survey that includes every element/unit of the population.
**Also Known As**: Method of Complete Enumeration.
**Characteristics:**
**Census in India:**
**Census of India - Historical Overview:**
**Population Growth Rates:**
(Note: Declining growth rate shows demographic transition)
**Data Collected in Census:**
**Conducting Census:**
**Advantages:**
**Disadvantages:**
**Population (Universe)**
**Definition**: The totality of all items/individuals under study; the entire group to which study results are intended to apply.
**Characteristics:**
**Example - Research Problem**: To study economic condition of agricultural labourers in Churachandpur district, Manipur
**Sample**
**Definition**: A group or section of population from which information is obtained; smaller than population but representative of it.
**Characteristics:**
**Example (continued)**: From above research problem
**Representative Sample**:
**Advantages of Sample Over Census:**
---
**Definition**: Sampling method where individual units from population are selected at random; every unit has equal chance of selection.
**Key Principle**: Each member of population has equal probability of being selected.
**Methods of Random Selection:**
**1. Lottery Method**
**Example**:
**2. Random Number Tables**
**3. Computer-Based Selection**
**Characteristics:**
**Sampling Frame**: Complete list of all units in population from which sample is drawn
**Advantages:**
**Disadvantages:**
**Exit Polls - Application of Random Sampling:**
During elections, television networks use random sampling to predict election results:
**Definition**: Sampling method where investigator uses judgment, convenience, or predetermined criteria to select sample; not all units have equal chance of selection.
**Key Principle**: Investigator's judgment plays important role; convenience and bias may influence selection.
**Characteristics:**
**Example**:
**Methods of Non-Random Sampling:**
**1. Convenience Sampling**
**2. Judgment Sampling**
**3. Purposive Sampling**
**4. Quota Sampling**
**Disadvantages:**
**Advantages:**
---
**Definition**: Difference between sample estimate (calculated from sample) and corresponding population parameter (actual value of population characteristic).
**Nature of Sampling Error:**
**Mathematical Expression:**
**Causes:**
**Reduction Methods:**
**Characteristics:**
**Definition**: Errors in data collection, processing, and analysis that are NOT due to sampling; occur in both census and sample surveys.
**Sources of Non-Sampling Error:**
**1. Data Collection Errors**
**2. Interviewer-Related Errors**
**3. Respondent-Related Errors**
**4. Processing Errors**
**5. Survey Design Errors**
**6. Non-Response Errors**
**Characteristics:**
**Reduction/Minimisation Methods:**
**Comparison: Sampling vs Non-Sampling Error**
| Aspect | Sampling Error | Non-Sampling Error |
|--------|---|---|
| **Definition** | Difference between sample estimate and population parameter | Errors in collection, processing, analysis (not due to sampling) |
| **Occurrence** | Only in sample surveys | In both census and sample surveys |
| **Measurement** | Can be measured statistically | Difficult to measure |
| **Relationship to Sample Size** | Decreases with larger sample | Unrelated to sample size |
| **Control** | Can be controlled by proper sampling | Difficult to control completely |
| **Source** | Inherent to sampling process | Errors in survey design and execution |
---
**Key Takeaways for Board Examination:**
1. **Data Collection Purpose**: Provides evidence for sound economic decision-making and problem-solving
2. **Primary vs Secondary Data**: Primary (first-hand, specific) vs Secondary (processed, available from published sources)
3. **Questionnaire Design**: Must be clear, concise, unambiguous, logically sequenced, free from bias and leading questions
4. **Three Modes of Collection**: Personal interviews (highest response, most expensive), Mail surveys (cheapest, lowest response), Telephone interviews (medium cost and response)
5. **Census**: Complete enumeration every 10 years in India, providing comprehensive population data
6. **Sampling**: Representative sample more practical than census due to cost, time, and intensive inquiry advantages
7. **Random Sampling**: Scientific selection ensuring equal probability for all units
8. **Non-Random Sampling**: Uses investigator judgment; less reliable but practical
9. **Sampling Error**: Measurable difference between sample and population; decreases with larger sample size
10. **Non-Sampling Error**: More serious; occurs in all surveys; requires careful design and execution to minimise
**Indian Context Relevance:**
Q1. Which of the following is an example of primary data?
Answer: B — Primary data is collected first-hand by the researcher through direct enquiry; options A, C, and D are secondary data already published by other sources.
Q2. Which statement correctly distinguishes primary and secondary data?
Answer: C — This is the correct definition: primary data is original first-hand collection, while secondary data is already processed by someone else; option B reverses the definitions.
Q3. What is a variable in statistics?
Answer: B — A variable is any quantity that varies across observations; for example, food grain production (Y) varies across years (X).
Q4. A researcher wants to study the average monthly income of farmers in a district with 50,000 farms. Which approach would be MORE practical?
Answer: B — A sample survey is practical, faster, and cheaper than a census of all 50,000 farms while still providing reliable results; option A is too costly and time-consuming.
Q5. Which of the following is a problem with a LEADING QUESTION?
Answer: B — A leading question suggests a desired answer (e.g., 'How do you like this high-quality tea?'), influencing respondents to answer in that direction rather than expressing true opinion.
Q6. Which questionnaire design principle is CORRECTLY applied?
Answer: B — Correct questionnaire design arranges questions from general to specific to build comfort; options A, C, and D violate questionnaire design principles.
Q7. A poor questionnaire asks: 'Do you spend a lot of money on books every month?' Which revision BEST improves this question?
Answer: B — Option B removes ambiguity by providing clear, measurable categories instead of vague terms like 'a lot'; it prevents misinterpretation and is easy to analyse.
Q8. Which is NOT a correct statement about closed-ended questions?
Answer: C — Closed-ended questions may miss true responses if the actual answer is not among the given options; options A, B, and D correctly describe closed-ended questions.
Q9. According to the material, India's food grain production was 108 million tonnes in 1970–71 and rose to 272 million tonnes in 2016–17. In this data, which of the following is correctly identified? (i) Year (X) is a variable (ii) Production (Y) is a variable (iii) Each production value (e.g., 108) is an observation (iv) Variables are represented by numbers only, never letters
Answer: B — Both year and production are variables (they change), and each value is an observation; statement (iv) is false because variables are represented by letters like X and Y.
Q10. A researcher collects data by directly interviewing 300 farmers about their annual income. Later, an economist uses this same data from the researcher's published report. For the researcher, this data is PRIMARY, but for the economist, it is SECONDARY because:
Answer: B — Data is primary to the source that collects it first and secondary to anyone else who uses the already-processed data from a published source; the same data changes status based on who is using it and when.
What is primary data?
Data collected directly by the researcher through first-hand enquiry or survey.
What is secondary data?
Data already collected, processed, and published by another agency or source.
Define a variable in statistics.
A characteristic or quantity that changes or varies from observation to observation.
What is an observation?
A single value or measurement of a variable in a dataset.
What is a questionnaire?
A set of carefully designed questions used to collect data from respondents in a survey.
Distinguish between closed-ended and open-ended questions.
Closed-ended questions offer fixed options (yes/no or multiple choice); open-ended questions allow respondents to write their own answer.
Why should a questionnaire avoid double negatives?
Double negatives (like 'Don't you think...') confuse respondents and can bias answers.
What is a leading question and why is it problematic?
A leading question hints at the expected answer, biasing the respondent's response instead of capturing true opinion.
Name three sources of secondary data in India.
Government reports, census documents, newspapers, books by economists, and official websites like RBI or Ministry of Statistics.
Why is a questionnaire arranged from general to specific questions?
This order makes respondents comfortable by starting easy, building rapport before asking sensitive or detailed questions.
Define primary data and secondary data with one example each from the Indian economic context. [2 marks]
Primary = first-hand collection by researcher (e.g., survey of farmers about crop yield); Secondary = already published by others (e.g., RBI inflation data or census reports).
A researcher wants to design a questionnaire to study the awareness of government agricultural schemes among rural farmers in Maharashtra. Identify and correct THREE common questionnaire design errors shown below: (i) 'Don't you think government schemes are helpful?' (ii) 'How much do you earn annually?' (iii) 'Would you prefer job training or subsidy assistance?' (shows alternatives) Explain why each error must be corrected and rewrite the improved versions. [5 marks]
Error (i) = double negative (bias); Error (ii) = vague with no options (ambiguous); Error (iii) = leading with pre-set alternatives; rewrite using simple language, offer realistic response categories, and avoid bias-inducing phrasing.
Discuss the practical advantages and limitations of using a SAMPLE SURVEY instead of a CENSUS for studying the income and employment patterns of 2 lakh workers in India. Use relevant examples and explain when a researcher might choose one over the other. [6 marks]
Sample Survey: cheaper, faster, practical for large populations but has sampling error and may not capture all subgroups; Census: complete and accurate but very costly and time-consuming; justify choice based on research objective, budget, and time constraints with Indian economic example (e.g., Labour Force Survey vs Population Census).
Practice with interactive flashcards, mind maps, upload your own chapters and get AI study kits instantly
Try StudyOS Free →