One Man, One World (ஒரு மனிதன், ஒரு உலகம் )

India Census 2001 – Part 1

with 3 comments

I was trying – for the last few weeks – to get the 2001 Indian census data. Alas the census website is under construction. But fortunately the Internet rewind button works! Thankfully the literacy data was online there. The raw data is available here.

I cleaned up the data so that it is easy to work with R. I removed the commas in the numbers. Also, under the urban status column I removed the dots and capitalized the status codes. One of the urban status became ‘NA’ and since R treats ‘NA’ as a missing data I changed it to NA1.

The cleaned up data is available here. Please download and rename it as india-census-2001.csv

Here goes the R code to explore the data:

#---------------------------------------------------------------------------
# set the working directory
# replace dir with your own path where "india-census-2001.csv" is stored
setwd("dir")

# load the plotting package
library(lattice)
india <- read.csv(file = "india-census-2001.csv", header = T)

# find out the places with zero population!
india_pop_zero <- subset(india, TotPop == 0)[,c(2,3,4,5)]
#---------------------------------------------------------------------------

Lets us print out those places with zero population.

#---------------------------------------------------------------------------
print(india_pop_zero)
           City UrbanStatus   State District
200       Anjar           M Gujarat  Kachchh
636     Bhachau           M Gujarat  Kachchh
735        Bhuj           M Gujarat  Kachchh
1495 Gandhidham           M Gujarat  Kachchh
2173     Kandla          CT Gujarat  Kachchh
2937     Mandvi           M Gujarat  Kachchh
3128      Morvi           M Gujarat   Rajkot
3178     Mundra          CT Gujarat  Kachchh
4043      Rapar           M Gujarat  Kachchh
5119   Wankaner           M Gujarat   Rajkot
#---------------------------------------------------------------------------

Find out the population in all Kachchh districts.

#---------------------------------------------------------------------------
subset(india, District == "Kachchh")[,c(2,3,4,5,6)]
           City UrbanStatus   State District TotPop
200       Anjar           M Gujarat  Kachchh      0
636     Bhachau           M Gujarat  Kachchh      0
735        Bhuj           M Gujarat  Kachchh      0
1495 Gandhidham           M Gujarat  Kachchh      0
2173     Kandla          CT Gujarat  Kachchh      0
2937     Mandvi           M Gujarat  Kachchh      0
3178     Mundra          CT Gujarat  Kachchh      0
4043      Rapar           M Gujarat  Kachchh      0
#---------------------------------------------------------------------------

Find out the population in all Rajkot districts.

#---------------------------------------------------------------------------
> subset(india, District == "Rajkot")[,c(2,3,4,5,6)]
                City UrbanStatus   State District TotPop
695       Bhayavadar           M Gujarat   Rajkot  18246
1298         Dhoraji           M Gujarat   Rajkot  80807
1613          Gondal           M Gujarat   Rajkot  95991
1956          Jasdan           M Gujarat   Rajkot  39041
1984 Jetpur Navagadh           M Gujarat   Rajkot 104311
3128           Morvi           M Gujarat   Rajkot      0
3521        Paddhari          CT Gujarat   Rajkot   9225
3967          Rajkot       MCorp Gujarat   Rajkot 966642
4919          Upleta           M Gujarat   Rajkot  55341
5119        Wankaner           M Gujarat   Rajkot      0
#---------------------------------------------------------------------------

Looks as if the data in the Kachchh region was not collected. Wonder why those two Rajkot districts also suffered the unfortunate fate. Maybe they are close to Kachchh region. Anyway let us look at the data which has non-zero population.

Let us plot the literacy rate of the city/town (x-axis) against the State (y-axis)

#---------------------------------------------------------------------------
india <- subset(india, TotPop > 0)
# Plot the literacy data
dotplot(State ~ 100*Literates/TotPop, xlab = "Literacy", data = india)
#---------------------------------------------------------------------------

Here goes the plot

india-census-2001-literacy

Looking at the plot, no surprise that Kerala has very high literacy rate in all the towns and the spread is also low. Tamil Nadu has a bigger spread in the literacy rates. The Northeastern states are doing very well in the educational aspect if we evaluate them by their literacy rates.

Let us check which city/town has the highest and the lowest literacy in India

#---------------------------------------------------------------------------
subset(india, TotLiteracy == max(TotLiteracy))[,c("City", "State", "District", "TotLiteracy")]
        City           State District TotLiteracy
1663 Gulmarg Jammu & Kashmir Baramula    96.23494
#---------------------------------------------------------------------------
subset(india, TotLiteracy == min(TotLiteracy))[,c("City", "State", "District", "TotLiteracy")]
        City       State District TotLiteracy
4666 Tarapur Maharashtra    Thane   0.7843697
#---------------------------------------------------------------------------

Well, this is a surprise. The city with the highest literacy is in Jammu & Kashmir (Gulmarg) and the lowest is in Maharashtra (Tarapur). What is shocking is that the literacy rate in Tarapur is less than 1%. I hope that there was mistake in data collection, otherwise it is a damning indictment of a huge administrative failure in that district. This is unacceptable.

In the next few posts, I will concentrate on Tamil Nadu and Coimbatore. It should be pretty easy to modify the code in the coming posts to look at the states and districts of your interest.

About these ads

Written by anandram

March 22, 2009 at 18:04

3 Responses

Subscribe to comments with RSS.

  1. Hi, I am Sisir, a graduate student in Economics Dept. at university of Virginia, I am interested in education in India and for my research I need the data on literacy rate (block wise and by sex) from census 2001. Can you tell where to get that data from. I appreciate any help in this regard. Thanks and Regards, Sisir

    Sisir

    June 30, 2009 at 16:39

  2. thanks for sharing India Census 2001 – Part 1

    ngekngok poems

    October 11, 2010 at 21:15


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Follow

Get every new post delivered to your Inbox.

%d bloggers like this: