STATA Tutorial I
STATA Tutorial I
Anusha Nath
Delhi School of Economics
Winter Semester 2008-09
Running STATA
STATA Windows
When STATA is started, there are four
windows that open on the screen:
1. Command
2. Results
3. Review
4. Variables
Anusha Nath
Delhi School of Economics 2008-09
Anusha Nath
Delhi School of Economics 2008-09
Anusha Nath
Delhi School of Economics 2008-09
Anusha Nath
Delhi School of Economics 2008-09
Anusha Nath
Delhi School of Economics 2008-09
Step 1: Open the Do-file editor by clicking the icon on the menu or
selecting Do from the File menu.
Step 2: Type the all the commands in the file that are required for the
analysis
Step 3: Run the commands as a batch by using the command:
do dofilename
The Do-file can be saved for use in a future STATA session. Note that
STATA automatically saves the do files with an extension .do
Anusha Nath
Delhi School of Economics 2008-09
This file keeps a record of all the commands and outputs of a particular
STATA session.
To open a log file, you can either go to the File menu, select Log and
then Begin or simply type the following command:
log using logfilename
where logfilename is the name you want to give to the log file. STATA
automatically appends the extension .log to the filename.
Anusha Nath
Delhi School of Economics 2008-09
To add new output to an existing (but closed) log file, we can use the
following command:
log using logfilename, append
To erase data on existing log file and overwriting it with new output, we can
use the following command:
log using logfilename, replace
To list the last n number of commands typed, we can use the following
command:
#review n
Anusha Nath
Delhi School of Economics 2008-09
10
11
12
Setting Memory
By default, STATA allocates 1 megabyte of memory
to its data areas. To work with larger datasets, it
becomes imperative to increase the memory. This
can be done through the following command:
set memory memsize
where memsize can be 2m, 100m etc.
Note that the memory can be set only if the there is
no data currently loaded in the STATA spreadsheet.
Anusha Nath
Delhi School of Economics 2008-09
13
14
15
Anusha Nath
Delhi School of Economics 2008-09
16
17
Anusha Nath
Delhi School of Economics 2008-09
18
19
Reading Data
use filename
use c:\user\data\filename
Anusha Nath
Delhi School of Economics 2008-09
20
Anusha Nath
Delhi School of Economics 2008-09
21
Examining Data
22
Types of Variables
23
Anusha Nath
Delhi School of Economics 2008-09
24
25
Anusha Nath
Delhi School of Economics 2008-09
26
Expressions in STATA
Logical Expressions
The logical expressions are evaluated as 1 if true and 0 if false.
The logical operators used are:
& and
|
or
! not
~ not
The relation operators are:
== equal
< less than
<= less than or equal to
!= not equal to
Anusha Nath
Delhi School of Economics 2008-09
27
28
29
30
Inserting Comments
To make STATA ignore any comments
inserted in the do file, we can use the
following options:
31
Observation Indices
Each observation has an index associated with it. The macro _n
takes on the value of the running index and _N is equal to the
total number of observations. For example, x[3] refers to the third
observation of variable x and
x[_n-1]
refers to the previous observation of a variable x. We can hence
calculate percentage change in a variable x by using the
following expression:
((x[_n]-x[_n-1])/(x[_n-1]))*100
Anusha Nath
Delhi School of Economics 2008-09
32
Observation Ranges
We can refer to a range of observations either by
using if with a logical expression involving _n or by
using the following command:
in f/l
where f/l is used to specify a range of indices. f
refers to the first value in the range while l refers to
the last value in the range. For example,
list x in 5/12
will list observations from fifth to twelfth in variable
x.
Anusha Nath
Delhi School of Economics 2008-09
33
Generating Variables
34
Anusha Nath
Delhi School of Economics 2008-09
35
Anusha Nath
Delhi School of Economics 2008-09
36
Anusha Nath
Delhi School of Economics 2008-09
37
Graphs in STATA
Scatter Plots:
scatter yvar xvar, options
type help scatter to see what options you can
include with the scatter. In general, for plots
requiring an x-axis and a y-axis, the command
twoway (short for graph twoway) can be used:
twoway (scatter yvar xvar), options
Anusha Nath
Delhi School of Economics 2008-09
38
Anusha Nath
Delhi School of Economics 2008-09
39
1000
0
governmenrt expenditure
200
400
600
800
0
100
200
300
militaryexpenditure
Anusha Nath
Delhi School of Economics 2008-09
40
Line Graphs:
The syntax for this is similar to the scatter
plot. To get a line graph between government
and military expenditures, we type:
twoway (scatter ug um), ytitle(government
expenditure) xtitle(militaryexpenditure)
Anusha Nath
Delhi School of Economics 2008-09
41
1000
0
governmenrt expenditure
200
400
600
800
0
100
200
300
militaryexpenditure
Anusha Nath
Delhi School of Economics 2008-09
42
Histograms:
The syntax for the command is:
histogram varname [if] [in] [weight] [,
[continuous_opts | discrete_opts]
options]
type help histogram to check out the options
available.
Anusha Nath
Delhi School of Economics 2008-09
43
Anusha Nath
Delhi School of Economics 2008-09
44
45
Anusha Nath
Delhi School of Economics 2008-09
46
.005
Density
.01
.015
.02
60
80
100
normalvar
Anusha Nath
Delhi School of Economics 2008-09
120
140
47
References
Everitt, Brian and Rabe-Heskethm S. (2004),
A Handbook of Statistical Analyses using
STATA,
Christenson, Dino and Powell, Scott (2008),
An Introduction to STATA
STATA Tutorials at Princeton
Anusha Nath
Delhi School of Economics 2008-09
48