Final for Proseminar
I have provided you a data set with 1419
observations on salaries along with a number of other variables.
You work is to generate and interpret a multiple linear regression with salary as a function of age (variable age), education (variable educ), gender (variable female), minority status (variable minority), and time on the job (variable jobtime)
Before running that model, you should look at these variables individually and
provide detail for each: possible statistics might be the mean values,
minimums, maximums, etc. The idea is to describe these data to me, your reader.
Because salary is the crucial variable, you might want to look at dependent
variable salary and how it relates to each of these variables independently:
You should also look at the
relationship between the independent variables:
Finally, generate a regression model
using all of the variables discussed in the introduction and discuss the
output.
Make sure to speak about each of the coefficients: for example,
Final words
All of the work that you do above should be presented
in a paper using sentences in paragraphs. Please do indicate the statistics
used but do not include any Stata output directly.
Have fun. Work hard. Do bring questions to class.
variable name |
variable label |
id |
Employee Code |
bdate |
Date of Birth |
educ |
Educational Level (years) |
jobcat |
Employment Category |
salary |
Current Salary |
salbegin |
Beginning Salary |
jobtime |
Months since Hire |
minority |
Minority Classification |
female |
Female dummy 1=female 0=male |
clerical01 |
clerical dummy 1=clerical 0=not
clerical |
custodial01 |
custodial dummy 1=custodial 0= not
custodial |
manager01 |
manager dummy 1=manager 0=not
manager |
age |
age in years |
agesquare |
age squared |