*! version 1.0.1 19Dec2011 *! author pjm capture program drop emlcclogit program emlcclogit version 10.1 if replay() { if "`e(cmd)'" != "emlcclogit" { error 301 } } else { Main `0' } Display end program Main syntax varlist(min=2) [if] [in], /* */ GRoup(varname) /* */ [CLuster(varname) /* */ Classes(integer 2) /* */ SEed(integer 101) /* */ maxit(integer 50) /* */ bsreps(integer 0)] gettoken depvar xvars : varlist **Mark sample (via clogit model) marksample touse qui clogit `depvar' `xvars' if `touse', group(`group') qui replace `touse' = e(sample) preserve qui keep if `touse' /* If boostrap performed, calculate bootstrap vce, and post results in r(vce)*/ if `bsreps'>0 { if "`cluster'"=="" { Bootstrap `bsreps' `cluster' `all' } else { Bootstrap `bsreps' `group' `all' } tempname vce matrix `vce'=r(vce) Estimate `0' bslast /* The last estimation in the bootstrap routine returns all estimates in e() */ } /* Otherwise simply run Estimate once */ else { Estimate `0' /* returns all estimates in e() */ } restore end program Bootstrap, rclass gettoken bsreps 0: 0 gettoken cluster all: 0 tempname beta nparm bsbeta one dev dof vce tempfile estdata qui save `estdata' /* Estimate main results */ qui Estimate `all' matrix `beta'=e(beta) scalar `nparm'=e(k) local bcnames: colnames `beta' /* Draw successive bootstrap samples from touse estimation data */ local i=1 while `i'<=`bsreps' { bsample, cluster(`cluster') qui Estimate `all' matrix `bsbeta'=(nullmat(`bsbeta') \ e(beta)) matrix `one'=(nullmat(`one') \ 1) local ++i use `estdata', clear } matrix `dev' = `bsbeta' - (`one'*`beta') scalar `dof' = `bsreps'-1 matrix `vce' = diag(vecdiag((1/`dof')*(`dev''*`dev'))) matrix rownames `vce' =`bcnames' matrix colnames `vce' =`bcnames' return matrix beta `beta' return matrix vce `vce' end program Estimate, eclass syntax varlist(min=2) [if] [in], /* */ GRoup(varname) /* */ [CLuster(varname) /* */ Classes(integer 2) /* */ SEed(integer 101) /* */ maxit(integer 50) /* */ bsreps(integer 0) /* */ bslast] ****SET-UP OPERATIONS gettoken depvar xvars : varlist if "`bslast'"!="" { tempname bsvce matrix `bsvce'=r(vce) } **Create temporary numeric variable for groups (choice situations), and get no. of groups (choice situations) tempvar gid tempname ng egen `gid'=group(`group') qui sum `gid', meanonly scalar `ng'=r(max) **Create temporary numeric variable for clusters (people), and get no. of clusters **Note: if cluster() is not specified, it is set to the group() variable tempvar cid tempname nc if "`cluster'"!="" { egen `cid'=group(`cluster') qui sum `cid', meanonly scalar `nc'=r(max) } else { gen `cid'=`gid' scalar `nc'=`ng' } **Get no. obs per cluster (person) tempvar nobspc egen `nobspc'=count(`depvar'), by(`cid') **Get no. parameters to be estimated local nparm: list sizeof xvars local nparm =(1+`nparm') *`classes' **Get no. observations local N =_N **Amend option macros for classes, and iterations local C=`classes' forvalues c = 1/`C' { local clist `"`clist' Class`c'"' } local I=`maxit' if `seed'!=0 { **Randomly order clusters (to allow testing of starting values by varying seed)* set seed `seed' tempvar rndorder egen `rndorder'=max(100000*uniform()), by(`cid') /*the multiple of 100000 is used simply just to spread the variable a bit. Without this multiple, the group() function did not work properly*/ drop `cid' egen `cid'=group(`rndorder') } /* Else if seed=0, keep same order order of clusters (to allow direct comparison with Train (2008) results */ ****INITIALISE LATENT CLASS ROUTINE **Initialise denominator of weights - a sum to be calculated by looping over classes tempvar sumsK qui gen `sumsK' = 0 **Perform a number of steps in a loop over classes forvalues c = 1/`C' { **Set the initial shares (as equal) tempname s`c' scalar `s`c'' = 1/`C' **Estimate initial logits qui clogit `depvar' `xvars', group(`group'), if (`cid'>`nc'*(`c'-1)/`C') & (`cid'<=`nc'*`c'/`C') **Get the beta estimates tempname b`c' matrix `b`c''=e(b) *Name the beta estimates, in a number of steps: *First remove the depvar: prefix, matrix coleq `b`c'' ="" *then, create a local macro containing the list of xvars prefixed by bc_, for naming the columns of bc foreach x of local xvars { local b`c'xvars `"`b`c'xvars' b`c'_`x'"' } *Finally, name the columns: matrix colnames `b`c'' =`b`c'xvars' **Calculate probability of sequence of choices, conditional on class tempvar L`c' qui predict `L`c'' , pc1 tempvar sumlnL`c' qui egen `sumlnL`c'' =sum(ln(`L`c''*`depvar')) , by(`cid') tempvar K`c' qui gen `K`c'' =exp(`sumlnL`c'') **Calculate unconditional choice probability (denominator of weights) qui replace `sumsK' = `sumsK' + `s`c''*`K`c'' } **Calculate initial weights forvalues c=1/`C' { tempvar h`c' qui gen `h`c'' =`s`c''*`K`c'' / `sumsK' } **Calculate log likelihood tempvar ll qui gen `ll' = ln(`sumsK')/`nobspc' qui sum `ll', meanonly tempname totll_zero scalar `totll_zero'=r(sum) **Initialize the iteration ticker and convergence indicator* local i=0 tempname converged totll_0 totll_1 sall tempvar hscaled scalar `converged'=0 scalar `totll_1'=`totll_zero' scalar `sall'=0 qui gen `hscaled'=0 ****ITERATE LATENT CLASS ROUTINE while `converged'==0 & `i'<`I' { **Display previous iteration's output di as txt "Iteration `i':" _col(16) "log likelihood =" _col(34) as res %15.3f `totll_1' **Reset / drop intermediate variables and results from previous iteration estimates clear scalar `totll_0'=`totll_1' scalar `sall' =0 forvalues c=1/`C' { drop `L`c'' `sumlnL`c'' } qui replace `sumsK' = 0 **Move to next iteration local ++i forvalues c=1/`C' { **Update the shares qui replace `hscaled'=`h`c''/`nobspc' /* Divide by number of obs for each person to account for diff in # obs over people */ qui sum `hscaled', meanonly /* (was hc) */ scalar `s`c'' =r(sum) /* (was r(mean) */ scalar `sall'=`sall'+`s`c'' /* (new) */ **Estimate logits, weighted by hc, and store betas qui clogit `depvar' `xvars' [iweight=`h`c''] , group(`group') matrix `b`c''=e(b) **Calculate probability of sequence of choices, conditional on class qui predict `L`c'' , pc1 qui egen `sumlnL`c'' =sum(ln(`L`c''*`depvar')) , by(`cluster') qui replace `K`c'' =exp(`sumlnL`c'') } forvalues c=1/`C' { **Update the shares scalar `s`c''=`s`c''/`sall' /* (new) */ **Calculate unconditional choice probability (denominator of weights) qui replace `sumsK' = `sumsK' + `s`c''*`K`c'' } forvalues c=1/`C' { **Update weights qui replace `h`c'' =`s`c''*`K`c'' / `sumsK' } **Calculate log likelihood qui replace `ll' = ln(`sumsK')/`nobspc' qui sum `ll', meanonly scalar `totll_1'=r(sum) **Check for convergence scalar `converged' = (`totll_1' - `totll_0') < 0.00005 } **COLLATE AND POST RESULTS ereturn clear tempname b forvalues c=1/`C' { matrix `b'=(nullmat(`b') , (`s`c'') , (`b`c'')) } matrix coleq `b'="" forvalues c=1/`C' { local bnames `"`bnames' Share`c' `b`c'xvars'"' } matrix colnames `b'=`bnames' matrix rownames `b'=Coeff if `bsreps'>0 & "`bslast'"=="" { /* While going through bootstrap reps, only need b and k*/ ereturn matrix beta =`b' ereturn scalar k = `nparm' } else { /* If no bootstrap, or last bootstrap run, collate all results */ tempname s B meanB V forvalues c=1/`C' { matrix `s'=(nullmat(`s'), (`s`c'')) matrix `B'=(nullmat(`B') \ (`b`c'')) } matrix `B'=`B'' matrix `meanB'=`B'*`s'' matrix rownames `s'=Share matrix colnames `s'=`clist' matrix roweq `B'="" matrix colnames `B'=`clist' matrix roweq `meanB'="" matrix colnames `meanB'="" if `bsreps'==0 { /* If no bootstrap, no vce to post. */ ereturn post `b', depname(`depvar') obs(`N') dof(`=`N'-`nparm'') esample(`touse') } if "`bslast'"!="" { /* If last bootstrap run, post vce from r() */ matrix `V'=`bsvce' matrix colnames `V'=`bnames' matrix rownames `V'=`bnames' ereturn post `b' `V', depname(`depvar') obs(`N') dof(`=`N'-`nparm'') esample(`touse') } ereturn scalar ll =`totll_1' ereturn scalar ll_0=`totll_zero' ereturn scalar converged =`converged' ereturn scalar I=`I' ereturn scalar N_clust =`nc' ereturn scalar classes =`C' ereturn scalar k = `nparm' ereturn local cmd "emlcclogit" ereturn local xvars `xvars' ereturn matrix s=`s' ereturn matrix B=`B' ereturn matrix meanB=`meanB' } end program Display syntax if e(converged) == 0 { di _newline(1) as txt "LATENT CLASS MODEL FAILED TO CONVERGE" } di _newline(2) as txt "Latent Class Conditional Logistic Regression" /* */ _col(51) "Classes" _col(67) "=" as res %12.0f e(classes) /* */ _newline(1) _col(51) as txt "No. of obs" _col(67) "=" as res %12.0f e(N) /* */ _newline(1) as txt "Log likelihood =" _col(24) as res %15.3f e(ll) if e(properties) == "b V" { ereturn display } noi matrix list e(s), title("Class Shares") noi matrix list e(B), title("Class Coefficients") noi matrix list e(meanB), title("Mean Coefficients") end