Major Features:
1. Data management: data transformations, match-merge, ODBC, XML, by-group processing, append files, sort, row–column transposition, labeling, saving results, more
2. Basic statistics: summaries, cross-tabulations, correlations, t tests, equality-of-variance tests, tests of proportions, confidence intervals, factor variables, more
3. Linear models: regression; bootstrap, jackknife, and robust Huber/White/sandwich variance estimates; instrumental variables; three-stage least squares; constraints; quantile regression; GLS; more
4. Multilevel mixed-effects models: continuous, binary, and count outcomes; two-, three-, and multiway random-intercepts and random-coefficients models; crossed random effects; ML and REML estimation; BLUPs of effects and fitted values; hierarchical models; residual error structures; more
5. Binary, count, and limited dependent variables: logistic, probit, tobit; Poisson and negative binomial; conditional, multinomial, nested, ordered, rank-ordered, and stereotype logistic; multinomial probit; zero-inflated and zero-truncated count models; selection models; marginal effects; more
6. Panel data/longitudinal data: random- and fixed-effects with robust standard errors, linear mixed models, random-effects probit, GEE, random- and fixed-effects Poisson, dynamic panel-data models, and instrumental-variables regression; panel unit-root tests; AR(1) disturbances; more
7. Generalized linear models (GLMs): ten link functions, user-defined links, seven distributions, ML and IRLS estimation, nine variance estimators, seven residuals, more
8. Nonparametric methods: Wilcoxon–Mann–Whitney, Wilcoxon signed ranks and Kruskal–Wallis tests; Spearman and Kendall correlations; Kolmogorov–Smirnov tests; exact binomial CIs; more
9. Exact statistics: exact logistic and Poisson regression, exact case–control statistics, binomial tests, Fisher’s exact test for r × c tables, more
10. ANOVA/MANOVA: balanced and unbalanced designs; factorial, nested, and mixed designs; repeated measures; marginal means; more
11. Multivariate methods: factor analysis, principal components, discriminant analysis, rotation, multidimensional scaling, Procrustean analysis, correspondence analysis, biplots, dendrograms, user-extensible analyses, more
12. Cluster analysis: hierarchical clustering; kmeans and kmedian nonhierarchical clustering; dendrograms; stopping rules; user-extensible analyses; more
13. Resampling and simulation methods: bootstrapping, jackknife and Monte Carlo simulation, permutation tests, more
14. Model testing and postestimation support: Wald tests; LR tests; linear and nonlinear combinations, tests, and predictions; marginal means, least-squares means, adjusted means, average partial and marginal effects; Hausman tests; more
15. Graphics: line charts, scatterplots, bar charts, pie charts, hi–lo charts, Graph Editor, regression diagnostic graphs, survival plots, nonparametric smoothers, distribution Q–Q plots, more
16. Survey methods: sampling weights, multistage designs; stratification, poststratification; deff; means, proportions, ratios, totals; summary tables; predictive margins; bootstrap, jackknife, and linearization-based variance estimation; regression, instrumental variables, probit, Cox regression; more
17. Survival analysis: Kaplan–Meier and Nelson–Aalen estimators, Cox regression (frailty); parametric models (frailty); competing risks; hazards; time-varying covariates; left and right censoring, Weibull, exponential, and Gompertz analysis; sample size and power analysis; more
18. Tools for epidemiologists: standardization of rates, case–control, cohort, matched case–control, Mantel–Haenszel, pharmacokinetics, ROC analysis, ICD-9-CM, more
19. Time series: ARIMA, ARCH/GARCH, VAR, VECM, multivariate GARCH, dynamic factors, state-space models, high-frequency data, correlograms, periodograms, white-noise tests, unit-root tests, Holt–Winters smoothers, Haver Analytics data, rolling and recursive estimation, more
20. Multiple imputation: five univariate imputation methods, multivariate normal imputation, explore pattern of missingness, manage imputed datasets, estimate model and pool results, transform parameters, joint tests of parameter estimates, more
21. Maximum likelihood: user-specified functions; NR, DFP, BFGS, BHHH; OIM, OPG, robust, bootstrap, and jackknife matrices; Wald tests; survey data; numeric or analytic derivatives; more
22. Other statistical methods: generalized method of moments (GMM), sample size and power, nonlinear regression, stepwise regression, statistical and mathematical functions, more
23. Programming language: adding new commands, command scripting, if, while, command parsing, debugging, menu and dialog-box programming, markup and control language, more
24. Matrix programming—Mata: interactive sessions, large-scale development projects, optimization, matrix inversions, decompositions, eigenvalues and eigenvectors, LAPACK engine, real and complex numbers, string matrices, interface to Stata datasets and matrices, numerical derivatives, object-oriented programming, more
25. Internet capabilities: ability to install new commands, web updating, web file sharing, latest Stata news, more
26. Accessibility: Section 508 compliance, accessibility for persons with disabilities
27. Sample session: A sample session of Stata for Mac, Unix, or Windows. |