Sun, 17 May 2020 00:00:00 +0000http://ehar.se/2020/05/17/random_intercepts/Model Consider a linear mixed model like
\[\begin{equation} Y_{ij} = \alpha + u_i + \beta x_{ij} + \epsilon_{ij}, \quad j = 1, \ldots, n_i; \; i = 1, \ldots, K, \tag{1} \end{equation}\]
where \(u_1, \ldots, u_K\) are iid random intercepts (drawn from \(N(0, \tau^2)\)), \(x_{ij}\) are measures on a covariate (continuous type) and \(\epsilon_{ij}\) are iid \(N(0, \sigma^2)\). If the \(u\)s and the \(x\)es are independent, we can write the model asThe Swedish age specific sex ratio
Sun, 12 Apr 2020 00:00:00 +0000http://ehar.se/2020/04/12/sexratio/I was looking at Swedish vital data from Statistics Sweden as a reaction to certain statistics floating around in the shadow of the ongoing covid-19 pandemic, when I accidentally happened to look at the age-specific sex ratio (males to females) in Sweden 2018, see Figure 1.
Figure 1: Sex ratio by age, Sweden 2018. This is quite as expected, with the exception of the small peak at age 19.Life is short but unlimited?
Old age mortality by social status
Mediation in survival analysis
Mon, 19 Dec 2016 00:00:00 +0000http://ehar.se/2016/12/19/mediation-in-survival-analysis/Introduction The estimation of direct effects and indirect effects (via one or several mediators) of an exposure on survival is a challenging task, that has received much attention lately. A good overview is given by Aalen et al. (2012). They emphazise the importance of explicitly including time in any discussion of causality and mediation: A cause must precede an effect. And many relevant references are found there.
Relevant, recent material Tyler VanderWeele has written a lot of interesting stuff on mediation, look under “Selected Publications” and “Methodological” on his homepage.Sizeless statistics and ignored nonresponse
Mon, 19 Dec 2016 00:00:00 +0000http://ehar.se/2016/12/19/sizeless/Sizeless statistics We were shown that the log-odds-ratio was 0.101 and the number of sigificance stars attached to it was three. But we were not told the subject-matter implication of this finding, other than that the “log odds” of migrating was larger in one group than in the other.
But why is sizeless statistics so common today? This is an increasing problem in applied statistics, especially in the social sciences, since a couple of decades ago.Contrasts
Sun, 02 Oct 2016 00:00:00 +0000http://ehar.se/2016/10/02/contrasts/1 The problem 1.1 The cause 2 The solution 2.1 A related problem: The Hauck-Donner effect 2.2 Back to the main track: A simulation study 3 Conclusion References 1 The problem When I run a logistic regression on infant mortality with a categorical covariate, say season, with four categories, (‘spring’, ‘summer’, ‘fall’, ‘winter’), with the following data (400 births),
deaths births season 1 50 100 spring 2 44 100 summer 3 58 100 fall 4 41 100 winter I get this output from R, with winter as reference category:Sufficient statistics and integrity
Fri, 23 Sep 2016 00:00:00 +0000http://ehar.se/2016/09/23/aggregate/Introduction If you suffer from large execution times with huge data sets, and/or integrity problems, this is for you. I show how the theory of sufficient statistics may solve your problem, given that you are willing to organize your data properly. Your integrity problem (if any) is solved on the fly.
I am illustrating everything by example using R and RStudio. However, the principle is universal, and you could, as an exercise, think of how you would implement it in Stata, if that is part of your inclinations.About
I am a docent in mathematical statistics and professor emeritus at the Centre of Demographic and Ageing Research ("CEDAR") at Umeå University, Umeå, Sweden.
The eha package.
The Second Edition of Event History Analysis with R (work in progress).
Research
Mon, 01 Jan 0001 00:00:00 +0000http://ehar.se/research/ Edvinsson, S. and Broström, G. (2020). “Is high social class always beneficial for survival? A study of northern Sweden 1801–2013." (Updated 2020-06-26 13:35) View Download.