*Bounty: 150*

*Bounty: 150*

I wonder if the Mixed Logit model could be understood, stated, and estimated as a Generalized Linear Mixed Model.

### Mixed Logit

Consider a standard discrete choice setting where individual $n in {1,…,N}$ chooses $t in {1,…,T}$ times one alternative $i in {1,…,J} $ mutually exclusive choice set. Additionally, we define that the binary variable $y_{nit}$ take the value of $1$ when individual

$n$ chooses alternative $i$ in the choice situation $t$. Accordingly, the random utility maximization model is specified as

$$ U_{nit} = alpha_{i} +

boldsymbol{X}*{nit} boldsymbol{beta}*{n} + varepsilon_{nit}= alpha_{i} +

boldsymbol{X}^{F}*{nit} boldsymbol{beta}^{F} +
boldsymbol{X}^{R}*{nit} boldsymbol{beta}^{R}

*{n} +*

varepsilon{nit} $$

varepsilon

Where $U_{nit}$ is the random utility associated with individual $n$ choosing alternative $i$ during choice situation $t$ and $ε_{nit}$ is an iid extreme value type I preference shock. Moreover, both the alternative attributes and preference parameters are sorted into two groups.

- On the one hand, $boldsymbol{beta}^{F}$ is a vector of fixed preference parameters, and boldsymbol{X}^{F}_{nit} is the attribute/covariate vector associated with these fixed parameters.
- On the other hand, $boldsymbol{beta}^{R}_{n}$ is a vector of
**random parameters**and $boldsymbol{X}^{R}_{nit}$ is the attribute vector (or regressors) for which the researcher expects the presence of unobserved preference heterogeneity. Commonly, the $boldsymbol{beta}^{R}_{n}$ is assumed to follow a parametric distribution, such as, for instance, a normal distribution, in which case we say that, $boldsymbol{beta}^{R}_{n} sim mathcal{N}(mu, Omega)$ (this is why there is a sub-index $n$ in $boldsymbol{beta}^{R}_{n}$ because each individual has a different parameter, and these parameters come from a Gaussian Distribution. )

Given this framework, it can be shown (see Train 2009 section 3.10 *Derivation of Logit Probabilities*) that the **conditional** (conditional on knowing each $boldsymbol{beta}^{R}_{n}$ ) **choice probability** that individual $n$ chooses alternative $i$ in the choice situation $t$ is given by :

$$P_{int}(boldsymbol{beta}*{n}) =
P(y*{nit}=1 | boldsymbol{beta}

*{n}, boldsymbol{X}*{nit} ) =

dfrac{exp(boldsymbol{X}

*{nit} boldsymbol{beta}*{n})}

{sum_{j=1}^{J}exp(boldsymbol{X}

*{njt} boldsymbol{beta}*{n})}$$

Additionally, given that the same individual chooses $t$ times an alternative from the choice set, we define the sequences of choices from individual $n$ as:

$$ P_{n}(boldsymbol{beta}*{n}) = prod*{t=1}^{T} prod_{i=1}^{J} P_{int}(boldsymbol{beta}*{n})^{y*{nit}}$$

Finally, the likelihood is defined as

$$ ln L(boldsymbol{varphi}) = sum_{n=1}^{N}

ln left[ int_{boldsymbol{beta}*{n}}
P*{n}(boldsymbol{beta}_{n}) f(boldsymbol{beta}|boldsymbol{varphi}) dboldsymbol{beta}

right] $$

where $ f(boldsymbol{beta}|varphi)$ is the parametric distribution over the random parameters $(boldsymbol{beta}^{R}*{n})$** which in this case, given that $boldsymbol{beta}^{R}*{n} sim mathcal{N}(mu, Omega)$, means that $boldsymbol{varphi} = (mu, Omega, boldsymbol{beta}^{F})$.

In a frequentist framework, the model can be fitted using **simulated maximum likelihood** taking draws from the assumed parametric distribution for the random parameters (see Train (2009) Chapter 10)

### Generalized Linear Mixed Models

In the Generalized Linear Mixed Models we assume that the data consist of outcomes from $m$ clusters, with $n_{i}$ observations in cluster $i$ ($i= (1,…,m)$). Within a cluster, the outcomes are independent, but conditional on the cluster-specific $ d times 1$ vector of random effects $b_{i}$, the outcomes $y_{ij}$ are independent and follow a generalized linear mixed model with mean:

$$ mu_{ij} = E left[ y_{ij}|beta, b_{i} right] = g^{-1}left[beta^{T}x_{ij} + b_{i}^{T}z_{ij} right] $$

where $x_{ij}$ and $z_{ij}$ are covariate vectors for the fixed effects $beta$ and the random effects $b_{i}$ of cluster $i$ and $g()$ is the so-called link function. Additionally, $b_{i}$ can be assumed to be normally distributed that is to say $b_{i} sim mathcal{N}(0, Sigma)$.

### Mixed logit as a Generalized Linear Mixed Model.

The similarities between the two models are somewhat evident since both assume a parametric distribution over a subset of the parameters. However, **(1)** what is not very clear to me is how to accommodate the estimation of the *"mean"* of the random parameters as it is done in the **mixed logit** case where the randoms effects follow a non-zero Gaussian Distribution and **(2)** I am not very familiar with the Generalized Linear Mixed Model so, is it possible to estimate a model where we have $b_{i} sim mathcal{N}(boldsymbol{mu}, Sigma)$?

Finally,in the case that it is possible to fit a write a Mixed logit as a Generalized Linear Mixed Model would it be possible to fit a mixed logit using for example the `glmer()`

R package instead of, say the `mlogit()`

R package?

Thank you in advance.