The Prediction of the Maternal and Fetal Blood Lead Level via Generalized Linear Model

Zakaria Y. AL-Jammal Abstract Generalized linear models (GLMs) are generalization of the linear regression models, which allow fitting regression models to response variable that is non normal and follows a general exponential family. The aim of this study is to encourage and initiate the application of GLMs to predict the maternal and fetal blood lead level. The inverse Gaussian distribution with inverse quadratic link function is considered. Four main effects were significant in the prediction of the maternal blood lead level (pica, smoking of mother, dairy products intake of mother, calcium intake of mother), while in the prediction of the fetal blood lead level two main effects showed significance (dairy products intake of mother and hemoglobin of mother).


1-Introduction
Generalized linear models (GLMs), as the name implies, are generalizations of the classical linear regression model. The classical linear model assumes that the mean of the response variable y is a linear function of a set of predictor variables (Hardin & Hilbe, 2007), and that the response variable is continuous and normally distributed with constant variance. As a matter of fact, in many applications, the response variable is categorical or consists of counts or is continuous but non normal, so the ordinary least square method can't be applied to find the

2-Exponential Family of Distributions
An important concept underlying GLM is the exponential family of distributions. Members of the exponential family of distributions all have probability density functions for a response y that can be expressed in the form , and ) ( c ⋅ are specific functions. The parameter θ is a natural location parameter, and φ is often called a dispersion parameter. The binomial, Poisson, normal, gamma, and inverse Gaussian distributions are members of this family. (Myers et al., 2002). Here are some properties of the exponential family:

3-Generalized Linear Models
The theory and use of GLMs were introduced by Nelder and This linear combination of explanatory variables is called the linear predictor.  6) where g is a monotonic differentiable function. The term link is derived from the fact that the function is the link between the mean and the linear predictor (Myers et al., 2002) .The expected response is One way of assessing the adequacy of a model is to compare it with a more general model with the maximum number of parameters that can be estimated. This is called a saturated model, which is a generalized linear with the same distribution and same link function as the models of interest. We define a measure of the fit of the model to the data as twice the difference between the log likelihoods of the model of interest and the saturated models. Since this difference is a measure of the deviation of the model of interest from a perfectly fitting model, this measure is called the deviance. The deviance, D , is given by In fitting a particular model, we seek the values of the parameters that minimize the deviance. A good rule of thumb is that the lack of fit be good when deviance/ (n-p) is less than 1.

4-Inverse Gaussian Distribution
The inverse Gaussian distribution is a positively skewed continuous distribution having two parameters µ and 2 σ .
Several alternative parameterization appear in the literature. In our paper we use the following p.d.f.
The log likelihood function of (10) may be derived as:

5-Application
Great attention has been directed to study maternal and fetal

5-1 Prediction of the Maternal Blood Lead Level
High levels of lead in pregnant women arise from various affected variables. These explanatory variables are: x (Diary products intake of mother), and 8 x (Calcium intake of mother).
The GLM equation is      x (dairy products intake of mother), 3 x (blood pressure of mother), and 4 x (hemoglobin of mother).    (5)

6-Conclusion
The generalized linear regression models for predicting MBLL and FBLL assuming the inverse Gaussian distribution as the response distribution are considered. From table (1), four explanatory variables (pica, smoking of mother, dairy products intake of mother, calcium intake of mother) have shown significant effects, while from table (2), dairy products intake of mother and hemoglobin of mother show main effects. The normal probability plot for the residuals for both response variables are represented on figure (2) and (5) (3) and (6), which points out that the variance is not constant.