## Using latent class analysis to model prescription medications in the measurement of falling among a community elderly population

Hardigan et al. BMC Medical Informatics and Decision Making 2013, 13:60http://www.biomedcentral.com/1472-6947/13/60
Using latent class analysis to model prescriptionmedications in the measurement of fallingamong a community elderly population
Patrick C Hardigan1*†, David C Schwartz2† and William D Hardigan3†
Background: Falls among the elderly are a major public health concern. Therefore, the possibility of a modelingtechnique which could better estimate fall probability is both timely and needed. Using biomedical,pharmacological and demographic variables as predictors, latent class analysis (LCA) is demonstrated as a tool forthe prediction of falls among community dwelling elderly.

Methods: Using a retrospective data-set a two-step LCA modeling approach was employed. First, we looked forthe optimal number of latent classes for the seven medical indicators, along with the patients' prescriptionmedication and three covariates (age, gender, and number of medications). Second, the appropriate latent classstructure, with the covariates, were modeled on the distal outcome (fall/no fall). The default estimator wasmaximum likelihood with robust standard errors. The Pearson chi-square, likelihood ratio chi-square, BIC, Lo-Mendell-Rubin Adjusted Likelihood Ratio test and the bootstrap likelihood ratio test were used for modelcomparisons.

Results: A review of the model fit indices with covariates shows that a six-class solution was preferred. Thepredictive probability for latent classes ranged from 84% to 97%. Entropy, a measure of classification accuracy, wasgood at 90%. Specific prescription medications were found to strongly influence group membership.

Conclusions: In conclusion the LCA method was effective at finding relevant subgroups within a heterogenous at-risk population for falling. This study demonstrated that LCA offers researchers a valuable tool to model medicaldata.

model assumptions In this paper, we demonstrate
Latent Class Analysis (LCA) is a statistical method for
the utility of LCA for the prediction of falls among
finding subtypes of related cases (latent classes) from
community dwelling elderly.

multivariate categorical data The most common use
Falls among the elderly are a major public health con-
of LCA is to discover case subtypes (or confirm hypoth-
cern. Research on falls and fall-related behavior among
esized subtypes) based on multivariate categorical data
the elderly has found that falls are the leading cause of
LCA is well suited to many health applications
injury deaths among individuals who are over 65 years
where one wishes to identify disease subtypes or diag-
of age Research has shown that sixty percent of
nostic subcategories LCA models do not rely on
fall-related deaths occur among individuals who are
traditional modeling assumptions (normal distribution,
75 years of age or older Demography research
linear relationship, homogeneity) and are therefore, less
estimates that by 2030, the population of individuals
subject to biases associated with data not conforming to
who are 65 years of age or older will double and by 2050the population of individuals who are 85 years of age orolder will quadruple
* Correspondence:
†Equal contributors
Predicting elderly falling can be complex and often
1Department of Public Health, Nova Southeastern University, 3200 South
involves heterogeneous markers. Therefore, the identifi-
University Dr., Health Professions Division, Ft. Lauderdale, FL 33328, USA
cation of more homogeneous subgroups of individuals
Full list of author information is available at the end of the article
2013 Hardigan et al.; licensee BioMed Central Ltd. This is an Open Access article distributed under the terms of the CreativeCommons Attribution License which permits unrestricted use, distribution, andreproduction in any medium, provided the original work is properly cited.

Hardigan et al. BMC Medical Informatics and Decision Making 2013, 13:60
and the refinement of the measurement criteria are typ-
or no fall) and second what covariates increase or decrease
ically inter-related research goals. Appropriate statistical
the likelihood of this occurrence. The four specific aims of
applications, such as latent class analysis, have become
the study are to identify items that indicate classes,
available for researchers to model the complex hetero-
estimate class probabilities, relate the class probabilities to
covariates, and predict a distal outcome (fall/no-fall) based
Latent class models are used to cluster participants.

on class membership. We model this process through the
This type of model is adequate if the sample consists of
application of latent class analysis (Figure ).

different subtypes and it is not known before-handwhich participant belongs to which of the subtypes The latent categorical variable is used to model hetero-
geneity. In the classic form of the latent class model,
A convenient retrospective database consisting of a ran-
observed variables within each latent class are assumed
dom sample of 3,293 elderly patients was used to develop
to be independent, and no structure for the covariances
a model to predict the likelihood of falling among individ-
of observed variables is specified
uals aged 65 years or older. Due to the retrospective
LCA is one of the most widely used latent structure
nature of this study, this study was granted an exemption
models for categorical data LCA differs from more
in writing by the Nova Southeastern University's IRB.

well-known methods such as K-means clustering which
This is a poof-of-concept analysis so it should be noted
apply arbitrary distance metrics to group individuals
that this data set was not designed for an LCA, therefore,
based on their similarity LCA derives clusters
additional medical variables which may predict falling
based on conditional independence assumptions applied
were not available. For this study an elderly person was
to multivariate categorical data distributed as binomial
defined as someone aged 65 years or older. Descriptive
or multinomial variables ,Using statistical distri-
data were as follows (Table ): The average age of patients
butions rather than distance metrics to define clusters
was 77 years old; 32 percent of the subjects had fallen in
helps in evaluating whether a model with a particular
the last 30 days; falling patients were taking an average of
number of clusters is able to fit the data, since tests can
five prescription medications while non-fallers were con-
be performed to observed (ni) versus model expected
suming two; and 75 percent of the subjects were female.

values (mi), using exact methods as recommended ,].

Research demonstrates that about 22% of community-
This comparison gives rise to a χ2 test of global model fit,
dwelling elderly persons fall each year; 10% of these
in which significant values indicate lack of fit ]. Here
"fallers" have multiple episodes []. This research was ap-
lack of fit means deviation of (model) predicted (m)
proved by Nova Southeastern University's Institutional
frequencies from observed frequencies (n)
Review Board for human subjects research.

Latent class analysis assumes that each observation is a
member of one and only one latent class (unobservable)and that the indicator (manifest) variables are mutually in-dependent of each other []. The models are expressedin probabilities of belonging to each latent class. Forexample, seven manifest variables can be expressed as:
where πX denotes the probability of being in a latent class
(t = 1,2,…,T) of latent variable X; πA X denotes the condi-
tional probability of obtaining the ith response from itemA, from members of class t, i = 1,2,…,I; and πBjXπCjXπDjX
, j = 1,2,…,j k = 1,2,…,k l = 1,2,…,l m = 1,2,…,m
n = 1,2,…,n O = 1,2,…,O are the corresponding conditionalprobabilities for items B,C,D,E,F, and G respectively.

We are testing the hypothesis that a two-class distal
Figure 1 Proposed fall model for the latent class analysis. Yi are
relationship (fall/no fall) can explain the relationship
the observed categorical medical indicators on the latent classes C.

among the biomedical, pharmacological and demographic
Drug Measure is the correspondence analysis derived drug score for
variables. Proper analysis of this data requires the under-
each subject. Age is the age of patient. # Rx is the number of
standing of two interdependent outcomes.2 First, the
prescriptions taken by each subject. Gender is the subjects reportedgender. Falling is the distal outcome.

binary outcome is whether or not the event occurred (fall
Hardigan et al. BMC Medical Informatics and Decision Making 2013, 13:60
Table 1 Descriptive statistics
cerebral ischemia. Data from both principal and
secondary diagnosis fields within a patient record.

Number of Medications
Type of prescription medication—type of
prescription medication was taken from patient
Number of prescription medications—was taken
The data set was taken from the State of Florida's Elder
from patient records.

Affairs Office. All variables were physician diagnosed andrecorded in an electronic dataset using appropriate ICD-9
Demographic variables
codes. Variables included in the database were:
Age—was taken from patient records.

Gender—Self reported male or female taken from
patients' record.

Arthritis—defined as a person diagnosed with
osteoarthritis (OA) and/or rheumatoid arthritis
(RA). Presence or absence of arthritis was based onresponses to questions on the basis of ICD-9 714.0,
Falling—was defined as "an event which results in
715.× -716.×, from both principal and secondary
the person coming to rest inadvertently on the
diagnosis fields within a patient record.

ground or other lower level, and other than as a
High Blood Pressure (HBP)—defined as a person
consequence of sustaining a violent blow." Falling
diagnosed with hypertension. HBP was identified on
was taken from both principal and secondary
the basis of ICD-9 codes 401–405, from both
diagnosis fields within a patient record.

principal and secondary diagnosis fields within apatient record.

A two-step modeling approach was employed. First, it
Diabetes—defined as a person diagnosed with
was necessary to reduce the number of different medica-
diabetes mellitus. Diabetes was identified on the
tions (N = 121). Initially, a licensed geriatric pharmacist
basis of ICD-9 codes of 250.0×–250.5× and 250.7×–
(PharmD) reviewed the medication list for accuracy and
250.9× from both principal and secondary diagnosis
to remove medications that have not been shown to im-
fields within a patient record.

pact the probability of falling. Using correspondence
Heart Disease (HD)—defined as a person diagnosed
analysis (CA) the medications were converted to con-
with coronary artery disease. HD was identified on
tinuous scores. CA is an exploratory technique related
the basis of ICD-9 codes 414.0x, from both principal
to principal components analysis which finds a multidi-
and secondary diagnosis fields within a patient
mensional representation of the association between the
row and column categories of a multi-way contingency
Foot Disorders (FD)—defined as a person diagnosed
table ]. This technique finds scores for the row and
with peripheral neuropathy, foot wounds, peripheral
column categories on a small number of dimensions
vascular disease, or Charcot arthropa. FD was
which account for the greatest proportion of the chi2 for
identified on the basis of ICD-9 codes 356.9, 892.0-
association between the row and column categories, just
892.2, 443.9, and 713.5 from both principal and
as principal components account for maximum variance
secondary diagnosis fields within a patient record.

These scores were then used in the latent class ana-
Parkinson's Disease (PD)—defined as a person
lysis. Similar to other data reduction techniques, CA can
diagnosed with Parkinson's Disease. PD was
be used to transform data
identified on the basis of ICD-9 code 332.0 from
Second, we looked for the optimal number of latent clas-
both principal and secondary diagnosis fields within
ses for the seven binary indicators: (1) arthritis, (2) high
a patient record.

blood pressure, (3) diabetes, (4) heart disease, (5) foot disor-
Stroke—defined as a person diagnosed with
ders, (6) Parkinson's disease, and (7) stroke; along with the
occlusion and stenosis of precerebral arteries
patients' medication "score" and three covariates (age, gen-
including basilar artery, carotid artery, and vertebral
der, and number of medications). The appropriate latent
artery, etc.; occlusion of cerebral arteries including
class structure, with the covariates, were modeled on the
cerebral thrombosis and Cerebral embolism;
distal outcome (fall/no fall). The default estimator was max-
unspecified cerebral artery occlusion; and transient
imum likelihood with robust standard errors. The Pearson

Hardigan et al. BMC Medical Informatics and Decision Making 2013, 13:60
Table 2 List of medications and correspondence scores
Based on patient charts forty-one different medications
were used in the latent class analysis (Table To
reduce this to a manageable number, correspondence
analysis (CA) was employed and the CA values were
saved for use in the latent class analysis. Due to missing
data this reduced the number of subjects in the final
model to 2,814. The higher the CA score the more likely
the medication will induce a fall. CA scores are given by
the following formula
P is the matrix of counts divided by the total
r and c are row and column sums of P the Ds are diagonal matrices of the values of r and c
A plot of the values indicates that an elderly person with
a score of 0.40 has a 50% chance of falling (Figure
For this manuscript CA values are averaged for each
person and referred to as the drug falling measure
(singular value = 25%, inertia = 6% chi-square = 164.63).

For example, a person may be using three different
medications with CA values of -0.30, 0.20, and 1.20; so
their drug falling measure is 0.37. A higher drug falling
measure is associated with a higher probability of falling
(r = .19, p < 0.00).

chi-square, likelihood ratio chi-square, (BIC), Lo-Mendell-
Figure 2 Proposed fall model for the latent class analysis. This
Rubin Adjusted Likelihood Ratio test and the bootstrap
is a plot of the probability of falling by correspondence analysisderived drug score.

likelihood ratio test were used for model comparisons.

Hardigan et al. BMC Medical Informatics and Decision Making 2013, 13:60
Table 3 Basic latent class structure
Four class solution
Five class solution
Six class solution
Seven class solution
Number of parameters
Latent class analysis
Group I. Seventeen percent of the sample is
For the latent class analysis, a review of the model fit in-
classified into latent class one (Table ). The
dices shows that a six-class solution was preferred
classification accuracy is 95%; the misclassified
(Table . The six-class solution provided a lower Bayesian
elderly were all placed into class four (Table
Information Criteria--BIC (lower is better), much smaller
Subjects in class one have a 47% chance of falling.

chi-square values, and as indicated by the procedures
The odds ratio indicate that a person in class one is
(Lo-Mendell-Rubin likelihood ratio test--LMR and
4.41 times more likely to fall than a person in class
bootstrap likelihood ratio test--BLRT), non-significant
six: Healthy Group II (Tables and ).

p-values. Age, number of medications, and gender were
Class two is also affected by all measured medical
shown to have a significant impact on falling. Females,
conditions (Figure The average age of this class is
older patients, and the more prescription drugs an elderly
76.89 ± 7.02, the average number of medications is
person takes, the greater the probability that they will fall.

7.5, and the drug falling measure is 0.017. This is
Table provides a comparison of fit indices for four-class,
defined as the Poorest-Health Group II. Twenty-
five-class, six-class and seven-class solutions. The six class
eight percent of the sample is placed into latent
structure, with covariates is interpreted as follows:
class two (Table ). The classification accuracy is89% (Table misclassified elderly were placed into
Class one is most likely to be affected by all medical
class three. Subjects in class two have a 46% chance
conditions (Figure The average age of this class is
of falling. The odds ratio indicate that a person in
77.78 ± 7.01, the average number of medications is
class two is about 4.67 times more likely to fall than a
4.7, and the average drug falling measure is 0.016.

person in class six: Healthy Group II (Tables and
Latent class one is defined as the Poorest-Health
Class three is generally unaffected by all medical
markers (Figure ). The average age of this class is78.83 ± 6.63, the average number of medications is
7.8, and the drug falling measure is 0.006. We definethis as the Healthy Group I. Seventeen percent of
the sample is classified class three (Table ). The
classification accuracy for latent class three is 84%
(Table ). Misclassified elderly were placed into class
two, indicating some overlap between the two latent
classes. Subjects in class three have a 16% chance of
falling. There is no significant difference in the
likelihood of falling between class three and class
Medical Condition
six: Healthy Group II (Tables and ).

Class four is primarily affected by arthritis;
therefore, this is defined as the arthritis group(Figure ). Twenty-percent of the sample fell into
Figure 3 Overlay plot of latent classes by medical condition.

latent class four (Table The average age of this
Arthritis = Arthritis. HBP = High Blood Pressure. DB = Diabetes.

class is 78.69 ± 7.32, the average number of
HD = Heart Disease. FD = Foot Disorders. PD = Parkinson's
medications is 2.6, and the drug falling measure is
Disease. Stroke = Stroke.

-0.003. The classification accuracy is 96% (Table ).

Hardigan et al. BMC Medical Informatics and Decision Making 2013, 13:60
Table 4 Most likely latent class membership
Misclassified elderly were placed into class one.

Subjects in class three have a 26% chance of falling.

This paper demonstrated the utility of LCA in the meas-
The odds ratio indicate that a person in class four
urement of falling among community-dwelling elderly.

is approximately 2.07 times more likely to fall than
The basic idea underlying LCA is that variables differ
a person in class six: Healthy Group II (Tables
across previously unrecognized subgroups These
subgroups form the categories of a categorical latent
Class five is primarily affected by high blood
variable. Given the potential for confounding among the
pressure,diabetes, heart disease and foot disorders
study variables, latent class analysis holds great promise.

(Figure ). This group is defined as the diabetes-
The six-class solution was statistically sound and pro-
heart disease group. Eight percent of the sample fell
vided a relatively straightforward interpretable number
into latent class five (Table ). The average age of
of classes. The interpretation of a LCA relies on both
this class is 77.53 ± 7.04, the average number of
the statistical indices and the practical interpretation of
medications is 3.1, and the drug falling measure is -0
the classes. In our example, the statistical indices
.009. The classification accuracy is 95% (Table
strongly point toward a six factor model. The classifica-
Misclassified elderly were placed into either class
tion accuracy for the model was very good. Furthermore,
one (Unhealthy Group I) or six (Healthy Group I).

we were able to define each latent class, which provides
Subjects in class five have a 29% chance of falling.

researchers and practitioners practical implications of
The odds ratio indicates that a person in class five is
the analysis.

2.24 times more likely to fall than a person in class
Medication usage helped differentiate the latent clas-
six: Healthy Group II (Tables and ).

ses. Subjects in latent class one have higher probabilities
Class six is least affected by the medical conditions
of possessing the seven medical conditions than subjects
and is defined as healthy group II (Figure ). Ten
in latent class two; yet, subjects in latent class two pos-
percent of the sample fell into latent class six
sess similar rates of falling. This may be explained by the
(Table The average age of this class is 78.87 ±
number of medications that class two is taking (7.5 vs.

7.48, the average number of medications is 4.3, and
4.7). Similarly latent class three and six are both defined
the drug falling measure is -0.012. The classification
as the healthy groups. Differentiating the two groups is
accuracy is 97% (Table ). Subjects in class three
the number of medications taken by subjects in latent
have a 15% chance of falling. Misclassified elderly
class three vs. latent class six (7.8 vs. 4.3).

were placed into class five: the diabetes-heart disease
It also true that the type of medications subjects are tak-
group (Tables and ).

ing is impacting their probability of falling. This can bedemonstrated for latent class one. Holding age and
Table 5 Most likely latent class membership
Table 6 Odds ratios
Note: Base or comparison group is class six or the "healthy group II" group.

Hardigan et al. BMC Medical Informatics and Decision Making 2013, 13:60
number of medications at their means, females with a
Cummings RG: Epidemiology of medication-related falls and fractures in
drug falling measure of 1.50 [i.e., Thioridazine &
the elderly. Drugs Aging 1998, 12:43–53.

Kannus SR, Sherrington C, Menz H: Falls in older people: risk factors and
Amoxapine] have a 80% greater chance of falling than the
strategies for prevention. Cambridge: Cambridge University Press; 2001.

same subjects with a drug falling measure of -0.50 [i.e.,
Weir E, Culmer L: Fall prevention in the elderly population. Can Med Assoc
Imipramine & Methadone] (p < 0.05). We stress that the
J 2004, 171:724.

Tinetti ME: Preventing falls in elderly persons. N Engl J Med 2003,
latent classes are composite variables, so one should not
look at medications in isolation. As one would expect, the
Kannus P, Sievnen H, Palvanen M, et al: Prevention of falls and consequent
two latent classes with the highest probability of falling
injuries in elderly people. Lancet 2005, 366:1885–1893.

Kannus P, Palvanen M, Niemi S: Time trends in severe head injuries
also possess the highest drug falling measure and the
among elderly Finns. J Am Med Assoc 2001, 286:673–674.

worst medical conditions.

Croudace TJ, Jarvelin MR, Wadsworth ME, Jones PB: Developmentaltypology of trajectories to nighttime bladder control: Epidemiologicapplication of longitudinal latent class analysis. Am J Epidemiol 2003,
As was demonstrated in past research correspondence
Everitt BS: The analysis of contingency tables. London: Chapman & Hall; 1992.

analysis is a useful tool for researchers examining prescrip-
Everitt BS, Hand DJ: Finite mixture distributions. London: Chapman & Hall;1991.

tion medication data . Combining LCA with CA pro-
Magidson J, Vermunt JK: Latent class models for clustering: a comparison
vides researchers a powerful tool for data reduction
with K-means. Canadian Journal of Marketing Research 2002, 20:37–44.

analysis. We demonstrated that this approach was effect-
Ploubidis GB, Abbott RA, Huppert FA, Kuh D, Wadsworth EJ, Croudace TJ:Improvements in social functioning reported by a birth cohort in mid-
ive for finding relevant subgroups with a heterogenous at-
adult life: A person-centred analysis of GHQ-28 social dysfunction items
risk population for falling. Nevertheless, the results may
using latent class analysis. Personal Individ Differ 2007, 42:305–316.

not be relevant to other countries, with different lifestyles
In CCC, Arminger G, Clogg CC, Sobel ME: Handbook of statistical modelingfor the social and behavioral sciences. New York: Plenum; 1995.

and different socio-economic status.

Read TR, Cressie N: Goodness-of-fit statistics for discrete multivariate data.

LCA and CA possess limitations which make its applica-
New York: Springer; 1988.

tion to this type of modeling dependent on replication
Langeheine R, Pannekoek J, van de Pol F: Bootstrapping goodness-of-fitmeasures in categorical data analysis. Sociol Methods Res 1996,
studies. The specific limitations include (1) Classes not
known prior to analysis, and (2) Class characteristics not
Magidson J, Vermunt J: Latent class models. [
know until after analysis. Both of these problems are re-
Scott V, Donaldson M, Gallagher E: A review of the literature on best
lated to LCA being an exploratory procedure for under-
practices in falls prevention of long-term care facilities. Long Term Care
standing data. Furthermore, the items were not designed
Falls Review 2003. September.

for a LCA approach. A latent class study designed a-priori
Ezekiel J, Mordecai L, Fox K: Methods of Correlation and Regression Analysis.

3rd edition. New York: WIley and Sones; 1959.

may offer better solutions. We would also suggest that
Friendly M: Categorical Data Analysis with Graphics: Part 5 Correspondence
additional items (medical) be used which have demon-
strated to impact falling among elderly community
Lewis-Beck MS, Bryman A, Liao TF: The Sage encyclopedia of social scienceresearch methods. New York: Sage Publications Inc.; 2004.

dwellers--such as eye disease and pain.

Inciardi JF, Stijnen T, McMahon K: Using correspondence analysis inpharmacy practice. Am J Health Syst Pharm 2002, 59:968–972.

Competing interestsThe authors declare that they have no competing interests.

doi:10.1186/1472-6947-13-60Cite this article as: Hardigan et al.: Using latent class analysis to model
Authors' contributions
prescription medications in the measurement of falling among a
PH, DS and WH participated in the design, coordination, project planning
community elderly population. BMC Medical Informatics and Decision
and data collection. PH performed the statistical analysis and drafted the
Making 2013 13:60.

manuscript. All authors read and approved the final manuscript.

Author details1Department of Public Health, Nova Southeastern University, 3200 SouthUniversity Dr., Health Professions Division, Ft. Lauderdale, FL 33328, USA. 2TheElderCare Companies, Inc, 2517 State Rt. 35, Bldg. J Ste. 203, Manasquan, NJ08736, USA. 3College of Pharmacy, Nova Southeastern University, 3200 SouthUniversity Dr., Health Professions Division, Ft. Lauderdale, FL 33328, USA.

**Submit your next manuscript to BioMed Central**

and take full advantage of:

Received: 8 August 2012 Accepted: 13 May 2013Published: 25 May 2013
**• Convenient online submission**

**• Thorough peer review**

Uebersax JS: LCA Frequently asked questions (FAQ). [
**• No space constraints or color ﬁgure charges**

**• Immediate publication on acceptance**

Hagenaars JA, McCutcheon AL: Applied latent class analysis. New York:Cambridge University Press; 2002.

**• Inclusion in PubMed, CAS, Scopus and Google Scholar**

Lazarsfeld PF, Henry NW: Latent structure analysis. Boston: Houghton; 1968.

**• Research which is freely available for redistribution**

McCutcheon AC: Latent class analysis. Beverly Hills: Sage Publications Inc.;1987.

Falls Prevention Act of 2003, Pub. L. No. 109-395, § 1217 Stat. 1217; 2004.

Submit your manuscript at www.biomedcentral.com/submit

Source: http://www.eldercarecompanies.biz/app/download/7236955947/Hardigan+Schwartz+Hardigan+Polypharmacy.pdf

Water- and Oil repellent for porous and absorptive wooden surfaces. Wood preservative and protection against wood-boaring insects. Building Construction.Wood treatment.Wood Pressure Treatment. Key ene ts: Simultaneously protects against oil and water absorption SurfaPore W Eﬀective against wood-boaring Prevents warping

MATERIAL SAFETY DATA SHEET Resene True Prime Arch Vacsol Azure MATERIAL SAFETY DATA SHEET Product Name: TRUE PRIMEOther Names:Description: PAINTRecommended use: Solvent Alkyd Paint Address: 32-50 Vogel Street Phone: (04) 577 0500 Resene Paints Limited Naenae Wel ington NEW ZEALAND Fax: (04) 577 0600 P O Box 38242 Wel ington Mail Centre Emergency Telephone: 0800 737 363 Available Monday - Friday 8.00am - 4.30pm