Reliability and Validity of the Tilburg Frailty Indicator in 5 European Countries.

OBJECTIVES
To assess the internal consistency, convergent and divergent validity, and concurrent validity of the Tilburg Frailty Indicator (TFI) within community-dwelling older people in Spain, Greece, Croatia, the Netherlands, and the United Kingdom.


DESIGN
Cross-sectional study.


SETTING
Primary care and community settings.


PARTICIPANTS
In total, 2250 community-dwelling older people (60.3% women; mean age = 79.7 years; standard deviation = 5.7).


METHODS
We assessed the reliability and validity of the full TFI as well as its physical, psychological, and social domains. Baseline data of the Urban Health Centers Europe project were used. The internal consistency was assessed with the Cronbach alpha. The convergent and divergent validity were assessed using Pearson correlation coefficients between the domains and alternative measures: the 12-item short-form, Groningen activity restriction scale, 5-item mental well-being scale of the 36-Item Short Form Survey, and the De Jong Gierveld loneliness scale. The concurrent validity was assessed by the area under the receiver operating characteristic curve with physically frail (Survey of Health, Ageing and Retirement in Europe-Frailty Instrument), loss of independence (Groningen activity restriction scale), limited function (Global Activity Limitation Index), poor mental health (5-item mental well-being scale of the 36-Item Short Form Survey), and feeling lonely (De Jong Gierveld loneliness scale) as criteria.


RESULTS
The internal consistency of the full TFI was satisfactory with the Cronbach alpha ≥0.70 in the total population and in each country. The internal consistency of the psychological and social domains was not satisfactory. The convergent and divergent validity of the physical, psychological, and social domains was supported by all the alternative measures in the total population and in each country. The concurrent validity of the full TFI and the physical, psychological, and social domains was supported with most area under the receiver operating characteristic curve ≥0.70 in the total population and in each country.


CONCLUSIONS AND IMPLICATIONS
The TFI is a reliable and valid instrument to assess frailty in community-dwelling older people in Spain, Greece, Croatia, the Netherlands, and the United Kingdom.

With the population rapidly aging worldwide and the increasing prevalence of chronic multimorbidity, frailty is increasingly recognized as a complex and important public health issue. 1,2 People with frailty have a higher risk of various negative outcomes such as falls, 3 disability, 4 long-term care, 5 hospitalization, 4 and mortality. 6 To improve the management of frailty and deliver more patient-centered care, providing supportive care to people with frailty ideally starts with the identification of their severity level of frailty. 7 Although many assessment tools to measure the severity level of frailty have been developed in the past decades, 7,8 there is no global standard assessment measure for frailty. 8 Hence, it is important to have robust data and studies on the psychometric properties including reliability and validity of existing instruments in order to be able to compare and select the most appropriate and relevant health measurement tools.
Furthermore, researchers, healthcare professionals, and policymakers increasingly acknowledge the multidimensional nature of frailty. 1,5,9 However, most frailty assessment measures only cover the physical domain 4,10,11 and not the psychological and social domains. 9 The Tilburg Frailty Indicator (TFI) is a short self-reported questionnaire, originally developed for identifying frail communitydwelling older people in the Netherlands in 2010. 5,12 It considers frailty from a bio-psycho-social framework, which includes 15 items addressing 3 domains: the physical, psychological, and social domains. 12 Pialoux et al 13 found that the TFI is one of the best 3 measures for screening frailty in primary healthcare settings. The psychometric properties of the TFI have been extensively examined, especially in Dutch populations. 9,12,14 However, the validity of the single domains of the TFI, especially the psychological and social domains, has not yet been extensively examined. 15e19 In addition, research on the properties of the TFI among different populations is still lacking. 5 For example, the TFI has not yet been validated in Greece, Croatia, or the United Kingdom (UK). Conducting the validation study in these countries contributes to the current literature with important evidence on psychometric properties of the TFI. Furthermore, reporting the results of the total population of the 5 European countries contributes to the generalizability of the results to other local contexts.
This study aims to assess the reliability and validity of the full TFI and its 3 domains in a population of community-dwelling older people from 5 European countries, including Spain, Greece, Croatia, the Netherlands, and the UK. In addition, the reliability and validity will be assessed for each country separately.
We examined the following aspects: (1) the internal consistency (reliability) of the full TFI and the 3 domains; (2) the convergent and divergent validity (construct validity) of the 3 domains; and (3) the concurrent validity (criterion validity) of the full TFI and the 3 domains.

Study Population and Data Collection
The Urban Health Centers Europe (UHCE) project aimed to promote the healthy aging of older people by implementing a coordinated preventive care approach. 20,21 The study design has been described in detail elsewhere. 20,21 Citizens aged 70 years or older who lived independently and were expected to be able to participate in the project for at least 6 months were eligible. Participants were recruited in primary care and community settings in 5 European countries between May 2015 and June 2017. Data was collected with a self-reported questionnaire in the local language at baseline and at 12-month follow-up. Ethical committee procedures have been followed in all countries, and approval has been provided. 20,21 Written informed consent was obtained from all participants. The study was registered as ISRCTN52788952.
In the current study, we adopted a cross-sectional design and used baseline data of the UHCE project (2325 participants from 5 European countries). 20 Participants with missing data on 1 or more items of the TFI (n ¼ 75) were excluded. Thus, our analyses included 2250 participants.

Measures Frailty
The TFI contains 15 items addressing the physical, psychological, and social domains. 12,15,22 The physical domain is assessed with 8 items regarding physical health, unexplained weight loss, difficulties in walking, balance, hand strength, physical tiredness, eyesight, and hearing impairments. The psychological domain is assessed with 4 items regarding problems with memory, feeling down, feeling nervous or anxious, and inability to cope with problems. The social domain is assessed with 3 items regarding living alone, lack of social relationships, and lack of social support. Eleven items have 2 response categories: Yes and No; and 4 items have 3 response categories: Yes, Sometimes, and No. 5 All items were dichotomized after recoding and scored with 0 or 1 point. 5,19 The score range of the full TFI is 0 to 15, that of the physical domain 0 to 8, psychological domain 0 to 4, and social domain 0 to 3. 5 A detailed description of the recoding is provided in Appendix, Supplementary Table 1.
Previously validated versions of the TFI were available in Spanish, 19 Dutch, 12 and English. 12 Because no validated translation of the TFI was available in Greek and Croatian, all items of the TFI were translated forward and backward. 20,21 Forward-and back-translations were discussed by the study team, and the translation was adapted when needed. Each language version of the TFI was piloted in at least 5 older people in the respective countries. Misinterpretation of questions was identified, and minor changes were made. 20 The translations of the TFI in the 5 languages are provided in Appendix, Supplementary Table 2.

Other measures
Health-Related Quality of Life was measured with the 12-item short-form (SF-12) that contains 12 questions covering 8 health domains. The 8 domains are summarized in the Physical Component Summary (PCS) and Mental Component Summary (MCS), both ranging from 0 (lowest) to 100 (highest level of health). 23 Activity restriction was measured with the Groningen Activity Restriction Scale (GARS), which contains 18 items on independence of activities of daily living (GARS-ADL; 11 items) and instrumental ADL (GARS-IADL; 7 items). 24 The GARS score ranges from 18 (highest) to 72 (lowest level of independence) and the GARS-ADL score from 11 (highest) to 44 (lowest level of independence). Participants with a GARS score !29 were categorized as experiencing a loss of independence. 24 Mental well-being was measured with the full 5-item mental wellbeing scale of the 36-Item Short Form Survey (MHI-5), which measures nervousness, downheartedness and feeling sad, jollity, calmness, and happiness (score range: 0e100). 25,26 Participants with a MHI-5 score 52 were categorized as showing signs of poor mental health. 25 Loneliness was measured with the short 6-item version of the De Jong Gierveld loneliness scale (short-JG) that contains 2 domains: emotional (3 items) and social loneliness (3 items). 27 The overall loneliness score ranges from 0 to 6 and the domain scores from 0 to 3, with higher scores indicating a higher experience of loneliness. Participants with a short-JG score !2 were categorized as feeling lonely.
Physical frailty was additionally assessed with the Survey of Health, Aging, and Retirement in the Europe-Frailty Instrument, which contains 5 items: exhaustion, weight loss, slowness, physical activity, and hand-grip strength. 28,29 An estimation of a discrete factor model based on the 5 items determined whether participants were physically frail. 28 Activity limitation was measured with the 1-item Global Activity Limitation Index (GALI). Participants who indicated their function to be moderately or severely limited were categorized as having a limited function. 30,31 Sociodemographic factors Age (in years), sex, level of education, and living situation (living alone/not living alone) were assessed. The level of education concerned the highest level of education the participant completed and was categorized according to the 2011 International Standard Classification of Education (ISCED) 32 into primary or less (ISCED 0e1), secondary or equivalent (2e5), and tertiary or higher (6)(7)(8).

Statistical Analyses
Scale scores were described by conventional descriptive statistics. 33 We applied the framework used by Gobbens et al, 7 who originally developed the TFI for the evaluation of the internal consistency and specific aspects of the validity of the TFI. The internal consistency was assessed with the Cronbach alpha; a value of the Cronbach alpha between 0.7 to 0.9 was considered as a satisfactory internal consistency. 34 To examine the convergent and divergent validity, we hypothesized that the SF-12 PCS, GARS, and GARS-ADL strongly relate to the physical domain of the TFI and less the other 2 domains. We hypothesized that the SF-12 MCS and MHI-5 strongly relate to the psychological domain of the TFI and less the other 2. We also hypothesized that the short-JG strongly relates to the social domain of the TFI and less to the other 2. The convergent and divergent validities were assessed using Pearson correlation coefficients. 12 A statistically significant correlation between a domain score and the score of an alternative measure of the same domain was considered as a satisfactory convergent validity; with a higher correlation indicating a better validity. 12,15,22 Divergent validity was assumed if each alternative measure had a higher correlation with the corresponding domain of the TFI, but a lower correlation with the each of the other domains of the TFI. 12,15,22 To examine the concurrent validity, we used the following alternative measures as the criterion: (1) Survey of Health, Aging, and Retirement in Europe-Frailty Instrument, (2) GARS and (3) GALI (physical domain), (4) MHI-5 (psychological domain), and (5) short-JG (social domain). The concurrent validity was assessed using the receiver operating characteristic (ROC) curve analysis. 12,22 Accuracy was measured by the area under the ROC curve (AUC). An AUC between 0.7 and 0.8 was considered acceptable, between 0.8 and 0.9 excellent, and an AUC of more than 0.9 was considered outstanding. 35 The Youden index (sensitivity þ specificity -1) was adopted as the criterion for selecting the optimum cut-off point(s). 36 All analyses were conducted among the total population as well as by country. All analyses were performed with SPSS v 23.0 (IBM SPSS Statistics for Windows, IBM Corp, Armonk, NY). The level of significance was P value of < .05. Table 1 presents the general characteristics of the total population and by country. The mean age of the total population was 79.7 (standard deviation ¼ 5.7) years, and 60.3% were women. Participants from Spain and Greece were younger, had less often completed secondary education, and less often lived alone than participants from other countries (P < .001). Participants from Croatia have higher physical and social domain scores than other countries, and participants from Greece have higher psychological domain scores (P < .001). Table 2 presents the score distributions of the TFI. A floor effect (>25% of the respondents had the lowest possible score 37 ) was observed in the physical (the Netherlands), psychological (the total population, Spain, the Netherlands, and the UK), and social (the total population and each country except Croatia) domains. Table 2 Table 3 presents the convergent and divergent validity of the TFI domains. In the total population and in each country, the physical domain correlated significantly with the SF-12 PCS, GARS, and GARS-ADL. These correlations were higher than those between the psychological or social domain vs the SF-12 PCS, GARS, and GARS-ADL, respectively.

Convergent and Divergent Validity
In the total population and in each country, the psychological domain correlated significantly with the SF-12 MCS and MHI-5. These correlations were higher than those between the physical or social domain vs the SF-12 MCS and MHI-5, respectively.
In the total population and in each country, the social domain correlated significantly with the short-JG. These correlations were higher than those between the physical or psychological domain and the short-JG.  Table 4 presents the concurrent validity of the TFI and its 3 domains.

Concurrent Validity
In the total population and in each country, the AUCs of the full TFI and the physical domain using physically frail or loss of independence as the criterion were excellent, and those using limited function as the criterion were acceptable to excellent.
In the total population and in most of the countries, the AUCs of the full TFI and the psychological domain using poor mental health as the criterion were excellent. In Greece, the AUCs of the full TFI and the psychological domain were acceptable.
In the total population and in most of the countries, the AUCs of the full TFI and the social domain using feeling lonely as the criterion were acceptable. In Croatia, the AUC of the social domain was not acceptable.

Discussion
In the present study, within a diverse community-based sample of older people in Spain, Greece, Croatia, the Netherlands, and the UK, we found an internal consistency of the full TFI and the physical domain in the total population and in each country. However, the internal consistency of the psychological and social domains was not satisfactory. Our results further support the convergent and divergent validity of the 3 domains in the total population and in each country. The concurrent validity of the full TFI and the 3 domains was supported in the total population and in each country, except for the social domain in Croatia.
Regarding the full TFI, the reliability was satisfactory with an internal consistency of the Cronbach alpha !0.70 in the total population and in each country. Previous studies in the Netherlands, 12 Portugal, 16 Poland, 18 Brazil, 15 and China 22 found similar results. The concurrent validity was acceptable with most AUCs !0.70 in the total population and in each country. This finding was similar to previous studies on the full TFI in the Netherlands, 12 Italy, 38 and China. 22 Regarding the physical domain, the internal consistency was satisfactory in the total population and in Croatia and the Netherlands, which was consistent with previous studies. 12,15,16,18,22 The Cronbach alpha of the physical domain in Spain, Greece, and the UK varied between 0.60 and 0.67. Earlier studies in Germany, 17 Italy, 38 and Spain 19 reported similar results and concluded that the internal consistency   *The Youden index was adopted as the criterion for selecting the optimum cut-off point; if more than 1 cut-off points had the maximum value, all potential cut-off points as well as corresponding sensitivity and specificity were provided. y 0.7 AUC <0.8 is considered acceptable concurrent validity; 0.8 AUC <0.9 excellent; AUC !0.9 outstanding; The value of AUC !0.7 in bold.
was acceptable with the Cronbach alpha !0.60. The convergent and divergent validity was supported in the total population and in each country, which was consistent with previous studies. 12,17,22,38 The concurrent validity was acceptable in the total population and in each country, which was consistent with previous studies on the physical domain in the Netherlands, 12 Italy, 38 and China. 22 Regarding the psychological and social domains, the internal consistency was satisfactory in none of the countries with the Cronbach alpha varying between 0.22 and 0.55. Previous studies reported similar findings. 12,15,16,18,22 The low internal consistency for the psychological and social domains might be caused by their small number of items. 12,15 The Cronbach alpha increases with number of items. Therefore, adding items to the psychological and social domains would be beneficial, for instance items referring to feelings of insecurity and the number of social contacts. 5 In addition, the low Cronbach alpha values do not imply that the items of the psychological and (especially) social domains are invalid, but rather they function more as an index rather than as a scale. The convergent and divergent validity was supported in the total population and in each country. The concurrent validity of the psychological domain was acceptable in the total population and in each country, and that of the social domain was acceptable in all countries except Croatia. We recommend further studies on the social domain in Croatia, for instance, cultural adaptation of the items in the social domain. A previous study in China also reported an acceptable concurrent validity of the psychological and social domains. 22 However, the reliability and validity of the psychological and social domains have otherwise received little attention in research before.
To the best of our knowledge, this is the first study to report the reliability and validity of the TFI for multiple European countries simultaneously and the first in Greece, Croatia, and the UK. We investigated the validity of the full TFI and its 3 domains. However, some limitations of our study should be highlighted. First, we did not assess the consistency of the TFI over time (test-retest reliability). However, frailty is not assumed to be stable over time and a low test-retest correlation over the follow-up period (12 months) may be expected. Therefore, we believe that assessing the consistency of the TFI across items (internal consistency) is sufficiently adequate for the current study. Second, we did not assess the sociocultural and language differences in the interpretation of individual items between countries. Consequently, we may have observed some unintended variation between countries. Still, we have paid specific attention to translating the items of the TFI for which no validated translation was available (Greece, Croatia). Further studies on the cultural adaption of the items are needed to confirm our findings. Third, most of the alternative measures chosen to examine convergent and divergent validity and concurrent validity have been widely applied by previous studies. However, there is no golden standard of choosing alternative measures of the TFI, and the number of alternative measures for psychological and social domains was limited by the data availability of the UHCE project. Further studies with more alternative measures are still needed. Finally, the application of the TFI in clinical practice still needs further study due to the absence of general population norms or reference scores, 9 and further research on the use of the TFI in other settings such as the hospital setting is still required.

Conclusions and Implications
In summary, our study supported the reliability and validity of the full TFI and physical domain. The TFI may be applied as an instrument to assess frailty in community-dwelling older people for large-scale population studies on frailty in the 5 European countries. However, our conclusions are drawn from statistical methods, and we cannot prove whether the use of the TFI will lead to clinically meaningful outcomes. The reliability and validity of the psychological and social domains have not been studied extensively before and more investigations in different countries are needed in the future.