Mortality of Care Home Residents and Community-Dwelling Controls During the COVID-19 Pandemic in 2020: Matched Cohort Study

Objective This study aimed to estimate and compare mortality of care home residents, and matched community-dwelling controls, during the COVID-19 pandemic from primary care electronic health records in England. Design Matched cohort study. Setting and Participants Family practices in England in the Clinical Practice Research Datalink Aurum database. There were 83,627 care home residents in 2020, with 26,923 deaths; 80,730 (97%) were matched on age, sex, and family practice with 300,445 community-dwelling adults. Methods All-cause mortality was evaluated and adjusted rate ratios by negative binomial regression were adjusted for age, sex, number of long-term conditions, frailty category, region, calendar month or week, and clustering by family practice. Results Underlying mortality of care home residents was higher than community controls (adjusted rate ratio 5.59, 95% confidence interval 5.23‒5.99, P < .001). During April 2020, there was a net increase in mortality of care home residents over that of controls. The mortality rate of care home residents was 27.2 deaths per 1000 patients per week, compared with 2.31 per 1000 for controls. Excess deaths for care home residents, above that predicted from pre-pandemic years, peaked between April 13 and 19 (men, 27.7, 95% confidence interval 25.1‒30.3; women, 17.4, 15.9‒18.8 per 1000 per week). Compared with care home residents, long-term conditions and frailty were differentially associated with greater mortality in community-dwelling controls. Conclusions and Implications Individual-patient data from primary care electronic health records may be used to estimate mortality in care home residents. Mortality is substantially higher than for community-dwelling comparators and showed a disproportionate increase in the first wave of the COVID-19 pandemic. Care home residents require particular protection during periods of high infectious disease transmission.

The COVID-19 pandemic had major impacts during 2020. 1 The first wave of infections peaked during April 2020 in the United Kingdom (UK), with more than 1000 deaths per day within 28 days of a positive COVID -19 test. In the second wave, with more widespread testing, the number of people in the UK with a positive COVID-19 test result peaked at 81,525 on December 29, 2020. 1 Early studies identified deprivation, 2 household overcrowding, 3 older age, male sex, obesity, comorbidity, and ethnic minority status as being important risk factors for severe disease and mortality. 4 Residents of care homes, which in the UK include residential homes providing support with personal care, and nursing homes providing support with personal care and assistance from qualified nurses, were severely affected by the pandemic. Contributing factors included the discharge of hospital patients to care homes with risks of disease transmission, 5 limited availability of COVID-19 testing, 6 limited supply of personal protective equipment, 7 and delayed development of guidance to ensure protection of the care home popualtion. 8 Data from the Office for National Statistics showed that weekly counts of deaths of care home residents in England and Wales increased from 2799 in the last week of February to 8476 and 9015 in the last 2 weeks of April 2020. 9 Analysis of data reported to the Care Quality Commission in England suggested that excess deaths represented about 6.5% of care home beds. 10 Care home residents typically have multiple risk markers of vulnerability for severe COVID-19, but susceptibility to infection may also have been increased because the care home environment had potential to facilitate transmission of COVID-19 and outbreaks were frequent. However, rigorous epidemiologic analysis has been limited. An editorial observed that "the COVID-19 pandemic has placed a spotlight on how little is known about this sector, and the lack of easily accessible, aggregated data on the UK care home population." 11 To address this gap, we aimed to explore whether primary care electronic health records could be used to evaluate care home mortality during the pandemic. 12 We aimed to use primary care electronic health records to estimate all-cause mortality of care home residents in comparison with matched community-dwelling controls in England during 2020.

Data Source and Participant Selection
The study drew on data from the Clinical Practice Research Datalink (CPRD) Aurum database, a database of longitudinal primary care electronic health records in England, 13 including a total of 1473 general practices in England with approximately 14.8 million registered patients at January 1, 2020. The protocol for the study was approved by the CPRD Independent Scientific Advisory Committee protocol number 20_000214.
This study used data from the March 2021 release of CPRD Aurum, including all 215,110 patients registered in CPRD Aurum general practices in England between January 1, 2015 and December 31, 2020 who were recorded as being resident in a care home. The most frequently recorded index care home codes were "lives in a nursing home [or] care home" (Supplementary Table 1). There were 28,531 (13%) patients with index codes of "patient died in a nursing home [or] care home." For these, patients we assumed that they were resident in the care home for 90 days before death. The median length of stay is 2 years for care home residents, and 1 year for nursing home residents, 14 but we assumed that patients with first codes for "died in nursing/care home" would have lower than average lengths of stay. In sensitivity analyses, we found that varying this assumed duration between 14 and 365 days had negligible influence on estimates. For each patient, the start date was the latest of the patient's start of registration or the first care home code. The end of the patient's record was the earliest of the end of patient registration, the death date recorded by CPRD and the last data collection date for the practice. There were 7584 care home residents and 16,861 controls whose records were censored by end of registration rather than by death or last data collection date. We included patients age 18-104 years of age.
For 83,627 care home residents contributing person-time during 2020, a matched comparison cohort of community-dwelling adults was sampled from the list of all patients registered in the CPRD Aurum March 2021 release after excluding care home residents. Control patients were matched for general practice, sex, and year of birth, and had a start date that was no later than 18 months after the start date for matched cases. Up to 4 community-dwelling control participants were randomly sampled with replacement 15 for each care home resident. Care home residents were omitted from this analysis where there were no eligible matched controls.

Main Measures
The primary measure of interest was mortality from any cause based on the CPRD death date. Covariates were age, sex, region in England, multiple morbidity, and frailty category. Age in 2020 and was divided into the age-groups of 18-64, 65-74, 75-84, 85-94, and 95-104 years. Multiple morbidity was represented by a count of 20 conditions, ever recorded in each patient's record up to the end of 2020, from the list of atrial fibrillation, cancer, chronic kidney disease, chronic obstructive pulmonary disease, dementia, depression, diabetes mellitus, epilepsy, frailty fractures, heart failure, hemorrhagic stroke, hypertension, ischemic heart disease, ischemic stroke, other mental health diagnoses, peripheral arterial disease, palliative care, rheumatoid arthritis, or transient ischemic attack. Frailty was evaluated from coded records of deficits noted in CPRD Observation files according to the e-Frailty index, as described by Clegg et al. 16 The e-Frailty index is informed by the cumulative deficit model of frailty and includes 36 deficits across physical, mental, cognitive, and social functioning. Coded records of frailty index scores and frailty index categories were also analyzed to inform frailty classification with the highest recorded value being employed.

Statistical Analyses
We initially analyzed eligible care home patient records between January 1, 2015 and December 31, 2020. We divided records into calendar months, calculating the number of deaths, and person time at risk for each month. We fitted a negative binomial regression model using data up to the end of 2019 as the training dataset, with counts of observed deaths as dependent variable and age-group, sex, region, multiple morbidity, frailty category, calendar month, and calendar year as predictors. Month was fitted as a categorical variable, while year was fitted as a continuous predictor. Multiple morbidity was fitted with categories from 1 to 9 or more morbidities, with a separate category for "none recorded." Frailty category was fitted as a categorical variable. The categories of "nonfrail," "mild frailty," "moderate frailty," and "severe frailty" were employed for analysis. Robust standard errors were employed to allow for general practice clustering. The general practice effect was allowed to differ between care home residents and controls, by representing the care home residents and community controls of each practice as separate clusters, because the former were clustered within care homes. We estimated predicted deaths by month for pre-pandemic and pandemic periods (2015-2020), comparing predicted and observed deaths graphically. To evaluate mortality in 2020 in more detail, we analyzed data for care home residents and community controls, evaluating counts of deaths and persons at risk by calendar week. We fitted a negative binomial model, with robust standard errors, now including interaction terms that allowed the associations of long-term conditions and frailty with mortality to differ between care home residents and community controls. To summarize the results, we fitted models separately for the periods of January to March, April, and May to December. However, we also present a difference-in-difference analysis that estimated the main effect of group (care home residence), time (January to March, April, and May to December) and the group by time interaction.
Analyses were performed using the "statsmodels" 17 package in Python 3.8.3 (Python Software Foundation). The "matplotlib" 18 package in Python and the "ggplot2" 19 package in the R program (R Foundation for Statistical Computing) were used for data visualization.

Results
We analyzed data for 215,110 patients who were registered at general practices in England and were recorded as resident in a care home, who contributed follow-up between January 1, 2015 and December 31, 2020. There were 137,024 (64%) women; 97,192 (45%) were age 85-94 years and 24,685 (11%) were age 95 years or older; 180,390 (84%) had 2 or more morbidities. Figure 1 shows the distribution of observed deaths (red line) by month from 2015 to 2020, compared with predicted values estimated from 2015 to 2019 data (blue line). It was clear that there was a substantial excess of observed over predicted deaths in early 2020, with a peak in April 2020.
Analyses were then restricted to 83,627 care home residents who were registered during 2020, of whom 80,730 (97%) were matched with 300,445 community-dwelling controls. Characteristics of the sample are shown in Supplementary Table 2. Care home residents and community controls were similar with respect to matching variables of sex and age-group, but care home residents generally showed higher counts of long-term conditions and greater levels of frailty.
There was a peak in observed deaths between April 6, 2020 and April 26, 2020 ( Figure 2, upper panel). Mortality rates were higher in men than women and increased in successive age-groups. The highest age-specific   Figure 3 shows mortality rates per 1000 patients per week for each week of 2020 for care home residents (red) and community controls (blue). Data are presented separately by number of long-term conditions (upper panel) and frailty category (lower panel). There was a peak in observed deaths between April 6, 2020 and April 26, 2020 that was evident in both care home residents and community controls. Mortality of care home residents was always higher than for community controls. Mortality also increased with number of long-term conditions and frailty category. However, the effect of increasing long-term condition count or frailty category was greater for community controls than for care home residents. Table 1 shows data aggregated for the periods January to March, April, and May to December 2020. Mortality of care home residents was higher in the April period than the other periods; this increase was evident at each level of frailty with absolute risks of mortality increasing with frailty level. Mortality of community controls was also higher in April compared with the other periods; the increase was proportionately smaller than for care home residents but, in absolute terms, the increase was greatest for patients with the most advanced level of frailty. Comparing care home residents and community controls, the adjusted relative mortality rate decreased with increasing level of frailty, reflecting the higher mortality of frail community controls. However, relative risks were higher in April period than in other periods of 2020. Table 2 presents the unadjusted and adjusted estimates from a difference-in-difference analysis. After adjustment for covariates, the rate ratio for care home residence overall was 5.59 (5.23-5.99). Mortality for controls showed a 66% (55%-78%) relative increase during April 2020 compared with January to March 2020. After allowing for the underlying difference between care home residents and controls, and the increase shown by controls in April 2020, care home residents showed a further 76% (60%-93%) relative increase in mortality during April 2020.
Supplementary Figure 1 presents a forest plot of the adjusted mortality rate ratios, comparing care home residents with community controls. At each level of morbidity or frailty, the relative risk of mortality for care home residents was higher during the COVIDrelated peak of mortality in April 2020 compared with the mostly prepandemic period of January to March or later-pandemic period of May to December.

Main Findings
This analysis shows that primary care electronic health records have potential to provide timely and relevant information concerning the care home population. There was evidence of a substantial underlying mortality difference between care home residents and community-dwelling controls that were matched for age, sex, and general practice. This difference persisted after further adjustment for frailty category, number of long-term conditions and region. We caution that, because of residual confounding from unmeasured and incompletely measured confounders, this analysis cannot determine to what extent the underlying mortality difference between care home residents and community controls is determined by the health status of residents, or the shared environment of the care home. Analyses quantified the first wave of COVID-19 mortality in April 2020 and showed that mortality peaked between April 6 and 26, 2020, being strongly associated with advanced age, male sex, multiple morbidity, and frailty category. Compared with community-dwelling control patients, mortality for care home residents was 4 to 5 times higher before the onset of the pandemic. Care home residents were disproportionately affected and during the month of April 2020 after allowing for differences in case-mix; mortality of care home residents was more than 10 times higher than for community-dwelling patients overall. Mortality remained high during the remainder of 2020 while the pandemic continued. The level of frailty and number of long-term conditions were found to be effect modifiers, being more strongly associated with mortality of community-dwelling patients than those living in care homes.

Strengths and Limitations
We drew on a well-described database, 13 and the quality of data offered by electronic health records has been shown to be generally high. 20 However, we acknowledge that there could be misclassification of care home status and it is possible that care home residence might be under-recorded. Community controls were matched on a small number of well recorded variables including age, sex, and general practice. Community controls were exactly matched with care home residents on year of birth, to allow for the important confounder of age, results were summarized over age groups. Controls might have been more closely matched for health status, but this might lead to problems of bias from over-matching. We compared unadjusted and covariate-adjusted estimates, as well as stratifying analyses by health status. We also acknowledge that limited testing for COVID-19, and recording of COVID-19 diagnoses, might have underestimated the  Additional effect of care home residence during 2020, net of underlying difference between care home residence and controls and rate in controls in same period. z "time" effect.
burden of illness during the early stages of the pandemic. We addressed this by comparing the mortality of care home residents in 2020, with the mortality experienced in the preceding 5-year period (2015-2019). We also evaluated mortality for each week from January 1, 2020 onward. For control participants, the Office for National Statistics Coronavirus (COVID-19) Infection Survey showed that at the height of the first wave of infection from April 27 to May 10, 2020, an average of 0.27% (95% confidence interval: 0.17%-0.41%) of the general population had COVID-19. 21 We did not have data concerning whether control participants were receiving social or nursing care support in their own homes, which might have been associated with frailty status. We included a count of important long-term conditions as well as analyzing frailty category. In the cumulative deficit model, frailty and multiple morbidity are closely related concepts, 22 but more accurate phenotypic characterization of patients frailty status over time would have added to the study. 23 Deprivation is associated with reduced healthy life expectancy, which could lead to care home admission. Patients were matched for general practice, so it was not possible to adjust for deprivation at the general practice-level. We did not employ individual postcode-level deprivation scores as these might have presented difficulties if the care home postcode did not reflect deprivation exposures over the life-course. COVID-19 mortality is associated with deprivation, as well as age, but the effect of deprivation diminishes with age. 24 Ethnic minorities make up about 3% of the English population age 80 years and over 25 and, while ethnic minorities may be under-represented in care homes, mortality of minorities from COVID-19 was generally higher than in the white population. 2 Future studies should aim to include ethnicity and socioeconomic measures. Control sampling was with replacement and duplicated controls were included to reduce bias. 15 Matching for family practice ensured that care home cases and community controls were resident in similar local areas and exposed to similar community prevalence of severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) infection.

Comparison with Other Studies
Previous studies of care home mortality during the COVID-19 pandemic have mainly drawn on data from care home records. 10,26 Morciano et al 10 analyzed data for numbers of deaths reported to the care quality commission and estimated that over the first 7 months of 2020, deaths accounted for 6.5% of care home beds. The estimates from our analyses are not directly comparable because we estimated the mortality rate per 1000 residents per week. Dutey-Magni et al 26 analyzed data collected by care homes for incidence of COVID-19 and mortality. Their findings, like our study, suggested that deaths were frequent among residents who were probably infected with SARS-CoV-2 but were not tested. Burton et al 27 found that outbreaks of COVID-19 were frequent within care homes and most deaths occur in the context of outbreaks. 10,27 In the United States, mortality in care homes was consistently associated with facility-size, communityincidence of COVID-19, and poverty. 28 We did not have data to identify individuals at the same care homes and the possible clustering of deaths at care homes could not be investigated in our data. Hollinghurst et al 29 analyzed linked primary care and administrative records for the population of Wales and found that care homes showed increased mortality during the first wave of the pandemic. Other studies confirm that background mortality is very high in care home residents. Vossius et al 30 found that annual mortality of nursing home residents was 31.8%. Shah et al 31 analyzing the The Health Improvement Network primary care database for 2009 found that the age and sex standardized mortality ratio for nursing home residents was 419 and for residential home residents was 284, consistent with the elevated relative rates observed in the present analyses.

Conclusions and Implications
This study shows that individual-patient data from primary care electronic health records may be used to estimate mortality in care home residents in comparison with community-dwelling comparators. Mortality of care home residents is substantially higher than for community-dwelling comparators and showed a disproportionate increase in the first wave of the COVID-19 pandemic. Care home residents require particular protection during periods of high infectious disease transmission.