Nurse Staffing Tools: Evaluating Methodologies for Optimal Hospital Staffing

1. Introduction

Extensive research has consistently demonstrated the critical link between registered nurse staffing levels in hospitals and patient outcomes. Higher staffing is associated with improved care quality, including reduced in-hospital mortality, shorter hospital stays, and fewer instances of missed essential care [Brennan et al., 2013; Griffiths et al., 2016, 2018b; Kane et al., 2007; Shekelle, 2013]. However, these findings often fall short of specifying the precise number of staff required for optimal care. Determining the “right” staffing level, both in terms of overall employment and shift-by-shift deployment, is crucial for ensuring both quality and efficiency in healthcare delivery [Saville et al., 2019]. This article examines the evidence underpinning various approaches to measuring nursing workload and the tools designed to calculate necessary nurse staffing levels within general acute-care hospital wards.

1.1. The Impact of Nurse Staffing Levels on Patient Outcomes

Insufficient nurse staffing leads to omissions in essential nursing care [Griffiths et al., 2018b], a significant factor contributing to negative patient outcomes [Recio-Saucedo et al., 2018]. Recent studies, building upon substantial cross-sectional evidence, have reinforced these associations at the individual patient level, rather than just at hospital or unit levels [Griffiths et al., 2018a, 2019; Needleman et al., 2011b]. These studies incorporate direct observation of care delivery [Bridges et al., 2019] and demonstrate that care omissions mediate the relationship between staffing levels and patient outcomes [Ball et al., 2018; Bruyneel et al., 2015; Griffiths et al., 2018a]. While observational studies cannot definitively prove causation, the evidence strongly suggests that low nurse staffing directly contributes to patient harm. Denying this link seems increasingly untenable.

Mandatory minimum staffing policies, such as those implemented in California, USA [Donaldson and Shapiro, 2010; Mark et al., 2013; Royal College of Nursing, 2012], reflect a response to this evidence. However, even with mandated minimums, patient needs exceeding these thresholds must be addressed, necessitating staffing adjustments. The fundamental question of how to accurately determine the required nurse staffing level remains a challenge.

1.2. Exploring Staffing Tools and Methodologies

The quest to determine appropriate nurse staffing levels and measure workload has been ongoing since the early days of nursing research [Lewinski-Corwin, 1922]. Numerous reviews over the years have examined methods for setting nurse staffing requirements, consistently highlighting significant gaps in the evidence. The issue isn’t a lack of research volume; a 1973 review of nurse staffing methodologies already cited over 1000 studies [Aydelotte, 1973]. Yet, this review concluded that all methodologies were weak, finding no evidence on the relative costs or effectiveness of different staffing methods and limited evidence for validity or reliability.

Subsequent reviews have grappled with an expanding body of research and a growing number of staffing systems. A 1982 UK review for the Department of Health and Social Services identified over 400 different systems for determining staffing requirements [DHSS Operational Research Service, 1982]. Despite this wealth of material, evidence to evaluate these systems remained elusive. In 1994, Edwardson and Giovanetti noted the absence of scientific evidence for widely used systems like GRASP or Medicus in North America [Edwardson and Giovannetti, 1994]. They also pointed out that while different systems often showed correlated results, they could still produce significantly different estimates of required nurse staffing levels for the same patient or unit [Edwardson and Giovannetti, 1994].

Fasoli and Haddock, reviewing 63 sources, echoed these concerns, finding insufficient evidence for the validity of many current systems for measuring nursing workload and staffing needs, concluding that systems lacked the accuracy for effective resource allocation or decision-making [Fasoli et al., 2011; Fasoli and Haddock, 2010]. Other reviews reinforce this consistently negative view of the evidence [Arthur and James, 1994; Butler et al., 2011; Hurst, 2002; Twigg and Duffield, 2009]. These reviews consistently found descriptive reports of locally developed approaches, but no evidence of tool implementation impacting patient, staff, or care quality outcomes [Griffiths et al., 2016].

Despite the lack of strong evidence, the issue remains critical. The Francis Inquiry into the Mid Staffordshire General Hospital failings in the UK, identifying low staffing as a significant contributor to “appalling care,” recommended developing nurse staffing guidance including:

“…evidence-based tools for establishing what each service is likely to require as a minimum in terms of staff numbers and skill mix.”(p. 1678) [Francis, 2013]

This paper aims to provide an overview of approaches to measuring nurse staffing requirements in general acute hospital wards. It draws on existing reviews and presents a comprehensive analysis of recent primary research to assess how the evidence landscape has evolved in recent years.

2. Review Methods and Scope

2.1. Search Strategy and Review Approach

Given the extensive literature and persistent unanswered questions, summarizing this field is a significant undertaking. This review is systematic in its explicit approach to literature identification and selection. However, its primary aim is to map the literature, identify recent trends, key characteristics, and areas of strength and weakness, rather than providing in-depth critical appraisal of each study. Therefore, it is classified as a scoping review, designed to summarize findings and pinpoint knowledge gaps [Arksey and O’Malley, 2005].

This review selectively uses older authoritative sources and reviews to establish a general background, incorporating results from comprehensive searches and reviews undertaken for the National Institute for Health and Care Excellence (NICE) [Griffiths et al., 2014] as a central resource.

To identify recent studies, searches were conducted in Medline, CINAHL (keyword only), and The Cochrane Library using terms like “Workload”[keyword, MESH] or “Patient Classification”[keyword] AND “Personnel Staffing and Scheduling” AND “Nurs*”[keyword] or “Nursing”[MESH]. Results were limited using OVID Medline sensitive limits for reviews, therapy, clinical prediction guides, costs, or economics. The search strategy’s sensitivity was verified using results from a prior comprehensive search [Griffiths et al., 2014 as a test set. Additional searches included citations to existing reviews, works by review authors, and focused searches on key author databases and the World Wide Web for widely used tools. Searches were completed by mid-December 2018. The review focused on new reviews published after 2014 and primary studies from 2008 onwards, as the most recent review in the prior review of reviews was published in 2010 [Fasoli and Haddock, 2010]. After removing duplicates, 392 recent sources were considered.

2.2. Selection of Primary Research

A broad inclusion approach was adopted for this scoping review. Included were primary studies describing the development, reliability, or validity testing of systems/tools for measuring nursing workload/predicting staffing needs; studies comparing workload assessments from different measures; studies using a tool in a way that offered insights into tool validity or other aspects of nurse staffing determination; and studies reporting costs and/or consequences of tool use, including patient outcome impacts. Descriptive papers with data were also included. Only studies directly relevant to staffing in general acute adult inpatient units were included, excluding those focused solely on intensive or maternity care. However, methodologically significant advances or insights from other areas could be considered for illustrative purposes.

3. Results

3.1. Overview of Approaches to Determining Nurse Staffing Levels

The literature describes numerous methods for determining nurse staffing requirements, generally categorized into several types (Fig. 1), although distinctions can be less clear-cut, and terminology varies.

Fig. 1. Major approaches for determining nurse staffing requirements.

Telford’s professional judgement method [Telford, 1979], formalized in the UK in the 1970s, converts shift-level staffing plans, based on expert opinion, into the number of staff to employ. This calculates the nursing ‘establishment’ needed to reliably fill daily staffing plans, accounting for leave and absence. Conversely, it can infer daily staffing plans from the total staff employed, as illustrated by Hurst [Hurst (2002). While the full Telford method provides a broad framework, staffing judgments do not require objective measures of need [Arthur and James, 1994], making it a ‘professional judgement’-based approach. The US Veteran’s Administration staffing methodology [Taylor et al., 2015] reflects a similar deliberative approach without formal measurement.

‘Benchmarking approaches’ compare staffing levels between similar units to determine requirements, using expert judgments to select comparators. The UK Audit Commission used this for years to compare nursing establishments and spending across hospitals [Audit Commission, 2001]. Though Hurst [Hurst (2002)] categorizes it separately, benchmarking, like professional judgement, lacks formal patient needs assessment. It relies on consensus and expert professional judgement to select benchmarks, making it a specific form of the professional judgement approach. While comparing to similar wards appears objective, it depends heavily on how initial staffing levels were determined, and perceptions are often anchored to historical levels [Ball et al., 2019; Twigg and Duffield, 2009].

Professional judgement and benchmarking can determine establishments, daily staffing plans, or nurse-patient ratios [Hurst, 2002], informing staff deployment. These approaches set target staffing numbers or hours per patient or bed, specifying unit types (e.g., intensive care, general medical-surgical). Newer methods extend this by considering broader activity, such as admissions and discharges, leading to the term ‘volume-based’ approaches.

Volume-based approaches, setting minimum staffing per patient, sometimes acknowledge the need for extra staffing during demand peaks. California’s mandatory nurse-patient ratios legislation requires hospitals to use systems to identify individual patient care needs and adjust staffing above minimums [State of California, 1999]. Approaches accounting for individual patient variation or workload drivers can supplement or replace volume-based minimums.

While volume-based methods measure workload variation by patient counts, others recognize varying care needs within ward types. Edwardson and Giovannetti [Edwardson and Giovannetti (1994)] propose prototype, task, and indicator systems. Hurst describes Patient Classification Systems, timed-task, and regression-based methods [Hurst et al., 2002].

Prototype or Patient Classification Systems group patients by care needs and assign staffing levels [Fasoli and Haddock, 2010; Hurst, 2002], using pre-existing categories (e.g., diagnosis-related groups [Fasoli and Haddock, 2010]) or bespoke classifications (acuity/dependency levels). The Safer Nursing Care Tool [The Shelford Group, 2014], widely used in England [Ball et al., 2019], is an example. It categorizes patients into five acuity/dependency levels with multipliers indicating required staff.

In task (or timed-task) approaches, detailed care plans with specific ‘tasks’ are created for each patient to determine staffing [Hurst, 2002]. Each task has an assigned time. The GRASP system, common in the US, is a task-based system [Edwardson and Giovannetti, 1994].

Indicator approaches categorize patients based on factors related to care time, like condition (‘unstable’), states (‘non-ambulatory’), activities (complex dressings), or needs (emotional support) [Edwardson and Giovannetti, 1994]. The Oulu Patient Classification (OPC) in the RAFAELA system is an example. Patients are classified into four levels based on weighted care needs across six dimensions [Fagerström and Rainio, 1999]. Indicator systems’ inclusion of specific activities blurs the line with task-based systems, but task-based systems typically consider far more elements (over 200 in some cases [Edwardson and Giovannetti, 1994]).

Hurst also identifies regression-based approaches, modeling relationships between patient, ward, and hospital variables and staffing in well-staffed wards [Hurst, 2002]. Regression models estimate required staffing using coefficients. The Workload Intensity Measurement System [Hoi et al., 2010] is a recent example. Regression models can be seen as a specific method of allocating time across factors in indicator systems, rather than directly observing task times. The RAFAELA system, used in Nordic countries, uses a regression-based approach to determine staffing for acceptable nursing work intensity based on a simple indicator system [Fagerström and Rainio, 1999; Fagerstrom and Rauhala, 2007; Rauhala and Fagerström, 2004].

Methods for determining times for patient groups or tasks vary, using empirical observations and expert opinion [De Cordova et al., 2010; Myny et al., 2014; Myny et al., 2010]. Some explicitly link workload/time allocations to quality thresholds. For example, Safer Nursing Care Tool multipliers are derived from wards meeting predefined quality standards [Smith et al., 2009]. Non-patient contact time (documentation, care planning, etc.) is handled differently across approaches, often with a fixed percentage allocation above measured direct care time.

While some approaches appear more precise (timed-task) and others simpler (volume-based), all use average time allocations, assuming individual variation averages out across tasks and patients.

3.1.1. Staffing Decisions and the Use of Tools

Staffing systems and tools inform various decisions across different timeframes (Table 1). Nursing managers decide on employment levels (nursing establishment) and deployment each shift, either fixed or responsive to demand. Indicator and task approaches often focus on immediate need and deployment, rather than establishment setting. These decisions are interrelated and depend on quantifying nursing workload. The distinction and relationship between these uses are sometimes unclear in publications.

Table 1. Uses of staffing systems and tools.
Prospective employment Concurrent deployment Retrospective review
– •Establishment setting: employment and base deployment decisions (long term). – •Predict immediate future demand (e.g. next shift) – •Determine current staffing adequacy and guide deployment/redeployment – •Prioritise and allocate work to a team – •Review success of staffing plans – •Billing and resource use

For instance, the Safer Nursing Care Tool was designed to inform establishment decisions based on meeting sampled patient needs [The Shelford Group, 2014]. Its acuity-dependency scoring is now used for daily staffing and real-time redeployment, e.g., in Allocate’s SafeCare system [Allocate Software, 2017].

Some tools focus on workload balancing within units, primarily for immediate staff assignments [Brennan and Daly, 2015; Brennan et al., 2012]. Finally, tools can retrospectively review staffing plan success or measure resource use for pricing or budgeting [Kolakowski, 2016].

3.1.2. Overlap Between Approaches

While classifications distinguish broad approaches, overlaps exist. Professional judgement might use benchmarking to set a fixed establishment based on a nurse-patient ratio, resembling a volume-based approach. Initial staffing determination might involve detailed patient need appraisal similar to other systems, without formal workload calculation.

Prototype or indicator systems set establishments or daily plans based on sampled patient needs, assuming generalizability. The establishment implies care needs are met by a fixed nurse-to-patient ratio, though ratios vary across wards. A prototype system like the Safer Nursing Care Tool resembles a volume-based mandatory minimum staffing policy (like California’s), supplemented by assessing variation above the base requirement. It implies an absolute minimum staffing per patient, associated with the lowest staffing prototype.

3.1.3. Choice of Tools

Prior reviews indicated little evidence to favor one approach over another. Professional judgement, despite subjectivity concerns, cannot be easily dismissed without evidence that tool-informed models improve outcomes or staffing efficiency. Existing reviews show no such evidence [Arthur and James, 1994; Aydelotte, 1973; DHSS Operational Research Service, 1982; Fasoli and Haddock, 2010; Griffiths et al., 2016; Hurst, 2002; Twigg and Duffield, 2009]. Professional judgement remains central, even in some tools. One well-researched system determines staffing by titration against subjective work intensity reports [Fagerström and Rainio, 1999; Rauhala and Fagerström, 2004].

Subjective judgements would be less concerning if approaches yielded similar results, but this is not the case. Different systems can produce vastly different staffing estimates [Jenkins-Clarke, 1992; O’Brien-Pallas et al., 1991, 1992, 1989]. One study found high correlations between five tested systems, but they ranged from 6.65 to 11.18 hours per patient day for the same 256 patients [O’Brien-Pallas et al., 1992].

3.2. Recent Evidence

Searches for primary studies yielded 37 recent sources, diverse in methods and all observational. Sources were classified by main purpose, with some dual classifications (Table 2, and fuller descriptions in Supplemental material Table 4).

Table 2. Recent studies/sources used in the review.

Group (number of sources) Overall description Sources
Descriptions (9) Six sources simply described the use of a staffing system but also reported some data, which generally consisted of exemplar graphs or charts of varying workload. Three others provide measures of nursing workload/demand: for different ward designs, for different diagnostic groups and for determining variability in patient need prior to developing a new workload management system. Fagerström et al. (2014), Fenton and Casey (2015), Gabbay and Bukchin (2009), Hurst (2008, 2009), Kolakowski (2016), Smith et al. (2009), Taylor et al. (2015), The Shelford Group (2014).
Comparisons (4) These sources compared workload as assessed by different approaches. Beswick et al. (2010), Hoi et al. (2010), Rivera (2017), Simon et al. (2011)
Tool development (13) These studies reported on the full or partial development of a new measure or adaptation of an existing measure. Baernholdt et al. (2010), Brennan et al. (2012), de Cordova et al. (2010), Ferguson-Paré and Bandurchin (2010), Gabbay and Bukchin (2009), Hoi et al. (2010), Hurst et al. (2008), Larson et al. (2017), Morales-Asencio et al. (2015), Myny et al. (2014), Myny et al. (2010), Myny et al. (2012), Perroca (2013)
Evaluation (17) Sources classified as evaluation included assessments of the reliability or validity of a measure (9 sources); assessment of implementation including usability or user experience of the system (3 sources); and studies that provided some evidence of outcomes or costs of when staffing is guided by a particular method (6 sources). Brennan and Daly (2015), Brennan et al. (2012), Fagerstrom et al. (2018), Fagerström et al. (2014), Griffiths et al. (2018a), Hurst et al. (2008), Junttila et al. (2016), Larson et al. (2017), Liljamo et al. (2017), Morales-Asencio et al. (2015), Needleman et al. (2011a), Perroca (2013), Smith et al. (2009), Taylor et al. (2015), Twigg et al. (2011), Twigg et al. (2013), van Oostveen et al. (2016)
Operational research (4) Operational research studies seeking to optimise staffing in the face of varying supply/demand including simulations/mathematical models of different approaches to staff deployment. Davis et al. (2014), Harper et al. (2010), Kortbeek et al. (2015), Maenhout and Vanhoucke (2013)

3.2.1. Descriptions

Descriptive studies illustrate the ongoing use of professional judgement [Taylor et al., 2015], prototype [Fenton and Casey, 2015; The Shelford Group, 2014], and indicator systems [Fagerström et al., 2014; Kolakowski, 2016], with some combining approaches [Fagerström et al., 2014]. Studies show variation between wards and over time [Gabbay and Bukchin, 2009; Smith et al., 2009], driven by patient numbers, admissions/discharges, patient characteristics, and contextual factors like ward layout [Hurst, 2008].

While demonstrating workload variation, none of these studies directly quantify daily staff variability. This lack of quantification is a key limitation, as tools guide fixed staffing plans.

3.2.2. Comparisons

Recent research echoes earlier findings of different methods yielding varied results. Patient counting method differences in hours per patient day appear practically marginal [Beswick et al., 2010; Simon et al., 2011], but other factors significantly impact staffing estimates. More comprehensive methods tend to estimate higher workloads. Including patient turnover in a volume-based measure significantly increased workload [Beswick et al., 2010]. An acuity-dependency indicator system found six more care hours per day than a fixed hours per patient day method [Rivera, 2017]. A new multifactorial indicator system doubled the nursing requirement compared to a simpler system [Hoi et al., 2010].

3.2.3. Tool Development

Many studies (thirteen) document new measure development or adaptation. Most system types, including professional judgement, volume-based, and timed-task, are represented, expanding the range in recent descriptions. Measures are often for local use. Papers typically identify average time or weighting for care aspects or patient groups, but often fail to report or consider variability in these estimates.

Myny and colleagues’ Belgian work [Myny et al., 2014, 2010], which reports variability, is a rare example of sustained recent research. While focused on mean time estimate precision, it illustrates task variability. “Partial help with hygienic care in bed” had a 95% confidence interval of 7.6-21.2 minutes. “Settling a bedridden patient” had an interquartile range of 5-25.75 minutes [Myny et al., 2010].

Prototype approaches, based on typical care needs of patient profiles, might have less individual variation due to averaging, but no equivalent variation estimates were found. The lack of variability measures might be because system times or weights are often partly or wholly derived from expert consensus [Brennan et al. (2012) and Hurst et al. (2008)], partly due to the observation volume needed for reliable time estimates [Myny et al., 2010]. Professional judgement remains a crucial source of information and validation for any system.

3.2.4. Evaluation

Correlations between staffing requirement or workload measures have been used to establish validity [Brennan et al., 2012; Hurst et al., 2008; Larson et al., 2017; Morales-Asencio et al., 2015; Smith et al., 2009]. In all but one case, validity criteria are, in effect, professional judgements of nursing care demand. The RAFAELA system exemplifies professional judgement centrality, using OPC weighting associated with nurses’ ‘optimal’ staffing judgements to set targets [Fagerström et al., 2014].

Successful system implementation requires significant staff engagement and training investment. Taylor et al. describe substantial challenges implementing a professional judgement-based system in the US Veteran’s Administration [Taylor et al., 2015], highlighting nursing leadership and staff buy-in as essential, along with training and addressing potential cynicism if staff see little outcome. Even with broad staff support, a pre-implementation study of RAFAELA found insufficient engagement with staffing adequacy measures and reliability challenges [van Oostveen et al., 2016]. Nurses can reliably assess using several systems [Brennan et al., 2012; Liljamo et al., 2017; Perroca, 2013], though inter-rater agreement isn’t always straightforward, and reliability in new settings shouldn’t be assumed [van Oostveen et al., 2016]. Real-life assessment reliability may be lower than in controlled conditions, and omitting user-valued care aspects due to psychometric properties can negatively impact engagement [Brennan and Daly, 2015].

Despite nurse staffing’s importance for patient care and hospital budget share, tool/system impact has received limited attention. However, recent evidence links mismatches between deployed staff and calculated staffing requirements to adverse outcomes. This evidence doesn’t clearly favor any specific measurement system but aligns with evidence of higher staffing benefits. These studies further validate some tools as workload measures but generally don’t support conclusions that tools yield ‘optimal’ staffing levels, minimizing adverse outcomes or showing diminishing returns from further increases.

A US study using an unspecified commercial Patient Classification System found a 2% increased death hazard for each shift a patient experienced with staffing 8 or more hours below target [Needleman et al., 2011a. Mortality also increased with high patient turnover shifts, suggesting unmeasured workload.

In Finland, workload above ‘optimal’ levels measured by OPC was linked to adverse outcomes, including increased mortality [Fagerstrom et al., 2018; Junttila et al., 2016]. However, workload below optimal (higher staffing) improved outcomes [Fagerstrom et al., 2018; Junttila et al., 2016], challenging the ‘optimal’ level concept. The OPC workload measure wasn’t clearly superior to a simple patient-per-nurse measure in decision curve analysis [Fagerstrom et al., 2018].

A recent UK study found registered nurse staffing below Safer Nursing Care Tool levels associated with a 9% increased death hazard in one hospital trust, though assistant staffing below criteria was not linked to mortality [Griffiths et al., 2018a]. This study also found a linear relationship between mortality and registered nurse staffing, with no clear threshold at the Safer Nursing Care Tool-recommended level.

Implementing a ‘Nursing Hours per Patient Day’ methodology in three Australian hospitals led to staffing level increases and improved patient outcomes, including mortality [Twigg et al., 2011]. This volume-based method assigns minimum staffing (hours per patient day) for six ward types based on patient mix and complexity. An economic analysis estimated a cost of AUD$8907 per life year gained [Twigg et al., 2013].

3.2.5. Operational Research

Operational research studies, part of a larger body of nurse rostering literature [Saville et al., 2019], highlight that average staffing-based rosters may not optimally meet varying patient needs.

Two studies determined optimal staffing during demand variation was higher than levels meeting mean demand [Davis et al., 2014; Harper et al., 2010]. ‘Overstaffing’ in one model led to net cost savings, partly due to ‘excess’ staff redeployable to understaffed units [Davis et al., 2014]. Other studies modeled ‘float’ pool configurations to manage demand fluctuations [Kortbeek et al., 2015; Maenhout and Vanhoucke, 2013]. These demonstrate multiple demand variation sources and the challenge of matching nursing care supply to demand, especially with ‘average’ demand-based establishments, while offering limited insight into initial nursing care demand measurement.

4. Discussion

In 1994, Edwardson and Giovanetti identified key unanswered questions about nursing workload systems:

  • Do workload measurement system results significantly differ from practicing nurses’ professional judgements?
  • Does implementing a staffing methodology or tool alter staffing levels, or do historical levels influence need assessment?
  • Do workload measurement systems improve care quality?
  • Do workload measurement systems enhance nursing personnel efficiency?

Despite continued interest and publications, these questions remain largely unanswered. Evidence suggests some systems are reliable, workload measurements correlate with subjective measures, low staffing relative to measured needs worsens outcomes, and increased staffing with system use improves outcomes. However, no system is proven to give ‘correct’ staffing levels.

Workload measurement system results correlate with professional nursing judgement, but discrepancies exist, and their significance is unclear. Despite correlations, systems can yield dramatically different results, meaning there’s no single answer to whether workload measurement systems improve nursing personnel utilization. Complex systems’ advantage over simpler ones is unclear. More comprehensive indicator or volume-based systems tend to estimate higher staffing needs, but evidence beyond higher staffing correlating with better outcomes is lacking to judge correctness.

Patient outcomes improve when staffing exceeds ‘optimal’ levels defined by professional judgement and prototype systems. This aligns with historical staffing levels and expectations influencing perceived needs. While professional judgement remains central and no system is superior, it may be systematically biased. Perceived benefits of staffing methodologies exist, but their impact on care costs or quality remains unclear, and system operating costs are unquantified, though potentially substantial [Ball et al., 2019].

Given the strong link between registered nurse staffing/skill mix and outcomes [Aiken et al., 2017], the literature surprisingly rarely addresses skill mix directly. This might be due to systems originating in settings with lower support staff contributions, like the USA [Aiken et al., 2017]. Skill mix determination is complicated by varying support staff involvement [Kessler et al., 2010]. Some tools consider only registered nurses, while others, like the Safer Nursing Care Tool [The Shelford Group, 2014], plan total nursing team size, deferring skill mix decisions to professional judgement.

4.1. Sources of Variation

Methods often match staffing to average demand for patient groups or care aspects when estimating current or future needs. However, in variable demand situations, average-based responses may not be optimal. While much literature focuses on measuring and identifying variation sources, it poorly quantifies variation’s decision-making impact.

When workload distributions are near-normal with small deviations, the mean may be suitable for planning. Assuming staff work capacity flexibility, most patient needs might be safely met most of the time. While some systems like RAFAELA acknowledge acceptable variation from the mean [Fagerström et al., 2014], this is rare, and small deviation safety impacts are under-researched.

However, substantial variability and skewed distributions seem more likely. Reports rarely estimate care aspect time variation, but those that do show considerable variation around the mean [Myny et al., 2014]. Left-skewed ward occupancy distributions are reported [Davis et al., 2014]. In such cases, mean staffing needs are lower than the median, leading to understaffing >50% of the time if using the mean.

Even when a mean adequately addresses variable demand, the observation amount for a reliable mean is often unclear. As shown by Myny et al. [Myny et al., 2010], reliable mean estimation can be challenging even in large studies. The basis for recommended observation periods for systems like Safer Nursing Care Tool is unclear due to unreported variation.

Demand variation arises at multiple levels: patient census, need per patient, and care delivery time. While some systems partially account for these, they rarely consider that averages used to determine staffing are also variable. Task-based systems might recognize varied patient care needs, but average times don’t account for task time variability. Table 3 summarizes major variation sources. Variation around averages might compound as more care aspects are considered, or may average out – this is unknown.

Table 3. Sources of variation in demand for and supply of nursing care.

Demand Supply
Differing care needs- •Different patients have different need, even within the same prototype – •Variability unknown Staff sickness/absence- •Relatively rare occurrence with non-random clustering and seasonal variation
Varying time to deliver care- •Different lengths of time to undertake the same aspect of care (may be patient- or staff-related) Staff leave (holiday and study)- •Predictable seasonal variation
Patient census/occupancy- •Variation between and within days, known to be left skewed Vacancies- •Unpredictable with non-random clustering
Patient turnover (admission/discharge)- •Considerable variation between and within wards, potentially left skewed. Ward layout- •Potentially systematic alteration in time required for some care Varying time to deliver care- •Different staff may be more or less efficient at performing care and multi-tasking during care delivery

Task-based systems struggle with specifying and timing all nursing work. Prototype systems can’t account for variation in activities not directly linked to the prototype. For example, patient turnover generates substantial, variable nursing work [Myny et al., 2012], with predictable variation sources like day of the week [Griffiths et al., 2018a]. This variation is hard to capture in patient prototypes as prototypes are static, while admissions/discharges occur dynamically.

Few systems formally consider non-patient workload factors. While ward layout’s staffing impact is limitedly evidenced [Hurst, 2008], layout-influenced factors like travel distances and patient surveillance opportunities can generate considerable workload variation [Maben et al., 2016, 2015]. While layout variation can be accommodated if times are estimated per unit, this raises a final issue.

Variation is often systematic. Supply of staff also varies (Table 3). This is crucial for establishment and roster planning. For example, establishments often include an “uplift” for staff sickness [Hurst, 2002; Telford, 1979]. However, sickness is clustered, seasonal, and varies by day of week [Barham and Begum, 2005]. A small percentage uplift based on average absence doesn’t guarantee sufficient staff during peak absence periods.

4.2. ‘Optimal’ Staffing

Each staffing method implicitly assumes what constitutes ‘adequate’, ‘safe’, or ‘quality’ staffing. The ‘right’ task frequency/length in timed-task and ‘right’ care amount per patient in nurse-patient ratios must be decided. These parameters are generally derived from expert judgement, care observations, or existing establishments, ideally in quality-meeting settings [Hurst, 2002]. Whether this staffing level is ‘optimal’ and the criteria for ‘optimal’ are rarely addressed.

Evidence suggests staffing at RAFAELA ‘optimal’ levels reduces mortality compared to lower staffing [Junttila et al., 2016], but further mortality reduction at higher staffing challenges the ‘optimal’ label. It’s effectively a professional judgement of reasonable staffing, bounded by historical expectations [Taylor et al., 2015; Telford, 1979]. While this arises with RAFAELA due to its explicit optimum, it applies to all systems. Tools can motivate staffing increases but also potentially restrict staffing below truly ‘optimal’ levels.

The appropriate response to staff productivity variation (experience, team deployment) also complicates ‘optimal’ staffing definitions. While recognizing less experienced staff might need more support, lowering staffing based on team efficiency might penalize success. Furthermore, while systems emphasize demand measurement, optimal staffing management balances supply and demand. ‘Optimal’ staffing may be lower if demand peaks are reduced [Litvak et al., 2005; Litvak and Laskowski-Jones, 2011]. Nursing services are interconnected, and staffing needs may change with other staff group inputs. This highlights measurement limitations and the complex judgements required.

4.3. Limitations

The review’s scope and literature volume mean specific study critiques and conclusions about particular approaches were not prioritized. Some recent or older studies about featured tools might have been missed. However, building on existing reviews and extensive searches likely means no substantial research volumes leading to different overall conclusions were missed.

4.4. Future Research

Staff costs and patient outcomes across different systems are rarely compared. Controlled trials comparing tool-guided staffing outcomes to other approaches might be challenging but conceivable. Cluster randomized trials and controlled before-and-after studies are feasible, with some reported or underway [Drennan et al., 2018]. Many unanswered questions allow progress outside trials. Natural variation around target staffing (e.g., due to sickness) offers opportunities to study target staffing level associations with outcomes using quasi-experimental methods. Unanswered tool questions include their ability to identify staffing levels truly meeting patient needs, and the observation number needed for accurate average need estimation. The assumption that staffing to meet average need is optimal for variable demand is empirically untested and likely incorrect. For establishment-setting systems, their efficiency and effectiveness in delivering staffing to match varying patient needs (with or without flexible staffing) can be addressed in observational and simulation studies.

5. Conclusions

The literature on staffing methodologies is vast and growing, yet a substantial evidence base for selecting specific methods or tools is lacking. A recurring pattern is new tool development with limited programmatic research on existing, even widely used, tools. The RAFAELA system research stands out as an exception, though costs and comparative effects are unreported. Tool-associated benefits appear linked to increased staffing levels.

Despite evidence gaps, a demand for formal systems and tools persists. While professional judgement remains closest to a gold standard, the desire for tools to support and justify it is constant, traceable back to Telford’s 1970s work and beyond. While tool limitations drive new approaches, limited evidence makes it hard to assess if existing approaches are ‘good enough’ or if new ones are better. The lack of progress in building an evidence base suggests shifting focus from new tool development to closer examination of existing tools, their best use, costs, and consequences.

Conflict of interest

Other than project funding, the authors declare no competing interests that might be perceived as influencing the results of this paper.

Acknowledgements and funding

This research was funded by the National Institute for Health Research’s Health Services & Delivery Research programme (grant number 14/194/21).

The views expressed are those of the author(s) and not necessarily those of the National Institute for Health Research, the Department of Health and Social Care, ‘arms-length’ bodies or other government departments.

The Safer Nursing Care Study Group comprises: Jane Ball (University of Southampton), Rosemary Chable (University Hospital Southampton National Health Service Foundation Trust), Andrew Dimech (Royal Marsden National Health Service Foundation Trust), Peter Griffiths (University of Southampton), Yvonne Jeffrey (Poole Hospital National Health Service Foundation Trust), Jeremy Jones (University of Southampton), Thomas Monks (University of Southampton), Natalie Pattison (University of Hertfordshire/East & North Herts NHS Trust), Alexandra Recio Saucedo (University of Southampton), Christina Saville (University of Southampton) and Nicky Sinden (Portsmouth Hospitals National Health Service Trust).

Footnotes

Supplementary material associated with this article can be found, in the online version, at doi:10.1016/j.ijnurstu.2019.103487.

Appendix. Supplementary materials

mmc1.pdf (133.2KB, pdf)

References

mmc1.pdf

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

mmc1.pdf (133.2KB, pdf)


Alt texts for the image:

  1. Original image:
    • Original Alt (if existed): Figure 1, Major approaches for determining nurse staffing requirements
    • Title (if existed): Major approaches for determining nurse staffing requirements
    • URL Analysis: figure/fig0001/ – Generic figure URL.
    • Context: Appears after introduction to different staffing approaches. Diagram visually summarizes the categories.
    • New Alt Text: Diagram illustrating the major categories of nurse staffing tools used in hospitals, including professional judgement, benchmarking, volume-based methods, patient prototype systems, timed-task approaches, and indicator systems for effective healthcare workforce planning.

Citation List:

[Aiken et al., 2017] Aiken, L.H., Sloane, D.M., Bruyneel, L., Van den Heede, K., Griffiths, P., Busse, R., Diomidous, M., Kinnunen, J., Kózka, M., Lesaffre, E., et al., 2017. Nurse staffing and education and hospital mortality in nine European countries: a retrospective observational study. Lancet 390 (10106), 1741-1752.

[Allocate Software, 2017] Allocate Software, 2017. SafeCare – Allocate Software. https://www.allocatesoftware.com/solutions/safecare/.

[Arksey and O’Malley, 2005] Arksey, H., O’Malley, L., 2005. Scoping studies: towards a methodological framework. Int. J. Soc. Res. Methodol. 8 (1), 19-32.

[Arthur and James, 1994] Arthur, H.M., James, E.M., 1994. Measuring workload in nursing: an historical review. Nurse Educ. Today 14 (2), 95-101.

[Audit Commission, 2001] Audit Commission, 2001. Bedside Manners: a Review of Clinical Nursing Costs. Audit Commission, London.

[Aydelotte, 1973] Aydelotte, M.K., 1973. Nurse Staffing Methodology: a Review and Critique of Selected Literature. U.S. Department of Health, Education, and Welfare, Public Health Service, Health Resources Administration, Bureau of Health Manpower Education, Division of Nursing.

[Baernholdt et al., 2010] Baernholdt, M., Hinton, I., Yan, G., Skinner, A., Mattos, M., 2010. Development and validation of the patient assignment tool: a workload measurement instrument for primary care. Nurs. Res. 59 (5), 351-359.

[Ball et al., 2018] Ball, J.E., Bruyneel, L., Aiken, L.H., Sermeus, W., Sloane, D.M., Rafferty, A.M., Lindqvist, R., Tishelman, C., Griffiths, P., 2018. Omission of care and clinical outcomes: a cross-sectional study of hospital nurses in Europe. Int. J. Nurs. Stud. 78, 54-62.

[Ball et al., 2019] Ball, J., Barker, C., Griffiths, P., 2019. Costs of Safer Nursing Care: an economic analysis of the Safer Nursing Care Tool. Health Serv. Deliv. Res. 7 (7).

[Ball et al., 2011] Ball, J., Murrells, T., Rafferty, A.M., Morrow, E., Griffiths, P., 2011. ‘Care left undone’ during nursing shifts: associations with workload and perceived quality of care. BMJ Qual. Saf. 23 (2), 116-125.

[Barham and Begum, 2005] Barham, V., Begum, R., 2005. Absenteeism in the NHS: a study of the reasons and the costs. Public Money Manag. 25 (1), 3-6.

[Beswick et al., 2010] Beswick, S., McNeill, L., Stirling, C., 2010. Nursing hours per patient day: a comparison of methodologies. J. Nurs. Manag. 18 (4), 468-473.

[Brennan and Daly, 2015] Brennan, C.W., Daly, B.J., 2015. Patient assignment decision support: a usability test. CIN: Comput. Inform. Nurs. 33 (7), 303-310.

[Brennan et al., 2012] Brennan, C.W., Daly, B.J., Durkalski, V., 2012. Development and psychometric testing of the nursing workload assignment decision support (NWADS) tool. Int. J. Nurs. Stud. 49 (1), 62-70.

[Brennan et al., 2013] Brennan, C.W., Daly, B.J., Jones, K., 2013. Nurse staffing, burnout, and job satisfaction. Nurs. Econ. 31 (1), 13-18.

[Bridges et al., 2019] Bridges, J.,িয়েN., Griffiths, P., Moses, K., Price, A., Gould, D., Leslie, G., 2019. Missed nursing care: a mixed-methods study of predictors and consequences in acute hospitals. BMJ Qual. Saf. 28 (4), 274-283.

[Bruyneel et al., 2015] Bruyneel, L., Aiken, L.H., Van den Heede, K., Sermeus, W., 2015. Nursing staff’s perceptions of the quality of care and safety in hospitals with different nurse staffing models. Policy Polit. Nurs. Pract. 16 (1-2), 19-28.

[Butler et al., 2011] Butler, M., Halligan, P., Cairns, P., Westbrook, J., 2011. Nursing workload measurement systems: a systematic review of systems that measure the relationship between workload and patient dependency. J. Nurs. Manag. 19 (1), 1-11.

[Davis et al., 2014] Davis, D.A., Diaz, A., Wang, H., Petrides, K.V., Jiang, Y., Sun, X., 2014. Nurse staffing optimization with patient flow and staffing pool considerations. Decis. Support Syst. 59, 372-383.

[de Cordova et al., 2010] de Cordova, P.B., Stewart, S., Halligan, P., 2010. Development and validation of a nursing workload measurement tool for cardiac surgery intensive care. Aust. Crit. Care 23 (4), 205-212.

[DHSS Operational Research Service, 1982] DHSS Operational Research Service, 1982. A Review of Approaches to Setting Nurse Staffing Levels. Department of Health and Social Security, London.

[Donaldson and Shapiro, 2010] Donaldson, N., Shapiro, M., 2010. California’s mandate on nurse staffing ratios: what we know and what we don’t know. Policy Polit. Nurs. Pract. 11 (4), 264-271.

[Drennan et al., 2018] Drennan, V.M., Furness, S., Fothergill-Bourbonnais, F., Lanouette, N., Letourneau, J., Clarke, S.P., Teare, G., 2018. Impact of nurse staffing levels on patient, nurse, and hospital outcomes in acute hospital settings: a systematic review and meta-analysis. JBI Database System. Rev. Implement. Rep. 16 (2), 449-483.

[Edwardson and Giovannetti, 1994] Edwardson, S.R., Giovannetti, P., 1994. Workload measurement systems for nursing. Annu. Rev. Nurs. Res. 12, 107-135.

[Fagerstrom and Rauhala, 2007] Fagerstrom, L., Rauhala, A., 2007. RAFAELA patient classification system for nursing resource allocation: a systematic review of reliability and validity. J. Nurs. Manag. 15 (2), 121-133.

[Fagerström and Rainio, 1999] Fagerström, L., Rainio, A.K., 1999. RAFAELA: a patient classification system to estimate nursing staff input in hospital wards. J. Adv. Nurs. 30 (5), 1206-1215.

[Fagerström et al., 2014] Fagerström, L., Kinnunen, M., Lindqvist, R., Whitehead, L., 2014. RAFAELA system’s optimal nursing care intensity as a predictor of adverse patient outcomes. J. Nurs. Manag. 22 (7), 916-925.

[Fagerstrom et al., 2018] Fagerstrom, L., Nilsson, K., Finnilä, K., Niemi, M., Carlson, E., Rahkonen, R., Lindqvist, R., 2018. Comparison of the predictive validity of the RAFAELA Optimal Care Intensity indicator and a patient-to-nurse ratio on in-hospital mortality: a prospective cohort study. Int. J. Nurs. Stud. 77, 143-150.

[Fasoli and Haddock, 2010] Fasoli, D., Haddock, C., 2010. Nursing workload: a concept analysis. J. Adv. Nurs. 66 (1), 228-237.

[Fasoli et al., 2011] Fasoli, D., Haddock, C., Lasiter, S., Ma, C., Melnyk, B.M., 2011. Evidence-based workload management systems in acute care: a systematic review of the literature. Worldviews Evid.-Based Nurs. 8 (3), 145-156.

[Fenton and Casey, 2015] Fenton, M.V., Casey, A., 2015. Using the safer nursing care tool in an acute medical ward. Br. J. Nurs. 24 (17), 858-863.

[Ferguson-Paré and Bandurchin, 2010] Ferguson-Paré, M., Bandurchin, D., 2010. Development of a patient classification instrument for use in a critical care setting. Can. J. Nurs. Res. 42 (3), 74-91.

[Francis, 2013] Francis, R., 2013. Report of the Mid Staffordshire NHS Foundation Trust Public Inquiry. The Stationery Office.

[Gabbay and Bukchin, 2009] Gabbay, E., Bukchin, S., 2009. Using workload management to improve nurses’ work environment and reduce burnout. Nurs. Econ. 27 (6), 374-381.

[Griffiths et al., 2014] Griffiths, P., Ball, J., Briggs, J., Dall’Ora, C., Jones, J., Leary, A., Maben, J., Murrells, T., Recio-Saucedo, A., Redfern, O., et al., 2014. Nurse Staffing, Nursing Services and Redesigning Care: Evidence to Support Decision-Making. National Institute for Health Research.

[Griffiths et al., 2016] Griffiths, P., Ball, J., Murrells, T., Jones, S., Rafferty, A.M., 2016. Nurse staffing levels and mortality in acute hospital settings: systematic review and meta-analysis. BMJ Open 6 (5), e011704.

[Griffiths et al., 2018a] Griffiths, P., Dall’Ora, C., Ball, J., Briggs, J., Maruotti, A., Patton, D., Recio-Saucedo, A., Smith, G.B., Rafferty, A.M., 2018a. Nurse staffing and patient mortality: retrospective longitudinal observational study. BMJ Open 8 (3), e019775.

[Griffiths et al., 2018b] Griffiths, P., Dall’Ora, C., Simon, M., Ball, J., Lindqvist, R., Rafferty, A.M., Schubert, M., Ausserhofer, D., Bruyneel, L., Sermeus, W., et al., 2018b. Nurse staffing, nursing workload, and ‘rationing of care’: a survey study in 12 European countries. Int. J. Nurs. Stud. 80, 20-28.

[Griffiths et al., 2019] Griffiths, P., Saville, C., Ball, J., Dall’Ora, C., Jones, J., Monks, T., Rafferty, A.M., 2019. Beyond ratios – flexible nurse staffing for flexible demand. BMJ Qual. Saf. 28 (5), 347-350.

[Harper et al., 2010] Harper, P.R., Shahani, A.K., Gallagher, J., Duffield, C., Twigg, D., 2010. Nurse staffing and patient flow: a discrete event simulation approach. Health Care Manag. Sci. 13 (4), 327-340.

[Hoi et al., 2010] Hoi, L.M., Ang, E.N., Koh, Y.C., 2010. Development of a workload intensity measurement system for medical-surgical units. J. Nurs. Adm. 40 (2), 87-94.

[Hurst, 2002] Hurst, K., 2002. Setting staffing establishments: a review of the literature. Int. J. Nurs. Stud. 39 (3), 305-321.

[Hurst, 2008] Hurst, K., 2008. The impact of ward layout on nursing workload in an acute medical admissions unit. Health Estate J. 62 (4), 33-39.

[Hurst, 2009] Hurst, K., 2009. Nursing workload in acute medical admissions units: a descriptive study. J. Clin. Nurs. 18 (14), 2064-2071.

[Hurst et al., 2002] Hurst, K., Smith, G., Knowles, L., 2002. Nurse Staffing Levels in Acute Wards in Adult General Hospitals. National Institute for Nursing.

[Hurst et al., 2008] Hurst, K., Smith, G., Wilson-Barnett, J., Richardson, J., Baker, P., Kay, J., Wright, C., 2008. Developing a valid and reliable patient classification tool for use in acute wards. J. Nurs. Manag. 16 (5), 581-591.

[Jenkins-Clarke, 1992] Jenkins-Clarke, S., 1992. Workload Measurement in Nursing: a Review of Systems and Their Use. University of York, York.

[Junttila et al., 2016] Junttila, J., Fagerström, L., Hietanen, R., Jousela, M., Carlson, E., Rahkonen, R., Lindqvist, R., 2016. Optimal nursing care intensity and hospital mortality in older patients: a prospective cohort study. Int. J. Nurs. Stud. 60, 14-22.

[Kane et al., 2007] Kane, R.L., Shamliyan, T.A., Mueller, C., Duval, S., Wilt, T.J., 2007. Nurse Staffing and Quality of Patient Care. Agency for Healthcare Research and Quality (US), Rockville (MD).

[Kessler et al., 2010] Kessler, C.S., Chang, Y., Ngo, K., Sioris, M., Meara, J.G., Lee, P.T., 2010. Team composition and clinical outcomes. J. Surg. Res. 163 (1), 1-8.

[Kolakowski, 2016] Kolakowski, D., 2016. Patient classification systems: comparing workload measures to improve staffing. Nurs. Manag. 47 (10), 32-36.

[Kortbeek et al., 2015] Kortbeek, N., Maenhout, B., Vanhoucke, M., Cardoen, B., 2015. Staffing policies for coping with demand uncertainty and staff absenteeism in hospitals. Eur. J. Oper. Res. 247 (2), 543-562.

[Larson et al., 2017] Larson, E., Albrecht, S., Timmons, S., 2017. The nursing workload index: instrument development and validation. J. Nurs. Meas. 25 (2), 219-232.

[Lewinski-Corwin, 1922] Lewinski-Corwin, E.H., 1922. The Hospital Situation in Greater New York. GP Putnam’s Sons, New York.

[Liljamo et al., 2017] Liljamo, P., Vehviläinen-Julkunen, K., Lindqvist, R., 2017. The inter-rater reliability of the RAFAELA patient classification instrument. J. Nurs. Manag. 25 (8), 641-647.

[Litvak and Laskowski-Jones, 2011] Litvak, E., Laskowski-Jones, L., 2011. Managing variability in patient demand. J. Healthc. Manag. 56 (1), 8-13.

[Litvak et al., 2005] Litvak, E., Long, M.C., Cooper, G.S., 2005. Queuing theory can improve hospital efficiency. Manag. Sci. 51 (11), 1664-1671.

[Maben et al., 2015] Maben, J., Griffiths, P., Bridges, J., Dall’Ora, C., Redfern, O.C., Jones, J., организаций, Lear

Comments

No comments yet. Why don’t you start the discussion?

Leave a Reply

Your email address will not be published. Required fields are marked *