Problem Identification: Hot flashes are common and bothersome in patients with breast and prostate cancer and can adversely affect patients’ quality of life.
Literature Search: Databases were searched for randomized controlled trials (RCTs) evaluating the effects of one or more interventions for hot flashes in patients with a history of breast or prostate cancer.
Data Evaluation: Outcomes of interest included changes in hot flash severity, hot flash frequency, quality of life, and harms. Pairwise meta-analyses and network meta-analyses were performed where feasible, with narrative synthesis used where required.
Synthesis: 40 RCTs were included. Findings from network meta-analysis for hot flash frequency suggested that several therapies may offer benefits compared to no treatment, but little data suggested differences between active therapies. Findings from network meta-analysis for hot flash score were similar.
Implications for Research: Although many interventions may offer improvements for hot flashes versus no treatment, minimal data suggest important differences between therapies.
Supplementary materials can be found by visiting https://bit.ly/2WGzi30.
Improvements in diagnosis and treatment of breast and prostate cancer are leading to a growing number of cancer survivors (American Cancer Society, 2019). A consequence of this is that many more patients are faced with managing the long-term side effects of their treatment. Treatments for breast or prostate cancer that target production of estrogen and testosterone can be associated with hormone-deprivation symptoms, the most common of which is hot flashes. The frequency, severity, and duration of hot flashes can vary widely from patient to patient but are reported in more than 65% of breast cancer survivors (Chang et al., 2016; Kontos et al., 2010; Mann et al., 2012) and in 80% of men undergoing androgen deprivation therapy (ADT) for prostate cancer (Frisk, 2010). Hot flashes not only can significantly affect a patient’s quality of life (Goldman, 2017), but also can be significant enough to lead to discontinuation of cancer treatment (Buijs et al., 2009). Despite their frequency and significance, there is currently a lack of consensus on evidence-based interventions to treat hot flashes (Goldman, 2017).
A hot flash has been defined as “a subjective sensation of heat that is associated with objective signs of cutaneous vasodilation and a subsequent drop in core temperature” (Boekhout et al., 2006, p. 642). The concept of hot flashes in men has not been well explored in the literature. A concept analysis identified the key attributes of hot flashes in men to consist of physiologic (e.g., warmth, sweating, chills) and psychological factors (e.g., anxiety, impaired memory) (Engstrom, 2005). Hot flashes are also often referred to as hot flushes, night sweats, and vasomotor symptoms. The exact pathophysiology of hot flashes in women or men has not been determined, leading to difficulty in understanding why some interventions or treatments may or may not work.
A wide variety of options for treatment of hot flashes has been researched in patients with and without cancer. Pharmacologic treatments (e.g., anticonvulsants, antidepressants, antiandrenergics, anticholinergics, progestins), natural health products (e.g., herbals, vitamins, phytoestrogens), and complementary medicine and physical and behavioral therapies (e.g., acupuncture, reflexology, exercise, yoga, relaxation training, mindfulness-based stress reduction, hypnosis, cognitive behavioral therapy) have all been evaluated in research studies (Fisher et al., 2013; Goldman, 2017; Marino et al., 2018; Santen et al., 2017; Zoberi & Tucker, 2019).
Although antidepressants have been shown to relieve hot flashes (Barton et al., 2002; Loprinzi et al., 2000) with positive efficacy and good tolerability, many patients look to alternative treatment options. Acupuncture has been studied as a treatment for hot flashes from menopause, as well as following treatment for cancer. Results have been mixed, with some studies finding no evidence of an effect and others finding some benefit (Dodin et al., 2013). Nutraceutical and complementary or behavioral therapies have also had mixed results, with insufficient data to firmly establish effectiveness (Kaplan et al., 2011). Collectively, many treatments are tried by patients to address hot flashes, but the evidence to support the presence of benefits for treatments of all forms is limited (Goldman, 2017).
Given the increased success of cancer therapies, physicians and nurses have an increasingly important need to address survivorship challenges. To improve treatment adherence and, ultimately, quality of life, it is important to identify evidence-based strategies that reduce the frequency and severity of hot flashes following cancer treatment (Kaplan et al., 2011; Pinkerton & Santen, 2019). A systematic review of the currently available evidence incorporating meta-analysis and network meta-analysis methodology was planned because of the potential to identify effective treatments while noting gaps in existing knowledge (Caldwell et al., 2005; Catalá-Lopez et al., 2014; Dias et al., 2011).
This review was conducted to address the following research question: In breast cancer and prostate cancer survivors, what are the relative benefits of nonhormonal therapies on frequency and severity of hot flashes, quality of life, and quality of life related to depression and sleep quality? A protocol for the study was prepared a priori and followed throughout the review process (Hutton, Yazdi, et al., 2015). The PRISMA (Preferred Reporting Items for Systematic Reviews and Meta-Analyses) Extension Statement for Network Meta-Analysis (Hutton, Salanti, et al., 2015) was used to guide authorship of this article, and a completed checklist is provided in the appendices. Adjustments to the protocol made because of characteristics of the data collected are described within the article.
The search strategy was developed and tested by an experienced medical information specialist in consultation with the review team. Using the OVID platform, Ovid MEDLINE®, Ovid MEDLINE In-Process and Other Non-Indexed Citations, Embase®, AMED, and PsycINFO® were searched. The CENTRAL database using the Cochrane Library on Wiley was also searched. Strategies used a combination of controlled vocabulary (e.g., “breast neoplasms,” “prostatic neoplasms,” “hot flashes”) and keywords (e.g., breast cancer, prostate cancer, vasomotor symptoms). Vocabulary and syntax were adjusted across databases, and no date restrictions were in place. Specific details regarding the strategies have been published previously in the related protocol (Hutton, Yazdi, et al., 2015). The final search strategy is provided in Appendix 1 and was last updated on November 13, 2018.
The study selection criteria employed for this study have been described previously (Hutton, Yazdi, et al., 2015). Briefly, the authors sought randomized controlled trials (RCTs) performed in patients with breast or prostate cancer with a history of hot flashes that compared different interventions in terms of their effects to improve quality of life endpoints and reduce hot flash frequency and severity. Studies evaluating the effects of nonhormonal pharmacologic, natural health product, and behavioral and physical interventions were of interest. Eligible pharmacologic interventions included antidepressants from the selective serotonin reuptake inhibitor (SSRI) and selective norepinephrine reuptake inhibitor (SNRI) classes, certain neuroleptic agents, and antihypertensive medications. Structured physical and behavioral interventions of interest included exercise programs, acupuncture, hypnosis, yoga, relaxation techniques, and cognitive behavioral therapy. Natural health products of interest included ginseng, black cohosh, isoflavones, menerba, vitamin E, flax, and soy. A detailed account of the selection criteria is provided in Appendix 2.
Pairs of reviewers among a set of six team members (M.H., M.P., P.B., F.Y., N.A., and S.M.) screened all citations independently, with a pilot phase being used at both phases of screening to ensure consistency between reviewers. Stage 1 review consisted of screening titles and abstracts, and stage 2 consisted of screening the full texts of citations that were considered potentially relevant. After each stage, reviewers resolved discrepancies through discussion or by consultation of a third party (B.H. or M.C.) if needed. The process of study selection is presented in a flow diagram provided in the appendices.
Data collection from the included studies was performed by pairs of reviewers among a set of six research team members (M.H., P.B., M.P., N.A., S.M., and F.Y.) using a standardized data extraction template implemented in the Systematic Review Data Repository (srdr.ahrq.gov) and Microsoft® Excel. A pilot test of the data collection form was performed on the first five studies and refined accordingly. Data items collected included the following: study design, patient eligibility criteria, patient demographics (e.g., type of malignancy; age distribution; menopausal status; baseline measures of hot flash frequency, severity, and composite scores), intervention details (e.g., type of drug, natural health product, therapy or activity and related frequency, dosage, duration), and outcome data (e.g., final values and/or changes in hot flash endpoints, quality-of-life endpoints, harms data). After data collection, the reviewers resolved any discrepancies and consulted a third party when needed.
Full-text articles were independently assessed for risk of bias by two reviewers (M.H., M.P.) using the Cochrane Collaboration’s tool for assessing risk of bias (Higgins et al., 2011). The tool assessed potential areas of bias, including selection bias, performance/detection bias, attrition bias, and reporting bias. Discrepancies in the initial independent assessments were resolved by discussion or consultation of a third party (B.H.), if necessary. A narrative summary of findings from these assessments is provided in the current article, and a tabular summary of all assessments is provided in the appendices.
Prior to network meta-analyses, pairwise meta-analyses were performed using inverse variance random effects models where multiple studies informing the same treatment comparison were available; measures of I2 were used to evaluate for the presence of statistical heterogeneity between studies. Where comparisons involving single studies existed and no connected network of evidence could be completed, narrative syntheses were prepared.
Network meta-analysis is an extension of traditional pairwise meta-analysis, which enables the comparison of multiple interventions in a single analysis and allows incorporation of direct and indirect evidence of relevance (Caldwell et al., 2005; Lu & Ades, 2004). In the current review, network meta-analyses of the change in hot flash daily frequency and the change in hot flash composite score (frequency × severity) were conducted (network structures for other endpoints were disconnected). The nature of reporting these endpoints varied across included studies, with some reporting changes from baseline and others reporting mean values of each endpoint at baseline and follow-up, with standard deviations for each. In some cases, the percentage change from baseline in each group was reported. To accommodate for these varied reporting structures, a network meta-analysis model based on the ratio of means (RoM) was developed and implemented using WinBugs and R Model code and is provided in Appendix 3, along with a detailed description of the underlying statistical methods. Random effects consistency models were considered using a Bayesian approach with burn-in of 100,000 iterations and sampling of 100,000 iterations, and model convergence was assessed using Gelman–Rubin plots and inspection of Monte Carlo standard errors. The adequacy of fit of each model was assessed through comparison of the mean posterior total residual deviance with the number of unconstrained data points (approximately equal values suggests strong fit), and the comparison of fit between models was based on deviance information criteria. The consistency assumption was evaluated by fitting corresponding inconsistency models for each analysis and comparing the deviance information criterion value with that from the corresponding consistency analysis. Although several secondary analyses using meta-regression analysis were planned to assess the impact of heterogeneity between studies (with regard to factors including duration of hot flashes, baseline hot flash severity, smoking status, and concomitant treatments), limited reporting of patient characteristics precluded their performance. The median effect sizes of interventions versus standard care are reported along with corresponding 95% credible intervals (CrIs). League tables (a tabular approach to summarizing comparisons between treatments derived from network meta-analysis, listing the point estimate and related measure of uncertainty) are presented in this article to summarize findings from comparisons between interventions. Surface under the cumulative ranking (SUCRA) curve values and treatment rankings are provided in the appendices (Salanti et al., 2011). All network meta-analyses were performed using OpenBUGS software, version 3.2.3. Model convergence was assessed using established methods including Gelman–Rubin diagnostics and the Potential Scale Reduction Factor. Network diagrams (a commonly used visual for network meta-analyses used to present the totality of evidence available for quantitative analysis [Salanti et al., 2008]) were prepared to summarize the available evidence for each network meta-analysis, and patterns of comparisons between active interventions and control were assessed.
Two team members (R.M., B.S.A.) rated the certainty of evidence for each pairwise and network estimate using the GRADE (Grading of Recommendations Assessment, Development and Evaluation) approach (Brignardello-Petersen et al., 2018; Guyatt et al., 2008; Puhan et al., 2014). For each direct comparison, the body of evidence from RCTs started at high certainty of evidence and could be rated down because of study limitations (risk of bias), inconsistency (heterogeneity), indirectness, imprecision, or publication bias (Guyatt et al., 2008).
For the assessment of the network, the certainty of evidence of the indirect evidence was rated with a focus on the dominant lowest order loop (Puhan et al., 2014). The certainty of evidence of the indirect evidence served as the lowest certainty of the contributing direct comparisons. The network estimate certainty of evidence started as the higher of the direct and indirect evidence; however, their relative contributions were considered for the final estimate. The network estimate could be rated downward because of incoherence between the direct and indirect estimates or an imprecise treatment effect (Brignardello-Petersen et al., 2018; Puhan et al., 2014).
The current authors present the network effect estimates and associated certainty of evidence using an approach that categorized interventions from most effective to least effective. For each outcome, groups of interventions are presented as follows: (a) the reference intervention (placebo) and interventions no different from placebo (i.e., 95% CrI includes null value), which the authors refer to as “among the least effective”; (b) interventions superior to placebo but not superior to any other of the intervention(s) superior to placebo, which the authors call category 2 and describe as “inferior to the most effective, but superior to the least effective”; and (c) interventions that proved superior to at least one category 2 intervention, which the authors call “among the most effective.” The authors then divided all three categories into two groups: those with moderate or high certainty of evidence relative to placebo, and those with low or very low certainty of evidence relative to placebo (Florez et al., 2018). Details of the assessments are provided in the appendices.
The initial search identified a total of 3,901 citations. Duplicates were removed, leaving 2,992 unique citations for review. Stage 1 screening of titles and abstracts identified 570 potentially relevant citations, which were subsequently reviewed in full text. Of these citations, 41 met the a priori inclusion criteria, representing 40 unique individual studies that were included for analysis; 36 were in patients with breast cancer and 4 were in patients with prostate cancer. Reasons for study exclusion are listed in the flow diagram presented in Appendix 4 and reflected in the full listing of articles excluded during full-text screening provided in Appendix 5.
A total of 4,186 patients (4,075 with breast cancer and 111 with prostate cancer) participated in the included studies (see Table 1). Year of publication ranged from 1998 to 2016, and study size ranged from 24 to 422 patients (median = 88). Parallel group and crossover designs were used in 80% and 20% of studies, respectively, and patient follow-up ranged from 4 to 108 weeks (median = 8 weeks). The largest proportion of studies was conducted in the United States (50%), with smaller proportions conducted in the United Kingdom (10%), Italy (7.5%), Sweden (7.5%), Canada (5%), Germany (5%), the Netherlands (5%), and other countries (7.5%). Regarding study funding, 58% were government or academically funded, 12.5% had mixed funding, 5% were industry funded, 2.5% had no funding, and 20% did not report on funding source (Biglia et al., 2009). A total of 42.5% of studies were at single sites, 27.5% were multicenter, and 30% did not report the number of participating centers.
Regarding demographics, several patient descriptors were largely unreported in the included studies. This included details of patients’ types of breast and prostate cancer diagnoses, average duration of hot flashes prior to the time of the study, concomitant interventions (for hot flashes or other indications), duration of time since cancer treatment, body mass index, smoking history, and socioeconomic status. Average or median patient age was available from 35 studies, with a median measure of 54 years (range = 45–70); patients in the four prostate cancer studies (Frisk et al., 2009; Loprinzi et al., 2009; Stefanopoulou et al., 2015; Vitolins et al., 2013) were all associated with higher values between 69 and 71 years. The median proportion of Caucasian patients was 85.5% (range = 40%–100%) among 19 studies with available data. A total of 31 studies (77.5%) reported patients’ average daily number of hot flashes prior to randomization (weekly values were divided accordingly); the median value was 7.7, with some variation as expected (range = 2.8–12.5) based on studies’ variable entry criteria regarding this trait. Only seven studies reported on the use of prior interventions for hot flashes, with a median percentage of patients of 13.5% (range = 0%–61%) (Boekhout et al., 2011; Bordeleau et al., 2010; Hervik & Mjåland, 2009; Loibl et al., 2007; MacGregor et al., 2005; Mann et al., 2012; Mao et al., 2015). Time from diagnosis was reported in nine studies (Bordeleau et al., 2010; Carson et al., 2009; Chen et al., 2014; Cramer et al., 2015; Frisk et al., 2009; Mann et al., 2012; Nedstrand et al., 2005; Stefanopoulou et al., 2015; Van Patten et al., 2002), with a median of 3.28 years (range = 2–7.6). The ability to assess the extent of homogeneity amongst study populations was limited based on the lack of information available for inspection.
Overall, the set of included studies evaluated a broad range of natural health product interventions (e.g., soy, melatonin, black cohosh, vitamin E), pharmacologic interventions (e.g., venlafaxine, paroxetine, fluoxetine, escitalopram, duloxetine, sertraline, gabapentin, clonidine), physical activity interventions (e.g., yoga, acupuncture, electroacupuncture), psychological interventions (e.g., relaxation, cognitive behavioral therapy, hypnosis), and combination interventions. Figure 1 presents network diagrams displaying the patterns of comparisons across the set of included studies prior to consideration of the availability of outcomes; data separated by population (breast cancer and prostate cancer) and considering data collectively from both populations are shown. The majority of comparisons in the included studies used placebo or standard care as the control group. As the presence of outcomes per study was inspected, the scope and feasibility of network meta-analyses encompassing the broad range of therapies was limited; this is discussed further later in this article.
A summary table provided in Appendix 6 provides a study-by-study account of the availability of the a priori outcomes of interest. Hot flash frequency was reported by 32 studies, hot flash severity was reported by 10 studies, and hot flash composite score was reported by 26 studies. Reporting format of outcome data varied across studies and included mean difference of change from baseline, mean percentage change from baseline, and values of endpoints at baseline and follow-up. Values were expressed as weekly measures in some cases and daily measures in other cases. In addition, treatment comparisons within studies could not be fully connected within evidence networks. Network meta-analyses consisting of a subset of all noted studies were feasible for each of the hot flash composite score and hot flash frequency outcome measures (supplemented by narrative description of unconnected studies), and only a narrative summary was feasible for hot flash severity.
Quality-of-life measures were heterogeneously reported across studies. General health-related quality of life was assessed by one of a broad range of scales in 21 studies including the SF-36®, European Organisation for Research and Treatment Quality of Life Questionnaire–Core 30 (EORTC QLQ-C30), EuroQoL, Functional Assessment of Cancer Therapy (FACT)–Breast, FACT-General, FACT-Prostate, FACT-Endocrine Subscale, Global Rating of Well-Being, Menopause-Specific Quality of Life (MENQOL), Menopause Rating Scale, single-item global score, the Symptom Checklist, and a visual analog scale. Sleep-related quality of life was assessed by eight studies involving use of the Pittsburgh Sleep Quality Index, the Medical Outcomes Study (MOS) Sleep Scale, the Groningen Sleep Quality Scale, and the Women’s Health Questionnaire sleep subscale. Depression-related quality of life was described in 15 studies based on the Hospital Anxiety and Depression Scale (HADS), Center for Epidemiologic Studies–Depression (CES-D), the Beck Depression Index (BDI), the Women’s Health Questionnaire depression subscale, or the Profile of Mood States (POMS). Sexual function quality of life was described in five studies and assessed using the Sexual Activity Questionnaire, the MOS Sexual Problem Index, or a visual analog scale. Based on the disconnected patterns of treatment comparisons and diversity in assessment scales used, only narrative summaries were used to summarize the available evidence for these outcomes.
Appendix 7 provides a summary bar graph of the distribution of judgments across studies on all risk of bias domains, as well as a tabular summary of all risk of bias assessments performed for the set of included studies. Totals of 45% and 57.5% of included studies were assessed to be associated with an unclear risk of bias for randomization and allocation concealment. Blinding of patients, study personnel, and outcome assessors was inconsistent among studies and judged to be unclear in 22.5%, 57.5%, and 25% of studies, respectively. Risk of selective reporting was unclear in 62.5% of studies because of lack of availability of a study protocol. A total of 66.5% of studies were judged to be at high risk of bias based on a lack of intention-to-treat analysis, and 52.5% of studies were judged at unclear risk of bias with regard to treatment compliance.
The following sections present findings from network meta-analyses (where they were feasible), as well as narrative summaries of individual study findings for studies that could not be included in network meta-analyses because of a lack of connectivity of the evidence base. Network meta-analyses were feasible using a subset of all available evidence for changes in hot flash frequency and hot flash composite score; for both analyses, model fit statistics suggested the fit of the random effects model to be adequate (see Appendix 8) and that there was minimal evidence to support a violation of the consistency of direct and indirect evidence (see Appendix 9). Secondary measures of treatment effect from network meta-analyses including SUCRA are described later in this article, with full details provided in Appendix 10. Where narrative summaries of study findings were required because of disconnected treatment networks (whether for all treatments or only a subset of treatments), the main text of the article mentions findings only briefly, and Appendix 11 presents additional numeric details from the individual studies with available data.
A total of 11 studies (Biglia et al., 2009; Chen et al., 2014; Kimmick et al., 2006; Loibl et al., 2007; Loprinzi et al., 2000, 2007, 2009; Pandya et al., 2000, 2005; Stearns et al., 2005; Wu et al., 2009) comparing 9 interventions (see Figure 2) that enrolled a total of 1,403 patients had outcome data for changes in hot flash frequency that could be synthesized using network meta-analysis. Overall, eight studies were placebo-controlled, all of which had two study arms.
Table 2 is a league table summarizing all pairwise RoMs comparing interventions included in the network. Paroxetine, venlafaxine, gabapentin plus antidepressant, gabapentin, clonidine, and sertraline were associated with more benefit than vitamin E. Only paroxetine and venlafaxine were better than placebo. The largest RoM versus placebo was associated with paroxetine (RoM = 3.15, 95% CrI [1.29, 7.58], SUCRA = 0.873). Based on descending magnitude of SUCRA value, the next best ranked interventions were venlafaxine (RoM versus placebo = 2.48, 95% CrI [1.36, 4.32], SUCRA = 0.8), gabapentin plus antidepressant (RoM versus placebo = 1.8, 95% CrI [0.65, 4.65], SUCRA = 0.59), sertraline (RoM versus placebo = 1.67, 95% CrI [0.69, 3.94], SUCRA = 0.55), gabapentin (RoM versus placebo = 1.62, 95% CrI [0.92, 2.73], SUCRA = 0.53), clonidine (RoM versus placebo = 1.62, 95% CrI [0.86, 2.98], SUCRA = 0.52), melatonin (RoM versus placebo = 1.03, 95% CrI [0.11, 8.9], SUCRA = 0.39), and vitamin E (RoM versus placebo = 0.27, 95% CrI [0.06, 1.18], SUCRA = 0.033).
No physical or psychological interventions could be included in the analysis because they were disconnected from the evidence network, warranting separate description. In total, 20 additional studies (Barton et al., 1998; Biglia et al., 2016; Bordeleau et al., 2010; Carson et al., 2009; Deng et al., 2007; Duijts et al., 2012; Elkins et al., 2008; Fenlon, 1999; Fenlon et al., 2008; Frisk et al., 2009; Hervik & Mjåland, 2009; Liljegren et al., 2012; Loprinzi et al., 2002; Mann et al., 2012; Mao et al., 2015; Nedstrand et al., 2005; Quella et al., 2000; Stefanopoulou et al., 2015; Van Patten et al., 2002; Vitolins et al., 2013) reported data regarding changes in hot flash frequency that could not be included in the network meta-analysis. Briefly, findings related to pharmacologic interventions suggested that duloxetine and escitalopram may offer a reduction in hot flash frequency in women with breast cancer (Biglia et al., 2016), as may fluoxetine (Loprinzi et al., 2002); gabapentin may offer fewer benefits than electroacupuncture (Mao et al., 2015); gabapentin and venlafaxine may offer benefits for patients with breast cancer, with patient preferences from a crossover study favoring venlafaxine (Bordeleau et al., 2010). Findings related to nonpharmacologic interventions suggested that cognitive behavioral therapy may reduce the number of hot flash episodes as compared to placebo among men with prostate cancer (Stefanopoulou et al., 2015); group cognitive behavioral therapy and usual care (including telephone support and information leaflets on symptom management) may offer reductions in hot flash frequency in women with breast cancer, with no significant difference identified between treatments (Mann et al., 2012); yoga may offer more improvement than no treatment for patients with breast cancer (Carson et al., 2009); and acupuncture and electroacupuncture may offer reductions for as long as one year in patients with prostate cancer (Frisk et al., 2009; Nedstrand et al., 2005); relaxation therapy may reduce frequency in women with breast cancer (Fenlon et al., 2008; Nedstrand et al., 2005); and hypnosis in women with breast cancer may offer benefits as compared to no treatment (Elkins et al., 2008). Two studies observed reductions in hot flash frequency in groups of patients with breast cancer randomized to acupuncture versus sham acupuncture, with no statistically significant difference identified between groups (Deng et al., 2007; Liljegren et al., 2012); one other study making the same treatment comparison observed a significant difference favoring acupuncture (Hervik & Mjåland, 2009). There was little to no evidence of benefits with soy (Quella et al., 2000; Van Patten et al., 2002) or vitamin E (Barton et al., 1998) as compared to placebo or exercise (with or without cognitive behavioral therapy) and as compared to no treatment (Duijts et al., 2012).
Among the studies with high or moderate certainty of evidence relative to placebo, only venlafaxine reduced the mean hot flash frequency as compared to placebo (RoM = 2.48, 95% CrI [1.36, 4.3], moderate certainty of evidence) (see Appendix 12). Gabapentin was no more effective than placebo (RoM = 1.62, 95% CrI [0.92, 2.73], moderate certainty of evidence). Among the studies with low or very low certainty of evidence, paroxetine significantly reduced the mean hot flash frequency (RoM = 3.15, 95% CrI [1.29, 7.58], low certainty of evidence). No other statistically significant differences were identified among the remainder of the interventions and placebo comparisons.
Data from 12 RCTs (Biglia et al., 2009; Chen et al., 2014; Kimmick et al., 2006; Loibl et al., 2007; Loprinzi et al., 2000, 2007, 2009; Mao et al., 2015; Pandya et al., 2000, 2005; Stearns et al., 2005; Wu et al., 2009) comparing 11 interventions that enrolled a total of 1,523 patients had outcome data for changes in hot flash score that could be synthesized using network meta-analysis. Overall, 9 studies (1,286 patients) were placebo-controlled, of which 5 studies had more than two arms.
Table 3 is a league table summarizing all pairwise RoMs comparing interventions included in the network. Paroxetine, clonidine, electroacupuncture, and venlafaxine were better than placebo, and vitamin E was worse than placebo. The largest RoM versus placebo was associated with paroxetine (RoM = 2.83, 95% CrI [1.31, 6.09], SUCRA = 0.87). In order of descending magnitude of SUCRA value, the subsequently ranked interventions were clonidine (RoM versus placebo = 2.13, 95% CrI [1.27, 3.54], SUCRA = 0.76), electroacupuncture (RoM versus placebo = 2.07, 95% CrI [1.01, 4.24], SUCRA = 0.73), venlafaxine (RoM versus placebo = 1.71, 95% CrI [1.05, 2.76], SUCRA = 0.59), sham acupuncture (RoM versus placebo = 1.65, 95% CrI [0.83, 3.31], SUCRA = 0.57), sertraline (RoM versus placebo = 1.58, 95% CrI [0.7, 3.41], SUCRA = 0.54), gabapentin (RoM versus placebo = 1.43, 95% CrI [0.95, 2.12], SUCRA = 0.451), gabapentin plus antidepressant (RoM versus placebo = 1.34, 95% CrI [0.59, 3.01], SUCRA = 0.42), melatonin (RoM versus placebo = 0.7, 95% CrI [0.05, 11.19], SUCRA = 0.34), placebo (SUCRA = 0.21), and vitamin E (RoM versus placebo = 0.14, 95% CrI [0.03, 0.58], SUCRA = 0.02).
As was observed for the hot flash frequency analysis, no physical or psychological interventions could be included in the analysis because they were disconnected from the evidence network. An additional 14 studies (Bao et al., 2014; Barton et al., 1998; Biglia et al., 2016; Boekhout et al., 2011; Bordeleau et al., 2010; Carson et al., 2009; Elkins et al., 2008; Frisk et al., 2009; Jacobson et al., 2001; Lesi et al., 2016; Loprinzi et al., 2002; Quella et al., 2000; Van Patten et al., 2002; Vitolins et al., 2013) reported data regarding composite hot flash score that could not be included in the network meta-analysis. Briefly, findings regarding pharmacologic interventions suggested that escitalopram and duloxetine may reduce hot flash scores in women with breast cancer (Biglia et al., 2016); venlafaxine and clonidine offered improvements as compared to placebo for patients with breast cancer (Boekhout et al., 2011), as did fluoxetine (Loprinzi et al., 2002). Venlafaxine and gabapentin offered reductions from baseline in patients with breast cancer that were of comparable magnitude (Bordeleau et al., 2010). Among nonpharmacologic interventions, findings suggested that acupuncture with enhanced self-care (i.e., provision of an information booklet about climacteric symptom management addressing considerations for diet, physical exercise, and psychological support) improved hot flash score as compared to enhanced self-care alone (Lesi et al., 2016); yoga provided significant benefits as compared to waitlist control post-treatment in patients with breast cancer after three months of follow-up (Carson et al., 2009); hypnosis as compared to waitlist control may provide improved change in hot flash score in patients with breast cancer (Elkins et al., 2008); vitamin E may offer some improvement as compared to placebo for patients with breast cancer (Barton et al., 1998); acupuncture and electroacupuncture may reduce distress from hot flashes in patients with prostate cancer (Frisk et al., 2009). No statistically significant differences were reported in trials that involved comparisons of acupuncture versus sham acupuncture (Bao et al., 2014); however, a significant change from baseline in the acupuncture group was noted. No statistically significant differences were reported in trials that involved comparisons of soy versus placebo (Quella et al., 2000; Van Patten et al., 2002), black cohosh versus placebo (Jacobson et al., 2001), or exercise (with or without cognitive behavioral therapy) versus no treatment (Duijts et al., 2012).
No interventions demonstrated high or moderate certainty of evidence when compared to placebo. Among the studies with low or very low certainty of evidence, venlafaxine (RoM = 1.71, 95% CrI [1.05, 2.76], low certainty of evidence), paroxetine (RoM = 2.83, 95% CrI [1.31, 6.09], low certainty of evidence), clonidine (RoM = 2.13, 95% CrI [1.27, 3.54], low certainty of evidence), and electroacupuncture (RoM = 2.07, 95% CrI [1.01, 4.24], low certainty of evidence) significantly reduced the mean hot flash composite score as compared to placebo. Vitamin E significantly increased the mean hot flash composite score when compared to placebo (RoM = 0.14, 95% CrI [0.03, 0.58], very low certainty of evidence). No other statistically significant differences were identified among the remainder of the interventions and placebo comparisons.
The authors identified 10 studies that reported on the outcome of hot flash severity (Barton et al., 1998; Bordeleau et al., 2010; Carson et al., 2009; Chen et al., 2014; Fenlon et al., 2008; Hernández Muñoz & Pluchino, 2003; Jacobson et al., 2001; Pandya et al., 2000; Vitolins et al., 2013; Walker et al., 2010). Because of the variety of reporting formats and low number of studies evaluating the outcome measure, a narrative approach to synthesis was used.
There was uncertainty of effects based on the small amounts of evidence available for venlafaxine, gabapentin, and clonidine. Each may offer some benefits related to hot flash severity, but the clinical relevance is unclear (Bordeleau et al., 2010; Loibl et al., 2007; Pandya et al., 2000; Walker et al., 2010). In patients with breast cancer, acupuncture may provide beneficial effects comparable to those of venlafaxine, yoga, and black cohosh (but studies for the latter conflicted) as compared to no treatment (Carson et al., 2009; Hernández Muñoz & Pluchino, 2003; Jacobson et al., 2001). There was insufficient evidence of any important effects associated with vitamin E, melatonin, or soy (Barton et al., 1998; Chen et al., 2014; Vitolins et al., 2013). Relaxation therapy may reduce hot flash severity as compared to control in patients with breast cancer (low certainty of evidence) (Fenlon et al., 2008).
The authors identified 19 studies that reported on measures of quality of life (Bao et al., 2014; Biglia et al., 2009; Bordeleau et al., 2010; Cramer et al., 2015; Fenlon et al., 2008; Jacobson et al., 2001; Kimmick et al., 2006; Loprinzi et al., 2000, 2002, 2007, 2009; MacGregor et al., 2005; Nedstrand et al., 2005; Pandya et al., 2000; Stearns et al., 2005; Stefanopoulou et al., 2015; Vitolins et al., 2013; Walker et al., 2010; Wu et al., 2009). A variety of general quality-of-life measures were assessed across trials, as described previously. Because of the diversity of measurements and treatment patterns, only pairwise comparisons were possible, primarily based on single studies. Only three studies that reported data related to general quality of life identified statistically significant differences between groups. One identified a difference between venlafaxine and placebo in patients with breast cancer based on single-item global quality-of-life items (Loprinzi et al., 2000). The second identified a difference between yoga and no treatment in patients with breast cancer on the FACT-Breast scale and the related physical, social, and emotional well-being subscales (Cramer et al., 2015). The third identified a difference between soy and no soy in patients with prostate cancer with regard to FACT-General, FACT-Prostate, and related emotional and functional domains (Vitolins et al., 2013). Limited data exist to suggest important general quality-of-life improvements for any intervention. Two studies reported that relaxation training may improve quality of life, but there is a low certainty of evidence (Fenlon et al., 2008). Yoga as compared to placebo may improve quality of life in patients with breast cancer, as measured by FACT-Breast (low certainty of evidence) (Cramer et al., 2015).
The authors identified 15 studies that reported on measures of quality of life related to depression (Bao et al., 2014; Biglia et al., 2016; Boekhout et al., 2011; Chen et al., 2014; Cramer et al., 2015; Duijts et al., 2012; Elkins et al., 2008; Jacobson et al., 2001; Kimmick et al., 2006; Loprinzi et al., 2000, 2009; Mann et al., 2012; Stearns et al., 2005; Stefanopoulou et al., 2015; Walker et al., 2010). Depression outcomes were assessed in a variety of formats and tools, including the BDI (various approaches, including mean change and amount reaching certain thresholds), CES-D (mean values and amount reaching certain threshold), HADS, Montgomery–Åsberg Depression Rating Scale, POMS, and Women’s Health Questionnaire depression subscale. Because of the diversity of measurements and treatment patterns, only pairwise comparisons were possible, most informed by single studies. There is very low certainty that hypnosis reduces depression as compared to a waitlist (Elkins et al., 2008). Among women with breast cancer and men with prostate cancer, cognitive behavioral therapy may reduce depression (measured with HADS) as compared to placebo, but it is unlikely (very low certainty of evidence) (Stefanopoulou et al., 2015). Among women with breast cancer, cognitive behavioral therapy may reduce depression (measured with Women’s Health Questionnaire; low certainty of evidence) (Mann et al., 2012). Yoga is unlikely to change depression when compared to placebo (measured with HADS; low certainty of evidence) (Cramer et al., 2015). Acupuncture is unlikely to affect depression as compared to placebo (low certainty of evidence) (Bao et al., 2014).
The authors identified eight RCTs that reported on the outcome of sleep quality (Bao et al., 2014; Biglia et al., 2009; Boekhout et al., 2011; Carson et al., 2009; Chen et al., 2014; Elkins et al., 2008; Mann et al., 2012; Stearns et al., 2005). Because of the diversity of measurements and treatment patterns, only pairwise comparisons were possible based mainly on single studies. Based on one study, paroxetine may offer benefits as compared to placebo for patients with breast cancer (Stearns et al., 2005); however, the benefits of venlafaxine and clonidine are unclear (Boekhout et al., 2011). Based on one small study, gabapentin may offer greater benefits than vitamin E in patients with breast cancer (Biglia et al., 2009). Based on small single studies, evidence suggested no clear benefits of acupuncture over sham acupuncture (Bao et al., 2014), statistically significant gains in sleep quality with melatonin as compared to placebo (Chen et al., 2014), reduced sleep problems in those receiving cognitive behavioral therapy as compared to usual care (Mann et al., 2012), improvements in sleep disturbance attained with yoga as compared to no therapy (Carson et al., 2009), and improvements in sleep achieved with hypnosis as compared to no treatment (Elkins et al., 2008). Hypnosis may improve sleep compared to placebo (low certainty of evidence) (Elkins et al., 2008).
The authors identified five RCTs that reported on the following interventions: cognitive behavioral therapy, exercise, venlafaxine, clonidine, paroxetine, and fluoxetine (Boekhout et al., 2011; Duijts et al., 2012; Loprinzi et al., 2000, 2002; Stearns et al., 2005). Because of the diversity of measurements and treatment patterns, only pairwise comparisons were possible.
Among studies of pharmacologic interventions, improvement in sexual function from venlafaxine, clonidine, or paroxetine as compared to placebo is unlikely (very low certainty of evidence). Fluoxetine may improve sexual function as compared to placebo (very low certainty of evidence); however, it is unlikely. Among the studies reporting on nonpharmacologic interventions (cognitive behavioral therapy and exercise), the combination of cognitive behavioral therapy with exercise may improve sexual function as compared to no intervention; however, it is unlikely (very low certainty of evidence).
Reporting of adverse events related to the interventions varied by type of outcome reported and measure of the outcome. In addition, the reporting of adverse events raised concerns with selective reporting because many studies failed to report disaggregated data, instead narratively stating whether or not there were differences between the two arms. Because of the diversity of measurements and treatment patterns, only pairwise comparisons were possible.
The most common adverse events reported were constipation, headache, nausea, and fatigue/sleepiness. When compared to placebo, no statistically significant difference was found for sertraline, electroacupuncture, gabapentin, melatonin, soy, or vitamin E (Barton et al., 1998; Chen et al., 2014; Kimmick et al., 2006; MacGregor et al., 2005; Mao et al., 2015; Wu et al., 2009). For the outcome of constipation, one small trial suggested lower risk with venlafaxine as compared to placebo (Boekhout et al., 2011). The remaining interventions did not suggest significance as compared to placebo for the outcome of constipation (i.e., soy, black cohosh, clonidine, electroacupuncture, gabapentin, and sertraline) (Jacobson et al., 2001; Kimmick et al., 2006; MacGregor et al., 2005; Mao et al., 2015; Van Patten et al., 2002). When compared to placebo, sertraline, soy, clonidine, venlafaxine, and vitamin E did not suggest a difference in risk for nausea (Barton et al., 1998; Boekhout et al., 2011; Kimmick et al., 2006; MacGregor et al., 2005; Van Patten et al., 2002; Wu et al., 2009). No study suggested a difference between the interventions of clonidine, electroacupuncture, gabapentin, melatonin, sertraline, soy, or venlafaxine as compared to placebo for the outcome of fatigue (Barton et al., 1998; Boekhout et al., 2011; Chen et al., 2014; Kimmick et al., 2006; Mao et al., 2015). Acupuncture as compared to placebo may not increase risk of fatigue (low certainty of evidence) (Mao et al., 2015). Review of data related to headache occurrence showed no statistically significant differences between pairs of therapies with available data.
To the authors’ knowledge, this is the first systematic review to incorporate network meta-analyses for the comparison of nonhormonal therapies to manage hot flashes in patients who have a history of breast or prostate cancer. In patients with breast cancer, hormonal therapies for hot flashes are generally contraindicated, unlike in the general population, where these treatments are often offered in the first-line setting (Harris et al., 2020). Given these unique challenges, clinicians and nurses must use alternative treatment strategies, with limited evidence guiding selection of interventions. A total of 40 trials met eligibility criteria, 36 being conducted with patients with breast cancer. As demonstrated by the network diagrams, the patterns of treatment comparisons for the outcomes studied were sparse, with most involving inactive control groups and interventions assessed by few studies in the context of generally small trials. The a priori outcomes of interest were not consistently assessed or reported among all trials, and measurement scales and reporting formats were also varied. Based on these obstacles, the ability to compare therapies within inclusive network meta-analyses was limited, the certainty of evidence was judged to be low, and no interventions were considered to be supported by strong evidence for use in the management of hot flashes in the target population.
Network meta-analyses were feasible in the current review to assess relative changes in hot flash frequency and composite hot flash score. However, the evidence base for these analyses was limited, and additional relevant interventions and data disconnected from networks also warranted consideration; the ability to consider recommendations among all previously studied therapies is, therefore, compromised. Network meta-analyses for the endpoints of hot flash composite score and hot flash severity were primarily focused on pharmacologic and natural health product interventions. Psychological and physical therapies required assessment from other study data. Findings from network meta-analyses generally suggested that most therapies in the analyses (including antidepressants and natural health products primarily) offered benefits relative to no treatment but little to suggest certain active therapies over others. In addition, tolerability data were limited, creating additional challenges for physicians and nurses when trying to counsel patients on expected side effects for these interventions. Additional trial data external to network meta-analyses for these endpoints, as well as changes in hot flash severity, provided additional data supporting other therapies (e.g., cognitive behavioral therapy, yoga, acupuncture/electroacupuncture), which again suggested benefits relative to no treatment. These findings generally align with findings from other reviews that have studied subsets of clinically relevant interventions (Chien et al., 2017; Guo et al., 2019; Li et al., 2019; Shan et al., 2019; Tao et al., 2017) and have noted benefits associated with a broad range of therapies, including selective serotonin reuptake inhibitors (SSRIs) and serotonin-norepinephrine reuptake inhibitors (SNRIs), neuroleptic agents, acupuncture, hypnosis, cognitive behavioral therapy, soy, and omega-3 supplementation. They also align with the American Cancer Society/American Society of Clinical Oncology breast cancer care survivorship guideline (Runowicz et al., 2016) that recommended that physicians offer SSRIs/SNRIs, gabapentin, lifestyle modification (e.g., vitamins, exercise, rhythmic breathing, reductions of alcohol and caffeine intake), and/or environmental modifications (e.g., layered dressing, cool rooms) for management of vasomotor symptoms.
Although past evidence supports that many forms of therapy may help patients in reducing hot flash frequency and severity relative to no treatment, there is a clear need to enhance research in this area in multiple ways to improve decision-making ability and the development of recommendations in the future. These include enhanced reporting of study population characteristics in clinical trials to better understand to whom findings apply; the development of a core outcome set (Williamson et al., 2012) in this area to guide the design and reporting of future RCTs, which may standardize measurements related to hot flashes while encouraging more regular assessment of other outcomes, such as generic quality of life and measures related to key symptoms (e.g., sleep quality, depression, sexual function); and enrichment of the treatment comparisons made in future RCTs to improve the robustness of evidence available for meta-analyses in the future.
Strengths of the current review include the use of rigorous review methods, including a robust search of the literature, detailed appraisal of the evidence, a thorough analysis plan, and an assessment of the certainty of the evidence. However, several limitations should also be noted. First, the set of included trials were all assessed to be at a high or unclear risk of bias, bringing into question the validity of findings generated from a synthesis of their outcome data. Second, from the perspective of evaluating between-study heterogeneity of patient populations, many characteristics, including prior/concomitant treatments, baseline duration of hot flashes, and duration of time since cancer treatment, were unreported by many studies, limiting the ability to compare populations based on aggregate baseline information and to assess appropriateness of the transitivity assumption for network meta-analysis (Donegan et al., 2013; Salanti, 2012). Third, reporting of study findings was mixed in terms of the scales used to assess different outcomes and the reporting format (e.g., mean change from baseline, percentage change from baseline). The ability to perform meta-analyses was limited, and simple narrative descriptions were often necessary; the development of core endpoint sets (Boers & Kirwan, 2017; Williamson et al., 2017) and efforts among researchers to establish consistent reporting are needed. Fourth, the degree of confidence in the treatment comparisons presented varies according to gaps in available direct information, and overall evidence for active therapies was sparse. Future studies in this area should be designed to reinforce gaps in the currently available set of studied treatment comparisons in areas of clinical relevance in the network and should ensure thorough outcome assessment, detailed summary of patient demographics, and transparent reporting of methods. Fifth, vitamin E and gabapentin plus antidepressant were the only two interventions in the evidence networks informing network meta-analyses that had not been compared directly with placebo; therefore, caution in the interpretation of related findings is warranted. Finally, the approach to network meta-analysis in the current review was adjusted from the original study protocol to employ an RoMs approach; this was done to maximize the ability to include outcome data reported in the encountered range of reporting formats.
An important additional limitation of this review specific to the prostate cancer population should be noted. Studies evaluating hormonal interventions were excluded from consideration in this review, with the aim of ensuring uniform study selection criteria across breast cancer and prostate cancer trials. It is acknowledged, however, that hormonal therapies (particularly progestastional agents such as medroxyprogesterone) are commonly used in the treatment of hot flashes in patients with prostate cancer receiving ADT and have been studied in a large randomized trial (Irani et al., 2010). The exclusion of hormonal interventions does partially limit the generalizability of the findings of this review to the prostate cancer population.
Nurses and other healthcare providers working with patients treated for breast or prostate cancer should ensure that patients are educated on the potential side effects of treatment, including hot flashes. Patients should understand that hot flashes are common, can affect quality of life, and can be a challenge to manage. Patients should notify their nurse or healthcare provider if they experience hot flashes, and clinicians should assess patients at high risk. Working together, patients and clinicians can be proactive to manage this challenging side effect of treatment (Kaplan & Mahon, 2014; Qan’ir et al., 2019).
In the current review, treatment comparisons derived from network meta-analyses and narrative review of individual trials generally highlight the presence of small differences between interventions; therefore, there is limited evidence and considerable uncertainty to guide the selection of certain interventions over others for management of vasomotor symptoms. The ability for physicians and nurses to provide specific suggestions regarding management of hot flashes based on the available evidence when asked by patients is challenging. At this time, for patients who are motivated to try an intervention to relieve their symptoms, physicians and nurses may wish to focus on those in which tolerability risks are felt to be minimal. Additional trials of high methodologic quality and improved reporting and that add to the robustness of currently available networks are needed. The development of a structured set of outcomes for measurement in future research, ideally established by methodologists, clinical experts, patients, and stakeholders, would also enhance the design of future clinical trials.
The authors gratefully acknowledge Fatemeh Yazdi, MSc, for her support during data collection.
Brian Hutton, MSc, PhD, is a senior scientist at Ottawa Hospital Research Institute and an assistant professor in the Department of Epidemiology and Community Medicine at the University of Ottawa, both in Ontario, Canada; Mona Hersi, MSc, is a senior clinical research associate, Wei Cheng, PhD, is a senior methodologist in the Clinical Epidemiology Program, Misty Pratt, MES, is a research assistant III, Pauline Barbeau, MSc, is a clinical research associate, Sasha Mazzarello, MSc, BSc, is a research assistant, Nadera Ahmadzai, MD, MPH, MSc, is a senior methodologist, and Becky Skidmore, BA(H), MLS, is a consulting librarian to the Knowledge Synthesis Unit, all at Ottawa Hospital Research Institute in Ontario, Canada; Scott C. Morgan, MD, MSc, FRCPC, is a radiation oncologist at Ottawa Hospital Research Institute and assistant professor in the Division of Radiation Oncology in the Department of Radiology at the University of Ottawa; Louise Bordeleau, MD, MSc, FRCPC, is an associate professor and medical oncologist in the Department of Oncology at McMaster University in Hamilton, Ontario, Canada; Pamela K. Ginex, EdD, RN, OCN®, is the senior manager of evidence-based practice and inquiry at the Oncology Nursing Society in Pittsburgh, PA; Behnam Sadeghirad, PharmD, MPH, PhD, is an assistant professor in the Department of Health Research Methods, Evidence, and Impact and a methodologist/biostatistician in the Michael G. DeGroote Institute for Pain Research and Care, and Rebecca L. Morgan, PhD, MPH, is an assistant professor in the Department of Health Research Methods, Evidence, and Impact, both at McMaster University; Katherine Marie Cole, MDCM, is a resident physician in medical oncology at the Ottawa Hospital Cancer Centre at the University of Ottawa; and Mark Clemons, MD, is a medical oncologist at the Ottawa Hospital Research Institute, in the Department of Epidemiology and Community Medicine at the University of Ottawa, and in the Division of Medical Oncology in the Department of Medicine at the University of Ottawa. This work was funded by a Canadian Institutes of Health Research (CIHR) Knowledge Synthesis Award and by the National Center for Complementary and Integrative Health of the National Institutes of Health (No. R24AT001293). The content is solely the responsibility of the authors and does not necessarily represent the official views of the funders. The funders had no role in the development of the protocol or final study report. Hutton has previously received honoraria from Eversana (previously Cornerstone Research Group) for the provision of methodologic advice related to systematic reviews and meta-analysis. Hutton, Mazzarello, Skidmore, S.C. Morgan, Bordeleau, and Clemons contributed to the conceptualization and design. Hersi, Pratt, Barbeau, Mazzarello, Ahmadzai, Skidmore, and Clemons completed the data collection. Hutton, Cheng, and Clemons provided statistical support. Hutton, Cheng, Pratt, S.C. Morgan, Bordeleau, Ginex, Sadeghirad, R.L. Morgan, and Clemons provided the analysis. Hutton, Hersi, Cheng, Mazzarello, Ahmadzai, Skidmore, S.C. Morgan, Bordeleau, Ginex, Sadeghirad, R.L. Morgan, Cole, and Clemons contributed to the manuscript preparation. Hutton can be reached at bhutton@ohri.ca, with copy to ONFEditor@ons.org. (Submitted January 2020. Accepted March 26, 2020.)