Clinical efficacy of chloroquine derivatives in COVID-19 infection: comparative meta-analysis between the big data and the real world
In periods of large epidemics such as the current coronavirus disease 2019 (COVID-19) pandemic, information spreads very fast with different levels of reliability, including fake news, press releases, preprints and peer-reviewed published reports. In addition, it seems that there is a competition between low-cost generic medications that are potentially effective against severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) and very expensive new drugs that are not yet approved, implying financial and organizational issues, stakeholders expectations and administrative/policy complexity. This may lead to positions that are not driven only by science and public health.
In this context, we aimed to conduct a meta-analysis on the effects of chloroquine derivatives (i.e. hydroxychloroquine (HCQ) or chloroquine) in individuals with COVID-19, based on all available information from preprints and peer-reviewed published reports. For preprints, we asked two reviewers of our team to provide an open review of the content (see Supplementary material, Appendix S1) and we considered the comment of an external scientist . We were surprised to find major discrepancies between study conclusions ranging from dramatic clinical improvement to dramatic increase in mortality rates under chloroquine-derivative treatment. We sought to understand what could explain such differences. We recently discussed the fact that it does not make sense to investigate a summary effect when inconsistent studies and unexplained heterogeneity make the average effect difficult to interpret and potentially misleading . Hence, we first investigated the differential characteristics of studies showing a very favourable effect of the treatment and of studies showing a clearly deleterious effect.
First, we found that a clear standardized protocol for treatment  and follow up was detailed in studies conducted by clinicians (clinical studies), whereas it was completely lacking in studies conducted by public-health experts on a large number of patients whose data were extracted from electronic medical records (big data). We have already pointed out the limitations of these ‘big data’ analyses in relation to clinical inaccuracy .
Adequate timing (early administration versus delayed administration), dosage, screening of contraindications, adjuvant measures and monitoring following standardized protocols are critical in the benefit–risk ratio of any drug against infectious diseases . Based on our 30 years of experience treating hundreds of patients with Q fever endocarditis and Whipple’s disease with HCQ 600 mg/day (200 mg three times per day) [5,6], we know that this drug is effective with negligible side effects when compared to the fatal outcome of both diseases. Chloroquine derivatives (and paracetamol) can be used to commit suicide with overdose  and may be fatal, at therapeutic dosage, when contraindications and adjuvant measures are not carefully followed. In this context, it is expected that studies using double-dose HCQ (1200 mg/day) in COVID-19 would be associated with toxicity . Accordingly, we investigated whether a well-described treatment protocol, including dosage, for at least 48 hours was associated with an improved outcome.
From our seminal study , we observed an improved efficacy of the combination of HCQ and azithromycin when compared with HCQ alone. A synergistic effect was confirmed by in vitro studies . This led us to change our standardized protocol by shifting from a mono-therapy to a combined therapy. This combination could not be neglected in the treatment of COVID-19 and was therefore also analysed in the present study.
In the context of a pandemic with an unknown virus, development of new drugs is a major opportunity for ‘big pharma’ industry, and this is potentially associated with a very high risk of conflicts of interest. This led us to consider these conflicts of interest as a moderator variable in the present work. As major financial issues are at stake, and may impact the interpretation of scientific data, we felt it was important to mention that none of us have conflict of interest with any pharmaceutical company.
We performed this meta-analysis taking into account three important moderator variables: clinical studies or studies based on electronic registry data analysis (big data), studies based on a mono-therapy (chloroquine derivatives) or a combined therapy (HCQ with azithromycin), and finally studies where authors had potential conflicts of interest and where authors had no conflicts of interest. In the context of the current pandemic, providing a timely and critical analysis of available data on this topic seems appropriate to us, from a public-health perspective.
We conducted a meta-analysis of studies evaluating the effects of chloroquine derivatives against SARS-CoV-2 in groups of patients with COVID-19 as compared with control groups of patients who did not receive chloroquine derivatives. In these studies, groups were expected to be similar with respect to demographics, chronic conditions, clinical presentation at enrolment and use of other antiviral drugs during the course of the disease. The keywords ‘hydroxychloroquine’, ‘chloroquine’, ‘coronavirus’, ‘COVID-19’ and ‘SARS-Cov-2’ were used in the PubMed, Google Scholar and Google search engines without any restrictions as to date or language. Preprints were also included. Open reviews and reviewer’s recommendations regarding preprints are available in the Supplementary material (Appendix S1). Articles published in peer-reviewed journals, preprints and articles available on the internet, even when not published on official websites, were included.
The following outcomes were considered: hospitalization rate, duration of cough, duration of fever, clinical cure, lymphocyte count, C-reactive protein level, interleukin-6 level, thoracic CT scan, worsening to severe symptoms, death, transfer to intensive care unit (ICU), ventilation, length of hospital stay and persistent viral shedding as assessed by PCR.
Only studies in which a group of COVID19 patients treated with a chloroquine derivative were compared with a control group without chloroquine derivatives were included. Non-comparative (single-arm) studies and studies comparing two groups treated with chloroquine derivatives at different dosages or with different delays of treatment were excluded.
Studies were classified as ‘big data’ studies when conducted on electronic medical records extracted by public-health specialists and epidemiologists who did not care for COVID-19 patients themselves. Conversely, studies were classified as ‘clinical studies’ when they mentioned details of treatments (e.g. dosages, duration, contraindications, monitoring) and where the authors were physicians (infectious diseases and internal medicine specialists, and pulmonologists) who cared for COVID-19 patients themselves. Conflicts of interest were retrieved from author statements in the article. Another check was performed using the Euros for Docs (https://www.eurosfordocs.fr/) and Dollars for Docs (https://projects.propublica.org/docdollars/) websites. We considered that there was a conflict of interest when funding by the pharmaceutical industry exceeded €50 000 over 7 years.
Studies were classified as ‘Pro’, when at least one comparison reported a significant improvement, and none was associated with a significant deleterious effect in the treated group. Studies were classified as ‘Con’ when none of the comparisons reported a significant favourable outcome and/or at least one comparison reported a significant deleterious outcome.
The meta-analysis was performed with a randomized model using comprehensive meta-analysis v3 (Biostat, Englewood, NJ, USA) as recommended by Borenstein et al. . This software made it possible to include dichotomous outcomes (number of events out of the total) and quantitative outcomes (mean in each group, sample size, p-value). Heterogeneity was considered substantial when I2 >50%. A p-value <0.05 was considered significant. A heat map analysis was performed to test a possible clustering between Pro and Con studies, clinical and big data study design, well-described treatment protocol and not described treatment protocol, and conflict of interest and no conflict of interest, using XLSTAT v2020.2.2 (Addinsoft, Paris, France).
Twenty-three comparative studies were screened. Three studies were excluded because they compared two groups treated with a chloroquine derivative (high versus low dose  and combination therapy with or without zinc ). As a result, 20 studies were identified involving 105 040 individuals (19 270 patients treated with a chloroquine derivative, including 11 247 in combination with a macrolide) from nine countries (Brazil, China, France, Iran, Saudi Arabia, South Korea, Spain and the USA) (see Supplementary material, Table S1). The 20 studies included eight published papers, nine preprints published on MedRxiv, one preprint published on preprints.org and two available on the internet (uniform resource locator (url) provided in the Supplementary material, Table S2). All but two papers, in Chinese  and French , were written in English. The Chinese study  was translated and included.
We noted that registry studies based on electronic medical records did not mention the dosage or included several dosages of the chloroquine derivatives used [, , , , ]. We found that in several studies, patients used several molecules with established or potential antiviral properties. For instance, in China and Iran almost all patients used multiple antivirals: lopinavir/ritonavir, oseltamivir, entecavir, ribavirin, umifenovir and nebulization of interferon aerosol. In eight studies [15,, , , , , , ] patients were given the combined therapy that we have recommended (HCQ and azithromycin combination ). Four randomized controlled trials (RCTs) were included in this analysis [14,, , ].
We observed major methodological pitfalls in some studies. Lymphopenia, a marker of severity , was significantly more frequent in the treated group in one study . In another study, eight patients received HCQ in the ‘untreated’ group . In this study, none of the 15 patients treated with combined therapy (HCQ + azithromycin) died or were transferred to the ICU, and the difference was significant with the untreated control group. Strikingly, this was not analysed because it was not prespecified in the study protocol. In another work , all results reporting a favourable effect of HCQ in the first version of the preprint  on alleviation of symptoms and C-reactive protein were removed in the final preprint version  and in the published version of the article . Finally, the largest study that has been carried out  is impossible to analyse because there is no notification of hospital sources or referral to any physician. It is not known if the authors of this study saw a single patient infected with SARS-CoV-2.
Big data and clinical studies were perfectly discriminated by unsupervised clustering
As we observed that several studies reported a clear favourable effect [15,, , ,25,26,30,, , , ] but others reported no effect [14,16,17,19,24,29] or a clear deleterious effect , we primarily performed an unsupervised clustering analysis including the following variables: ‘Pro’/‘Con’ studies, ‘big data’ versus ‘clinical studies’, ‘detailed’ or ‘absence of detailed treatment’, presence or absence of a conflict of interest (Fig. 1).
In this unsupervised analysis, only the variable ‘big data’ versus ‘clinical’ studies yielded to a perfect clustering. All other variables (conflict of interest, Pro/Con, detailed treatment) did not provide a perfect clustering. We subsequently investigate whether each of these parameters was significantly associated with favourable or unfavourable effect.
All ‘big data’ studies reported a lack of beneficial effect of the treatment and were significantly more likely to be associated with ‘Con’ variable (5/5 versus 3/15, p 0.004). This was also observed by examination of the meta-analysis forest plot (Fig. 2, see Supplementary material, Tables S3 to S8). In addition, both ‘conflicts of interest’ (p 0.01) and ‘not described treatment protocol’ variables (p 0.004) were associated with the ‘Con’ variable. Conversely, clinical studies were more likely to report a favourable effect of chloroquine derivatives in individuals with COVID-19 (p < 0.05). Consistently, clinical studies with detailed treatment protocol were more likely to be associated with the observation of a favourable effect of the treatment (p < 0.05).
Conflicts of interests are linked to part of the biases in favour of Con
We found four studies with author conflicts of interest (Fig. 1; see Supplementary material, Table S1). The ‘Conflicts of interest’ variable was associated with big data studies (3/5 versus 1/15, p < 0.05) and had a negative direction of treatment effect (p < 0.05, Fig. 1).
Direct care of patients (clinical versus big data) explains the direction of effect
We primarily tested if the studies involving direct care of patients (clinical studies performed by physicians who took care of patients) were associated with a different direction of effect compared with ‘big data’ studies (Fig. 2). The visual examination of the forest plot clearly showed that ‘big data’ studies reported no effect [16,17,19,20] or deleterious effect . In contrast, several clinical studies reported significant favourable effects, notably regarding hospitalization rate , duration of fever [25,33], duration of cough [23,25], clinical cure [15,30], C-reactive protein levels , interleukin-6 levels , thoracic CT-imaging , length of hospital stay [23,26], death or ICU transfer [22,32], death [34,35] and persistent viral shedding [9,23,33].
We compared the proportion of comparisons reporting significant differences according to treatment. In the big data analyses, four comparisons reported a significant effect, and all were deleterious. In the clinical studies, 17 comparisons reported a significant effect, and all were beneficial. The difference was highly significant (4/4 versus 0/17, Bilateral Mid-P exact test, p 0.00016). This was also supported by the significant heterogeneity between the two subgroups (big data versus clinical studies, mixed effect analysis, Q-value 51.8, p < 0.001).
Three of four RCTs reported a significant favourable effect
Four RCTs were included [14,, , ,30,31]. All were performed in China. Three of them reported significant favourable effects. Chen Z et al.  reported a significant favourable effect on duration of fever, duration of cough and thoracic CT imaging. Huang et al. reported a significant reduction of length of hospital stay (26). Interestingly, Tang et al.  reported in the first version of their preprint  a significant favourable effect on alleviation of symptoms (post hoc analysis) and C-reactive protein reduction (subgroup with baseline increased C-reactive protein), but these results were removed in the final published version of the manuscript [27,31]. This was requested by editors and reviewers from the British Medical Journal (open review) where the final version was published because this was not prespecified in the study protocol. In addition, they were concerned about the justification of including these secondary outcomes results and post hoc analysis from under-powered sample size (due to early termination). This is surprising because a lack of power may be associated with a risk of not finding a difference when there is one, but not with a risk of finding a difference when there is none. None of these RCTs reported a significant deleterious effect.
Effect of chloroquine derivatives without azithromycin
As several studies addressed the effectiveness of the combination of chloroquine derivatives with a macrolide, specifically azithromycin, we tested whether the favourable clinical effect (observed in clinical studies) remained after exclusion of comparisons with combination therapy (see Supplementary material, Fig. S1). A favourable effect was still observed for duration of cough (n = 1, point estimate 0.12, p 0.001), duration of fever (n = 2, point estimate 0.05, p 0.002), clinical cure (n = 2, point estimate 0.48, p 0.022), C-reactive protein levels (n = 1, point estimate 0.55, p 0.045), interleukin-6 levels (n = 1, point estimate 0.43, p 0.002) and death (n = 3, point estimate 0.31, p < 0.001). Interestingly, the effect was not significant for persistent viral shedding (n = 7, point estimate 0.51, 95% CI 0.20–1.33, p 0.17).
Outcomes with a significant summary effect in clinical studies
We found a favourable summary effect on duration of cough (n = 2, point estimate 0.19, 95% CI 0.09–0.42, p 0.00003; I2 = 0%), duration of fever (n = 3, point estimate 0.11, 95% CI 0.01–0.90, p 0.039; I2 = 91%, p < 0.001), clinical cure (n = 3, point estimate 0.21, 95% CI 0.05–1.0, p 0.0495; I2 = 81%, p < 0.001) and death (n = 4, point estimate 0.32, 95% CI 0.19–0.52, p 4.1 × 10−6; I2 = 0%, p 0.71; see Supplementary material, Table S9). A trend for the outcome ‘death or ICU transfer’ was also noted (n = 3, point estimate 0.29, 95% CI 0.08–1.10, p 0.069; I2 = 85%, p < 0.002) with a point estimate very similar to that observed for the death outcome (0.3, e.g. a three-fold decrease in the risk of ICU transfer and/or death). For persistent viral shedding, ten comparisons were included with a significant favourable effect on persistent viral shedding (n = 10, point estimate 0.43, 95% CI 0.20–0.92, p 0.031; I2 = 75%, p < 0.001).
Chloroquine derivatives present a paradox. On the one hand, the heterogeneity of patients and treatment schemes makes it difficult to obtain a clear picture while the epidemic is still ongoing. On the other hand, despite controversy, only chloroquine derivatives have been used by physicians on a large-scale basis as a treatment for COVID-19 . According to the Sermo Real Time Covid-19 Barometer (https://www.sermo.com/, consulted 27 May), for over 20 000 physicians across 30 countries, chloroquine derivatives are the first medication used to treat COVID-19 patients in ICUs (43%; except oxygen, anti-clotting/anticoagulants, steroids and norepinephrine) and in other hospital settings (52%; except oxygen), and the second in outpatient settings (33%, after AZ and similar antibiotics).
Indeed, we were challenged by the major discrepancies between the results of the various published studies and our experience at the IHU, where 7800 electrocardiograms were performed in 4000 patients. To understand which elements could lead to contradictory results, we compared the results of studies carried out by clinicians (real world) and those carried out by database analysts (virtual world of big data – Fig. 1). The clinical studies used a standardized treatment protocol with methods that included assessment of contraindications, daily dosage, adjuvant measures and duration of treatment with at least 48 hours of treatment before the objective could be assessed. For example, assessment of kalaemia and electrocardiogram is critical before treatment, especially when the chloroquine derivative is combined with azithromycin . At the same time, we observed that virtual big data studies did not mention these elements and considered the presence of chloroquine derivative prescription in electronic records in a binary fashion. Obviously, the number of patients included in the database analyses was much higher than the number of patients included in the clinical studies, because these databases are made up of thousands of electronic medical records. As mentioned in the past , this type of study has tremendous statistical power but is limited by clinical inaccuracy that makes their conclusions difficult to believe.
We cannot believe that in some series up to 8% of deaths are due to cardiac rhythm disorders , whereas all the electrocardiograms performed in the IHU (our centre) for 4000 patients and analysed by a team of cardiologists specializing in heart rhythms have not seen any, except for an increase in QTc, which justified stopping treatment in only three individuals . Under these conditions we thought that people who really observed the patients had a very different perception of the results from people who had not observed the patients but retained observations. The major elements of this study are that, overall, there is an extremely significant difference between the analyses of data not collected directly by the doctors who cared for the patients and the studies carried out by the physicians who set up these studies and cared for patients, including the randomized studies. The second thing is that in the studies conducted electronically, the treatment is never really specified, with the dosage and duration of treatment making it impossible to assess efficacy (dose too low) or toxicity (dose too high). In addition to this major bias, we also noted a significant bias when the authors had conflicts of interest due to their relationship with industrialists trying to market molecules in the same therapeutic framework competing with HCQ.
For discrepancies in published data, favourable evidence for chloroquine derivatives is sometimes censored by the journal (open review of Tang’s randomized controlled trial, published in the British Medical Journal [27,30,31]). For the article by Mahevas et al. , one of us (DR) had contact with one of the authors (B Godeau), who told him that it was the methodologist (P Ravaud) who did not want to carry out the statistical tests demonstrating the superiority of dual therapy over the control group (death or transfer to ICU, 0/15 versus 16/63, bilateral Mid-P exact test p = 0.02).
Overall, and as previously published, the relevance of the analysis of important medical data depends on clinical accuracy . Indeed, the discrepancy between clinicians and epidemiologists reflects a major trend, that of the analysis of large medical data, with a database warehouse more or less well filled by individuals who are not directly included in the work reported. This analysis is unrelated to the observations made by physicians who are in direct contact with patients, and which lead to divergent interpretations and opposite conclusions, which are of real interest and show that the world predicted by Baudrillard —a parallel world of numerical analysis completely disconnected from reality—is being born.
Under these conditions, a meta-analysis allowing for the combination of different studies makes it possible to identify a general trend. This makes it possible to reconcile the chloroquine derivative efficacy that many doctors have perceived with the results of the first published studies. This meta-analysis is based on several studies, including four RCTs, and identifies a favourable trend toward the benefit of chloroquine derivatives in the treatment of individuals with COVID-19, enabling us to make a grade I recommendation for its use against the disease. The retraction of the only big data study associated with a significantly deleterious effect the day after (June 5, 2020) the acceptance of the present work (June 4, 2020) confirms the relevance of this work.
This work was funded by ANR-15-CE36-0004-01 and by ANR Investissements d’avenir, Méditerranée infection10-IAHU-03, and was also supported by Région Provence-Alpes-Côte d’Azur. This work had received financial support from the Mediterranean Infection Foundation.
Declaration of competing interest
The authors declare no competing interests. Funding sources had no role in the design and conduct of the study; collection, management, analysis and interpretation of the data; and preparation, review or approval of the manuscript. Our group used widely available generic drugs distributed by many pharmaceutical companies.
We thank Christian Devaux for helpful interactions and Fanyu Huang for Chinese to English translation of the study by Chen J et al. .
Appendix A. Supplementary data
The following are the Supplementary data to this article:
Multimedia component 1.
Multimedia component 2.
M. Million and P. Gautret are equal first co-authors.