A Review of Exhaled Volatile Organic Compounds as Biomarkers for Thoracic Malignancies

Lung cancer, malignant pleural mesothelioma, and esophageal cancer are the most common thoracic malignancies and are responsible for substantial cancer-related morbidity and mortality worldwide. Early cancer identification prompts earlier intervention and can therefore improve patient survival. Traditional diagnostics are costly and invasive, however, creating an urgent need for alternative methods. Over the past 30 years, breath analysis has emerged as a rapid, minimally invasive, and cost-effective approach. Metabolites in exhaled breath, known as volatile organic compounds (VOCs), reflect internal biomolecular processes and their composition has been shown to vary in association with numerous pathological states. This review provides an overview on the use of VOCs in exhaled breath for the early screening and diagnosis of thoracic malignancies. Study design, methodology, and significant results from over sixty studies published since 1990 are specified and summarized. A total of 439 significant VOCs are reported in the literature, mainly consisting of aromatic compounds, aldehydes, alkanes, lipids, ketones, and sulfur-containing compounds. Diagnostic sensitivities and specificities range from 51-100% and 68.8 – 100%, respectively. Cancer-specific VOC profiles and associations of clinical interest (e.g., comorbidities, histology, and staging) are emphasized and discussed. While there is considerable evidence to support the diagnostic utility of VOCs, the lack of standardization and external validation in large independent cohorts remain key barriers to clinical translation. However, efforts to address these limitations are currently underway.


Introduction
Thoracic malignancies are aggressive neoplasms of uncontrolled cell growth that originate within the chest cavity. Lung cancer, malignant pleural mesothelioma, and esophageal cancer are the most common thoracic malignancies. Early screening and detection can reduce mortality in all three cancers [1][2][3]. Current methods of investigation rely on a combination of blood, imaging, and tissue sampling techniques that are invasive, inaccurate, and/or costly [3][4][5][6][7][8]. Over the past few decades, volatile organic compounds (VOCs) in exhaled breath have emerged as a potentially quick, safe, noninvasive biomarker for detecting lung cancer, esophageal cancer, and malignant pleural mesothelioma. Prior reviews have considered these three malignancies separately; overall VOC patterns between lung cancer, esophageal cancer, and malignant pleural mesothelioma have not been explored.
The aim of this review is to consolidate the current knowledge on exhaled VOCs in the early screening and diagnosis of thoracic malignancies. Literature published between January 1990 and May 2020 is included. In the following sections, we will summarize breath composition and findings of available studies on lung cancer, esophageal cancer, and mesothelioma. Clinically relevant VOCs will be emphasized, along with sensitivities and specificities. Focused reviews on study design and methodology of the three respective cancers can be found elsewhere. We will conclude with a discussion on general VOC patterns between all three thoracic cancers along with potential biological mechanisms. The most abundant VOCs in human breath are acetone, methanol, ethanol, propanol and isoprene [18]. On average, a single breath sample contains around 200 different VOCs [19]. Changes in VOC composition and/or concentration can therefore signal underlying pathology and be used to guide clinical practice. To date, breath VOCs have shown clinical value in the diagnosis and management of numerous diseased states, including Inflammatory Bowel Disease (IBD) [20], infectious diseases [21][22][23], Asthma [24][25][26], Cystic Fibrosis [25], Chronic Obstructive Pulmonary Disease (COPD) [25,26], Alzheimer's [27], and various cancers [28][29][30].
The use of VOCs for lung cancer screening and detection has been investigated in dozens of studies over the past 30 years. More recently, VOC biomarker studies on esophageal cancer and malignant pleural mesothelioma have begun to surface as well. However, despite the sizeable body of literature, a consensus list of validated VOCs does not exist.

Methodologic Principles
VOC studies primarily concern themselves with two distinct but related outcomes. Characterization studies seek to discover VOCs significantly associated with cancer, while diagnostic studies evaluate the ability of VOC profiles to determine the presence of cancer. While the latter approach is of greater clinical interest, the approaches are complimentary and a majority of studies employ both.
Many techniques and technologies have been developed over the years to enhance VOC extraction and analysis [31].

Breath Sampling
Exhaled breath is typically either collected and analyzed immediately or stored in containers for transport to a VOC detector. Given the presence of trace VOC concentrations, breath samples often undergo preconcentration with solid-phase microextraction (SPME) to improve VOC detection. The most common VOC containers are sampling bags (e.g., Tedlar, Mylar). Sampling bags are cheap, chemically inert, and readily interface with numerous other lab and clinical equipment for preconcentration and/or detection. However, sampling bags are susceptible to leakage, UV degradation, and water condensation.
Once exhaled breath samples are collected and preconcentrated, they undergo VOC detection and profiling.

VOC Detection
VOC detection is primarily based on mass spectrometry and electronic nose (e-nose) technologies ( Figure 2).  Gas chromatography-mass spectrometry (GC-MS) is the current gold standard for VOC analysis. Compounds in exhaled breath are separated by gas chromatography and ionized in the mass spectrometer. VOCs can then be identified based on their mass/charge ratios. GC-MS allows for chemical identification and quantification of VOCs.
However, GC-MS technology is slow, expensive, labor intensive, and requires skilled operation. E-nose is rapidly emerging as a more practical alternative for VOC detection and essentially rely on pattern recognition. Sensors react to VOCs in exhaled breath and produce electrical signals, which combine to generate a composite pattern of surrounding air components, or "breathprint," that can be further analyzed.
Unlike GC-MS, selective characterization and identification of VOCs is not possible. However, sensor arrays are cheaper, portable, and provide real-time results, making them more suitable for point-of-care use than GC-MS.

VOC Analysis
VOC analysis typically evaluates the breath composition of cancer and benign subjects to characterize malignancy using two main approaches: (1) VOC identification, and (2) VOC patterning. VOC identification uses GC-MS to generate a differential panel of known and discrete VOCs, while VOC patterning uses e-nose to generate a discriminant breathprint from unknown and aggregate VOCs. In either case, rigorous statistical analyses can be used to develop a VOC biosignature of malignant disease and educate diagnostic models.

Thoracic Malignancies & VOCs
Our review included 66 publications since 1990 using exhaled breath VOCs to determine thoracic cancer.
In general, studies sought to identify VOCs uniquely associated with thoracic cancer (n = 39), and/or test the ability of VOCs to determine malignancy (n =61). Within included studies, lung (n = 53, Table 1), mesothelioma (n = 7, Table 2), and esophagogastric (n = 6, Table 3) cancers were the primary cancers of interest. One study investigated VOCs as biomarkers for both lung cancer and mesothelioma. Studies typically compared patients with thoracic cancer against a healthy control population and/or patients with benign conditions. Cancer cohorts were often of mixed histological subtype and staging. MS-based techniques were the most common analytical platforms (n = 42 studies) followed by electronic sensor-based (n=28).
The ability of VOCs to determine malignancy was tested in 48 studies. Sensitivity and specificity ranged from 51-100% and 13.0 -100%, respectively. Diagnostic VOC models, ranging from 1 to 500 discrete VOCs, were described and evaluated in 26 studies. Five studies tested the ability to diagnose LC patients using a single VOC. Song et al. [75] showed lung cancer could be diagnosed using two univariate models: 1-butanol with 95.3% sensitivity and 85.4% specificity, and 3-hydroxy-2-butanone with 93% sensitivity and 92.7% specificity. In a study with 233 LC cases and 140 controls by Wang et al. [80], heneicosane was able to establish a diagnostic model with a sensitivity of 75.6%, specificity of 78.9%, and overall accuracy of 76.7%. Oguma et al. [59] collected breath samples from 116 LC cases and 37 controls and analyzed 14 VOCs with gas chromatography. Using cyclohexane alone, they achieved a sensitivity of 53% and a specificity of 78% for the diagnosis of lung cancer. Using xylene alone, they achieved a sensitivity of 49% and a specificity of 86%. Using either cyclohexane or xylene, they achieved a sensitivity of 75% and a specificity of 78%. Corradi et al. [39] examined breath samples from 71 LC cases and 67 controls, which included patients with comorbidities and pulmonary nodules. Trans-2-nonenol was able to establish a diagnosis of lung cancer with a sensitivity of 60.6% and a specificity of 62.7%. In subgroup analyses of smoking exposure, sensitivity and specificity increased to 84.6% and 83.3% in patients with less than 10 pack-years exposure, and 75% and 73.3% in patients with less than 30 pack-years exposure. In a study by Molina et al. [58], p-cresol was able to discriminate between LC cases and H controls with 77.9% sensitivity and 74.2% specificity.

Comorbid Cohorts
Different disease states have been shown to alter the composition and concentration of VOCs in exhaled breath. Therefore, the presence of chronic conditions may complicate VOC detection and diagnosis of lung cancer. In addition to lung cancer, patients may also have hypertension, heart disease, and diabetes mellitus. Lung disease is particularly common and typically includes COPD, pulmonary nodules, and asthma. Pulmonary nodules may be benign or malignant, and early detection has been associated with improved survival [98].
i. Pulmonary Nodules Four studies examined the ability to determine malignancy in the presence of PNs. Peled et al. [60] collected breath samples from 53 patients with malignant PNs and 19 patients with benign PNs with similar smoking histories and comorbidities. GC-MS analysis identified a significantly higher concentration of 1-octene in the breath of LC patients, and the nanoarray distinguished between malignant and benign PNs with 86% sensitivity and 96% specificity. Similarly, Fu et al. [43] correctly identified malignant PNs from benign PNs with 89.8% sensitivity and 81.3% specificity. Broza et al. [34] evaluated over two dozen breath samples from 17 patients undergoing resection for suspicious PNs and was able to detect LC with 100% sensitivity and 80% specificity. Likewise, Shlomi et al. [74] was able to detect LC from a cohort of suspicious PNs with 75% sensitivity and 93% specificity.
One study examined the ability to predict the presence of suspicious pulmonary nodules on low-dose CT. Using a single unidentified VOC, Phillips et al. [62] predicted biopsy-proven lung cancer with 75.4% sensitivity and 85% specificity, and presence of suspicious pulmonary nodules with 80.1% sensitivity and 75.0% specificity.
ii. Comorbidities Three studies aimed to characterize LC in the presence of comorbidities. In 2005, Poli et al. [67] found significantly higher levels of 2-methylpentane and isoprene and significantly lower levels of ethylbenzene and styrene in NSCLC patients vs COPD patients. Molina et al. [58] screened LC vs CM with 70% sensitivity and 61.1% specificity using 2,6-bis-1,1-dimethylethyl-4-(1-methyl-1phenylethyl)phenol. Wang et al. [80] examined breath samples from 484 subjects and was able to distinguish between LC and CM groups with a 0.701 AUC using 1,2,4-trimethylbenzene. The authors concluded comorbidities may have significant interference in the selection of VOC markers for LC diagnosis.
Ten studies examined the ability to diagnose LC in the presence of comorbidities. In the presence of comorbidities, seven studies reported an overall decrease in diagnostic accuracy and two studies reported an overall increase. All studies reported a decrease in specificity in the presence of comorbidities. Five studies reported an increase in sensitivity in the presence of comorbidities. Dragonieri et al. [42] correctly recognized LC with an overall accuracy of 90% amongst H subjects, which decreased to 85% amongst CM subjects. D'Amico et al. [40]  In summary, the presence of comorbidities on lung cancer diagnosis appears to increase sensitivity, decrease specificity, and decrease overall accuracy. Consequently, VOCs may hold greater potential for screening than diagnosis in patients with comorbidities.

Classification
Lung cancer is classified by histological and molecular subtype. The type of lung cancer can have significant implications for patient prognosis and therapeutic management [99]. The two major histological classifications are small cell lung cancer (SCLC) and non-small cell lung cancer (NSCLC). NSCLC accounts for roughly 85% of LC cases and mainly consists of adenocarcinoma (ADC), squamous cell carcinoma (SQC), and large cell carcinoma (LCC) [100]. NSCLC can be further subdivided into molecular subtypes based on the presence of genetic mutations, e.g. KRAS, TP53, Epidermal Growth Factor Receptor (EGFR), Anaplastic Lymphoma Kinase (ALK).
i. Histology Two studies distinguished histological subtypes from controls. Oguma et al. [59] found that concentrations cyclohexane and xylene were significantly higher in patients with ADC or SCLC in controls, though the difference between controls and patients with SQC was less marked. Handa et al. [47] analyzed 115 VOC peaks from exhaled breath samples of 50 LC patients (including 32 ADC) and 39 healthy subjects. n-Dodecane was able to separate adenocarcinoma and healthy subjects with a sensitivity of 81.3% and a specificity of 89.7%.
Three studies distinguished histological subtypes from each other. Song et al. [75]  Two studies sought to distinguish molecular subtypes from controls. In a study by Handa et al. [47], LC patients with EGFR mutations had significantly higher levels of n-Dodecane compared to LC patients without EGFR mutations, with a sensitivity of 85.7% and a specificity of 78.6%. In a study by Shlomi et al. [74], LC patients with EGFR mutations were discriminated from LC patients without EGFR mutations with a sensitivity of 79% and specificity of 85%.

Stage
The extent of lung cancer is most commonly described using the international TNM-based staging system. Stages range from one to four (I through IV) and take into account the size of the primary tumor, the involvement of regional lymph nodes, and the presence or absence of distant metastatic spread. The lower the stage, the less the cancer has spread.
Three studies distinguished the stages of lung cancer from healthy controls. One approach distinguished between various stages and controls. Fu et al. [43] found the exhaled breath concentration of 2-butanone significantly higher in patients with stage I LC. A related study by Oguma et al. [59] found concentrations of cyclohexane and xylene increased significantly as the clinical stage of lung cancer advanced. In stage III patients, concentrations of xylene were significantly elevated. In stage IV patients, concentrations of cyclohexane and xylene were significantly elevated.
Three studies distinguished the stages of lung cancer from each other. Fu et al. [43] found the exhaled breath concentration of 2-butanone significantly lower in patients with stage I LC compared to patients with stages II-IV LC. Corradi et al. [39] detected significantly elevated concentrations of ethyl benzene in stages III+IV compared to stages I+II. Oguma et al. [59] found concentrations of cyclohexane and xylene increased significantly as the clinical stage of lung cancer advanced.
Two studies evaluated the ability to diagnose lung cancer by stages. A diagnostic model by Oguma et al. [59] was able to determine lung cancer with 75% sensitivity and 78% specificity, which decreased slightly to 73% sensitivity and 78% specificity for early lung cancer. Lu et al. [54] analyzed 214 breath samples by e-nose and was able to correctly identify LC with 94.2% sensitivity and 92.8% specificity. The authors detected stage II with 97.9% sensitivity and 70% specificity, stage III with 82.8% sensitivity and 81.8% specificity, and stage IV with 83.2% sensitivity and 81.6% specificity.
Three studies evaluated the ability to diagnose early stage lung cancer versus late stage. Shehada et al. [73] was able to correctly distinguish early stage LC from late stage LC with 34.5% sensitivity and 95% specificity. Mazzone et al. [56] was able to correctly distinguish early stage LC from late stage LC with 81% sensitivity and 73% specificity. Peled et al. [60] achieved excellent distinction between early and advanced stages of NSCLC with 86% sensitivity and 88% specificity.

Malignant Pleural Mesothelioma
A total of 7 studies on MPM and VOCs were reviewed.
The ability of VOCs to determine malignancy was tested in 7 studies. Sensitivity and specificity ranged from 80 -91% and 80 -89.5%, respectively. Diagnostic VOC models, ranging from 2 to 32 VOCs, were described and evaluated in 5 studies. Two studies characterized MPM with respect to healthy controls. de Gannero et al. [86] found cyclohexane alone to be highly discriminant for MPM against H controls.

Asbestos Exposure
Asbestos refers to a group of naturally occurring mineral fibers historically used in building materials and fabrics. Asbestos exposure (AEx) is a known cause of lung and pleural disease: inhaled asbestos fibers lead to oxidative stress and stimulate a protracted immune reaction that damages the pleura [101][102][103].
Two studies characterized MPM in the presence of asbestos exposure. In 2010, de Gennaro et al. [86] found cyclohexane to be highly discriminant for MPM against AEx controls. Lamote et al. [89] found cyclohexane and limonene highly discriminant for MPM against AEx controls.
Five studies assessed the ability to diagnose MPM in the presence of asbestos exposure. All three studies reported an increase in diagnostic accuracy and specificity. Two studies reported a decrease in sensitivity. A 3-way classification model of MPM vs. H vs. AEx by de Gennaro et al. [86] was able to discriminate for MPM with 84.6% sensitivity and 100% specificity. Giglio et al. [88] correctly predicted cancer with 100% specificity and 50% specificity on a prospective independent cohort of 5 AEx subjects. Lamote first distinguished MPM from H with 96% sensitivity and 67% specificity, which increased overall to 87% sensitivity and 86% specificity against the AEx group, and decreased overall to 87% sensitivity and 70% specificity against the pooled B group [90]. In a subsequent study, Lamote distinguished MPM from H controls with 89% sensitivity and 42% specificity, which increased to 87% sensitivity and 90% specificity against the AEx cohort [91]. One study reported an increase in sensitivity. Lamote et al. [89] obtained 67% sensitivity and 64% specificity against H subjects using electronic sensor array, which increased overall to 80% sensitivity and 64% specificity against AEx subjects. Using GC-MS, the authors obtained 64% sensitivity and 79% specificity against H subjects, which increased to 93% sensitivity and 100% specificity against AEx subjects.
In summary, diagnostic accuracy and specificity for MPM appears to increase given a history of asbestos exposure. These results suggest that VOCs screening may yield lower false positive rates in patients with a history of asbestos exposure.
Four studies tested the ability to diagnose MPM in the presence of asbestos-related disease, including fibrosis and/or asbestos plaques. Three studies report an increase in diagnostic accuracy in the presence of ARD, while two studies report a decrease in diagnostic accuracy. Three studies report an increase in specificity while one study reports a decrease in specificity. Three studies report no changes in sensitivity, while one study reports an increase in sensitivity. Chapman et al. [85] correctly classified MPM from H groups with 90% sensitivity and 91% specificity; in a three-way classification of MPM, ARD, and H, the authors achieved 90% sensitivity and 88% specificity. Dragonieri et al. [87] correctly classified MPM with 85% accuracy against H subjects, which decreased to 81% accuracy against ARD subjects and 80% accuracy in a 3-way classification model between MPM, ARD, and H groups. Lamote et al. [91] was able detect MPM from H controls with a 89% sensitivity and 42% specificity, which increased overall to 89% sensitivity and 73% specificity against the ARD cohort. Lamote et al. [89] obtained 67% sensitivity and 64% specificity against H subjects using electronic sensor array, which increased overall to 75% sensitivity and 64% specificity against ARD subjects. Using GC-MS, the authors obtained 64% sensitivity and 79% specificity against H subjects, which increased overall to 79% sensitivity and 80% specificity against ARD subjects. In a blinded validation cohort of 5 AEx individuals, a model by Gilio et al. [88] misclassified two patients found to have pleural plaques, suggesting a potential diagnostic challenge.
Two studies examined the ability to diagnose MPM in the presence of asbestos-related disease and asbestos exposure (ARD + AEx). Given the combination of ARD+AEx, one study reports a decrease in accuracy compared to AEx alone. Another study reports mixed results: a decrease in accuracy using GC-MS, and an increase in accuracy using e-nose. Both studies report an increase in diagnostic accuracy compared to ARD alone. Lamote et al. [91] was able to distinguish MPM from ARD with 89% sensitivity and 73% specificity, which increased overall to 94% sensitivity and 80% specificity against a pooled cohort of AEx+ARD subjects. In a follow up study, Lamote et al. [89] used GC-MS to discriminate MPM from ARD with 79% sensitivity and 80% specificity, which increased overall to 100% sensitivity and 91% specificity when both AEx and ARD groups were pooled. Using electronic sensor technology, the authors discriminated MPM patients from ARD with 75% sensitivity and 64% specificity. When AEx and ARD patients were pooled, diagnostic performance increased overall to 82% sensitivity and 55% specificity. The authors found diethyl ether and nonanal to be highly discriminant for MPM vs. AEx+ARD.
One study examined the determination of MPM in the setting of pulmonary disease. Lamote et al. [91] sampled breath from 330 participants with a history of asbestos exposure, asbestos-related disease, and/or pulmonary disease (e.g., COPD, cystic fibrosis, and lung cancer). The authors were detected MPM versus a pooled AEx + ARD cohort with 94% sensitivity and 80% specificity, which decreased overall to 71.2% and 87% versus a CM cohort and 73% and 71% versus a LC cohort, respectively.
In summary, diagnostic accuracy and specificity for MPM appears to increase in the presence of asbestos-related disease and may be further increased by history of asbestos exposure.

Esophageal Cancer
A total of 6 studies on EC and VOCs were reviewed. VOCs uniquely associated with EC were identified in 4 studies. A total of 35 VOCs were reported as significant biomarkers for EC-mainly consisting of aldehydes (~28.6%), aromatic compounds (~28.6%), and fatty acids (~20.0%) ( Figure 3). Notably, alkanes and ketones were not detected by any study. The most common VOCs were Phenol, Hexanoic Acid, Methyl Phenol, Ethyl Phenol, Decanal, Butanal, Pentanoic Acid, and Butyric Acid, reported in at least 2 studies. Of note, cancer groups typically included both esophageal and gastric adenocarcinoma; these cancers are considered comparable subtypes that have frequently been grouped together in neoadjuvant chemotherapy trials. Markar et al. [95] analyzed concentrations of 5 highly predictive VOCs and found no significant differences between esophageal and gastric cancer patients.
The ability of VOCs to determine malignancy was tested in 6 studies. Sensitivity and specificity ranged from 80 -91% and 80 -89.5%, respectively. Diagnostic VOC models, ranging from 4 to 8 discrete VOCs, were described and evaluated in 6 studies. Kumar et al. [93] noted exhaled breath concentrations of methanol, hexanoic acid, phenol, methyl phenol were clearly elevated in subjects with EC compared to H controls. A multicenter study by Markar et al. [95] was able diagnose esophagogastric cancer with 80% sensitivity and 81% specificity.

Comorbidities
One study sought to characterize EC in the presence of comorbidities. Kumar et al. [93] noted elevated breath concentrations of hexanoic acid, phenol, methyl phenol, and ethyl phenol in the EC group vs CM group.
Two studies tested the ability to diagnose EC in the presence of comorbidities. In the aforementioned study by Kumar et al., subsequent ROC analysis found the same 4 VOCs were able to discriminate EGC from CM with an AUC of 0.91. A subsequent study from the same team was able to discriminate between EC vs. H cohorts with 98% sensitivity and 91.7% specificity [94]. Between EC vs. B cohorts, sensitivity and specificity decreased to 87.5% and 82.9%, respectively.

Barrett's Esophagus
Barrett's esophagus is a precursor to esophageal cancer and its early identification has been associated with improved survival [104].
Two studies examined the ability to diagnose Barrett's Esophagus. Chan et al. [92] 2016 analyzed 66 BE patients and 56 B with electronic sensor and achieved 82% sensitivity and 81% specificity. More recently, a prediction model by Peters et al. [96] correctly differentiated between EC and H controls with 57% sensitivity and 67% specificity.
One study examined the ability to diagnose Barrett's Esophagus in the presence of comorbidities. In the same Peters et al. [96] study mentioned above, sensitivity and specificity increased to 64% and 74% versus CM groups and 91% and 74% versus B groups, respectively.

Stage
The extent of esophageal cancer is described using an international TNM-based staging system similar to lung cancer.
One study evaluated the ability to diagnose esophageal cancer by stages. Zou et al. [97] analyzed breath samples from 29 EC cases and achieved a discriminant accuracy of 100%, 71%, 86%, and 93% for stage I, II, III, and IV, respectively. However, the study population is relatively small (There were 1, 7, 7, and 14 patients in stage I, II, III, and IV respectively) so the value of discriminant accuracy at different tumor stages may be not quite accurate.

Discussion
Interest in the use of exhaled breath VOCs to determine thoracic malignancies is growing. Almost half of all studies included in our review were published within the past 5 years alone. In addition, electronic sensor accounts for the majority of lung cancer studies published within the same time period and are gaining popularity over GC-MS as the VOC detection method of choice. The reliable performance of electronic sensor array combined with its many practical strengths make it an appealing technology for point-of-care screening and diagnosis. Concurrently, recent studies are less likely to identify specific VOCs. This trend reflects a broader and more holistic shift in our approach to exhaled breath VOCs as complex, dynamic, and multidimensional biomarkers for malignant disease unlikely to be adequately characterized by discrete singular compounds. Still, identification of VOCs remains useful for elucidating metabolic pathways of disease, enriching our understanding of underlying cellular processes, and generating hypothesis for novel therapeutic interventions.
Exhaled breath VOCs in thoracic malignancies typically consisted of aromatics, aldehydes, alkanes, lipids, ketones, and sulfur-containing compounds. Aromatic compounds are usually considered to be pollutants from exogenous sources, including cigarette smoke, alcohol, and pollution. Most lung cancer patients have a long smoking history, and some studies found that certain aromatic compounds increased in the breath of smoking patients versus nonsmokers [50,105]. These pollutants may result in peroxidative damage to PUFA, proteins, and DNA, leading to age-dependent diseases [106]. Aldehydes in the body come from 4 major sources [16]: (1) oxidation of fatty acids, where concentrations have been found to increase in concentration during inflammation and oxidative stress; (2) ethanol metabolism, where ethanol is degraded by alcohol dehydrogenase to produce acetaldehyde and subsequently oxidized by aldehyde dehydrogenase to acetate; (3) tobacco metabolism, where it's formation is catalyzed by cytochrome P450 as part of the detoxification process; and (4) cigarette smoke (e.g., formaldehyde, acetaldehyde, ethanal, propanal, butanal). In patients with esophageal cancer, genetic dysregulation of aldehyde metabolism has been observed [95]. Alkanes are largely produced by oxidation of fatty acids, particularly during the peroxidation of polyunsaturated fatty acids (e.g., ethane, pentane) [16]. Protein oxidation and fecal flora may also yield alkanes (e.g., propane, butane) [107]. Lipids are required for the membrane synthesis, potentially due to accelerated cell proliferation. The upregulation of fatty acids has been reported in esophageal cancer tissue [108,109]. Ketones, like other hydrocarbons, are mainly generated via fatty acid oxidation. During times of fasting or starvation, hepatocytes produce acetone via decarboxylation of excess Acetyl-CoA secondary to lipid peroxidation by cytochrome p450. In the state of cachexia, typically under illness conditions such as cancer, protein metabolism would increase and result in higher levels of ketone bodies. In addition to endogenous ketones, some occur naturally in the environment and are absorbed by the body (e.g., 2-butanone) [16]. The presence of volatile sulfur compounds (e.g. dimethyl sulfide, dimethyl disulfide, methanethiol) is largely the result of incomplete metabolism of methionine [107].

Conclusion
There is considerable evidence to support the notion of exhaled breath VOCs to determine thoracic malignancies. A variety of breath sampling and analytical techniques have been able to determine the presence of lung cancer in particular. Similar studies on malignant pleural mesothelioma and esophageal cancer have begun to emerge within the past 10 years with comparable and equally encouraging results.
Despite their promise, the use of exhaled VOCs has not translated to routine clinical practice. Results between studies are widely inconsistent. Studies tend to manage covariates poorly and vary widely in methodology, making it difficult to generate consensus. Further, the lack of externally validated multicenter studies on independent cohorts remains a critical issue. In order to enhance reproducibility and facilitate the transition of exhaled VOCs into a clinical setting, the European Respiratory Society has published recommendations for standardization of sampling, analyzing and reporting of data [110]. Commercially available breath sampling devices offer several advantages over traditional sampling methods and may also further improve experimental validity [111]. The recently developed ReCIVA breath sampler is a quick and convenient device that is repeatable and provides the researcher with added control and functionality [112]. The use of standardized instrumentation to diagnose thoracic cancers via exhaled VOCs is an active area of research-two major ongoing studies funded by the United Kingdom National Health Service are of particular interest [113,114]. The Lung Cancer Indicator Detection (LuCID) study is an international multicenter prospective case-control cohort study which aims to identify an exhaled VOC biosignature that can accurately diagnose lung cancer. Exhaled breath will be sampled from up to 4000 patients with clinical suspicion of lung cancer. The PAN-cancer Early Detection (PAN) study is a prospective cross-sectional observational case-control study of up to 1500 participants evaluating whether breath VOCs can accurately distinguish between individuals with and without different cancer types, including esophagogastric cancer. The LuCID and PAN studies are the largest of their kind and may finally provide the insights necessary to move forward with a reliable and non-invasive biomarker for thoracic cancers.