Determining the
 Bio‐Based
 Carbon Content of Surfactants

In response to a mandate from the European Commission, the European Committee for Standardization (CEN) called on the technical committee CEN/TC 276 to develop a European standard (EN 17035) to define bio-based surfactants and enable quantification of the bio-based carbon content of surfactants based on radiocarbon analyses. This analytical approach was tested through directly contracted analyses and through a round robin procedure at commercial facilities in Europe. Initial results were unsatisfactory and further investigation identified issues surrounding the degree of homogenization in the samples. In general, the samples were only homogeneous at the gram level while the maximum quantity of material that could be introduced to the analytical process was at the milligram level. Having identified the root cause of the discrepancies between measured and expected results, new samples were sent to six European laboratories. The results were satisfactory indicating linearity and accuracy across the measurement range.


Introduction
Surfactants are the major active ingredient in most detergent formulations such as laundry liquids and powders as well as dishwashing liquids; they may also be an important component of personal care products such as shampoos and body washes (Mudge et al., 2019). Typical compounds present in such formulations include cationic, anionic, or amphoteric surfactants along with nonpolar surfactants (Freeling et al., 2019). As part of the molecular structure, the surfactants usually have an alkyl chain that imparts the hydrophobic component of the molecule. These are principally carbon chains in the region of 10-18 carbons in length (Mudge et al., 2019). The source of these carbon atoms may be from fossil carbon such as crude oil or natural gas (through gas to liquid technologies) or from oleochemical sources such as palm kernel oil (PKO) or coconut oil. These sources are usually classified as petrogenic for the fossil carbons or bio-based for those derived from recent plant or animal sources (Hill, 2000). Examples of common surfactants with long alkyl chains are shown in Fig. 1. In each case, the alkyl chain may be derived from either petrochemical or bio-based sources; the resulting compounds are functionally identical and have the same chemical properties within the product and environmental fate.
Within the EU, standardized determination technologies for bio-based content are based on measuring the carbon content alone, for instance, as defined in the European Norm on "Surface active agents-Bio-based surfactants-Requirements and test methods EN 17035". With regard to the elemental constituents that comprise surfactants, it is principally the carbon atoms that can be sourced from either fossil sources such as oil or gas, on one hand, or from recent biological material such as plants (Gaubert et al., 2016). Carbon atoms form the majority of the backbone of the surfactant molecules and the number of carbons in the molecule will alter its properties such as water solubility (see Fig. 1 for examples). The major part of the molecular weight of most surfactants is also derived from the carbon content (e.g., a 12-carbon alcohol ethoxylate with 7 ethoxylate sub-units weighs 494 g mol −1 with 63% of this derived from the carbon atoms).
In 2011, the European Commission adopted Mandate M/491 in relation to developing the bio-based economy in the area of surfactants and solvents. CEN (European Committee for Standardization) was tasked with developing technical reports and technical specifications (TS) in these areas and CEN/TC 276 was asked to address the surfactant component of this mandate. One of the aspects that requires a TS is the analytical method needed to quantify the amount of bio-based carbon within a surfactant. It should be noted that the poly-ethoxylate shown in Fig. 1 may have the alkyl chain derived from bio-based carbon but the ethoxylates ( CH 2 CH 2 O ) are typically derived from petrochemical sources (Trujillo-Cayado et al., 2014). Therefore, these compounds do not have 0% or 100% biobased carbon content but some intermediate value. It is possible to calculate what those values might be based on knowledge of the supply chain and an example of the content is shown in Fig. 2. Typical ethoxylate distributions may range from 1 to 20 but tend to peak around 7-9 (Cohen et al., 2001;Wind et al., 2006). If it is assumed that there are an equal number of ethoxylates from 0 to 20 for any given bio-based alkyl chain, the mean bio-based carbon content for the resulting mixture ranges from 44.9% for C 12 to 49.5% for C 15 .
There are two major analytical approaches that might be used to determine the quantity of bio-based carbon present within any sample: • Stable isotopes. The carbon and hydrogen isotopes present within the surfactant molecule will reflect the initial carbon source and any transformations they may have undergone since formation. The δ 13 C for many crude oils is in the range of −23 to −28‰. This may be compared to −26 to −36‰ for terrestrial plant matter and −20 to −26‰ for unicellular algae (Mudge et al., 2012). Since these ranges overlap, this may not be definitive in separating out the two sources. However, the δ 2 H values for petrochemical compounds are typically −40 to −90‰ while those surfactants derived from oleochemical sources, usually PKO, have δ 2 H values in the range of −250 to −320‰ (Gaubert et al., 2016;Mudge et al., 2012). • Radiocarbon ( 14 C). Along with the 13 C stable isotope, a very small proportion of the carbon in a molecule will be of the naturally occurring radioactive form, 14 C, also called radiocarbon. The 14 C atoms are formed in the upper atmosphere due to interactions between cosmic rays and nitrogen atoms. The natural abundance of 14 C in compounds is around 1 part per trillion. This radioactive carbon isotope decays with a half-life of 5730 years such that after six half-lives, it is functionally undetectable in a sample. Carbon compounds that are derived from fossil sources such as oil or gas will contain no radiocarbon, as it will have decayed away during the millions of years needed to make such reserves. This contrasts with recently grown plant-based materials that do contain measurable amounts of 14 C.
Several other technical committees of CEN (e.g., CEN/ TC 411) have adopted the radiocarbon approach to analysis along with the United States Department of Agriculture (USDA) in their bio-preferred program (USDA, 2020). Radioactive carbon can be measured using gas proportional counting, liquid scintillation counting, and accelerator mass spectrometry (AMS) (Yates et al., 2015). The latter approach is the most sensitive of the three. However, there are a number of confounding factors that make the measurement of 14 C difficult: • The quantity of 14 C present in a compound is small at the best of times and so a suitable quantity of material is needed to achieve good counting statistics. • Contamination with even a small amount of modern carbon will lead to incorrect estimates of the bio-based carbon content and so all analyses need to be conducted with precautions to prevent cross-contamination (Yates et al., 2015). • The cost of such analyses is relatively high due to the small market for such analyses and the expense of the analytical equipment.

Development of Samples for Testing
The draft TS developed by CEN/TC 276 assumed a 5% error in bio-based carbon measurements after inspection of the results from a round robin test by CEN/TC 411. Therefore, a series of thresholds between bio-based carbon categories were developed to help in describing the carbon content rather than simply presenting the percentage. These thresholds were set at 5% (anything below 5% was described as not having any bio-based content); 50% (anything below 50% but above 5% was described as minority bio-based); 95% (anything between 50% and 95% was described as majority bio-based) and anything above 95% was described as wholly bio-based. In order to test the ability of the 14 C radiocarbon method to distinguish between the different classes, samples were prepared at the following percentage bio-based carbon contents: • 3% and 6% to distinguish samples either side of the 5% boundary. The 3% sample was made from a blend of anionic surfactants from Stepan Company. The 6% sample was a nonionic/anionic blend from BASF. • 10%, 25%, and 75% were produced to provide coverage of the entire percentage range even though these values are not close to a threshold. The 10% and 25% samples were developed from BASF nonionic/anionic blends while the 75% sample was a cationic surfactant from Stepan Company. • 48% and 52% samples were developed to test the ability of the method to distinguish samples either side of the 50% boundary. The 48% sample was an anionic/amphoteric blend from Stepan Company and the 52% sample was a nonionic/anionic blend from BASF. • 93% and 100% samples were devised to distinguish samples at the top end of the scale. The 93% sample was a non-ionic / anionic blend from Stepan Company along with the 100% sample that was a nonionic surfactant. • Several replicate samples were developed to test the repeatability and precision of the method. The 3%, 48%, and 52% samples were made up, homogenized and divided between separate sample containers (three each for 3%, 48% and 93% and four for 52%) and each was analyzed separately to test the analytical variability of the protocol.
A total of 18 samples were generated for analysis. Samples were sent by courier to Beta Analytic in Miami, USA for analysis through AMS methods according to the EN 16640 protocol. The calculations are based on the published CEN method EN 16640:2017 and this was used by each of the laboratories. In this approach, the bio-based carbon content by dry mass (x) is expressed from the percent modern carbon (pMC) by the following equation: x TC is the total carbon content expressed as a percentage of the dry mass of the sample. pMC is the measured value of the pMC of the sample, and REF is the pMC value of the 100% bio-based carbon reference sample.

Analytical Methodology
Samples were analyzed by Beta Analytic according to their standard procedures using method EN 16640. A summary of the approach is detailed below. For fuller descriptions, the reader is directed to (BetaAnalytical, 2020).
For each sample, the vial was shaken well prior to opening. For liquid samples, a drop was removed and rapidly frozen in a hydrogen atmosphere in combination with copper oxide. For solid samples, a small quantity (10-20 mg) was taken from the homogenized sample vial and treated as for the liquid samples. The vessel was then evacuated of the H 2 while at −196 C and flame sealed. The vessel was heated at 800 C for 5 h to obtain CO 2 from all carbon species present in that sample. Each tube was marked with a high-temperature ink pen and a picture taken of each to ensure proper chain of custody prior to placing the vessel in a labeled tube cracker. The cracker was attached to a vacuum line, which was purged and evacuated six times with H 2 to ensure it was free of any carbon memory. The tube was then cracked and the CO 2 collected. The CO 2 was mixed with H 2 over a cobalt catalyst and heated to 500 C for 5 h to synthesize graphite.
The graphite was then loaded into an AMS target under strict protocols, which identified the sample number with the target number in the AMS. All the samples were handled identically in these steps.

Round Robin Testing
Following the initial analyses at Beta Analytic, five new samples were prepared for analysis at six different AMS laboratories in Europe (UK, Belgium, Germany, Italy, Poland, and Spain). Three of the five samples were chosen to be completely homogeneous at any sample size and these were composed of a single compound with mixed carbon sources. The remaining two samples were composed mixtures of the previous three samples in a 1:1 ratio. The sample types can be seen in Table 1.
The only information given to the laboratories was that at least two of the samples were mixtures and may not be completely homogenous at all scales. A total of 5 g of each sample was sent to each laboratory by courier. We did not specify the analytical protocol but did alert the laboratories to potential for mixtures that may not be completely homogeneous at all sample sizes.

Results
The results of the initial 14 C radiocarbon analyses are shown in Fig. 3. In this figure, the measured bio-based carbon content is plotted against the known values from the manufacturing process (the raw material origins are known in each case). It is clear from this figure that many samples fall well outside the expected AE5% error range; this is especially apparent with the 52% bio-based carbon sample that had measured bio-based carbon values as high as 100%. Additionally, replicate analyses of the 6% and 52% samples clearly show a wide range of results (7-35% and 75-100%, respectively). Notwithstanding these results, the instrumental error reported is less than 0.3% in all cases (error bars not shown in the figure as they are too small to distinguish). Three pure compounds were used as "end-members" and these were mixed in a 1 to 1 (w/w) ratio to develop intermediates.
The samples that deviated most from the expected values were solids, principally dry powders (Table 2). It is possible that the errors may have arisen from a number of sources and processes. These include the potential for the raw materials used to make the surfactants not as described and had a higher bio-based carbon content that expected. However, all samples deviated from the calculated value in the positive direction (over-estimated the bio-based carbon content) implying that the petrochemical-based raw material would have been contaminated with bio-based materials, which seems unlikely. Additionally, there was a wide range of measured values for the same sample suggesting that this is not likely to be the cause.
There may have been a miscalculation of the bio-based carbon content. Again, the wide range of measured results for single samples rules this out. Likewise, incorrect blending of the surfactants to give intermediate values may have occurred. Again, the wide range of measured results for single samples rules this out. It is possible that there was poor homogenization at the blending stage leading to heterogeneous mixtures. The initial surfactants were ground to a < 100 μm diameter powder during the blending process. Based on a density of 1.0 g cm −3 , a 10 mg sample used for analysis would contain~20,000 individual grains so this seems unlikely too.
Settling and separation during transport is also unlikely due to the fast transit times to the laboratory and near identical densities of the two blended surfactants. The volatility of the compounds is also the same and so no partitioning is expected here either.
At the laboratory, there may have been some bias during the sub-sampling procedure that favored the bio-based surfactant over the petrochemical one. It should be noted that the vials were thoroughly homogenized at the laboratory before sub-sampling and there was no density difference between the two types. The quantity of sample taken for analysis may have been too great for the subsequent chemical reagents and the petrochemical surfactant was somehow discriminated against in such a condition. However, no mechanism for this has been identified. Contamination of the process lines during AMS graphite preparation may have occurred. However, no mechanism for this has been identified. Further investigation of the analytical results indicated that the results were not correlated to the quantity of carbon dioxide generated during the combustion process either.
The unexpected lack of accuracy and the poor precision of the results raised several questions regarding the failure of a method that is of known and proven reliability. It was suspected at this stage that there was a mismatch between the minimum sample size that might be considered homogeneous within the sample and the maximum quantity of sample that may be introduced into the sample preparation scheme.
In order to test the analytical method further and establish the reason for poor performance in the initial samples, a single 52% (calculated) bio-based carbon dry powder (blend) surfactant was sent to the Beta Analytic laboratory. This sample was divided into sub-samples with different weights and after dissolving in the smallest amount of water necessary to solubilize the sample. Two replicates around 12 mg were analyzed along with two around 20 mg with one being dissolved and one treated as a dry powder in each case. Additionally, approximately 2 g of the sample was dissolved in the smallest amount of pure water and a sub-sample of this taken. This latter approach removes any possibility of bias in the sub-sampling. The results of these analyses can be seen in Table 3.
The measured values in these later analyses are closer than the original results and straddle the expected (actual) value   With liquid samples, single drops were taken, and no accurate weight was recorded for these samples. Pluriol ® E4000 is a PEG with molecular weight of 4000 g/Mol, Sulfopon ® 1214G is a C 12 C 14 sulfate.
with the two larger sub-samples indicating less bio-based carbon and the smaller samples having more. The samples are not differentiated based on the form of the sample (solid or liquid) but may do so based on the mass analyzed.

Round Robin Results
The results of the round robin testing are summarized in Fig. 4. Some of the laboratories did more than one analysis and presented multiple values for each sample. The individual data points are presented in this figure. One laboratory did not conduct the analyses and presented no data.
The samples with~30% and~70% bio-based carbon were mixtures of the other three samples. In general, the results are equally good as those of the single compounds.
Each laboratory was provided with a minimal amount of information regarding the nature of the compounds in the sample; this included the health and safety aspects and that the samples were either a single (pure) compound or a mixture of two compounds without indicating which were which. The laboratories converted the samples to graphite according to The whole sample was based on a dissolved sub-sample taken from the 2 g solution.  It should be noted that the instrumental confidence in the individual results from the AMS is small (<1%) in all cases. CoV, coefficient of variation; SD, standard deviation.
their own in-house protocols and analyzed the samples on an AMS. Two of the results from different laboratories initially were considered outliers in the distribution. The laboratories were asked if they had an explanation for the differences and in both cases they re-analyzed the sample and produced an answer closer to the expected results. These two samples have been excluded from Fig. 4 and Table 4.
The regression of the data with the expected bio-based carbon content clearly shows a very good agreement (R 2 > 0.99) and a slope of 1.0 (Fig. 4).

Stable Isotope (δ 13 C) Content
Part of the process of measuring 14 C uses the measured 13 C content to determine the amount of fractionation between isotopes that may have occurred during the analysis. It should be noted, however, that the 13 C content of the samples may vary depending on the source of the carbon (Mudge et al., 2012;Mudge et al., 2014). It is well known that terrestrial plant carbon typically has δ 13 C values around −30‰, petrochemical sources are typically −26‰, and unicellular algae are closer to −22‰. This aspect is not generally incorporated into the corrections for fractionation but the likely errors in the final 14 C measurement are small.
The δ 13 C signature from these samples is not determined by the bio-based carbon content. The samples that contained 100% bio-based carbon have a δ 13 C value that is different from the values measured in the compound-specific analyses conducted on USA products (Mudge et al., 2012). It was anticipated at the time that the raw material source for the USA products was PKO since the chain lengths in The compound specific stable isotopic composition of the alkyl component of surfactants purchased in the USA as part of a study on the fate of such compounds (Mudge et al., 2012). The frequency histograms indicate the distribution of the δ 2 H values (top left) relative to standard Marine Ocean water (SMOW) and δ 13 C values (lower left) relative to pee Dee belemnite (PDB). Those products in the lower right (oleochemical sourced carbon) were marketed as containing "natural" materials. Figure redrawn from (Mudge et al., 2012). DEO, deordourants; HDD, hand dish detergents; LHS, liquid hand soaps; LLD, liquid laundry detergents; PLD, powdered laundry detergents; SHA, shampoo the oleochemical cluster (Fig. 5) were either C 12 or C 14 except for the deodorant sample (C 18 ). These stable isotope results from earlier work by Mudge et al. (2012) also suggest that the δ 2 H composition may also be a more satisfactory indicator of petrochemical and oleochemical (biobased) carbon sources than the carbon itself. While this approach may be able to differentiate between the end members (petrochemical and oleochemical), quantifying the exact bio-based carbon content around a threshold would be challenging. Additionally, the δ 2 H composition will vary according to the geographical source of the PKO and the type of crude oil used leading to "fuzzy" end members rather than specific values.

Discussion
It was concluded that the reason for the discrepancy between the expected and measured bio-based carbon content in the initial samples was due to the mismatch in sample sizes required: the maximum sample size that could be applied to the AMS was in the region of 10 mg while the minimum sample size that could be guaranteed to be homogeneous for the dry samples was around 10 g. This leads to a difference of three orders of magnitude difference in scales. To overcome this, some samples were dispersed in water to assist in the homogenization process before a smaller volume was withdrawn for analysis. Three of the samples that were sent for analysis in the second phase of this work were homogenous at any quantity as the different carbon sources were incorporated into the molecule at the time of synthesis. In general, these samples had a low coefficient of variation and the AMS results were an accurate reflection of the expected bio-based carbon content. Reassuringly, this clearly shows the AMS approach is an accurate and reliable method to measure the bio-based carbon content in these surfactant samples and all of the reporting laboratories were able to provide good results.
For the two samples that were mixtures, the results were also acceptable in most cases with only a single outlier (57% as opposed to an expected value of 70%). If that outlier was removed, the CoV for the two mixtures would also be less than 1.3%.
The poor results seen initially were due to the different samples sizes needed to ensure a homogeneous subsample and the maximum quantity that could be used in the subsequent analysis. This is a key factor in ensuring accurate results and when samples are sent for analysis, this aspect should be given significant thought and planned for.

Conclusions
The consistency of the results across the different laboratories suggests that the AMS 14 C radiocarbon approach to bio-based carbon content determination in surfactant mixtures is a reliable method suitable for end-users. The initial discrepancies between the expected and measured results were due to the difference in scale between the sample size that might be considered homogeneous and the maximum sample size that might be analyzed by the AMS method. It is clear, however, that if users are aware of this issue, representative sub-samples can be taken from a sample and accurately reflect the bio-based carbon content. The turnaround time at the laboratories could be long and so users would be encouraged to plan ahead. This work contributed to the adoption of the European Norm on bio-based surfactants (EN 17035).