Defining totipotency using criteria of increasing stringency

Totipotency is the ability of a single cell to give rise to all the differentiated cells that build the conceptus, yet how to capture this property in vitro remains incompletely understood. Defining totipotency relies upon a variety of assays of variable stringency. Here we describe criteria to define totipotency. We illustrate how distinct criteria of increasing stringency can be used to judge totipotency by evaluating candidate totipotent cell types in the mouse, including early blastomeres and expanded or extended pluripotent stem cells. Our data challenge the notion that expanded or extended pluripotent states harbor increased totipotent potential relative to conventional embryonic stem cells under in vivo conditions.

2 that expanded or extended pluripotent states harbor increased totipotent potential relative to conventional embryonic stem cells under in vivo conditions.

Introduction
During early mammalian development, the totipotent state of early blastomeres is rapidly lost as cells gradually restrict their developmental potential and commit to distinct cell lineages by the blastocyst stage 1,2,3 . In mouse, the first cell fate decision starting at embryonic day (E)2.5 sets aside the trophectoderm (TE), the precursors of the placenta, from the inner cell mass (ICM). A second cell fate decision starting around E3.5 within the ICM gives rise to the pluripotent epiblast (EPI) and the primitive endoderm (PE), precursors of all embryonic germ layers and extraembryonic yolk sac, respectively 4 . For the most part EPI, PE and TE cells maintain blastocyst-defined lineage assignments throughout subsequent development, with a notable exception for the PE, which was shown to also contribute to otherwise EPI-derived definitive endodermal lineages during post-implantation stages 5,6 .
While EPI, PE and TE cells exist only transiently in the developing embryo, distinct selfrenewing stem cell types can be derived from each cell type using a combination of appropriate growth factors and/or inhibitors which capture and preserve their developmental potential in culture 7,8,9,10 . Murine embryonic stem cells (ESCs) derived from the early EPI lineage were originally established using fetal bovine serum and leukemia inhibitory factor (Lif) 7,8 . They can however also be cultured under naïve conditions, using inhibitors against mitogen-activated protein kinase and glycogen synthase kinase-3, termed 2i, in combination with Lif 11 . Trophoblast stem cells (TSCs) can be derived from the TE lineage using fibroblast growth factor 4 and heparin 9 and extraembryonic endoderm (XEN) stem cells can be established from the PE using various methods 10,12,13 . Importantly, while each of these stem cell types are able to re-enter the normal course of embryonic development and differentiate into the downstream cell types similar to their in vivo counterparts, they are also lineage restricted in that they do not readily cross lineage boundaries that have been set during blastocyst formation 14,15 .
Strict lineage restriction differs between the three stem cell types, and is reflected in the time elapsed since the source lineages parted ways during embryo development. The closer relationship between EPI and PE lineages is also underscored by the observation that XEN cells can spontaneously appear in ESC cultures 16,17 and ESCs can be readily converted into XEN cells using only soluble factors 13 . On the other hand, ESCs have only been reported to rarely contribute to trophectoderm-derived lineages in vivo 18 . Work by several laboratories has also shown that stem cell types with properties of TE, EPI or PE can be obtained by reprogramming lineage restriction using transcription factor (TF) expression. Long-term TF overexpression was shown to reprogram ESCs into TSC-like cells in vitro 19,20,21,22 . Induced pluripotent stem cells (iPSCs), as well as induced TSCs and induced XEN stem cells have also been derived by TF overexpression followed by culture with the appropriate growth media 23,24 .
Additionally, mouse primed epiblast stem cells (EpiSCs), isolated from the post-implantation epiblast 25,26 can be reverted back into ESCs 27 . Collectively, these studies suggested that it might be possible to induce totipotent stem cells, or at least cells that approach the totipotent state, by conversion from pluripotent stem cells.
In the past years there has been several reports of conditions to derive novel mouse stem cells types with the ability to produce descendants contributing to all three blastocyst-defined lineages 28,29,30,31,32,33,34 . In particular, two methods were described to derive extended or expanded pluripotent stem cells (EPSCs) by conversion from pre-existing ESCs or directly from 8-cell stage blastomeres 33,32 . In the first method, Liu lab EPSCs (L-EPSCs) 32 were derived in Lif, CHIR, PD0325901, JNK Inhibitor VIII, SB203580, A-419259 and XAV939. In the second method, Deng lab EPSCs (D-EPSCs) 33 were derived using Lif, CHIR, DiM ((S)-(+)-Dimethindene maleate), and MiH (Minocycline hydrochloride). Both cell types showed molecular and functional features that suggested expanded pluripotency, such as totipotency-associated marker gene expression and contribution to the EPI, PE as well as TE lineages using chimeric assays. Additionally, recent studies reported the ability of EPSCs, alone or in combination with TSCs, to self-assemble into blastocyst-like structures, termed blastoids, that contain cells with features of all three embryonic lineages 35,36 . These studies suggested that stem cells with the potential to give rise to both ICM and TE lineages, properties that define totipotent stem cells, can be isolated and expanded in vitro.
Many criteria of variable stringency can be used to assess totipotency. One criterion is to assess gene expression, in search of activated totipotency-associated marker genes. This can either be performed in bulk for a set of genes or through a more stringent approach taking advantage of transcriptome-wide single cell correlation analysis with totipotent cells of early embryos. More demanding is providing evidence of the potential to enter both the embryonic and extraembryonic pathway using in vitro differentiation assays. Finally, a more stringent requirement for evaluating the potential of different stem cell types is to perform in vivo aggregation experiments, by combining candidate cells with a host embryo and analyzing lineage contributions in the resulting chimera. Candidate cells are typically combined with morula (8-16 cell stage) or blastocyst stage host embryos and analyzed at different developmental stages. It is important to analyze chimeric contributions not only based on localization, but also by assessing lineage integration using functional marker analysis.
Here we subject candidate totipotent stem cells to these assays of increasing stringency to assess their developmental potential. We analyze the transcriptome and gene regulatory networks of ESCs, L-EPSC and D-EPSCs and pre-implantation embryos using bulk and singlecell RNA-sequencing (RNA-seq), and provide a resource for the community enabling interactive online data exploration. We investigate the ability of EPSCs to give rise to TSCs in both a conversion and a reprogramming setting. We analyze the transcriptome and gene regulatory networks of blastoids derived from EPSCs. Finally, we examine how EPSCs and blastomeres perform in chimeric experiments. We present a gold standard for analyzing contribution to different lineages, with a focus on contribution to the trophoblast lineage at different stages combined with molecular analyses. We emphasize the importance of thorough analysis of cell potential using high stringency assays and highlight the ongoing challenges of unlocking the totipotent state.

Transcriptional signatures of preimplantation embryos and different stem cell states
Transcriptomic analysis can serve as effective means to monitor cellular states and analyze marker gene expression. Therefore, using transcriptional similarity analysis, we investigated which in vivo developmental stage or previously established in vitro stem cell state L-EPSCs and D-EPSCs resemble the most. First, we converted naïve ESCs (2iLif) to L-EPSCs and D-EPSCs using published protocols 33,32 . We observed similar morphological changes after conversion as previously reported 33,32 and were able to stably maintain L-EPSC and D-EPSC cell lines ( Figure S1A). In our first experiments we used bulk RNA sequencing for genome-wide detection of transcription and assessment of totipotency marker gene expression in L-EPSCs.
We first set out to explore the dynamics with which a transcriptome shift is induced after switching ESCs into L-EPSC conditions ( Figure 1A). Our results reveal a rapid transcriptome change, within 3 days of induction, indicating that ESCs can readily convert into L-EPSCs ( Figure 1B). Intriguingly, despite these differences between L-EPSCs and ESCs, the L-EPSC transcriptome resembled the ESC transcriptome more than any early mouse embryo stage ( Figure 1C and S2A and B) and 4-cell and 8-16 cell stage embryo marker genes remained mostly silenced ( Figure 1D).
Single-cell RNA-seq (scRNA-seq) is particularly suited to resolve cellular heterogeneity and identify subpopulations with distinct transcriptional features. To examine whether totipotent features can be detected in individual cells, we applied SMART-seq2 scRNA-seq to ESCs, as well as to L-EPSCs and D-EPSCs derived from them. As a reference we transcriptionally tracked mouse preimplantation lineage segregation and post-implantation epiblast development from zygote to E6.75, and included naïve ESC and primed EpiSC states as well ( Figure 1E).  Figure S2D). As expected, ESCs cultured in 2iLif conditions occupied the space between E3.5 ICM and E4.5 EPI, while primed EpiSCs clustered with E5.5 and E6.75 EPI cells.
We found that along the embryonic developmental trajectory L-EPSCs, as parental ESCs, clustered between the E3.5 ICM and E4.5 EPI stages, while the majority of D-EPSCs clustered together with E5.5 stage EPI cells ( Figure 1E). We observed that top differentially expressed genes reported to be upregulated in D-EPSCs compared to Lif/serum-cultured ESCs were also upregulated in D-EPSCs compared to 2iLif-cultured ESCs ( Figure S2E). We additionally constructed a correlation matrix from the top 2000 genes averaged expression to compare each stem cell condition independently with each developmental stage ( Figure 1F). While ESCs showed high correlation with all preimplantation stages: 8-cell (r = .45, p < .001), morula (r = .55, p < .001), ICM (r = .53, p < .001), and highest similarity with E4.5 EPI (r = .72, p < .001), L-EPSCs (from both this and previous study) exhibited the most resemblance to E4.5 EPI (r = .64, p < .001), while lacking significant correlation with other developmental stages (p > .05).
Consistent with the UMAP, D-EPSCs correlated the most with E5.5 EPI (r = .89, p < .001), but 6 also showed close similarity with primed EpiSCs (r = .71, p < .001), and E6.75 EPI (r = .51, p < .001). The position occupied by L-EPSCs in the UMAP space is consistent with the original report by Yang et al. 32 . In conclusion, L-and D-EPSCs single-cell transcriptomes align with pluripotent rather than totipotent states.
Embryo development is under the control of transcription factors that bind to cis-regulatory regions, forming gene regulatory networks. We reconstructed gene regulatory networks which are active in early development, ESCs and EPSCs, from scRNA-seq data, using single-cell gene regulatory inference and clustering (pySCENIC 40 ). SCENIC predicts TFs that may control cellular states present in the dataset, together with candidate TF target genes. A TF and its candidate targets are called a regulon, and by quantifying the activity of regulons in each single cell, SCENIC can be used to cluster cells based on the activity of regulatory programs. In contrast to identifying the global transcriptional state of cells, here we project a UMAP visualization based on regulon activity and reveal that 2iLif ESCs localize closest to E3.5 ICM, and both L-and D-EPSCs clustered between E4.5 EPI and E5.5 EPI ( Figure S2F). These results show that the regulatory state of EPSCs resembles that of late pluripotent EPI rather than earlier, totipotent developmental stages.

Capacity of EPSCs to enter the trophectoderm program and generate TSC-like cells in vitro
Another test to judge totipotency is to evaluate the capacity of cells to enter the trophoblast linage. This can be assessed in vitro by switching cells to TSC culture conditions and assaying whether cells give rise to TSC-like cells, a transition that pluripotent ESCs cannot make. When bulk L-EPSCs cultures were switched to TSC conditions followed by RT-PCR to assay trophoblast marker gene expression (Figure 2A), no substantial activation of such genes was seen ( Figure 2B).
To examine whether a small subpopulation of L-EPSCs or D-EPSCs may harbor the potential to directly convert into TSCs, as suggested by a previous study (Yang 32 ), we analyzed the expression of key TSC markers on a single cell level using flow cytometry ( Figure 2C).
Additionally, we tested whether L-EPSCs and D-EPSCs can be reprogrammed into TSC-like cells more efficiently than ESCs. While ESCs do not readily convert into TSCs, they can be reprogrammed with low efficiency into TSCs by induced overexpression of TSC-associated TFs, 7 such as Cdx2 and by lowering the expression of the pluripotency factor Oct4 19 . To assay TSC reprogramming, we used a tetracycline-inducible Cdx2 (iCdx2) and Oct4 heterozygous ESC background, used in the original ESC-to-TSC reprogramming experiments 19 . To read out fate conversion using flow cytometry, we immunostained for two TSC-specific cell surface markers, CD40 and Plet1. We also established an Elf5-2A-mCherry reporter ESC line by targeting 2A-mCherry to the C-terminus of the endogenous Elf5 locus 41,42,20,22 (Figure 2C). We then switched L-EPSCs, D-EPSCs and ESCs to TSC medium (with Fgf4 and Heparin) with or without tetraycycline (+/-4OH) and cultured the cells for 6 days (6d) before analyzing TSC-marker expression. We found that both in the absence or presence of Cdx2 induction, there were no significant differences in the number of single or triple marker positive cells between ESC and either EPSC conditions ( Figure 2D). In contrast, a control TSC line in which we also targeted the Elf5 gene with the mCherry reporter showed 80% CD40/PLET1/ELF5 triple-positive cells ( Figure 2E). Collectively, these results indicate that the EPSC states do not facilitate more efficient reprogramming into TSCs in vitro, in contrast with a previous study that suggested an increased ability of EPSC to give rise to TSC-like cells compared to ESC 33,32 .

In vitro blastoid-forming ability of D-EPSCs based on Li et al. 2019 and Sozen et al. 2019
The ultimate proof of totipotency is the ability of a single cell type, or more stringently a single cell, to give rise to an entire blastocyst and subsequently a viable and fertile animal. Recently, blastocyst-like structures, termed blastoids, have been generated in vitro from different stem cell types 43,44,35,36 . These protocols use different combinations of growth factors and inhibitors to generate blastoids, which in multiple aspects resemble real blastocyst stage embryos, although until now none have been able to generate viable animals. Most notably, two recent reports used D-EPSCs, either as a sole stem cell source (Belmonte group, B-blastoid) 35 or in combination with TSCs (Zernicka-Goetz group, ZG-blastoid) 36 to generate blastoids. Importantly, a large proportion of the blastoid cells generated with only D-EPSCs showed expression of genes previously associated with post-implantation stage lineages and not cells of the blastocyst. We therefore re-analyzed the scRNA-seq data provided in these reports and aligned them to our existing sampled preimplantation cells, along with an additional dataset containing cells up to E7.5 45  At the gene regulatory level, both B-and ZG-blastoid cells aligned well with embryo cells, indicating that the gene regulatory programs of natural embryos are recapitulated to a large extent in blastoids derived from D-EPSCs ( Figure S4A). For example, the activity of selected regulons for lineage-specific TFs showed that blastoid-generated PE and EPI cells shared regulatory activity with their respective cell types in natural embryos ( Figure S4B). Furthermore, in-depth analysis of target genes of NANOG and GATA4, EPI and PE TFs, respectively, showed similar target gene expression patterns between blastoids and natural embryos ( Figure   S4C). However, the TE regulatory state did not seem to be well recapitulated in D-EPSCderived blastoids. Regulons associated with TE such as GATA3, CDX2, PITX1 and SOX6 are downregulated in blastoid TE compared to embryo TE ( Figure S4B). Indeed, GATA3 target genes are downregulated in blastoid TE cells compared with natural embryos ( Figure S4C), indicating that the misregulation of specific parts of the regulatory program underlying embryogenesis may limit blastoid development. We also investigated the regulatory activity of  Figure S4A, S4B) such as T and MIXL1. We therefore analyzed gene expression of T target genes and found that B-blastoid-Intermediate-2 cells activate most, but not all, T targets ( Figure S4C). At the same time, however, these cells, as well as mesodermal cells, also activate many targets of CDX2 ( Figure S4C). This suggests that B-blastoid-intermediate-2 cells may have failed to activate the TE regulatory program and instead arrested between overlapping mesodermal and TE states. Altogether, these results demonstrate that the gene regulatory programs used in natural embryos are engaged to a large extent in EPSC-derived blastoids, but not fully, which might contribute to the developmental arrest of these structures.

In vivo lineage contributions of totipotent blastomeres, L-EPSCs and D-EPSCs at embryonic day 4.5
The capacity to enter the trophoblast lineage can also be assessed in vivo by creating chimeras with a host embryo and analyzing lineage contributions at later developmental time points. To test lineage contributions of truly totipotent cells, we aggregated a morula-stage (E2.5) embryo (host) with a single blastomere from an 8-cell stage embryo (donor), as most, if not all blastomeres at the 8-cell stage are considered totipotent 46,47 . We allowed chimeras to develop for 48 hours before analyzing lineage contributions at the late blastocyst stage (E4.5). At E4.5 the three blastocyst lineages are clearly segregated ( Figure 4A) and express well-characterized lineage specific markers, such as Sox2 (EPI), Sox17 (PE) and Cdx2, Gata3, Krt8, and Krt18 (TE) 48,49,50,51,16,52 . To visualize progeny of the donor cell, we isolated single blastomeres from embryos expressing either H2B-Gfp (nuclear-localized marker) or DsRed (no nuclear localization, marker appears in both cytoplasm and nucleus) and used wild-type embryos as 1 0 hosts. We found that in 60-70% of chimeras the donor blastomere contributed to both the inner cell mass (EPI + PE) and the TE ( Figure 4B) which was verified by co-immunostaining with the panel of lineage-specific markers ( Figure 4C). These data serve as a benchmark for lineage contributions of truly totipotent cells in a chimera.
We then aggregated L-EPSCs, D-EPSCs or control parental ESCs to wild-type host embryos and analyzed chimeras at E4.5. Interestingly, we observed that progeny of both L-EPSCs and D-EPSCs localized to trophectodermal positions in ~20% of chimeras, while progeny of the parental ESC line cultured in 2iLif conditions localized only to the epiblast, corroborating previous studies 33,32 ( Figure 4D). However, when we immuno-stained chimeras for epiblast (Sox2) and trophectoderm markers (Cdx2), none of the L-EPSC or D-EPSC derived cells in the TE position showed expression of either marker ( Figure 4E and S5). Therefore, we conclude that EPSCs can contribute cells that localize to the TE but do not express a key TE marker.

In vivo lineage contributions of totipotent blastomeres, L-EPSCs and D-EPSCs at embryonic day 6.25
To confirm that the observed lineage contributions at E4.5 persist later in development, we examined chimeras post implantation. Shortly after implantation the EPI forms a cup-shaped epithelium, the PE forms the two layers of the visceral and parietal endoderm and the TE cells overlying the EPI proliferate to form the extraembryonic ectoderm (ExE) and the ectoplacental cone ( Figure 5A). Before gastrulation is initiated at ~E6.5, the boundaries of the different compartments are easily discernable, prompting us to analyze lineage contributions at E6.25.
First, we generated chimeras with H2B-Gfp expressing blastomeres and showed that progeny of the blastomeres can contribute to both the Oct4-positive EPI, the Tfap2c and Elf5-positive ExE and the Tfap2c-positive ectoplacental cone ( Figure 5B). Next, we generated chimeras with H2B-Gfp expressing L-EPSCs, D-EPSCs or ESCs and performed similar lineage analysis at E6. 25. We found that all cell types readily contributed to the EPI lineage of the host embryo ( Figure 5C). Interestingly, while ESC and D-EPSC chimeras occasionally contained donor cells in the trophoblast compartment (~5% of chimeras), we also found that around 25% of L-EPSC chimeras contained a few cells in the ExE. However, when we performed immuno-staining for lineage-specific markers, cells localized to trophoblast regions did not express trophoblast markers such as Elf5 and Tfap2c ( Figure 5D and S5F). Instead, most of these mis-localized cells expressed the EPI marker Oct4. These data emphasize that donor cell localization alone 1 1 does not necessarily indicate appropriate lineage-specific marker allocation and therefore questions functional integration into the tissue. We also show an increased frequency of mislocalized L-EPSCs in chimeras, which may potentially explain the previously reported behavior of these cells.

In vivo lineage contributions of totipotent blastomeres, L-EPSCs and ESCs in embryonic day 12.5 placentas
To test whether ExE-localized donor cells in chimeras give rise to differentiated trophoblast cell types, we analyzed chimeric placentas at E12.5. The placenta has a complex structure and contains both trophoblast as well as embryo-derived cell types 53 ( Figure 6A). Additionally, due to its high metabolite content, the placenta exhibits elevated levels of autofluorescence. These properties make immuno-fluorescent lineage analysis in the placenta a tricky task, requiring thorough evaluation aided by appropriate positive and negative controls. First, we identified antibodies and immuno-fluorescent staining conditions to label the different cell types of the placenta. We used Mct1 and Mct4 to label syncytiotrophoblast I and II, respectively, Tpbpa to label spongiotrophoblast, and Krt8, Cdh3 and Tfap2c to label all trophoblast cell types in both the spongio and the labyrinth zones. Tfap2c is a nuclear-localized TF, making it an ideal marker to detect co-localization with nuclear-localized lineage tracers (e.g. H2B-Gfp). Finally, we used CD31 to label embryo-derived endothelial cells in the placenta. We then used this marker panel to show that in chimeric placentas generated with a single H2B-Gfp or DsRed expressing totipotent blastomere and a wild-type host embryo, blastomere progeny contribute to all analyzed lineages ( Figure S6). To unambiguously distinguish between trophoblast and embryoderived cells in the placenta we took advantage of a technique termed tetraploid complementation, which involves generating a chimera using a tetraploid host embryo and diploid ESCs 54,55 ( Figure 6B). Tetraploid cells are not tolerated in the embryonic compartment and ESCs do not contribute to the trophoblast compartment. Therefore, any surviving conceptus at E12.5 consists of trophoblast originating from tetraploid cells and embryonic tissues originating from ESCs. We generated chimeras in which either the tetraploid cells or ESCs ( Figure 6C) carried H2B-eGfp and immuno-stained placental sections at E12.5 for the markers described above ( Figure S7). As expected, we saw that in chimeras with H2B-eGfp-posititve tetraploid cells trophoblast markers (Tfap2c, Cdh3, Tpbpa, and Mct4) always overlapped with the Gfp signal, while the embryonic marker CD31 did not. In contrast, in chimeras with H2B-eGfp-posititve ESCs only CD31 overlapped with the Gfp signal, and trophoblast markers were 1 2 excluded from Gfp-positive cells. This panel highlights the difficulty in distinguishing different cell types in the placenta, especially in the labyrinth zone, without detailed analysis of markers and also emphasizes the challenge of matching a nuclear label with a membrane-localized signal in individual cells. Co-localization can be interpreted more clearly when the fluorescent lineage tracer and the cell-type specific marker are in the same sub-cellular compartment, as exemplified in our staining panel by the co-localization of H2B-eGfp and Tfap2c. Next we generated chimeric placentas using diploid host embryos and L-EPSCs and analyzed them using the same marker panel ( Figure 6D). We could only detect Gfp-positive cells in the embryonic, but not in the trophoblast compartment, suggesting that L-EPSCs do not readily give rise to differentiated trophoblast cell types.

Discussion
Here we present different criteria to evaluate the differentiation potential of early embryonic stem cells. We provide a large compiled dataset of single cell transcriptomes covering different in vivo cell types from fertilization to gastrulation, as well as several early stem cell types, which can be used to map a novel cell type based on transcriptional similarities. As a resource for the community, the data presented here is made available through a user-friendly file format that can be explored using the single-cell analysis tool SCope 56 . Users can upload the .loom files provided here (https://github.com/pasquelab/totipotency) enabling to browse the data sets at will.
We demonstrate assays to directly test the differentiation potential of cells by converting or Using these criteria, we examine the potential of two novel stem cell states (L-EPSCs and D-EPSCs) that have been reported to have expanded/extended potential beyond pluripotency.
Surprisingly, we fail to find convincing evidence that these cell types harbor extensive expanded or extended potential. Instead, based on our transcriptomic comparison L-EPSCs most closely 1 3 resemble E4.5 EPI cells or the parental ESCs cultured in 2iLif and D-EPSCs the E5.5 EPI or EpiSCs. In TSC conversion or reprogramming settings neither L-EPSCs or D-EPSCs show enhanced potential compared to parental ESCs. Finally, in chimeric experiments L-EPSCs and D-EPSCs only show convincing contribution to the EPI lineage. Interestingly, we found that in chimeras analyzed at the late blastocyst stage generated with both L-EPSCs or D-EPSCs, but not ESCs, cells occasionally localized to TE positions but did not express either EPI nor TE markers. We hypothesize that these mis-localized, marker-negative cells are not maintained long term and are likely in the process of getting eliminated from the compartment where they do not belong. We also detected mis-localized cells in chimeras made with L-EPSCs just prior to the onset of gastrulation. Majority of these cells however continued to express Oct4 and lacked trophoblast marker expression. Therefore, it is likely that these cells are not the progeny of the TE-localized, marker-negative cells observed in blastocyst-stage chimeras. Instead, mislocalization of L-EPSCs may occur during postimplantation development and coud potentially be due to weak anchorage or un-synced developmental timing, allowing spurious integration.
Our results which suggest that L-and D-EPSCs are unable to enter the trophectoderm lineage are seemingly in contrast with the recent study reporting the formation of blastocyst-like structures, termed blastoids, using only D-EPSCs 35 or L-EPSCs. We therefore carefully reexamined the cell types generated in blastoids. Our results corroborate the idea that EPSCs are able engage the gene regulatory programs utilized in distinct cellular lineages in natural embryos. However, we also found differences between the gene regulatory programs of natural embryos and EPSC-derived blastoids, which were most apparent in the TE lineage. Our reanalysis of the scRNA-seq data of B-blastoids made form D-EPSCs indicated that only 6.7 % of cells sequenced were categorized as TE and even these showed an ExE-like profile. These TElike cells also failed to show robust expression of classical TE markers such as Cdx2, Gata3 or Elf5. Problematically, the most abundant cell types in B-blastoids (B-blastoid intermediate 1 and 2) seem to most closely resemble mesoderm, expressing markers such as T, yet also share a number of common markers with the trophoblast lineage, such as Cdx2, Krt8, Krt18, Tfap2c.
With such cell composition it is not surprising that B-blastoids are not able to generate a live conceptus. Although not abundant, the presence of TE or ExE-like cells in B-blastoids is intriguing and leaves the door open for the possibility that some EPSCs may indeed harbor potential to differentiate into trophoblast. Of note, ESCs are also able, in some cases, to form blastoids 44,36 but whether they also recapitulate the gene regulatory programs of natural embryos like EPSC-derived blastoids do remains to be determined.

4
Why is this potential only revealed in the blastoid-forming assays and not in the context of chimeras? Forcing cells to the surface of a forming sphere in the blastoid method may mimic TE-inducing cues better than aggregation assays, which allow positional freedom of aggregated cells within the host embryo 3 . Positional freedom permits cells to group with TE or ICM compartments in the forming blastocyst based on their identity, therefore if only the potential exists for TE fate, this may not be realized in an aggregation setting. Additionally, B-blastoids are formed under specific culture conditions which may direct differentiation more robustly than the environment of the embryo. Supporting this notion, D-EPSCs were not able to give rise to a TE-like layer under a different blastoid forming protocol (ZG-blastoid). Instead, TSCs had to be used 36 . It should however still be considered that the TE or ExE-like blastoid cells fail to express Cdx2, Gata3 and Elf5 transcripts in similar levels to endogenous TE or ExE of the embryo suggesting that their transcriptional profile still is distinct from in vivo cells.
Notably, the B-blastoid method employs Bmp4 and inhibits Activin/Nodal signaling 35 , conditions which were also used in another blastoid protocol by the Tomoda lab 44 . The Tomoda group used EpiSCs as starting cells and were also able to produce blastoids with certain TE-like marker expression in the surface layer. Additionally, high Bmp4 and low Activin/Nodal was previously computationally predicted and shown in vitro to activate TE-marker gene expression in ESCs in which Jak/Stat signaling was inhibited 57 . These data suggest that high Bmp4 and low Activin/Nodal signaling may be key to TE-like cell induction.
Importantly, these signaling conditions are also involved in inducing proximal mesoderm fates during gastrulation and Bmp induces mixed mesoderm and trophoblast differentiation in EpiSCs and hESCs 58 , consistent with the appearance of abundant mesoderm-like cells in B-blastoids.
Could the starting stem cell state be crucial for facilitating mesoderm versus trophoblast differentiation? Indeed, it was shown that Cdx2 overexpression in ESCs induces reprogramming into TSCs, while Cdx2 overexpression in EpiSCs results in mesodermal gene expression 19,59 , highlighting the importance of the starting state for different differentiation outcomes. Notably, as our analysis placed D-EPSCs closest to primed EpiSCs, the widespread induction of mesodermal profiles is not surprising. We propose that to truly unlock a cell's differentiation potential into any extraembryonic or embryonic lineage, a starting state more resembling earlier embryonic, such as morula stages is needed. Our study highlights this challenge and sets gold standards for evaluating the differentiation potential of cells using various methods.

ESC-to-TSC reprogramming
For in vitro testing of EPSC potency, iCdx2 Elf5-2A-mCherry ESCs were converted to either D-EPSC or L-EPSC conditions as described above. To initiate ESC-to-TSC reprogramming, 50-200 cells/cm 2 were plated onto low density E12.5 DR4 MEFs (~10,000 cells/cm 2 on gelatincoated plate) in their original media. The following day the media was changed to TSC media (+Fgf4/Heparin) with or without 1 µg/mL 4-hydroxytamoxifen (Sigma-Aldrich H7904) to induce Cdx2. Media was changed daily and the degree of TSC reprogramming assessed by flow cytometry on day 6.
For ESC-to-TSCs differentiation followed by RT-qPCR, L-EPSCs and ESCs were gradually feeder-depleted by passaging every two days with 0.1% gelatin (porcine skin, 0.1% g/v final, Sigma, G2500) and a feeder percentage of 100%, 75%, 50% to 0% at every passage in L-EPSCM and 2iLif, respectively. After complete feeder removal, the cells were cultured at a density of 1x10 5 cells on gelatin-coated culture 6 well plates in TSC medium (as above). The cell culture medium was refreshed every two days and cells were collected every three days 1 8 from D0 to D12 for RT-qPCR analysis of TSC marker genes expression using the following primers: AGGAGCTGGAGGCCTTTTT CTAGGTTCAGGTAAGCCCAGG Eomes AAAGGCTTCCGGGACAACTA TAATATCGGGCTTGAGGCAA

Flow cytometry
Flow cytometry assay was designed to assess differential protein levels of TSC surface markers ranging ~33 mln reads. Sequencing reads were mapped to mm10 reference genome using STAR 2.5.3a 64 . On average, 77,16% of reads were uniquely mapped and only those were kept for further analyses. Subsequently, the featureCounts function from the R Bioconductor package "Rsubread" (version 1.5.2) 65 was used to assign mapped reads to genomic features.

Bulk RNA sequencing analysis
Processing raw read counts was performed as described in 66 . Briefly, the DESeq2 package and the associated protocol 67 was used. Only genes that express at least 10 reads in total across all libraries were retained. PCA was performed using plotPCA function from the DESeq2 package with input of top 500 most variable genes after rlog transformation. Unless mentioned otherwise, gene expression was presented as log2 values after size-factor normalization for the differences in library size (DESeq2). stage preimplantation embryos was done in two-steps. First, stage-specific markers were defined using k-means clustering (SC3) of single-cell data, followed by differential expression analysis between the clusters. Second, these markers were used for the comparison between L-EPSC timepoints and the corresponding embryo stages in the averaged single-cell data.

Library preparation for single-cell RNA sequencing
Single-cells from stem cells were sorted by FACS into 384 well plates containing lysis and RT, whereas embryo cells were manually picked and directly dispensed into lysis buffer containing 2 0 RT. The current study generated cDNA libraries using the Smart-seq2 protocol, as previously described 68,69 .

Single-cell RNA sequencing data pre-processing and quality control
Smart-Seq2 read files (E2.5 and E4.5 embryos, and three pluripotent stem cell conditions: ESCs cultured in 2iLif, L-EPSCs, and D-EPSCs) were mapped to the mouse reference genome (mm10) using STAR aligner 64

Single-cell gene expression analysis of merged datasets
Analysis of the filtered data was conducted in R version 3.

Gene Regulatory Network inference
Gene regulatory networks were inferred using pySCENIC (0.9.15; python implementation of SCENIC) 40 in Python version 3.6.9. First, raw expression data was normalized by dividing feature counts of each cell by the total counts for that cell and multiplying by factor of 10000 followed by log1p transformation. Subsequently, normalized counts were used to generate co-

Immunofluorescent staining of E4.5 and E6.25 chimeras
Whole mount immunofluorescence staining of E4.5 embryos was performed as previously Scientific) and placed between two cover glasses for imaging. E6.25 embryos were mounted in agarose plugs using 1% low melting agar (Sigma).

Immunofluorescent staining of E12.5 placentas
Placentas were dissected from pregnant females at E12.5, washed briefly in ice cold PBS and fixed in 4% PFA overnight. Depending on the experiment, embryos and placentas were prescreened for contribution to individual compartments using a fluorescent stereomicroscope.
The following day, placentas were embedded, frozen and sectioned (10μm) starting at the sagittal plane. Sections were blocked with 10% horse serum in PBS for 1-hr at room temperature and incubated overnight at 4°C with primary antibodies diluted with 5% horse serum in PBS. DsRed chimeric placentas were stained with anti-mCherry/Rfp antibody and H2B-Gfp chimeric placentas with anti-GFP. For detection of fetal endothelial cells, placentas were first subjected to antigen retrieval using 10 mM citrate buffer, pH 6.0 at 100°C for 10 minutes, then cooled, blocked, and co-stained with anti-CD34. The following day, sections were stained with the appropriate secondary antibodies for 1-hr at room temperature, washed in PBS, and eventually counterstained with DAPI (Sigma). Sections exposed to secondary antibody alone were used as negative controls. Antibody specificity for mCherry/Rfp and Gfp were confirmed on non-chimeric placentas.

Data access
All raw and processed sequencing data generated in this study have been submitted to the NCBI Gene Expression Omnibus (GEO; http://www.ncbi.nlm.nih.gov/geo/) under accession number GSE145609.