DNA copy number aberrations (CNAs) are of biological and medical interest because they help identify regulatory mechanisms underlying tumor initiation and evolution. profile and generates a sequence of simplicial complexes, mathematical objects that generalize the concept of a graph. This representation of the data permits segmenting the data at different resolutions and identifying CNAs by interrogating the topological properties of these simplicial complexes. We tested our approach on a published dataset with the goal of identifying specific breast cancer CNAs associated with specific molecular subtypes. Identification of CNAs associated with each subtype was performed by analyzing each subtype separately from the others and by taking the rest of the subtypes as the control. Our results found buy Retigabine (Ezogabine) a new amplification in 11q at the location of the progesterone receptor in the Luminal A subtype. Aberrations in the Luminal B buy Retigabine (Ezogabine) subtype were found only upon buy Retigabine (Ezogabine) removal of the basal-like subtype from the control set. Under those conditions, all regions found in the original publication, except for 17q, were confirmed; all aberrations, except those in chromosome arms 8q and 12q were confirmed in the basal-like subtype. These two chromosome arms, however, were detected only upon removal of three patients with exceedingly large copy number values. More importantly, we detected 10 and 21 additional regions in the Luminal B and basal-like subtypes, respectively. Most of the additional regions were either validated on an independent dataset and/or using GISTIC. Furthermore, we found three new CNAs in the basal-like subtype: a combination of gains and losses in 1p, a gain in 2p and a loss in 14q. Based on these results, we suggest that topological approaches that incorporate multiresolution analyses and that interrogate topological properties of the data can help in the identification of copy number changes in cancer. for clones inside an aberration and of mean for clones outside any aberration. The standard deviation was constant for all clones in any given simulation. The mean value of an aberration was and the length of an aberration (and Luminal B bps away from the position reported in ENSEMBL. Finally, we removed 28 clones that were in the correct chromosomes, but had inconsistent relative positions with respect to their immediate neighboring clones. We imputed missing values using the algorithm called locally weighted scatterplot smoothing (lowess) [47]. Entries of clones that were mapped to buy Retigabine (Ezogabine) the same locations were averaged. 2.3. Detection of Focal Copy Number Aberrations Using TAaCGH Here, we extend the method initially proposed in [22,48] to analyze microarray data (see the Conclusions Section for a buy Retigabine (Ezogabine) detailed explanation of the new features reported in this work). For a chosen section of copy number values, TAaCGH associates a point cloud in an euclidean n-dimensional coordinate system (from a section of copy number values (see Figure 1). Any three consecutive copy number values naturally define a point in with coordinates ((is is shown in Figure 1B with the points connected by edges (see below for an explanation of the meaning of the edges). A number of features can be noticed when representing the data as a point cloud. First, the associated point cloud has an elliptical shape, because consecutive copy number values are correlated. In fact, when TAaCGH was applied Rabbit Polyclonal to ARHGEF11 to gene expression profiles, we observed that the associated clouds were spherical due to the lack of correlation between expression values of consecutive genes along the genome [48]. Second, consecutive gains are mapped to the octant with all positive values, consecutive losses to the octant with all negative values, and values containing combinations of positive and negative values are mapped to the other octants. Third, the higher the absolute value of the gain or loss, the further the corresponding points in the point cloud will be from the origin. Consequently, the noise in the data is mapped near the origin of the coordinates. Figure 1 Generation of a point.