Takes one or more STRUCTURE, TESS, BAPS, BASIC (numeric delimited runs) or CLUMPP format files and converts them to a qlist (list of dataframes).
readQ(files = NULL, filetype = "auto", indlabfromfile = FALSE, readci = FALSE)
files | A character or character vector of one or more files. |
---|---|
filetype | A character indicating input filetype. Options are 'auto', 'structure','tess2','baps','basic' or 'clumpp'. See details. |
indlabfromfile | A logical indicating if individual labels must be read from input file and used as row names for resulting dataframe. Spaces in labels may be replaced with _. Currently only applicable to STRUCTURE runs. |
readci | A logical indicating if confidence intervals from the STRUCTURE run file (if available) should be read. Set to FALSE by default as it take up excess space. This argument is only applicable to STRUCTURE run files. |
A list of lists with dataframes is returned. List items are named by
input filenames. File extensions such as '.txt','.csv','.tsv' and '.meanQ'
are removed from filename. In case filenames are missing or not available,
lists are named sample1, sample2 etc. For STRUCTURE runs, if individual
labels are present in the run file and indlabfromfile=TRUE
, they are
added to the dataframe as row names. Structure metadata including loci,
burnin, reps, elpd, mvll, and vll is added as attributes to each dataframe.
When readci=TRUE
and if CI data is available in STRUCTURE run files,
it is read in and attached as attribute named ci.
For CLUMPP files, multiple runs within one file are suffixed by -1, -2 etc.
STRUCTURE, TESS2 and BAPS run files have unique layout and format (See
vignette). BASIC files can be Admixture run files, fastStructure meanQ files
or any tab-delimited, space-delimited or comma-delimited tabular data
without a header. CLUMPP files can be COMBINED, ALIGNED or MERGED files.
COMBINED files are generated from clumppExport
. ALIGNED and
MERGED files are generated by CLUMPP.
To convert TESS3 R objects to pophelper qlist, see readQTess3
.
See the vignette for more details.
# STRUCTURE files sfiles <- list.files(path=system.file("files/structure",package="pophelper"), full.names=TRUE) # create a qlist of all runs slist <- readQ(sfiles) slist <- readQ(sfiles,filetype="structure") # use ind names from file slist <- readQ(sfiles[1],indlabfromfile=TRUE) # access the first run slist <- readQ(sfiles)[[1]] # access names of runs names(slist)#> [1] "Cluster1" "Cluster2"#> NULL#> $Cluster1 #> NULL #> #> $Cluster2 #> NULL #># TESS files tfiles <- list.files(path=system.file("files/tess",package="pophelper"), full.names=TRUE) # create a qlist tlist <- readQ(tfiles) # BASIC files afiles <- list.files(path=system.file("files/admixture",package="pophelper"), full.names=TRUE) # create a qlist alist <- readQ(afiles) # CLUMPP files cfiles1 <- system.file("files/STRUCTUREpop_K4-combined.txt", package="pophelper") cfiles2 <- system.file("files/STRUCTUREpop_K4-combined-aligned.txt", package="pophelper") cfiles3 <- system.file("files/STRUCTUREpop_K4-combined-merged.txt", package="pophelper") # create a qlist clist1 <- readQ(cfiles1) clist2 <- readQ(cfiles2) clist3 <- readQ(cfiles3) # manually create qlist df1 <- data.frame(Cluster1=c(0.2,0.4,0.6,0.2),Cluster2=c(0.8,0.6,0.4,0.8)) df2 <- data.frame(Cluster1=c(0.3,0.1,0.5,0.6),Cluster2=c(0.7,0.9,0.5,0.4)) # one-element qlist q1 <- list("sample1"=df1) str(q1)#> List of 1 #> $ sample1:'data.frame': 4 obs. of 2 variables: #> ..$ Cluster1: num [1:4] 0.2 0.4 0.6 0.2 #> ..$ Cluster2: num [1:4] 0.8 0.6 0.4 0.8#> List of 2 #> $ sample1:'data.frame': 4 obs. of 2 variables: #> ..$ Cluster1: num [1:4] 0.2 0.4 0.6 0.2 #> ..$ Cluster2: num [1:4] 0.8 0.6 0.4 0.8 #> $ sample2:'data.frame': 4 obs. of 2 variables: #> ..$ Cluster1: num [1:4] 0.3 0.1 0.5 0.6 #> ..$ Cluster2: num [1:4] 0.7 0.9 0.5 0.4