The genome coverage analysis of the chromosome contig_1.
The following figures shows the per-base coverage along the reference genome (black line). The blue line indicates the running median. From the normalised coverage, we estimate z-scores on a per-base level. The red lines indicates the z-scores at plus or minus N standard deviations, where N is chosen by the user. (default:4). Only a million point are shown. This may explain some visual discrepancies with.
Here are some basic statistics about the genome coverage.
The following tables give regions of interest detected by sequana. Here are the definitions of the columns:
Regions with a z-score lower than -2.00 and at least one base with a z-score lower than -4.00 are detected.There are 58 low regions of interest.
chr,start,end,size,mean_cov,max_cov,mean_rm,mean_zscore,max_zscore,log2_ratio,link contig_1,162488,165525,3037,50.4,58,66.8,-3.66,-5.53,-0.407,subplots/contig_1_1_200001.html contig_1,194264,194700,436,51.8,53,64,-2.84,-4.87,-0.305,subplots/contig_1_1_200001.html contig_1,219275,222426,3151,46.1,51,63.6,-4.08,-6.24,-0.464,subplots/contig_1_200001_400001.html contig_1,286026,286211,185,50.8,51,69,-3.92,-4.3,-0.442,subplots/contig_1_200001_400001.html contig_1,288186,288187,1,51,51,70,-4.03,-4.03,-0.457,subplots/contig_1_200001_400001.html contig_1,289004,289005,1,46,46,71,-5.22,-5.22,-0.626,subplots/contig_1_200001_400001.html contig_1,334865,335193,328,39.8,41,56,-4.3,-5.29,-0.493,subplots/contig_1_200001_400001.html contig_1,341212,342104,892,35.9,40,56,-5.31,-6.61,-0.641,subplots/contig_1_200001_400001.html contig_1,496662,496663,1,63,63,92,-4.68,-4.68,-0.546,subplots/contig_1_400001_600001.html contig_1,513253,513988,735,73.5,76,89,-2.61,-4.01,-0.277,subplots/contig_1_400001_600001.html contig_1,557720,563903,6183,22.2,31,40,-6.58,-10.3,-0.849,subplots/contig_1_400001_600001.html contig_1,617912,621902,3990,38.1,42,51,-3.76,-5.23,-0.421,subplots/contig_1_600001_800001.html contig_1,680069,680070,1,65,65,91,-4.24,-4.24,-0.485,subplots/contig_1_600001_800001.html contig_1,718928,720118,1190,64.2,68,85,-3.63,-4.88,-0.404,subplots/contig_1_600001_800001.html contig_1,767151,767329,178,40.5,41,55,-3.91,-4.05,-0.44,subplots/contig_1_600001_800001.html contig_1,830855,830856,1,45,45,62,-4.07,-4.07,-0.462,subplots/contig_1_800001_1000001.html contig_1,834937,834949,12,48.8,50,62,-3.18,-5.26,-0.347,subplots/contig_1_800001_1000001.html contig_1,889327,889328,1,37,37,51,-4.08,-4.08,-0.463,subplots/contig_1_800001_1000001.html contig_1,890687,891671,984,39.7,42,51,-3.3,-4.65,-0.362,subplots/contig_1_800001_1000001.html contig_1,892452,892453,1,37,37,51,-4.08,-4.08,-0.463,subplots/contig_1_800001_1000001.html contig_1,893139,893140,1,33,33,51,-5.23,-5.23,-0.628,subplots/contig_1_800001_1000001.html contig_1,910306,910569,263,57.7,58,77,-3.72,-4.43,-0.415,subplots/contig_1_800001_1000001.html contig_1,999325,1000116,791,38.3,41,49,-3.24,-4.54,-0.354,subplots/contig_1_800001_1000001.html contig_1,1000712,1002268,1556,37.2,42,49,-3.58,-4.84,-0.398,subplots/contig_1_1000001_1200001.html contig_1,1028167,1031441,3274,44.7,49,63.4,-4.39,-5.88,-0.505,subplots/contig_1_1000001_1200001.html contig_1,1058763,1058764,1,53,53,73,-4.07,-4.07,-0.462,subplots/contig_1_1000001_1200001.html contig_1,1077175,1077376,201,54.5,56,67.4,-2.84,-4.43,-0.304,subplots/contig_1_1000001_1200001.html contig_1,1177678,1178305,627,53.6,55,66.3,-2.85,-4.94,-0.306,subplots/contig_1_1000001_1200001.html contig_1,1327784,1327785,1,40,40,57,-4.43,-4.43,-0.511,subplots/contig_1_1200001_1400001.html contig_1,1329689,1329690,1,41,41,57,-4.17,-4.17,-0.475,subplots/contig_1_1200001_1400001.html contig_1,1387057,1387058,1,36,36,53,-4.76,-4.76,-0.558,subplots/contig_1_1200001_1400001.html contig_1,1398359,1401714,3355,34.1,37,45,-3.59,-4.94,-0.399,subplots/contig_1_1200001_1400001.html contig_1,1470762,1470763,1,36,36,52,-4.57,-4.57,-0.531,subplots/contig_1_1400001_1600001.html contig_1,1481617,1483671,2054,38.1,40,52.5,-4.07,-5.59,-0.463,subplots/contig_1_1400001_1600001.html contig_1,1567390,1567391,1,32,32,45,-4.29,-4.29,-0.492,subplots/contig_1_1400001_1600001.html contig_1,1569248,1569249,1,33,33,46,-4.2,-4.2,-0.479,subplots/contig_1_1400001_1600001.html contig_1,1588229,1588831,602,36.5,38,46,-3.08,-4.2,-0.334,subplots/contig_1_1400001_1600001.html contig_1,1625297,1625298,1,40,40,55,-4.05,-4.05,-0.459,subplots/contig_1_1600001_1800001.html contig_1,1626490,1626491,1,36,36,53,-4.76,-4.76,-0.558,subplots/contig_1_1600001_1800001.html contig_1,1671929,1675107,3178,39.9,43,53,-3.67,-5.59,-0.409,subplots/contig_1_1600001_1800001.html contig_1,1689714,1689715,1,41,41,59,-4.53,-4.53,-0.525,subplots/contig_1_1600001_1800001.html contig_1,1689763,1689764,1,40,40,59,-4.78,-4.78,-0.561,subplots/contig_1_1600001_1800001.html contig_1,1707339,1707757,418,38.9,41,53.3,-4,-4.76,-0.453,subplots/contig_1_1600001_1800001.html contig_1,1757843,1762538,4695,43,49,58,-3.84,-5.62,-0.431,subplots/contig_1_1600001_1800001.html contig_1,1844741,1844742,1,38,38,56,-4.77,-4.77,-0.559,subplots/contig_1_1800001_2000001.html contig_1,1901216,1901217,1,48,48,67,-4.21,-4.21,-0.481,subplots/contig_1_1800001_2000001.html contig_1,1903392,1903746,354,50.9,52,66.9,-3.56,-4.87,-0.395,subplots/contig_1_1800001_2000001.html contig_1,1934221,1934222,1,36,36,50,-4.16,-4.16,-0.474,subplots/contig_1_1800001_2000001.html contig_1,1954556,1954557,1,31,31,49,-5.44,-5.44,-0.661,subplots/contig_1_1800001_2000001.html contig_1,2006320,2006321,1,43,43,64,-4.87,-4.87,-0.574,subplots/contig_1_2000001_2200001.html contig_1,2028301,2028302,1,38,38,57,-4.94,-4.94,-0.585,subplots/contig_1_2000001_2200001.html contig_1,2030535,2030583,48,45.8,46,57,-2.94,-4.43,-0.317,subplots/contig_1_2000001_2200001.html contig_1,2032030,2032267,237,45.4,46,57,-3.04,-4.94,-0.329,subplots/contig_1_2000001_2200001.html contig_1,2093516,2093517,1,46,46,63,-4.01,-4.01,-0.454,subplots/contig_1_2000001_2200001.html contig_1,2143759,2144317,558,45,47,57,-3.14,-5.2,-0.342,subplots/contig_1_2000001_2200001.html contig_1,2163329,2163330,1,44,44,62,-4.31,-4.31,-0.495,subplots/contig_1_2000001_2200001.html contig_1,2176032,2176271,239,51.8,53,65,-3.03,-4.57,-0.328,subplots/contig_1_2000001_2200001.html contig_1,2177438,2177439,1,47,47,65,-4.11,-4.11,-0.468,subplots/contig_1_2000001_2200001.html
| chr | start | end | size | mean_cov | max_cov | mean_rm | mean_zscore | max_zscore | log2_ratio | link |
|---|
Regions with a z-score higher than 2.00 and at least one base with a z-score higher than 4.00 are detected.There are 17 high regions of interest.
chr,start,end,size,mean_cov,max_cov,mean_rm,mean_zscore,max_zscore,log2_ratio,link contig_1,480790,480963,173,128,130,100,4.13,4.37,0.36,subplots/contig_1_400001_600001.html contig_1,1013433,1015246,1813,86,91,66,4.41,5.53,0.381,subplots/contig_1_1000001_1200001.html contig_1,1167922,1169982,2060,83.5,88,64,4.44,5.48,0.384,subplots/contig_1_1000001_1200001.html contig_1,1170369,1171840,1471,86.1,90,64.4,4.93,5.94,0.42,subplots/contig_1_1000001_1200001.html contig_1,1522745,1522864,119,56.9,57,44,4.29,4.31,0.372,subplots/contig_1_1400001_1600001.html contig_1,1581436,1582693,1257,61.9,64,47.7,4.35,4.97,0.377,subplots/contig_1_1400001_1600001.html contig_1,1616788,1620446,3658,68.2,72,49,5.72,6.87,0.476,subplots/contig_1_1600001_1800001.html contig_1,1836428,1836714,286,73.8,75,58,3.97,4.27,0.348,subplots/contig_1_1800001_2000001.html contig_1,1837221,1837434,213,74.3,75,58,4.1,4.27,0.358,subplots/contig_1_1800001_2000001.html contig_1,1838252,1839057,805,74.1,75,58,4.05,4.27,0.354,subplots/contig_1_1800001_2000001.html contig_1,1908494,1908513,19,76.9,77,60,4.1,4.13,0.358,subplots/contig_1_1800001_2000001.html contig_1,2060072,2060768,696,82,84,64,4.11,4.56,0.358,subplots/contig_1_2000001_2200001.html contig_1,2061572,2061576,4,82,82,64,4.1,4.1,0.358,subplots/contig_1_2000001_2200001.html contig_1,2061577,2061773,196,82,82,64,4.09,4.1,0.357,subplots/contig_1_2000001_2200001.html contig_1,2184399,2184412,13,77,77,60,4.13,4.13,0.36,subplots/contig_1_2000001_2200001.html contig_1,2184413,2186320,1907,76.5,80,59.2,4.25,5.2,0.369,subplots/contig_1_2000001_2200001.html contig_1,2188006,2188165,159,75.7,77,59,4.14,4.45,0.361,subplots/contig_1_2000001_2200001.html
| chr | start | end | size | mean_cov | max_cov | mean_rm | mean_zscore | max_zscore | log2_ratio | link |
|---|
The following figures contain the histogram of the genome coverage. The X and Y axis being in log scale in the left panelwhile only the Y axis is in log scale in the right panel.
The correlation coefficient between the coverage and GC content is -0.114 with a window size of 201bp.
Note: the correlation coefficient has to be between -1.0 and 1.0. A coefficient of 0 means no correlation, while a coefficient of -1 or 1 means an existing correlation between GC and Coverage
Distribution of the normalised coverage with predicted Gaussian. The red line should be followed the trend of the barplot.
Distribution of the z-score (normalised coverage); You should see a Gaussian distribution centered around 0. The estimated parameters are mu=1.00 and sigma=0.07.
Command used:
sequana_coverage --input-file hifi3/minimap2/hifi3.bed -H 4.0 -L -4.0 --clustering-parameter 0.5 --chunk-size 5000000 --window-gc 201 --mixture-models 2 --output-directory hifi3/sequana_coverage --window-median 20001 --reference-file hifi3/sorted_contigs/hifi3.fasta Sequana version: 0.18.0.