4.9 KiB
4.9 KiB
| layout | color |
|---|---|
| section | navy |
技术路线与分析进展
当前分析进展情况
layout: side-title color: navy align: rm-lm titlewidth: is-3 title: 技术路线图(上)
::title::
技术路线图(上)
::content::
flowchart TD
classDef default fill:#ECF5FF,stroke:#409eff,stroke-width:2px,color:#409eff;
classDef done fill:#F0F9EB,stroke:#67c23a,stroke-width:2px,color:#67c23a;
classDef warn fill:#FDF6EC,stroke:#e6a23c,stroke-width:2px,color:#e6a23c
subgraph subgraph1[Step1. 数据预处理与组装]
direction LR
RawData[原始数据,来自sra]:::warn --> CleanData[数据清洗与质控,Fastp]:::done
CleanData -- 沙棘样品,有参 --> AlignData[比对到参考基因组,Hisat2]:::done
AlignData --> Assembly[组装成转录本,Trinity]:::done
CleanData -- 胡颓子,无参 --> Assembly[组装成转录本,使用N50和busco评估组装结果,Trinity]:::done
end
subgraph subgraph2[Step2. 同源基因集聚类]
direction LR
ass[转录本]:::warn --> CDS[预测CDS,抽取最长转录本的最长ORF作为输入CDS,TD2]:::done
CDS --> reducedCDS[单样本CDS序列去冗余,CD-HIT]:::done
reducedCDS --> ortho[基因家族聚类得到低拷贝同源基因集,OrthoFinder]:::done
ortho --> homolog[将枣的同源基因整合到低拷贝同源基因集中, hmmer]:::done
end
subgraph subgraph3[Step3. 基因集筛选与基因树]
direction LR
homo[低拷贝同源基因集]:::warn --> alignment[基于蛋白序列比对,macse<br>修剪比对序列,trimAl]:::done
alignment --> treeshrink[fasttree快速构树,基于长枝吸引(treeshrink)和比对序列长度过滤基因集]:::done
treeshrink --> modeltest[序列进化模型检验,modeltest-ng]:::done
modeltest --> raxml[构建ML单基因树]:::done
modeltest --> mrbayes[构建BP单基因树]:::done
raxml --> trees[根据mrbayes的结果收敛性筛选基因树与对应序列]:::done
mrbayes --> trees
end
subgraph subgraph4[Step4. 物种系统发育关系]
direction LR
tree[过滤后单基因树与对应序列]:::warn -- 单基因ML树 --> coalescence[溯祖关系树, Aster]
tree -- 单基因ML树 --> phyparts[基因树不一致性展示]
tree -- 单基因序列 --> concatenation[串联关系树,iqtree3]
tree -- 单基因的mrbayes分析结果 --> CFs[基因树一致性因子,BUCKy]:::done
CFs --> network[系统发育网络,SNAQ]:::done
coalescence --> network
end
subgraph note[节点说明]
direction TB
done[已完成分析]:::done
warn[重要中间数据]:::warn
default[未完成分析]
end
subgraph1 --> subgraph2
subgraph2 --> subgraph3
subgraph3 --> subgraph4
subgraph4 ~~~ note
layout: side-title color: navy align: rm-lm titlewidth: is-3 title: 技术路线图(下)
::title::
技术路线图(下)
::content::
flowchart TD
classDef default fill:#ECF5FF,stroke:#409eff,stroke-width:2px,color:#409eff;
classDef done fill:#F0F9EB,stroke:#67c23a,stroke-width:2px,color:#67c23a;
classDef warn fill:#FDF6EC,stroke:#e6a23c,stroke-width:2px,color:#e6a23c
subgraph subgraph5[Step5. 叶绿体系统发育关系]
direction LR
cleanData[Clean Reads]:::done --> alignCp[比对到参考叶绿体基因组,hisat2]:::done
alignCp --> callExon[提取比对上的外显子序列,samtools]
callExon --> cpTree[串联高质量的exon片段,构建叶绿体基因组系统发育树,iqtree3]
cpTree --> divergence[估算叶绿体基因组分歧时间,beast]
end
subgraph subgraph6[Step6. 沙棘模型分析]
direction LR
genes[筛选得到的低拷贝同源基因集]:::warn --> filterGenes[基于ml树筛选基因集用于后续模型]
filterGenes --> mcmcTree[使用mcmctree估算分化时间]
mcmcTree --> fitModel[使用模型拟合种间分化时间分布]
fitModel --> integrate[整合不同模型结果得到最终分化时间估计]
end
subgraph subgraph7[Step7. bpp模拟模型分析]
direction LR
bpp[使用bpp进行溯祖模拟得到基因集]:::warn --> mcmcTreeBPP[使用mcmctree估算分化时间]
mcmcTreeBPP --> fitModelBPP[使用模型拟合种间分化时间分布]
fitModelBPP --> integrateBPP[整合不同模型结果得到最终分化时间估计]
end
subgraph subgraph8[Step8. 蛇数据分析]
direction LR
snakeData[蛇数据]:::warn --> mcmcTreeSnake[使用mcmctree估算分化时间]
mcmcTreeSnake --> fitModelSnake[使用模型拟合种间分化时间分布]
fitModelSnake --> integrateSnake[整合不同模型结果得到最终分化时间估计]
end
subgraph note[节点说明]
direction TB
done[已完成分析]:::done
warn[重要中间数据]:::warn
default[未完成分析]
end
subgraph5 --> subgraph6
subgraph6 --> subgraph7
subgraph7 --> subgraph8
subgraph8 ~~~ note