biyelunwen/slides/20251213/pages/技术路线.md

151 lines
4.9 KiB
Markdown
Raw Permalink Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

---
layout: section
color: navy
---
<div class="w-7/10 mx-auto text-center">
## 技术路线与分析进展
<hr/>
当前分析进展情况
</div>
---
layout: side-title
color: navy
align: rm-lm
titlewidth: is-3
title: 技术路线图(上)
---
::title::
# 技术路线图(上)
::content::
```mermaid
flowchart TD
classDef default fill:#ECF5FF,stroke:#409eff,stroke-width:2px,color:#409eff;
classDef done fill:#F0F9EB,stroke:#67c23a,stroke-width:2px,color:#67c23a;
classDef warn fill:#FDF6EC,stroke:#e6a23c,stroke-width:2px,color:#e6a23c
subgraph subgraph1[Step1. 数据预处理与组装]
direction LR
RawData[原始数据来自sra]:::warn --> CleanData[数据清洗与质控Fastp]:::done
CleanData -- 沙棘样品,有参 --> AlignData[比对到参考基因组Hisat2]:::done
AlignData --> Assembly[组装成转录本Trinity]:::done
CleanData -- 胡颓子,无参 --> Assembly[组装成转录本使用N50和busco评估组装结果Trinity]:::done
end
subgraph subgraph2[Step2. 同源基因集聚类]
direction LR
ass[转录本]:::warn --> CDS[预测CDS抽取最长转录本的最长ORF作为输入CDSTD2]:::done
CDS --> reducedCDS[单样本CDS序列去冗余CD-HIT]:::done
reducedCDS --> ortho[基因家族聚类得到低拷贝同源基因集OrthoFinder]:::done
ortho --> homolog[将枣的同源基因整合到低拷贝同源基因集中, hmmer]:::done
end
subgraph subgraph3[Step3. 基因集筛选与基因树]
direction LR
homo[低拷贝同源基因集]:::warn --> alignment[基于蛋白序列比对macse<br>修剪比对序列trimAl]:::done
alignment --> treeshrink[fasttree快速构树基于长枝吸引treeshrink和比对序列长度过滤基因集]:::done
treeshrink --> modeltest[序列进化模型检验modeltest-ng]:::done
modeltest --> raxml[构建ML单基因树]:::done
modeltest --> mrbayes[构建BP单基因树]:::done
raxml --> trees[根据mrbayes的结果收敛性筛选基因树与对应序列]:::done
mrbayes --> trees
end
subgraph subgraph4[Step4. 物种系统发育关系]
direction LR
tree[过滤后单基因树与对应序列]:::warn -- 单基因ML树 --> coalescence[溯祖关系树, Aster]
tree -- 单基因ML树 --> phyparts[基因树不一致性展示]
tree -- 单基因序列 --> concatenation[串联关系树iqtree3]
tree -- 单基因的mrbayes分析结果 --> CFs[基因树一致性因子BUCKy]:::done
CFs --> network[系统发育网络SNAQ]:::done
coalescence --> network
end
subgraph note[节点说明]
direction TB
done[已完成分析]:::done
warn[重要中间数据]:::warn
default[未完成分析]
end
subgraph1 --> subgraph2
subgraph2 --> subgraph3
subgraph3 --> subgraph4
subgraph4 ~~~ note
```
---
layout: side-title
color: navy
align: rm-lm
titlewidth: is-3
title: 技术路线图(下)
---
::title::
# 技术路线图(下)
::content::
```mermaid
flowchart TD
classDef default fill:#ECF5FF,stroke:#409eff,stroke-width:2px,color:#409eff;
classDef done fill:#F0F9EB,stroke:#67c23a,stroke-width:2px,color:#67c23a;
classDef warn fill:#FDF6EC,stroke:#e6a23c,stroke-width:2px,color:#e6a23c
subgraph subgraph5[Step5. 叶绿体系统发育关系]
direction LR
cleanData[Clean Reads]:::done --> alignCp[比对到参考叶绿体基因组hisat2]:::done
alignCp --> callExon[提取比对上的外显子序列samtools]
callExon --> cpTree[串联高质量的exon片段构建叶绿体基因组系统发育树iqtree3]
cpTree --> divergence[估算叶绿体基因组分歧时间beast]
end
subgraph subgraph6[Step6. 沙棘模型分析]
direction LR
genes[筛选得到的低拷贝同源基因集]:::warn --> filterGenes[基于ml树筛选基因集用于后续模型]
filterGenes --> mcmcTree[使用mcmctree估算分化时间]
mcmcTree --> fitModel[使用模型拟合种间分化时间分布]
fitModel --> integrate[整合不同模型结果得到最终分化时间估计]
end
subgraph subgraph7[Step7. bpp模拟模型分析]
direction LR
bpp[使用bpp进行溯祖模拟得到基因集]:::warn --> mcmcTreeBPP[使用mcmctree估算分化时间]
mcmcTreeBPP --> fitModelBPP[使用模型拟合种间分化时间分布]
fitModelBPP --> integrateBPP[整合不同模型结果得到最终分化时间估计]
end
subgraph subgraph8[Step8. 蛇数据分析]
direction LR
snakeData[蛇数据]:::warn --> mcmcTreeSnake[使用mcmctree估算分化时间]
mcmcTreeSnake --> fitModelSnake[使用模型拟合种间分化时间分布]
fitModelSnake --> integrateSnake[整合不同模型结果得到最终分化时间估计]
end
subgraph note[节点说明]
direction TB
done[已完成分析]:::done
warn[重要中间数据]:::warn
default[未完成分析]
end
subgraph5 --> subgraph6
subgraph6 --> subgraph7
subgraph7 --> subgraph8
subgraph8 ~~~ note
```