biyelunwen/slides/20251213/pages/技术路线.md

4.9 KiB
Raw Blame History

layout color
section navy

技术路线与分析进展


当前分析进展情况


layout: side-title color: navy align: rm-lm titlewidth: is-3 title: 技术路线图(上)


::title::

技术路线图(上)

::content::

flowchart TD

classDef default fill:#ECF5FF,stroke:#409eff,stroke-width:2px,color:#409eff;
classDef done fill:#F0F9EB,stroke:#67c23a,stroke-width:2px,color:#67c23a;
classDef warn fill:#FDF6EC,stroke:#e6a23c,stroke-width:2px,color:#e6a23c

subgraph subgraph1[Step1. 数据预处理与组装]
    direction LR
    RawData[原始数据来自sra]:::warn --> CleanData[数据清洗与质控Fastp]:::done
    CleanData -- 沙棘样品,有参 --> AlignData[比对到参考基因组Hisat2]:::done
    AlignData --> Assembly[组装成转录本Trinity]:::done
    CleanData -- 胡颓子,无参 --> Assembly[组装成转录本使用N50和busco评估组装结果Trinity]:::done
end

subgraph subgraph2[Step2. 同源基因集聚类]
    direction LR
    ass[转录本]:::warn --> CDS[预测CDS抽取最长转录本的最长ORF作为输入CDSTD2]:::done
    CDS --> reducedCDS[单样本CDS序列去冗余CD-HIT]:::done
    reducedCDS --> ortho[基因家族聚类得到低拷贝同源基因集OrthoFinder]:::done
    ortho --> homolog[将枣的同源基因整合到低拷贝同源基因集中, hmmer]:::done
end

subgraph subgraph3[Step3. 基因集筛选与基因树]
    direction LR
    homo[低拷贝同源基因集]:::warn --> alignment[基于蛋白序列比对macse<br>修剪比对序列trimAl]:::done
    alignment --> treeshrink[fasttree快速构树基于长枝吸引treeshrink和比对序列长度过滤基因集]:::done
    treeshrink --> modeltest[序列进化模型检验modeltest-ng]:::done
    modeltest --> raxml[构建ML单基因树]:::done
    modeltest --> mrbayes[构建BP单基因树]:::done
    raxml --> trees[根据mrbayes的结果收敛性筛选基因树与对应序列]:::done
    mrbayes --> trees
end

subgraph subgraph4[Step4. 物种系统发育关系]
    direction LR
    tree[过滤后单基因树与对应序列]:::warn -- 单基因ML树 --> coalescence[溯祖关系树, Aster]
    tree -- 单基因ML树 --> phyparts[基因树不一致性展示]
    tree -- 单基因序列 --> concatenation[串联关系树iqtree3]
    tree -- 单基因的mrbayes分析结果 --> CFs[基因树一致性因子BUCKy]:::done
    CFs --> network[系统发育网络SNAQ]:::done
    coalescence --> network
end

subgraph note[节点说明]
    direction TB
    done[已完成分析]:::done
    warn[重要中间数据]:::warn
    default[未完成分析]
end

subgraph1 --> subgraph2
subgraph2 --> subgraph3
subgraph3 --> subgraph4
subgraph4 ~~~ note

layout: side-title color: navy align: rm-lm titlewidth: is-3 title: 技术路线图(下)


::title::

技术路线图(下)

::content::

flowchart TD

classDef default fill:#ECF5FF,stroke:#409eff,stroke-width:2px,color:#409eff;
classDef done fill:#F0F9EB,stroke:#67c23a,stroke-width:2px,color:#67c23a;
classDef warn fill:#FDF6EC,stroke:#e6a23c,stroke-width:2px,color:#e6a23c

subgraph subgraph5[Step5. 叶绿体系统发育关系]
    direction LR
    cleanData[Clean Reads]:::done --> alignCp[比对到参考叶绿体基因组hisat2]:::done
    alignCp --> callExon[提取比对上的外显子序列samtools]
    callExon --> cpTree[串联高质量的exon片段构建叶绿体基因组系统发育树iqtree3]
    cpTree --> divergence[估算叶绿体基因组分歧时间beast]
end

subgraph subgraph6[Step6. 沙棘模型分析]
    direction LR
    genes[筛选得到的低拷贝同源基因集]:::warn --> filterGenes[基于ml树筛选基因集用于后续模型]
    filterGenes --> mcmcTree[使用mcmctree估算分化时间]
    mcmcTree --> fitModel[使用模型拟合种间分化时间分布]
    fitModel --> integrate[整合不同模型结果得到最终分化时间估计]
end

subgraph subgraph7[Step7. bpp模拟模型分析]
    direction LR
    bpp[使用bpp进行溯祖模拟得到基因集]:::warn --> mcmcTreeBPP[使用mcmctree估算分化时间]
    mcmcTreeBPP --> fitModelBPP[使用模型拟合种间分化时间分布]
    fitModelBPP --> integrateBPP[整合不同模型结果得到最终分化时间估计]
end

subgraph subgraph8[Step8. 蛇数据分析]
    direction LR
    snakeData[蛇数据]:::warn --> mcmcTreeSnake[使用mcmctree估算分化时间]
    mcmcTreeSnake --> fitModelSnake[使用模型拟合种间分化时间分布]
    fitModelSnake --> integrateSnake[整合不同模型结果得到最终分化时间估计]
end

subgraph note[节点说明]
    direction TB
    done[已完成分析]:::done
    warn[重要中间数据]:::warn
    default[未完成分析]
end

subgraph5 --> subgraph6
subgraph6 --> subgraph7
subgraph7 --> subgraph8
subgraph8 ~~~ note