# Synteny analysis of long- vs short-lived sea urchins Analysis completed by Kate Castellano Purpose: Look for chromosomal rearrangments in the long- versus short-lived sea urchins to identify novel rearrangments associated with longevity/negligible senescence Summary: Pairwise synteny comparisons were inferred between M. franciscancus and S. purpuratus, M. franciscancus and L. variegatus, M. franciscancus and L. pictus, and L. variegatus and L. pictus using MCscan. S. purpuratus version 5.0 gene annotations were obtained from Echinobase (https://www.echinobase.org), version 3.0 of the L. variegatus gene annotations obtained from Echinobase and version 2.0 of the L. pictus gene annotations obtained from NCBI. LAST (v 1445) was used for genome wide alignments of coding regions, and filtering of tandem duplications and weak hits. Linkage clustering into syntenic blocks and visualization was performed with the MCscan python workflow (https://github.com/tanghaibao/jcvi/wiki/MCscan-(Python-version)). Microsynteny visualization of the Hox cluster was also completed through MCscan modules. # *Mesocentrotus franciscanus* versus *Lytechinus variegatus* MCscan (python version) program to download: https://github.com/tanghaibao/jcvi MCscan manual/workflow: https://github.com/tanghaibao/jcvi/wiki/MCscan-%28Python-version%29#dependencies ## Reformat files for MCscan
##Manually Remove noncoding, miscRNA and rRNA sequences
awk '/^>/ {P=index($0,"ncRNA")==0} {if(P) print} ' Lvariegatus.cds > Lvariegatus_test.cds
awk '/^>/ {P=index($0,"miscrna")==0} {if(P) print} ' Lvariegatus_test.cds > Lvariegatus_test2.cds
awk '/^>/ {P=index($0,"rRNA")==0} {if(P) print} ' Lvariegatus_test2.cds > Lvariegatus_test3.cds
rm Lvariegatus.cds
mv Lvariegatus_test3.cds Lvariegatus.cds
rm Lvariegatus_test.cds
rm Lvariegatus_test2.cds
rmLvariegatus_test3.cds
##edit header to contain only the transcript ID
#remove everything from the beginning of the header up to transcript_id=; add the ">" back
sed -i 's/.*transcript_id=/>/g' Lvariegatus.cds
#remove everything after "]"
sed -i 's/].*//g' Lvariegatus.cds
Genome Mfranciscanus depths:
Depth 0: 3,148 of 22,306 (14.1%)
Depth 1: 15,494 of 22,306 (69.5%)
Depth 2: 3,367 of 22,306 (15.1%)
Depth 3: 247 of 22,306 (1.1%)
Depth 4: 46 of 22,306 (0.2%)
Depth 5: 4 of 22,306 (0.0%)
Genome Lvariegatus depths:
Depth 0: 5,355 of 33,669 (15.9%)
Depth 1: 27,330 of 33,669 (81.2%)
Depth 2: 971 of 33,669 (2.9%)
Depth 3: 13 of 33,669 (0.0%)
Mfranciscanus vs Lvariegatus syntenic depths
1:2 pattern
sort -n Mfran_chr_length.txt | sed 's/>//g' > Mfran_chr_length_sort.txt
rm Mfran_chr_length.txt
awk '{print $1}' Mfran_chr_length_sort.txt | uniq | paste -d, -s >> seqids
awk '{print $1}' Lvariegatus.bed | uniq | paste -d, -s >> seqids
# y, xstart, xend, rotation, color, label, va, bed
.6, .1, .8, 0, #f1b6da, Mfranciscanus, top, Mfranciscanus.bed
.4, .1, .8, 0, #4dac26, Lvariegatus, top, Lvariegatus.bed
# edges
e, 0, 1, Mfranciscanus.Lvariegatus.anchors.simple
#get the line numbers for the first and last hox gene
grep -n "Mfran_g4479" Mfranciscanus.Lvariegatus.i1.blocks
10306:Mfran_g4479 XM_041602940.1
grep -n "Mfran_g4498" Mfranciscanus.Lvariegatus.i1.blocks
10315:Mfran_g4498 XM_041604911.1
#pull out the lines of the hox cluster (+ 2 genes on either end)
sed -n '10304,10317p' Mfranciscanus.Lvariegatus.i1.blocks > Mfran_hox.blocks
##Manually edit header - remove everything from the beginning of the header up to transcript_id=; add the ">" back
sed -i 's/.*transcript_id=/>/g' Lpictus.cds
Genome Mfranciscanus depths:
Depth 0: 3,515 of 22,306 (15.8%)
Depth 1: 17,828 of 22,306 (79.9%)
Depth 2: 930 of 22,306 (4.2%)
Depth 3: 33 of 22,306 (0.1%)
Genome Lpictus depths:
Depth 0: 5,737 of 28,631 (20.0%)
Depth 1: 21,633 of 28,631 (75.6%)
Depth 2: 1,227 of 28,631 (4.3%)
Depth 3: 34 of 28,631 (0.1%)
Mfranciscanus vs Lpictus syntenic depths
1:1 pattern
sort -k 2 -nr Lpictus_chr_length.txt | sed 's/>//g' > Lpictus_chr_length_sort.txt
rm Lpictus_chr_length.txt
awk '{print $1}' Mfran_chr_length_sort.txt | uniq | paste -d, -s > seqids
awk '{print $1}' Lpictus_chr_length_sort.txt | uniq | paste -d, -s >> seqids
# y, xstart, xend, rotation, color, label, va, bed
.6, .1, .8, 0, #f1b6da, Mfranciscanus, top, Mfranciscanus.bed
.4, .1, .8, 0, #4dac26, Lpictus, top, Lpictus.bed
# edges
e, 0, 1, Mfranciscanus.Lpictus.anchors.simple
##Manually edit header - remove everything from the beginning of the header up to transcript_id=; add the ">" back
sed -i 's/.*transcript_id=/>/g' Spurp.cds
Genome Mfranciscanus depths:
Death 0: 747 of 22,306 (3.3%)
Death 1: 20,811 of 22,306 (93.3%)
Death 2: 737 of 22,306 (3.3%)
Death 3: 11 of 22,306 (0.0%)
Genome Spurp depths:
Death 0: 2,015 of 29,585 (6.8%)
Death 1: 27,486 of 29,585 (92.9%)
Death 2: 84 of 29,585 (0.3%)
Mfranciscanus vs Spurp syntenic depths
1:1 pattern
sort -k 2 -nr Spurp_chr_length.txt | sed 's/>//g' > Spurp_chr_length_sort.txt
rm Spurp_chr_length.txt
awk '{print $1}' Mfran_chr_length_sort.txt | uniq | paste -d, -s > seqids
awk '{print $1}' Spurp_chr_length_sort.txt | uniq | paste -d, -s >> seqids
# y, xstart, xend, rotation, color, label, va, bed
.6, .1, .8, 0, #f1b6da, Mfranciscanus, top, Mfranciscanus.bed
.4, .1, .8, 0, #4dac26, Spurp, top, Spurp.bed
# edges
e, 0, 1, Mfranciscanus.Spurp.anchors.simple
Genome Lvariegatus depths:
Depth 0: 1,577 of 33,669 (4.7%)
Depth 1: 31,863 of 33,669 (94.6%)
Depth 2: 209 of 33,669 (0.6%)
Depth 3: 20 of 33,669 (0.1%)
Genome Lpictus depths:
Depth 0: 1,809 of 28,631 (6.3%)
Depth 1: 23,516 of 28,631 (82.1%)
Depth 2: 3,267 of 28,631 (11.4%)
Depth 3: 39 of 28,631 (0.1%)
Lvariegatus vs Lpictus syntenic depths
2:1 pattern
# y, xstart, xend, rotation, color, label, va, bed
.6, .1, .8, 0, #f1b6da, Lvariegatus, top, Lvariegatus.bed
.4, .1, .8, 0, #4dac26, Lpictus, top, Lpictus.bed
# edges
e, 0, 1, Lvariegatus.Lpictus.anchors.simple
Genome Spurp depths:
Depth 0: 5,186 of 29,585 (17.5%)
Depth 1: 19,627 of 29,585 (66.3%)
Depth 2: 4,384 of 29,585 (14.8%)
Depth 3: 308 of 29,585 (1.0%)
Depth 4: 80 of 29,585 (0.3%)
Genome Lvariegatus depths:
Depth 0: 5,037 of 33,669 (15.0%)
Depth 1: 26,363 of 33,669 (78.3%)
Depth 2: 2,120 of 33,669 (6.3%)
Depth 3: 149 of 33,669 (0.4%)
Spurp vs Lvariegatus syntenic depths
1:2 pattern
# y, xstart, xend, rotation, color, label, va, bed
.6, .1, .8, 0, #f1b6da, Spurpuratus, top, Spurp.bed
.4, .1, .8, 0, #4dac26, Lvariegatus, bottom, Lvariegatus.bed
# edges
e, 0, 1, Spurp.Lvariegatus.anchors.simp
Genome Spurp depths:
Depth 0: 5,186 of 29,585 (17.5%)
Depth 1: 19,627 of 29,585 (66.3%)
Depth 2: 4,384 of 29,585 (14.8%)
Depth 3: 308 of 29,585 (1.0%)
Depth 4: 80 of 29,585 (0.3%)
Genome Lpictus depths:
Depth 0: 5,037 of 33,669 (15.0%)
Depth 1: 26,363 of 33,669 (78.3%)
Depth 2: 2,120 of 33,669 (6.3%)
Depth 3: 149 of 33,669 (0.4%)
Spurp vs Lpictus syntenic depths
1:2 pattern
# y, xstart, xend, rotation, color, label, va, bed
.6, .1, .8, 0, #f1b6da, Spurpuratus, top, Spurp.bed
.4, .1, .8, 0, #4dac26, Lpictus, bottom, Lpictus.bed
# edges
e, 0, 1, Spurp.Lpictus.anchors.simp
# y, xstart, xend, rotation, color, label, va, bed
.7, .1, .8, 0, , Spurp, top, Spurp.bed
.5, .1, .8, 0, , Mfran, top, Mfranciscanus.bed
.3, .1, .8, 0, , Lpictus, bottom, Lpictus.bed
.1, .1, .8, 0, , Lvar, bottom, Lvariegatus.bed
# edges
e, 0, 1, Mfranciscanus.Spurp.anchors.simple
e, 1, 2, Mfranciscanus.Lpictus.anchors.simple
e, 2, 3, Lvariegatus.Lpictus.anchors.simple