I'm ending up with inconsistent field ordering in my BAMs after running calmd to correct an initially-badly-generated MD field. The reads whose MD fields need correcting get corrected, but the MD field is moved to the last column. Here are two adjacent reads from the same BAM; the first one is in its original form, and the second has been modified by calmd.

1618_1392_70    16      chr1    58997   10      50M     *       0       0       AACTCCGGTCCTTCCTATTTATGTTGTTTTTTGTATTCTATGGAGGAATC      *:""4565557;:"%<%)8G4#:OK4""X\H<7S`]K<W`^SET`>0T``      PG:Z:bfast      AS:i:1350       NM:i:2  NH:i:2  IH:i:1  HI:i:1  MD:Z:6A1A41     CS:Z:T1230.0220133220331100002110112300.3212020210302110        CQ:Z:=?;0!>B)37>B,';9B(&-2A..%&<*'%8&)!%.*())%'%'&&%)'* CM:i:6  XA:i:3  XE:Z:--1----------1--4---2-----2------------------4----
1559_1614_1163  0       chr1    59304   255     5M1I1M4I3M4I1M1I30M     *       0       0       AGCAAGTAGAAATGAACTCTAAGCCCCTACACTACACTACAATCATGTGT      5"3J``F4P``P3G`^C3P``N3L\]M3J``N3M``N33P`I33""=""%      PG:Z:bfast      AS:i:300        NM:i:11 NH:i:1  IH:i:1  HI:i:1  CS:Z:T31310213220031201222302300023111231112311032111021        CQ:Z:;'%%<@7&%B=B%%9@5%%BB@%%>5?%%<<@%%?B@%%%B;%%%AA%%% CM:i:4  XA:i:3  XE:Z:-1-------------------------------------------1-02- MD:Z:33T6

Here's the second read, in its original form, before calmd:

1559_1614_1163  0       chr1    59304   255     5M1I1M4I3M4I1M1I30M     *       0       0       AGCAAGTAGAAATGAACTCTAAGCCCCTACACTACACTACAATCATGTGT      5"3J``F4P``P3G`^C3P``N3L\]M3J``N3M``N33P`I33""=""%      PG:Z:bfast      AS:i:300        NM:i:11 NH:i:1  IH:i:1  HI:i:1  MD:Z:5^G1^AGAA3^AACT1^T23T6     CS:Z:T31310213220031201222302300023111231112311032111021        CQ:Z:;'%%<@7&%B=B%%9@5%%BB@%%>5?%%<<@%%?B@%%%B;%%%AA%%% CM:i:4  XA:i:3  XE:Z:-1-------------------------------------------1-02-

I'm not particularly surprised to find optional fields in different orders in different BAMs from different sources; but two different orderings in the same BAM seems like a bit much!

I'm on samtools 0.1.6, and this is SOLiD data mapped by BFAST.

(I've also asked this question on the samtools mailing list, but thought I would help prime the pump over here, as well as giving the formatting tools a bit of a workout. Do people like the read all on one line like that, or is it more annoying to scroll than to read multiline data?)

asked 21 May '10, 23:26

Jenn's gravatar image

Jenn
3629
accept rate: 0%


i've seen this in various sam-files as well, i think it's best to treat the optional field as an unordered hash of key - value pairs (or key:okey:value pairs). This is likely what various BAM/SAM implementations will do so the output will be dependent on their hash implementation.

link

answered 22 May '10, 10:08

brentp's gravatar image

brentp
865
accept rate: 33%

Sorry it took me a while to accept this answer, but...it's taken me a while to accept this answer. :) It does seem to be true, though, that SAM is an unordered hash rather than columnar tab-separated data. This still bugs me; it's fine when you're reading things programmatically but lousy when you're trying to do quick and dirty things at the command line. Thanks for helping pound the fact into my head, though!

(08 Jun '10, 12:47) Jenn

There was a bug in BFAST with the MD tag, which was fixed to match that of samtools calmd in version 0.6.4d. Please download the latest release to see if the MD inconsistencies go away.

link

answered 22 May '10, 22:33

nilshomer's gravatar image

nilshomer
764
accept rate: 0%

The reordering wasn't BFAST's fault. Though I am on 0.6.4d now, and enjoying the good MD tags!

(08 Jun '10, 12:44) Jenn

Does BFAST generate MD? Samtools moves MD when the MD generated by samtools is different from the one in the SAM/BAM file (you see the MD is different). You should check whether this is a bug in samtools or bfast.

As to the ordering, there is never a rule on the order of optional tags.

link

answered 22 May '10, 20:36

lh3's gravatar image

lh3
22191
accept rate: 10%

edited 22 May '10, 20:38

Your answer
toggle preview

Follow this question

By Email:

Once you sign in you will be able to subscribe for any updates here

By RSS:

Answers

Answers and Comments

Markdown Basics

  • *italic* or _italic_
  • **bold** or __bold__
  • link:[text](http://url.com/ "title")
  • image?![alt text](/path/img.jpg "title")
  • numbered list: 1. Foo 2. Bar
  • to add a line break simply add two spaces to where you would like the new line to be.
  • basic HTML tags are also supported

Tags:

×13
×8

Asked: 21 May '10, 23:26

Seen: 1,227 times

Last updated: 08 Jun '10, 12:47

powered by OSQA