Back to All

Were the Kallisto files on the INCLUDE hub created using two analysis pipelines?

If I transfer the Kallisto files to Cavatica and unzip them, 400 Kallisto files from the HTP have 200401 lines. There are 555 Kallisto files from the HTP that have 244998 lines. Between the 400 Kalisto files (400 Participant IDs) with 200401 lines and the 555 Kallisto files (507 Participant IDs) with 244998 lines, 329 Participant IDs are the same. All the Kallisto files from both BRI-DSR and X01-Hakonarson have 244998 lines.

Based on the overlapping patient IDs and sample IDs, can I assume that the 400 Kallisto files from the HTP with 200401 lines were the first analysis of files from HTP? Can I assume some of those 400 fastq files were reanalyzed with some new fastq files; therefore, the 555 Kallisto files from the HTP with 244998 lines are the second analysis of the HTP data?

However, if the 555 files are the second analysis of the HTP, I want to ensure I’m right about my other assumption. Of the 555 Kallisto files from the HTP that have 244998 lines, 139 have a “External Sample ID” that is of the format internalsampleID_PAXgeneWholeBloodRNA, while 416 of those files have the format internalsampleID _WhiteBloodCellsRNA. Can I assume all 555 files were created by the same type of RNA prep and cell type? Regardless of one label saying wholeblood and one saying whiteblood? The Difference in the “External Sample ID” doesn’t mean they are different types of samples, correct?

Thanks!