Member-only story
Quality Control
Quality Control Fastq File
It is necessary to understand, identify and exclude error-types that may impact the interpretation of downstream analysis. Sequence quality control is therefore an essential first step in your analysis. Catching errors early saves time later on.
The FASTQ file format is the defacto file format for sequence reads generated from next-generation sequencing technologies. This file format evolved from FASTA in that it contains sequence data, but also contains quality information. Similar to FASTA, the FASTQ file begins with a header line. The difference is that the FASTQ header is denoted by a @
character. For a single record (sequence read) there are four lines, each of which are described below:
Check total number of words
The total number of words in fastq file should be divided by 4. We can check the total number of words by “wc -l” command in Unix/Linux.