Accelerating minimap2 for long-read sequencing applications on modern CPUs

  • Brief Communication
  • Published:

From Nature Computational Science

A preprint version of the article is available at bioRxiv.


Long-read sequencing is now routinely used at scale for genomics and transcriptomics applications. Mapping long reads or a draft genome assembly to a reference sequence is often one of the most time-consuming steps in these applications. Here we present techniques to accelerate minimap2, a widely used software for this task. We present multiple optimizations using single-instruction multiple-data parallelization, efficient cache utilization and a learned index data structure to accelerate the three main computational modules of minimap2: seeding, chaining and pairwise sequence alignment. These optimizations result in an up to 1.8-fold reduction of end-to-end mapping time of minimap2 while maintaining identical output.

Fig. 1: Work distribution for three modules.
Fig. 2: Performance comparison of minimap2 and mm2-fast on a single socket Cascade Lake CPU (28 cores) for full datasets.

Data availability

Datasets used for benchmarking are publicly available (Supplementary Table 2). Human reference genome is available at All ONT and PacBio HiFi datasets (HG002, HG003, HG004) used are available at Datasets for PacBio CLR (HG002, HG003, HG004) are available at Genome assemblies are available at: CHM13: NCBI (GCA009914755.3), HG002 (hap1) and HG002 (hap2) are publicly available at ref. 33. The speedup shown in the paper can also be realized with a smaller subset of the above datasets. Source Data are provided with this paper.

Code availability

The mm2-fast source code is available under the open source MIT license at The particular version of mm2-fast used in this manuscript is publicly available at ref. 34. The scripts used for the experiments in the manuscript are available at ref. 35.

Optimization Notice: Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors. Performance tests, such as SYSmark and MobileMark, are measured using specific computer systems, components, software, operations and functions. Any change to any of those factors may cause the results to vary. You should consult other information and performance tests to assist you in fully evaluating your contemplated purchases, including the performance of that product when combined with other products. For more information go to Intel, Xeon, and Intel Xeon Phi are trademarks of Intel Corporation in the US and/or other countries.


This work is supported in part by the National Supercomputing Mission (NSM) India under DST/NSM/R&D_HPC_Applications to C.J. The authors are grateful to H. Li for guidance and technical discussions on minimap2 and working with us to get our improvements integrated in a branch of minimap2 github repo.

Author information

Authors and Affiliations



S.K. led the software implementation of mm2-fast. All authors contributed to algorithm design, experiments and manuscript preparation, and read and approved the final manuscript.

Corresponding authors

Correspondence to Saurabh Kalikar, Chirag Jain, Md Vasimuddin or Sanchit Misra.

Ethics declarations

Competing interests

S.K., V.M. and S.M. are employees of Intel Corporation.

Peer review

Peer review information

Nature Computational Science thanks Aydin Buluc, Zemin Ning and the other, anonymous, reviewer(s) for their contribution to the peer review of this work. Handling editor: Fernando Chirigati, in collaboration with the Nature Computational Science team.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Extended data

Extended Data Fig. 1 Minimap2 workflow depicting its three key modules – (i) seeding, (ii) chaining, and (iii) alignment – and mm2-fast optimizations.

The seeding stage identifies short fixed-length exact matches between a read and a reference sequence. Chaining stage selects an ordered subset of these exact matches (anchors) to form a chain. The final alignment stage computes base-level alignments for filling the gaps between adjacent anchors in these chains. Our optimizations to each of the modules are shown in the blue dotted rectangle.

Extended Data Fig. 2 Cross-platform performance of our optimizations for Rome, Skylake, Cascade Lake and Ice Lake architectures using single socket.

X-axis shows various query datasets and y-axis indicates the speedup achieved by mm2-fast over minimap2 – both running on the same CPU.

Source data

Extended Data Fig. 3 Data structures used for hash table.

Minimizers extracted from the reference sequence are stored in a sorted list as key-value pairs. Position list maintains a separate list of the positions of minimizers on the reference sequence.

Extended Data Fig. 4 Two-layer RMI.

An example minimizer lookup is illustrated - get_mm_hits(mm5) calls a lookup for a minimizer mm5. The RMI root predicts the leaf layer model which in turn predicts the location of mm4 in the sorted list. Finally, the last mile search from mm4 walks to the location of mm5 and returns its value to the caller.

Extended Data Fig. 5 Chaining of two co-linear anchors A and B.

Here two anchors overlap on the query sequence. Gap cost function in minimap2 is calculated using the reference gap, query gap, and the average length of all anchors avg_qlen.

Supplementary information

Supplementary Information

Supplementary Tables 1–4, Figs. 1 and 2, Algorithms 1 and 2, and Sections 1 and 2.

Peer Review Information

Supplementary Data 1

Source data showing the single-threaded and multithreaded runtime of mm2-fast.

Supplementary Data 2

Source data showing the time spent by mm2-fast and minimap2 in the chaining module.

Source data

Source Data Fig. 1

Source Data showing the time spent by mm2-fast and minimap2 in various modules.

Source Data Fig. 2

Source Data showing the end-to-end mapping time of mm2-fast and minimap2 on the full datasets.

Source Data Extended Data Fig. 2

Source data showing the speedups of mm2-fast on various architectures.

Kalikar, S., Jain, C., Vasimuddin, M. et al. Accelerating minimap2 for long-read sequencing applications on modern CPUs. Nat Comput Sci 2, 78–83 (2022).

