Skip to main navigation Skip to search Skip to main content

KARAJ: an efficient adaptive multi-processor tool to streamline genomic and transcriptomic sequence data acquisition

Mahdieh Labani, Amin Beheshti, Nigel H. Lovell, Hamid Alinejad-Rokny, Ali Afrasiabi*

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

83 Downloads (Pure)

Abstract

Here we developed KARAJ, a fast and flexible Linux command-line tool to automate the end-to-end process of querying and downloading a wide range of genomic and transcriptomic sequence data types. The input to KARAJ is a list of PMCIDs or publication URLs or various types of accession numbers to automate four tasks as follows; firstly, it provides a summary list of accessible datasets generated by or used in these scientific articles, enabling users to select appropriate datasets; secondly, KARAJ calculates the size of files that users want to download and confirms the availability of adequate space on the local disk; thirdly, it generates a metadata table containing sample information and the experimental design of the corresponding study; and lastly, it enables users to download supplementary data tables attached to publications. Further, KARAJ provides a parallel downloading framework powered by Aspera connect which reduces the downloading time significantly.

Original languageEnglish
Article number14418
Pages (from-to)1-9
Number of pages9
JournalInternational Journal of Molecular Sciences
Volume23
Issue number22
DOIs
Publication statusPublished - 2 Nov 2022

Bibliographical note

Copyright the Author(s) 2022. Version archived for private and non-commercial use with the permission of the author/s and according to publisher conditions. For further rights please contact the publisher.

Keywords

  • biological data
  • Genomics
  • transcriptomics
  • Download
  • Bioinformatics
  • sequence data
  • FASTQ
  • Linux

Fingerprint

Dive into the research topics of 'KARAJ: an efficient adaptive multi-processor tool to streamline genomic and transcriptomic sequence data acquisition'. Together they form a unique fingerprint.

Cite this