Bioinformatician | Genomic Workflows | Nextflow, AWS & Data Science
I build robust, reproducible bioinformatics pipelines for genomic research. My core focus is developing modular Nextflow workflows, and I have experience extending these architectures into the cloud using AWS and Terraform.
Because my background is in Agrobiotechnology and Molecular Pathology, my code is rooted in real wet-lab science. I actually understand the genetics behind AMR and plant pathology, which ensures the pipelines I build are scientifically accurate, not just computationally fast.
| **Nextflow | AWS Batch | Amazon S3 | Docker | Terraform | Streamlit** |
NextAMR is a cloud-native bioinformatics pipeline for bacterial genome assembly, annotation, and antimicrobial resistance gene detection. It supports short-read, long-read, and hybrid sequencing workflows and is designed for reproducible execution on AWS Batch.
Highlights:
Architected a modular Nextflow DSL2 workflow for bacterial genome assembly and AMR detection.
Integrated tools including FastP, Filtlong, Unicycler, Flye, Medaka, Bakta, and AMRFinderPlus.
Deployed scalable execution using AWS Batch, EC2 Spot, Amazon S3, and Docker containers.
Provisioned cloud infrastructure using Terraform-based Infrastructure as Code.
Developed a Streamlit interface for S3 upload, manifest generation, workflow launch, and result reporting.
🔗 Repository: NextAMR on GitHub
| **M.Sc. Thesis Project | Nextflow | Bacterial Genomics | Genome Assembly | Annotation** |
Baktflow is a reproducible Nextflow workflow developed for bacterial genome analysis, from raw sequencing reads to assembly and functional annotation.
Highlights:
Designed a modular Nextflow DSL2 workflow for short-read, long-read, and hybrid sequencing data.
Automated routine bacterial genomics tasks including quality control, assembly, and annotation.
Built the workflow as the core software deliverable for my M.Sc. thesis at Justus Liebig University Giessen.
🔗 Record: 10.5281/zenodo.14995561
(Note: Source code access is restricted due to university IP conditions).
| **Nextflow | Variant Calling | GATK | Bowtie2 | Bash** |
An archival Nextflow pipeline designed to reproduce a yeast genome study, processing raw Illumina sequencing data to identify genomic variants using the GATK best practices framework.
🔗 Repository: yeast-variant-calling-nf
Bioinformatics & Genomics: NGS analysis, bacterial genome assembly, genome annotation, AMR detection, RNA-seq, variant analysis, promoter analysis
Workflow Engineering: Nextflow DSL2, Docker, Bash, Git/GitHub, reproducible pipelines, workflow modularization
Cloud & Infrastructure: AWS Batch, Amazon S3, EC2 Spot, Terraform, Infrastructure as Code, HPC/Slurm
Programming & Data Science: Python, R, SQL, Pandas, Scikit-learn, Tidyverse, data cleaning, quality control, time-series forecasting
LinkedIn: linkedin.com/in/naoueleldjouher
Email: hellonaouel@gmail.com