Genomics Bioinformatician
We are looking for a Genomics Bioinformatician to join the Genomics team
Basecamp Research is a market leader in mapping biodiversity for AI-based design of biological systems. With the world’s largest biological database built from biodata collected from every corner of the world, often in unexplored locations and biomes, we are able to create a foundational dataset tailored for AI. This leads to tremendous opportunities to both expand basic understanding of biology but also tackle some of the toughest challenges in the life sciences.
Basecamp Research is backed by leaders across technology, life sciences and biopharma and collaborate with global organisations such as Nvidia, Johnson Matthey, and world leading academic labs such as the David Liu Lab of the Broad Institute. We are also listed as Bloombergs top 25 UK startups to watch for in 2024.
The successful candidate will take ownership, expand and manage our sequencing and metagenomic data analysis production operations in the first instance.
They will have the opportunity to investigate new methods to maximize the curation and annotation of the microbial dark matter, and our unique sequencing datasets
This will be a strong collaborative role working closely with all teams at all data collection and analysis points. This will include but not limited to our biodiversity partners, field scientists, sequencing ops, ML scientists, data engineers and commercial stakeholders.
Develop and run software to support the genome collection and sequencing operations, post curation and labelling of data and the overall goals of the Genomics team. Our sequencing data stack includes second and third generation technologies.
Responsibilities may include, in coordination with other team members:
Taking ownership in building, improving and managing the in-house genomic assembly and annotation pipeline. This will entail:
Manage and audit the data workflow of our samples from collection to data warehousing. This includes the quality control of appropriate files and datasets
Collaborate with the Data Engineering team in building and managing the pipeline in our in-house designed infrastructure platform
Investigate, benchmark and integrate novel analyses into the pipeline
Write and document high quality code and methodology of processes
Methods development to leverage the in-house sequencing datasets to create full high quality genomes for context analysis
Contribute to problem-solving discussions within and across teams to generate ideas that will benefit all aspects of the organisation
Opportunity to lead from the front when it comes to bringing new ideas and approaches to the table
A graduate (MSc/PhD) degree in the life sciences, computer science or similar
At least three years of post bachelors experience in the life science and genomics, preferably in industry or a high throughput research institute
Have directly worked in an environment that involved processing hundreds to thousands of samples. This is beyond just downloading from the public databases but handling data at the raw level and the ability to track each datapoint
if microbial or metagenomics have demonstrated managing in the upper hundreds
if human or clinical resequencing have demonstrated managing in the thousands.
Have experience writing complex pipelines using workflow languages and tools for genomic and protein analysis and have demonstrated the ability to benchmark what to choose for each step of a pipeline
Dagster > Nextflow > Snakemake > Stepfunctions > other
Have worked with environmental metagenomic or microbiome datasets. This means not one single organism, or parasite or cultured bacterium
Have extensive experience working with second and third generation sequencing datasets for downstream analysis
Experience with methods development with sequencing read datasets
Proficient at using unix based operating systems, libraries, and tools
Knowledge and experience of tools used in bioinformatics both in genomics, metagenomics and/or protein biology
Proficiency with a programmatic scripting language (python, bash etc)
Excellent analytical and problem solving skills
Excellent communication skills and ability to work closely with interdisciplinary teams
Fluency in English
Have worked and developed analysis/methods with population sequencing data or analysing variation within metagenomic sequencing data
Have worked with novel deep learning methodology in analysing genomic data
Have worked in developing novel computational methods or analyses towards the annotation of phage, viral and/or archaea genomes
Experience in the techbio/biotech industry that is focused on product development (therapeutics, protein/drug discovery, CRO etc)
Experience with building databases (relational and/or non-relational)
Experience using a HPC environment
Experience using cloud solutions (AWS, GCP etc)
Experience with containerization (Docker, Singularity)
Experience with git and Gitlab or Github
Experience with Agile software development
London office based (non-remote) or happy to relocate to London.
The opportunity to be a key early member in our team in an exciting, dynamic, and fast-moving field.
A fun, flexible, and supportive work environment with an office in the centre of London and an emphasis on collaboration and personal development.
Unique & never-seen before data sourced in partnership with nature parks across 5 continents
A network of top tier advisors and academic partners
Exploring the world’s untapped biodiversity.
At Basecamp Research, we’re building a workplace that’s diverse, inclusive, and welcoming to everyone. Our team of 30 represent 14 nationalities and even more languages. We’re 30% women and 50% of us hold a PhD. If you’re excited about this job but you don’t meet all the qualifications, we still encourage you to apply, you may just be the right candidate for our team.
Do you want to join our team as our new Genomics Bioinformatician? Then we'd love to hear about you!