Author: Rafeul Hasan

  • From Pipettes to Pipelines: What I Wish I Knew When I Moved From Wet Lab to Dry Lab 

    I used to think I wasn’t a good researcher because I didn’t thrive in wet lab. 

    I’m a night owl. I work best at my own pace, late at night, when the world is quiet and I can think deeply about a problem. The wet lab simply doesn’t care about that. Experiments run on their own schedule. Cells don’t wait for you to be a morning person. Protocols demand precision at 8 AM whether you slept well or not. 

    An AI depiction showing my transition

    I spent time in a hybrid wet lab–dry lab project early in my career, and while I could do the bench work, I never felt like it was where I belonged. I wasn’t bad at it , but I wasn’t lighting up. The part that excited me was always what came after: looking at the data, asking what it meant, synthesizing ideas across experiments. I just didn’t know yet that there was a whole career built around that. 

    Then I moved fully into dry lab work in the beginning of 2024 and while started working with population genetics, large-scale biobank analysis ,and something clicked. This post is about that transition, the mistakes I made, and the things I wish someone had told me on day one. 

    The Beginning Was Still Overwhelming 

    Let me be clear: just because dry lab was a better fit for me didn’t mean it was easy. The first months were genuinely overwhelming. I didn’t know what tools to use, what language to write in, or even how to organize my files. Everything felt like it had a learning curve stacked on top of another learning curve. 

    I remember being overwhelmed working with a full linux system on the sandbox environment of finngen. I had taken many bioinformatics courses, but working hands on turned out to be a different ball game.

    One of my early mistakes was trying to go deep on everything at once. I’d encounter a new concept or tool and feel like I needed to master it completely before moving on. That meant I spent far more time on certain things than I actually needed to. I was learning, but not efficiently. It took a while to develop the instinct for knowing when you understand something well enough to keep going, and when you actually need to dig deeper. I remember being stuck on the concept of principal component analysis far too long then it was actually needed.

    I think there is a specific term for it, which is called “Time Blindness.” It is common for beginners to suffer from that.

    That instinct doesn’t come from reading , it comes from doing the work, making mistakes, and slowly building judgment. Give yourself permission to learn unevenly. Not everything deserves the same depth at the same time.

    Finding My Rhythm With FinnGen-Scale Data 

    The work that really shaped me was working with biobank-scale data — specifically FinnGen, where you’re dealing with over 500,0000+ individuals, health record data with close to a million data points, and the sheer volume of information that comes with population level genetics. 

    The first time I tried to work with data at that scale, I hit a wall. My usual approaches were too slow. Loading everything into memory wasn’t an option. I needed to learn how to be efficient, not just correct. 

    That’s where tools like DuckDB and BigQuery became essential. DuckDB was a revelation for working with huge datasets locally — being able to run SQL-style queries directly on large files without spinning up a database server changed how I approached data extraction. BigQuery was critical for working with the phenotype data efficiently at scale. Instead of pulling entire datasets and filtering afterward, I learned to query smartly: extract exactly what I needed, filter early, and keep my working data manageable. 

    These aren’t glamorous skills. Nobody puts “learned to write efficient SQL queries for phenotyping” on a conference poster. But they’re the skills that let you actually do the science instead of waiting for your computer to finish choking on a file. 

    R and Python: It’s Not a Philosophy — It’s Driven by the Task 

    One of my earliest sources of confusion was the classic question: R or Python? I wasted time going back and forth, starting things in one language and switching to the other, never feeling settled. 

    What eventually resolved it wasn’t a blog post or a debate — it was just doing enough real work that the answer became obvious from context. 

    R became my primary tool for analysis and visualization. When I needed to explore GWAS results, make Manhattan plots, generate QQ plots, or do any kind of interactive data exploration, R was where I went. The Bioconductor ecosystem is built for genomics, and ggplot2 made it possible to create publication-quality figures that actually communicated what I wanted to show. When the output from a pipeline was messy — and it often was — R was where I cleaned it up and turned it into something presentable. 

    Python, on the other hand, became my tool for automation and backend tasks. When I needed to convert RSIDs to genomic positions, or run a script in the terminal as part of a larger pipeline, Python was more practical. It’s better suited for scripting, for batch processing, for anything that needs to run unattended on a server. 

    The choice was never ideological. It was always driven by the specific task in front of me.

    Once I stopped thinking about it as an identity question and started thinking about it as a practical one, the anxiety disappeared. 

    The Clumsy Outputs Nobody Warns You About 

    Here’s something that surprised me: the tools we rely on in genomics — the GWAS pipelines, FUMA, the various annotation and enrichment software — they work, but their output is often a mess. 

    You run an analysis, you get results, and then you open the output file and it’s formatted in a way that’s not immediately usable. Columns are named inconsistently. Files have weird delimiters. Summary statistics come in formats that don’t match what the next tool expects. You end up spending significant time just wrangling output from one tool into input for another. 

    This is where R became indispensable for me. Not for the analysis itself, but for the cleanup afterward. Reading in clumsy output files, renaming columns, filtering, reformatting, merging datasets, and then visualizing the results — that whole post-processing workflow lived in R, and ggplot was at the center of it. 

    Learning the basics of how GWAS pipelines work, how to interpret FUMA output, how to trace a signal from summary statistics through to functional annotation — that all takes time. But what takes even more time, and what nobody teaches you, is building the data wrangling muscle to handle the mess in between. 

    Discovering ggplot Changed How I Think About Data 

    I remember my early attempts at plotting — base R graphics that technically showed the data but looked awful and were painful to customize. Then I found ggplot2, and it reshaped not just my plots, but how I think about communicating results. 

    The grammar of graphics forces you to be intentional. What variable maps to which axis? What does color represent? Is a boxplot or a violin plot the right choice for this distribution? When you’re presenting PCA plots of population structure or allele frequency distributions across cohorts, these decisions matter. A careless plot can mislead. A thoughtful one can make your point before you say a word. 

    I invested real time in learning ggplot well — axis labels, themes, color palettes that are colorblind-friendly, faceting for multi-panel figures — and it has paid off in every presentation, every paper, every time a collaborator says “that’s a clear figure.” 

    Data Cleaning Is the Real First Step

    This is the unsexy truth of bioinformatics: most of your time is spent cleaning data. Mismatched sample IDs. Missing phenotype fields. Inconsistent chromosome naming. Duplicated entries. Numeric columns with hidden character values buried in row 40,000. 

    When you’re working at biobank scale, these problems are amplified. A small inconsistency in 500,000 records can silently corrupt an entire analysis. I learned — the hard way — to treat data cleaning as the foundation, not a chore. If your input is wrong, everything downstream is wrong. Every time. 

    The Small Habits That Changed Everything 

    Some of the most impactful things I learned had nothing to do with code. 

    Date your files. Not final_results.csv . Not results_v3_FINAL.csv . Instead: 2025-02- 15_gwas_results_chr22.csv . When you’re juggling dozens of output files, timestamps are how you keep your sanity. 

    Use R Projects. For months, I was setting working directories manually and losing track of which script belonged to which analysis. R Projects give you a self-contained workspace with relative paths, an isolated environment, and automatic organization. It’s the single easiest thing you can do to bring order to your work. 

    Keep a dry lab notebook. This one hit me hardest, because it’s so obvious in retrospect. In wet lab, we document everything — every protocol, every gel, every result. Then we move to dry lab and somehow abandon that discipline entirely. I’d run an analysis, get a result, move on, and two weeks later have no idea what parameters I used or which version of the data I was working with. 

    Now I keep a running document in each project: what I did, why I did it, what the input was, what I changed from last time. When a collaborator asks “how did you get this number?” or when I revisit something six months later, the answer is there. Organization isn’t a soft skill in bioinformatics. It’s a survival skill. 

    I Started Enjoying It — And That Changed Everything 

    The turning point wasn’t a single moment. It was gradual. One day I realized I was excited to open my laptop. I was staying up late not because I had to, but because I wanted to figure something out. The night owl in me had finally found a schedule that worked. 

    I started enjoying the puzzle of it — taking a messy dataset and making it tell a story. Learning how different tools fit together. Getting faster at the things that used to take me days. Building intuition for when something in the data looked off.

    That enjoyment is what carried me through the steep parts of the learning curve. The beginning is hard, and it stays hard for a while. But if you’re the kind of person who gets satisfaction from synthesizing ideas, from seeing patterns in data, from turning a wall of numbers into a clean figure that makes a point — you’ll find your stride. 

    What I’d Tell Someone Just Starting Out 

    If you’re a wet lab person staring at a terminal for the first time, here’s what I want you to know. 

    You don’t have to learn everything deeply all at once. Build judgment about what needs depth and what just needs familiarity. Let R and Python serve different roles — don’t agonize over choosing one forever. Invest early in ggplot; it’ll pay off in every figure you make. Learn to query your data efficiently before you try to analyze it, especially at biobank scale. Expect messy output from bioinformatics tools and build the skills to wrangle it. Date your files, use R Projects, and for the love of science, keep a notebook. 

    The transition from wet lab to dry lab isn’t about becoming a different kind of scientist. It’s about finding the version of science that fits how your brain actually works. For me, that meant late nights, big datasets, and the quiet satisfaction of a clean pipeline producing a clear result. 

    You’ve already proven you can learn hard things. This is just the next one.

  • Resistance Training : The unusual road of recovering mental health through the lessons of resistance training

    I had never been a gym person.

    I always had a stigma about gym. Or to say, I never thought strength or resistance training was my thing.

    Image generated by Chatgpt

    I vividly remember my first day at the Gym. Looking at everyone using different machines, I was lost at even where to start.

    Youtube has made me spoilt for choice. I had zero idea where to begin.

    And there within, standing at the gym in a cold November (No pun intended to Guns and roses) , I stand trying to do my first deadlift. To be honest, I never had the belief. And not to my surprise, the barbell didn’t move an inch, but something inside me probably did.

    Flash forward, with progressive overload, and months of accumulation, I can lift some bits of weight. And reflecting on that, and years of therapy and training on Cognitive Behavioral Therapy, I realise the connection.

    The point of realization was the fact things could change. And as a chronic OCD individual(Diagnosed), I came to realise this had started to give me some form of sense of control. And to be honest, it helped me challenge some of my core beliefs that had led to some of the cognitive distortions I struggled with.

    With every single training of muscle, it grows through resistance. There is muscle memory, and bunches of neuromuscular signalling running.

    And in many ways our brains are similar. It is plastic in many ways.

    Years of journalling and battling my mental health, I realised I had unknowingly done a form of resistance training with my brain. And therapy in itself is a form of resistance training.

    I am a researcher by profession, and the geeky side of me jumped through copious amount of research articles.

    Of course, there has been loads of positive things in different journals about resistance training. I am not going to be boring you with the technical parts.

    A 2021 review by T. Hortobágyi and colleagues looked at what happens inside the body when you train regularly. They found something fascinating:

    Strength training boosts muscle activation, reshapes neural circuits, and even changes the way your brain and peripheral nerves communicate.

    Using brain-imaging data and nerve-stimulation studies, the researchers showed that when you train, you’re strengthening the “software” (your brain and nervous system) just as much as the “hardware” (your muscles).

    This was the first study that made me think: maybe the mental clarity and emotional stability I gained from lifting weren’t just psychological — maybe they were neurological.

    A massive 2019 meta-analysis by U. Singhique et al. examined where exactly these brain-and-nerve changes happen.

    They looked at things like:

    • Motor-evoked potentials
    • Cortical silent periods
    • Spinal reflexes (like the V-wave)
    • Subcortical excitability

    Their conclusion was pretty straightforward

    Resistance training upgrades the entire system — from the motor cortex in your brain all the way down the spinal cord.

    It’s a full-body neural reprogramming

    When I read this, it made sense why lifting sometimes feels like my thoughts become sharper, my reactions cleaner, and my anxiety a little quieter.

    My nervous system is literally learning to fire more efficiently.

    The bookworm in me immediately picture Kafka Tamura from Murakami’s “Kafka on the Shore”. Kafka doing different sorts of strength trainings, in his way of becoming the world’s toughest 15 year old. Murakami had subconscioiusly meant it to be both physical and mental.

    In CBT, you are training your brain. In the words of Aaron Beck, your thoughts shape your reality. You are training your brain through therapy to get rid of different cognitive distortions, and in a way your perception of reality.

    Your mental strength training could take learnings from your resistance training. You are building new connections, more synapses are formed as you are slowly recovering from your cognitive distortions. And you are learning.

    There is a caveat. One of the reasons why strength training is rewarding is because you can visualise the progress. With every week passing, you get a joy with the extra load you can carry, and the hypertrophy gets added to your visual image.

    But the same is hard to say about your own mental strength. There is no direct indicator of your growth of mental health.

    Now getting out of the mumbo jumbo, the core message is very simple. Strength training had helped me challenge my core belief. It made me believe I can change. I can have growth.

    I started tricking my brain into thinking that with every day of toggling life’s challenges in a healthy way, my mental strength is growing. It is not visible. So I sort to imagination. A bar is popping in my head. I am growing.

    As Master Oogway said, You need to believe. And believing changes a lot of things. This also causes major challenges in my research, because placebo effect itself is a big thing to encounter in my own research, but lets put that story for another day.

    I have found my own sense of strength and self belief through strength training. I hope you do as well!