Data science workflows for research that lasts
This handbook aims to teach you everything you need to know about carrying out data science in the Golden Lab: working on the cluster, maintaining code as the source of truth, documenting decisions close to the analysis, and building projects that other people can reproduce.
🚀 Supercomputing, simplified
Treat HPC as the default, not the backup plan. Use FASRC’s interactive apps for exploration and experimentation, and scale your analyses with SLURM jobs.
🏗️ Build projects that survive handoff
Make your science reproducible by using organized project structures, robust relative paths, and fully specified software environments, to create research that runs anywhere.
🤝 Collaborate without chaos
Record your experiments with git, collaborate on code on GitHub, and document your work beautifully so others can support your work and build upon it.
Lab Principles
1. Remote & reproducible beats local & manual 🌐
If your work matters, it should run on any infrastructure, and be re-runnable from code — not clicks.
2. Code is the source of truth 💎
Results, figures, and tables are merely products of the workflow. The lasting record — and scientific truth underlying them — is in the code and documentation that created them.
3. Analysis should be narrated 📇
Use notebooks and reports to explain what you did, why you did it, and what should happen next.
4. Robust habits compound 🌱
Version control, clean project structure, and clear documentation reduces technical debt for your colleagues — and your future self.
What students should be able to do
By the end of onboarding, a student should be able to:
- Start a project on the correct FASRC system and choose an appropriate storage location.
- Create a reproducible project structure with notebooks, scripts, and dependency tracking.
- Read shared datasets without making unnecessary copies.
- Push analysis code and documentation to GitHub without exposing restricted data.
- Ask for help with enough context that someone else can reproduce the problem.
Suggested reading path
- Read Start Here for the workflow philosophy and first-week checklist.
- Read Workflow Roadmap for the full research lifecycle.
- Use the guide pages as reference while you are actively working on a project.
- Keep Reference Library open when you need the official docs.
Footnotes
If you were looking for the Climate Smart Public Health Research Site, click here (https://www.climatesmartpublichealth.com/).↩︎