Data Analysis in Natural Sciences
An R-Based Approach

Welcome
Welcome to Data Analysis in Natural Sciences: An R-Based Approach, a comprehensive, practical guide designed for students, professionals, and researchers across the natural sciences. This book provides hands-on methods for analyzing and visualizing data using R, with real-world applications spanning ecology, forestry, agriculture, marine biology, environmental science, and beyond.
What Youβll Learn
π Data Analysis Fundamentals
- Import, clean, and transform data
- Exploratory data analysis techniques
- Working with the tidyverse ecosystem
π Statistical Methods
- Hypothesis testing frameworks
- Parametric and non-parametric tests
- Regression analysis with tidymodels
π¨ Data Visualization
- Publication-quality graphics with ggplot2
- Interactive visualizations
- Effective scientific communication
π Real-World Applications
- Conservation case studies
- Environmental data analysis
- Reproducible research practices
Book Structure
| Part | Chapters | Topics |
|---|---|---|
| Getting Started | 1-2 | Introduction to R, data structures, importing data |
| Data Analysis Fundamentals | 3-5 | EDA, hypothesis testing, statistical tests |
| Data Visualization | 6-7 | ggplot2, advanced graphics, interactive plots |
| Advanced Topics | 8-10 | Regression analysis, advanced modeling, conservation applications |
| R in Context | 11 | Integrations with jamovi, JASP, Positron, Jupyter, reticulate, plumber |
Who Is This Book For?
This book is designed for anyone working with data in the natural sciences:
- π Students: Undergraduate and postgraduate students in biology, ecology, forestry, agriculture, and environmental sciences
- π¬ Researchers: Scientists seeking to enhance their data analysis and visualization skills
- πΏ Practitioners: Conservation professionals, environmental consultants, and natural resource managers
- π Data Enthusiasts: Anyone interested in learning R for scientific data analysis
New to R? Start with Chapter 1: Introduction to Data Analysis for installation instructions and your first steps with R and RStudio.
Features
β Complete code examples: All code is fully reproducible β Real datasets: Learn with actual data from ecological and environmental research β Modern R practices: Tidyverse and tidymodels workflows throughout β Professional tips: Best practices from experienced researchers β Exercises: Practice problems to reinforce learning β Open access: Free to read online
How to Use This Book
π Read Online
Browse chapters directly in your web browser. Use the navigation menu to move between sections.
π» Run the Code
Copy code examples into R or RStudio. All code is designed to be reproducible with the included datasets.
π§ Adapt & Apply
Modify examples for your own data and research questions. The techniques are broadly applicable.
Prerequisites
To get the most out of this book, you should have:
- Basic computer skills
- R and RStudio installed (instructions in Chapter 1)
- Curiosity about data and natural sciences!
No prior programming experience is required. I start from the basics and build up progressively.
A note on the example datasets
The book ships with ten dataset directories named after scientific disciplines (forestry/, epidemiology/, marine/, and so on). Most of these CSV files came from public TidyTuesday-style sources and kept their original schemas. The directory names reflect the chapter context in which each file is used, not always what the file literally contains. For example, forestry/forest_inventory.csv is actually a Star Wars character table that we treat as a stand-in for continuous-variable analysis.
This is documented in full at data/MISMATCHES.md. When you reach a chapter exercise that asks you to load one of these files, glance at names(df) and the first few rows before assuming the variables match the directory name. If you have a real domain dataset youβd like to swap in, the code in each chapter is small enough to adapt.
Get Involved
π Found an Issue?
Report errors or suggest improvements on GitHub Issues
π€ Want to Contribute?
Contributions welcome! See our Contributing Guide
Acknowledgments
This book would not be possible without:
- The R Core Team and the incredible R community
- The tidyverse and tidymodels teams for transforming how I work with data
- RStudio/Posit for excellent development tools
- The Quarto team for this beautiful publishing system
- All the data providers whose open datasets make the examples possible
- Students and colleagues who provided feedback and inspiration
Ready to start your data analysis journey?