Introduction to Machine learning in R: Classification

An introduction to machine learning in R focusing on classification (supervised learning)

Image may contain: Gesture, Font, Art, Wing, Illustration.

Time and place:

The course consists of two sessions:

Tuesday October 17th, 09:15-12:00, in seminar room Postscript, Ole-Johan Dahls hus

Thursday October 19th, 12:15-15:00, in seminar room Java, Ole-Johan Dahls hus

Who is it for?

This is a course for UiO-affiliated students or researchers those that want to learn more about machine learning, how it can be used in research, but do not have a strong background in mathematics or data science. This is a hands-on course and it is an advantage but not necessary that you are accustomed to writing code in R. Basic knowledge of descriptive statistics and tidyverse is a plus.

A video (approximately 25 minutes) has been prepared that might be useful for those that are completely new to machine learning, with example use-cases in research.

 

How do I sign up?

A valid UiO user-account is required to attend this course. The sign up form is here. You will be notified in advance if the course has to be held online over zoom. 

Important: Participants must use their own PC or Mac (laptop) with both R and RStudio installed. Both R (≥ 3.3.0) and RStudio are free and do not require a licence. R can be installed from https://cran.r-project.org and RStudio  from https://www.rstudio.com/products/rstudio/download/

Contact IT-support from your faculty or department if you need help with installation. You can use UiO Programkiosk ("Statistikk fullskjerm") if it is not possible to install either R or RStudio on your own computer. 

Install the following packages in R(studio) before the start of the course:
tidyverse, tidymodels, xgboost, vip, patchwork, workflowsets
*extra packages* doParallel, discrim

How to install packages in R

A second screen/monitor is an advantage (i.e. one for zoom, the other for coding)

36 participants

Links to course material

Contents

The focus will be on building and evaluating machine learning models in R rather than an in-depth breakdown of specific algorithms. We will be building models to distinguish between different categories of text based on linguistic features (including number of nouns, adjectives, etc.) using XGBoost.

  • Exploratory data analysis
  • Binary classification
    • Feature importance
  • Multiclass classification
  • Cross-validation
  • Additional topics
    • Preprocessing data with "recipe" 
    • Building and evaluating multiple models
      simultaneously
    • Statistically comparing models
    • Hyperparamater tuning

Language

The course will be in English this semester.

Instructor

Luigi Maglanoc, PhD, Data Management, IT Department.

Contact information

Send questions about the course to statistikk@usit.uio.no

Tags: R, Rstudio, statistics, data analysis, data science, data visualization, machine learning, classification
Published Aug. 28, 2023 1:00 PM - Last modified Oct. 12, 2023 2:46 PM