Logistical Concerns for Deep Learning

Deep Learning
Author

Daniel Kick

Published

February 18, 2025

Overview

This week we talked about non-deep learning tasks that are necessary for effective modeling.

  • Data cleaning and storage
    • Pandas, Polars, and Databases
  • Model evaluation and test set construction
    • Structure in the data and information leakage
  • Case Study: Workflow to predict viral infection potential
    • Separating data prep from modeling
    • Using cross validation folds stratified with respect to species
    • How do we think about predicting cases that were not observed?
    • Example pipeline
  • Tracking Hyperparameters and Experiments
    • General workflow
    • KerasTuner, Ray, and Ax
    • Example in Ax

Meeting Notes

  • The slides from today can be found here.