Loading...

Details

  • Journal: SimplyStats
  • Date: Jan. 22, 2018
  • Category: Visualization & Software

Description

Rafael Irizarry from the Department of Data Sciences at Dana Farber Cancer Institute created the Data Science Labs R package for use in his data science courses. Irizarry used Project Tycho to show how to plot data in more then 2 dimensions.

Authors

Rafael Irizarry

Related Project Tycho Datasets

United States of America - Acute nonparalytic poliomyelitis
United States of America - Acute paralytic poliomyelitis United States of America - Acute poliomyelitis United States of America - Acute type A viral hepatitis
United States of America - Congenital rubella syndrome
United States of America - Measles
United States of America - Mumps United States of America - Pertussis United States of America - Rubella United States of America - Smallpox United States of America - Smallpox without rash
United States of America - Viral hepatitis, type A

Abstract

In this post I describe the dslabs package, which contains some datasets that I use in my data science courses.

A much discussed topic in stats education is that computing should play a more prominent role in the curriculum. I strongly agree, but I think the main improvement will come from bringing applications to the forefront and mimicking, as best as possible, the challenges applied statisticians face in real life. I therefore try to avoid using widely used toy examples, such as the mtcars dataset, when I teach data science. However, my experience has been that finding examples that are both realistic, interesting, and appropriate for beginners is not easy. After a few years of teaching I have collected a few datasets that I think fit this criteria. To facilitate their use in introductory classes, I include them in the dslabs package.

Read the full article