Reflections from the Data Science Course

After taking the Data Science for Statisticians course, I understand further what the role of a data scientist is. One thing that has changed for me about my understanding of the data science role is how much they actually perform machine learning or data analysis in their day to day work. I’ve learned that this is only a small part of what they do, and the bulk of the work is understanding business needs, data processing and cleaning, and deploying and intergreting the models.

After getting more hands-on R experience in this course, I realize there’s still much more that I have to uncover about the language. It’s great to learn about all the different things I can do with it and I’m excited to keep learning and practicing the language throughout my career and other projects. I still prefer R over most languages, but I recognize that attempting to do some things in R is not as efficient as using other languages, such as accessing API’s in Python or performing complex statisticial analyses in SAS. For this reason, it is important to not focus solely on one language, so I can optimize my workflow and use the language that will be the most effective for what I am trying to do.

Ater taking this course, there are a few things I plan to do differently in future work. First, I plan to continue using Github to share projects that I’ve completed and to collaborate with peers or coworkers. Second, I plan to use R Markdown techniques to help keep my work organized, my code commented, and my data tidy. The initial investment of time pays off greatly when working on a large project. Finally, I plan to look for where I can make my code more efficient, in R and in other languages. Understanding functions, loops, and what kind of operations are more time consuming than others can save hours of time on a single project.

Written on July 29, 2021