Day 24


Final projects

The final project accounts for 55% of your grade. So I am expecting something substantial. This will involve careful thinking about your data, which should result in interesting questions. To answer these questions will involve both critical, statistical thinking and writing well-tested code.

The best way I know of producing a high-quality final project is to spend time thinking about your problem, researching existing approaches, looking carefully and critically at your data, and writing and revising your code.

Most of the teams have substantially begun this process. However, a few teams still need to get started on their final projects. It will not suffice to just commit code you got from us (i.e., the teaching staff).

Also you should be following best practices. This includes things like configuring your computer. For example, I noticed that several of you still have not configured Git with your information. You should, at a bare minimum, configure Git with your and If you don’t know how to do this, please see the notes from the second lecture.

I expect to see several, non-trivial commits from each member of your team. Roughly half of the teams have members that have contributed less to your projects than Ross or I. Every team member should visibly and publically contribute to the project.

If my expectations are not met, it will be reflected in your final team grade.

Next steps

  • Draft reports: If your team wishes, you may turn in a 9 page draft of your final report on Monday, November 30th at 21:00 and give me an @mention. If you submit this draft, I will give you feedback on your reports by the evening of Sunday, December 6th. (If you would rather that I give you feedback on the report you submitted last Friday, just let me know. However, I will only give you feedback on one draft.)

  • Meeting as a group: A few groups mentioned that they had a hard time finding time to meet as a group. I’ve given several hours of in-class time specifically for this purpose. However, I’ve noticed that several students have been missing class. If your team is having difficulty meeting as group and your teammates aren’t attending class, please let me know so I can address the situation.

  • Class time: For the next 3 classes, we will have about 30 minutes of lecture at the beginning of each class followed by project work. The lectures will be designed to give you additional ideas about what to do with your data. The idea being that maybe you will be inspired to do something similar with your data analysis. So please pay attention to the final lectures and when you break into groups think about whether a similar approach would be useful for your projects.

  • Project scope: Several projects still have a scope that is too ambitious. In particular, if you’ve made little progress up to this point, I would try to limit your ambitions to something you can successfully accomplish during the remaining time. Having an loosely defined project may be new to you at this point. However, as you progress in your academic careers or in industry, you will find that having well-defined projects (such as normally encountered in homework assignments) will be much more the exception than the rule. Defining the questions to ask and the scope of your work is challenging task. The best way, I know, to face this challenge is to start early, proceed cautiously, and continuously refine your questions and the scope of your work.

  • Final presentations: While I was concerned about a lack of progress on the final projects, I was impressed by everyone’s ability to present. So I’ve decided that rather than have formal slide presentations in front of the class, we (i.e., the teaching staff) will meet with each team to go over the final presentation slides. Rather than presenting the slides, we will go through slides together and ask questions and suggest improvements. This will be a much more interactive process than the progress presentations were. We will be able to meet with each group for about 10 minutes on Thursday, December 3rd. However, if your team can be ready to discuss your projects (with slides) on Tuesday, December 1st, then your team will be able to get more than 10 minutes to discuss your projects. You will be asked in lab on Monday, November 30th what day your team would like to meet.

    I strongly recommend that the format of your slides parallels the outline of your final report. However, I recommend that it consist primarily of plots, figures, tables, equations, and algorithms. You should feel free to include many more plots (etc.) than you will include in your final report. That way, you can get feedback about what to include in your report. This will also allow us to give you feedback on the overall story you tell with your data.

  • Tentative final project rubric


  • If you have questions about your projects, there should be a pull request with code or text and use @jarrodmillman, @matthew-brett, @rossbar, andor @jbpoline
  • Most of your code should be written as a collection of functions with tests, then use scripts calling these functions to perform your analysis