Find us on GitHub

SciLifeLab Stockholm alpha building, lunch room.

2015-11-30 - 2015-12-02

9:00 am - 4:30 pm

Instructors: Oxana Sachenkova, Olav Vahtras, Radovan Bast, Roman Valls Guimera

Helpers: Ahmed Kachkach, Mikael Huss

General Information

Software Carpentry's mission is to help scientists and engineers get more research done in less time and with less pain by teaching them basic lab skills for scientific computing. This hands-on workshop will cover basic concepts and tools, including program design, version control, data management, and task automation. Participants will be encouraged to help one another and to apply what they have learned to their own research problems.

For more information on what we teach and why, please see our paper "Best Practices for Scientific Computing".

Who: The course is aimed at graduate students and other researchers. You don't need to have any previous knowledge of the tools that will be presented at the workshop.

Where: Science for Life Laboratory, Tomtebodavägen 23A, 17165 Solna, Sweden. Get directions with OpenStreetMap or Google Maps.

Requirements: Participants must bring a laptop with a few specific software packages installed (listed below). They are also required to abide by Software Carpentry's Code of Conduct. We require that all participants sign up for a GitHub account and to configure Git on your laptop before the course starts and we ask the participants to get familiar with Git by reading Version Control with Git and practicing with basic Git challenges on https://try.github.io. We will offer a short recap on Git basics during the first day of the course.

Contact: Please mail brainstorm+swc_data@nopcode.org for more information.


Schedule

Day 1

09:00 Introduction to the workshop(s).
09:30 Introduction to Python.
10:30 Test driven development, Continuous Integration, and Git recap.
12:00 Lunch break.
13:00 Students start with the Python exercises.
15:45 Collaborative GitHub workflow exercise.
16:30 Wrap up, check on status, Q&A and more TA :)

Day 2

09:30 Recap/logistics/Q&A from previous session.
10:30 Data wrangling lesson from Data Carpentry: python-ecology
12:30 Lunch break.
13:30 Putting it all together with some real biological data. Follow this notebook and instructions during the seminar.
14:00 Introduction to machine learning by Ahmed.
15:00 More real world examples: predictive models of gene expression, cancer hyplots, scikit-allel.
18:00 - onwards Post workshop mingling.

Day 3

NOTE: This will be happening on lunch room on *** gamma *** floor (instead of alpha). That is, right in front of Scilifelab reception.

To get academic credits for the course or just feel like you accomplished something, we recommend attending a seminar on day 3 where you can get your exercises solutions accepted and discuss any questions you have with the instructors.

9-12 am (drop-in), same location

These workshops are registered as PhD level courses at Stockholm University:

  • Software carpentry for beginners 1,5 hp
  • Data carpentry and machine learning 1,5 hp
Getting the course certificates requires completing all the workshop assignments and getting them accepted by one of the instructors at any time during the workshop. If you are not registered at SU, please provide your personal number in this file (will be open for editing until 15:30 2 Dec): Credits file


Syllabus

Git and best practices recap

  • Tracking changes.
  • Basic clone/pull-add-commit-push-workflow.
  • Working with GitHub.
  • Collaborating.
  • Test driven development and continuous integration.
  • Reference...

Data analysis with Pandas

  • Working with vectors and data frames.
  • Reading and plotting data.
  • Network visualization
  • Slicing datasets.
  • Data types, dataframe transformations

Programming in Python

  • Using libraries.
  • Working with arrays.
  • Reading and plotting data.
  • Creating and using functions.
  • Loops and conditionals.
  • Defensive programming.
  • Using Python from the command line.
  • Reference...

Machine learning

  • Machine learning examples with scikit-learn.
  • Applied analysis to biological and ecological datasets.

Setup

To participate in a Software Carpentry workshop, you will need access to the software described below. In addition, you will need an up-to-date web browser.

We maintain a list of common issues that occur during installation as a reference for instructors that may be useful on the Configuration Problems and Solutions wiki page.

Editor

When you're writing code, it's nice to have a text editor that is optimized for writing code, with features like automatic color-coding of key words. Editors that you could use if you have not chosen one already are:

Python

Python is a popular language for scientific computing, and great for general-purpose programming as well. Installing all of its scientific packages individually can be a bit difficult, so we recommend Anaconda, an all-in-one installer.

Regardless of how you choose to install it, please make sure you install Python version 3.x (e.g., 3.4 is fine).

We will teach Python using the Jupyter notebook, a programming environment that runs in a web browser. We will be occasionally using Nature's (Rackspace) JupyterHub and mybinder.org for some of the lessons.

Once you are done installing the software listed above, please go to this page, which has instructions on how to test that everything was installed correctly. Please also verify that you have correctly configured Git on your laptop before the beginning of the course.