vis arrangement

PhD Course: Social Data Science - An applied introduction to machine learning

PhD Course: Social Data Science - An applied introduction to machine learning

Get started with data science and machine learning with this PhD course for social science and humanities scholars!


23.11.2020 kl. 09.00 - 27.11.2020 kl. 16.00


The developments in computer science technologies and the increasing amount of accessible data present a range of new methodological opportunities for the social sciences and humanities.

Data from websites, social media and electronic devices (often referred to as ‘Big Data’) allow for new approaches and perspectives on issues relevant for both the social sciences and humanities. Meanwhile, the increasing computational power and development of artificial intelligence algorithms provide the means for accessing, combining and analyzing a variety of data types (numerical, textual, relational) in new and meaningful ways.

This course is a hands-on practical introduction with no prerequisites in applying computer science techniques (like programming and machine learning) in humanities and social science research. It will cover a broad range of techniques and methods representing the latest methodological innovations in social science and humanities applications of machine learning and artificial intelligence.
Some techniques include:

  • Collecting data from the web using web scraping methods and API's
  • Processing textual data for quantitative analysis (Natural Language Processing)
  • Working and visualizing networks (network analysis)
  • Dimensionality reduction and clustering techniques (topic models and k-means clustering)
  • Visualization techniques for text data and networks
  • Building and understanding machine learning classifiers

This course is meant as a hands-on tools course focusing on the practical use of these methods and will not go in depth with the mathematical and theoretical foundations. It will rather provide a broad overview of the data science ecosystem and toolbox and enable immediate application.

Structure / teaching format

Each day will consist of a mixture of lectures and exercises using interactive online notebooks allowing participants to try out and use the various methods as they are being taught.

Participants are expected to work on a portfolio during the week with each day having hours dedicated to portfolio work with the possibility of sparring with the course lecturers. Here, participants will work on applying the methods and techniques presented on various cases.


The course teaches the methods in python using the Jupyter Notebook IDE on Google Colab.

It is not a prerequisite to know Python beforehand as access to relevant courses will be provided and the first day of the course provides the relevant introduction.

Participants are expected to complete assigned introductory e-courses on DataCamp before the course. Access to DataCamp will be provided 4 weeks in advance. Two mandatory online check-in sessions are scheduled to properly prepare participants for the course.

If you are already familiar with Python programming and the application of statistical tools, it is possible to skip the first day and sign up for a lesser fee and receive less credits (4-day version).

Please bring your own laptop for the course. Make sure to have a Google Account to use on Google Colab. An account can be created for free at

Learning objectives

The objectives of the course is to obtain knowledge of key data science concepts and their relevance in social science and humanities as well as gaining practical competencies in applying and embedding data science methods in quantitative and qualitative workflows using a variety of data types (numerical, textual, relational).


Participants are expected to hand in a portfolio assignment no later than 2 weeks after the conclusion of the course. Credits can only be received by handing in portfolio assignment by December 11th 2020.


Access to relevant e-course material from DataCamp will be provided 4 weeks in advance. All other materials and notebooks will be provided at the course.



4 (3 for 4-day version)

Number of seats



  • For AAU phD students: 455,- DKK (incl. VAT)
  • For non-AAU phD students (entire course): 4,800,- DKK (excl. VAT)
  • For non-AAU phD students (4-day version): 3,600,- DKK (excl. VAT)

No-show fee: We have over some time experienced problems with no-show for both project and general courses. It has now reached a point where we are forced to take action. Therefore, the Doctoral School has decided to introduce a no-show fee of DKK 5,000 for each course where the student does not show up. Cancellations are accepted no later than 2 weeks before start of the course. Registered illness is of course an acceptable reason for not showing up on those days. Furthermore, all courses open for registration approximately three months before start. This can hopefully also provide new students a chance to register for courses during the year. We look forward to your registrations.

Sign up now

Organizing Committee

  • Associate Professor Daniel Hain (Aalborg University Business School)
  • Associate Professor Roman Jurowetzki (Aalborg University Business School)
  • Assistant Professor Rolf Lyneborg Lund (Department of Sociology and Social Work, Aalborg University)
  • Professor Thomas B. Moeslund (Department of Architecture, Design and Media Technology, Aalborg University)
  • Kristian Gade Kjelmann (General Manager of CALDISS, Aalborg University)

Please contact Kristian (e-mail: with any questions regarding the course.


A course site with all relevant material is currently being developed.


Main course: November 23rd-27th 2020, 9:00-16:00

Online check-in sessions: November 4th & 16th (time TBD)


Online via Zoom

Tilmelding inden

02.11.2020 kl. 23.59