PhD Course: Social Data Science - An applied introduction to machine learning

PhD Course: Social Data Science - An applied introduction to machine learning

Get started with data science and machine learning with this PhD course for social science and humanities scholars!


22.11.2021 kl. 09.00 - 26.11.2021 kl. 16.00


The developments in computer science technologies and the increasing amount of accessible data present a range of new methodological opportunities for the social sciences and humanities.

Data from websites, social media and electronic devices (often referred to as ‘Big Data’) allow for new approaches and perspectives on issues relevant for both the social sciences and humanities. Meanwhile, the increasing computational power and development of artificial intelligence algorithms provide the means for accessing, combining and analyzing a variety of data types (numerical, textual, relational) in new and meaningful ways.

This course is a hands-on practical introduction with no prerequisites in applying computer science techniques (like programming and machine learning) in humanities and social science research. It will cover a broad range of techniques and methods representing the latest methodological innovations in social science and humanities applications of machine learning and artificial intelligence.
Some techniques include:

  • Collecting data from the web using web scraping methods and API's
  • Processing textual data for quantitative analysis (Natural Language Processing)
  • Working and visualizing networks (network analysis)
  • Dimensionality reduction and clustering techniques (topic models and k-means clustering)
  • Visualization techniques for text data and networks
  • Building and understanding machine learning classifiers

This course is meant as a hands-on tools course focusing on the practical use of these methods and will not go in depth with the mathematical and theoretical foundations. It will rather provide a broad overview of the data science ecosystem and toolbox and enable immediate application.

Structure / teaching format

Each day will consist of a mixture of lectures and exercises using interactive online notebooks allowing participants to try out and use the various methods as they are being taught.

Participants are expected to work on a portfolio during the week with each day having hours dedicated to portfolio work with the possibility of sparring with the course lecturers. Here, participants will work on applying the methods and techniques presented on various cases.


The course teaches the methods in python using the Jupyter Notebook IDE on Google Colab.

It is not a prerequisite to know Python beforehand as access to relevant courses will be provided and the first day of the course provides the relevant introduction.

Participants are expected to complete assigned introductory e-courses on DataCamp before the course. Access to DataCamp will be provided 4 weeks in advance. Two mandatory online check-in sessions are scheduled to properly prepare participants for the course.

Please bring your own laptop for the course. Make sure to have a Google Account to use on Google Colab. An account can be created for free at

Learning objectives

The objectives of the course is to obtain knowledge of key data science concepts and their relevance in social science and humanities as well as gaining practical competencies in applying and embedding data science methods in quantitative and qualitative workflows using a variety of data types (numerical, textual, relational).


Participants are expected to hand in a portfolio assignment no later than 2 weeks after the conclusion of the course. Credits can only be received by handing in portfolio assignment by December 10th 2021.


Access to relevant e-course material from DataCamp will be provided 4 weeks in advance. All other materials and notebooks will be provided at the course.




Number of seats



  • For AAU phD students: 455,- DKK (incl. VAT)
  • For non-AAU phD students (entire course): 4,800,- DKK (excl. VAT)

No-show fee: We have over some time experienced problems with no-show for both project and general courses. It has now reached a point where we are forced to take action. Therefore, the Doctoral School has decided to introduce a no-show fee of DKK 5,000 for each course where the student does not show up. Cancellations are accepted no later than 2 weeks before start of the course. Registered illness is of course an acceptable reason for not showing up on those days. Furthermore, all courses open for registration approximately three months before start. This can hopefully also provide new students a chance to register for courses during the year. We look forward to your registrations.


Organizing Committee

  • Associate Professor Daniel Hain (Aalborg University Business School)
  • Associate Professor Roman Jurowetzki (Aalborg University Business School)
  • Assistant Professor Rolf Lyneborg Lund (Department of Sociology and Social Work, Aalborg University)
  • Professor Thomas B. Moeslund (Department of Architecture, Design and Media Technology, Aalborg University)
  • Kristian Gade Kjelmann (General Manager of CALDISS, Aalborg University)

Please contact Kristian (e-mail: with any questions regarding the course.


A course site with all relevant material is currently being developed.


Main course: November 22nd-26th 2021, 9:00-16:00

Online check-in sessions: November 8th & November 16, both days from 13.00-14.30

Covid-19 contingencies

This course will be held with physical attendance (with exception of the online check-ins). Should precautionary measures for COVID-19 still be in effect at the time of the course, the course will be converted to an entirely digital course.

Should the course be held digitally, participants will be refunded 455,- DKK corresponding to the price of catering.

Registered participants will be notified via e-mail, if the course will be converted to an entirely digital formats.


Aalborg University, Rendsburggade 14, 9000 Aalborg, room 4.105

Tilmelding inden

08.11.2021 kl. 23.59