Skip to main content

Data Programming in Python

Module information>

Academic Direction
Goldsmiths, University of London
Also part of
MSc Data Science
Modes of Study

This module aims to provide you with the programming skills you will need to carry out the programming tasks you will encounter in the other modules in this programme.

You will learn about general programming techniques such as variables, functions and control flow. You will learn how to work with different types of data structures such as arrays and dictionaries.

You will develop data processing pipelines, which allow you to convert raw data into data that you can analyse. You will apply mathematical and statistical procedures to data. You will learn how to plot graphs of various types. You will also familiarise yourself with an industry standard data science programming environment which you can use throughout the programme.

Upon successful completion of this module, you will:

  • be able to work with different data representations including: basic data types, comma-separated variables, eXtensible Markup Language, JavaScript Object Notation, Resource Description Framework, Relations.
  • be familiar with techniques for data acquisition, storage, retrieval and publication. You will have experience of working with filesystems, version control, network programming, HTTP, Web servers and relational database systems e.g SQL.
  • be confident working with data of different types and representing data for analysis through the forms of statistics and visualisations.

Topics covered

  • Setting up your programming environment 
  • Variables, control flow and functions 
  • Data structures 
  • Data plotting 
  • Reading and writing data on the file system 
  • Retrieving data from the web 
  • Retrieving data from databases using query languages 
  • Cleaning data 
  • Restructuring data 
  • Version control systems 


15 (150 hours)


  • Coursework item 1 (30%)
  • Coursework item 2 (70%)