Ingo Scholtes
Data Analytics Group
Department of Informatics (IfI)
University of Zurich
September 5 2018
In the last (open-ended) exploration of this first tutorial session, you have the chance to use higher-order network analytics to study real data sets for yourself.
Details on the available data sets can be found here. Using these methods introduced in the previous unit, you can - for instance - address the following questions (in ascending order of difficulty):
The data sets and questions above are mere suggestions for your exploration of higher-order network analytics. You are welcome to study other data sets or questions instead. Please reach out to me if you encounter any problems or questions (also after the tutorial). You can reach me at scholtes@ifi.uzh.ch
.
import pathpy as pp
# Flight data
flight_paths = pp.Paths.read_file('../data/US_flights.ngram', frequency=False)
# Clickstreams, ignore single path with more than 400 clicks
clickstreams = pp.Paths.read_file('../data/wikipedia_clickstreams.ngram', frequency=False, max_ngram_length=100)
# London Tube trips based on Oyster card checkin-checkouts
tube_net = pp.Network.read_file('../data/tube.edges', separator=';')
od_stats = pp.path_extraction.read_origin_destination('../data/tube_od.csv', separator=';')
tube_trips = pp.path_extraction.paths_from_origin_destination(od_stats, tube_net)