Project 2

Bike sharing

Many cities provide the bike sharing service. Some of them made the bike trip data open. For example:

New York	Citibike	data
Chicago	Divvy	data
Boston	Bluebikes	data
Pittsburgh	HealthyRide	data
Philadelphia	Indego	data
Washington, DC	Capital	data
San Francisco	Bay Wheels	data

Each trip is anonymized and includes:

Trip start day and time
Trip end day and time
Usertype

These data can be aggregated into frequency distributions in different ways. A trip t adds 1 to a bin iff (if and only if) [t.start,t.end) ∩ bin ≠ ∅. For example

DURA: A trip duration (= t.end - t.start) distribution for a selected year (bin size = 5 min) for each usertype and total.

WEEK: A weekly activity for a selected year for each usertype and total. The units U are the weeks, bins are days of the week (Mo, Tu, We, Th, Fr, Sa, Su).

DAY: A daily activity for a selected year for each usertype and total. The units U are days in a year, bins are half hours of a day.

ISTA: A station daily arrival activity for a selected year for each usertype and total. The units U are stations, bins are half hours of a day. A trip t adds 1 to a bin iff t.end ∈ bin.

OSTA: A station daily departure activity for a selected year for each usertype and total. The units U are stations, bins are half hours of a day. A trip t adds 1 to a bin iff t.start ∈ bin(station,usertype).

For each student number s the TYPE and service are determined by the following table

	Citibike	Divvy	Bluebikes	HealthyRide	Indego	Capital	Bay Wheels
WEEK	1	4	7	10	13	16	19
DAY	2	5	8	11	14	17	20
ISTA	3	6	9	12	15	18	21
OSTA	22	23	24	25	26	27	28

For example, to the student s=5 correspond TYPE=DAY and service=Divvy.

For the year 2019 and “your” service construct the aggregated descriptions for types DURA and “your” TYPE.

Each distribution is a vector. To get a unit description join the corresponding vectors into a named list. To get a data set description join the unit descriptions into a named list. Save both data set descriptions in JSON format.

Visualize/analyze the created data sets. Report your observations.

Hints

Students; EDA