项目作者: tgbnhy

项目描述 :
Repository of k-paths: code, dataset, technical report, visualization
高级语言: Java
项目地址: git://github.com/tgbnhy/k-paths-clustering.git
创建时间: 2019-04-15T03:47:56Z
项目社区:https://github.com/tgbnhy/k-paths-clustering

开源协议:

下载


Fast large-scale trajectory clustering

Technical Report

https://t4research.github.io/k-paths-tr.pdf

Introduction

This repo holds the source code and scripts for reproduce the key experiments of k-paths trajectory clustering.

Usage

  1. If you run in Eclipse, just go to “au.edu.rmit.trajectory.expriments.kpathEfficiency”, and click the “run configuration”, creat a new java application, and fill the following parameters:
  1. .\data_porto\reassign\porto_mm_edge.dat 10 1000000 .\data_porto\reassign\new_edge_street.txt .\data_porto\reassign\new_graph.txt Porto

There are six parameters:

  1. arg[0] is the trajectory data file
  2. arg[1] is the number of clusters (k)
  3. arg[2] is the number of trajectories in the datafile which will be clustered (|D|)
  4. arg[3] is the edge info file which contains the street name
  5. arg[4] is the road network graph file
  6. arg[5] is the city name.

Then, all the result will be recorded into the log file under the “logs” folder.

  1. If you want to run from commands (recommended):
  1. mvn clean package

A file “torch-clus-0.0.1-SNAPSHOT.jar” will be generated under folder “target”.

run the tdrive clustering for efficiency comparision.

  1. java -Xmx16192M -cp ./torch-clus-0.0.1-SNAPSHOT.jar au.edu.rmit.trajectory.expriments.kpathEfficiency ./data_tdrive/beijing_mm_edge.txt.reassign 10 250997 ./data_tdrive/new_id_edge_raw_beijing.txt ./data_tdrive/beijing_graph_new.txt tdrive

run the porto clustering for efficiency comparision.

  1. java -Xmx16192M -cp ./torch-clus-0.0.1-SNAPSHOT.jar au.edu.rmit.trajectory.expriments.kpathEfficiency ./data_porto/porto_mm_edge.dat 10 1565595 ./data_porto/new_edge_street.txt ./data_porto/new_graph.txt porto

run the porto clustering, and produce clustering results for visualization.

  1. #java -Xmx16192M -cp ./torch-clus-0.0.1-SNAPSHOT.jar au.edu.rmit.trajectory.clustering.Running ./data_porto/porto_mm_edge.dat 10 100000 ./data_porto/new_edge_street.txt ./data_porto/new_graph.txt porto

run the tdrive clustering, and produce clustering results for visualization.

  1. #java -Xmx16192M -cp ./torch-clus-0.0.1-SNAPSHOT.jar au.edu.rmit.trajectory.clustering.Running ./data_tdrive/beijing_mm_edge.txt.reassign 10 10000 ./data_tdrive/new_id_edge_raw_beijing.txt ./data_tdrive/beijing_graph_new.txt tdrive

compare with other distance measure in Tdrive dataset

  1. java -Xmx16192M -cp ./torch-clus-0.0.1-SNAPSHOT.jar au.edu.rmit.trajectory.expriments.EBD ./data_tdrive/beijing_mm_edge.txt.reassign 10 1000 ./data_tdrive/new_id_edge_raw_beijing.txt ./data_tdrive/beijing_graph_new.txt tdrive

compare with other distance measure on Porto dataset

  1. java -Xmx16192M -cp ./torch-clus-0.0.1-SNAPSHOT.jar au.edu.rmit.trajectory.expriments.EBD ./data_porto/porto_mm_edge.dat 10 100000 ./data_porto/new_edge_street.txt ./data_porto/new_graph.txt porto

Datasets

We use the map-matched dataset, and trajectory data composed of integer ids. Since they have a size above the standard of Github, we store it in Google Drive, and you can find the dataset from:
https://sites.google.com/site/shengwangcs/torch

Download the trajectory dataset from the above link, and put the dataset into “data_porto” or “data_tdrive”. (The road network graph datasets are already there.)

Visualization

We use MapV (https://github.com/huiyan-fe/mapv) to visulized the cluster result using different color.

If you are familar with javascript, you can use WebStorm (https://www.jetbrains.com/webstorm/) to open the webpage and see how the data is demonstrated.

An online visualization using dynamic flow can also be found in http://203.101.224.103:8080/TTorchServer/.
alt text

Citation

If you use our code for research work, please cite our paper as below:

  1. @article{wang2019fast,
  2. title={Fast large-scale trajectory clustering},
  3. author={Wang, Sheng and Bao, Zhifeng and Culpepper, J Shane and Sellis, Timos and Qin, Xiaolin},
  4. journal={Proceedings of the VLDB Endowment},
  5. volume={13},
  6. number={1},
  7. pages={29--42},
  8. year={2019},
  9. publisher={VLDB Endowment}
  10. }

If you use our mapped trajectory dataset for research work, please cite our paper as below:

  1. @inproceedings{wang2018torch,
  2. author = {{Wang}, Sheng and {Bao}, Zhifeng and {Culpepper}, J. Shane and {Xie}, Zizhe and {Liu}, Qizhi and {Qin}, Xiaolin},
  3. title = "{Torch: {A} Search Engine for Trajectory Data}",
  4. booktitle = {Proceedings of the 41th International ACM SIGIR Conference on Research & Development in Information Retrieval},
  5. organization = {ACM},
  6. pages = {535--544},
  7. year = 2018,
  8. }