项目作者: shwetank3

项目描述 :
Data Science - Projects done at Virginia Tech
高级语言: Python
项目地址: git://github.com/shwetank3/niyati.git
创建时间: 2017-03-22T01:16:31Z
项目社区:https://github.com/shwetank3/niyati

开源协议:

下载


Data Science

The repository has my work in various topics of Data Science.

Graph Mining

Using graph algorithms to solve the problem of link prediction. The dataset involves use of adjacency matrix and deriving various features like centrality, betweenness, jaccard and cosine similarity.
New features were derived by taking the exponential of adjacency matrix. With new features, accuracy improed by 5%.

https://shwetank3.github.io/graph.html

Spatial Statistics

The Topic deals with modeling the geospatial datapoints over a region. Various algorithms like CAR and SAR were used to model the dataset.
Project PPP (Poisson Point Processes) deals with modeling of spatial points distributed within a designated region and presumed to have beengenerated by some form of stochastic mechanism.
Geyer and Strauss Process were used for model fitting.

https://shwetank3.github.io/spatial.html

Time Series

The Topic deals with modeling the time-series datapoints over a period of time.
Bayesian Gibs folder contains R files, demonstrating use of conjugate and reference prior in modeling, and convergence of Gibbs Sampling.
Project Dynamic Linear Model, involves predicting the traffic speed based on various features like temperature, humidity etc.

https://shwetank3.github.io/timeseries

Topic Modeling

Using Traditional NLP methods to model topics based on Tweets from various cities in India, over a period of time.
Using NLTK to tokeinze, stemmize and lemmatize word, followed by topic modeling of the dataset.