Data Science - Projects done at Virginia Tech
The repository has my work in various topics of Data Science.
Using graph algorithms to solve the problem of link prediction. The dataset involves use of adjacency matrix and deriving various features like centrality, betweenness, jaccard and cosine similarity.
New features were derived by taking the exponential of adjacency matrix. With new features, accuracy improed by 5%.
https://shwetank3.github.io/graph.html
The Topic deals with modeling the geospatial datapoints over a region. Various algorithms like CAR and SAR were used to model the dataset.
Project PPP (Poisson Point Processes) deals with modeling of spatial points distributed within a designated region and presumed to have beengenerated by some form of stochastic mechanism.
Geyer and Strauss Process were used for model fitting.
https://shwetank3.github.io/spatial.html
The Topic deals with modeling the time-series datapoints over a period of time.
Bayesian Gibs folder contains R files, demonstrating use of conjugate and reference prior in modeling, and convergence of Gibbs Sampling.
Project Dynamic Linear Model, involves predicting the traffic speed based on various features like temperature, humidity etc.
https://shwetank3.github.io/timeseries
Using Traditional NLP methods to model topics based on Tweets from various cities in India, over a period of time.
Using NLTK to tokeinze, stemmize and lemmatize word, followed by topic modeling of the dataset.