项目作者: pratian-pyclub
项目描述 :
POCS for AI Course
高级语言: Python
项目地址: git://github.com/pratian-pyclub/pocs.git
Outline of Case Studies, Labs and Projects for the four verticals of AI.
Machine Learning
Labs:
- Classification of Multivariate Gaussion Distribution using FeedForward Network
Projects:
- Automatic Image Classification with Neural Networks and PyBrain
The learnings will be,
- Understanding implementation difference between Classification and Regression
- Data split importance
- Accuracy Calculations
The skills learned will be,
- Reading images with Python
- Converting images to datasets
- Feature extraction in images that will lead to classification
Case Studies:
- Back Propagation Algorithm Implementation with Python
NLP
Labs:
- Conditional Frequency Distributions
- Frequency Distribution and Dispersion Plots
- Plotting and Tabulating Distributions
- Counting Words by Genre
- Convert English Text to Pig Latin
- Definition of Pig Latin Each word of the text is converted as follows: move any consonant (or consonant cluster) that appears at the start of the word to the end, then append ay, e.g. string → ingstray, idle → idleay. http://en.wikipedia.org/wiki/Pig_Latin
- Write a function to convert a word to Pig Latin.
- Write code that converts text, instead of individual words. Extend it further to preserve capitalization, to keep qu together (i.e. so that quiet becomes ietquay), and to detect when y is used as a consonant (e.g. yellow) vs a vowel (e.g. style).
- Process the Brown Corpus to find answers to the following questions,
- Which nouns are more common in their plural form, rather than their singular form? (Only consider regular plurals, formed with the -s suffix.)
- Which word has the greatest number of distinct tags. What are they, and what do they represent?
- List tags in order of decreasing frequency. What do the 20 most frequent tags represent?
- Which tags are nouns most commonly found after? What do these tags represent?
- Write a tag pattern to identify places of work from a set of resumes by building your own grammar.
Projects:
Analysis of a real world event using Twitter data. This includes graphing with matplotlib.
The various analysis will be,
- Sentiment Classification
- Keyword Extraction
- Frequency Distribution
The feature extractors will be,
- Unigrams + Bigrams
- Frequency Distribution Extraction
Skills learnt will be,
- Data Gathering
- Data Classification
- Data Normalizing
- Graphing
- JSON API Calls and Ratelimiting
- Reading and Storing YAML data
Case Studies
- Resume Parser
- Accepts resumes in PDF format
- Must parse the applicants for,
- Name
- Birthdate
- Address
- Experience in years
- Categorize skills
Algorithmic Programming
Labs:
- Sierpinski Triangle
- Tower of Hanoi
- Exploring a Maze
- Implementing a binary search tree that accepts any input type
- The Knight’s Tour Problem
- Dijkstra’s Problem
Projects:
- Create rate-limiting tool with Python and Redis
- Use the Python Redis package, https://pypi.python.org/pypi/redis.
- Redis will be used to store all the values to be processed.
- A background job will read the data from Redis and process them at regular intervals.
- Background jobs will be managed using the RQ Python package, http://python-rq.org/
- Create a CLI function that would start these background processes
Case Studies
- An Anagram Detection Example using multiple Algorithm Solutions
- Solution 1: Checking Off
- Solution 2: Sort and Compare
- Solution 3: Brute Force
- Solution 4: Count and Compare
- Longest Common Sunsequence Algorithm Python Implementation
- Automatic Text Generation with Hidden Markov Models
Bayesian Inference
Labs:
- Modelling Coin flip patterns using Bayes Rule
Projects:
- Ordering Reddit submissions
- Theory
- Relook at Law of Large Numbers Relook at Naive Bayes Theorem
- Relook at Gaussian distribution
- Introduction to the problem
- Discussion on factors involved
- Popularity (Number of upvotes)
- Difference of upvotes and downvotes
- Time adjusted (rate of upvotes/downvotes)
- Ratio of upvotes to total number of votes
- Understanding how human behavior affects data
Case Studies
- Algorithm for human deceit
- A look at Binomial distribution
- Example: Cheating Students