Concurrent Pandas is a Python Library that allows you to use Pandas to concurrently download bulk data using threads or processes.
Concurrent Pandas is a Python Library that allows you to use Pandas and / or Quandl to concurrently download bulk data using threads or processes. What does concurrency do for you? Download your data simultaneously instead of one key at a time, Concurrent Pandas automatically spawns an optimal number of processes or threads based on the number of processes available on your machine.
Note: Concurrent Pandas is not associated with Quandl or Python Pandas, it just allows you to access them faster.
# Define your keys
yahoo_keys = ["aapl", "xom", "msft", "goog", "brk-b", "TSLA", "IRBT"]
# Instantiate Concurrent Pandas
fast_panda = concurrentpandas.ConcurrentPandas()
# Set your data source
fast_panda.set_source_yahoo_finance()
# Insert your keys
fast_panda.insert_keys(yahoo_keys)
# Choose either asynchronous threads, processes, or a single sequential download
fast_panda.consume_keys_asynchronous_threads()
# The Concurrent Pandas object contains a dict of your results now
mymap = fast_panda.return_map()
# Easily pull the data out of the map for your research
print(mymap["aapl"].head)
Note : only tested on Linux
To install execute:
pip install ConcurrentPandas
New in 0.1.2
Ability to interact with stock options
Now requires BeautifulSoup4, and Pandas 0.16 or newer.
Tested on Python 2.7.6 and Python 3.4.0
To see what else I’m building or follow / contact me check out my github, twitter, and my personal site.