The frustration
You are half way thru downloading data for 100 stocks and suddenly you realize that half of the dataframes are empty. What just happened, NSE's website probably detected that there are too many requests from your IP and now you are blocked. But the question is do you really need to download the same data every time?
Typical analysis cycle
Lets consider you are running an analysis every week on last one month data-
from datetime import date,timedelta
from nsepy import get_history
def download_data_for_month():
end_day = date.today()
start_day = end_day - timedelta(30)
stock_price_df = get_history(symbol="SBIN", start=start_day, end= end_day)
return stock_price_df
def do_some_analysis(df):
""" Some Cool Analysis """
df = download_data_for_month()
do_some_analysis(df)
What's wrong with above example-
- The script downloads data every time you run it and in an development cycle you might change your code frequently, multiple times a day or even hour. Why would you download data on each iteration? It not only slows down YOUR development productivity, but also loads the NSE's website.
- Consider you are running the script every week, incremental data change is for just 7 days but you are still requesting data for the whole 30 days.
What can we change
Have separate scripts for downloading and analysis
download_data.py
def download_data_for_month():
end_day = date.today()
start_day = end_day - timedelta(30)
stock_price_df = get_history(symbol="SBIN", start=start_day, end= end_day)
return stock_price_df
df = download_data_for_month()
df.to_pickle('SBIN.pkl')
analysis.py
import pandas as pd
def do_some_analysis(df):
""" Some Cool Analysis """
df = pd.read_pickle('SBIN.pkl')
do_some_analysis(df)
Top comments (0)