-
-
Notifications
You must be signed in to change notification settings - Fork 2.8k
Description
Describe bug
When using yfinance.download concurrently (e.g., via concurrent.futures.ThreadPoolExecutor) to download same ticker with different date ranges, results may be silently overwritten.
This is because yfinance.download() internally uses a shared global dictionary, which is not thread-local or protected from concurrent writes.
Here is a simplified version of the internal function used inside download() (from source):
def _download_one(ticker, ...):
...
data = Ticker(ticker).history(...)
shared._DFS[ticker.upper()] = data
return data
Since _DFS is a global dictionary, if two threads are downloading the same ticker with different parameters (e.g., different start/end), they may overwrite each otherโs results inside _DFS before the final result is returned.
This leads to incorrect results when the user calls yf.download() concurrently.
Notes on Debug Mode
This issue can be harder to reproduce when debug_mode=True, because yfinance.download() automatically sets threads=False internally.
However, even in debug mode, the userโs code may still execute in a multi-threaded context, so this behavior is likely just coincidentally avoided due to timing.
The root cause remains: a global shared dictionary is accessed and written to concurrently. This is a race condition waiting to happen.
Simple code that reproduces your problem
import asyncio
import yfinance as yf
from functools import partial
from concurrent.futures import ThreadPoolExecutor
def download(ticker: str, start: str):
return yf.download(ticker, start=start, end="2023-01-10", progress=False)
async def main():
with ThreadPoolExecutor(max_workers=2) as executor:
loop = asyncio.get_event_loop()
tasks = [
loop.run_in_executor(executor, partial(
download, "AAPL", "2023-01-01")),
loop.run_in_executor(executor, partial(
download, "AAPL", "2022-12-01")),
]
results = await asyncio.gather(*tasks)
for i, df in enumerate(results):
print(f"\n๐ Result {i + 1}:")
if df is None or df.empty:
print("โ DataFrame is empty!")
else:
print(f" From: {df.index.min().date()} To: {df.index.max().date()}")
print(f" First Close: {float(df['Close'].iloc[0].item()):.2f}")
print(f" Last Close: {float(df['Close'].iloc[-1].item()):.2f}")
print(f" Rows: {len(df)}")
asyncio.run(main())
Debug log from yf.enable_debug_mode()
DEBUG Entering download()
DEBUG Entering download()
YF.download() has changed argument auto_adjust default to True
DEBUG Disabling multithreading because DEBUG logging enabled
DEBUG Disabling multithreading because DEBUG logging enabled
DEBUG Using User-Agent: Mozilla/5.0 (X11; Linux i686; rv:135.0) Gecko/20100101 Firefox/135.0
DEBUG Entering history()
DEBUG Entering history()
DEBUG Entering history()
DEBUG Entering history()
DEBUG AAPL: Yahoo GET parameters: {'period1': '2023-01-01 00:00:00-05:00', 'period2': '2023-01-10 00:00:00-05:00', 'interval': '1d', 'includePrePost': False, 'events': 'div,splits,capitalGains'}
DEBUG Entering get()
DEBUG AAPL: Yahoo GET parameters: {'period1': '2022-12-01 00:00:00-05:00', 'period2': '2023-01-10 00:00:00-05:00', 'interval': '1d', 'includePrePost': False, 'events': 'div,splits,capitalGains'}
DEBUG Entering _make_request()
DEBUG Entering get()
DEBUG url=https://query2.finance.yahoo.com/v8/finance/chart/AAPL
DEBUG Entering _make_request()
DEBUG params=frozendict.frozendict({'period1': 1672549200, 'period2': 1673326800, 'interval': '1d', 'includePrePost': False, 'events': 'div,splits,capitalGains'})
DEBUG url=https://query2.finance.yahoo.com/v8/finance/chart/AAPL
DEBUG Entering _get_cookie_and_crumb()
DEBUG params=frozendict.frozendict({'period1': 1669870800, 'period2': 1673326800, 'interval': '1d', 'includePrePost': False, 'events': 'div,splits,capitalGains'})
DEBUG cookie_mode = 'basic'
DEBUG Entering _get_cookie_and_crumb()
DEBUG Entering _get_cookie_and_crumb_basic()
DEBUG cookie_mode = 'basic'
DEBUG crumb = 'mfIDcZ8MHpF'
DEBUG Exiting _get_cookie_and_crumb_basic()
DEBUG Exiting _get_cookie_and_crumb()
DEBUG Entering _get_cookie_and_crumb_basic()
DEBUG reusing crumb
DEBUG Exiting _get_cookie_and_crumb_basic()
DEBUG Exiting _get_cookie_and_crumb()
DEBUG response code=200
DEBUG Exiting _make_request()
DEBUG Exiting get()
DEBUG AAPL: yfinance received OHLC data: 2023-01-03 14:30:00 -> 2023-01-09 14:30:00
DEBUG AAPL: OHLC after cleaning: 2023-01-03 09:30:00-05:00 -> 2023-01-09 09:30:00-05:00
DEBUG AAPL: OHLC after combining events: 2023-01-03 00:00:00-05:00 -> 2023-01-09 00:00:00-05:00
DEBUG AAPL: yfinance returning OHLC: 2023-01-03 00:00:00-05:00 -> 2023-01-09 00:00:00-05:00
DEBUG Exiting history()
DEBUG Exiting history()
DEBUG Exiting download()
DEBUG response code=200
DEBUG Exiting _make_request()
DEBUG Exiting get()
DEBUG AAPL: yfinance received OHLC data: 2022-12-01 14:30:00 -> 2023-01-09 14:30:00
DEBUG AAPL: OHLC after cleaning: 2022-12-01 09:30:00-05:00 -> 2023-01-09 09:30:00-05:00
DEBUG AAPL: OHLC after combining events: 2022-12-01 00:00:00-05:00 -> 2023-01-09 00:00:00-05:00
DEBUG AAPL: yfinance returning OHLC: 2022-12-01 00:00:00-05:00 -> 2023-01-09 00:00:00-05:00
DEBUG Exiting history()
DEBUG Exiting history()
DEBUG Exiting download()
๐ Result 1:
From: 2023-01-03 To: 2023-01-09
First Close: 123.47
Last Close: 128.49
Rows: 5
๐ Result 2:
From: 2022-12-01 To: 2023-01-09
First Close: 146.41
Last Close: 128.49
Rows: 26
Bad data proof
๐ Result 1:
From: 2022-12-01 To: 2023-01-09
First Close: 146.41
Last Close: 128.49
Rows: 26
๐ Result 2:
From: 2022-12-01 To: 2023-01-09
First Close: 146.41
Last Close: 128.49
Rows: 26
yfinance
version
0.2.64
Python version
3.13
Operating system
macOS Sequoia 15.5