A package for applying a fast implementation of Fisher's exact test to observations in a pandas DataFrame.
Contingency tables are computed based on all pairs of columns in cols and all pairs of unique values within the columns. The results are tested against scipy.stats.fishers_exact and fallback on scipy if numba is not avilable.
The package is compatible with Python 3 and can be installed from PyPI or cloned and installed directly.
pip install fishersapi
import fishersapi
a = np.random.randint(1, 50, size=n)
b = np.random.randint(1, 50, size=n)
c = np.random.randint(1, 100, size=n)
d = np.random.randint(1, 100, size=n)
ORs, pvalues = fishersapi.fishers_vec(a, b, c, d, alternative='two-sided')
n = 50
df = pd.DataFrame({'VA':np.random.choice(['TRAV14', 'TRAV12', 'TRAV3', 'TRAV23', 'TRAV11', 'TRAV6'], n),
'JA':np.random.choice(['TRAJ4', 'TRAJ2', 'TRAJ3','TRAJ5', 'TRAJ21', 'TRAJ13'], n),
'VB':np.random.choice(['TRBV14', 'TRBV12', 'TRBV3', 'TRBV23', 'TRBV11', 'TRBV6'], n),
'JB':np.random.choice(['TRBJ4', 'TRBJ2', 'TRBJ3','TRBJ5', 'TRBJ21', 'TRBJ13'], n)})
df = df.assign(Count=1)
df.loc[:10, 'Count'] = 15
res = fishersapi.fishers_frame(df, ['VA', 'JA', 'VB', 'JB'], count_col=None, alternative='two-sided')