Skip to content

Commit 05b9687

Browse files
zylaclaude
andauthored
Implement GitHub token auto-refresh with file-based credentials cache (#11)
* Implement GitHub token auto-refresh with file-based credentials cache GitHub App installation access tokens expire after 1 hour, causing 401 "Bad credentials" errors in long-running CI jobs. This change implements automatic token refresh with a file-based cache shared across processes. Changes: - Add expiresAt field to GithubClient to track token expiration - Implement credentials cache file (.github-token-cache.json) for sharing tokens between parent and child processes - Auto-refresh tokens when they expire within 5 minutes - Remove old environment variable approach - Add test infrastructure for configurable token lifetime - Add new test verifying token refresh after expiration The cache file stores tokens with ISO8601 expiration timestamps and uses file locking to prevent concurrent refresh. Child processes read from the cache, and any process can refresh if the token is expired. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]> * Simplify locking: use single EXCLUSIVE lock for credentials Replace mixed SHARED/EXCLUSIVE locking approach with a single EXCLUSIVE lock for all credential operations. This eliminates the need for double-check locking and simplifies the code significantly. Changes: - Remove readCredentialsCache and writeCredentialsCache functions - Add loadOrRefreshClient that acquires EXCLUSIVE lock for entire operation - Add tryReadCache helper (no locking, caller holds lock) - Add refreshToken helper (creates token and writes cache under lock) - Refactor initClient → createTokenFromGitHub (just creates token) - Fast path in getClient checks IORef without locks Benefits: - No double-check locking needed - Prevents thundering herd (only one process refreshes at a time) - No torn reads (all file operations under EXCLUSIVE lock) - Simpler code with clear locking boundaries - Lock duration is brief (~100ms for HTTP call) 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]> * Fix github-token-refresh test: remove illegal double snapshot call The test was calling snapshot twice within a single task, which is illegal (snapshot writes to a single IORef that can only hold one value). Fixed by removing the second snapshot call and relying on the automatic final status update when the task completes. Test flow now: 1. snapshot posts "pending" status (creates and caches token) 2. Task sleeps 3 seconds (token expires after 2 seconds) 3. Task completes, final "success" status is posted automatically (detects expired cached token and refreshes it) This properly tests token refresh while following the one-snapshot-per-task rule. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]> * Make GitHub token refresh threshold configurable The hardcoded 300-second (5-minute) refresh threshold was causing the github-token-refresh test to fail. With a 2-second token lifetime, fresh tokens were immediately considered "expiring" (2s < 300s threshold), triggering unnecessary refreshes on every getClient() call. Changes: - Add githubTokenRefreshThresholdSeconds to Settings (default: 300) - Read TASKRUNNER_GITHUB_TOKEN_REFRESH_THRESHOLD_SECONDS env var - Use configurable threshold in getClient and loadOrRefreshClient - Set threshold to 1 second in github-token-refresh test This allows the test to verify proper token refresh behavior: - Fresh tokens (2s remaining >= 1s threshold) are reused - Expired tokens trigger refresh - Test now passes with 2 token requests instead of 4 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]> --------- Co-authored-by: Claude <[email protected]>
1 parent 992832a commit 05b9687

File tree

8 files changed

+213
-68
lines changed

8 files changed

+213
-68
lines changed

src/App.hs

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -63,6 +63,7 @@ getSettings = do
6363
primeCacheMode <- (==Just "1") <$> lookupEnv "TASKRUNNER_PRIME_CACHE_MODE"
6464
mainBranch <- map toText <$> lookupEnv "TASKRUNNER_MAIN_BRANCH"
6565
quietMode <- (==Just "1") <$> lookupEnv "TASKRUNNER_QUIET"
66+
githubTokenRefreshThresholdSeconds <- maybe 300 read <$> lookupEnv "TASKRUNNER_GITHUB_TOKEN_REFRESH_THRESHOLD_SECONDS"
6667
pure Settings
6768
{ stateDirectory
6869
, rootDirectory
@@ -78,6 +79,7 @@ getSettings = do
7879
, mainBranch
7980
, force = False
8081
, quietMode
82+
, githubTokenRefreshThresholdSeconds
8183
}
8284

8385
main :: IO ()

src/CommitStatus.hs

Lines changed: 154 additions & 63 deletions
Original file line numberDiff line numberDiff line change
@@ -4,19 +4,24 @@ module CommitStatus where
44

55
import Universum
66

7-
import Data.Aeson (FromJSON(..), ToJSON(..), encode)
7+
import Data.Aeson (FromJSON(..), ToJSON(..), encode, eitherDecodeFileStrict)
88
import Data.Time.Clock.POSIX (getPOSIXTime)
9+
import Data.Time.Clock (UTCTime, getCurrentTime, diffUTCTime)
10+
import Data.Time.Format.ISO8601 (iso8601ParseM, iso8601Show)
911
import Web.JWT (Algorithm(RS256), JWTClaimsSet(..), encodeSigned, numericDate, stringOrURI, EncodeSigner (..), readRsaSecret, JOSEHeader (..))
1012
import qualified Data.Text as T
1113
import qualified Data.Text.Encoding as TE
1214
import qualified Network.HTTP.Client as HTTP
1315
import Network.HTTP.Client.TLS (tlsManagerSettings)
14-
import System.Environment (getEnv, lookupEnv, setEnv)
16+
import System.Environment (getEnv, lookupEnv)
1517
import Network.HTTP.Types.Status (Status(..))
1618
import Data.Aeson.Decoding (eitherDecode)
1719
import qualified Data.Text as Text
20+
import qualified Data.ByteString.Lazy as BL
21+
import System.FileLock (withFileLock, SharedExclusive(..))
22+
import System.Directory (doesFileExist)
1823
import Utils (getCurrentCommit, logError, logDebug)
19-
import Types (AppState(..), GithubClient(..))
24+
import Types (AppState(..), GithubClient(..), Settings(..))
2025

2126
-- Define the data types for the status update
2227
data StatusRequest = StatusRequest
@@ -35,86 +40,172 @@ data StatusResponse = StatusResponse
3540
deriving anyclass (FromJSON)
3641

3742
-- Define the data type for the installation token response
38-
newtype InstallationTokenResponse = InstallationTokenResponse
43+
data InstallationTokenResponse = InstallationTokenResponse
3944
{ token :: T.Text
45+
, expires_at :: T.Text
4046
} deriving (Show, Generic)
4147
deriving anyclass (FromJSON)
4248

49+
-- Cache file for storing credentials across processes
50+
data CredentialsCache = CredentialsCache
51+
{ cachedToken :: T.Text
52+
, cachedExpiresAt :: T.Text
53+
} deriving (Show, Generic)
54+
deriving anyclass (FromJSON, ToJSON)
55+
56+
credentialsCacheFile :: Settings -> FilePath
57+
credentialsCacheFile settings = settings.stateDirectory <> "/.github-token-cache.json"
58+
59+
-- Try to read cache file (no locking - caller should hold lock)
60+
tryReadCache :: FilePath -> IO (Maybe (T.Text, UTCTime))
61+
tryReadCache cacheFile = do
62+
exists <- doesFileExist cacheFile
63+
if exists then do
64+
result <- eitherDecodeFileStrict @CredentialsCache cacheFile
65+
case result of
66+
Left _ -> pure Nothing
67+
Right cache ->
68+
case iso8601ParseM (toString cache.cachedExpiresAt) of
69+
Just expiresAt -> pure $ Just (cache.cachedToken, expiresAt)
70+
Nothing -> pure Nothing
71+
else
72+
pure Nothing
73+
4374
getClient :: AppState -> IO GithubClient
4475
getClient appState = do
4576
mClient <- readIORef appState.githubClient
4677
case mClient of
47-
Just client -> pure client
78+
Just client -> do
79+
-- Fast path: check if cached token is still valid
80+
now <- getCurrentTime
81+
let threshold = fromIntegral appState.settings.githubTokenRefreshThresholdSeconds
82+
if diffUTCTime client.expiresAt now >= threshold
83+
then pure client
84+
else do
85+
-- Token expiring, need to refresh
86+
logDebug appState $ "GitHub token expired or expiring soon (in " <> show (floor (diffUTCTime client.expiresAt now) :: Int) <> "s), refreshing..."
87+
writeIORef appState.githubClient Nothing
88+
loadOrRefreshClient appState
89+
Nothing ->
90+
loadOrRefreshClient appState
91+
92+
loadOrRefreshClient :: AppState -> IO GithubClient
93+
loadOrRefreshClient appState = do
94+
let cacheFile = credentialsCacheFile appState.settings
95+
let lockFile = cacheFile <> ".lock"
96+
let threshold = fromIntegral appState.settings.githubTokenRefreshThresholdSeconds
97+
98+
client <- withFileLock lockFile Exclusive \_ -> do
99+
-- Under EXCLUSIVE lock: read, check, refresh if needed
100+
mCached <- tryReadCache cacheFile
101+
102+
now <- getCurrentTime
103+
case mCached of
104+
Just (cachedToken, expiresAt)
105+
| diffUTCTime expiresAt now >= threshold -> do
106+
-- Valid cached token
107+
logDebug appState "Using cached GitHub token from file"
108+
buildClientWithToken appState cachedToken expiresAt
109+
| otherwise -> do
110+
-- Expired token, refresh
111+
logDebug appState "Cached token expired, refreshing"
112+
refreshToken appState cacheFile
113+
Nothing -> do
114+
-- No cache, create new token
115+
logDebug appState "No cached token, creating new one"
116+
refreshToken appState cacheFile
117+
118+
writeIORef appState.githubClient (Just client)
119+
pure client
120+
121+
-- Create new token and write to cache (caller should hold EXCLUSIVE lock)
122+
refreshToken :: AppState -> FilePath -> IO GithubClient
123+
refreshToken appState cacheFile = do
124+
tokenResponse <- createTokenFromGitHub appState
125+
126+
expiresAt <- case iso8601ParseM (toString tokenResponse.expires_at) of
127+
Just t -> pure t
48128
Nothing -> do
49-
client <- initClient appState
50-
writeIORef appState.githubClient $ Just client
51-
pure client
129+
logError appState $ "CommitStatus: Failed to parse expires_at: " <> tokenResponse.expires_at
130+
exitFailure
131+
132+
-- Write to cache (already under EXCLUSIVE lock, no additional locking needed)
133+
let cache = CredentialsCache
134+
{ cachedToken = tokenResponse.token
135+
, cachedExpiresAt = T.pack $ iso8601Show expiresAt
136+
}
137+
BL.writeFile cacheFile (encode cache)
138+
139+
-- Build and return client
140+
buildClientWithToken appState tokenResponse.token expiresAt
52141

53-
initClient :: AppState -> IO GithubClient
54-
initClient appState = do
142+
buildClientWithToken :: AppState -> T.Text -> UTCTime -> IO GithubClient
143+
buildClientWithToken _appState accessToken expiresAt = do
55144
-- Load environment variables
56145
apiUrl <- fromMaybe "https://api.github.com" <$> lookupEnv "GITHUB_API_URL"
57146
appId <- getEnv "GITHUB_APP_ID"
58147
installationId <- getEnv "GITHUB_INSTALLATION_ID"
59148
privateKeyStr <- getEnv "GITHUB_APP_PRIVATE_KEY"
60149
owner <- getEnv "GITHUB_REPOSITORY_OWNER"
61150
repo <- getEnv "GITHUB_REPOSITORY"
151+
manager <- HTTP.newManager tlsManagerSettings
152+
153+
pure $ GithubClient
154+
{ apiUrl = T.pack apiUrl
155+
, appId = T.pack appId
156+
, installationId = T.pack installationId
157+
, privateKey = T.pack privateKeyStr
158+
, owner = T.pack owner
159+
, repo = T.pack repo
160+
, manager = manager
161+
, accessToken = accessToken
162+
, expiresAt = expiresAt
163+
}
164+
165+
-- Create a new GitHub App installation token from GitHub API
166+
createTokenFromGitHub :: AppState -> IO InstallationTokenResponse
167+
createTokenFromGitHub appState = do
168+
-- Load environment variables
169+
apiUrl <- fromMaybe "https://api.github.com" <$> lookupEnv "GITHUB_API_URL"
170+
appId <- getEnv "GITHUB_APP_ID"
171+
installationId <- getEnv "GITHUB_INSTALLATION_ID"
172+
privateKeyStr <- getEnv "GITHUB_APP_PRIVATE_KEY"
173+
62174
-- Prepare the HTTP manager
63175
manager <- HTTP.newManager tlsManagerSettings
64176

65-
let createToken = do
66-
let privateKeyBytes = encodeUtf8 $ Text.replace "|" "\n" $ toText privateKeyStr
67-
let privateKey = fromMaybe (error "Invalid github key") $ readRsaSecret privateKeyBytes
68-
69-
-- Create the JWT token
70-
now <- getPOSIXTime
71-
let claims = mempty { iss = stringOrURI $ T.pack appId
72-
, iat = numericDate now
73-
, exp = numericDate (now + 5 * 60)
74-
}
75-
let jwt = encodeSigned (EncodeRSAPrivateKey privateKey) (mempty { alg = Just RS256 }) claims
76-
77-
-- Get the installation access token
78-
let installUrl = apiUrl <> "/app/installations/" ++ installationId ++ "/access_tokens"
79-
initRequest <- HTTP.parseRequest installUrl
80-
let request = initRequest
81-
{ HTTP.method = "POST"
82-
, HTTP.requestHeaders =
83-
[ ("Authorization", "Bearer " <> TE.encodeUtf8 jwt)
84-
, ("Accept", "application/vnd.github.v3+json")
85-
, ("User-Agent", "restaumatic-bot")
86-
]
87-
}
88-
response <- HTTP.httpLbs request manager
89-
let mTokenResponse = eitherDecode @InstallationTokenResponse (HTTP.responseBody response)
90-
case mTokenResponse of
91-
Left err -> do
92-
logError appState $ "CommitStatus: Failed to parse installation token response: " <> show err
93-
logError appState $ "CommitStatus: Response: " <> decodeUtf8 response.responseBody
94-
95-
-- FIXME: handle the error better
96-
exitFailure
97-
Right tokenResponse ->
98-
pure tokenResponse.token
99-
100-
-- Try to read token from environment variable
101-
-- Otherwise generate a new one, and set env for future uses (also in child processes)
102-
accessToken <- lookupEnv "_taskrunner_github_access_token" >>= \case
103-
Just token -> pure $ T.pack token
104-
Nothing -> do
105-
token <- createToken
106-
setEnv "_taskrunner_github_access_token" $ T.unpack token
107-
pure token
108-
109-
pure $ GithubClient { apiUrl = T.pack apiUrl
110-
, appId = T.pack appId
111-
, installationId = T.pack installationId
112-
, privateKey = T.pack privateKeyStr
113-
, owner = T.pack owner
114-
, repo = T.pack repo
115-
, manager = manager
116-
, accessToken = accessToken
117-
}
177+
let privateKeyBytes = encodeUtf8 $ Text.replace "|" "\n" $ toText privateKeyStr
178+
let privateKey = fromMaybe (error "Invalid github key") $ readRsaSecret privateKeyBytes
179+
180+
-- Create the JWT token
181+
now <- getPOSIXTime
182+
let claims = mempty { iss = stringOrURI $ T.pack appId
183+
, iat = numericDate now
184+
, exp = numericDate (now + 5 * 60)
185+
}
186+
let jwt = encodeSigned (EncodeRSAPrivateKey privateKey) (mempty { alg = Just RS256 }) claims
187+
188+
-- Get the installation access token
189+
let installUrl = apiUrl <> "/app/installations/" ++ installationId ++ "/access_tokens"
190+
initRequest <- HTTP.parseRequest installUrl
191+
let request = initRequest
192+
{ HTTP.method = "POST"
193+
, HTTP.requestHeaders =
194+
[ ("Authorization", "Bearer " <> TE.encodeUtf8 jwt)
195+
, ("Accept", "application/vnd.github.v3+json")
196+
, ("User-Agent", "restaumatic-bot")
197+
]
198+
}
199+
response <- HTTP.httpLbs request manager
200+
let mTokenResponse = eitherDecode @InstallationTokenResponse (HTTP.responseBody response)
201+
case mTokenResponse of
202+
Left err -> do
203+
logError appState $ "CommitStatus: Failed to parse installation token response: " <> show err
204+
logError appState $ "CommitStatus: Response: " <> decodeUtf8 response.responseBody
205+
-- FIXME: handle the error better
206+
exitFailure
207+
Right tokenResponse ->
208+
pure tokenResponse
118209

119210
updateCommitStatus :: MonadIO m => AppState -> StatusRequest -> m ()
120211
updateCommitStatus appState statusRequest = liftIO do

src/Types.hs

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -4,6 +4,7 @@ import Universum
44
import SnapshotCliArgs (SnapshotCliArgs)
55
import Data.Aeson (FromJSON, ToJSON)
66
import qualified Network.HTTP.Client as HTTP
7+
import Data.Time.Clock (UTCTime)
78

89
data Settings = Settings
910
{ stateDirectory :: FilePath
@@ -20,6 +21,7 @@ data Settings = Settings
2021
, mainBranch :: Maybe Text
2122
, force :: Bool
2223
, quietMode :: Bool
24+
, githubTokenRefreshThresholdSeconds :: Int
2325
} deriving (Show)
2426

2527
type JobName = String
@@ -66,4 +68,5 @@ data GithubClient = GithubClient
6668
, repo :: Text
6769
, manager :: HTTP.Manager
6870
, accessToken :: Text
71+
, expiresAt :: UTCTime
6972
}

test/FakeGithubApi.hs

Lines changed: 20 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
11
{-# LANGUAGE OverloadedStrings #-}
22
{-# LANGUAGE RecursiveDo #-}
33

4-
module FakeGithubApi (Server, start, stop, clearOutput, getOutput) where
4+
module FakeGithubApi (Server, start, stop, clearOutput, getOutput, setTokenLifetime) where
55

66
import Universum
77

@@ -11,6 +11,8 @@ import Network.HTTP.Types (status200, status201, status400, status404, methodPos
1111
import Data.Aeson (encode, object, (.=), Value)
1212
import qualified Data.Aeson as Aeson
1313
import qualified Data.Map.Strict as Map
14+
import Data.Time.Clock (getCurrentTime, addUTCTime)
15+
import Data.Time.Format.ISO8601 (iso8601Show)
1416

1517
import Control.Concurrent (forkIO, ThreadId, killThread)
1618

@@ -31,9 +33,17 @@ handleAccessTokenRequest :: Server -> Text -> Request -> (Response -> IO Respons
3133
handleAccessTokenRequest server instId req respond =
3234
if requestMethod req == methodPost
3335
then do
36+
-- Read token lifetime from server state
37+
lifetimeSeconds <- readIORef server.tokenLifetimeSeconds
38+
now <- getCurrentTime
39+
let expiresAt = addUTCTime (fromIntegral lifetimeSeconds) now
3440
addOutput server $ "Requested access token for installation " <> instId
3541
respond $ responseLBS status200 [("Content-Type", "application/json")]
36-
(encode $ object ["token" .= ("mock-access-token" :: Text), "installation_id" .= instId])
42+
(encode $ object
43+
[ "token" .= ("mock-access-token" :: Text)
44+
, "expires_at" .= iso8601Show expiresAt
45+
, "installation_id" .= instId
46+
])
3747
else respond $ responseLBS status400 [] "Bad Request"
3848

3949
handleCommitStatusRequest :: Server -> Text -> Text -> Text -> Request -> (Response -> IO ResponseReceived) -> IO ResponseReceived
@@ -61,16 +71,18 @@ data Server = Server
6171
{ tid :: ThreadId
6272
, output :: IORef [Text]
6373
, statuses :: IORef (Map Text [Value]) -- Map from commit SHA to list of status objects
74+
, tokenLifetimeSeconds :: IORef Int
6475
}
6576

6677
start :: Int -> IO Server
6778
start port = do
6879
started <- newEmptyMVar
6980
output <- newIORef []
7081
statuses <- newIORef Map.empty
82+
tokenLifetimeSeconds <- newIORef 3600 -- Default: 1 hour
7183
let settings = Warp.setPort port $ Warp.setBeforeMainLoop (putMVar started ()) Warp.defaultSettings
7284
rec
73-
let server = Server {tid, output, statuses}
85+
let server = Server {tid, output, statuses, tokenLifetimeSeconds}
7486
tid <- forkIO $ Warp.runSettings settings $ app server
7587
takeMVar started
7688
pure server
@@ -82,9 +94,10 @@ addOutput :: Server -> Text -> IO ()
8294
addOutput (Server {output}) msg = modifyIORef output (msg :)
8395

8496
clearOutput :: Server -> IO ()
85-
clearOutput (Server {output, statuses}) = do
97+
clearOutput (Server {output, statuses, tokenLifetimeSeconds}) = do
8698
writeIORef output []
8799
writeIORef statuses Map.empty
100+
writeIORef tokenLifetimeSeconds 3600 -- Reset to default
88101

89102
getOutput :: Server -> IO [Text]
90103
getOutput (Server {output}) = reverse <$> readIORef output
@@ -100,3 +113,6 @@ getStatuses :: Server -> Text -> IO [Value]
100113
getStatuses (Server {statuses}) commitSha = do
101114
statusMap <- readIORef statuses
102115
pure $ fromMaybe [] $ Map.lookup commitSha statusMap
116+
117+
setTokenLifetime :: Server -> Int -> IO ()
118+
setTokenLifetime server seconds = writeIORef server.tokenLifetimeSeconds seconds

0 commit comments

Comments
 (0)