Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jobs not subscribing after unsubscribing from time to time #227

Open
ukulekek opened this issue Nov 26, 2021 · 4 comments
Open

Jobs not subscribing after unsubscribing from time to time #227

ukulekek opened this issue Nov 26, 2021 · 4 comments
Labels

Comments

@ukulekek
Copy link

Hello, our Jenkins jobs are being unsubscribed without subscribing back from time to time. I have found two different reasons for not subscribing back in logs. We are using the latest 1.1.24 version of the plugin.

1:

2021-11-04 18:12:26.639+0000 [id=1256]  INFO    c.r.j.p.c.m.ActiveMqMessagingWorker#unsubscribe: Unsubscribing job 'ci-brew-docker-dispatcher' from the 'Consumer.rh-jenkins-ci-plugin.60aa7a6
b-df3b-5d9b-b08d-198beeb4db36.VirtualTopic.eng.brew.build.complete.>' topic.
2021-11-04 18:12:26.727+0000 [id=1256]  SEVERE  h.i.i.InstallUncaughtExceptionHandler$DefaultUncaughtExceptionHandler#uncaughtException: A thread (CIBuildTrigger-ci-brew-docker-dispatcher-1-
ActiveMqMessagingProvider/1256) died unexpectedly due to an uncaught exception, this may leave your Jenkins in a bad way and is usually indicative of a bug in the code.
java.lang.StackOverflowError
        at org.acegisecurity.context.SecurityContext$1.getAuthentication(SecurityContext.java:46)
        at org.acegisecurity.context.SecurityContext$2.getAuthentication(SecurityContext.java:60)
        at org.acegisecurity.context.SecurityContext$1.getAuthentication(SecurityContext.java:46)
        at org.acegisecurity.context.SecurityContext$2.getAuthentication(SecurityContext.java:60)
        at org.acegisecurity.context.SecurityContext$1.getAuthentication(SecurityContext.java:46)
...

No messages from the job appeared after that. We resolved it by disabling and enabling the job back.

2:

2021-11-10 00:53:32.818+0000 [id=33]    INFO    c.r.j.plugins.ci.CIBuildTrigger#stopTriggerThreads: Removed thread lock for 'ci-dispatch-github_pull_request_opened'.
2021-11-10 00:53:32.818+0000 [id=33]    INFO    c.r.j.plugins.ci.CIBuildTrigger#startTriggerThreads: Starting thread (83) for 'ci-dispatch-github_pull_request_opened'.
2021-11-10 00:53:32.830+0000 [id=33]    INFO    jenkins.model.Jenkins$5#runTask: Took 514ms for Loading item ci-dispatch-github_pull_request_opened by pool-6-thread-6
2021-11-10 00:53:33.020+0000 [id=83]    INFO    c.r.j.p.c.m.FedMsgMessagingWorker#subscribe: Subscribing job 'ci-dispatch-github_pull_request_opened' to org.fedoraproject.prod.github.pull_request.opened topic.
2021-11-10 00:53:33.021+0000 [id=83]    INFO    c.r.j.p.c.m.FedMsgMessagingWorker#subscribe: Successfully subscribed job 'ci-dispatch-github_pull_request_opened' to topic 'org.fedoraproject.prod.github.pull_request.opened'.
2021-11-10 00:53:36.225+0000 [id=28]    WARNING    j.a.s.AdvancedQueueSorterQueueListener#onEnterBuildable: onEnterBuilding() called without prior call to onEnterWaiting() for 'ci-dispatch-github_pull_request_opened'
2021-11-10 00:53:36.226+0000 [id=28]    WARNING    j.a.s.AdvancedQueueSorterQueueListener#onEnterBuildable: onEnterBuilding() called without prior call to onEnterWaiting() for 'ci-dispatch-github_pull_request_opened'
2021-11-10 01:53:33.124+0000 [id=83]    INFO    c.r.j.p.c.m.FedMsgMessagingWorker#receive: No message received for the past 3600000 ms, re-subscribing for job 'ci-dispatch-github_pull_request_opened'.
2021-11-10 01:53:33.131+0000 [id=83]    INFO    c.r.j.p.c.m.FedMsgMessagingWorker#unsubscribe: Un-subscribing job 'ci-dispatch-github_pull_request_opened' from org.fedoraproject.prod.github.pull_request.opened topic.
2021-11-10 01:53:33.162+0000 [id=83]    INFO    c.r.j.p.c.m.FedMsgMessagingWorker#subscribe: Subscribing job 'ci-dispatch-github_pull_request_opened' to org.fedoraproject.prod.github.pull_request.opened topic.
2021-11-10 01:53:33.166+0000 [id=83]    INFO    c.r.j.p.c.m.FedMsgMessagingWorker#subscribe: Successfully subscribed job 'ci-dispatch-github_pull_request_opened' to topic 'org.fedoraproject.prod.github.pull_request.opened'.
2021-11-10 02:06:03.269+0000 [id=7312]    INFO    c.r.j.plugins.ci.CIBuildTrigger#stopTriggerThreads: Stopping thread (83) for 'ci-dispatch-github_pull_request_opened'.
2021-11-10 02:06:03.269+0000 [id=7312]    INFO    c.r.j.p.c.m.FedMsgMessagingWorker#prepareForInterrupt: Un-subscribing job 'ci-dispatch-github_pull_request_opened' from org.fedoraproject.prod.github.pull_request.opened topic.
2021-11-10 02:06:03.270+0000 [id=83]    INFO    c.r.j.p.c.m.FedMsgMessagingWorker#receive: We have been interrupted...
2021-11-10 02:06:03.270+0000 [id=83]    INFO    c.r.j.p.c.m.FedMsgMessagingWorker#unsubscribe: We are being interrupted. Skipping unsubscribe...

Again, no messages appeared after that. With this one, even if it looks that unsubscribing was skipped, however, it wasn't as the job hasn't been triggered after that. In this case, disable/enable didn't work, there was no trigger thread running. Only Jenkins restart did help.

Version report

Jenkins and plugins versions report:

Jenkins: 2.289.3
OS: Linux - 3.10.0-1062.4.3.el7.x86_64
---
ColumnPack-plugin:1.0.3
ColumnsPlugin:1.0
PrioritySorter:4.0.0
ace-editor:1.1
ansicolor:1.0.0
ant:1.11
antisamy-markup-formatter:2.1
apache-httpcomponents-client-4-api:4.5.13-1.0
authentication-tokens:1.4
badge:1.8
bootstrap4-api:4.6.0-3
bootstrap5-api:5.1.1-1
bouncycastle-api:2.25
branch-api:2.7.0
build-failure-analyzer:2.1.0
build-name-setter:2.2.0
build-pipeline-plugin:1.5.8
build-timeout:1.20
build-view-column:0.3
build-with-parameters:1.5.1
buildresult-trigger:0.17
built-on-column:1.1
caffeine-api:2.9.2-29.v717aac953ff3
checks-api:1.7.2
cloudbees-disk-usage-simple:0.10
cloudbees-folder:6.16
command-launcher:1.6
compact-columns:1.13
conditional-buildstep:1.4.1
configuration-as-code:1.53
configure-job-column-plugin:1.0
console-column-plugin:1.5
console-tail:1.1
copy-to-slave:1.4.4
copyartifact:1.46.2
credentials:2.6.1
credentials-binding:1.27
cron_column:1.4
cvs:2.19
dashboard-view:2.17
depgraph-view:1.0.5
description-column-plugin:1.3
description-setter:1.10
disk-usage:0.28
display-url-api:2.3.5
docker-commons:1.17
docker-workflow:1.26
dtkit-api:3.0.0
dumpling:2.2
durable-task:1.39
dynamicparameter:0.2.0
echarts-api:5.1.2-11
email-ext:2.83
email-ext-recipients-column:1.0
envinject:2.4.0
envinject-api:1.7
extended-read-permission:3.2
external-monitor-job:1.7
extra-columns:1.24
ez-templates:1.3.4
flexible-publish:0.16.1
font-awesome-api:5.15.4-1
gerrit-trigger:2.35.1
git:4.8.2
git-client:3.9.0
git-server:1.10
github:1.34.1
github-api:1.133
greenballs:1.15.1
groovy:2.4
groovy-postbuild:2.5
handlebars:3.0.8
icon-shim:3.0.0
instant-messaging:1.46
ircbot:2.40
jackson2-api:2.12.4
javadoc:1.6
jdk-tool:1.5
jenkins-multijob-plugin:1.36
jms-messaging:1.1.24
jobConfigHistory:2.28.1
jobtype-column:1.3
jquery:1.12.4-1
jquery-detached:1.2.1
jquery-ui:1.0.2
jquery3-api:3.6.0-2
jsch:0.1.55.2
junit:1.53
kerberos-sso:1.7
ldap:2.7
locale:1.4
lockable-resources:2.11
mailer:1.34
mapdb-api:1.0.9.0
mask-passwords:3.0
material-theme:0.4.1
matrix-auth:2.6.8
matrix-project:1.19
maven-plugin:3.13
metrics:4.0.2.8
momentjs:1.1.1
monitoring:1.88.0
multiple-scms:0.8
nested-view:1.21
nodelabelparameter:1.9.2
nodenamecolumn:1.2
okhttp-api:3.14.9
ownership:0.13.0
pam-auth:1.6
parameterized-trigger:2.41
pipeline-build-step:2.15
pipeline-graph-analysis:1.11
pipeline-input-step:2.12
pipeline-milestone-step:1.3.2
pipeline-model-api:1.9.2
pipeline-model-declarative-agent:1.1.1
pipeline-model-definition:1.9.2
pipeline-model-extensions:1.9.2
pipeline-rest-api:2.19
pipeline-stage-step:2.5
pipeline-stage-tags-metadata:1.9.2
pipeline-stage-view:2.19
pipeline-utility-steps:2.10.0
plain-credentials:1.7
plugin-util-api:2.4.0
popper-api:1.16.1-2
popper2-api:2.10.1-1
postbuildscript:3.0.0
progress-bar-column-plugin:1.0
project-build-times:1.2.1
prometheus:2.0.10
python:1.3
rebuild:1.32
redhat-ci-plugin:1.5.11
resource-disposer:0.16
role-strategy:3.2.0
run-condition:1.5
saferestart:0.3
schedule-build:0.5.1
scm-api:2.6.5
script-security:1.78
scriptler:3.3
scripttrigger:0.34
sectioned-view:1.25
shiningpanda:0.24
skip-certificate-check:1.0
snakeyaml-api:1.29.1
sonar:2.13.1
sonargraph-plugin:1.6.4
ssh-agent:1.23
ssh-credentials:1.19
ssh-slaves:1.33.0
sshd:3.1.0
structs:1.23
subversion:2.14.5
support-core:2.76
swarm:3.28
test-stability:2.3
theme-manager:0.6
thinBackup:1.10
throttle-concurrents:2.4
timestamper:1.13
tmpcleaner:1.3
token-macro:266.v44a80cf277fd
toolenv:1.2
translation:1.16
trilead-api:1.0.13
update-sites-manager:2.0.0
variant:1.4
windows-slaves:1.8
workflow-aggregator:2.6
workflow-api:2.46
workflow-basic-steps:2.24
workflow-cps:2.94
workflow-cps-global-lib:2.21
workflow-durable-task-step:2.40
workflow-job:2.41
workflow-multibranch:2.26
workflow-scm-step:2.13
workflow-step-api:2.24
workflow-support:3.8
ws-cleanup:0.39
xunit:3.0.3
xvnc:1.25
zmq-event-publisher:0.0.5
  • What Operating System are you using (both controller, and any agents involved in the problem)?
RHEL-7.7

Reproduction steps

I'm unable to reproduce the issue on purpose.

Results

Expected result:

The job is subscribed back.

Actual result:

The job keeps being unsubscribed.

@ukulekek ukulekek added the bug label Nov 26, 2021
@olivergondza
Copy link
Member

@ukulekek, thanks for the report. Please attach full (or at least relevant part of the) StackOverflowError stacktrace - we need to know how did it manage to get into that infinite recursion.

@ukulekek
Copy link
Author

@olivergondza the full StackOverflowError consists of the same two lines repeating 1024 times, starting and ending with:

2021-11-04 18:12:26.727+0000 [id=1256]  SEVERE  h.i.i.InstallUncaughtExceptionHandler$DefaultUncaughtExceptionHandler#uncaughtException: A thread (CIBuildTrigger-ci-brew-docker-dispatcher-1-ActiveMqMessagingProvider/1256) died unexpectedly due to an uncaught exception, this may leave your Jenkins in a bad way and is usually indicative of a bug in the code.
java.lang.StackOverflowError
        at org.acegisecurity.context.SecurityContext$1.getAuthentication(SecurityContext.java:46)
        at org.acegisecurity.context.SecurityContext$2.getAuthentication(SecurityContext.java:60)
        at org.acegisecurity.context.SecurityContext$1.getAuthentication(SecurityContext.java:46)
        at org.acegisecurity.context.SecurityContext$2.getAuthentication(SecurityContext.java:60)
...
        at org.acegisecurity.context.SecurityContext$1.getAuthentication(SecurityContext.java:46)
        at org.acegisecurity.context.SecurityContext$2.getAuthentication(SecurityContext.java:60)
        at org.acegisecurity.context.SecurityContext$1.getAuthentication(SecurityContext.java:46)
        at org.acegisecurity.context.SecurityContext$2.getAuthentication(SecurityContext.java:60)

@olivergondza
Copy link
Member

Unfortunately, that does not help us much in finding out the cause of it. No idea why is the stacktrace shortened.

However, I found https://issues.jenkins.io/browse/JENKINS-65590 describing the same symptoms addressed by update in the API they use. It is something we eventually must do here as well, so it might be worth prioritizing it.

@olivergondza olivergondza changed the title Jobs not subsribing after unsubscribing from time to time Jobs not subscribing after unsubscribing from time to time Dec 1, 2021
@olivergondza
Copy link
Member

Filing API update: #228

It is strange we did not see this on 2.289.2 or 3.303.3. Is there any particular way to reproduce this?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants