-
Notifications
You must be signed in to change notification settings - Fork 73
Closed
Labels
bugSomething isn't workingSomething isn't working
Description
Describe the bug
When trying to follow the Get Started with AWS IoT tutorial, I've replaced the Cloud9 instance with my own Raspberrypi. I manage to do everything until the end of the 3.3. For some reason the aws-iot-device-client service does not work. I tried to run directly the executable (sudo /sbin/aws-iot-device-client --config-file /etc/.aws-iot-device-client/aws-iot-device-client.conf
) and I got a "Segmentation fault" error.
To Reproduce
Steps to reproduce the behavior:
- Install a clean version of RaspberryOS lite
- Install cmake, libssl-dev, git
- Install AWS iot sdk (is this necessary?)
- Install and configure AWS CLI
- Install AWS iot device client
- Execute the
setup.sh
script
Logs
2024-07-04T14:02:35.906Z [INFO] {Config.cpp}: Successfully fetched JSON config file:
{
"endpoint": "endpoint",
"cert": "cert",
"key": "key",
"root-ca": "root-ca",
"thing-name": "deviceClientThing",
"logging": {
"level": "DEBUG",
"type": "FILE",
"file": "/var/log/aws-iot-device-client/aws-iot-device-client.log",
"enable-sdk-logging": false,
"sdk-log-level": "TRACE",
"sdk-log-file": "/var/log/aws-iot-device-client/sdk.log"
},
"jobs": {
"enabled": true,
"handler-directory": "/etc/.aws-iot-device-client/jobs"
},
"tunneling": {
"enabled": true
},
"device-defender": {
"enabled": true,
"interval": 300
},
"fleet-provisioning": {
"enabled": false,
"template-name": "",
"template-parameters": "",
"csr-file": "",
"device-key": ""
},
"samples": {
"pub-sub": {
"enabled": true,
"publish-topic": "/topic/workshop/dc/pub",
"publish-file": "/home/pi/workshop_dc/pubfile.txt",
"subscribe-topic": "/topic/workshop/dc/sub",
"subscribe-file": "/home/pi/workshop_dc/subfile.txt"
}
},
"config-shadow": {
"enabled": false
},
"sample-shadow": {
"enabled": false,
"shadow-name": "",
"shadow-input-file": "",
"shadow-output-file": ""
}
}
2024-07-04T14:02:35.906Z [INFO] {FileUtils.cpp}: Successfully create directory /root/.aws-iot-device-client/sample-shadow/ with required permissions 700
2024-07-04T14:02:35.906Z [INFO] {Config.cpp}: ~/.aws-iot-device-client/sample-shadow/default-sample-shadow-document
2024-07-04T14:02:35.906Z [INFO] {Config.cpp}: Succesfully create default file: /root/.aws-iot-device-client/sample-shadow/default-sample-shadow-document required for storage of shadow document
2024-07-04T14:02:35.906Z [DEBUG] {Config.cpp}: Did not find a runtime configuration file, assuming Fleet Provisioning has not run for this device
2024-07-04T14:02:35.906Z [DEBUG] {Config.cpp}: Did not find a http proxy config file /root/.aws-iot-device-client/http-proxy.conf, assuming HTTP proxy is disabled on this device
2024-07-04T14:02:35.907Z [DEBUG] {EnvUtils.cpp}: Updated PATH environment variable to: /usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/root/.aws-iot-device-client:/root/.aws-iot-device-client/jobs:/home/pi/workshop_dc/aws-iot-device-client:/home/pi/workshop_dc/aws-iot-device-client/jobs
2024-07-04T14:02:35.907Z [DEBUG] {LockFile.cpp}: creating lockfile
2024-07-04T14:02:35.907Z [INFO] {Main.cpp}: Now running AWS IoT Device Client version v1.9.1-bfae937
2024-07-04T14:02:35.907Z [INFO] {SharedCrtResourceManager.cpp}: SDK logging is disabled. Enable it with --enable-sdk-logging on the command line or logging::enable-sdk-logging in your configuration file
2024-07-04T14:02:35.907Z [DEBUG] {Retry.cpp}: Retryable function starting, it will retry until success
2024-07-04T14:02:36.022Z [INFO] {SharedCrtResourceManager.cpp}: Establishing MQTT connection with client id deviceClientThing...
2024-07-04T14:02:36.648Z [INFO] {SharedCrtResourceManager.cpp}: MQTT connection established with return code: 0
Segmentation fault
Additional context
Is this related to this issue?
Ogy-GrayLineOgy-GrayLine
Metadata
Metadata
Assignees
Labels
bugSomething isn't workingSomething isn't working
Type
Projects
Milestone
Relationships
Development
Select code repository
Activity
HarshGandhi-AWS commentedon Jul 17, 2024
Hello @veigaMak , thank you for reaching out to us. To answer your question, no the issue you have linked should not related since it was resolved in previous client version update.
Give us some time to reproduce the issue and find the root cause. Most likely it is an setup issue but I can share more details once I reproduce the issue and solve it.
Regards,
Harsh Gandhi
rui-maksense commentedon Aug 20, 2024
Hello @HarshGandhi-AWS.
Any news on this issue, I'm also getting this exact behaviour.
Here's some additional info on my system
Cheers
ig15 commentedon Sep 25, 2024
Hi @rui-maksense . Thanks for reaching out to us. I see that you haven't provided the absolute path to the cert, key and roo-ca, in your aws-iot-device-client.conf file. Also, you haven't provided the endpoint it seems. Kindly try with these 2 modifications and let me know if it is working.
It should look something like:
If it still doesn't work, we suggest you to try and follow the updated documentation for Device Client and let us know if it resolves your issue.
veigaMak commentedon Sep 25, 2024
hi @ig15 , it was me that provided the aws-iot-device-client.conf file. Yeah I know that in that log it does not have the correct paths. I removed it before posting it, for privacy reasons. In the actual log the paths were correct.
ig15 commentedon Oct 15, 2024
Hey @veigaMak . Can you set
"enable-sdk-logging": true, "sdk-log-level": "DEBUG",
and send the logs that you get to help us better understand and debug the issue. Also I hope you have checked the updated documentation for Device Client.garysferrao commentedon Jan 22, 2025
just if it helps anyone, i faced the same problem: the MQTT client just segmentation-faults, without any trace/debug log.
i was using the Docker image (ubuntu, armv7) from AWS ECR built 15 days ago (seems this image ref).
after i deleted the
samples
section of theconfig.json
, it ran without any faults. need to investigate that part further.(note: technically, this is the OP's config and error log; i just want to show what section. mine was the same.)
Ogy-GrayLine commentedon May 7, 2025
I was just about to lose my hair pulling it while doing exactly the same - trying to follow the same tutorial referenced just as a step-by-step and fighting with this very same setup - an RPi-5 with latest Raspbian updated and kernel while feeling stupid that it just doesn't work the straight case and verifying something wrong I perhaps did :-(
Was just going to file a ticket when I discovered there's already. Apparently not a real move and resolution happened, but at least I can provide a bit more insight as I went to troubleshoot this further.
Raspberry PI-5 with default OS, updated to latest just 2 days back:
1.1.
uname -a Linux raspberrypi 6.12.25+rpt-rpi-2712 #1 SMP PREEMPT Debian 1:6.12.25-1+rpt1 (2025-04-30) aarch64 GNU/Linux
1.2. Installed
libssl-dev, libc6, build-essential, zlib1g-dev, checkinstall
1.3. Compiled just fine. Both initially as well as later with "DEBUG" flag added to CMAKE so I can attach a GDB while troubleshooting. I have snippets of the ouput from compile time for reference if needed.
Output of the GDB session:
Any thoughts on why it actually fails on the "memset.S" which seems missing?
Thanks a lot in advance!
Ogy-GrayLine commentedon May 7, 2025
Getting rid of the "samples" and "sample-shadow" sections as mentioned by @garysferrao here: #462 (comment)
does change the behavior on my compiled, non-docker RPi-5, but still doesn't make it a successful running client, unfortunately. Here's what I can see in the logs from that test:
Apparently - not much of a clue what happens and I've run out of troubleshooting ideas as it looks like the client is non-usable on RPi-s at least at the moment :-(
garysferrao commentedon May 7, 2025
@Ogy-GrayLine thanks for the debug logs. unfortunately i'm not a maintainer; i hope a maintainer will take up this issue soon.
this happens because gdb is trying to find even more details about the line number and looking up a source file. i assume that because memset is common, there is no source file available. (the fault already happened, and then gdb is trying to find the exact cause).
https://stackoverflow.com/a/10629233
this answer says to use
-g
when compiling to explicitly add that source information.i myself don't have the resources to compile and debug this. but on some light testing at the time, it seems the fault happened because those topics in
samples.pub-sub.publish-topic
et cetera do not exist. when i used an existing topic there was no error iirc. 🤔Ogy-GrayLine commentedon May 7, 2025
Hi @garysferrao ,
Thanks a lot for the assistance.
Yes, but as the tutorial which is mentioned initially, i.e. this one:
Get Started with AWS IoT
build happens with CMAKE instead of GCC/G++ directly. Thus I've followed this post here and configured the build by executing
cmake -D CMAKE_BUILD_TYPE=Debug ../
instead of simply "cmake ../ " .To my understanding that should be needed to add the debug info. Maybe it's some external library which is missing the source to show additional info? 🤔
@garysferrao Could you please clarify what do you mean by:
I have no topics in AWS IoT Core preloaded and seems such are not explicitly created. So I'm a bit confused what to try on that direction?
@ig15, Hi! I've provided debug logs as you request above and gdb snippets. Do you need anything else which I can do since I have a test bed setup to make this client actually working on Raspberry?
Thanks a lot in advance, Team!
P.S. I've also read and tried to additionally debug this not only with GDB, but also
valgrind
, a tool which I wasn't aware of and not familiar. Still, to show what else could be seen in there:Print the following interesting snippet (full log I can paste if needed, but it's indeed large):
Reading the stack trace from bottom to up it seems to me points to line 249 in
PubSubFeature.cpp
in the code.I can see this has been changed 2 years ago by this PR with motivation to do a clean up exactly upon startup.
But as I am not a C/C++ guy at all this is the most I can make and signals mine max limit in that direction.
Not really sure if it's in the right direction at all 🤷
ig15 commentedon May 9, 2025
@Ogy-GrayLine Thanks for sharing the details. We have ordered a Raspberry Pi 5 device and we'll start to work on the issue once we get our hands on it to reproduce the issue.
garysferrao commentedon May 13, 2025
@Ogy-GrayLine
sorry, it was just me trying to remember what i was doing at the time. it could have been replacing the topic
/topic/workshop/dc/*
or the file/home/pi/workshop_dc/*.txt
with something that exists. my reasoning was that if it's in the document tutorial, surely something must work; and iirc replacing them with existing paths did not crash the iot-device-client. but in the end i just removed that section from the JSON because i didn't need them.basically, i didn't mean for you to try anything.
hmm, if
CMAKE_BUILD_TYPE=Debug
is not adding the-g
flag, maybe you can manually add itthe file seems to be
CMakeLists.txt
; for example on L21:aws-iot-device-client/CMakeLists.txt
Line 21 in af44d15
assuming your compiler is GCC, you can manually add the
-g
or-ggdb
flag, say on line 22 theni mean to try this only if you're wondering about that error in gdb:
personally, i just extracted the compiled binary from the the Docker image. or better, wait for a response 🥺.
ig15 commentedon May 28, 2025
@Ogy-GrayLine I was able to reproduce the Device Client error that you encountered on an RPi 5 with following specifications:
pi@raspberrypi:~/.aws-iot-device-client $ uname -a Linux raspberrypi 6.12.25+rpt-rpi-2712 #1 SMP PREEMPT Debian 1:6.12.25-1+rpt1 (2025-04-30) aarch64 GNU/Linux
Error logs:
This error is occurs when incorrect policy setup for the samples pub-sub feature. You must have publish, subscribe and received permissions for the topic in your AWS IOT policy.
For example: If the samples section in your aws-iot-device-client.conf looks like:
You should have a corresponding policy attached to your IOT thing:
Refer here for more information.
That is why if you remove the samples section itself, the issue is not seen.
Do let us know if your problem is resolved.
ig15 commentedon Jun 8, 2025
We are closing this issue. Feel free to reopen or open a new issue in case of further problems. Thank you.