Skip to content

Problem processing sl2 files #215

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
IronicSven opened this issue Mar 26, 2017 · 5 comments
Open

Problem processing sl2 files #215

IronicSven opened this issue Mar 26, 2017 · 5 comments

Comments

@IronicSven
Copy link
Contributor

IronicSven commented Mar 26, 2017

I processed the raw data today and noticed that some of the logs I uploaded are missing in the trackpoints tables although they are part of the raw data.

It seems to me like if there is a coherent set of sl2 logs sometimes only the first log makes it into the trackpoints tables.

Example from my uploads of the 2016-08-31:
trackid name
53539 Sonar0008.sl2 makes it into the trackpoints tables
53540 Sonar0009.sl2 is missing in the trackpoints tables
53556 Sonar0010.sl2 is missing in the trackpoints tables
53557 Sonar0011.sl2 is missing in the trackpoints tables
53558 Sonar0012.sl2 is missing in the trackpoints tables

Example from my uploads of the 2016-09-25:
trackid name
54185 LASonar0000.sl2 makes it into the trackpoints tables
54186 LASonar0001.sl2 is missing in the trackpoints tables
54187 LASonar0002.sl2 is missing in the trackpoints tables
54188 LASonar0003.sl2 is missing in the trackpoints tables
54189 LASonar0005.sl2 makes it into the trackpoints tables
54190 LASonar0006.sl2 is missing in the trackpoints tables
54191 LASonar0007.sl2 is missing in the trackpoints tables
54192 LASonar0004.sl2 is missing in the trackpoints tables

@cleanerx
Copy link
Member

Without further looking into the data I noticed this problem with SL2 files, too. From what I could see is that the first file breaks off and a new file starts. Unfortunately I could not identify this situation as the headers of the consecutive files did not reveal any hint that such a situation is there. I think this scenario needs further investigation of the format. Maybe the first file stores some information about the files to follow.

@IronicSven
Copy link
Contributor Author

Sorry but I think I missed an important information:
The following sl2 logs emerge from pressing stop and start logging therefore they could/should be processed independently.

The reason behind this: Splitting large tracks is a best practice proposed by Insight Genesis and Reefmaster.

@cleanerx
Copy link
Member

cleanerx commented Apr 2, 2017

When I tried to look at the format, I could not find any hint about the files being present. I think some more reverse engineering is required. Try to open up a hex editor with these files and see if you can find something:
http://wiki.openstreetmap.org/wiki/SL2

@IronicSven
Copy link
Contributor Author

It seems that there is no problem with the format itself because all files are processable seperately. If I delete a file that was processed on the last run another file will be processed on the next run.

The log of the processing tool states that all files are processed and clustered to two tracks. But those clustered tracks only contain their own data. The missing track data is logged with "Partially correct data for for track".

Is there a way to disable the TimeBasedTrackClustering so I could test if this is the cause?

Extract of the logfile:

....
net.sf.seesea.data.postprocessing.filter.TimeBasedTrackClustering: Track Clusters are 2
...
net.sf.seesea.data.postprocessing.filter.FilterController: Processing track id:54185
net.sf.seesea.data.postprocessing.filter.FilterController: Processing track id:54186
net.sf.seesea.data.postprocessing.filter.FilterController: Partially correct data for for track id 54186
net.sf.seesea.data.postprocessing.filter.FilterController: Processing track id:54187
net.sf.seesea.data.postprocessing.filter.FilterController: Partially correct data for for track id 54187
net.sf.seesea.data.postprocessing.filter.FilterController: Processing track id:54188
net.sf.seesea.data.postprocessing.filter.FilterController: Partially correct data for for track id 54188
...

This is my testcase:

sven@sven-shuttle$ ls -lah ../test/54100/
total 1,2G
drwxrwxr-x 2 sven sven 4,0K Apr  3 08:27 .
drwxrwxr-x 3 sven sven 4,0K Apr  3 08:18 ..
-rw-rw-r-- 1 sven sven 273M Apr  3 08:18 54185.dat
-rw-rw-r-- 1 sven sven  30M Apr  3 08:18 54186.dat
-rw-rw-r-- 1 sven sven  68M Apr  3 08:27 54187.dat
-rw-rw-r-- 1 sven sven  58M Apr  3 08:26 54188.dat
-rw-rw-r-- 1 sven sven 165M Apr  3 08:26 54189.dat
-rw-rw-r-- 1 sven sven 160M Apr  3 08:27 54190.dat
-rw-rw-r-- 1 sven sven 387M Apr  3 08:27 54191.dat

sven@sven-shuttle$ ./eclipse
...

SELECT DISTINCT datasetid FROM trackpoints_raw_filter_16 ORDER BY datasetid;

54185
54189

sven@sven-shuttle$ rm ../test/54100/54185.dat
sven@sven-shuttle$ ls -lah ../test/54100/
total 865M
drwxrwxr-x 2 sven sven 4,0K Apr  3 09:11 .
drwxrwxr-x 3 sven sven 4,0K Apr  3 08:18 ..
-rw-rw-r-- 1 sven sven  30M Apr  3 08:18 54186.dat
-rw-rw-r-- 1 sven sven  68M Apr  3 08:27 54187.dat
-rw-rw-r-- 1 sven sven  58M Apr  3 08:26 54188.dat
-rw-rw-r-- 1 sven sven 165M Apr  3 08:26 54189.dat
-rw-rw-r-- 1 sven sven 160M Apr  3 08:27 54190.dat
-rw-rw-r-- 1 sven sven 387M Apr  3 08:27 54191.dat

sven@sven-shuttle$ ./eclipse
...

SELECT DISTINCT datasetid FROM trackpoints_raw_filter_16 ORDER BY datasetid;

54186
54189

sven@sven-shuttle$ rm ../test/54100/54186.dat
sven@sven-shuttle$ ls -lah ../test/54100/
total 836M
drwxrwxr-x 2 sven sven 4,0K Apr  3 09:23 .
drwxrwxr-x 3 sven sven 4,0K Apr  3 08:18 ..
-rw-rw-r-- 1 sven sven  68M Apr  3 08:27 54187.dat
-rw-rw-r-- 1 sven sven  58M Apr  3 08:26 54188.dat
-rw-rw-r-- 1 sven sven 165M Apr  3 08:26 54189.dat
-rw-rw-r-- 1 sven sven 160M Apr  3 08:27 54190.dat
-rw-rw-r-- 1 sven sven 387M Apr  3 08:27 54191.dat
sven@sven-shuttle$ ./eclipse
...

SELECT DISTINCT datasetid FROM trackpoints_raw_filter_16 ORDER BY datasetid;

54187
54189

@cleanerx
Copy link
Member

cleanerx commented Apr 3, 2017

No this is the only clustering algorithm. It tries to put the log file in sequence according to the recorded time. Partially correct data appears if the format suddenly breaks off. Maybe SL2 Track File Processor stores some state so it can not recover from this scenario when multiple tracks are being processed.
Good indication.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants