Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Investigate current debug file chunk uploading codepath #2197

Closed
szokeasaurusrex opened this issue Oct 23, 2024 · 2 comments
Closed

Investigate current debug file chunk uploading codepath #2197

szokeasaurusrex opened this issue Oct 23, 2024 · 2 comments
Assignees

Comments

@szokeasaurusrex
Copy link
Member

szokeasaurusrex commented Oct 23, 2024

Investigate how debug files are currently uploaded via chunk uploading. In particular, we should focus on determining which logic is specific to debug files, and which logic is general to chunk uploading.

Document the results in this issue. This will inform #2194.

Investigation results

This diagram summarizes the general process for chunk uploading.

The process for debug files specifically starts with the sentry-cli debug-files upload command, defined here.

The happy code path then calls this function to perform the upload. If chunk uploading is supported by the server, we proceed to the upload_difs_chunked function to perform the upload.

upload_difs_chunked

The first several lines of upload_difs_chunked perform logic specific to uploading debug files. This logic is not really related to chunk uploading.

// Search for debug files in the file system and ZIPs
let found = search_difs(options)?;
if found.is_empty() {
println!("{} No debug information files found", style(">").dim());
return Ok(Default::default());
}
// Try to resolve BCSymbolMaps
let symbol_map = options.symbol_map.as_deref();
let mut processed = process_symbol_maps(found, symbol_map)?;
if options.upload_il2cpp_mappings {
let il2cpp_mappings = create_il2cpp_mappings(&processed)?;
processed.extend(il2cpp_mappings);
}
// Resolve source code context if specified
if options.include_sources {
let source_bundles = create_source_bundles(&processed, options.upload_il2cpp_mappings)?;
processed.extend(source_bundles);
}

The first chunk uploading logic comes here, where we split each of the DifMatch objects into chunks. We can likely generalize this logic.

// Calculate checksums and chunks
let chunked = prepare_difs(processed, |m| {
ChunkedDifMatch::from(m, chunk_options.chunk_size)
})?;

The rest of the function combines debug file and chunk upload related logic:

// Upload missing chunks to the server and remember incomplete difs
let missing_info = try_assemble_difs(&chunked, options)?;
upload_missing_chunks(&missing_info, chunk_options)?;
// Only if DIFs were missing, poll until assembling is complete
let (missing_difs, _) = missing_info;
if !missing_difs.is_empty() {
poll_dif_assemble(&missing_difs, options)
} else {
println!(
"{} Nothing to upload, all files are on the server",
style(">").dim()
);
Ok((Default::default(), false))
}

In the above code snippet, we essentially check whether all of the chunks are on the server by calling the Dif assemble endpoint, then we upload any missing chunks. It is worth looking separately at the try_assemble_difs and upload_missing_chunks functions.

try_assemble_difs

The purpose of try_assemble_difs is to call the dif assemble endpoint to determine which, if any, debug files and chunks need to be uploaded because they are missing from the server.

We first perform the API call to the assemble endpoint:

let request = difs
.iter()
.map(|d| d.to_assemble(options.pdbs_allowed))
.collect();
let response = api
.authenticated()?
.assemble_difs(&options.org, &options.project, &request)?;

Next, we iterate through all of the files in the response to find the missing difs and the missing chunks. If the file errored or is still assembling, we add the file to the list of missing difs, but we don't add any of the chunks to the list of missing chunks (since the chunks are on the server). If the dif is not found, we add all of the chunks that are missing to the missing chunks list, and we add the dif to the missing difs list (unless there are no missing chunks, e.g. in case of an empty file). For difs that have neither errored, are not assembling, and which are not in a not found state, we assume the file is successfully uploaded.

let mut difs = Vec::new();
let mut chunks = Vec::new();
for (checksum, ref file_response) in response {
let chunked_match = *difs_by_checksum
.get(&checksum)
.ok_or_else(|| format_err!("Server returned unexpected checksum"))?;
match file_response.state {
ChunkedFileState::Error => {
// One of the files could not be uploaded properly and resulted
// in an error. We include this file in the return value so that
// it shows up in the final report.
difs.push(chunked_match);
}
ChunkedFileState::Assembling => {
// This file is currently assembling. The caller will have to poll this file later
// until it either resolves or errors.
difs.push(chunked_match);
}
ChunkedFileState::NotFound => {
// Assembling for one of the files has not started because some
// (or all) of its chunks have not been found. We report its
// missing chunks to the caller and then continue. The caller
// will have to call `try_assemble_difs` again after uploading
// them.
let mut missing_chunks = chunked_match
.chunks()
.filter(|&Chunk((c, _))| file_response.missing_chunks.contains(&c))
.peekable();
// Usually every file that is NotFound should also contain a set
// of missing chunks. However, if we tried to upload an empty
// file or the server returns an invalid response, we need to
// make sure that this match is not included in the missing
// difs.
if missing_chunks.peek().is_some() {
difs.push(chunked_match);
}
chunks.extend(missing_chunks);
}
_ => {
// This file has already finished. No action required anymore.
}
}
}

Lastly, we return the two constructed lists:

Ok((difs, chunks))

try_assemble_difs should be generalizable to file types other than debug files.

upload_missing_chunks

upload_missing_chunks is actually quite simple. We first check whether there are actually any missing chunks that we still need to upload:

if chunks.is_empty() {
return Ok(());
}

If so, we call upload_chunks to perform the actual upload. Since upload_chunks is already generalized to chunks of any type (not specific to debug files), we don't need to investigate its internal workings.

upload_chunks(chunks, chunk_options, progress_style)?;

The function also contains some logic to output information about the upload and to display a progress bar.

We can likely easily generalize this code to any file type.

@szokeasaurusrex
Copy link
Member Author

The investigation results are above

@szokeasaurusrex
Copy link
Member Author

Also potentially a helpful visualization of the server-CLI interactions:

Image

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant