-
Notifications
You must be signed in to change notification settings - Fork 459
Improved metadata binding parsing and validation. #11101
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: dev
Are you sure you want to change the base?
Conversation
@@ -175,7 +176,7 @@ private static async Task<JObject> GetFunctionConfig(FunctionMetadata metadata, | |||
|
|||
private static async Task<JObject> GetFunctionConfigFromFile(string path) | |||
{ | |||
return JObject.Parse(await FileUtility.ReadAsync(path)); | |||
return JObject.Parse(Sanitizer.Sanitize(await FileUtility.ReadAsync(path))); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
synctrigger endpoint and /functions/{functionname} endpoint code reads the file if it is present. (Valid for custom handlers /non worker indexing use case).
@@ -257,7 +258,7 @@ internal FunctionMetadata ValidateBindings(IEnumerable<string> rawBindings, Func | |||
|
|||
foreach (string binding in rawBindings) | |||
{ | |||
var deserializedObj = JsonConvert.DeserializeObject<JObject>(binding, _dateTimeSerializerSettings); | |||
var deserializedObj = JsonConvert.DeserializeObject<JObject>(Sanitizer.Sanitize(binding), _dateTimeSerializerSettings); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This method gets executed in worker indexing use case.
@@ -41,6 +41,9 @@ public class SanitizerTests | |||
[InlineData("test,aaa://aaa:[email protected]:1111,test", "test,[Hidden Credential],test")] | |||
[InlineData(@"some text abc://abc:[email protected]:1111 some text abc://abc:[email protected]:1111 text", @"some text [Hidden Credential] some text [Hidden Credential] text")] | |||
[InlineData(@"some text abc://abc:[email protected]:1111 some text AccountKey=heyyyyyyy text", @"some text [Hidden Credential] some text [Hidden Credential]")] | |||
[InlineData("""{"queueName":"my-q-items","connection":"MyConnection","type":"queueTrigger","name":"qTrigger1","direction":"in"}""", "{\"queueName\":\"my-q-items\",\"connection\":\"MyConnection\",\"type\":\"queueTrigger\",\"name\":\"qTrigger1\",\"direction\":\"in\"}")] | |||
[InlineData("""{"queueName":"my-q-items","connection":"DefaultEndpointsProtocol=https;AccountName=a;AccountKey=b/c==;EndpointSuffix=core.windows.net","type":"queueTrigger","name":"queueTrigger1","direction":"in"}""", "{\"queueName\":\"my-q-items\",\"connection\":\"[Hidden Credential]\",\"type\":\"queueTrigger\",\"name\":\"queueTrigger1\",\"direction\":\"in\"}")] | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could this be a breaking change in some scenarios? Let's say, queue name is "key=myqueue"
then we will replace it with "[Hidden Credential]"
. Will the function stop executing since we changed the queue name or is it only the Sync trigger response that is being sanitized (and it won't break any other component that depends on Sync trigger response)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The functions are already in a broken state (not executing anything). The extension listener code throws when it is not able to resolve a valid app setting value from the value of these attributes (example code for service bus extension here). So this is not breaking anything.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We are sanitizing the entire binding object for all scenarios and not just the exception cases. If any extension introduces (or already has) a binding property with name "password="
or "key="
(may not be good examples), they will be modified to [Hidden Credential]
and the app can break.
In terms of binding values, I checked that the queue name and connection setting name cannot have "="
(but other bindings value may have "="
), but they can have other characters like ":", so may be, we need to ensure that the Sanitizer class should never have tokens such as "key:"
, otherwise we will modify that value to [Hidden Credential]
. A unit test should work for this scenario.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not sure I follow the last part of the comment. But for inputs like "one:two;three"
, Sanitizer returns the same value output.
Input | Output |
---|---|
one:two;three | one:two;three |
one:a;two=three | one:a;two=three |
foo=bar | foo=bar |
If you meant something else, can you share an example input?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Today, key=myvalue
will return [HiddenCredential]
. If in future, someone adds a token "key:"
in Sanitizer, then
key:myvalue
will also return [HiddenCredential]
. This key:myvalue
is a valid connection setting name. Let's also discuss this offline to make sure we don't have any confusion here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right because :
is the connection separator, so it won't be unexpected for connection names to include it. Now, if customer's have a section literally called "key" or "secret", or something else covered by our sanitizer that is another question.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe we should reconsider the approach taken here and sanitize only the binding values when we serve the response from the /functions/ endpoints. Let's follow up offline.
Had offline sync with @surgupta-msft and agreed to mask only the "connection" binding value. |
ed60f35
to
c0129c5
Compare
b317aee
to
dff555b
Compare
|
||
if (jObject.TryGetValue("bindings", StringComparison.OrdinalIgnoreCase, out JToken bindingsToken) && bindingsToken is JArray bindings) | ||
{ | ||
var bindingObjects = bindings.OfType<JObject>().ToList(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it possible to avoid converting JArray to List here? In HostFunctionMetadataProvider, we are able to do similar thing with JArray.
{ | ||
if (propertyNames.Contains(prop.Name, StringComparer.OrdinalIgnoreCase)) | ||
{ | ||
jsonObject[prop.Name] = Sanitizer.Sanitize(prop.Value.ToString()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Connection value could be null. So, I believe we need a null check for prop.Value.
@@ -257,8 +258,8 @@ internal FunctionMetadata ValidateBindings(IEnumerable<string> rawBindings, Func | |||
|
|||
foreach (string binding in rawBindings) | |||
{ | |||
var deserializedObj = JsonConvert.DeserializeObject<JObject>(binding, _dateTimeSerializerSettings); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need to still leverage _dateTimeSerializerSettings
. If we don't use it, it will break some datetime stuff.
For example, current serializer setting says to not parse the datetime. If we don't use it, it will start parsing it. I added an example in the test class below. I worked on similar issue before - Azure/azure-functions-dotnet-worker#2442
""type"": ""queue"", | ||
""direction"": ""out"", | ||
""queueName"": ""test-output-node"", | ||
""connection"": ""MyConnection"" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we check for connection null and empty too?
var functionMetadata = new FunctionMetadata(); | ||
List<string> rawBindings = | ||
[ | ||
"""{"type": "queueTrigger","name": "myQueueItem","direction": "in","queueName": "test-input-node","connection": "DefaultEndpointsProtocol=https;AccountName=a;AccountKey=b/c==;EndpointSuffix=core.windows.net"}""", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add a test with queue name (or any other field) as "2024-04-29T11:35:00+08:00"
. You will see that the datetime is parsed to some other value. This won't happen if we use
new JsonSerializerSettings { DateParseHandling = DateParseHandling.None }
Improving how metadata binding is being read and used. Handles both worker indexing and host indexing cases.
Pull request checklist
IMPORTANT: Currently, changes must be backported to the
in-proc
branch to be included in Core Tools and non-Flex deployments.in-proc
branch is not requiredrelease_notes.md