-
Notifications
You must be signed in to change notification settings - Fork 63
Adjusting the timeout config for hdfs storage #386
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -39,6 +39,16 @@ public synchronized void init() throws IOException { | |
| storageProperties.getTypes().get(HDFS_TYPE.getValue()); | ||
| org.apache.hadoop.conf.Configuration configuration = new org.apache.hadoop.conf.Configuration(); | ||
| configuration.set("fs.defaultFS", hdfsStorageProperties.getEndpoint()); | ||
|
|
||
| // Connection timeout configuration - fail fast on unreachable nodes | ||
| configuration.set("ipc.client.connect.timeout", "10000"); // default: 20000ms, override to 10s | ||
| configuration.set( | ||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. All these values are scaled down to fail fast? What if it succeeds with retries or higher value config?
Collaborator
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. There are up retries, block retries, and client retries. IO retries may retry several times to the same host. where block retries will retry the same block across data nodes, but client retries will retry f the namenode and refresh block locations. So say a host is down, IO retries will continue to fetch the host even if it is not responding. So say a a block is missing, we would not want to check all data nodes for said block, but instead re-request the namenode for these locations. I'm not particularly knowledgeable here so offline I've requested some HDFS SME to review |
||
| "ipc.client.connect.max.retries", "3"); // default: 10, override to 3 per address | ||
|
|
||
| // Socket timeout configuration - fail fast per datanode attempt | ||
| configuration.set( | ||
| "dfs.client.socket-timeout", "30000"); // default: 60000ms, override to 30s per node | ||
|
|
||
| fs = FileSystem.get(configuration); | ||
| } | ||
|
|
||
|
|
||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should these be config driven?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 for getting them from config. Can be injected through li internal config.