Skip to content

HADOOP-18325: ABFS: Add correlated metric support for ABFS operations #6314

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 114 commits into from
May 23, 2024

Conversation

anmolanmol1234
Copy link
Contributor

@anmolanmol1234 anmolanmol1234 commented Dec 1, 2023

We have introduced support for metric collection at the filesystem instance level.
Metrics are pushed to the store upon the closure of a filesystem instance, encompassing all operations that utilized that specific instance.

Collected Metrics:

  1. Number of successful requests without any retries.
  2. Count of requests that succeeded after a specified number of retries (x retries).
    3.Request count subjected to throttling.
    4.Number of requests that failed despite exhausting all retry attempts. etc.

Implementation Details:

  1. Incorporated logic in the AbfsClient to facilitate metric pushing through an additional request.
  2. This occurs in scenarios where no requests are sent to the backend for a defined idle period.

By implementing these enhancements, we ensure comprehensive monitoring and analysis of filesystem interactions, enabling a deeper understanding of success rates, retry scenarios, throttling instances, and exhaustive failure scenarios. Additionally, the AbfsClient logic ensures that metrics are proactively pushed even during idle periods, maintaining a continuous and accurate representation of filesystem performance.

@hadoop-yetus
Copy link

🎊 +1 overall

Vote Subsystem Runtime Logfile Comment
+0 🆗 reexec 0m 22s Docker mode activated.
_ Prechecks _
+1 💚 dupname 0m 0s No case conflicting files found.
+0 🆗 codespell 0m 0s codespell was not available.
+0 🆗 detsecrets 0m 0s detect-secrets was not available.
+0 🆗 markdownlint 0m 0s markdownlint was not available.
+1 💚 @author 0m 0s The patch does not contain any @author tags.
+1 💚 test4tests 0m 0s The patch appears to include 8 new or modified test files.
_ trunk Compile Tests _
+1 💚 mvninstall 32m 29s trunk passed
+1 💚 compile 0m 26s trunk passed with JDK Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1
+1 💚 compile 0m 23s trunk passed with JDK Private Build-1.8.0_392-8u392-ga-1~20.04-b08
+1 💚 checkstyle 0m 22s trunk passed
+1 💚 mvnsite 0m 27s trunk passed
+1 💚 javadoc 0m 26s trunk passed with JDK Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1
+1 💚 javadoc 0m 23s trunk passed with JDK Private Build-1.8.0_392-8u392-ga-1~20.04-b08
+1 💚 spotbugs 0m 44s trunk passed
+1 💚 shadedclient 19m 37s branch has no errors when building and testing our client artifacts.
-0 ⚠️ patch 19m 49s Used diff version of patch file. Binary files and potentially other changes not applied. Please rebase and squash commits if necessary.
_ Patch Compile Tests _
+1 💚 mvninstall 0m 19s the patch passed
+1 💚 compile 0m 19s the patch passed with JDK Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1
+1 💚 javac 0m 19s the patch passed
+1 💚 compile 0m 16s the patch passed with JDK Private Build-1.8.0_392-8u392-ga-1~20.04-b08
+1 💚 javac 0m 16s the patch passed
+1 💚 blanks 0m 0s The patch has no blanks issues.
-0 ⚠️ checkstyle 0m 13s /results-checkstyle-hadoop-tools_hadoop-azure.txt hadoop-tools/hadoop-azure: The patch generated 2 new + 10 unchanged - 1 fixed = 12 total (was 11)
+1 💚 mvnsite 0m 19s the patch passed
+1 💚 javadoc 0m 17s the patch passed with JDK Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1
+1 💚 javadoc 0m 17s the patch passed with JDK Private Build-1.8.0_392-8u392-ga-1~20.04-b08
+1 💚 spotbugs 0m 42s the patch passed
+1 💚 shadedclient 19m 16s patch has no errors when building and testing our client artifacts.
_ Other Tests _
+1 💚 unit 1m 54s hadoop-azure in the patch passed.
+1 💚 asflicense 0m 24s The patch does not generate ASF License warnings.
82m 47s
Subsystem Report/Notes
Docker ClientAPI=1.44 ServerAPI=1.44 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6314/24/artifact/out/Dockerfile
GITHUB PR #6314
Optional Tests dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets markdownlint
uname Linux 50b4d82014b7 5.15.0-94-generic #104-Ubuntu SMP Tue Jan 9 15:25:40 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/bin/hadoop.sh
git revision trunk / 18205c1
Default Java Private Build-1.8.0_392-8u392-ga-1~20.04-b08
Multi-JDK versions /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_392-8u392-ga-1~20.04-b08
Test Results https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6314/24/testReport/
Max. process+thread count 557 (vs. ulimit of 5500)
modules C: hadoop-tools/hadoop-azure U: hadoop-tools/hadoop-azure
Console output https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6314/24/console
versions git=2.25.1 maven=3.6.3 spotbugs=4.2.2
Powered by Apache Yetus 0.14.0 https://yetus.apache.org

This message was automatically generated.

@hadoop-yetus
Copy link

🎊 +1 overall

Vote Subsystem Runtime Logfile Comment
+0 🆗 reexec 0m 46s Docker mode activated.
_ Prechecks _
+1 💚 dupname 0m 0s No case conflicting files found.
+0 🆗 codespell 0m 1s codespell was not available.
+0 🆗 detsecrets 0m 1s detect-secrets was not available.
+0 🆗 markdownlint 0m 1s markdownlint was not available.
+1 💚 @author 0m 0s The patch does not contain any @author tags.
+1 💚 test4tests 0m 0s The patch appears to include 8 new or modified test files.
_ trunk Compile Tests _
+1 💚 mvninstall 48m 47s trunk passed
+1 💚 compile 0m 37s trunk passed with JDK Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1
+1 💚 compile 0m 34s trunk passed with JDK Private Build-1.8.0_392-8u392-ga-1~20.04-b08
+1 💚 checkstyle 0m 30s trunk passed
+1 💚 mvnsite 0m 38s trunk passed
+1 💚 javadoc 0m 36s trunk passed with JDK Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1
+1 💚 javadoc 0m 32s trunk passed with JDK Private Build-1.8.0_392-8u392-ga-1~20.04-b08
+1 💚 spotbugs 1m 2s trunk passed
+1 💚 shadedclient 37m 42s branch has no errors when building and testing our client artifacts.
-0 ⚠️ patch 38m 2s Used diff version of patch file. Binary files and potentially other changes not applied. Please rebase and squash commits if necessary.
_ Patch Compile Tests _
+1 💚 mvninstall 0m 28s the patch passed
+1 💚 compile 0m 29s the patch passed with JDK Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1
+1 💚 javac 0m 29s the patch passed
+1 💚 compile 0m 26s the patch passed with JDK Private Build-1.8.0_392-8u392-ga-1~20.04-b08
+1 💚 javac 0m 26s the patch passed
+1 💚 blanks 0m 0s The patch has no blanks issues.
+1 💚 checkstyle 0m 18s hadoop-tools/hadoop-azure: The patch generated 0 new + 10 unchanged - 1 fixed = 10 total (was 11)
+1 💚 mvnsite 0m 29s the patch passed
+1 💚 javadoc 0m 25s the patch passed with JDK Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1
+1 💚 javadoc 0m 24s the patch passed with JDK Private Build-1.8.0_392-8u392-ga-1~20.04-b08
+1 💚 spotbugs 1m 3s the patch passed
+1 💚 shadedclient 37m 58s patch has no errors when building and testing our client artifacts.
_ Other Tests _
+1 💚 unit 2m 22s hadoop-azure in the patch passed.
+1 💚 asflicense 0m 35s The patch does not generate ASF License warnings.
141m 12s
Subsystem Report/Notes
Docker ClientAPI=1.44 ServerAPI=1.44 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6314/23/artifact/out/Dockerfile
GITHUB PR #6314
Optional Tests dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets markdownlint
uname Linux fb68d19f8bf0 5.15.0-94-generic #104-Ubuntu SMP Tue Jan 9 15:25:40 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/bin/hadoop.sh
git revision trunk / e18b2f5
Default Java Private Build-1.8.0_392-8u392-ga-1~20.04-b08
Multi-JDK versions /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_392-8u392-ga-1~20.04-b08
Test Results https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6314/23/testReport/
Max. process+thread count 531 (vs. ulimit of 5500)
modules C: hadoop-tools/hadoop-azure U: hadoop-tools/hadoop-azure
Console output https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6314/23/console
versions git=2.25.1 maven=3.6.3 spotbugs=4.2.2
Powered by Apache Yetus 0.14.0 https://yetus.apache.org

This message was automatically generated.

Copy link
Contributor

@anujmodi2021 anujmodi2021 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just one comment, more of a doubt.
Rest LGTM

if (abfsStore.getClient().isMetricCollectionEnabled()) {
TracingContext tracingMetricContext = new TracingContext(
clientCorrelationId,
fileSystemId, FSOperationType.GET_ATTR, true,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here for metric tracing context we are using fileSystemId only, where as the metric tracing context created in ABfsClient, we are suing hostname...

Any reason for this difference?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is mainly because the fileystem id is private to the fileystem class, hence cannot be used in the client class for a unique identifier. Hence we have used hostname for the identifier in the client class. If this looks as a disparity, we can make both the tracing contexts use the hostname.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good this way only.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

have a look at the CommonAuditContext; it includes things like jobid, kerberos principal and more...so letting you identify jobs and people/applications

@anmolanmol1234 anmolanmol1234 marked this pull request as ready for review March 18, 2024 13:53
@anmolanmol1234
Copy link
Contributor Author

Hi @steveloughran @mukund-thakur, requesting you to kindly review this PR.

@anmolanmol1234
Copy link
Contributor Author

Hi @steveloughran @mukund-thakur @mehakmeet, requesting you to kindly review this PR.

@hadoop-yetus
Copy link

💔 -1 overall

Vote Subsystem Runtime Logfile Comment
+0 🆗 reexec 0m 21s Docker mode activated.
_ Prechecks _
+1 💚 dupname 0m 0s No case conflicting files found.
+0 🆗 codespell 0m 0s codespell was not available.
+0 🆗 detsecrets 0m 0s detect-secrets was not available.
+0 🆗 markdownlint 0m 0s markdownlint was not available.
+1 💚 @author 0m 0s The patch does not contain any @author tags.
+1 💚 test4tests 0m 0s The patch appears to include 8 new or modified test files.
_ trunk Compile Tests _
+1 💚 mvninstall 32m 55s trunk passed
+1 💚 compile 0m 23s trunk passed with JDK Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1
+1 💚 compile 0m 21s trunk passed with JDK Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06
+1 💚 checkstyle 0m 22s trunk passed
+1 💚 mvnsite 0m 28s trunk passed
+1 💚 javadoc 0m 27s trunk passed with JDK Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1
+1 💚 javadoc 0m 22s trunk passed with JDK Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06
+1 💚 spotbugs 0m 45s trunk passed
+1 💚 shadedclient 20m 24s branch has no errors when building and testing our client artifacts.
-0 ⚠️ patch 20m 36s Used diff version of patch file. Binary files and potentially other changes not applied. Please rebase and squash commits if necessary.
_ Patch Compile Tests _
-1 ❌ mvninstall 0m 13s /patch-mvninstall-hadoop-tools_hadoop-azure.txt hadoop-azure in the patch failed.
-1 ❌ compile 0m 12s /patch-compile-hadoop-tools_hadoop-azure-jdkUbuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1.txt hadoop-azure in the patch failed with JDK Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1.
-1 ❌ javac 0m 12s /patch-compile-hadoop-tools_hadoop-azure-jdkUbuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1.txt hadoop-azure in the patch failed with JDK Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1.
-1 ❌ compile 0m 12s /patch-compile-hadoop-tools_hadoop-azure-jdkPrivateBuild-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06.txt hadoop-azure in the patch failed with JDK Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06.
-1 ❌ javac 0m 12s /patch-compile-hadoop-tools_hadoop-azure-jdkPrivateBuild-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06.txt hadoop-azure in the patch failed with JDK Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06.
+1 💚 blanks 0m 0s The patch has no blanks issues.
-0 ⚠️ checkstyle 0m 10s /buildtool-patch-checkstyle-hadoop-tools_hadoop-azure.txt The patch fails to run checkstyle in hadoop-azure
-1 ❌ mvnsite 0m 14s /patch-mvnsite-hadoop-tools_hadoop-azure.txt hadoop-azure in the patch failed.
-1 ❌ javadoc 0m 14s /patch-javadoc-hadoop-tools_hadoop-azure-jdkUbuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1.txt hadoop-azure in the patch failed with JDK Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1.
-1 ❌ javadoc 0m 14s /patch-javadoc-hadoop-tools_hadoop-azure-jdkPrivateBuild-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06.txt hadoop-azure in the patch failed with JDK Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06.
-1 ❌ spotbugs 0m 13s /patch-spotbugs-hadoop-tools_hadoop-azure.txt hadoop-azure in the patch failed.
+1 💚 shadedclient 22m 4s patch has no errors when building and testing our client artifacts.
_ Other Tests _
-1 ❌ unit 0m 16s /patch-unit-hadoop-tools_hadoop-azure.txt hadoop-azure in the patch failed.
+1 💚 asflicense 0m 25s The patch does not generate ASF License warnings.
82m 57s
Subsystem Report/Notes
Docker ClientAPI=1.45 ServerAPI=1.45 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6314/25/artifact/out/Dockerfile
GITHUB PR #6314
Optional Tests dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets markdownlint
uname Linux 4c5a0b1a1f81 5.15.0-94-generic #104-Ubuntu SMP Tue Jan 9 15:25:40 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/bin/hadoop.sh
git revision trunk / 1479da8
Default Java Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06
Multi-JDK versions /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06
Test Results https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6314/25/testReport/
Max. process+thread count 552 (vs. ulimit of 5500)
modules C: hadoop-tools/hadoop-azure U: hadoop-tools/hadoop-azure
Console output https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6314/25/console
versions git=2.25.1 maven=3.6.3 spotbugs=4.2.2
Powered by Apache Yetus 0.14.0 https://yetus.apache.org

This message was automatically generated.

@hadoop-yetus
Copy link

🎊 +1 overall

Vote Subsystem Runtime Logfile Comment
+0 🆗 reexec 0m 22s Docker mode activated.
_ Prechecks _
+1 💚 dupname 0m 0s No case conflicting files found.
+0 🆗 codespell 0m 1s codespell was not available.
+0 🆗 detsecrets 0m 1s detect-secrets was not available.
+0 🆗 markdownlint 0m 1s markdownlint was not available.
+1 💚 @author 0m 0s The patch does not contain any @author tags.
+1 💚 test4tests 0m 0s The patch appears to include 8 new or modified test files.
_ trunk Compile Tests _
+1 💚 mvninstall 33m 4s trunk passed
+1 💚 compile 0m 25s trunk passed with JDK Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1
+1 💚 compile 0m 20s trunk passed with JDK Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06
+1 💚 checkstyle 0m 20s trunk passed
+1 💚 mvnsite 0m 29s trunk passed
+1 💚 javadoc 0m 24s trunk passed with JDK Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1
+1 💚 javadoc 0m 23s trunk passed with JDK Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06
+1 💚 spotbugs 0m 45s trunk passed
+1 💚 shadedclient 20m 50s branch has no errors when building and testing our client artifacts.
-0 ⚠️ patch 21m 3s Used diff version of patch file. Binary files and potentially other changes not applied. Please rebase and squash commits if necessary.
_ Patch Compile Tests _
+1 💚 mvninstall 0m 20s the patch passed
+1 💚 compile 0m 18s the patch passed with JDK Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1
+1 💚 javac 0m 18s the patch passed
+1 💚 compile 0m 14s the patch passed with JDK Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06
+1 💚 javac 0m 14s the patch passed
+1 💚 blanks 0m 0s The patch has no blanks issues.
-0 ⚠️ checkstyle 0m 13s /results-checkstyle-hadoop-tools_hadoop-azure.txt hadoop-tools/hadoop-azure: The patch generated 2 new + 10 unchanged - 1 fixed = 12 total (was 11)
+1 💚 mvnsite 0m 21s the patch passed
+1 💚 javadoc 0m 16s the patch passed with JDK Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1
+1 💚 javadoc 0m 18s the patch passed with JDK Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06
+1 💚 spotbugs 0m 39s the patch passed
+1 💚 shadedclient 20m 59s patch has no errors when building and testing our client artifacts.
_ Other Tests _
+1 💚 unit 1m 49s hadoop-azure in the patch passed.
+1 💚 asflicense 0m 25s The patch does not generate ASF License warnings.
86m 18s
Subsystem Report/Notes
Docker ClientAPI=1.45 ServerAPI=1.45 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6314/26/artifact/out/Dockerfile
GITHUB PR #6314
Optional Tests dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets markdownlint
uname Linux 815c1cb01b86 5.15.0-94-generic #104-Ubuntu SMP Tue Jan 9 15:25:40 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/bin/hadoop.sh
git revision trunk / ac3cb16
Default Java Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06
Multi-JDK versions /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06
Test Results https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6314/26/testReport/
Max. process+thread count 560 (vs. ulimit of 5500)
modules C: hadoop-tools/hadoop-azure U: hadoop-tools/hadoop-azure
Console output https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6314/26/console
versions git=2.25.1 maven=3.6.3 spotbugs=4.2.2
Powered by Apache Yetus 0.14.0 https://yetus.apache.org

This message was automatically generated.

@steveloughran steveloughran changed the title Hadoop 18325: ABFS: Add correlated metric support for ABFS operations HADOOP-18325: ABFS: Add correlated metric support for ABFS operations Apr 16, 2024
Copy link
Contributor

@steveloughran steveloughran left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

well this is quite the patch, isn't it!

commented.

import static org.apache.hadoop.fs.azurebfs.constants.FileSystemConfigurations.HUNDRED;
import static org.apache.hadoop.fs.azurebfs.constants.FileSystemConfigurations.THOUSAND;

public class AbfsBackoffMetrics {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you add javadocs for each of these and the class.

= new ConcurrentHashMap<>();

public AbfsBackoffMetrics() {
initializeMap();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd suggest you actually use an {{IOStatisticsStoreBuilder}} to build an IOStatisticsStore, from which you can then extract the counters for direct access -but still be able to snapshot, aggregate and share the stats though public APIs.

See S3AInstrumentation.InputStreamStatistics as an example of this

this.numberOfNetworkFailedRequests = new AtomicLong();
}

public AbfsBackoffMetrics(String retryCount) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this only partially initialises the metrics


private AtomicLong numberOfRequestsSucceeded;

private AtomicLong minBackoff;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

are these timings? if so should include time unit in name or at least use millis and javadoc it

+ getNumberOfOtherThrottledRequests();
double percentageOfRequestsThrottled =
((double) totalRequestsThrottled / getTotalNumberOfRequests()) * HUNDRED;
for (Map.Entry<String, AbfsBackoffMetrics> entry : metricsMap.entrySet()) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is going to blow up if you used the string constructor.

  • consider adding a unit test calling .toString() immediately after creating the object.
  • if you use IOStatistics, as I've proposed, you can just use ioStatisticsToString to print these things


import static org.apache.hadoop.fs.azurebfs.constants.FileSystemConfigurations.ONE_KB;

public class AbfsReadFooterMetrics {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

again, I think a lot of the numbers could be collected in IOStatisticsStore so easily printed and marshalled



@Override
public String toString() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think this should be an explicit method, not toString() which is generally assumed to be flor logging and lower cost.

if (abfsReadFooterMetrics.getIsParquetFile()) {
isParquetList.add(abfsReadFooterMetrics);
} else {
if (abfsReadFooterMetrics.getReadCount() >= 2) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so it only goes into the non parquet file if its read the footer more than once?

@@ -60,6 +61,9 @@ public enum AzureServiceErrorCode {
private final String errorCode;
private final int httpStatusCode;
private final String errorMessage;

private static final Logger LOG1 = LoggerFactory.getLogger(AzureServiceErrorCode.class);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

call it LOG


private AtomicLong numberOfRequestsFailed;

private final Map<String, AbfsBackoffMetrics> metricsMap
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is mixing up a metrics class "stats on use" with the map of filenames -> instances. it explains why there are two constructors and one only does partial init and whose toString() will NPE -but it doesn't justify this design.

Proposed: split out the metrics (which should use IOStatisticsStore for its structure, and implement IOStatisticsSource serving this) from the map managing the metrics.

that must be at most one per client instance and needs a good cleanup story so it scales well.

@steveloughran
Copy link
Contributor

@anmolanmol1234 you got any timeline for changes here?

@anmolanmol1234
Copy link
Contributor Author

@anmolanmol1234 you got any timeline for changes here?

Hi @steveloughran, since the changes requested would need a design refactor and scale tests from our side, I would say July end sounds feasible to me.

@steveloughran
Copy link
Contributor

well lets that as a next step and we can merge this in? makes sense. but it will be your next piece of homework...

@anmolanmol1234
Copy link
Contributor Author

well lets that as a next step and we can merge this in? makes sense. but it will be your next piece of homework...

Sure @steveloughran, sounds good.

@hadoop-yetus
Copy link

🎊 +1 overall

Vote Subsystem Runtime Logfile Comment
+0 🆗 reexec 17m 44s Docker mode activated.
_ Prechecks _
+1 💚 dupname 0m 0s No case conflicting files found.
+0 🆗 codespell 0m 0s codespell was not available.
+0 🆗 detsecrets 0m 0s detect-secrets was not available.
+0 🆗 markdownlint 0m 0s markdownlint was not available.
+1 💚 @author 0m 0s The patch does not contain any @author tags.
+1 💚 test4tests 0m 0s The patch appears to include 8 new or modified test files.
_ trunk Compile Tests _
+1 💚 mvninstall 49m 8s trunk passed
+1 💚 compile 0m 39s trunk passed with JDK Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1
+1 💚 compile 0m 34s trunk passed with JDK Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06
+1 💚 checkstyle 0m 31s trunk passed
+1 💚 mvnsite 0m 40s trunk passed
+1 💚 javadoc 0m 38s trunk passed with JDK Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1
+1 💚 javadoc 0m 34s trunk passed with JDK Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06
+1 💚 spotbugs 1m 4s trunk passed
+1 💚 shadedclient 39m 7s branch has no errors when building and testing our client artifacts.
-0 ⚠️ patch 39m 28s Used diff version of patch file. Binary files and potentially other changes not applied. Please rebase and squash commits if necessary.
_ Patch Compile Tests _
+1 💚 mvninstall 0m 29s the patch passed
+1 💚 compile 0m 31s the patch passed with JDK Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1
+1 💚 javac 0m 31s the patch passed
+1 💚 compile 0m 27s the patch passed with JDK Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06
+1 💚 javac 0m 27s the patch passed
+1 💚 blanks 0m 0s The patch has no blanks issues.
-0 ⚠️ checkstyle 0m 19s /results-checkstyle-hadoop-tools_hadoop-azure.txt hadoop-tools/hadoop-azure: The patch generated 2 new + 10 unchanged - 1 fixed = 12 total (was 11)
+1 💚 mvnsite 0m 30s the patch passed
+1 💚 javadoc 0m 27s the patch passed with JDK Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1
+1 💚 javadoc 0m 24s the patch passed with JDK Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06
+1 💚 spotbugs 1m 5s the patch passed
+1 💚 shadedclient 38m 47s patch has no errors when building and testing our client artifacts.
_ Other Tests _
+1 💚 unit 2m 22s hadoop-azure in the patch passed.
+1 💚 asflicense 0m 37s The patch does not generate ASF License warnings.
161m 46s
Subsystem Report/Notes
Docker ClientAPI=1.45 ServerAPI=1.45 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6314/27/artifact/out/Dockerfile
GITHUB PR #6314
JIRA Issue HADOOP-18325
Optional Tests dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets markdownlint
uname Linux 27e6025db822 5.15.0-107-generic #117-Ubuntu SMP Fri Apr 26 12:26:49 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/bin/hadoop.sh
git revision trunk / f8df32f
Default Java Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06
Multi-JDK versions /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06
Test Results https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6314/27/testReport/
Max. process+thread count 539 (vs. ulimit of 5500)
modules C: hadoop-tools/hadoop-azure U: hadoop-tools/hadoop-azure
Console output https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6314/27/console
versions git=2.25.1 maven=3.6.3 spotbugs=4.2.2
Powered by Apache Yetus 0.14.0 https://yetus.apache.org

This message was automatically generated.

@steveloughran
Copy link
Contributor

ok, fix those 2 checkstyles

./hadoop-tools/hadoop-azure/src/main/java/org/apache/hadoop/fs/azurebfs/services/AbfsRestOperation.java:28:import org.apache.hadoop.fs.azurebfs.AbfsConfiguration;:8: Unused import - org.apache.hadoop.fs.azurebfs.AbfsConfiguration. [UnusedImports]
./hadoop-tools/hadoop-azure/src/main/java/org/apache/hadoop/fs/azurebfs/services/AbfsRestOperation.java:29:import org.apache.hadoop.fs.azurebfs.contracts.exceptions.InvalidUriException;:8: Unused import - org.apache.hadoop.fs.azurebfs.contracts.exceptions.InvalidUriException. [UnusedImports]

+1 pending that

Copy link
Contributor

@steveloughran steveloughran left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

will merge once the build finishes

@hadoop-yetus
Copy link

🎊 +1 overall

Vote Subsystem Runtime Logfile Comment
+0 🆗 reexec 0m 47s Docker mode activated.
_ Prechecks _
+1 💚 dupname 0m 0s No case conflicting files found.
+0 🆗 codespell 0m 1s codespell was not available.
+0 🆗 detsecrets 0m 1s detect-secrets was not available.
+0 🆗 markdownlint 0m 1s markdownlint was not available.
+1 💚 @author 0m 0s The patch does not contain any @author tags.
+1 💚 test4tests 0m 0s The patch appears to include 8 new or modified test files.
_ trunk Compile Tests _
+1 💚 mvninstall 49m 4s trunk passed
+1 💚 compile 0m 39s trunk passed with JDK Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1
+1 💚 compile 0m 34s trunk passed with JDK Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06
+1 💚 checkstyle 0m 30s trunk passed
+1 💚 mvnsite 0m 41s trunk passed
+1 💚 javadoc 0m 39s trunk passed with JDK Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1
+1 💚 javadoc 0m 33s trunk passed with JDK Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06
+1 💚 spotbugs 1m 4s trunk passed
+1 💚 shadedclient 38m 45s branch has no errors when building and testing our client artifacts.
-0 ⚠️ patch 39m 6s Used diff version of patch file. Binary files and potentially other changes not applied. Please rebase and squash commits if necessary.
_ Patch Compile Tests _
+1 💚 mvninstall 0m 28s the patch passed
+1 💚 compile 0m 31s the patch passed with JDK Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1
+1 💚 javac 0m 31s the patch passed
+1 💚 compile 0m 27s the patch passed with JDK Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06
+1 💚 javac 0m 27s the patch passed
+1 💚 blanks 0m 0s The patch has no blanks issues.
+1 💚 checkstyle 0m 19s hadoop-tools/hadoop-azure: The patch generated 0 new + 9 unchanged - 2 fixed = 9 total (was 11)
+1 💚 mvnsite 0m 30s the patch passed
+1 💚 javadoc 0m 26s the patch passed with JDK Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1
+1 💚 javadoc 0m 24s the patch passed with JDK Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06
+1 💚 spotbugs 1m 5s the patch passed
+1 💚 shadedclient 39m 6s patch has no errors when building and testing our client artifacts.
_ Other Tests _
+1 💚 unit 2m 20s hadoop-azure in the patch passed.
+1 💚 asflicense 0m 35s The patch does not generate ASF License warnings.
144m 36s
Subsystem Report/Notes
Docker ClientAPI=1.45 ServerAPI=1.45 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6314/28/artifact/out/Dockerfile
GITHUB PR #6314
JIRA Issue HADOOP-18325
Optional Tests dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets markdownlint
uname Linux bbcecd70f66a 5.15.0-107-generic #117-Ubuntu SMP Fri Apr 26 12:26:49 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/bin/hadoop.sh
git revision trunk / 05edb95
Default Java Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06
Multi-JDK versions /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06
Test Results https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6314/28/testReport/
Max. process+thread count 563 (vs. ulimit of 5500)
modules C: hadoop-tools/hadoop-azure U: hadoop-tools/hadoop-azure
Console output https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6314/28/console
versions git=2.25.1 maven=3.6.3 spotbugs=4.2.2
Powered by Apache Yetus 0.14.0 https://yetus.apache.org

This message was automatically generated.

Copy link
Contributor

@steveloughran steveloughran left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM
+1

@steveloughran steveloughran merged commit d168d3f into apache:trunk May 23, 2024
@steveloughran
Copy link
Contributor

fixed in 3.5, submit a PR for this against 3.4.

regarding my outstanding comments, i will create a followup

@anmolanmol1234
Copy link
Contributor Author

fixed in 3.5, submit a PR for this against 3.4.

regarding my outstanding comments, i will create a followup

Sure will do so.

anmolanmol1234 added a commit to anmolanmol1234/hadoop that referenced this pull request Jan 20, 2025
…apache#6314)

Adds support for metric collection at the filesystem instance level.
Metrics are pushed to the store upon the closure of a filesystem instance, encompassing all operations
that utilized that specific instance.

Collected Metrics:

- Number of successful requests without any retries.
- Count of requests that succeeded after a specified number of retries (x retries).
- Request count subjected to throttling.
- Number of requests that failed despite exhausting all retry attempts. etc.
Implementation Details:

Incorporated logic in the AbfsClient to facilitate metric pushing through an additional request.
This occurs in scenarios where no requests are sent to the backend for a defined idle period.
By implementing these enhancements, we ensure comprehensive monitoring and analysis of filesystem interactions, enabling a deeper understanding of success rates, retry scenarios, throttling instances, and exhaustive failure scenarios. Additionally, the AbfsClient logic ensures that metrics are proactively pushed even during idle periods, maintaining a continuous and accurate representation of filesystem performance.

Contributed by Anmol Asrani
anujmodi2021 pushed a commit that referenced this pull request Jan 21, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants