[prev in list] [next in list] [prev in thread] [next in thread] 

List:       hadoop-dev
Subject:    [jira] [Resolved] (HADOOP-18521) ABFS ReadBufferManager buffer sharing across concurrent HTTP reques
From:       "Steve Loughran (Jira)" <jira () apache ! org>
Date:       2022-12-19 11:12:00
Message-ID: JIRA.13496586.1667660426000.214646.1671448320055 () Atlassian ! JIRA
[Download RAW message or body]


     [ https://issues.apache.org/jira/browse/HADOOP-18521?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel \
]

Steve Loughran resolved HADOOP-18521.
-------------------------------------
    Fix Version/s: 3.3.5
       Resolution: Fixed

This is fixed in HADOOP-18546; the followups were just tuning.

closing as such. my larger bit of work has some advantages (better testability, \
iostats of use) but that makes it too complex to put in 3.3.5 and means more work \
remaining to fix.

when we do a rework of the read buffer manager some aspects of it can be applied. 
* iostats
* tryEvict prioritising eviction of completed fetches with buffers belonging to \
                closed windows
* AbfsInputStream calls to go an interface, with unit tests

It'd also be good to include split start/end and read policy from stream to manager
* don't prefetch past end of split (or at most, one block)
* on random IO, use optimised policy (no prefetch? one block max)
* on vectored IO: no prefetching
j

> ABFS ReadBufferManager buffer sharing across concurrent HTTP requests
> ---------------------------------------------------------------------
> 
> Key: HADOOP-18521
> URL: https://issues.apache.org/jira/browse/HADOOP-18521
> Project: Hadoop Common
> Issue Type: Bug
> Components: fs/azure
> Affects Versions: 3.3.2, 3.3.3, 3.3.4
> Reporter: Steve Loughran
> Assignee: Steve Loughran
> Priority: Critical
> Labels: pull-request-available
> Fix For: 3.3.5
> 
> 
> AbfsInputStream.close() can trigger the return of buffers used for active prefetch \
> GET requests into the ReadBufferManager free buffer pool. A subsequent prefetch by \
> a different stream in the same process may acquire this same buffer. This can lead \
> to risk of corruption of its own prefetched data, data which may then be returned \
> to that other thread. On releases without the fix for this (3.3.2+), the bug can be \
> avoided by disabling all prefetching  {code}
> fs.azure.readaheadqueue.depth = 0
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: common-dev-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-dev-help@hadoop.apache.org


[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic