Skip to content

Commit 2e586c1

Browse files
DieterDP-ngsteveloughran
authored andcommitted
HADOOP-18987. Various fixes to FileSystem API docs (apache#6292)
Contributed by Dieter De Paepe
1 parent 19c7952 commit 2e586c1

File tree

5 files changed

+75
-69
lines changed

5 files changed

+75
-69
lines changed

hadoop-common-project/hadoop-common/src/site/markdown/filesystem/abortable.md

Lines changed: 1 addition & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -88,14 +88,13 @@ for example. output streams returned by the S3A FileSystem.
8888
The stream MUST implement `Abortable` and `StreamCapabilities`.
8989

9090
```python
91-
if unsupported:
91+
if unsupported:
9292
throw UnsupportedException
9393

9494
if not isOpen(stream):
9595
no-op
9696

9797
StreamCapabilities.hasCapability("fs.capability.outputstream.abortable") == True
98-
9998
```
10099

101100

hadoop-common-project/hadoop-common/src/site/markdown/filesystem/filesystem.md

Lines changed: 50 additions & 47 deletions
Original file line numberDiff line numberDiff line change
@@ -64,13 +64,13 @@ a protected directory, result in such an exception being raised.
6464

6565
### `boolean isDirectory(Path p)`
6666

67-
def isDirectory(FS, p)= p in directories(FS)
67+
def isDir(FS, p) = p in directories(FS)
6868

6969

7070
### `boolean isFile(Path p)`
7171

7272

73-
def isFile(FS, p) = p in files(FS)
73+
def isFile(FS, p) = p in filenames(FS)
7474

7575

7676
### `FileStatus getFileStatus(Path p)`
@@ -250,7 +250,7 @@ process.
250250
changes are made to the filesystem, the result of `listStatus(parent(P))` SHOULD
251251
include the value of `getFileStatus(P)`.
252252

253-
* After an entry at path `P` is created, and before any other
253+
* After an entry at path `P` is deleted, and before any other
254254
changes are made to the filesystem, the result of `listStatus(parent(P))` SHOULD
255255
NOT include the value of `getFileStatus(P)`.
256256

@@ -305,7 +305,7 @@ that they must all be listed, and, at the time of listing, exist.
305305
All paths must exist. There is no requirement for uniqueness.
306306

307307
forall p in paths :
308-
exists(fs, p) else raise FileNotFoundException
308+
exists(FS, p) else raise FileNotFoundException
309309

310310
#### Postconditions
311311

@@ -381,7 +381,7 @@ being completely performed.
381381

382382
Path `path` must exist:
383383

384-
exists(FS, path) : raise FileNotFoundException
384+
if not exists(FS, path) : raise FileNotFoundException
385385

386386
#### Postconditions
387387

@@ -432,7 +432,7 @@ of data which must be collected in a single RPC call.
432432

433433
#### Preconditions
434434

435-
exists(FS, path) else raise FileNotFoundException
435+
if not exists(FS, path) : raise FileNotFoundException
436436

437437
### Postconditions
438438

@@ -463,7 +463,7 @@ and 1 for file count.
463463

464464
#### Preconditions
465465

466-
exists(FS, path) else raise FileNotFoundException
466+
if not exists(FS, path) : raise FileNotFoundException
467467

468468
#### Postconditions
469469

@@ -567,7 +567,7 @@ when writing objects to a path in the filesystem.
567567
#### Postconditions
568568

569569

570-
result = integer >= 0
570+
result = integer >= 0
571571

572572
The outcome of this operation is usually identical to `getDefaultBlockSize()`,
573573
with no checks for the existence of the given path.
@@ -591,12 +591,12 @@ on the filesystem.
591591

592592
#### Preconditions
593593

594-
if not exists(FS, p) : raise FileNotFoundException
594+
if not exists(FS, p) : raise FileNotFoundException
595595

596596

597597
#### Postconditions
598598

599-
if len(FS, P) > 0: getFileStatus(P).getBlockSize() > 0
599+
if len(FS, P) > 0 : getFileStatus(P).getBlockSize() > 0
600600
result == getFileStatus(P).getBlockSize()
601601

602602
1. The outcome of this operation MUST be identical to the value of
@@ -654,12 +654,12 @@ No ancestor may be a file
654654

655655
forall d = ancestors(FS, p) :
656656
if exists(FS, d) and not isDir(FS, d) :
657-
raise [ParentNotDirectoryException, FileAlreadyExistsException, IOException]
657+
raise {ParentNotDirectoryException, FileAlreadyExistsException, IOException}
658658

659659
#### Postconditions
660660

661661

662-
FS' where FS'.Directories' = FS.Directories + [p] + ancestors(FS, p)
662+
FS' where FS'.Directories = FS.Directories + [p] + ancestors(FS, p)
663663
result = True
664664

665665

@@ -688,7 +688,7 @@ The return value is always true—even if a new directory is not created
688688

689689
The file must not exist for a no-overwrite create:
690690

691-
if not overwrite and isFile(FS, p) : raise FileAlreadyExistsException
691+
if not overwrite and isFile(FS, p) : raise FileAlreadyExistsException
692692

693693
Writing to or overwriting a directory must fail.
694694

@@ -698,7 +698,7 @@ No ancestor may be a file
698698

699699
forall d = ancestors(FS, p) :
700700
if exists(FS, d) and not isDir(FS, d) :
701-
raise [ParentNotDirectoryException, FileAlreadyExistsException, IOException]
701+
raise {ParentNotDirectoryException, FileAlreadyExistsException, IOException}
702702

703703
FileSystems may reject the request for other
704704
reasons, such as the FS being read-only (HDFS),
@@ -712,8 +712,8 @@ For instance, HDFS may raise an `InvalidPathException`.
712712
#### Postconditions
713713

714714
FS' where :
715-
FS'.Files'[p] == []
716-
ancestors(p) is-subset-of FS'.Directories'
715+
FS'.Files[p] == []
716+
ancestors(p) subset-of FS'.Directories
717717

718718
result = FSDataOutputStream
719719

@@ -734,7 +734,7 @@ The behavior of the returned stream is covered in [Output](outputstream.html).
734734
clients creating files with `overwrite==true` to fail if the file is created
735735
by another client between the two tests.
736736

737-
* The S3A and potentially other Object Stores connectors not currently change the `FS` state
737+
* The S3A and potentially other Object Stores connectors currently don't change the `FS` state
738738
until the output stream `close()` operation is completed.
739739
This is a significant difference between the behavior of object stores
740740
and that of filesystems, as it allows >1 client to create a file with `overwrite=false`,
@@ -762,15 +762,15 @@ The behavior of the returned stream is covered in [Output](outputstream.html).
762762
#### Implementation Notes
763763

764764
`createFile(p)` returns a `FSDataOutputStreamBuilder` only and does not make
765-
change on filesystem immediately. When `build()` is invoked on the `FSDataOutputStreamBuilder`,
765+
changes on the filesystem immediately. When `build()` is invoked on the `FSDataOutputStreamBuilder`,
766766
the builder parameters are verified and [`create(Path p)`](#FileSystem.create)
767767
is invoked on the underlying filesystem. `build()` has the same preconditions
768768
and postconditions as [`create(Path p)`](#FileSystem.create).
769769

770770
* Similar to [`create(Path p)`](#FileSystem.create), files are overwritten
771-
by default, unless specify `builder.overwrite(false)`.
771+
by default, unless specified by `builder.overwrite(false)`.
772772
* Unlike [`create(Path p)`](#FileSystem.create), missing parent directories are
773-
not created by default, unless specify `builder.recursive()`.
773+
not created by default, unless specified by `builder.recursive()`.
774774

775775
### <a name='FileSystem.append'></a> `FSDataOutputStream append(Path p, int bufferSize, Progressable progress)`
776776

@@ -780,14 +780,14 @@ Implementations without a compliant call SHOULD throw `UnsupportedOperationExcep
780780

781781
if not exists(FS, p) : raise FileNotFoundException
782782

783-
if not isFile(FS, p) : raise [FileAlreadyExistsException, FileNotFoundException, IOException]
783+
if not isFile(FS, p) : raise {FileAlreadyExistsException, FileNotFoundException, IOException}
784784

785785
#### Postconditions
786786

787787
FS' = FS
788788
result = FSDataOutputStream
789789

790-
Return: `FSDataOutputStream`, which can update the entry `FS.Files[p]`
790+
Return: `FSDataOutputStream`, which can update the entry `FS'.Files[p]`
791791
by appending data to the existing list.
792792

793793
The behavior of the returned stream is covered in [Output](outputstream.html).
@@ -813,7 +813,7 @@ Implementations without a compliant call SHOULD throw `UnsupportedOperationExcep
813813

814814
#### Preconditions
815815

816-
if not isFile(FS, p)) : raise [FileNotFoundException, IOException]
816+
if not isFile(FS, p)) : raise {FileNotFoundException, IOException}
817817

818818
This is a critical precondition. Implementations of some FileSystems (e.g.
819819
Object stores) could shortcut one round trip by postponing their HTTP GET
@@ -842,7 +842,7 @@ The result MUST be the same for local and remote callers of the operation.
842842
symbolic links
843843

844844
1. HDFS throws `IOException("Cannot open filename " + src)` if the path
845-
exists in the metadata, but no copies of any its blocks can be located;
845+
exists in the metadata, but no copies of its blocks can be located;
846846
-`FileNotFoundException` would seem more accurate and useful.
847847

848848
### `FSDataInputStreamBuilder openFile(Path path)`
@@ -861,7 +861,7 @@ Implementations without a compliant call MUST throw `UnsupportedOperationExcepti
861861

862862
let stat = getFileStatus(Path p)
863863
let FS' where:
864-
(FS.Directories', FS.Files', FS.Symlinks')
864+
(FS'.Directories, FS.Files', FS'.Symlinks)
865865
p' in paths(FS') where:
866866
exists(FS, stat.path) implies exists(FS', p')
867867

@@ -931,16 +931,16 @@ metadata in the `PathHandle` to detect references from other namespaces.
931931

932932
### `FSDataInputStream open(PathHandle handle, int bufferSize)`
933933

934-
Implementaions without a compliant call MUST throw `UnsupportedOperationException`
934+
Implementations without a compliant call MUST throw `UnsupportedOperationException`
935935

936936
#### Preconditions
937937

938938
let fd = getPathHandle(FileStatus stat)
939939
if stat.isdir : raise IOException
940940
let FS' where:
941-
(FS.Directories', FS.Files', FS.Symlinks')
942-
p' in FS.Files' where:
943-
FS.Files'[p'] = fd
941+
(FS'.Directories, FS.Files', FS'.Symlinks)
942+
p' in FS'.Files where:
943+
FS'.Files[p'] = fd
944944
if not exists(FS', p') : raise InvalidPathHandleException
945945

946946
The implementation MUST resolve the referent of the `PathHandle` following
@@ -951,7 +951,7 @@ encoded in the `PathHandle`.
951951

952952
#### Postconditions
953953

954-
result = FSDataInputStream(0, FS.Files'[p'])
954+
result = FSDataInputStream(0, FS'.Files[p'])
955955

956956
The stream returned is subject to the constraints of a stream returned by
957957
`open(Path)`. Constraints checked on open MAY hold to hold for the stream, but
@@ -1006,7 +1006,7 @@ A directory with children and `recursive == False` cannot be deleted
10061006

10071007
If the file does not exist the filesystem state does not change
10081008

1009-
if not exists(FS, p):
1009+
if not exists(FS, p) :
10101010
FS' = FS
10111011
result = False
10121012

@@ -1089,7 +1089,7 @@ Some of the object store based filesystem implementations always return
10891089
false when deleting the root, leaving the state of the store unchanged.
10901090

10911091
if isRoot(p) :
1092-
FS ' = FS
1092+
FS' = FS
10931093
result = False
10941094

10951095
This is irrespective of the recursive flag status or the state of the directory.
@@ -1152,7 +1152,7 @@ has been calculated.
11521152

11531153
Source `src` must exist:
11541154

1155-
exists(FS, src) else raise FileNotFoundException
1155+
if not exists(FS, src) : raise FileNotFoundException
11561156

11571157
`dest` cannot be a descendant of `src`:
11581158

@@ -1162,7 +1162,7 @@ This implicitly covers the special case of `isRoot(FS, src)`.
11621162

11631163
`dest` must be root, or have a parent that exists:
11641164

1165-
isRoot(FS, dest) or exists(FS, parent(dest)) else raise IOException
1165+
if not (isRoot(FS, dest) or exists(FS, parent(dest))) : raise IOException
11661166

11671167
The parent path of a destination must not be a file:
11681168

@@ -1240,7 +1240,8 @@ There is no consistent behavior here.
12401240

12411241
The outcome is no change to FileSystem state, with a return value of false.
12421242

1243-
FS' = FS; result = False
1243+
FS' = FS
1244+
result = False
12441245

12451246
*Local Filesystem*
12461247

@@ -1319,28 +1320,31 @@ Implementations without a compliant call SHOULD throw `UnsupportedOperationExcep
13191320

13201321
All sources MUST be in the same directory:
13211322

1322-
for s in sources: if parent(S) != parent(p) raise IllegalArgumentException
1323+
for s in sources:
1324+
if parent(s) != parent(p) : raise IllegalArgumentException
13231325

13241326
All block sizes must match that of the target:
13251327

1326-
for s in sources: getBlockSize(FS, S) == getBlockSize(FS, p)
1328+
for s in sources:
1329+
getBlockSize(FS, s) == getBlockSize(FS, p)
13271330

13281331
No duplicate paths:
13291332

1330-
not (exists p1, p2 in (sources + [p]) where p1 == p2)
1333+
let input = sources + [p]
1334+
not (exists i, j: i != j and input[i] == input[j])
13311335

13321336
HDFS: All source files except the final one MUST be a complete block:
13331337

13341338
for s in (sources[0:length(sources)-1] + [p]):
1335-
(length(FS, s) mod getBlockSize(FS, p)) == 0
1339+
(length(FS, s) mod getBlockSize(FS, p)) == 0
13361340

13371341

13381342
#### Postconditions
13391343

13401344

13411345
FS' where:
1342-
(data(FS', T) = data(FS, T) + data(FS, sources[0]) + ... + data(FS, srcs[length(srcs)-1]))
1343-
and for s in srcs: not exists(FS', S)
1346+
(data(FS', p) = data(FS, p) + data(FS, sources[0]) + ... + data(FS, sources[length(sources)-1]))
1347+
for s in sources: not exists(FS', s)
13441348

13451349

13461350
HDFS's restrictions may be an implementation detail of how it implements
@@ -1360,7 +1364,7 @@ Implementations without a compliant call SHOULD throw `UnsupportedOperationExcep
13601364

13611365
if not exists(FS, p) : raise FileNotFoundException
13621366

1363-
if isDir(FS, p) : raise [FileNotFoundException, IOException]
1367+
if isDir(FS, p) : raise {FileNotFoundException, IOException}
13641368

13651369
if newLength < 0 || newLength > len(FS.Files[p]) : raise HadoopIllegalArgumentException
13661370

@@ -1369,8 +1373,7 @@ Truncate cannot be performed on a file, which is open for writing or appending.
13691373

13701374
#### Postconditions
13711375

1372-
FS' where:
1373-
len(FS.Files[p]) = newLength
1376+
len(FS'.Files[p]) = newLength
13741377

13751378
Return: `true`, if truncation is finished and the file can be immediately
13761379
opened for appending, or `false` otherwise.
@@ -1399,7 +1402,7 @@ Source and destination must be different
13991402
if src = dest : raise FileExistsException
14001403
```
14011404

1402-
Destination and source must not be descendants one another
1405+
Destination and source must not be descendants of one another
14031406
```python
14041407
if isDescendant(src, dest) or isDescendant(dest, src) : raise IOException
14051408
```
@@ -1429,7 +1432,7 @@ Given a base path on the source `base` and a child path `child` where `base` is
14291432

14301433
```python
14311434
def final_name(base, child, dest):
1432-
is base = child:
1435+
if base == child:
14331436
return dest
14341437
else:
14351438
return dest + childElements(base, child)
@@ -1557,7 +1560,7 @@ while (iterator.hasNext()) {
15571560

15581561
As raising exceptions is an expensive operation in JVMs, the `while(hasNext())`
15591562
loop option is more efficient. (see also [Concurrency and the Remote Iterator](#RemoteIteratorConcurrency)
1560-
for a dicussion on this topic).
1563+
for a discussion on this topic).
15611564

15621565
Implementors of the interface MUST support both forms of iterations; authors
15631566
of tests SHOULD verify that both iteration mechanisms work.

0 commit comments

Comments
 (0)