forked from toddlipcon/hadoop
-
Notifications
You must be signed in to change notification settings - Fork 0
/
CHANGES.txt
9266 lines (6374 loc) · 363 KB
/
CHANGES.txt
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
Hadoop Change Log
Trunk (unreleased changes)
INCOMPATIBLE CHANGES
HADOOP-4895. Remove deprecated methods DFSClient.getHints(..) and
DFSClient.isDirectory(..). (szetszwo)
HADOOP-4941. Remove deprecated FileSystem methods: getBlockSize(Path f),
getLength(Path f) and getReplication(Path src). (szetszwo)
HADOOP-4648. Remove obsolete, deprecated InMemoryFileSystem and
ChecksumDistributedFileSystem. (cdouglas via szetszwo)
HADOOP-4940. Remove a deprecated method FileSystem.delete(Path f). (Enis
Soztutar via szetszwo)
HADOOP-4010. Change semantics for LineRecordReader to read an additional
line per split- rather than moving back one character in the stream- to
work with splittable compression codecs. (Abdul Qadeer via cdouglas)
HADOOP-5094. Show hostname and separate live/dead datanodes in DFSAdmin
report. (Jakob Homan via szetszwo)
HADOOP-4942. Remove deprecated FileSystem methods getName() and
getNamed(String name, Configuration conf). (Jakob Homan via szetszwo)
HADOOP-5486. Removes the CLASSPATH string from the command line and instead
exports it in the environment. (Amareshwari Sriramadasu via ddas)
HADOOP-2827. Remove deprecated NetUtils::getServerAddress. (cdouglas)
HADOOP-5681. Change examples RandomWriter and RandomTextWriter to
use new mapreduce API. (Amareshwari Sriramadasu via sharad)
HADOOP-5680. Change org.apache.hadoop.examples.SleepJob to use new
mapreduce api. (Amareshwari Sriramadasu via sharad)
HADOOP-5699. Change org.apache.hadoop.examples.PiEstimator to use
new mapreduce api. (Amareshwari Sriramadasu via sharad)
HADOOP-5720. Introduces new task types - JOB_SETUP, JOB_CLEANUP
and TASK_CLEANUP. Removes the isMap methods from TaskID/TaskAttemptID
classes. (ddas)
HADOOP-5668. Change TotalOrderPartitioner to use new API. (Amareshwari
Sriramadasu via cdouglas)
HADOOP-5738. Split "waiting_tasks" JobTracker metric into waiting maps and
waiting reduces. (Sreekanth Ramakrishnan via cdouglas)
HADOOP-5679. Resolve findbugs warnings in core/streaming/pipes/examples.
(Jothi Padmanabhan via sharad)
HADOOP-4359. Support for data access authorization checking on Datanodes.
(Kan Zhang via rangadi)
HADOOP-5690. Change org.apache.hadoop.examples.DBCountPageView to use
new mapreduce api. (Amareshwari Sriramadasu via sharad)
HADOOP-5694. Change org.apache.hadoop.examples.dancing to use new
mapreduce api. (Amareshwari Sriramadasu via sharad)
HADOOP-5696. Change org.apache.hadoop.examples.Sort to use new
mapreduce api. (Amareshwari Sriramadasu via sharad)
HADOOP-5698. Change org.apache.hadoop.examples.MultiFileWordCount to
use new mapreduce api. (Amareshwari Sriramadasu via sharad)
HADOOP-5913. Provide ability to an administrator to stop and start
job queues. (Rahul Kumar Singh and Hemanth Yamijala via yhemanth)
NEW FEATURES
HADOOP-4268. Change fsck to use ClientProtocol methods so that the
corresponding permission requirement for running the ClientProtocol
methods will be enforced. (szetszwo)
HADOOP-3953. Implement sticky bit for directories in HDFS. (Jakob Homan
via szetszwo)
HADOOP-4368. Implement df in FsShell to show the status of a FileSystem.
(Craig Macdonald via szetszwo)
HADOOP-3741. Add a web ui to the SecondaryNameNode for showing its status.
(szetszwo)
HADOOP-5018. Add pipelined writers to Chukwa. (Ari Rabkin via cdouglas)
HADOOP-5052. Add an example computing exact digits of pi using the
Bailey-Borwein-Plouffe algorithm. (Tsz Wo (Nicholas), SZE via cdouglas)
HADOOP-4927. Adds a generic wrapper around outputformat to allow creation of
output on demand (Jothi Padmanabhan via ddas)
HADOOP-5144. Add a new DFSAdmin command for changing the setting of restore
failed storage replicas in namenode. (Boris Shkolnik via szetszwo)
HADOOP-5258. Add a new DFSAdmin command to print a tree of the rack and
datanode topology as seen by the namenode. (Jakob Homan via szetszwo)
HADOOP-4756. A command line tool to access JMX properties on NameNode
and DataNode. (Boris Shkolnik via rangadi)
HADOOP-4539. Introduce backup node and checkpoint node. (shv)
HADOOP-5363. Add support for proxying connections to multiple clusters with
different versions to hdfsproxy. (Zhiyong Zhang via cdouglas)
HADOOP-5528. Add a configurable hash partitioner operating on ranges of
BinaryComparable keys. (Klaas Bosteels via shv)
HADOOP-5257. HDFS servers may start and stop external components through
a plugin interface. (Carlos Valiente via dhruba)
HADOOP-5450. Add application-specific data types to streaming's typed bytes
interface. (Klaas Bosteels via omalley)
HADOOP-5518. Add contrib/mrunit, a MapReduce unit test framework.
(Aaron Kimball via cutting)
HADOOP-5469. Add /metrics servlet to daemons, providing metrics
over HTTP as either text or JSON. (Philip Zeyliger via cutting)
HADOOP-5467. Introduce offline fsimage image viewer. (Jakob Homan via shv)
HADOOP-5752. Add a new hdfs image processor, Delimited, to oiv. (Jakob
Homan via szetszwo)
HADOOP-5266. Adds the capability to do mark/reset of the reduce values
iterator in the Context object API. (Jothi Padmanabhan via ddas)
HADOOP-5745. Allow setting the default value of maxRunningJobs for all
pools. (dhruba via matei)
HADOOP-5643. Adds a way to decommission TaskTrackers while the JobTracker
is running. (Amar Kamat via ddas)
HADOOP-4829. Allow FileSystem shutdown hook to be disabled.
(Todd Lipcon via tomwhite)
HADOOP-5815. Sqoop: A database import tool for Hadoop.
(Aaron Kimball via tomwhite)
HADOOP-4861. Add disk usage with human-readable size (-duh).
(Todd Lipcon via tomwhite)
HADOOP-5844. Use mysqldump when connecting to local mysql instance in Sqoop.
(Aaron Kimball via tomwhite)
HADOOP-5170. Allows jobs to set max maps/reduces per-node and per-cluster.
(Matei Zaharia via ddas)
HADOOP-5897. Add name-node metrics to capture java heap usage.
(Suresh Srinivas via shv)
IMPROVEMENTS
HADOOP-4565. Added CombineFileInputFormat to use data locality information
to create splits. (dhruba via zshao)
HADOOP-4936. Improvements to TestSafeMode. (shv)
HADOOP-4985. Remove unnecessary "throw IOException" declarations in
FSDirectory related methods. (szetszwo)
HADOOP-5017. Change NameNode.namesystem declaration to private. (szetszwo)
HADOOP-4794. Add branch information from the source version control into
the version information that is compiled into Hadoop. (cdouglas via
omalley)
HADOOP-5070. Increment copyright year to 2009, remove assertions of ASF
copyright to licensed files. (Tsz Wo (Nicholas), SZE via cdouglas)
HADOOP-5037. Deprecate static FSNamesystem.getFSNamesystem(). (szetszwo)
HADOOP-5088. Include releaseaudit target as part of developer test-patch
target. (Giridharan Kesavan via nigel)
HADOOP-2721. Uses setsid when creating new tasks so that subprocesses of
this process will be within this new session (and this process will be
the process leader for all the subprocesses). Killing the process leader,
or the main Java task in Hadoop's case, kills the entire subtree of
processes. (Ravi Gummadi via ddas)
HADOOP-5097. Remove static variable JspHelper.fsn, a static reference to
a non-singleton FSNamesystem object. (szetszwo)
HADOOP-3327. Improves handling of READ_TIMEOUT during map output copying.
(Amareshwari Sriramadasu via ddas)
HADOOP-5124. Choose datanodes randomly instead of starting from the first
datanode for providing fairness. (hairong via szetszwo)
HADOOP-4930. Implement a Linux native executable that can be used to
launch tasks as users. (Sreekanth Ramakrishnan via yhemanth)
HADOOP-5122. Fix format of fs.default.name value in libhdfs test conf.
(Craig Macdonald via tomwhite)
HADOOP-5038. Direct daemon trace to debug log instead of stdout. (Jerome
Boulon via cdouglas)
HADOOP-5101. Improve packaging by adding 'all-jars' target building core,
tools, and example jars. Let findbugs depend on this rather than the 'tar'
target. (Giridharan Kesavan via cdouglas)
HADOOP-4868. Splits the hadoop script into three parts - bin/hadoop,
bin/mapred and bin/hdfs. (Sharad Agarwal via ddas)
HADOOP-1722. Adds support for TypedBytes and RawBytes in Streaming.
(Klaas Bosteels via ddas)
HADOOP-4220. Changes the JobTracker restart tests so that they take much
less time. (Amar Kamat via ddas)
HADOOP-4885. Try to restore failed name-node storage directories at
checkpoint time. (Boris Shkolnik via shv)
HADOOP-5209. Update year to 2009 for javadoc. (szetszwo)
HADOOP-5279. Remove unnecessary targets from test-patch.sh.
(Giridharan Kesavan via nigel)
HADOOP-5120. Remove the use of FSNamesystem.getFSNamesystem() from
UpgradeManagerNamenode and UpgradeObjectNamenode. (szetszwo)
HADOOP-5222. Add offset to datanode clienttrace. (Lei Xu via cdouglas)
HADOOP-5240. Skip re-building javadoc when it is already
up-to-date. (Aaron Kimball via cutting)
HADOOP-5042. Add a cleanup stage to log rollover in Chukwa appender.
(Jerome Boulon via cdouglas)
HADOOP-5264. Removes redundant configuration object from the TaskTracker.
(Sharad Agarwal via ddas)
HADOOP-5232. Enable patch testing to occur on more than one host.
(Giri Kesavan via nigel)
HADOOP-4546. Fix DF reporting for AIX. (Bill Habermaas via cdouglas)
HADOOP-5023. Add Tomcat support to HdfsProxy. (Zhiyong Zhang via cdouglas)
HADOOP-5317. Provide documentation for LazyOutput Feature.
(Jothi Padmanabhan via johan)
HADOOP-5455. Document rpc metrics context to the extent dfs, mapred, and
jvm contexts are documented. (Philip Zeyliger via cdouglas)
HADOOP-5358. Provide scripting functionality to the synthetic load
generator. (Jakob Homan via hairong)
HADOOP-5442. Paginate jobhistory display and added some search
capabilities. (Amar Kamat via acmurthy)
HADOOP-4842. Streaming now allows specifiying a command for the combiner.
(Amareshwari Sriramadasu via ddas)
HADOOP-5196. avoiding unnecessary byte[] allocation in
SequenceFile.CompressedBytes and SequenceFile.UncompressedBytes.
(hong tang via mahadev)
HADOOP-4655. New method FileSystem.newInstance() that always returns
a newly allocated FileSystem object. (dhruba)
HADOOP-4788. Set Fair scheduler to assign both a map and a reduce on each
heartbeat by default. (matei)
HADOOP-5491. In contrib/index, better control memory usage.
(Ning Li via cutting)
HADOOP-5423. Include option of preserving file metadata in
SequenceFile::sort. (Michael Tamm via cdouglas)
HADOOP-5331. Add support for KFS appends. (Sriram Rao via cdouglas)
HADOOP-4365. Make Configuration::getProps protected in support of
meaningful subclassing. (Steve Loughran via cdouglas)
HADOOP-2413. Remove the static variable FSNamesystem.fsNamesystemObject.
(Konstantin Shvachko via szetszwo)
HADOOP-4584. Improve datanode block reports and associated file system
scan to avoid interefering with normal datanode operations.
(Suresh Srinivas via rangadi)
HADOOP-5502. Documentation for backup and checkpoint nodes.
(Jakob Homan via shv)
HADOOP-5485. Mask actions in the fair scheduler's servlet UI based on
value of webinterface.private.actions.
(Vinod Kumar Vavilapalli via yhemanth)
HADOOP-5581. HDFS should throw FileNotFoundException when while opening
a file that does not exist. (Brian Bockelman via rangadi)
HADOOP-5509. PendingReplicationBlocks does not start monitor in the
constructor. (shv)
HADOOP-5494. Modify sorted map output merger to lazily read values,
rather than buffering at least one record for each segment. (Devaraj Das
via cdouglas)
HADOOP-5396. Provide ability to refresh queue ACLs in the JobTracker
without having to restart the daemon.
(Sreekanth Ramakrishnan and Vinod Kumar Vavilapalli via yhemanth)
HADOOP-4490. Provide ability to run tasks as job owners.
(Sreekanth Ramakrishnan via yhemanth)
HADOOP-5697. Change org.apache.hadoop.examples.Grep to use new
mapreduce api. (Amareshwari Sriramadasu via sharad)
HADOOP-5625. Add operation duration to clienttrace. (Lei Xu via cdouglas)
HADOOP-5705. Improve TotalOrderPartitioner efficiency by updating the trie
construction. (Dick King via cdouglas)
HADOOP-5589. Eliminate source limit of 64 for map-side joins imposed by
TupleWritable encoding. (Jingkei Ly via cdouglas)
HADOOP-5734. Correct block placement policy description in HDFS
Design document. (Konstantin Boudnik via shv)
HADOOP-5657. Validate data in TestReduceFetch to improve merge test
coverage. (cdouglas)
HADOOP-5613. Change S3Exception to checked exception.
(Andrew Hitchcock via tomwhite)
HADOOP-5717. Create public enum class for the Framework counters in
org.apache.hadoop.mapreduce. (Amareshwari Sriramadasu via sharad)
HADOOP-5217. Split AllTestDriver for core, hdfs and mapred. (sharad)
HADOOP-5364. Add certificate expiration warning to HsftpFileSystem and HDFS
proxy. (Zhiyong Zhang via cdouglas)
HADOOP-5733. Add map/reduce slot capacity and blacklisted capacity to
JobTracker metrics. (Sreekanth Ramakrishnan via cdouglas)
HADOOP-5596. Add EnumSetWritable. (He Yongqiang via szetszwo)
HADOOP-5727. Simplify hashcode for ID types. (Shevek via cdouglas)
HADOOP-5500. In DBOutputFormat, where field names are absent permit the
number of fields to be sufficient to construct the select query. (Enis
Soztutar via cdouglas)
HADOOP-5081. Split TestCLI into HDFS, Mapred and Core tests. (sharad)
HADOOP-5015. Separate block management code from FSNamesystem. (Suresh
Srinivas via szetszwo)
HADOOP-5080. Add new test cases to TestMRCLI and TestHDFSCLI
(V.Karthikeyan via nigel)
HADOOP-5135. Splits the tests into different directories based on the
package. Four new test targets have been defined - run-test-core,
run-test-mapred, run-test-hdfs and run-test-hdfs-with-mr.
(Sharad Agarwal via ddas)
HADOOP-5771. Implements unit tests for LinuxTaskController.
(Sreekanth Ramakrishnan and Vinod Kumar Vavilapalli via yhemanth)
HADOOP-5419. Provide a facility to query the Queue ACLs for the
current user.
(Rahul Kumar Singh via yhemanth)
HADOOP-5780. Improve per block message prited by "-metaSave" in HDFS.
(Raghu Angadi)
HADOOP-5823. Added a new class DeprecatedUTF8 to help with removing
UTF8 related javac warnings. These warnings are removed in
FSEditLog.java as a use case. (Raghu Angadi)
HADOOP-5824. Deprecate DataTransferProtocol.OP_READ_METADATA and remove
the corresponding unused codes. (Kan Zhang via szetszwo)
HADOOP-5721. Factor out EditLogFileInputStream and EditLogFileOutputStream
into independent classes. (Luca Telloli & Flavio Junqueira via shv)
HADOOP-5838. Fix a few javac warnings in HDFS. (Raghu Angadi)
HADOOP-5854. Fix a few "Inconsistent Synchronization" warnings in HDFS.
(Raghu Angadi)
HADOOP-5369. Small tweaks to reduce MapFile index size. (Ben Maurer
via sharad)
HADOOP-5858. Eliminate UTF8 and fix warnings in test/hdfs-with-mr package.
(shv)
HADOOP-5866. Move DeprecatedUTF8 from o.a.h.io to o.a.h.hdfs since it may
not be used outside hdfs. (Raghu Angadi)
HADOOP-5857. Move normal java methods from hdfs .jsp files to .java files.
(szetszwo)
HADOOP-5873. Remove deprecated methods randomDataNode() and
getDatanodeByIndex(..) in FSNamesystem. (szetszwo)
HADOOP-5572. Improves the progress reporting for the sort phase for both
maps and reduces. (Ravi Gummadi via ddas)
HADOOP-5839. Fix EC2 scripts to allow remote job submission.
(Joydeep Sen Sarma via tomwhite)
HADOOP-5877. Fix javac warnings in TestHDFSServerPorts, TestCheckpoint,
TestNameEditsConfig, TestStartup and TestStorageRestore.
(Jakob Homan via shv)
HADOOP-5438. Provide a single FileSystem method to create or open-for-append
to a file. (He Yongqiang via dhruba)
HADOOP-5472. Change DistCp to support globbing of input paths. (Dhruba
Borthakur and Rodrigo Schmidt via szetszwo)
HADOOP-5175. Don't unpack libjars on classpath. (Todd Lipcon via tomwhite)
HADOOP-5620. Add an option to DistCp for preserving modification and access
times. (Rodrigo Schmidt via szetszwo)
HADOOP-5664. Change map serialization so a lock is obtained only where
contention is possible, rather than for each write. (cdouglas)
HADOOP-5896. Remove the dependency of GenericOptionsParser on
Option.withArgPattern. (Giridharan Kesavan and Sharad Agarwal via
sharad)
HADOOP-5784. Makes the number of heartbeats that should arrive a second
at the JobTracker configurable. (Amareshwari Sriramadasu via ddas)
HADOOP-5955. Changes TestFileOuputFormat so that is uses LOCAL_MR
instead of CLUSTER_MR. (Jothi Padmanabhan via das)
HADOOP-5948. Changes TestJavaSerialization to use LocalJobRunner
instead of MiniMR/DFS cluster. (Jothi Padmanabhan via das)
HADOOP-2838. Add mapred.child.env to pass environment variables to
tasktracker's child processes. (Amar Kamat via sharad)
HADOOP-5961. DataNode process understand generic hadoop command line
options (like -Ddfs.property=value). (Raghu Angadi)
HADOOP-5938. Change org.apache.hadoop.mapred.jobcontrol to use new
api. (Amareshwari Sriramadasu via sharad)
HADOOP-2141. Improves the speculative execution heuristic. The heuristic
is currently based on the progress-rates of tasks and the expected time
to complete. Also, statistics about trackers are collected, and speculative
tasks are not given to the ones deduced to be slow.
(Andy Konwinski and ddas)
OPTIMIZATIONS
HADOOP-5595. NameNode does not need to run a replicator to choose a
random DataNode. (hairong)
HADOOP-5603. Improve NameNode's block placement performance. (hairong)
HADOOP-5638. More improvement on block placement performance. (hairong)
BUG FIXES
HADOOP-5379. CBZip2InputStream to throw IOException on data crc error.
(Rodrigo Schmidt via zshao)
HADOOP-5326. Fixes CBZip2OutputStream data corruption problem.
(Rodrigo Schmidt via zshao)
HADOOP-4963. Fixes a logging to do with getting the location of
map output file. (Amareshwari Sriramadasu via ddas)
HADOOP-2337. Trash should close FileSystem on exit and should not start
emtying thread if disabled. (shv)
HADOOP-5072. Fix failure in TestCodec because testSequenceFileGzipCodec
won't pass without native gzip codec. (Zheng Shao via dhruba)
HADOOP-5050. TestDFSShell.testFilePermissions should not assume umask
setting. (Jakob Homan via szetszwo)
HADOOP-4975. Set classloader for nested mapred.join configs. (Jingkei Ly
via cdouglas)
HADOOP-5078. Remove invalid AMI kernel in EC2 scripts. (tomwhite)
HADOOP-5045. FileSystem.isDirectory() should not be deprecated. (Suresh
Srinivas via szetszwo)
HADOOP-4960. Use datasource time, rather than system time, during metrics
demux. (Eric Yang via cdouglas)
HADOOP-5032. Export conf dir set in config script. (Eric Yang via cdouglas)
HADOOP-5176. Fix a typo in TestDFSIO. (Ravi Phulari via szetszwo)
HADOOP-4859. Distinguish daily rolling output dir by adding a timestamp.
(Jerome Boulon via cdouglas)
HADOOP-4959. Correct system metric collection from top on Redhat 5.1. (Eric
Yang via cdouglas)
HADOOP-5039. Fix log rolling regex to process only the relevant
subdirectories. (Jerome Boulon via cdouglas)
HADOOP-5095. Update Chukwa watchdog to accept config parameter. (Jerome
Boulon via cdouglas)
HADOOP-5147. Correct reference to agent list in Chukwa bin scripts. (Ari
Rabkin via cdouglas)
HADOOP-5148. Fix logic disabling watchdog timer in Chukwa daemon scripts.
(Ari Rabkin via cdouglas)
HADOOP-5100. Append, rather than truncate, when creating log4j metrics in
Chukwa. (Jerome Boulon via cdouglas)
HADOOP-5204. Fix broken trunk compilation on Hudson by letting
task-controller be an independent target in build.xml.
(Sreekanth Ramakrishnan via yhemanth)
HADOOP-5212. Fix the path translation problem introduced by HADOOP-4868
running on cygwin. (Sharad Agarwal via omalley)
HADOOP-5226. Add license headers to html and jsp files. (szetszwo)
HADOOP-5172. Disable misbehaving Chukwa unit test until it can be fixed.
(Jerome Boulon via nigel)
HADOOP-4933. Fixes a ConcurrentModificationException problem that shows up
when the history viewer is accessed concurrently.
(Amar Kamat via ddas)
HADOOP-5253. Remove duplicate call to cn-docs target.
(Giri Kesavan via nigel)
HADOOP-5251. Fix classpath for contrib unit tests to include clover jar.
(nigel)
HADOOP-5206. Synchronize "unprotected*" methods of FSDirectory on the root.
(Jakob Homan via shv)
HADOOP-5292. Fix NPE in KFS::getBlockLocations. (Sriram Rao via lohit)
HADOOP-5219. Adds a new property io.seqfile.local.dir for use by SequenceFile,
which earlier used mapred.local.dir. (Sharad Agarwal via ddas)
HADOOP-5300. Fix ant javadoc-dev target and the typo in the class name
NameNodeActivtyMBean. (szetszwo)
HADOOP-5218. libhdfs unit test failed because it was unable to
start namenode/datanode. Fixed. (dhruba)
HADOOP-5273. Add license header to TestJobInProgress.java. (Jakob Homan
via szetszwo)
HADOOP-5229. Remove duplicate version variables in build files
(Stefan Groschupf via johan)
HADOOP-5383. Avoid building an unused string in NameNode's
verifyReplication(). (Raghu Angadi)
HADOOP-5347. Create a job output directory for the bbp examples. (szetszwo)
HADOOP-5341. Make hadoop-daemon scripts backwards compatible with the
changes in HADOOP-4868. (Sharad Agarwal via yhemanth)
HADOOP-5456. Fix javadoc links to ClientProtocol#restoreFailedStorage(..).
(Boris Shkolnik via szetszwo)
HADOOP-5458. Remove leftover Chukwa entries from build, etc. (cdouglas)
HADOOP-5386. Modify hdfsproxy unit test to start on a random port,
implement clover instrumentation. (Zhiyong Zhang via cdouglas)
HADOOP-5511. Add Apache License to EditLogBackupOutputStream. (shv)
HADOOP-5507. Fix JMXGet javadoc warnings. (Boris Shkolnik via szetszwo)
HADOOP-5191. Accessing HDFS with any ip or hostname should work as long
as it points to the interface NameNode is listening on. (Raghu Angadi)
HADOOP-5561. Add javadoc.maxmemory parameter to build, preventing OOM
exceptions from javadoc-dev. (Jakob Homan via cdouglas)
HADOOP-5149. Modify HistoryViewer to ignore unfamiliar files in the log
directory. (Hong Tang via cdouglas)
HADOOP-5477. Fix rare failure in TestCLI for hosts returning variations of
'localhost'. (Jakob Homan via cdouglas)
HADOOP-5194. Disables setsid for tasks run on cygwin.
(Ravi Gummadi via ddas)
HADOOP-5322. Fix misleading/outdated comments in JobInProgress.
(Amareshwari Sriramadasu via cdouglas)
HADOOP-5198. Fixes a problem to do with the task PID file being absent and
the JvmManager trying to look for it. (Amareshwari Sriramadasu via ddas)
HADOOP-5464. DFSClient did not treat write timeout of 0 properly.
(Raghu Angadi)
HADOOP-4045. Fix processing of IO errors in EditsLog.
(Boris Shkolnik via shv)
HADOOP-5462. Fixed a double free bug in the task-controller
executable. (Sreekanth Ramakrishnan via yhemanth)
HADOOP-5652. Fix a bug where in-memory segments are incorrectly retained in
memory. (cdouglas)
HADOOP-5533. Recovery duration shown on the jobtracker webpage is
inaccurate. (Amar Kamat via sharad)
HADOOP-5647. Fix TestJobHistory to not depend on /tmp. (Ravi Gummadi
via sharad)
HADOOP-5661. Fixes some findbugs warnings in o.a.h.mapred* packages and
supresses a bunch of them. (Jothi Padmanabhan via ddas)
HADOOP-5704. Fix compilation problems in TestFairScheduler and
TestCapacityScheduler. (Chris Douglas via szetszwo)
HADOOP-5650. Fix safemode messages in the Namenode log. (Suresh Srinivas
via szetszwo)
HADOOP-5488. Removes the pidfile management for the Task JVM from the
framework and instead passes the PID back and forth between the
TaskTracker and the Task processes. (Ravi Gummadi via ddas)
HADOOP-5658. Fix Eclipse templates. (Philip Zeyliger via shv)
HADOOP-5709. Remove redundant synchronization added in HADOOP-5661. (Jothi
Padmanabhan via cdouglas)
HADOOP-5715. Add conf/mapred-queue-acls.xml to the ignore lists.
(szetszwo)
HADOOP-5612. Some c++ scripts are not chmodded before ant execution.
(Todd Lipcon via tomwhite)
HADOOP-5611. Fix C++ libraries to build on Debian Lenny. (Todd Lipcon
via tomwhite)
HADOOP-5592. Fix typo in Streaming doc in reference to GzipCodec.
(Corinne Chandel via tomwhite)
HADOOP-5656. Counter for S3N Read Bytes does not work. (Ian Nowland
via tomwhite)
HADOOP-5406. Fix JNI binding for ZlibCompressor::setDictionary. (Lars
Francke via cdouglas)
HADOOP-3426. Fix/provide handling when DNS lookup fails on the loopback
address. Also cache the result of the lookup. (Steve Loughran via cdouglas)
HADOOP-5476. Close the underlying InputStream in SequenceFile::Reader when
the constructor throws an exception. (Michael Tamm via cdouglas)
HADOOP-5675. Do not launch a job if DistCp has no work to do. (Tsz Wo
(Nicholas), SZE via cdouglas)
HADOOP-5737. Fixes a problem in the way the JobTracker used to talk to
other daemons like the NameNode to get the job's files. Also adds APIs
in the JobTracker to get the FileSystem objects as per the JobTracker's
configuration. (Amar Kamat via ddas)
HADOOP-5648. Not able to generate gridmix.jar on the already compiled version of hadoop.
(gkesavan)
HADOOP-5808. Fix import never used javac warnings in hdfs. (szetszwo)
HADOOP-5203. TT's version build is too restrictive. (Rick Cox via sharad)
HADOOP-5818. Revert the renaming from FSNamesystem.checkSuperuserPrivilege
to checkAccess by HADOOP-5643. (Amar Kamat via szetszwo)
HADOOP-5820. Fix findbugs warnings for http related codes in hdfs.
(szetszwo)
HADOOP-5822. Fix javac warnings in several dfs tests related to unncessary
casts. (Jakob Homan via szetszwo)
HADOOP-5842. Fix a few javac warnings under packages fs and util.
(Hairong Kuang via szetszwo)
HADOOP-5845. Build successful despite test failure on test-core target.
(sharad)
HADOOP-5314. Prevent unnecessary saving of the file system image during
name-node startup. (Jakob Homan via shv)
HADOOP-5855. Fix javac warnings for DisallowedDatanodeException and
UnsupportedActionException. (szetszwo)
HADOOP-5582. Fixes a problem in Hadoop Vaidya to do with reading
counters from job history files. (Suhas Gogate via ddas)
HADOOP-5829. Fix javac warnings found in ReplicationTargetChooser,
FSImage, Checkpointer, SecondaryNameNode and a few other hdfs classes.
(Suresh Srinivas via szetszwo)
HADOOP-5835. Fix findbugs warnings found in Block, DataNode, NameNode and
a few other hdfs classes. (Suresh Srinivas via szetszwo)
HADOOP-5853. Undeprecate HttpServer.addInternalServlet method. (Suresh
Srinivas via szetszwo)
HADOOP-5801. Fixes the problem: If the hosts file is changed across restart
then it should be refreshed upon recovery so that the excluded hosts are
lost and the maps are re-executed. (Amar Kamat via ddas)
HADOOP-5841. Resolve findbugs warnings in DistributedFileSystem,
DatanodeInfo, BlocksMap, DataNodeDescriptor. (Jakob Homan via szetszwo)
HADOOP-5878. Fix import and Serializable javac warnings found in hdfs jsp.
(szetszwo)
HADOOP-5782. Revert a few formatting changes introduced in HADOOP-5015.
(Suresh Srinivas via rangadi)
HADOOP-5687. NameNode throws NPE if fs.default.name is the default value.
(Philip Zeyliger via shv)
HADOOP-5867. Fix javac warnings found in NNBench and NNBenchWithoutMR.
(Konstantin Boudnik via szetszwo)
HADOOP-5728. Fixed FSEditLog.printStatistics IndexOutOfBoundsException.
(Wang Xu via johan)
HADOOP-5847. Fixed failing Streaming unit tests (gkesavan)
HADOOP-5252. Streaming overrides -inputformat option (Klaas Bosteels
via sharad)
HADOOP-5710. Counter MAP_INPUT_BYTES missing from new mapreduce api.
(Amareshwari Sriramadasu via sharad)
HADOOP-5809. Fix job submission, broken by errant directory creation.
(Sreekanth Ramakrishnan and Jothi Padmanabhan via cdouglas)
HADOOP-5759. Fix for IllegalArgumentException when
CombineFileInputFormat is used as job InputFormat.
(Amareshwari Sriramadasu via dhruba)
HADOOP-5635. Change distributed cache to work with other distributed file
systems. (Andrew Hitchcock via tomwhite)
HADOOP-5856. Fix "unsafe multithreaded use of DateFormat" findbugs warning
in DataBlockScanner. (Kan Zhang via szetszwo)
HADOOP-4864. Fixes a problem to do with -libjars with multiple jars when
client and cluster reside on different OSs. (Amareshwari Sriramadasu via ddas)
HADOOP-5623. Fixes a problem to do with status messages getting overwritten
in streaming jobs. (Rick Cox and Jothi Padmanabhan via ddas)
HADOOP-5895. Fixes computation of count of merged bytes for logging.
(Ravi Gummadi via ddas)
HADOOP-5805. problem using top level s3 buckets as input/output directories.
(Ian Nowland via tomwhite)
HADOOP-5940. trunk eclipse-plugin build fails while trying to copy
commons-cli jar from the lib dir (Giridharan Kesavan via gkesavan)
HADOOP-5864. Fix DMI and OBL findbugs in packages hdfs and metrics.
(hairong)
HADOOP-5935. Fix Hudson's release audit warnings link is broken.
(Giridharan Kesavan via gkesavan)
HADOOP-5947. Delete empty TestCombineFileInputFormat.java
HADOOP-5899. Move a log message in FSEditLog to the right place for
avoiding unnecessary log. (Suresh Srinivas via szetszwo)
HADOOP-5944. Add Apache license header to BlockManager.java. (Suresh
Srinivas via szetszwo)
HADOOP-5891. SecondaryNamenode is able to converse with the NameNode
even when the default value of dfs.http.address is not overridden.
(Todd Lipcon via dhruba)
HADOOP-5953. The isDirectory(..) and isFile(..) methods in KosmosFileSystem
should not be deprecated. (szetszwo)
HADOOP-5954. Fix javac warnings in TestFileCreation, TestSmallBlock,
TestFileStatus, TestDFSShellGenericOptions, TestSeekBug and
TestDFSStartupVersions. (szetszwo)
HADOOP-5956. Fix ivy dependency in hdfsproxy and capacity-scheduler.
(Giridharan Kesavan via szetszwo)
HADOOP-5836. Bug in S3N handling of directory markers using an object with
a trailing "/" causes jobs to fail. (Ian Nowland via tomwhite)
HADOOP-5861. s3n files are not getting split by default. (tomwhite)
HADOOP-5762. Fix a problem that DistCp does not copy empty directory.
(Rodrigo Schmidt via szetszwo)
HADOOP-5859. Fix "wait() or sleep() with locks held" findbugs warnings in
DFSClient. (Kan Zhang via szetszwo)
HADOOP-5457. Fix to continue to run builds even if contrib test fails
(Giridharan Kesavan via gkesavan)
HADOOP-5963. Remove an unnecessary exception catch in NNBench. (Boris
Shkolnik via szetszwo)
HADOOP-5989. Fix streaming test failure. (gkesavan)
HADOOP-5981. Fix a bug in HADOOP-2838 in parsing mapred.child.env.
(Amar Kamat via sharad)
HADOOP-5420. Fix LinuxTaskController to kill tasks using the process
groups they are launched with.
(Sreekanth Ramakrishnan via yhemanth)
HADOOP-6031. Remove @author tags from Java source files. (Ravi Phulari
via szetszwo)
HADOOP-5980. Fix LinuxTaskController so tasks get passed
LD_LIBRARY_PATH and other environment variables.
(Sreekanth Ramakrishnan via yhemanth)
HADOOP-4041. IsolationRunner does not work as documented.
(Philip Zeyliger via tomwhite)
HADOOP-6076. Fix a broken compilation of Forrest documentation due to
misplaced tags in commands_manual.xml. (yhemanth)
HADOOP-6004. Fixes BlockLocation deserialization. (Jakob Homan via
szetszwo)
HADOOP-6079. Serialize proxySource as DatanodeInfo in DataTransferProtocol.
(szetszwo)
Release 0.20.1 - Unreleased
INCOMPATIBLE CHANGES
HADOOP-5726. Remove pre-emption from capacity scheduler code base.
(Rahul Kumar Singh via yhemanth)
HADOOP-5881. Simplify memory monitoring and scheduling related
configuration. (Vinod Kumar Vavilapalli via yhemanth)
NEW FEATURES
IMPROVEMENTS
HADOOP-5711. Change Namenode file close log to info. (szetszwo)
HADOOP-5736. Update the capacity scheduler documentation for features
like memory based scheduling, job initialization and removal of pre-emption.
(Sreekanth Ramakrishnan via yhemanth)
HADOOP-5714. Add a metric for NameNode getFileInfo operation. (Jakob Homan
via szetszwo)
HADOOP-4372. Improves the way history filenames are obtained and manipulated.
(Amar Kamat via ddas)
OPTIMIZATIONS
BUG FIXES
HADOOP-5691. Makes org.apache.hadoop.mapreduce.Reducer concrete class
instead of abstract. (Amareshwari Sriramadasu via sharad)
HADOOP-5646. Fixes a problem in TestQueueCapacities.
(Vinod Kumar Vavilapalli via ddas)
HADOOP-5655. TestMRServerPorts fails on java.net.BindException. (Devaraj
Das via hairong)
HADOOP-5654. TestReplicationPolicy.<init> fails on java.net.BindException.
(hairong)
HADOOP-5688. Fix HftpFileSystem checksum path construction. (Tsz Wo
(Nicholas) Sze via cdouglas)
HADOOP-4674. Fix fs help messages for -test, -text, -tail, -stat
and -touchz options. (Ravi Phulari via szetszwo)
HADOOP-5718. Remove the check for the default queue in capacity scheduler.
(Sreekanth Ramakrishnan via yhemanth)
HADOOP-5719. Remove jobs that failed initialization from the waiting queue
in the capacity scheduler. (Sreekanth Ramakrishnan via yhemanth)
HADOOP-4744. Attaching another fix to the jetty port issue. The TaskTracker
kills itself if it ever discovers that the port to which jetty is actually
bound is invalid (-1). (ddas)
HADOOP-5349. Fixes a problem in LocalDirAllocator to check for the return
path value that is returned for the case where the file we want to write
is of an unknown size. (Vinod Kumar Vavilapalli via ddas)
HADOOP-5636. Prevents a job from going to RUNNING state after it has been
KILLED (this used to happen when the SetupTask would come back with a
success after the job has been killed). (Amar Kamat via ddas)
HADOOP-5641. Fix a NullPointerException in capacity scheduler's memory
based scheduling code when jobs get retired. (yhemanth)
HADOOP-5828. Use absolute path for mapred.local.dir of JobTracker in
MiniMRCluster. (yhemanth)
HADOOP-4981. Fix capacity scheduler to schedule speculative tasks
correctly in the presence of High RAM jobs.
(Sreekanth Ramakrishnan via yhemanth)
HADOOP-5210. Solves a problem in the progress report of the reduce task.
(Ravi Gummadi via ddas)
HADOOP-5850. Fixes a problem to do with not being able to jobs with
0 maps/reduces. (Vinod K V via ddas)
HADOOP-4626. Correct the API links in hdfs forrest doc so that they
point to the same version of hadoop. (szetszwo)
HADOOP-5883. Fixed tasktracker memory monitoring to account for
momentary spurts in memory usage due to java's fork() model.
(yhemanth)
HADOOP-5539. Fixes a problem to do with not preserving intermediate
output compression for merged data.
(Jothi Padmanabhan and Billy Pearson via ddas)
HADOOP-5932. Fixes a problem in capacity scheduler in computing
available memory on a tasktracker.
(Vinod Kumar Vavilapalli via yhemanth)
HADOOP-5908. Fixes a problem to do with ArithmeticException in the
JobTracker when there are jobs with 0 maps. (Amar Kamat via ddas)
HADOOP-5924. Fixes a corner case problem to do with job recovery with
empty history files. Also, after a JT restart, sends KillTaskAction to
tasks that report back but the corresponding job hasn't been initialized
yet. (Amar Kamat via ddas)
HADOOP-5882. Fixes a reducer progress update problem for new mapreduce
api. (Amareshwari Sriramadasu via sharad)
HADOOP-5746. Fixes a corner case problem in Streaming, where if an exception
happens in MROutputThread after the last call to the map/reduce method, the
exception goes undetected. (Amar Kamat via ddas)
HADOOP-5884. Fixes accounting in capacity scheduler so that high RAM jobs
take more slots. (Vinod Kumar Vavilapalli via yhemanth)
HADOOP-5937. Correct a safemode message in FSNamesystem. (Ravi Phulari
via szetszwo)
HADOOP-5869. Fix bug in assignment of setup / cleanup task that was
causing TestQueueCapacities to fail.
(Sreekanth Ramakrishnan via yhemanth)
HADOOP-5921. Fixes a problem in the JobTracker where it sometimes never used
to come up due to a system file creation on JobTracker's system-dir failing.
This problem would sometimes show up only when the FS for the system-dir
(usually HDFS) is started at nearly the same time as the JobTracker.
(Amar Kamat via ddas)
HADOOP-5920. Fixes a testcase failure for TestJobHistory.
(Amar Kamat via ddas)
Release 0.20.0 - 2009-04-15
INCOMPATIBLE CHANGES
HADOOP-4210. Fix findbugs warnings for equals implementations of mapred ID
classes. Removed public, static ID::read and ID::forName; made ID an
abstract class. (Suresh Srinivas via cdouglas)
HADOOP-4253. Fix various warnings generated by findbugs.
Following deprecated methods in RawLocalFileSystem are removed:
public String getName()
public void lock(Path p, boolean shared)
public void release(Path p)
(Suresh Srinivas via johan)
HADOOP-4618. Move http server from FSNamesystem into NameNode.
FSNamesystem.getNameNodeInfoPort() is removed.
FSNamesystem.getDFSNameNodeMachine() and FSNamesystem.getDFSNameNodePort()
replaced by FSNamesystem.getDFSNameNodeAddress().
NameNode(bindAddress, conf) is removed.
(shv)
HADOOP-4567. GetFileBlockLocations returns the NetworkTopology