Skip to content

rmr2 mapreduce does not produce any output #179

@byu777

Description

@byu777

I tried the following simple script on rmr2 in Cloudera Quickstart 5.7.0 but mapreduce does not generate any results. Here is the script:

small.ints <- to.dfs(1:10)
out <- mapreduce(input = small.ints, map = function(k, v) keyval(v, v^2))
from.dfs(out)

Here is the output:

> small.ints <- to.dfs(1:10)
16/08/07 20:14:42 INFO zlib.ZlibFactory: Successfully loaded & initialized native-zlib library
16/08/07 20:14:42 INFO compress.CodecPool: Got brand-new compressor [.deflate]
Warning message:
S3 methods ‘gorder.default’, ‘gorder.factor’, ‘gorder.data.frame’, ‘gorder.matrix’, ‘gorder.raw’ were declared in NAMESPACE but not found 
> out <- mapreduce(input = small.ints, map = function(k, v) keyval(v, v^2))
16/08/07 20:14:48 INFO Configuration.deprecation: mapred.reduce.tasks is deprecated. Instead, use mapreduce.job.reduces
packageJobJar: [] [/usr/jars/hadoop-streaming-2.6.0-cdh5.7.0.jar] /tmp/streamjob543400947433267521.jar tmpDir=null
16/08/07 20:14:49 INFO client.RMProxy: Connecting to ResourceManager at /0.0.0.0:8032
16/08/07 20:14:49 INFO client.RMProxy: Connecting to ResourceManager at /0.0.0.0:8032
16/08/07 20:14:50 INFO mapred.FileInputFormat: Total input paths to process : 1
16/08/07 20:14:50 INFO mapreduce.JobSubmitter: number of splits:2
16/08/07 20:14:50 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1470447912721_0016
16/08/07 20:14:50 INFO impl.YarnClientImpl: Submitted application application_1470447912721_0016
16/08/07 20:14:50 INFO mapreduce.Job: The url to track the job: http://quickstart.cloudera:8088/proxy/application_1470447912721_0016/
16/08/07 20:14:50 INFO mapreduce.Job: Running job: job_1470447912721_0016
16/08/07 20:15:00 INFO mapreduce.Job: Job job_1470447912721_0016 running in uber mode : false
16/08/07 20:15:00 INFO mapreduce.Job:  map 0% reduce 0%
16/08/07 20:15:12 INFO mapreduce.Job:  map 50% reduce 0%
16/08/07 20:15:13 INFO mapreduce.Job:  map 100% reduce 0%
16/08/07 20:15:13 INFO mapreduce.Job: Job job_1470447912721_0016 completed successfully
16/08/07 20:15:13 INFO mapreduce.Job: Counters: 30
    File System Counters
        FILE: Number of bytes read=0
        FILE: Number of bytes written=236342
        FILE: Number of read operations=0
        FILE: Number of large read operations=0
        FILE: Number of write operations=0
        HDFS: Number of bytes read=1001
        HDFS: Number of bytes written=244
        HDFS: Number of read operations=14
        HDFS: Number of large read operations=0
        HDFS: Number of write operations=4
    Job Counters 
        Launched map tasks=2
        Data-local map tasks=2
        Total time spent by all maps in occupied slots (ms)=19917
        Total time spent by all reduces in occupied slots (ms)=0
        Total time spent by all map tasks (ms)=19917
        Total vcore-seconds taken by all map tasks=19917
        Total megabyte-seconds taken by all map tasks=9958500
    Map-Reduce Framework
        Map input records=3
        Map output records=0
        Input split bytes=208
        Spilled Records=0
        Failed Shuffles=0
        Merged Map outputs=0
        GC time elapsed (ms)=115
        CPU time spent (ms)=1300
        Physical memory (bytes) snapshot=239222784
        Virtual memory (bytes) snapshot=2127200256
        Total committed heap usage (bytes)=121503744
    File Input Format Counters 
        Bytes Read=793
    File Output Format Counters 
        Bytes Written=244
16/08/07 20:15:13 INFO streaming.StreamJob: Output directory: /tmp/file10106a0b36b6
> from.dfs(out)
$key
NULL

$val
NULL

to.dfs and from.dfs do work since I tried the following:

> small.ints <- to.dfs(1:10)
16/08/07 07:15:34 INFO zlib.ZlibFactory: Successfully loaded & initialized native-zlib library
16/08/07 07:15:34 INFO compress.CodecPool: Got brand-new compressor [.deflate]
> out <- from.dfs(small.ints)
16/08/07 07:15:44 INFO zlib.ZlibFactory: Successfully loaded & initialized native-zlib library
16/08/07 07:15:44 INFO compress.CodecPool: Got brand-new decompressor [.deflate]
> out
$key
NULL

$val
 [1]  1  2  3  4  5  6  7  8  9 10

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions